Extract keywords and phrases from text content. Part of the DevTools Surf developer suite. Browse more tools in the AI / Modern Dev collection.
Use Cases
Extract key topics from a research paper or technical document for indexing.
Identify the main subjects covered in a customer support ticket corpus.
Find content keywords in a blog draft to inform meta tags and heading structure.
Analyze competitive content to reverse-engineer the topic coverage strategy.
Tips
Extract keywords from competitor pages and compare with yours to find topical gaps before writing new content.
Use multi-word phrase extraction (bigrams, trigrams) in addition to single-word extraction to identify meaningful compound terms.
Filter out common function words (stopwords) to see only the content-bearing keywords that signal page topics.
Fun Facts
TF-IDF (Term Frequency–Inverse Document Frequency) was invented by Karen Spärck Jones in 1972 and remains the most widely used statistical measure for keyword extraction — embedded in Elasticsearch, Solr, and most search engines.
RAKE (Rapid Automatic Keyword Extraction), published in 2010, can extract key phrases from a document in milliseconds without a training corpus by using word co-occurrence and frequency statistics.
Google's BERT model (2018) shifted keyword analysis from term frequency toward semantic meaning — a page can rank for a keyword that never appears verbatim if the surrounding context is semantically relevant.
FAQ
What algorithm does it use?
TF-IDF for single-document keyword scoring and RAKE for multi-word phrase extraction. Both are configurable; the output shows each keyword's score and frequency.
Does it remove stopwords?
Yes — a built-in English stopword list removes common function words. Custom stopwords can be added; other languages use language-specific lists when the input language is detected.
Can it extract keywords from multiple documents?
Enter up to 5 documents and the tool finds keywords common to all (core topics) and unique to each (differentiating topics) — useful for content gap analysis.