Which search engines does this cover?

The analyzer simulates Elasticsearch/OpenSearch analyzer chains including character filters, tokenizers, and token filters. Output is equivalent to the Analyze API.

What's the difference between a tokenizer and an analyzer?

A tokenizer splits text into tokens. An analyzer is the full pipeline: character filters (pre-process) → tokenizer → token filters (lowercase, stem, stopwords). An analyzer always contains exactly one tokenizer.

Does it support language-specific analyzers?

Yes — English, French, German, Spanish, and other built-in language analyzers are included, with their respective stemmer and stopword configurations pre-loaded.

Full-Text Search Analyzer

DevTools Surf

About Full-Text Search Analyzer

Full-Text Search Analyzer preview - Database Tools tool

Analyze full-text search tokenization and analysis. Part of the DevTools Surf developer suite. Browse more tools in the Database Tools collection.

Use Cases

Debug why a search query returns no results by inspecting how both document and query text are tokenized.
Choose between standard, language-specific, and custom analyzers for a multilingual search index.
Verify that stopword filters and stemming produce the expected token reduction for your domain vocabulary.
Test synonym expansion rules before deploying a synonym filter to production.

Tips

Test your analyzer with rare or compound words to catch stemming and tokenization edge cases before indexing production data.
Compare multiple analyzer configurations side by side to choose the best tokenizer for your language.
Check the token stream output to understand why certain queries return unexpected results.

Fun Facts

Elasticsearch's default analyzer (the 'standard' analyzer) was derived from the Unicode standard tokenizer, which splits text on whitespace and punctuation defined by Unicode 3.0 in 2000.
Full-text search indexes can be 30-50% larger than the raw text they index, depending on the number of stored term positions and offsets for phrase queries.
Lucene, the search engine underlying Elasticsearch, Solr, and OpenSearch, was created by Doug Cutting in 1999 and open-sourced in 2000 — the same year he started working on what would become Hadoop.

FAQ

Which search engines does this cover?: The analyzer simulates Elasticsearch/OpenSearch analyzer chains including character filters, tokenizers, and token filters. Output is equivalent to the Analyze API.
What's the difference between a tokenizer and an analyzer?: A tokenizer splits text into tokens. An analyzer is the full pipeline: character filters (pre-process) → tokenizer → token filters (lowercase, stem, stopwords). An analyzer always contains exactly one tokenizer.
Does it support language-specific analyzers?: Yes — English, French, German, Spanish, and other built-in language analyzers are included, with their respective stemmer and stopword configurations pre-loaded.

Related Database Tools Tools

MongoDB Query Builder Elasticsearch Query Builder Redis Command Simulator Cassandra CQL Builder DynamoDB Query Simulator Firestore Query Builder Graph Database Explorer Search Relevance Scorer