What's cosine similarity?

Measures the angle between two vectors regardless of magnitude. Ranges from -1 (opposite) to 1 (identical). 0 is unrelated.

Which embedding models does it work with?

Any embedding vector — OpenAI text-embedding-3, Cohere embeddings, Sentence Transformers, all work the same way. Paste the vectors; the math is model-agnostic.

What's a 'similar enough' threshold?

Depends on model and data. Often >0.8 is 'similar', >0.95 is 'near duplicate'. Calibrate on your own data; there's no universal threshold.

Why cosine and not Euclidean?

Cosine ignores magnitude — only direction matters. For language embeddings, direction captures meaning better than raw distance. Most vector DBs default to cosine.

Embedding Distance Calculator

DevTools Surf

About Embedding Distance Calculator

Embedding Distance Calculator preview - AI / Modern Dev tool

Calculate cosine similarity between two text embedding vectors. Part of the DevTools Surf developer suite. Browse more tools in the AI / Modern Dev collection.

Use Cases

ML engineers evaluating semantic search result quality
NLP researchers comparing sentence similarity across models
RAG developers tuning retrieval thresholds for AI chatbots
Data scientists clustering documents by embedding proximity

Tips

Cosine similarity near 1.0 means vectors are semantically aligned
Compare multiple vectors to find the closest semantic match
Normalize vectors to unit length before computing distances

Fun Facts

Cosine similarity was first used in information retrieval by Gerard Salton's SMART system at Cornell in the 1960s, decades before modern embeddings existed.
OpenAI's text-embedding-ada-002 produces 1,536-dimensional vectors — each dimension captures a subtle aspect of meaning that humans cannot individually interpret.
The 'king - man + woman = queen' analogy from Mikolov's 2013 Word2Vec paper demonstrated that vector arithmetic could capture semantic relationships, launching the embedding revolution.

FAQ

What's cosine similarity?: Measures the angle between two vectors regardless of magnitude. Ranges from -1 (opposite) to 1 (identical). 0 is unrelated.
Which embedding models does it work with?: Any embedding vector — OpenAI text-embedding-3, Cohere embeddings, Sentence Transformers, all work the same way. Paste the vectors; the math is model-agnostic.
What's a 'similar enough' threshold?: Depends on model and data. Often >0.8 is 'similar', >0.95 is 'near duplicate'. Calibrate on your own data; there's no universal threshold.
Why cosine and not Euclidean?: Cosine ignores magnitude — only direction matters. For language embeddings, direction captures meaning better than raw distance. Most vector DBs default to cosine.

Related AI / Modern Dev Tools

Token Counter Prompt Template Renderer Markdown → Slack/Discord Markdown → Slack mrkdwn Sentiment Analyzer Keyword Extractor Text Summarizer Duplicate Content Detector