- What's cosine similarity?
- Measures the angle between two vectors regardless of magnitude. Ranges from -1 (opposite) to 1 (identical). 0 is unrelated.
- Which embedding models does it work with?
- Any embedding vector — OpenAI text-embedding-3, Cohere embeddings, Sentence Transformers, all work the same way. Paste the vectors; the math is model-agnostic.
- What's a 'similar enough' threshold?
- Depends on model and data. Often >0.8 is 'similar', >0.95 is 'near duplicate'. Calibrate on your own data; there's no universal threshold.
- Why cosine and not Euclidean?
- Cosine ignores magnitude — only direction matters. For language embeddings, direction captures meaning better than raw distance. Most vector DBs default to cosine.