- Which algorithms are supported?
- Levenshtein distance (character edits), Jaccard similarity (set overlap), cosine similarity (word vector), and n-gram similarity. Each serves different use cases.
- Which should I use?
- Levenshtein for typo detection, Jaccard for plagiarism/overlap, cosine for semantic-ish comparison of shorter texts. N-gram for phrase matching.
- Is this semantic comparison?
- No — these are syntactic algorithms. For semantic similarity (meaning, not spelling), use embeddings (OpenAI, Cohere) — a different tool entirely.
- Does it work for code?
- Yes — Levenshtein is fine for small code snippets. For real diff-style comparison of code, use text-diff or git diff.