- How many languages does it support?
- 80+ languages including all major European, Asian, and Semitic languages. Less commonly digitized languages have lower accuracy due to limited training data.
- Does it work on code-switched text (multiple languages mixed)?
- It returns the dominant language with a confidence score. Code-switching detection (identifying language segments within a text) is a separate, harder problem not covered here.
- What's the minimum text length for reliable detection?
- 50+ characters gives 95%+ accuracy for most language pairs. Under 20 characters, accuracy drops significantly. Single-word detection is unreliable for closely related language pairs.