- What are the four normalization forms?
- NFC (composed, canonical — café as one character), NFD (decomposed — café as c + a + f + combining-acute + e), NFKC (composed + compatibility — ff ligature → ff), NFKD (decomposed + compatibility). NFC is the web default.
- Which form should I use?
- NFC for everything user-facing (storage, display, search indexing). NFD for typography processing. NFKC/NFKD only for search normalization — they're lossy.
- Why does normalization matter?
- Two visually-identical strings can compare as different if they use different normalization. Login comparisons, cache keys, and file name matching need consistent normalization.
- Will it change visible characters?
- NFC/NFD roundtrip losslessly. NFKC/NFKD may change ligatures, superscripts, and compatibility characters to their 'simple' forms — visible but usually equivalent.