Revision as of 21:05, 25 April 2023 edit Omegatron (talk \| contribs) Administrators 35,823 edits →Character duplication: Duplicate characters in Unicode Tag: Visual edit ← Previous edit		Revision as of 17:02, 29 August 2023 edit undo HouseBlaster (talk \| contribs) Edit filter managers, Administrators 72,619 edits m →Normal forms: expand contraction Next edit →
Line 82: The normal forms are not [[closure (mathematics)\|closed]] under string [[concatenation]].<ref> Per [http://www.unicode.org/faq/normalization.html#5 What should be done about concatenation]</ref> For defective Unicode strings starting with a Hangul vowel or trailing [[Hangul Jamo (Unicode block)\|conjoining jamo]], concatenation can break Composition. However, they are not [[injective function\|injective]] (they map different original glyphs and sequences to the same normalized sequence) and thus also not [[bijection\|bijective]] (~~can't~~cannot be restored). For example, the distinct Unicode strings "U+212B" (the angstrom sign "Å") and "U+00C5" (the Swedish letter "Å") are both expanded by NFD (or NFKD) into the sequence "U+0041 U+030A" (Latin letter "A" and combining [[ring above]] "°") which is then reduced by NFC (or NFKC) to "U+00C5" (the Swedish letter "Å"). A single character (other than a Hangul syllable block) that will get replaced by another under normalization can be identified in the Unicode tables for having a non-empty compatibility field but lacking a compatibility tag.

Unicode equivalence: Difference between revisions