Unicode equivalence: Difference between revisions

Content deleted Content added
Normalization: "software implementing"; subsection
Tags: Reverted Visual edit
Vfnn (talk | contribs)
Undid revision 1067025660 by Alexander Davronov (talk) "software" is not a countable noun
Tags: Undo Reverted
Line 50:
 
==Normalization==
AThe textimplementation processing software implementating theof Unicode string searchsearches and comparisoncomparisons functionalityin text processing software must take into account the presence of equivalent code points. In the absence of this feature, users searching for a particular code point sequence would be unable to find other visually indistinguishable glyphs that have a different, but canonically equivalent, code point representation.
 
=== Algorithms ===
Unicode provides standard normalization algorithms that produce a unique (normal) code point sequence for all sequences that are equivalent; the equivalence criteria can be either canonical (NF) or compatibility (NFK). Since one can arbitrarily choose the [[representative (mathematics)|representative]] element of an [[equivalence class]], multiple canonical forms are possible for each equivalence criterion. Unicode provides two normal forms that are semantically meaningful for each of the two compatibility criteria: the composed forms NFC and NFKC, and the decomposed forms NFD and NFKD. Both the composed and decomposed forms impose a '''canonical ordering''' on the code point sequence, which is necessary for the normal forms to be unique.