Standard Compression Scheme for Unicode: Difference between revisions

Content deleted Content added
No edit summary
Line 14:
 
== The scheme ==
The following sections briefly describe howthe toanatomy interpretof a compressed SCSU scheme,stream. effectivelyFor describinga full description (matching that of a decompressor), see the UTS #6 document.
 
=== Window encoding ===
 
=== Encoding modes ===
SCSU starts in the single-byte mode, which uses the compressed Window encoding. There exist commands to switch to a UTF-16BE "Unicode" mode, and to switch to the single-byte mode from that mode.
 
=== Window encoding ===
The core of SCSU lies in the windows for which the meanings of bytes 0x80-0xff are defined. There are eight static windows for simpler scripts and punctuation, and 6 types of dynamic windows (plus "half Unicode block" windows and custom Windows for the supplementary planes) for scripts making use of more characters.
 
Both simple and dynamic windows are selected by special command characters. For individual characters that do not fit into the current block, command characters for quoting are provided.
=== Inline commands ===
 
== Comparison with general-purpose plain text compression schemes ==