Halfwidth and Fullwidth Forms (Unicode block): Difference between revisions

Content deleted Content added
History: add docs
m Block: {{not a typo}}
 
(28 intermediate revisions by 18 users not shown)
Line 9:
|symbols = Variant width characters
|1_0_0 = 216
|1_0_11_1 = 7
|3_2 = 2
|note = <ref>{{cite web|url=https://www.unicode.org/versions/Unicode1.0.0/Notice.pdf|title=Unicode 1.0.1 Addendum|work=The Unicode Standard|date=1992-11-03|accessdateaccess-date=2016-07-09|deadurl=no|archiveurl=https://web.archive.org/web/20160702004420/http://www.unicode.org/versions/Unicode1.0.0/Notice.pdf|archivedate=2016-07-02|df=}}</ref><ref>{{cite web|url=https://www.unicode.org/ucd/|title=Unicode character database|work=The Unicode Standard|accessdate=2016access-07-09|deadurldate=no|archiveurl=https://web.archive.org/web/20160710080729/http://www.unicode.org/|archivedate=20162023-07-10|df=26}}</ref><ref>{{cite web|url=https://www.unicode.org/versions/enumeratedversions.html|title=Enumerated Versions of The Unicode Standard|work=The Unicode Standard|accessdateaccess-date=20162023-07-09|deadurl=no|archiveurl=https://web.archive.org/web/20160629001311/http://www.unicode.org/versions/enumeratedversions.html|archivedate=2016-06-29|df=26}}</ref>
}}
 
'''Halfwidth and Fullwidth Forms''' is the name of a [[Unicode block]] U+FF00&ndash;FFEF, provided so that older encodings containing both [[Halfwidth and fullwidth forms|halfwidth and fullwidth]] characters can have lossless translation to/from Unicode. It is the second-to-last block of the [[Basic Multilingual Plane]], exceptingfollowed only by the short [[Specials (Unicode Specialsblock)|Specials]] block at U+FFF0&ndash;FFFF. Its block name in Unicode 1.0 was '''Halfwidth and Fullwidth Variants'''.<ref>{{cite web |url=https://www.unicode.org/versions/Unicode1.0.0/CodeCharts2.pdf |work=The Unicode Standard |version=version 1.0 |title=3.8: Block-by-Block Charts |publisher=[[Unicode Consortium]]}}</ref>
 
Range U+FF01&ndash;FF5E reproduces the characters of [[ASCII]] 21 to 7E as fullwidth forms. U+FF00 does not correspond to a fullwidth ASCII 20 (space character), since that role is already fulfilled by U+3000 "[[ideographic space]]".
 
Range U+FF65FF61&ndash;FF9F encodes halfwidth forms of [[katakana]] and related punctuation in a transposition of theirA1 to DF in the [[JIS X 0201]] layoutencoding – see [[half-width kana]].
 
The range U+FFA0&ndash;FFDC encodes halfwidth forms of [[Hangul Compatibility Jamo|compatibility jamo]] characters for [[Hangul]], in a transposition of their [[KS C 5601#1974|1974 standard]] layout. It is used in the mapping of some IBM encodings for Korean, such as IBM code page 933, which allows the use of the [[Shift Out and Shift In characters]] to shift to a double-byte character set.<ref name="ibm933">{{cite web|url=http://demo.icu-project.org/icu-bin/convexp?conv=ibm-933|title=ICU Demonstration - Converter Explorer|author=|date=|website=demo.icu-project.org|accessdateaccess-date=7 May 2018}}</ref> Since the double-byte character set could contain compatibility jamo, halfwidth variants are needed to provide round-trip compatibility.<ref name=hwfwblame>{{Cite web|url=https://harjit.moe/hwfwblame.html|title=Halfwidth and Fullwidth blame}}</ref><ref>{{Cite web|url=http://userguide.icu-project.org/conversion/data|title=Conversion Data - Old ___location of the ICU User Guide}}</ref>
 
Range U+FFE0&ndash;FFEE includes fullwidth and halfwidth symbols.
Line 28:
 
The block has [[Variant form (Unicode)|variation sequences]] defined for East Asian punctuation positional variants.<ref>{{cite web|url=https://www.unicode.org/L2/L2017/17436r-sv-eastsian-punct.pdf|title=L2/17-436: Proposal to add standardized variation sequences for fullwidth East Asian punctuation|date=2018-01-21|first=Ken|last=Lunde}}</ref><ref name="stdvar">{{cite web|url=https://www.unicode.org/Public/UNIDATA/StandardizedVariants.txt|title=Unicode Character Database: Standardized Variation Sequences | publisher=The Unicode Consortium}}</ref> They use {{sc2|U+FE00 VARIATION SELECTOR-1}} (VS01) and {{sc2|U+FE01 VARIATION SELECTOR-2}} (VS02):
{|border="1" cellspacing="0" cellpadding="5" class="wikitable nounderlines" style="border-collapse:collapse;background:#FFFFFF;font-size:large;text-align:center"
|+style="font-size:small" | Variation sequences for punctuation alignment
|-style="background:#F8F8F8;font-size:small"
| style="text-align:right" | U+ || FF01 || FF0C || FF0E || FF1A || FF1B || FF1F || style="background:#F8F8F8;font-size:small;text-align:left" | Description
|-
| style="background:#F8F8F8;font-size:small;text-align:left" | base&nbsp;code&nbsp;point || &#xff01; || &#xff0c; || &#xff0e; || &#xff1a; || &#xff1b; || &#xff1f; || style="font-size:small;text-align:left" |
|-
| style="background:#F8F8F8;font-size:small;text-align:left" | base + VS01 || &#xff01;&#xfe00; || &#xff0c;&#xfe00; || &#xff0e;&#xfe00; || &#xff1a;&#xfe00; || &#xff1b;&#xfe00; || &#xff1f;&#xfe00; || style="font-size:small;text-align:left" | corner-justified form
|-
| style="background:#F8F8F8;font-size:small;text-align:left" | base + VS02 || &#xff01;&#xfe01; || &#xff0c;&#xfe01; || &#xff0e;&#xfe01; || &#xff1a;&#xfe01; || &#xff1b;&#xfe01; || &#xff1f;&#xfe01; || style="font-size:small;text-align:left" | centered form
|}
 
An additional variant is defined for a fullwidth [[slashed zero|zero with a short diagonal stroke]]: U+FF10 FULLWIDTH DIGIT ZERO, U+FE00 VS1 ({{not a typo|0&#xfe00;}}).<ref>{{cite web|url=https://www.unicode.org/L2/L2015/15268-slashed-zero.pdf|title=L2/15-268: Proposal to Represent the Slashed Zero Variant of Empty Set|date=2015-10-30|first1=Barbara|last1=Beeton|first2=Asmus|last2=Freytag|first3=Laurențiu|last3=Iancu|first4=Murray|last4=Sargent}}</ref><ref name="stdvar"/>
 
==History==
The following Unicode-related documents record the purpose and process of defining specific characters in the Halfwidth and Fullwidth Forms block:
 
{{sticky header}}
{| class="wikitable collapsible sticky-header"
|-
! [[Unicode#Versions|Version]] !! {{nobr|Final code points<ref group=lower-alpha name=final/>}} !! Count !! [[International Committee for Information Technology Standards|L2]]&nbsp;ID !! [[ISO/IEC JTC 1/SC 2|WG2]]&nbsp;ID !! Document
|-
| rowspan="109" | 1.0.0 || rowspan="109" width="180" | U+FF01..FF5E, FF61..FFBE, FFC2..FFC7, FFCA..FFCF, FFD2..FFD7, FFDA..FFDC, FFE0..FFE6 || rowspan="109" | 216 || || || (to be determined)
|-
| || {{nobr|[https://www.unicode.org/wg2/docs/n4403.pdf N4403 (pdf],}} [https://www.unicode.org/wg2/docs/n4403.doc doc]) || {{Citation|title=Unconfirmed minutes of WG 2 meeting 61, Holiday Inn, Vilnius, Lithuania; 2013-06-10/14|date=2014-01-28|first=V. S.|last=Umamaheswaran|section=Resolution M61.01}}
|-
| {{nobr|[https://www.unicode.org/L2/L2015L2017/1526817056-slashedsv-zerowestern-vs-eastasian.pdf L2/1517-268056]}} || || {{Citation|title=Proposal to Representadd thestandardized Slashedvariation Zero Variant of Empty Setsequences|date=20152017-1002-3013|first1first=BarbaraKen|last1last=BeetonLunde|first2author-link=Asmus|last2=Freytag|first3=Laurențiu|last3=Iancu|first4=Murray|last4=SargentKen Lunde}}
|-
| {{nobr|[https://www.unicode.org/L2/L2017/17056-sv-western-vs-eastasian.pdf L2/17-056]}} || || {{Citation|title=Proposal to add standardized variation sequences|date=2017-02-13|first=Ken|last=Lunde|authorlink=Ken Lunde}}
|-
| {{nobr|[https://www.unicode.org/L2/L2017/17436r-sv-eastsian-punct.pdf L2/17-436]}} || || {{Citation|title=Proposal to add standardized variation sequences for fullwidth East Asian punctuation|date=2018-01-21|first=Ken|last=Lunde}}
Line 65 ⟶ 64:
| {{nobr|[https://www.unicode.org/L2/L2018/18115.htm L2/18-115]}} || || {{Citation|title=UTC #155 Minutes|date=2018-05-09|first=Lisa|last=Moore|section=Consensus 154-C17|quote=Add 16 standardized variation sequences based on L2/17-436R, for Unicode 12.0.}}
|-
| {{nobr|[https://www.unicode.org/L2/L2019/19055-segment-fullwd-digits.txt L2/19-055]}} || || {{Citation|title=Proposed Changes in the Segmentation Property Values for Fullwidth Digits |date=2019-01-14|first=Laurențiu|last=Iancu}}
|-
| {{nobr|[https://www.unicode.org/L2/L2019/19008.htm L2/19-008]}} || || {{Citation|title=UTC #158 Minutes |date=2019-02-08|first=Lisa|last=Moore|section=B.11.11.1.2 Proposed changes in the segmentation property values for fullwidth digits}}
|-
| 1.0.1 || width="180" | U+FFE8..FFEE || 7 || || || (to be determined)
|-
| rowspan="11" | 3.2 || rowspan="11" width="180" | U+FF5F..FF60 || rowspan="11" | 2 || {{nobr|[https://www.unicode.org/L2/L1999/99052.htm L2/99-052]}} || || {{Citation|title=The math pieces from the symbol font|date=1999-02-05|first=Asmus|last=Freytag}}
|-
| {{nobr|[https://www.unicode.org/L2/L2001/01033-addbrackets.htm L2/01-033]}} || || {{Citation|title=Disunify braces/brackets for math, computing science, and Z notation from similar-looking CJK braces/brackets|date=2001-01-16|first1=Kent|last1=Karlsson|first2=Asmus|last2=Freytag}}