Halfwidth and Fullwidth Forms (Unicode block): Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 14:13, 6 June 2019 edit HarJIT (talk \| contribs) Extended confirmed users 12,434 edits Reverted to revision 625693190 by Drmccreedy (talk): Demerge per Talk:Halfwidth and fullwidth forms (TW) Tags: Removed redirect Undo ← Previous edit		Latest revision as of 00:58, 7 April 2025 edit undo Beland (talk \| contribs) Autopatrolled, Administrators 259,155 edits m →Block: {{not a typo}}
(32 intermediate revisions by 19 users not shown)
Line 11: \|1_1 = 7 \|3_2 = 2 \|note = <ref>{{cite web\|url=~~http~~https://www.unicode.org/versions/Unicode1.0.0/Notice.pdf\|title=Unicode ~~character~~1.0.1 ~~database~~Addendum\|work=The Unicode Standard\|~~accessdate~~date=~~22 March 2013~~1992-11-03\|access-date=2016-07-09}}</ref><ref>{{cite ~~book~~web\|url=https://www.unicode.org/ucd/\|title=Unicode character database\|work=The Unicode Standard\|access-date=2023-07-26}}</ref><ref>{{cite ~~Version 1~~web\|url=https://www.~~0, Volume 1~~unicode.org/versions/enumeratedversions.html\|~~year~~title=~~1990,~~Enumerated ~~1991~~Versions of The Unicode Standard\|~~publisher~~work=~~Addison-Wesley~~The ~~Publishing Company,~~Unicode ~~Inc.~~Standard\|~~isbn~~access-date=02023-~~201~~07-~~56788-1~~26}}</ref> }} '''Halfwidth and Fullwidth Forms''' is a [[Unicode block]] U+FF00–FFEF, provided so that older encodings containing both [[Halfwidth and fullwidth forms\|halfwidth and fullwidth]] characters can have lossless translation to/from Unicode. It is the second-to-last block of the [[Basic Multilingual Plane]], followed only by the short [[Specials (Unicode block)\|Specials]] block at U+FFF0–FFFF. Its block name in Unicode 1.0 was '''Halfwidth and Fullwidth Variants'''.<ref>{{cite web \|url=https://www.unicode.org/versions/Unicode1.0.0/CodeCharts2.pdf \|work=The Unicode Standard \|version=version 1.0 \|title=3.8: Block-by-Block Charts \|publisher=[[Unicode Consortium]]}}</ref> ~~'''Halfwidth and Fullwidth Forms''' is a [[Unicode block]] containing Latin, [[Katakana]], and [[Hangul]] jamo characters for compatibility with East Asian character sets.~~ Range U+FF01–FF5E reproduces the characters of [[ASCII]] 21 to 7E as fullwidth forms. U+FF00 does not correspond to a fullwidth ASCII 20 (space character), since that role is already fulfilled by U+3000 "[[ideographic space]]". Range U+FF61–FF9F encodes halfwidth forms of [[katakana]] and related punctuation in a transposition of A1 to DF in the [[JIS X 0201]] encoding – see [[half-width kana]]. The range U+FFA0–FFDC encodes halfwidth forms of [[Hangul Compatibility Jamo\|compatibility jamo]] characters for [[Hangul]], in a transposition of their [[KS C 5601#1974\|1974 standard]] layout. It is used in the mapping of some IBM encodings for Korean, such as IBM code page 933, which allows the use of the [[Shift Out and Shift In characters]] to shift to a double-byte character set.<ref name="ibm933">{{cite web\|url=http://demo.icu-project.org/icu-bin/convexp?conv=ibm-933\|title=ICU Demonstration - Converter Explorer\|website=demo.icu-project.org\|access-date=7 May 2018}}</ref> Since the double-byte character set could contain compatibility jamo, halfwidth variants are needed to provide round-trip compatibility.<ref name=hwfwblame>{{Cite web\|url=https://harjit.moe/hwfwblame.html\|title=Halfwidth and Fullwidth blame}}</ref><ref>{{Cite web\|url=http://userguide.icu-project.org/conversion/data\|title=Conversion Data - Old ___location of the ICU User Guide}}</ref> Range U+FFE0–FFEE includes fullwidth and halfwidth symbols. ==Block== {{Unicode chart Halfwidth and Fullwidth Forms}} The block has [[Variant form (Unicode)\|variation sequences]] defined for East Asian punctuation positional variants.<ref>{{cite web\|url=https://www.unicode.org/L2/L2017/17436r-sv-eastsian-punct.pdf\|title=L2/17-436: Proposal to add standardized variation sequences for fullwidth East Asian punctuation\|date=2018-01-21\|first=Ken\|last=Lunde}}</ref><ref name="stdvar">{{cite web\|url=https://www.unicode.org/Public/UNIDATA/StandardizedVariants.txt\|title=Unicode Character Database: Standardized Variation Sequences \| publisher=The Unicode Consortium}}</ref> They use {{sc2\|U+FE00 VARIATION SELECTOR-1}} (VS01) and {{sc2\|U+FE01 VARIATION SELECTOR-2}} (VS02): {\| class="wikitable nounderlines" style="border-collapse:collapse;background:#FFFFFF;font-size:large;text-align:center" \|+style="font-size:small" \| Variation sequences for punctuation alignment \|-style="background:#F8F8F8;font-size:small" \| style="text-align:right" \| U+ \|\| FF01 \|\| FF0C \|\| FF0E \|\| FF1A \|\| FF1B \|\| FF1F \|\| style="background:#F8F8F8;font-size:small;text-align:left" \| Description \|- \| style="background:#F8F8F8;font-size:small;text-align:left" \| base code point \|\| ！ \|\| ， \|\| ． \|\| ： \|\| ； \|\| ？ \|\| style="font-size:small;text-align:left" \| \|- \| style="background:#F8F8F8;font-size:small;text-align:left" \| base + VS01 \|\| ！︀ \|\| ，︀ \|\| ．︀ \|\| ：︀ \|\| ；︀ \|\| ？︀ \|\| style="font-size:small;text-align:left" \| corner-justified form \|- \| style="background:#F8F8F8;font-size:small;text-align:left" \| base + VS02 \|\| ！︁ \|\| ，︁ \|\| ．︁ \|\| ：︁ \|\| ；︁ \|\| ？︁ \|\| style="font-size:small;text-align:left" \| centered form \|} An additional variant is defined for a fullwidth [[slashed zero\|zero with a short diagonal stroke]]: U+FF10 FULLWIDTH DIGIT ZERO, U+FE00 VS1 ({{not a typo\|０︀}}).<ref>{{cite web\|url=https://www.unicode.org/L2/L2015/15268-slashed-zero.pdf\|title=L2/15-268: Proposal to Represent the Slashed Zero Variant of Empty Set\|date=2015-10-30\|first1=Barbara\|last1=Beeton\|first2=Asmus\|last2=Freytag\|first3=Laurențiu\|last3=Iancu\|first4=Murray\|last4=Sargent}}</ref><ref name="stdvar"/> ==History== The following Unicode-related documents record the purpose and process of defining specific characters in the Halfwidth and Fullwidth Forms block: {{sticky header}} {\| class="wikitable collapsible sticky-header" \|- ! [[Unicode#Versions\|Version]] !! {{nobr\|Final code points<ref group=lower-alpha name=final/>}} !! Count !! [[International Committee for Information Technology Standards\|L2]] ID !! [[ISO/IEC JTC 1/SC 2\|WG2]] ID !! Document \|- \| rowspan="9" \| 1.0.0 \|\| rowspan="9" width="180" \| U+FF01..FF5E, FF61..FFBE, FFC2..FFC7, FFCA..FFCF, FFD2..FFD7, FFDA..FFDC, FFE0..FFE6 \|\| rowspan="9" \| 216 \|\| \|\| \|\| (to be determined) \|- \| \|\| {{nobr\|[https://www.unicode.org/wg2/docs/n4403.pdf N4403 (pdf],}} [https://www.unicode.org/wg2/docs/n4403.doc doc]) \|\| {{Citation\|title=Unconfirmed minutes of WG 2 meeting 61, Holiday Inn, Vilnius, Lithuania; 2013-06-10/14\|date=2014-01-28\|first=V. S.\|last=Umamaheswaran\|section=Resolution M61.01}} \|- \| {{nobr\|[https://www.unicode.org/L2/L2017/17056-sv-western-vs-eastasian.pdf L2/17-056]}} \|\| \|\| {{Citation\|title=Proposal to add standardized variation sequences\|date=2017-02-13\|first=Ken\|last=Lunde\|author-link=Ken Lunde}} \|- \| {{nobr\|[https://www.unicode.org/L2/L2017/17436r-sv-eastsian-punct.pdf L2/17-436]}} \|\| \|\| {{Citation\|title=Proposal to add standardized variation sequences for fullwidth East Asian punctuation\|date=2018-01-21\|first=Ken\|last=Lunde}} \|- \| {{nobr\|[https://www.unicode.org/L2/L2018/18039-script-adhoc-rec.pdf L2/18-039]}} \|\| \|\| {{Citation\|title=Recommendations to UTC #154 January 2018 on Script Proposals\|date=2018-01-19\|first1=Deborah\|last1=Anderson\|first2=Ken\|last2=Whistler\|first3=Roozbeh\|last3=Pournader\|first4=Lisa\|last4=Moore\|first5=Hai\|last5=Liang\|first6=Richard\|last6=Cook\|section=24. Fullwidth East Asian Punctuation}} \|- \| {{nobr\|[https://www.unicode.org/L2/L2017/17362.htm L2/17-362]}} \|\| \|\| {{Citation\|title=UTC #153 Minutes\|date=2018-02-02\|first=Lisa\|last=Moore\|section=B.4.1 New Proposal to add standardized variation sequence for U+FF10 FULL WIDTH DIGIT ZERO}} \|- \| {{nobr\|[https://www.unicode.org/L2/L2018/18115.htm L2/18-115]}} \|\| \|\| {{Citation\|title=UTC #155 Minutes\|date=2018-05-09\|first=Lisa\|last=Moore\|section=Consensus 154-C17\|quote=Add 16 standardized variation sequences based on L2/17-436R, for Unicode 12.0.}} \|- \| {{nobr\|[https://www.unicode.org/L2/L2019/19055-segment-fullwd-digits.txt L2/19-055]}} \|\| \|\| {{Citation\|title=Proposed Changes in the Segmentation Property Values for Fullwidth Digits\|date=2019-01-14\|first=Laurențiu\|last=Iancu}} \|- \| {{nobr\|[https://www.unicode.org/L2/L2019/19008.htm L2/19-008]}} \|\| \|\| {{Citation\|title=UTC #158 Minutes\|date=2019-02-08\|first=Lisa\|last=Moore\|section=B.11.11.1.2 Proposed changes in the segmentation property values for fullwidth digits}} \|- \| 1.1 \|\| U+FFE8..FFEE \|\| 7 \|\| \|\| \|\| (to be determined) \|- \| rowspan="11" \| 3.2 \|\| rowspan="11" \| U+FF5F..FF60 \|\| rowspan="11" \| 2 \|\| {{nobr\|[https://www.unicode.org/L2/L1999/99052.htm L2/99-052]}} \|\| \|\| {{Citation\|title=The math pieces from the symbol font\|date=1999-02-05\|first=Asmus\|last=Freytag}} \|- \| {{nobr\|[https://www.unicode.org/L2/L2001/01033-addbrackets.htm L2/01-033]}} \|\| \|\| {{Citation\|title=Disunify braces/brackets for math, computing science, and Z notation from similar-looking CJK braces/brackets\|date=2001-01-16\|first1=Kent\|last1=Karlsson\|first2=Asmus\|last2=Freytag}} \|- \| {{nobr\|[https://www.unicode.org/L2/L2001/01159-N2344-MathAdHoc.pdf L2/01-159]}} \|\| [https://www.unicode.org/wg2/docs/n2344.pdf N2344] \|\| {{Citation\|title=Ad-hoc report on Mathematical Symbols\|date=2001-04-03}} \|- \| {{nobr\|[https://www.unicode.org/L2/L2001/01157-N2345R-brackets.pdf L2/01-157]}} \|\| [https://www.unicode.org/wg2/docs/n2345r.pdf N2345R] \|\| {{Citation\|title=Proposal to disunify certain fencing CJK punctuation marks from similar-looking Math fences\|date=2001-04-04\|first=Kent\|last=Karlsson}} \|- \| {{nobr\|[https://www.unicode.org/L2/L2001/01168-hell.txt L2/01-168]}} \|\| \|\| {{Citation\|title=Bracket Disunification & Normalization Hell\|date=2001-04-10\|first=Ken\|last=Whistler}} \|- \| {{nobr\|[https://www.unicode.org/L2/L2001/01012.htm L2/01-012R]}} \|\| \|\| {{Citation\|title=Minutes UTC #86 in Mountain View, Jan 2001\|date=2001-05-21\|first=Lisa\|last=Moore\|section=Disunifying Braces and Brackets}} \|- \| {{nobr\|[https://www.unicode.org/L2/L2001/01223.htm L2/01-223]}} \|\| \|\| {{Citation\|title=Discussion of Issues Regarding Bracket Disunification\|date=2001-05-23\|first=Michel\|last=Suignard}} \|- \| {{nobr\|[https://www.unicode.org/L2/L2001/01184.htm L2/01-184R]}} \|\| \|\| {{Citation\|title=Minutes from the UTC/L2 meeting\|date=2001-06-18\|first=Lisa\|last=Moore\|section=Motion 87-M21\|quote=Reverse the decision made in motion 86-M6 not to disunify brackets.}} \|- \| {{nobr\|[https://www.unicode.org/L2/L2001/01317-bracket.htm L2/01-317]}} \|\| \|\| {{Citation\|title=Bracket Disunification & Normalization\|date=2001-08-14\|first=Michel\|last=Suignard}} \|- \| {{nobr\|[https://www.unicode.org/consortium/utc-minutes/UTC-088-200108.html L2/01-295R]}} \|\| \|\| {{Citation\|title=Minutes from the UTC/L2 meeting #88\|date=2001-11-06\|first=Lisa\|last=Moore\|section=Bracket Disunification and Normalization}} \|- \| {{nobr\|[https://www.unicode.org/L2/L2002/02154-n2403-minutes.pdf L2/02-154]}} \|\| [https://www.unicode.org/wg2/docs/n2403.pdf N2403] \|\| {{Citation\|title=Draft minutes of WG 2 meeting 41, Hotel Phoenix, Singapore, 2001-10-15/19\|date=2002-04-22\|first=V. S.\|last=Umamaheswaran\|section=Resolution M41.1}} \|- class="sortbottom" \| colspan="6" \| {{reflist\|group=lower-alpha\|refs=<ref name=final>Proposed code points and characters names may differ from final code points and names</ref>}} \|} == See also == * [[CJK Symbols and Punctuation (Unicode block)]] * [[Hangul Jamo (Unicode block)]] * [[Katakana (Unicode block)]] * [[Latin script in Unicode]] * [[Enclosed Alphanumerics]] - bullet point sequences, some appear as full width (e.g. ⒈,⓵,⑴,⒜,ⓐ) == References == {{~~reflist~~Reflist}} [[Category:Unicode blocks]]▼ {{Unicode navigation}} ~~{{writingsystem-stub}}~~ ▲[[Category:Unicode blocks]] [[Category:Latin-script Unicode blocks]] [[Category:Kana]] [[Category:Hangul jamo\|*Halfwidth]]