Halfwidth and fullwidth forms: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 17:30, 25 October 2007 edit Liliana-60 (talk \| contribs) 1,349 edits Undid revision 156347834 by 4.229.39.224 (talk) ← Previous edit		Latest revision as of 03:28, 12 June 2025 edit undo Drmccreedy (talk \| contribs) Extended confirmed users, Template editors 26,287 edits Undid revision 1295152914 by 27.55.70.178 (talk) Unexplained removal of content Tag: Undo
(192 intermediate revisions by more than 100 users not shown)
Line 1: {{Short description\|Alternative width characters in East Asian typography}} '''Halfwidth and Fullwidth Forms''' is the name of [[Unicode]] block U+FF00–FFEF, the last of the [[Basic Multilingual Plane]] excepting the short "[[Unicode Specials\|Specials]]" block at U+FFF0–FFFF. {{For\|the Unicode ~~chart~~ block\|Halfwidth and Fullwidth Forms (Unicode block)}}▼ [[File:Command Prompt on Windows XP (Korean).png\|thumb\|349px\|A command prompt ([[cmd.exe]]) with Korean localisation, showing halfwidth and fullwidth characters]] In [[CJK characters\|CJK]] (Chinese, Japanese, and Korean) computing, [[graphic character]]s are traditionally classed into '''fullwidth'''{{efn\|In [[Taiwan]] and [[Hong Kong]]: [[wikt:全形\|全形]]; in CJK: [[wikt:全角\|全角]].}} and '''halfwidth'''{{efn\|In [[Taiwan]] and [[Hong Kong]]: [[wikt:半形\|半形]]; in CJK: [[wikt:半角\|半角]].}} characters. Unlike [[monospaced font]]s, a halfwidth character occupies half the width of a fullwidth character, hence the name. ''[[Halfwidth and Fullwidth Forms (Unicode block)\|Halfwidth and Fullwidth Forms]]'' is also the name of a [[Unicode block]] U+FF00–FFEF, provided so that older encodings containing both halfwidth and fullwidth characters can have lossless translation to and from Unicode. U+FF01–FF5E reproduce the characters of [[ASCII]] 21 to 7E as [[fullwidth forms]] ([[zenkaku]]), that is, as [[monospace]] glyphs with the same width as a fullwidth [[Kanji]]. This is useful for typesetting Latin characters in a [[CJK]] environment. U+FF00 does not correspond to a fullwith ASCII 20 (space character), since that role is already fulfilled by U+3000 "ideographic space". ==Rationale== ~~U+FF65–FFDC encode [[halfwidth forms]] ([[hankaku]]), of [[Katakana]] and [[Hangul]] characters. U+FFE0–FFEE are fullwidth and halfwidth symbols.~~ {{More citations needed section\|date=April 2021}} [[File:Alternative names of JIS X 0213.svg\|thumb\|Characters which appear in both [[JIS X 0201]] (single byte) and [[JIS X 0208]] / [[JIS X 0213]] (double byte) have both a halfwidth and a fullwidth form in [[Shift JIS]].\|class=skin-invert]] In the days of [[text mode]] computing, Western characters were normally laid out in a grid on the screen, often 80 columns by 24 or 25 lines. Each character was displayed as a small [[dot matrix]], often about 8 [[pixel]]s wide, and an [[SBCS]] (single-byte character set) was generally used to encode characters of Western languages. For aesthetic reasons and readability, it is preferable for [[Chinese characters]] to be approximately square-shaped, therefore twice as wide as these fixed-width SBCS characters. As these were typically encoded in a [[double-byte character set\|DBCS]] (double-byte character set), this also meant that their width on screen in a [[duospaced font]] was proportional to their byte length. Some terminals and editing programs could not deal with double-byte characters starting at odd columns, only even ones (some could not even put double-byte and single-byte characters in the same line). So the DBCS sets generally included Roman characters and digits also, for use alongside the CJK characters in the same line. ==Chart==▼ ▲{{Unicode chart Halfwidth and Fullwidth Forms}} On the other hand, early Japanese computing used a single-byte code page called [[JIS X 0201]] for [[katakana]]. These would be rendered at the same width as the other single-byte characters, making them [[half-width kana]] characters rather than normally proportioned kana. Although the JIS X 0201 standard itself did not specify half-width display for katakana, this became the visually distinguishing feature in [[Shift JIS]] between the single-byte JIS X 0201 and double-byte [[JIS X 0208]] katakana. Some IBM code pages used a similar treatment for [[Hangul#Letters\|Korean jamo]],<ref name="ibm933">{{cite web \|url=http://demo.icu-project.org/icu-bin/convexp?conv=ibm-933 \|title=ICU Demonstration - Converter Explorer \|website=demo.icu-project.org \|access-date=7 May 2018}}</ref> based on the [[KS C 5601#1974\|N-byte Hangul code]] and its [[EBCDIC]] translation. ==In Unicode== {{see also\|Halfwidth and Fullwidth Forms (Unicode block)}} For compatibility with existing character sets that contained both half- and fullwidth versions of the same character, [[Unicode]] allocated a single block at U+FF00–FFEF containing the necessary "alternative width" characters. This includes a fullwidth version of all the [[ASCII]] characters and some non-ASCII punctuation such as the Yen sign, halfwidth versions of katakana and [[hangul]], and halfwidth versions of some other symbols such as circles. Only characters needed for lossless round trip to existing character sets were allocated, rather than (for instance) making a fullwidth version of every Latin accented character. Unicode assigns ''every'' code point an "East Asian width" [[Unicode character property\|property]]. This may be:<ref name="uax11">{{cite web \|url=https://unicode.org/reports/tr11/ \|title=Unicode® Standard Annex #11: East Asian Width \|last1=Lunde \|first1=Ken \|author-link=Ken Lunde \|publisher=[[Unicode Consortium]] \|date=2019-01-25}}</ref> {\|class=wikitable \|+Unicode character properties based on width \|- !scope="col"\|Abbreviation !scope="col"\|Name !scope="col"\|Description \|- !scope="row"\|W \|Wide\|\|Naturally wide character, e.g. [[Hiragana]]. \|- !scope="row"\|Na \|Narrow\|\|Naturally narrow character, e.g. [[ISO Basic Latin alphabet]]. \|- !scope="row"\|F \|Fullwidth\|\|Wide variant with [[NFKC\|compatibility normalisation]] to naturally narrow character, e.g. fullwidth Latin script. \|- !scope="row"\|H \|Halfwidth\|\|Narrow variant with [[NFKC\|compatibility normalisation]] to naturally wide character, e.g. [[half-width kana]]. Includes U+20A9 ([[won sign\|₩]]) as an exception. \|- !scope="row"\|A \|Ambiguous\|\|Characters included in East Asian DBCS codes but also in European SBCS codes, e.g. [[Greek alphabet]]. Duospaced behaviour can consequently vary. \|- !scope="row"\|N \|Neutral\|\|Characters which do not appear in East Asian DBCS codes, e.g. [[Devanagari]]. \|} [[Terminal emulator]]s can use this property to decide whether a character should consume one or two "columns" when figuring out tabs and cursor position. ==In OpenType== [[OpenType]] has the <code>fwid</code>, <code>halt</code>, <code>hwid</code>, and <code>vhal</code> feature tags to be used to reproduce fullwidth or halfwidth form of a character. [[CSS]] provides control over these features using <code>font-variant-east-asian</code> and <code>font-feature-settings</code> properties.<ref>{{cite web \|url=https://helpx.adobe.com/fonts/using/open-type-syntax.html \|title=Syntax for OpenType features in CSS \|publisher=[[Adobe Inc.\|Adobe]] \|access-date=2023-09-20}}</ref> ==See also== * [[CJK Symbols and Punctuation (Unicode block)\|East Asian punctuation]] [[CJK]] [[Em size]] – full width forms [[Han unification]] [[Enclosed Alphanumerics]] – bullet point sequences; some appear as fullwidth (e.g. ⒈, ⓵, ⑴, ⒜, ⓐ) [[Monospace]] [[~~East Asian~~Han ~~Punctuation~~unification]] * [[Hangul Jamo (Unicode block)]] * [[Katakana (Unicode block)]] * [[Latin script in Unicode]] ▲==~~Chart~~Notes== {{Notelist}} ==References== {{Reflist}} ==External links== * [https://www.unicode.org/reports/tr11/tr11-31.html East Asian Width] Unicode Standard Annex #11 http://www.alanwood.net/unicode/halfwidth_and_fullwidth_forms.html http://everything2.com/index.pl?node=Halfwidth%20and%20Fullwidth%20Forms {{Unicode navigation}} [[Category:~~Unicode~~East Asian typography]] [[Category:Kana]] [[Category:Hangul jamo\|*Halfwidth]]