Basic Latin (Unicode block): Difference between revisions

Content deleted Content added
Undid revision 1279543928 by 2600:6C5D:57F0:1F0:7509:4501:D19C:1EFB (talk) revert vandalism
 
(25 intermediate revisions by 14 users not shown)
Line 1:
{{for|a list of all Latin characters encoded in Unicode|Latin script in Unicode}}
{{also|Latin-1 Supplement|l1=C1 Controls and Latin-1 Supplement (Unicode block)}}
{{Infobox Unicode block
|blockname = Basic Latin<br/>{{nobold|1=''or''}}<br/>C0 Controls and Basic Latin
Line 12 ⟶ 10:
|controls = 33
|sources = [[ISO/IEC 8859]], [[ISO 646]]
|note = <ref>{{cite web|url=https://www.unicode.org/ucd/|title=Unicode character database|work=The Unicode Standard|accessdate=20162023-07-0926}}</ref><ref>{{cite web|url=https://www.unicode.org/versions/enumeratedversions.html|title=Enumerated Versions of The Unicode Standard|work=The Unicode Standard|accessdate=20162023-07-0926}}</ref>
|codechart = https://www.unicode.org/charts/PDF/U0000.pdf
|note = <ref>{{cite web|url=https://www.unicode.org|title=Unicode character database|work=The Unicode Standard|accessdate=2016-07-09}}</ref><ref>{{cite web|url=https://www.unicode.org/versions/enumeratedversions.html|title=Enumerated Versions of The Unicode Standard|work=The Unicode Standard|accessdate=2016-07-09}}</ref>
}}
 
The '''Basic Latin''' [[Unicode block]],<ref>{{cite web|url=https://www.unicode.org/Public/UCD/latest/ucd/Blocks.txt|title=block.txt|accessdate=232023-03-202323|publisher=The Unicode Consortium}}</ref> sometimes informally called '''C0 Controls and Basic Latin''',<ref>{{cite web|url=https://www.unicode.org/charts/PDF/U0000.pdf|title=C0 Controls and Basic Latin|work=The Unicode Standard, Version 15.0|publisher=[[Unicode Consortium|Unicode, Inc.]]|year=2022|access-date=March 22, 2023}}</ref> is the first block of the [[Unicode]] standard, and the only block which is encoded in one byte in [[UTF-8]]. The block contains all the [[ISO basic Latin alphabet|letters]] and [[ASCII control character|control codes]] of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the [[C0 controls]], ASCII [[punctuation]] and [[symbol]]s, [[ASCII]] [[numerical digit|digits]], both the [[uppercase]] and [[lowercase]] of the [[English alphabet]] and a [[control character]].
 
The Basic Latin block was included in its present form from version 1.0.0 of the Unicode Standard, without addition or alteration of the character repertoire.<ref name=Unicode1.0>{{cite book|title=The Unicode Standard Version 1.0, Volume 1|year=1990|publisher=Addison-Wesley Publishing Company, Inc.|isbn=0-201-56788-1}}</ref> Its block name in Unicode 1.0 was '''ASCII'''.<ref>{{cite web |url=https://www.unicode.org/versions/Unicode1.0.0/CodeCharts2.pdf |work=The Unicode Standard |version=version 1.0 |title=3.8: Block-by-Block Charts |publisher=[[Unicode Consortium]]}}</ref>
Line 496 ⟶ 493:
|U+005B
|&#91;
|[[Bracket#BoxSquare brackets or square brackets .5B .5D|Left Square Bracket]]
|
|-
Line 506 ⟶ 503:
|U+005D
|&#93;
|[[Bracket#BoxSquare brackets or square brackets .5B .5D|Right Square Bracket]]
|
|-
Line 660 ⟶ 657:
|U+007B
|{
|[[Bracket#Curly brackets or braces .7B .7D|Left Curly Bracket]]
|
|-
Line 670 ⟶ 667:
|U+007D
| }
|[[Bracket#Curly brackets or braces .7B .7D|Right Curly Bracket]]
|
|-
Line 701 ⟶ 698:
 
===C0 controls===
The [[C0 and C1 control codes|C0 Controls]], referred to as C0 ASCII control codes in version 1.0, are inherited from ASCII and other 7-bit and 8-bit encoding schemes. The Alias names for C0 controls are taken from the [[ISO/IEC 6429|ISO/IEC 6429:1992]] standard.<ref name=charts />
 
===ASCII punctuation and symbols===
Line 716 ⟶ 713:
 
===Control character===
The Control Character subheading contains the [[Delete character|"Delete" character]].<ref name=charts />
 
==Number of symbols, letters and control codes==
Line 744 ⟶ 741:
A variant is defined for a zero with a short diagonal stroke: U+0030 DIGIT ZERO, U+FE00 VS1 (0&#xfe00;).<ref>{{cite web|url=https://www.unicode.org/L2/L2015/15268-slashed-zero.pdf|title=L2/15-268: Proposal to Represent the Slashed Zero Variant of Empty Set|date=2015-10-30|first1=Barbara|last1=Beeton|first2=Asmus|last2=Freytag|first3=Laurențiu|last3=Iancu|first4=Murray|last4=Sargent}}</ref><ref name="uts51"/>
 
Twelve characters (#, *, and the digits) can be followed by U+FE0E VS15 or U+FE0F VS16 to create [[emoji]] variants.<ref>{{cite web|url=https://www.unicode.org/L2/L2011/11438-emoji-var.pdf|title=L2/11-438: Emoji Variation Sequences (Revision of L2/11-429)|date=2011-12-22|first=Peter|last=Edberg}}</ref><ref>{{cite web|url=https://www.unicode.org/L2/L2015/15301-emoji-sequences.pdf|title=L2/15-301: A proposal for 278 standardized variation sequences for emoji|date=2015-11-01|first=Roozbeh|last=Pournader}}</ref><ref name="UTR51">{{Cite web|url=http://unicode.org/reports/tr51/|title=UTR #51: Unicode Emoji|publisher=Unicode Consortium|date=20202023-0209-1105}}</ref><ref name="EmojiData">{{Cite web|url=https://unicode.org/Public/UNIDATA/emoji/emoji-data.txt|title=UCD: Emoji Data for UTR #51|publisher=Unicode Consortium|date=20212023-0802-2601}}</ref>
They are [[keycap]] base characters, for example #️⃣ (U+0023 NUMBER SIGN U+FE0F VS16 U+20E3 COMBINING ENCLOSING KEYCAP). The VS15 version is "text presentation" while the VS16 version is "emoji-style".<ref name="uts51">{{Cite web|url=https://unicode.org/Public/UNIDATA/emoji/emoji-variation-sequences.txt|title=UTS #51 Emoji Variation Sequences | publisher=The Unicode Consortium}}</ref>
 
Line 762 ⟶ 759:
The following Unicode-related documents record the purpose and process of defining specific characters in the Basic Latin block:
 
{{sticky header}}
{| class="wikitable sticky-header"
|-
! [[Unicode#Versions|Version]] !! {{nobr|Final code points<ref group=lower-alpha name=final/>}} !! Count !! [[Unicode Consortium|UTC]]&nbsp;ID !! [[International Committee for Information Technology Standards|L2]]&nbsp;ID !! [[ISO/IEC JTC 1/SC 2|WG2]]&nbsp;ID !! Document
Line 811 ⟶ 809:
==See also==
{{portal|Internet|Language}}
*[[Latin script in Unicode]]
*[[Character set]]
*[[ISO 8859Latin-1 Supplement]]
*[[Character setencoding]]
*[[ISO/IEC 8859-1]]
*[[Latin script]]
*[[ISO basic Latin alphabet]]
*[[List of Basic Latin characters]] in [[English language|English]], [[German language|German]], [[French language|French]], [[Spanish language|Spanish]], and [[Latin language|Latin]]
{{clear}}
 
==References==
Line 821 ⟶ 820:
 
==External links==
{{Spoken Wikipedia|date=2023-11-08|En-Basic Latin (Unicode block)-article.ogg}}
{{sister project links|Unicode}}
{{Unicode navigation}}