Basic Latin (Unicode block): Difference between revisions

Content deleted Content added
mNo edit summary
Undid revision 1279543928 by 2600:6C5D:57F0:1F0:7509:4501:D19C:1EFB (talk) revert vandalism
 
(43 intermediate revisions by 19 users not shown)
Line 1:
{{for|a list of all Latin characters encoded in Unicode|Latin script in Unicode}}
{{also|Latin-1 Supplement|l1=C1 Controls and Latin-1 Supplement (Unicode block)}}
{{Infobox Unicode block
|blockname = [[Basic Latin<br/>{{nobold|1=''or''}}<br/>C0 Controls]] and [[ISO Basic Latin alphabet|Basic Latin]]
|rangestart = 0000
|rangeend = 007F
|script1 = {{nowrap|[[Latin script|Latin]] (52 char.characters)}}
|script2 = {{nowrap|[[Script (Unicode)#Special script property values|Common]] (76 char.characters)}}
|symbols = [[Arabic numerals]]<br />[[Punctuation]]
|alphabets = [[English language|English]]<br />[[French language|French]]<br />[[German language|German]]<br />[[Spanish language|Spanish]]<br />[[Vietnamese language|Vietnamese]]
Line 12 ⟶ 10:
|controls = 33
|sources = [[ISO/IEC 8859]], [[ISO 646]]
|note = <ref>{{cite web|url=https://www.unicode.org/ucd/|title=Unicode character database|work=The Unicode Standard|accessdate=20162023-07-0926}}</ref><ref>{{cite web|url=https://www.unicode.org/versions/enumeratedversions.html|title=Enumerated Versions of The Unicode Standard|work=The Unicode Standard|accessdate=20162023-07-0926}}</ref>
|codechart = https://www.unicode.org/charts/PDF/U0000.pdf
|note = <ref>{{cite web|url=https://www.unicode.org|title=Unicode character database|work=The Unicode Standard|accessdate=2016-07-09}}</ref><ref>{{cite web|url=https://www.unicode.org/versions/enumeratedversions.html|title=Enumerated Versions of The Unicode Standard|work=The Unicode Standard|accessdate=2016-07-09}}</ref>
}}
 
The '''Basic Latin''' or[[Unicode block]],<ref>{{cite web|url=https://www.unicode.org/Public/UCD/latest/ucd/Blocks.txt|title=block.txt|accessdate=2023-03-23|publisher=The Unicode Consortium}}</ref> sometimes informally called '''C0 Controls and Basic Latin''',<ref>{{cite web|url=https://www.unicode.org/charts/PDF/U0000.pdf|title=C0 Controls and Basic Latin|work=The Unicode Standard, Version 15.0|publisher=[[Unicode blockConsortium|Unicode, Inc.]]|year=2022|access-date=March 22, 2023}}</ref> is the first block of the [[Unicode]] standard, and the only block which is encoded in one byte in [[UTF-8]]. The block contains all the [[ISO basic Latin alphabet|letters]] and [[ASCII control character|control codes]] of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the [[C0 controls]], ASCII [[punctuation]] and [[symbol]]s, [[ASCII]] [[numerical digit|digits]], both the [[uppercase]] and [[lowercase]] of the [[English alphabet]] and a [[control character]].
 
The Basic Latin block was included in its present form from version 1.0.0 of the Unicode Standard, without addition or alteration of the character repertoire.<ref name=Unicode1.0>{{cite book|title=The Unicode Standard Version 1.0, Volume 1|year=1990|publisher=Addison-Wesley Publishing Company, Inc.|isbn=0-201-56788-1}}</ref> Its block name in Unicode 1.0 was '''ASCII'''.<ref>{{cite web |url=https://www.unicode.org/versions/Unicode1.0.0/CodeCharts2.pdf |work=The Unicode Standard |version=version 1.0 |title=3.8: Block-by-Block Charts |publisher=[[Unicode Consortium]]}}</ref>
Line 496 ⟶ 493:
|U+005B
|&#91;
|[[Bracket#BoxSquare brackets or square brackets .5B .5D|Left Square Bracket]]
|
|-
Line 506 ⟶ 503:
|U+005D
|&#93;
|[[Bracket#BoxSquare brackets or square brackets .5B .5D|Right Square Bracket]]
|
|-
Line 660 ⟶ 657:
|U+007B
|{
|[[Bracket#Curly brackets or braces .7B .7D|Left Curly Bracket]]
|
|-
Line 670 ⟶ 667:
|U+007D
| }
|[[Bracket#Curly brackets or braces .7B .7D|Right Curly Bracket]]
|
|-
Line 681 ⟶ 678:
|-
| U+007F
|
| [[Delete character|Delete]]
| DEL
Line 701 ⟶ 698:
 
===C0 controls===
The [[C0 and C1 control codes|C0 Controls]], referred to as C0 ASCII control codes in version 1.0, are inherited from ASCII and other 7-bit and 8-bit encoding schemes. The Alias names for C0 controls are taken from the [[ISO/IEC 6429|ISO/IEC 6429:1992]] standard.<ref name=charts />
 
===ASCII punctuation and symbols===
Line 716 ⟶ 713:
 
===Control character===
The Control Character subheading contains the [[Delete character|"Delete" character]].<ref name=charts />
 
==Number of symbols, letters and control codes==
Line 736 ⟶ 733:
|}
 
==BlockChart==
{{Unicode chart C0 Controls and Basic Latin}}
 
Line 744 ⟶ 741:
A variant is defined for a zero with a short diagonal stroke: U+0030 DIGIT ZERO, U+FE00 VS1 (0&#xfe00;).<ref>{{cite web|url=https://www.unicode.org/L2/L2015/15268-slashed-zero.pdf|title=L2/15-268: Proposal to Represent the Slashed Zero Variant of Empty Set|date=2015-10-30|first1=Barbara|last1=Beeton|first2=Asmus|last2=Freytag|first3=Laurențiu|last3=Iancu|first4=Murray|last4=Sargent}}</ref><ref name="uts51"/>
 
Twelve characters (#, *, and the digits) can be followed by U+FE0E VS15 or U+FE0F VS16 to create [[emoji]] variants.<ref>{{cite web|url=https://www.unicode.org/L2/L2011/11438-emoji-var.pdf|title=L2/11-438: Emoji Variation Sequences (Revision of L2/11-429)|date=2011-12-22|first=Peter|last=Edberg}}</ref><ref>{{cite web|url=https://www.unicode.org/L2/L2015/15301-emoji-sequences.pdf|title=L2/15-301: A proposal for 278 standardized variation sequences for emoji|date=2015-11-01|first=Roozbeh|last=Pournader}}</ref><ref name="UTR51">{{Cite web|url=http://unicode.org/reports/tr51/|title=UTR #51: Unicode Emoji|publisher=Unicode Consortium|date=20202023-0209-1105}}</ref><ref name="EmojiData">{{Cite web|url=https://unicode.org/Public/UNIDATA/emoji/emoji-data.txt|title=UCD: Emoji Data for UTR #51|publisher=Unicode Consortium|date=20212023-0802-2601}}</ref>
They are [[keycap]] base characters, for example #️⃣ (U+0023 NUMBER SIGN U+FE0F VS16 U+20E3 COMBINING ENCLOSING KEYCAP). The VS15 version is "text presentation" while the VS16 version is "emoji-style".<ref name="uts51">{{Cite web|url=https://unicode.org/Public/UNIDATA/emoji/emoji-variation-sequences.txt|title=UTS #51 Emoji Variation Sequences | publisher=The Unicode Consortium}}</ref>
 
Line 762 ⟶ 759:
The following Unicode-related documents record the purpose and process of defining specific characters in the Basic Latin block:
 
{{sticky header}}
{| class="wikitable sticky-header"
|-
! [[Unicode#Versions|Version]] !! {{nobr|Final code points<ref group=lower-alpha name=final/>}} !! Count !! [[Unicode Consortium|UTC]]&nbsp;ID !! [[International Committee for Information Technology Standards|L2]]&nbsp;ID !! [[ISO/IEC JTC 1/SC 2|WG2]]&nbsp;ID !! Document
|-
| rowspan="1618" | 1.0.0 || rowspan="1618" | U+0000..007F || rowspan="1618" | 128 || || || || (to be determined)
|-
| {{nobr|[https://www.unicode.org/L2/L1999-UTC/u1999-013.htm UTC/1999-013]}} || || || {{Citation|title=Tildes and micro sign decompositions|date=1999-05-27|first=Kent|last=Karlsson}}
Line 776 ⟶ 774:
| || {{nobr|[https://www.unicode.org/L2/L2004/04202-slash-c-feedback.txt L2/04-202]}} || || {{Citation|title=Slashed C Feedback|date=2004-06-07|first=Deborah|last=Anderson}}
|-
| || || [https://www.unicode.org/wg2/docs/n3046.pdf N3046] || {{Citation|title=Improving formal definition for control characters |date=2006-02-22|first=Michel|last=Suignard}}
|-
| || || {{nobr|[https://www.unicode.org/wg2/docs/n3103.pdf N3103 (pdf],}} [https://www.unicode.org/wg2/docs/n3103.doc doc]) || {{Citation|title=Unconfirmed minutes of WG 2 meeting 48, Mountain View, CA, USA; 2006-04-24/27|date=2006-08-25|first=V. S.|last=Umamaheswaran|section=M48.33}}
Line 796 ⟶ 794:
| || {{nobr|[https://www.unicode.org/L2/L2015/15254.htm L2/15-254]}} || || {{Citation|title=UTC #145 Minutes|date=2015-11-16|first=Lisa|last=Moore|section=B.12.1.2 Proposal to Represent the Slashed Zero Variant of Empty Set}}
|-
| || {{nobr|[https://www.unicode.org/L2/L2017/17294-fullwidth-slashed-zero.pdf L2/17-294]}} || [https://www.unicode.org/wg2/docs/n4914-17294-fullwidth-slashed-zero.pdf N4914] || {{Citation|title=Proposal to add standardized variation sequence for U+FF10 FULLWIDTH DIGIT ZERO|date=2017-08-14|first=Ken|last=Lunde|authorlinkauthor-link=Ken Lunde}}
|-
| || {{nobr|[https://www.unicode.org/L2/L2022/22019-utc170-properties-recs.pdf L2/22-019]}} || || {{Citation|title=UTC #170 properties feedback & recommendations|date=2022-01-19|first1=Markus|last1=Scherer|display-authors=etal|section=F.2 F4: U+0019 in ISO vs. NameAliases.txt vs. chart/NamesList.txt}}
|-
| || {{nobr|[https://www.unicode.org/L2/L2022/22016.htm L2/22-016]}} || || {{Citation|title=UTC #170 Minutes|date=2022-04-21|first=Peter|last=Constable|section=Consensus 170-C24|quote=For U+0019, add a Name alias "EM" of type abbreviation, for Unicode version 15.0.}}
|- class="sortbottom"
| colspan="7" | {{reflist|group=lower-alpha|refs=
<ref name=final>Proposed code points and characters names may differ from final code points and names</ref>
<ref name=also10458>See also [https://www.unicode.org/L2/L2010/10458-emoji-var.pdf L2/10-458], [https://www.unicode.org/L2/L2011/11414-emoji-var-seq.pdf L2/11-414], [https://www.unicode.org/L2/L2011/11415-unified-emoji-ref.pdf L2/11-415], and [https://www.unicode.org/L2/L2011/11429-emoji-var-seq-list.pdf L2/11-429]</ref>
<ref name=emojidocs>Refer to the [[Miscellaneous Symbols and Pictographs#History|history section]] of the Miscellaneous Symbols and Pictographs block for additional emoji-related documents</ref>
<ref name=also15198>See also [https://www.unicode.org/L2/L2015/15198-varseq-text-emoji.pdf L2/15-198] and [https://www.unicode.org/L2/L2015/15275-more-var-seqs-for-text-vs-emoji.pdf L2/15-275]</ref>}}
|}
 
==See also==
{{portal|Internet|Language}}
* [[Character set]]
* [[ISOLatin script in 8859-1Unicode]]
*[[Latin-1 Supplement]]
* [[Character setencoding]]
*[[ISO/IEC 8859-1]]
*[[Latin script]]
*[[ISO basic Latin alphabet]]
 
==References==
<references/>
 
==External links==
{{Spoken Wikipedia|date=2023-11-08|En-Basic Latin (Unicode block)-article.ogg}}
{{sister project links|Unicode}}
{{Unicode navigation}}
 
{{authority control}}
 
[[Category:Latin-script Unicode blocks]]