Basic Latin (Unicode block): Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 03:59, 26 February 2022 edit Theknightwho (talk \| contribs) Autopatrolled, Extended confirmed users, Template editors 15,314 edits mNo edit summary ← Previous edit		Latest revision as of 04:30, 9 March 2025 edit undo Drmccreedy (talk \| contribs) Extended confirmed users, Template editors 26,284 edits Undid revision 1279543928 by 2600:6C5D:57F0:1F0:7509:4501:D19C:1EFB (talk) revert vandalism Tag: Undo
(43 intermediate revisions by 19 users not shown)
Line 1: ~~{{for\|a list of all Latin characters encoded in Unicode\|Latin script in Unicode}}~~ ~~{{also\|Latin-1 Supplement\|l1=C1 Controls and Latin-1 Supplement (Unicode block)}}~~ {{Infobox Unicode block \|blockname = [[Basic Latin<br/>{{nobold\|1=''or''}}<br/>C0 Controls]] and ~~[[ISO Basic Latin alphabet\|~~Basic Latin]] \|rangestart = 0000 \|rangeend = 007F \|script1 = {{nowrap\|[[Latin script\|Latin]] (52 ~~char.~~characters)}} \|script2 = {{nowrap\|[[Script (Unicode)#Special script property values\|Common]] (76 ~~char.~~characters)}} \|symbols = [[Arabic numerals]]<br />[[Punctuation]] \|alphabets = [[English language\|English]]<br />[[French language\|French]]<br />[[German language\|German]]<br />[[Spanish language\|Spanish]]<br />[[Vietnamese language\|Vietnamese]] Line 12 ⟶ 10: \|controls = 33 \|sources = [[ISO/IEC 8859]], [[ISO 646]] \|note = <ref>{{cite web\|url=https://www.unicode.org/ucd/\|title=Unicode character database\|work=The Unicode Standard\|accessdate=~~2016~~2023-07-0926}}</ref><ref>{{cite web\|url=https://www.unicode.org/versions/enumeratedversions.html\|title=Enumerated Versions of The Unicode Standard\|work=The Unicode Standard\|accessdate=~~2016~~2023-07-0926}}</ref> ▼ ~~\|codechart = https://www.unicode.org/charts/PDF/U0000.pdf~~ ▲\|note = <ref>{{cite web\|url=https://www.unicode.org\|title=Unicode character database\|work=The Unicode Standard\|accessdate=2016-07-09}}</ref><ref>{{cite web\|url=https://www.unicode.org/versions/enumeratedversions.html\|title=Enumerated Versions of The Unicode Standard\|work=The Unicode Standard\|accessdate=2016-07-09}}</ref> }} The '''Basic Latin''' or[[Unicode block]],<ref>{{cite web\|url=https://www.unicode.org/Public/UCD/latest/ucd/Blocks.txt\|title=block.txt\|accessdate=2023-03-23\|publisher=The Unicode Consortium}}</ref> sometimes informally called '''C0 Controls and Basic Latin''',<ref>{{cite web\|url=https://www.unicode.org/charts/PDF/U0000.pdf\|title=C0 Controls and Basic Latin\|work=The Unicode Standard, Version 15.0\|publisher=[[Unicode ~~block~~Consortium\|Unicode, Inc.]]\|year=2022\|access-date=March 22, 2023}}</ref> is the first block of the [[Unicode]] standard, and the only block which is encoded in one byte in [[UTF-8]]. The block contains all the [[ISO basic Latin alphabet\|letters]] and [[ASCII control character\|control codes]] of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the [[C0 controls]], ASCII [[punctuation]] and [[symbol]]s, [[ASCII]] [[numerical digit\|digits]], both the [[uppercase]] and [[lowercase]] of the [[English alphabet]] and a [[control character]]. The Basic Latin block was included in its present form from version 1.0.0 of the Unicode Standard, without addition or alteration of the character repertoire.<ref name=Unicode1.0>{{cite book\|title=The Unicode Standard Version 1.0, Volume 1\|year=1990\|publisher=Addison-Wesley Publishing Company, Inc.\|isbn=0-201-56788-1}}</ref> Its block name in Unicode 1.0 was '''ASCII'''.<ref>{{cite web \|url=https://www.unicode.org/versions/Unicode1.0.0/CodeCharts2.pdf \|work=The Unicode Standard \|version=version 1.0 \|title=3.8: Block-by-Block Charts \|publisher=[[Unicode Consortium]]}}</ref> Line 496 ⟶ 493: \|U+005B \|[ \|[[Bracket#~~Box~~Square brackets ~~or square brackets .5B .5D~~\|Left Square Bracket]] \| \|- Line 506 ⟶ 503: \|U+005D \|] \|[[Bracket#~~Box~~Square brackets ~~or square brackets .5B .5D~~\|Right Square Bracket]] \| \|- Line 660 ⟶ 657: \|U+007B \|{ \|[[Bracket#Curly brackets ~~or braces .7B .7D~~\|Left Curly Bracket]] \| \|- Line 670 ⟶ 667: \|U+007D \| } \|[[Bracket#Curly brackets ~~or braces .7B .7D~~\|Right Curly Bracket]] \| \|- Line 681 ⟶ 678: \|- \| U+007F \| ␡ \| [[Delete character\|Delete]] \| DEL Line 701 ⟶ 698: ===C0 controls=== The [[C0 and C1 control codes\|C0 Controls]], referred to as C0 ASCII control codes in version 1.0, are inherited from ASCII and other 7-bit and 8-bit encoding schemes. The Alias names for C0 controls are taken from the [[ISO/IEC 6429\|ISO/IEC 6429:1992]] standard.<ref name=charts /> ===ASCII punctuation and symbols=== Line 716 ⟶ 713: ===Control character=== The Control Character subheading contains the [[Delete character\|"Delete" character]].<ref name=charts /> ==Number of symbols, letters and control codes== Line 736 ⟶ 733: \|} ==~~Block~~Chart== {{Unicode chart C0 Controls and Basic Latin}} Line 744 ⟶ 741: A variant is defined for a zero with a short diagonal stroke: U+0030 DIGIT ZERO, U+FE00 VS1 (0︀).<ref>{{cite web\|url=https://www.unicode.org/L2/L2015/15268-slashed-zero.pdf\|title=L2/15-268: Proposal to Represent the Slashed Zero Variant of Empty Set\|date=2015-10-30\|first1=Barbara\|last1=Beeton\|first2=Asmus\|last2=Freytag\|first3=Laurențiu\|last3=Iancu\|first4=Murray\|last4=Sargent}}</ref><ref name="uts51"/> Twelve characters (#, , and the digits) can be followed by U+FE0E VS15 or U+FE0F VS16 to create [[emoji]] variants.<ref>{{cite web\|url=https://www.unicode.org/L2/L2011/11438-emoji-var.pdf\|title=L2/11-438: Emoji Variation Sequences (Revision of L2/11-429)\|date=2011-12-22\|first=Peter\|last=Edberg}}</ref><ref>{{cite web\|url=https://www.unicode.org/L2/L2015/15301-emoji-sequences.pdf\|title=L2/15-301: A proposal for 278 standardized variation sequences for emoji\|date=2015-11-01\|first=Roozbeh\|last=Pournader}}</ref><ref name="UTR51">{{Cite web\|url=http://unicode.org/reports/tr51/\|title=UTR #51: Unicode Emoji\|publisher=Unicode Consortium\|date=~~2020~~2023-0209-1105}}</ref><ref name="EmojiData">{{Cite web\|url=https://unicode.org/Public/UNIDATA/emoji/emoji-data.txt\|title=UCD: Emoji Data for UTR #51\|publisher=Unicode Consortium\|date=~~2021~~2023-0802-2601}}</ref> They are [[keycap]] base characters, for example #️⃣ (U+0023 NUMBER SIGN U+FE0F VS16 U+20E3 COMBINING ENCLOSING KEYCAP). The VS15 version is "text presentation" while the VS16 version is "emoji-style".<ref name="uts51">{{Cite web\|url=https://unicode.org/Public/UNIDATA/emoji/emoji-variation-sequences.txt\|title=UTS #51 Emoji Variation Sequences \| publisher=The Unicode Consortium}}</ref> Line 762 ⟶ 759: The following Unicode-related documents record the purpose and process of defining specific characters in the Basic Latin block: {{sticky header}} {\| class="wikitable sticky-header" \|- ! [[Unicode#Versions\|Version]] !! {{nobr\|Final code points<ref group=lower-alpha name=final/>}} !! Count !! [[Unicode Consortium\|UTC]] ID !! [[International Committee for Information Technology Standards\|L2]] ID !! [[ISO/IEC JTC 1/SC 2\|WG2]] ID !! Document \|- \| rowspan="1618" \| 1.0.0 \|\| rowspan="1618" \| U+0000..007F \|\| rowspan="1618" \| 128 \|\| \|\| \|\| \|\| (to be determined) \|- \| {{nobr\|[https://www.unicode.org/L2/L1999-UTC/u1999-013.htm UTC/1999-013]}} \|\| \|\| \|\| {{Citation\|title=Tildes and micro sign decompositions\|date=1999-05-27\|first=Kent\|last=Karlsson}} Line 776 ⟶ 774: \| \|\| {{nobr\|[https://www.unicode.org/L2/L2004/04202-slash-c-feedback.txt L2/04-202]}} \|\| \|\| {{Citation\|title=Slashed C Feedback\|date=2004-06-07\|first=Deborah\|last=Anderson}} \|- \| \|\| \|\| [https://www.unicode.org/wg2/docs/n3046.pdf N3046] \|\| {{Citation\|title=Improving formal definition for control characters \|date=2006-02-22\|first=Michel\|last=Suignard}} \|- \| \|\| \|\| {{nobr\|[https://www.unicode.org/wg2/docs/n3103.pdf N3103 (pdf],}} [https://www.unicode.org/wg2/docs/n3103.doc doc]) \|\| {{Citation\|title=Unconfirmed minutes of WG 2 meeting 48, Mountain View, CA, USA; 2006-04-24/27\|date=2006-08-25\|first=V. S.\|last=Umamaheswaran\|section=M48.33}} Line 796 ⟶ 794: \| \|\| {{nobr\|[https://www.unicode.org/L2/L2015/15254.htm L2/15-254]}} \|\| \|\| {{Citation\|title=UTC #145 Minutes\|date=2015-11-16\|first=Lisa\|last=Moore\|section=B.12.1.2 Proposal to Represent the Slashed Zero Variant of Empty Set}} \|- \| \|\| {{nobr\|[https://www.unicode.org/L2/L2017/17294-fullwidth-slashed-zero.pdf L2/17-294]}} \|\| [https://www.unicode.org/wg2/docs/n4914-17294-fullwidth-slashed-zero.pdf N4914] \|\| {{Citation\|title=Proposal to add standardized variation sequence for U+FF10 FULLWIDTH DIGIT ZERO\|date=2017-08-14\|first=Ken\|last=Lunde\|~~authorlink~~author-link=Ken Lunde}} \|- \| \|\| {{nobr\|[https://www.unicode.org/L2/L2022/22019-utc170-properties-recs.pdf L2/22-019]}} \|\| \|\| {{Citation\|title=UTC #170 properties feedback & recommendations\|date=2022-01-19\|first1=Markus\|last1=Scherer\|display-authors=etal\|section=F.2 F4: U+0019 in ISO vs. NameAliases.txt vs. chart/NamesList.txt}} \|- \| \|\| {{nobr\|[https://www.unicode.org/L2/L2022/22016.htm L2/22-016]}} \|\| \|\| {{Citation\|title=UTC #170 Minutes\|date=2022-04-21\|first=Peter\|last=Constable\|section=Consensus 170-C24\|quote=For U+0019, add a Name alias "EM" of type abbreviation, for Unicode version 15.0.}} \|- class="sortbottom" \| colspan="7" \| {{reflist\|group=lower-alpha\|refs= <ref name=final>Proposed code points and characters names may differ from final code points and names</ref> <ref name=also10458>See also [https://www.unicode.org/L2/L2010/10458-emoji-var.pdf L2/10-458], [https://www.unicode.org/L2/L2011/11414-emoji-var-seq.pdf L2/11-414], [https://www.unicode.org/L2/L2011/11415-unified-emoji-ref.pdf L2/11-415], and [https://www.unicode.org/L2/L2011/11429-emoji-var-seq-list.pdf L2/11-429]</ref> <ref name=emojidocs>Refer to the [[Miscellaneous Symbols and Pictographs#History\|history section]] of the Miscellaneous Symbols and Pictographs block for additional emoji-related documents</ref> <ref name=also15198>See also [https://www.unicode.org/L2/L2015/15198-varseq-text-emoji.pdf L2/15-198] and [https://www.unicode.org/L2/L2015/15275-more-var-seqs-for-text-vs-emoji.pdf L2/15-275]</ref>}} \|} ==See also== {{portal\|Internet\|Language}} [[Character set]]▼ * [[~~ISO~~Latin script in ~~8859-1~~Unicode]] [[Latin-1 Supplement]] ▲ [[Character ~~set~~encoding]] [[ISO/IEC 8859-1]] [[Latin script]] *[[ISO basic Latin alphabet]] ==References== <references/> ==External links== {{Spoken Wikipedia\|date=2023-11-08\|En-Basic Latin (Unicode block)-article.ogg}} {{sister project links\|Unicode}} {{Unicode navigation}} {{authority control}} [[Category:Latin-script Unicode blocks]]