Tai Tham (Unicode block): Difference between revisions

Content deleted Content added
Encoding of Subscript Consonants: Lanna equivalent of OY.
History: fix links
 
(29 intermediate revisions by 12 users not shown)
Line 7:
|alphabets = Tai Tham
|5_2 = 127
|note = <ref>{{cite web|url=https://www.unicode.org/ucd/|title=Unicode character database|work=The Unicode Standard|accessdate=20162023-07-0926}}</ref><ref>{{cite web|url=https://www.unicode.org/versions/enumeratedversions.html|title=Enumerated Versions of The Unicode Standard|work=The Unicode Standard|accessdate=20162023-07-0926}}</ref>
}}
 
Line 15:
 
==History==
123 of the 127 code points initially encoded were proposed in L2/07-007R,<ref name=N3207R/> two more (U+1A5C and U+1A7C) in L2/08-037R2<ref name=N3379/> and a final pair (U+1A5D and U+1A5E) in L2/08-073.<ref name=N3384/> The last of these three documents modified the definitions of U+1A37 and U+1A38 given in the first of the three.
 
The following Unicode-related documents record the purpose and process of defining specific characters in the Tai Tham block:
 
{{sticky header}}
{| class="wikitable collapsible sticky-header"
|-
! [[Unicode#Versions|Version]] !! {{nobr|Final code points<ref group=lower-alpha {{efn|name=final/>|Proposed code points and characters names may differ from final code points and names}}}} !! Count !! [[International Committee for Information Technology Standards|L2]]&nbsp;ID !! [[ISO/IEC JTC 1/SC 2|WG2]]&nbsp;ID !! Document
|-
| rowspan="3539" | 5.2{{efn|name=version|Changes to characters may have first taken effect in a later version of Unicode}} || rowspan="3539" width="180" | U+1A20..1A5E, 1A60..1A7C, 1A7F..1A89, 1A90..1A99, 1AA0..1AAD || rowspan="3539" | 127 || {{nobr|X3L2[https:/94/www.unicode.org/L2/L1999/n2042.pdf L2/99-088245]}} || [httphttps://www.evertypeunicode.comorg/standardswg2/taidocs/n1013-lannan2042.pdf N1013N2042] || {{Citation|title=TheUnicode MotionTechnical onReport the#3: CodingEarly ofAramaic, the Old XishuangBalti, BannaKirat Dai Writing(Limbu), EnteringManipuri into(Meitei) BMPand ofTai ISO/IEC 10646scripts|date=19941999-0407-1820|first1=Michael|last1=Everson|author-link1=Michael Everson|first2=Rick|last2=McGowan|ref=none}}
|-
| || {{nobr|[http:X3L2//std.dkuug.dk/jtc1/sc2/wg2/docs/n1099.pdf N1099 (pdf],94-088}} || [httphttps://stdwww.dkuugevertype.dkcom/jtc1standards/sc2tai/wg2/docs/n1099typedn1013-lanna.docpdf docN1013]) || {{Citation|title=The motionMotion on codingthe Coding of the Old Xishuang Banna Dai Writing, Entering into BMP of ISO/IEC 10646|date=1994-1004-1018|ref=none}}
|-
| || {{nobr|[https://wwwweb.unicodearchive.org/L2web/L199920200215052615/n2042http://std.dkuug.dk/jtc1/sc2/wg2/docs/n1099.pdf L2/99-245N1099 (pdf],}} || [https://wwwweb.unicodearchive.org/web/20200215052615/http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2042n1099typed.pdfdoc N2042doc]) || {{Citation|title=UnicodeThe Technicalmotion Reporton #3:coding Earlyof Aramaic,the Balti,Old KiratXishuang (Limbu),Banna ManipuriDai (Meitei)Writing andEntering Taiinto BMP of ISO/IEC scripts10646|date=19991994-0710-20|first1=Michael|last1=Everson|authorlink1=Michael Everson|first2=Rick10|last2ref=McGowannone}}
|-
| {{nobr|[https://www.unicode.org/L2/L2001L2004/0117004351-comm2242lanna.htmpdf L2/0104-170 (html351],}} [https://www.unicode.org/L2/L2001/01170-comm2242.txt txt]) || || {{Citation|title=CommentsLanna onUnicode: SC2/WG2A N2242Draft Proposal|date=20012004-0406-1728|first=PeterMartin|last=ConstableHosken|ref=none}}
|-
| {{nobr|[https://www.unicode.org/L2/L2004L2005/0435105095r-lanna.pdf L2/0405-351095R]}} || || {{Citation|title=Lanna Unicode: A Draft Proposal|date=20042005-0604-2825|first=Martin|last=Hosken|ref=none}}
|-
| {{nobr|[https://www.unicode.org/L2/L2005/05095r05166-lannadekalb-gk-vb.pdf L2/05-095R166]}} || || {{Citation|title=LannaTowards Unicode:a AComputerization Proposalof the Lao Tham System of Writing|date=2005-0407-2515|first1=G.|last1=Kourilsky|first2=V.|firstlast2=MartinBerment|lastref=Hoskennone}}
|-
| {{nobr|[https://www.unicode.org/L2/L2005/0516605188-dekalbrespond-gk-vb05166.pdf L2/05-166188]}} || || {{Citation|title=TowardsLao aTham in ComputerizationTerms of theLanna: Laoa Thamresponse Systemto ofL2/05-166 Writingfrom L2/05-095|date=2005-0708-1502|first1first=G.Martin|last1last=Kourilsky|first2=V.Hosken|last2ref=Bermentnone}}
|-
| {{nobr|[https://www.unicode.org/L2/L2005L2006/0518806258r-respondn3121-05166lanna.pdf L2/0506-188258R]}} || [https://www.unicode.org/wg2/docs/n3121r.pdf N3121R] || {{Citation|title=LaoProposal Thamfor inencoding Terms ofthe Lanna: ascript responsein tothe L2/05-166BMP fromof L2/05-095the UCS|date=20052006-0809-0209|firstfirst1=Michael|last1=Everson|first2=Martin|lastlast2=Hosken|ref=none}}
|-
| {{nobr|[https://www.unicode.org/L2/L2006/06258r-n312106311-lanna-comments.pdf L2/06-258R311]}} || [https://www.unicode.org/wg2/docs/n3121rn3159.pdf N3121RN3159] || {{Citation|title=Response to N3121R: Proposal for encoding the Lanna script in the BMP of the UCS|date=2006-09-09|first1=Michael20|last1first=EversonNgwe|first2last=MartinTun|last2ref=Hoskennone}}
|-
| {{nobr|[https://www.unicode.org/L2/L2006/0631106319-lanna-commentsn3161.pdf L2/06-311319]}} || [https://www.unicode.org/wg2/docs/n3159n3161.pdf N3159N3161] || {{Citation|title=ResponseOpinions to N3121R: Proposal for encoding theon N3121-Lanna script in the BMP of the UCS|date=2006-09-20|first=Ngwe22|lastref=Tunnone}}
|-
| {{nobr|[https://www.unicode.org/L2/L2006/0631906320-n3161n3169-lanna-adhoc.pdf L2/06-319320]}} || [https://www.unicode.org/wg2/docs/n3161n3169.pdf N3161N3169R] || {{Citation|title=OpinionsLanna on N3121ad-Lannahoc scriptreport|date=2006-09-2226|first1=Zhuang|last1=Chen|first2=Michael|last2=Everson|first3=Martin|last3=Hosken|first4=Lin-Mei|last4=Wei|ref=none}}
|-
| || {{nobr|[https://www.unicode.org/L2wg2/L2006docs/06320-n3169-lanna-adhocn3153.pdf L2/06-320N3153 (pdf],}} || [https://www.unicode.org/wg2/docs/n3169n3153.pdfdoc N3169Rdoc]) || {{Citation|title=LannaUnconfirmed ad-hocminutes of WG 2 meeting 49 AIST, Akihabara, Tokyo, Japan; report|date=2006-09-2625/29|first1date=Zhuang2007-02-16|last1first=ChenV. S.|first2last=MichaelUmamaheswaran|last2ref=Eversonnone|first3section=Martin|last3=Hosken|first4=Lin-Mei|last4=WeiM49.17}}
|-
| || {{nobr|[https://www.unicode.org/wg2L2/docsL2007/n315307015.pdfhtm N3153 (pdfL2/07-015],}} [https://www.unicode.org/wg2/docs/n3153.doc doc])|| || {{Citation|title=UnconfirmedUTC minutes#110 of WG 2 meeting 49 AIST, Akihabara, Tokyo, Japan; 2006-09-25/29Minutes|date=2007-02-1608|first=V. S.Lisa|last=UmamaheswaranMoore|ref=none|section=M49Lanna (C.17)}}
|-
| {{nobr|[https://www.unicode.org/L2/L2007/0701507007r-n3207r-lanna.htmpdf L2/07-015007R]}} || [https://www.unicode.org/wg2/docs/n3207.pdf N3207] || {{Citation|title=UTCRevised #110proposal Minutesfor encoding the Lanna script in the BMP of the UCS|date=2007-0203-0821|firstfirst1=LisaMichael|lastlast1=MooreEverson|sectionfirst2=Lanna (C.17)Martin|last2=Hosken|first3=Peter|last3=Constable|ref=none}}
|-
| {{nobr|[https://www.unicode.org/L2/L2007/07007r07101-n3207r-lannan3238.pdf L2/07-007R101]}} || [https://www.unicode.org/wg2/docs/n3207n3238.pdf N3207N3238] || {{Citation|title=RevisedProposing proposalon forEncoding encodingOld theTai Lanna script in the BMP of the UCSLue|date=2007-04-03-21|first1=Michael|last1=Everson|first2=Martin|last2=Hosken|first3=Peter|last3ref=Constablenone}}
|-
| {{nobr|[https://www.unicode.org/L2/L2007/0710107098-n3239-n3238-response.pdf L2/07-101098]}} || [https://www.unicode.org/wg2/docs/n3238n3239.pdf N3238N3239] || {{Citation|title=Response to Chinese contribution N3238, "Proposing on Encoding Old Tai Lue"|date=2007-04-0311|ref=none}}
|-
| || {{nobr|[https://www.unicode.org/L2wg2/L2007docs/07098-n3239-n3238-responsen3353.pdf L2/07-098N3353 (pdf],}} || [https://www.unicode.org/wg2/docs/n3239n3353.pdfdoc N3239doc]) || {{Citation|title=ResponseUnconfirmed tominutes Chineseof contributionWG N3238,2 “Proposingmeeting on51 EncodingHanzhou, OldChina; Tai Lue”2007-04-24/27|date=2007-0410-1110|first=V. S.|last=Umamaheswaran|ref=none|section=M51.2}}
|-
| || {{nobr|[https://www.unicode.org/wg2L2/docsL2007/n335307118.pdfhtm N3353 (pdfL2/07-118R2],}} [https://www.unicode.org/wg2/docs/n3353.doc doc])|| || {{Citation|title=UnconfirmedUTC minutes#111 of WG 2 meeting 51 Hanzhou, China; 2007-04-24/27Minutes|date=2007-1005-1023|first=V. S.Lisa|last=UmamaheswaranMoore|ref=none|section=M51.2111-C17}}
|-
| {{nobr|[https://www.unicode.org/L2/L2007/0711807268-n3253.htmpdf L2/07-118R2268]}} || {{nobr|[https://www.unicode.org/wg2/docs/n3253.pdf N3253 (pdf],}} [https://www.unicode.org/wg2/docs/n3253.doc doc]) || {{Citation|title=UTCUnconfirmed #111minutes Minutesof WG 2 meeting 50, Frankfurt-am-Main, Germany; 2007-04-24/27|date=2007-0507-2326|first=LisaV. S.|last=MooreUmamaheswaran|ref=none|section=111-C17M50.10}}
|-
| {{nobr|[https://www.unicode.org/L2/L2007/0726807307-n3253n3313.pdf L2/07-268307]}} || {{nobr|[https://www.unicode.org/wg2/docs/n3253n3313.pdf N3253 (pdf],}} [https://www.unicode.org/wg2/docs/n3253.doc docN3313]) || {{Citation|title=UnconfirmedComments minuteson ofLanna WGencoding 2in meeting 50, Frankfurt-am-Main, Germany; 2007-04-24/27FPDAM4|date=2007-0709-2606|firstref=V. S.|last=Umamaheswaran|section=M50.10none}}
|-
| {{nobr|[https://www.unicode.org/L2/L2007/0730707316-n3313n3342.pdf L2/07-307316]}} || [https://www.unicode.org/wg2/docs/n3313n3342.pdf N3313N3342] || {{Citation|title=CommentsResponse onto Lanna encoding in FPDAM4N3313|date=2007-09-0610|first=Martin|last=Hosken|ref=none}}
|-
| {{nobr|[https://www.unicode.org/L2/L2007/0731607319-n3342n3346.pdf L2/07-316319]}} || [https://www.unicode.org/wg2/docs/n3342n3346.pdf N3342N3346] || {{Citation|title=ResponseAd tohoc N3313report on Lanna|date=2007-09-10|first=Martin19|lastref=Hoskennone}}
|-
| {{nobr|[https://www.unicode.org/L2/L2007/0731907322r-n3346n3349r.pdf L2/07-319322]}} || [https://www.unicode.org/wg2/docs/n3346n3349.pdf N3346N3349R] || {{Citation|title=AdSummary hocof reportrepertoire onfor LannaFPDAM 5 of ISO/IEC 10646:2003 and future amendments|date=2007-09-1928|first=Michael|last=Everson|ref=none|section=Tai Tham}}
|-
| {{nobr|[https://www.unicode.org/L2/L2007/0734507353-wg2consent.htmtxt L2/07-345353]}} || || {{Citation|title=UTCWG2 #113Consent MinutesDocket|date=2007-10-2510|first=LisaKen|last=MooreWhistler|ref=none|section=ConsensusA. 113-C10Lanna (FDAM 4 and FPDAM 5)}}
|-
| {{nobr|[https://www.unicode.org/L2/L2008L2007/08073-subjoin-tham07345.pdfhtm L2/0807-073345]}} || [https://www.unicode.org/wg2/docs/n3384.pdf N3384] || {{Citation|title=TaiUTC Tham#113 Subjoined VariantsMinutes|date=20082007-0110-2825|first=MartinLisa|last=HoskenMoore|ref=none|section=Consensus 113-C10}}
|-
| {{nobr|[https://www.unicode.org/L2/L2008/0800308037r2-n3379r2.htmpdf L2/08-003037R2]}} || [https://www.unicode.org/wg2/docs/n3379.pdf N3379R2] || {{Citation|title=UTCTai Tham Ad-hoc #114Meeting MinutesReport|date=2008-0204-1418|first=LisaPeter|last=MooreConstable|sectionref=Tai Thamnone}}
|-
| {{nobr|[https://www.unicode.org/L2/L2008/08037r208073-n3379r2subjoin-tham.pdf L2/08-037R2073]}} || [https://www.unicode.org/wg2/docs/n3379n3384.pdf N3379N3384] || {{Citation|title=Tai Tham Ad-hoc MeetingSubjoined ReportVariants|date=2008-0401-1828|first=PeterMartin|last=ConstableHosken|ref=none}}
|-
| {{nobr|[https://www.unicode.org/L2/L2008/08318-n345308003.pdfhtm L2/08-318003]}} || {{nobr|[https://www.unicode.org/wg2/docs/n3453.pdf N3453 (pdf],}} [https://www.unicode.org/wg2/docs/n3453.doc doc]) || {{Citation|title=UnconfirmedUTC minutes#114 of WG 2 meeting 52Minutes|date=2008-0802-1314|first=V. S.Lisa|last=UmamaheswaranMoore|ref=none|section=M52.2aTai Tham}}
|-
| {{nobr|[https://www.unicode.org/L2/L2017L2008/1716908318-tai-tham-categoryn3453.txtpdf L2/1708-169318]}} || {{nobr|[https://www.unicode.org/wg2/docs/n3453.pdf N3453 (pdf],}} [https://www.unicode.org/wg2/docs/n3453.doc doc]) || {{Citation|title=ProposedUnconfirmed Indicminutes Syllabicof CategoryWG changes2 formeeting Tai Tham for Unicode 1052|date=20172008-0508-1213|first=RoozbehV. S.|last=PournaderUmamaheswaran|ref=none|section=M52.2a}}
|-
| {{nobr|[https://www.unicode.org/L2/L2017L2014/1710314126r-indic-properties.htmpdf L2/1714-103126R (pdf],}} [https://www.unicode.org/L2/L2014/14126-files/ appendices]) || || {{Citation|title=UTCImprovements #151requested Minutesfor Unicode Indic properties [Affects U+1A55, 1A60, 1A80-1A89, and 1A90-1A99]|date=20172014-05-1808|first=LisaRoozbeh|last=MoorePournader|sectionref=B.14.9none}}
|-
| {{nobr|[https://www.unicode.org/L2/L2018L2014/18053-consonant-suffixed14177.txthtm L2/1814-053177]}} || || {{Citation|title=NewUTC Indic#140 Syllabic Category Consonant_Initial_PostfixedMinutes|date=20182014-0110-2417|first=RoozbehLisa|last=PournaderMoore|ref=none|section=B.14.5 [Affects U+1A56-1A5E, 1A75-1A7C, and 1A7F]<!--L2/14-199 is relevant for some scripts, but not for Tai Tham-->}}
|-
| {{nobr|[https://www.unicode.org/L2/L2018L2017/1817117120-vowelsisc-belowcorrections.pdf L2/1817-171120]}} || || {{Citation|title=PositioningCorrections ofto the Indic Syllabic Category for the Tai Tham VowelsScript Below[Affects U+1A57, 1A5A-1A5E, 1A74, and 1A7A]|date=20182017-04-2930|first=Richard|last=Wordingham|ref=none}}
|-
| {{nobr|[https://www.unicode.org/L2/L2018L2017/1811517169-tai-tham-category.htmtxt L2/1817-115169]}} || || {{Citation|title=UTCProposed #155Indic MinutesSyllabic Category changes for Tai Tham for Unicode 10 [Affects U+1A57, 1A5A-1A5E, 1A74, and 1A7A]|date=20182017-05-0912|first=LisaRoozbeh|last=MoorePournader|sectionref=B.14.7none}}
|-
| {{nobr|[https://www.unicode.org/L2/L2018L2017/18241-script-ad-hoc17103.pdfhtm L2/1817-241103]}} || || {{Citation|title=Recommendations to UTC #151 156 July 2018 on Script Proposals Minutes|date=20182017-0705-2518|first1first=DeborahLisa|last1last=AndersonMoore|display-authorsref=etalnone|section=15B.14.9 Tai[Affects ThamU+1A57, 1A5A-1A5E, 1A74, and 1A7A]}}
|-
| {{nobr|[https://www.unicode.org/L2/L2018/1818318053-consonant-suffixed.htmtxt L2/18-183053]}} || || {{Citation|title=UTCNew #156Indic MinutesSyllabic Category Consonant_Initial_Postfixed [Affects U+1A5A]|date=2018-1101-2024|first=LisaRoozbeh|last=MoorePournader|sectionref=D.12 Positioning of Tai Tham vowels belownone}}
|-
| {{nobr|[https://www.unicode.org/L2/L2018/18007.htm L2/18-007]}} || || {{Citation|title=UTC #154 Minutes|date=2018-03-19|first=Lisa|last=Moore|ref=none|section=B.14.7 [Affects U+1A5A]}}
|-
| {{nobr|[https://www.unicode.org/L2/L2018/18171-vowels-below.pdf L2/18-171]}} || || {{Citation|title=Positioning of Tai Tham Vowels Below [Affects U+1A69 and 1A6A]|date=2018-04-29|first=Richard|last=Wordingham|ref=none}}
|-
| {{nobr|[https://www.unicode.org/L2/L2018/18241-script-ad-hoc.pdf L2/18-241]}} || || {{Citation|title=Recommendations to UTC # 156 July 2018 on Script Proposals|date=2018-07-20|first1=Deborah|last1=Anderson|display-authors=etal|ref=none|section=15. Tai Tham [Affects 1A69 and 1A6A]}}
|-
| {{nobr|[https://www.unicode.org/L2/L2018/18183.htm L2/18-183]}} || || {{Citation|title=UTC #156 Minutes|date=2018-11-20|first=Lisa|last=Moore|ref=none|section=D.12 Positioning of Tai Tham vowels below [Affects U+1A69 and 1A6A]}}
|- class="sortbottom"
| colspan="6" | {{notelist}}
| colspan="6" | {{reflist|group=lower-alpha|refs=<ref name=final>Proposed code points and characters names may differ from final code points and names</ref>}}
|}
 
Line 101 ⟶ 112:
If a consonant has two subscript forms and the choice affects the meaning, the form typically used for syllable-final consonants will be encoded with SAKOT, and the other form will have its own code point. There are 7 consonants which have different subscript forms in this way, namely {{lang|nod|ᩁ}} RA, {{lang|nod|ᩃ}} LA, {{lang|nod|ᨷ}} BA, {{lang|nod|ᩈ}} HIGH SA, {{lang|nod|ᨾ}} MA, {{lang|nod|ᨳ}} HIGH RATA, and {{lang|nod|ᨻ}} LOW PA.
 
{{lang|nod|ᨣᩕᩪ}} ({{IPA-|nod|k'''ʰ'''uː}}) is encoded as &lt;U+1A23 LOW KA, '''U+1A55 MEDIAL RA''', U+1A6A SIGN UU&gt; but
{{lang|nod|ᨠᩣ᩠ᩁ}} ({{IPA-|nod|kaː'''n'''|IPA}}) is encoded as &lt;U+1A20 HIGH KA, U+1A63 SIGN AA,
'''U+1A60 SAKOT, U+1A41 RA'''&gt;<ref name=N3207R/>{{rp|at=Section 4}}
 
{{lang|nod|ᩆᩦ᩠ᩃ}} ({{IPA-|nod|siː'''n'''|IPA}}) is encoded as &lt;U+1A46 HIGH SHA, U+1A66 SIGN II,
'''U+1A60 SAKOT, U+1A43 LA'''&gt;<ref name=N3207R/>{{rp|at=Section 14.5}} but {{lang|nod|ᨸᩖᩦ}} ({{IPA-|nod|piː|IPA}}) is encoded as &lt;U+1A38 {{lang|nod|ᨸ}}HIGH PA, '''U+1A56 MEDIAL LA''', U+1A66 SIGN II&gt;.<ref name=N3207R/>{{rp|at=Section 4}} (For the use of LA as a syllable final letter, compare {{lang|nod|ᩁᨭᩛᨷᩣ᩠ᩃ}}<ref name=N3207R/>{{rp|at=Section 4}} ({{IPA|nod|lat tha baːn}}).
 
U+1A57 SIGN LA TANG LAI looks like &lt;U+1A60 SAKOT, U+1A43 LA&gt; but is in origin a ligature of it with &lt;U+1A60 SAKOT, U+1A361A26 NANGA&gt;. Tai Lue uses it to write the word {{lang|khb|ᨴᩢ᩵ᩗᩣ}} ({{IPA-|khb|taŋ laːi|IPA}}).<ref name="blends">
{{cite_webcite web|url=http://www.seasite.niu.edu/tai/TaiLue/graphic%20blends.htm
|language=English
|title=Tai Lue: Complex Orthographic Rules: Graphic Blends(I)
Line 119 ⟶ 130:
<!-- See also http://www.seasite.niu.edu/tai/TaiLue/excerpt.htm -->
 
{{lang|nod|ᨣᩝᩴ}} ({{IPA-|nod|kɔː '''b'''ɔː|IPA}})is encoded as &lt;U+1A23 LOW KA, '''U+1A5D SIGN BA''', U+1A74 MAI KANG&gt;, but {{lang|nod|ᨠᩢ᩠ᨷ}} ({{IPA-|nod|ka'''p'''|IPA}}) is encoded as &lt;U+1A20 HIGH KA, U+1A62 MAI SAT, '''U+1A60 SAKOT, U+1A37 BA'''&gt; and
{{lang|nod|ᨠᩢᨷ᩠ᨷ᩺}} ({{IPA-|nod|kap|IPA}}) is encoded as &lt;U+1A20 HIGH KA, U+1A62 MAI SAT, U+1A37 BA, '''U+1A60 SAKOT, U+1A37 BA''', U+1A7A RA HAAM&gt;
 
:In the final proposal,<ref name="N3207R">
{{cite_webcite web|url=https://www.unicode.org/L2/L2007/07007r-n3207r-lanna.pdf
|format=PDF
|language=en
|title=Revised proposal for encoding the Lanna script in the BMP of the UCS
|lastlast1=Everson
|firstfirst1=Michael
|authorlink1=Michael_Everson
|last2=Hosken
Line 138 ⟶ 148:
}}
</ref>{{rp|page=1}} which the [[Unicode Consortium]] accepted that what is now SIGN BA (as in {{lang|nod|ᨣᩝᩴ}}) would be encoded as &lt;SAKOT, BA&gt; and what is now &lt;SAKOT, BA&gt; (as in {{lang|nod|ᨠᩢ᩠ᨷ}}) should be encoded as &lt;SAKOT, HIGH PA&gt;, but during the ISO process the meaning of &lt;SAKOT, BA&gt; changed<ref name="N3384">
{{cite_webcite web|url=https://www.unicode.org/L2/L2008/08073-subjoin-tham.pdf
|format=PDF
|language=English
|title=Tai Tham Subjoined Variants
Line 157 ⟶ 166:
 
Tai Khuen has an additional way of writing subscript MA. There is a special codepoint for this additional method<ref name="N3379">
{{cite_webcite web|url=https://www.unicode.org/L2/L2008/08037-n3379.pdf
|format=PDF
|language=en
|title=Tai Tham Ad-hoc Meeting Report (WG2 N3379)
Line 180 ⟶ 188:
| ___location= Chiang Mai
}}</ref>{{rp|page=368}} is encoded as &lt;U+1A36 NA, U+1A65 SIGN I, U+1A23 LOW KA,
U+1A31 RANA, <b>'''U+1A5B SIGN HIGH RATHA OR LOW PA</b>'''&gt;:
{{lang|nod|[[Rajabhat University system|{{lang|nod|ᩁᩣᨩᨽᩢ᩠ᨮ}}]]}}<ref name="N3207R"/>{{rp|page=3}} is encoded
&lt;U+1A41 RA, U+1A63 SIGN AA, U+1A29 LOW CA, U+1A3D LOW PHA, U+1A62 MAI SAT, '''U+1A60 SAKOT, U+1A2E HIGH RATHA'''&gt;.
{{lang|nod|ᨶᩥᨻᩛᩣᨶ}} is encoded as &lt;U+1A36 NA, U+1A65 SIGN I, U+1A3B LOW PA, '''U+1A5B SIGN HIGH RATHA OR LOW PA''', U+1A63 SIGN AA, U+1A36 NA&gt;:
{{lang|nod|ᨴᩮ᩠ᨻ}} is encoded as &lt;U+1A34 LOW TA, U+1A6E SIGN E, '''U+1A60 SAKOT, U+1A3B LOW PA'''&gt;.
The latter word is also written as {{lang|nod|ᨴᩮ᩠ᨷ}}.
The Lao-style consonant conjunct {{lang|lo|ᨲ᩠ᨳ}} (encoded as &lt;U+1A32 HIGH TA, U+1A60 SAKOT, U+1A33 HIGH THA&gt;) looks as though it is {{lang|lo|ᨲᩛ}} encoded as &lt;U+1A32 HIGH TA, U+1A5B SIGN HIGH RATHA OR LOW PA&gt;. The shape of U+1A5B depends upon the consonant it is subscript to.
 
The dependent vowel of words like {{lang|nod|ᨯᩬᨠ}} 'flower' is encoded by the special vowel &lt;U+1A6C SIGN OA BELOW&gt;; one should not use the sequence &lt;U+1A60 SAKOT, U+1A4B LETTER A&gt; There is also an encoded dependent vowel for words like Tai Khuen, Tai Lue and Lao words such as {{lang|kkh|ᨶ᩶ᩭ}}, namely U+1A6D SIGN OY. This vowel is not encoded as &lt;U+1A6C SIGN OA BELOW, U+1A60 SAKOT, U+1A3F LOW YA&gt; (which is what Northern Thai uses for the corresponding words; nor is it the sequence &lt;U+1A60 SAKOT, U+1A40 HIGH YA&gt;<ref name=N3207R/>&#x2060;{{rp|at=Section&nbsp;5}}
 
==Superscript Consonants==
Superscript consonants are encoded independently of the base consonants. Some characters serve both as superscript consonants and in other roles, and are therefore discussed further in this section.
[[Anusvara|Niggahita]] and is encoded as U+1A74 MAI KANG. Superscript WA is not encoded separately. It is encoded as MAI KANG. For example, Tai Khuen {{lang|kkh|ᨯ᩠ᨿᩴ}} ({{IPA-|shn|deu|IPA}}) is encoded as &lt;U+1A2, DA, U+1A60 SAKOT, U+1A3F LOW YA, U+1A74 MAI KANG&gt;. For the purposes of character sequencing, it is generally treated as a vowel.
 
Superscript cluster-initial NGA is encoded as U+1A58 MAI KANG LAI. Note that Lao generally uses the same glyph for MAI KANG LAI and U+1A59 SIGN FINAL NGA.
Line 199 ⟶ 207:
U+1A62 MAI SAT serves three roles - it is a vowel, a final consonant, and a vowel shortener.
 
Choosing the encoding of the superscript form of RA and the vowel killers was difficult. In the 1940's1940s the Tai Khuen wrote the consonant and the vowel killer the same way. The proposers of the encoding made enquiries and were told that the glyphs were still the same and therefore encoded them both as U+1A7A RA HAAM. It was then learnt that the Tai Khuen had changed the glyphs of the vowel killer, and a new character U+1A7C KARAN was added for the Tai Khuen style of the vowel killer. Some Northern Thai writers prefer to use U+1A7C as the vowel killer, and indeed the use of its glyph is not unknown in Northern Thai handwriting.
 
==Special Consonants==
The special forms {{lang|nod|ᩓ}} and {{lang|nod|&#x1a55;}} are encoded by the code points U+1A53 and U+1A55 respectively.
 
If the glyphs of U+1A36 NA and U+1A63 SIGN AA would be side by side they are written as the ligature {{lang|nod|ᨶᩣ}} rather than as two separate glyphs {{lang|nod|ᨶ&zwnj;ᩣ}}. They are written as a ligature even if the NA has a subscript consonant or a non-following mark attached. Examples: {{lang|nod|ᨾᨶ᩠ᨲᩣ}} ({{IPA-|nod|man taː|IPA}}, encoding &lt;U+1A3E MA, U+1A36 NA, U+1A60 SAKOT, U+1A32 HIGH TA, U+1A63 SIGN AA&gt;) and {{lang|nod|ᨶᩮᩢᩣ}} ({{IPA-|nod|nau|IPA}}, encoding &lt;U+1A36 NA, 1A6E SIGN E, U+1A62 MAI SAT, U+1A63 SIGN AA&gt;). Subscript NA and SIGN AA do ''not'' similarly ligate, e.g. {{lang|nod|ᩉ᩠ᨶᩣ}} (({{IPA-|nod|naː|IPA}}), encoded &lt;U+1A49 HIGH HA, U+1A60 SAKOT, 1A36 NA, U+1A63 SIGN AA&gt;)
 
The geminate consonant {{lang|nod|ᩔ}} is encoded separately because the word {{lang|nod|ᩅᩥᩈᩮ᩠ᩈ}} ({{IPA-|nod|wiseːt}}, encoding &lt;U+1A45 WA, U+1A65 SIGN I, U+1A48 HIGH SA, U+1A6E SIGN E, U+1A60 SAKOT, U+1A48 HIGH SA&gt;) has an appearance very different from {{lang|nod|ᩅᩥᩔᩮ}}, but one may have occasion to fold the final syllable to &lt;HIGH SA, SAKOT, HIGH SA, SIGN E&gt;. Indeed, in 2019 to 2020 there was a campaign to establish the latter as its standard spelling.
 
By contrast, the geminate consonant {{lang|nod|ᨬ᩠ᨬ}} is encoded as the conjunct &lt;U+1A2C NYA, U+1A60 SAKOT, U+1A2C NYA&gt;, even though some of its glyphs may resemble the hypothetical conjunct {{lang|nod|ᨱ᩠ᨬ}} &lt;U+1A31 RANA, U+1A60 SAKOT, U+1A2C NYA&gt;.
Line 213 ⟶ 221:
The independent vowel {{lang|nod|ᩋ}} and the consonant {{lang|nod|ᩋ}} are the same character, U+1A4B.
 
The independent vowel {{lang|nod|ᩋᩣ}} and the sequence of the consonant {{lang|nod|ᩋ}} and dependent vowel {{lang|nod|&#x1a63;}} have the same appearance {{lang|nod|ᩋᩣ}} and are therefore both encoded &lt;U+1A20 LETTER A, U+1A63 SIGN AA&gt;.
 
Northern Thai uses 5 independent vowels with their own code points, namely {{lang|nod|ᩍ}}, {{lang|nod|ᩎ}}, {{lang|nod|ᩏ}}, {{lang|nod|ᩐ}} and {{lang|nod|ᩑ}}.<ref name=N3207R/>{{rp|at=Section 3}}
 
In Northern Thai the 8th independent vowel is no different from the sequence of the consonant {{lang|nod|ᩋ}} and dependent vowel {{lang|nod|&#x1A70;}}, i.e. {{lang|nod|ᩋᩰ}}, and they are therefore both encoded &lt;U+1A4B LETTER A, U+1A70 SIGN OO&gt;. Other languages use a distinct character {{lang|nod|ᩒ}} U+1A52 LETTER OO for the independent vowel.
 
==Character Order within Text==
Line 236 ⟶ 244:
The 'onset letters' are consonants, independent vowels or special symbols. The consonants in a group are ordered according to the order in which they are sounded or used to be sounded.
 
Example: {{lang|nod|ᨻᩩᨴ᩠ᨵ}} ({{IPA-|nod|put thaʔ}})
:onset letter: {{lang|nod|ᨻ}}
:pure vowel: {{lang|nod| ᩩ}}
Line 246 ⟶ 254:
The encoding is &lt;U+1A3B LOW PA, U+1A69 SIGN U, U+1A34 LOW TA, U+1A60 SAKOT, U+1A35 LOW THA&gt;
 
Example: {{lang|nod|ᨻᩕ}} has a single consonant sound {{IPA-|nod|pʰ}}, but formerly had 2 sounds, namely those of {{lang|nod|ᨻ}} and then {{lang|nod|ᩁ}} as in central Thai. This word is encoded as &lt;LOW PA, MEDIAL RA&gt;.
 
Apart from MEDIAL RA, the order of the consonant glyphs is the same as the order of the sounds. In most cases MEDIAL RA is the last consonant but the WA of /ua/ and the LOW YA of /ia/ follow MEDIAL RA.
Line 254 ⟶ 262:
:{{lang|nod|ᨠᩕᩈᩢ᩠ᨲ}} is encoded &lt;U+1A20 HIGH KA, U+1A55 MEDIAL RA, U+1A48 HIGH SA, U+1A62 MAI SAT, U+1A60 SAKOT, U+1A32 HIGH TA&gt.
:{{lang|nod|ᩈᩕ᩠ᩅᨾ}} is encoded &lt;U+1A48 HIGH SA, U+1A55 MEDIAL RA, U+1A60 SAKOT, U+1A45 WA, U+1A3E MA&gt;.
:But {{lang|nod|ᨲᩕ᩠ᨶᩬᨾ}} ({{IPA-|nod|tʰa nɔːm}})<ref name=MFL/>{{rp|269}} is encoded &lt;U+1A32 HIGH TA, U+1A55 MEDIAL RA, U+1A60 SAKOT, U+1A36 NA, U+1A6C SIGN OA BELOW, U+1A3E MA&gt;
 
For words like {{lang|nod|ᨧᩮᩢ᩶ᩣ}} there is the rule that symbols for vowels and tones have the order:<ref name=N3207R/>{{rp|at=Section 5 first part, 5.3 and 13}}
Line 269 ⟶ 277:
Examples:
:{{lang|nod|ᨧᩮᩢ᩶ᩣ}} is encoded as &lt;U+1A27 HIGH CA, U+1A6E SIGN E, U+1A62 MAI SAT, U+1A76 TONE-2, U+1A63 SIGN AA&gt;<ref name="N3207R"/>{{rp|at=Section 5 no. 29}}
:{{lang|nod|ᨾᩢᩣ}} ({{IPA-|nod|maːk|IPA}}) is encoded as &lt;U+1A3E MA, U+1A62 MAI SAT, U+1A63 SIGN AA&gt;
:{{lang|nod|ᩃᩪᩢ}} ({{IPA-|nod|luːk|IPA}}) is encoded as &lt;U+1A43 LA, U+1A6A SIGN UU, U+1A62 MAI SAT&gt;
:{{lang|nod|ᨶᩮᩢᩣ}} is encoded as &lt;U+1A36 NA, U+1A6E SIGN E, U+1A62 MAI SAT, U+1A63 SIGN AA&gt;
:{{lang|nod|ᩋᩫᨶ᩠ᨲᩕᩣ᩠ᨿ}} ({{IPA-|nod|on thaʔ laːi}}) is encoded as &lt;U+1A4B LETTER A, U+1A6B SIGN O, U+1A36 NA, U+1A60 SAKOT, U+1A32 HIGH TA, U+1A55 MEDIAL RA, U+1A63 SIGN AA, U+1A60 SAKOT, U+1A3F LOW YA&gt;
 
For /ia/ and /ua/ in all their forms, subscript LOW YA and WA are reckoned as onset consonants.<ref name="N3207R"/>{{rp|at=Section 14.3}}.
 
Examples:
:{{lang|nod|ᩈ᩠ᨿᩮ}} is actually encoded &lt;U+1A48 HIGH SA, U+1A60 SAKOT, U+1A3F LOW YA, U+1A6E SIGN E&gt;<ref name="N3207R"/>{{rp|at=Section 5 No. 33}}
:{{lang|nod|ᨸᩖ᩠ᨿ᩵ᩁᨸ᩠ᩃ᩠ᨿ᩵ᩁ}} is actually encoded &lt;U+1A38 HIGH PA, U+1A561A60 MEDIALSAKOT, U+1A43 LA, U+1A60 SAKOT, U+1A3F LOW YA, U+1A75 TONE-1, U+1A41 RA&gt;<ref name="N3207R"/>{{rp|at=Section 14.9}}
:{{lang|nod|ᨲ᩠ᩅᩫ}} is actually encoded &lt;U+1A32 HIGH TA, U+1A60 SAKOT, U+1A45 WA, U+1A6B SIGN O&gt;<ref name="N3207R"/>{{rp|at=Section 14.3}}
:{{lang|nod|ᩈ᩠ᩅ᩵ᩁ}} is actually encoded &lt;U+1A48 HIGH SA, U+1A60 SAKOT, U+1A45 WA, U+1A75 TONE-1, U+1A41 RA&gt;
:{{lang|nod|ᨠᩖ᩠ᩅ᩠᩶ᨿ}} is actually encoded as &lt;U+1A20 KA, U+1A56 MEDIAL LA, U+1A60 SAKOT, U+1A45 WA, U+1A601A76 SAKOTTONE-2, U+1A761A60 TONE-2SAKOT, U+1A3F LOW YA&gt;
::(&lt;U+1A60, U+1A76&gt; is canonically equivalent to &lt;U+1A76, U+1A60&gt;)
 
Outside Northern Thailand, the MAI KANG in the symbol for /am/ is written on the SIGN AA component. In Northern Thailand, it is positioned variously – on the consonant, on the SIGN AA and between them. The Unicode Consortium refused a special character for the combination. The word {{lang|nod|ᨷᩴ᩠᩵ᨾᩣ}} ({{IPA-|nod|bɔːmaː|IPA}}) should not appear to have the same vowel as {{lang|nod|ᨲ᩵ᩣᩴ}} ({{IPA-|nod|tam|IPA}}). The combination for /am/ is therefore encoded as &lt;U+1A63 SIGN AA, U+1A74 MAI KANG&gt;. The word {{lang|nod|ᨷᩴ᩠᩵ᨾᩣ}} is encoded as &lt;U+1A37 BA, U+1A74 MAI KANG, U+1A75 TONE-1, U+1A60 SAKOT, U+1A3E MA, U+1A63 SIGN AA&gt;. The word {{lang|nod|ᨲ᩵ᩣᩴ}} is encoded as &lt;U+1A32 HIGH TA, U+1A75 TONE-1, U+1A63 SIGN AA, U+1A74 MAI KANG&gt;. The combination for /am/ with SIGN TALL AA is encoded as &lt;U+1A64 SIGN TALL AA, U+1A74 MAI KANG&gt;.
 
U+1A5A SIGN LOW PA is a special case; the Tai KhuenLue word {{lang|kkhkhb|ᨣᨽᩚ}} ({{IPA-shn|khb|kap phaʔ|IPA}}) is encoded as &lt;U+1A23 LOW KA, U+1A3D LOW PHA, U+1A5A SIGN LOW PA&gt;.<ref name="N3207R"/>{{rp|at=Section 4}}
 
Examples showing mai kang lai and la tang lai:
:Pali word {{lang|nod|ᩈᩘᨥᩮᩣ}} (saṅgho) is encoded &lt;U+1A48 SA, U+1A58 MAI KANG LAI, U+1A25 LOW KHA, U+1A6E SIGN E, U+1A63 SIGN AA&gt;.
:“ᨴᩘ᩠ᩃᩣ᩠ᨿ”Northern Thai word {{lang|nod|ᨴᩘ᩠ᩃᩣ᩠ᨿ}} ({{IPA-|nod|taŋ laːi}}) is encoded &lt;U+1A34 LOW TA, U+1A58 MAI KANG LAI, U+1A60 SAKOT, U+1A43 LA, U+1A63 SIGN AA, U+1A60 SAKOT, U+1A3F LOW YA&gt.
:Tai Lue word {{lang|khb|ᨴᩢᩗᩣ}} ({{IPA-|khb|taŋ laːi}}) is encoded &lt;U+1A34 LOW TA, U+1A62 MAI SAT, U+1A57 LA TANG LAI, U+1A63 SIGN AA&gt;.
 
==External links==
Line 298 ⟶ 306:
== References ==
{{reflist}}
 
[[Category:Unicode blocks]]