Content deleted Content added
Drmccreedy (talk | contribs) →History: fix links |
|||
(15 intermediate revisions by 8 users not shown) | |||
Line 7:
|alphabets = Tai Tham
|5_2 = 127
|note = <ref>{{cite web|url=https://www.unicode.org/ucd/|title=Unicode character database|work=The Unicode Standard|accessdate=
}}
Line 19:
The following Unicode-related documents record the purpose and process of defining specific characters in the Tai Tham block:
{{sticky header}}
{| class="wikitable collapsible sticky-header"
|-
! [[Unicode#Versions|Version]] !! {{nobr|Final code points{{efn|name=final|Proposed code points and characters names may differ from final code points and names}}}} !! Count !! [[International Committee for Information Technology Standards|L2]] ID !! [[ISO/IEC JTC 1/SC 2|WG2]] ID !! Document
|-
| rowspan="39" | 5.2{{efn|name=version|Changes to characters may have first taken effect in a later version of Unicode}} || rowspan="39" width="180" | U+1A20..1A5E, 1A60..1A7C, 1A7F..1A89, 1A90..1A99, 1AA0..1AAD || rowspan="39" | 127 || {{nobr|[https://www.unicode.org/L2/L1999/n2042.pdf L2/99-245]}} || [https://www.unicode.org/wg2/docs/n2042.pdf N2042] || {{Citation|title=Unicode Technical Report #3: Early Aramaic, Balti, Kirat (Limbu), Manipuri (Meitei) and Tai Lü scripts|date=1999-07-20|first1=Michael|last1=Everson|
|-
| {{nobr|X3L2/94-088}} || [
|-
| || {{nobr|[https://web.archive.org/web/20200215052615/http://std.dkuug.dk/jtc1/sc2/wg2/docs/n1099.pdf N1099 (pdf],}} [https://web.archive.org/web/20200215052615/http://std.dkuug.dk/jtc1/sc2/wg2/docs/n1099typed.doc doc]) || {{Citation|title=The motion on coding of the Old Xishuang Banna Dai Writing Entering into BMP of ISO/IEC 10646|date=1994-10-10|ref=none}}
|-
| {{nobr|[https://www.unicode.org/L2/L2004/04351-lanna.pdf L2/04-351]}} || || {{Citation|title=Lanna Unicode: A Draft Proposal|date=2004-06-28|first=Martin|last=Hosken|ref=none}}
|-
| {{nobr|[https://www.unicode.org/L2/L2005/05095r-lanna.pdf L2/05-095R]}} || || {{Citation|title=Lanna Unicode: A Proposal|date=2005-04-25|first=Martin|last=Hosken|ref=none}}
|-
| {{nobr|[https://www.unicode.org/L2/L2005/05166-dekalb-gk-vb.pdf L2/05-166]}} || || {{Citation|title=Towards a Computerization of the Lao Tham System of Writing|date=2005-07-15|first1=G.|last1=Kourilsky|first2=V.|last2=Berment|ref=none}}
|-
| {{nobr|[https://www.unicode.org/L2/L2005/05188-respond-05166.pdf L2/05-188]}} || || {{Citation|title=Lao Tham in Terms of Lanna: a response to L2/05-166 from L2/05-095|date=2005-08-02|first=Martin|last=Hosken|ref=none}}
|-
| {{nobr|[https://www.unicode.org/L2/L2006/06258r-n3121-lanna.pdf L2/06-258R]}} || [https://www.unicode.org/wg2/docs/n3121r.pdf N3121R] || {{Citation|title=Proposal for encoding the Lanna script in the BMP of the UCS|date=2006-09-09|first1=Michael|last1=Everson|first2=Martin|last2=Hosken|ref=none}}
|-
| {{nobr|[https://www.unicode.org/L2/L2006/06311-lanna-comments.pdf L2/06-311]}} || [https://www.unicode.org/wg2/docs/n3159.pdf N3159] || {{Citation|title=Response to N3121R: Proposal for encoding the Lanna script in the BMP of the UCS|date=2006-09-20|first=Ngwe|last=Tun|ref=none}}
|-
| {{nobr|[https://www.unicode.org/L2/L2006/06319-n3161.pdf L2/06-319]}} || [https://www.unicode.org/wg2/docs/n3161.pdf N3161] || {{Citation|title=Opinions on N3121-Lanna script|date=2006-09-22|ref=none}}
|-
| {{nobr|[https://www.unicode.org/L2/L2006/06320-n3169-lanna-adhoc.pdf L2/06-320]}} || [https://www.unicode.org/wg2/docs/n3169.pdf N3169R] || {{Citation|title=Lanna ad-hoc report|date=2006-09-26|first1=Zhuang|last1=Chen|first2=Michael|last2=Everson|first3=Martin|last3=Hosken|first4=Lin-Mei|last4=Wei|ref=none}}
|-
| || {{nobr|[https://www.unicode.org/wg2/docs/n3153.pdf N3153 (pdf],}} [https://www.unicode.org/wg2/docs/n3153.doc doc]) || {{Citation|title=Unconfirmed minutes of WG 2 meeting 49 AIST, Akihabara, Tokyo, Japan; 2006-09-25/29|date=2007-02-16|first=V. S.|last=Umamaheswaran|ref=none|section=M49.17}}
|-
| {{nobr|[https://www.unicode.org/L2/L2007/07015.htm L2/07-015]}} || || {{Citation|title=UTC #110 Minutes|date=2007-02-08|first=Lisa|last=Moore|ref=none|section=Lanna (C.17)}}
|-
| {{nobr|[https://www.unicode.org/L2/L2007/07007r-n3207r-lanna.pdf L2/07-007R]}} || [https://www.unicode.org/wg2/docs/n3207.pdf N3207] || {{Citation|title=Revised proposal for encoding the Lanna script in the BMP of the UCS|date=2007-03-21|first1=Michael|last1=Everson|first2=Martin|last2=Hosken|first3=Peter|last3=Constable|ref=none}}
|-
| {{nobr|[https://www.unicode.org/L2/L2007/07101-n3238.pdf L2/07-101]}} || [https://www.unicode.org/wg2/docs/n3238.pdf N3238] || {{Citation|title=Proposing on Encoding Old Tai Lue|date=2007-04-03|ref=none}}
|-
| {{nobr|[https://www.unicode.org/L2/L2007/07098-n3239-n3238-response.pdf L2/07-098]}} || [https://www.unicode.org/wg2/docs/n3239.pdf N3239] || {{Citation|title=Response to Chinese contribution N3238, "Proposing on Encoding Old Tai Lue"|date=2007-04-11|ref=none}}
|-
| || {{nobr|[https://www.unicode.org/wg2/docs/n3353.pdf N3353 (pdf],}} [https://www.unicode.org/wg2/docs/n3353.doc doc]) || {{Citation|title=Unconfirmed minutes of WG 2 meeting 51 Hanzhou, China; 2007-04-24/27|date=2007-10-10|first=V. S.|last=Umamaheswaran|ref=none|section=M51.2}}
|-
| {{nobr|[https://www.unicode.org/L2/L2007/07118.htm L2/07-118R2]}} || || {{Citation|title=UTC #111 Minutes|date=2007-05-23|first=Lisa|last=Moore|ref=none|section=111-C17}}
|-
| {{nobr|[https://www.unicode.org/L2/L2007/07268-n3253.pdf L2/07-268]}} || {{nobr|[https://www.unicode.org/wg2/docs/n3253.pdf N3253 (pdf],}} [https://www.unicode.org/wg2/docs/n3253.doc doc]) || {{Citation|title=Unconfirmed minutes of WG 2 meeting 50, Frankfurt-am-Main, Germany; 2007-04-24/27|date=2007-07-26|first=V. S.|last=Umamaheswaran|ref=none|section=M50.10}}
|-
| {{nobr|[https://www.unicode.org/L2/L2007/07307-n3313.pdf L2/07-307]}} || [https://www.unicode.org/wg2/docs/n3313.pdf N3313] || {{Citation|title=Comments on Lanna encoding in FPDAM4|date=2007-09-06|ref=none}}
|-
| {{nobr|[https://www.unicode.org/L2/L2007/07316-n3342.pdf L2/07-316]}} || [https://www.unicode.org/wg2/docs/n3342.pdf N3342] || {{Citation|title=Response to N3313
|-
| {{nobr|[https://www.unicode.org/L2/L2007/07319-n3346.pdf L2/07-319]}} || [https://www.unicode.org/wg2/docs/n3346.pdf N3346] || {{Citation|title=Ad hoc report on Lanna|date=2007-09-19|ref=none}}
|-
| {{nobr|[https://www.unicode.org/L2/L2007/07322r-n3349r.pdf L2/07-322]}} || [https://www.unicode.org/wg2/docs/n3349.pdf N3349R] || {{Citation|title=Summary of repertoire for FPDAM 5 of ISO/IEC 10646:2003 and future amendments|date=2007-09-28|first=Michael|last=Everson|ref=none|section=Tai Tham}}
|-
| {{nobr|[https://www.unicode.org/L2/L2007/
|-
| {{nobr|[https://www.unicode.org/L2/L2007/
|-
| {{nobr|[https://www.unicode.org/L2/L2008/08037r2-n3379r2.pdf L2/08-037R2]}} || [https://www.unicode.org/wg2/docs/n3379.pdf N3379R2] || {{Citation|title=Tai Tham Ad-hoc Meeting Report|date=2008-04-18|first=Peter|last=Constable|ref=none}}
|-
| {{nobr|[https://www.unicode.org/L2/L2008/08073-subjoin-tham.pdf L2/08-073]}} || [https://www.unicode.org/wg2/docs/n3384.pdf N3384] || {{Citation|title=Tai Tham Subjoined Variants|date=2008-01-28|first=Martin|last=Hosken|ref=none}}
|-
| {{nobr|[https://www.unicode.org/L2/L2008/08003.htm L2/08-003]}} || || {{Citation|title=UTC #114 Minutes|date=2008-02-14|first=Lisa|last=Moore|ref=none|section=Tai Tham}}
|-
| {{nobr|[https://www.unicode.org/L2/L2008/08318-n3453.pdf L2/08-318]}} || {{nobr|[https://www.unicode.org/wg2/docs/n3453.pdf N3453 (pdf],}} [https://www.unicode.org/wg2/docs/n3453.doc doc]) || {{Citation|title=Unconfirmed minutes of WG 2 meeting 52|date=2008-08-13|first=V. S.|last=Umamaheswaran|ref=none|section=M52.2a}}
|-
| {{nobr|[https://www.unicode.org/L2/L2014/14126r-indic-properties.pdf L2/14-
|-
| {{nobr|[https://www.unicode.org/L2/L2014/14177.htm L2/14-177]}} || || {{Citation|title=UTC #140 Minutes|date=2014-
|-
| {{nobr|[https://www.unicode.org/L2/L2017/17120-isc-corrections.pdf L2/17-120]}} || || {{Citation|title=Corrections to the Indic Syllabic Category for the Tai Tham Script [Affects U+1A57, 1A5A-1A5E, 1A74, and 1A7A]|date=2017-
|-
| {{nobr|[https://www.unicode.org/L2/L2017/17169-tai-tham-category.txt L2/17-169]}} || || {{Citation|title=Proposed Indic Syllabic Category changes for Tai Tham for Unicode 10 [Affects U+1A57, 1A5A-1A5E, 1A74, and 1A7A]|date=2017-05-12|first=Roozbeh|last=Pournader|ref=none}}
|-
| {{nobr|[https://www.unicode.org/L2/L2017/17103.htm L2/17-103]}} || || {{Citation|title=UTC #151 Minutes|date=2017-05-18|first=Lisa|last=Moore|ref=none|section=B.14.9
|-
| {{nobr|[https://www.unicode.org/L2/L2018/18053-consonant-suffixed.txt L2/18-053]}} || || {{Citation|title=New Indic Syllabic Category Consonant_Initial_Postfixed [Affects U+1A5A]|date=2018-01-24|first=Roozbeh|last=Pournader|ref=none}}
|-
| {{nobr|[https://www.unicode.org/L2/L2018/18007.htm L2/18-007]}} || || {{Citation|title=UTC #154 Minutes|date=2018-03-19|first=Lisa|last=Moore|ref=none|section=B.14.7
|-
| {{nobr|[https://www.unicode.org/L2/L2018/18171-vowels-below.pdf L2/18-171]}} || || {{Citation|title=Positioning of Tai Tham Vowels Below [Affects U+1A69 and 1A6A]|date=2018-04-29|first=Richard|last=Wordingham|ref=none}}
|-
| {{nobr|[https://www.unicode.org/L2/L2018/18241-script-ad-hoc.pdf L2/18-241]}} || || {{Citation|title=Recommendations to UTC # 156 July 2018 on Script Proposals
|-
| {{nobr|[https://www.unicode.org/L2/L2018/18183.htm L2/18-183]}} || || {{Citation|title=UTC #156 Minutes|date=2018-11-20|first=Lisa|last=Moore|ref=none|section=D.12 Positioning of Tai Tham vowels below
|- class="sortbottom"
| colspan="6" | {{notelist}}
Line 116 ⟶ 112:
If a consonant has two subscript forms and the choice affects the meaning, the form typically used for syllable-final consonants will be encoded with SAKOT, and the other form will have its own code point. There are 7 consonants which have different subscript forms in this way, namely {{lang|nod|ᩁ}} RA, {{lang|nod|ᩃ}} LA, {{lang|nod|ᨷ}} BA, {{lang|nod|ᩈ}} HIGH SA, {{lang|nod|ᨾ}} MA, {{lang|nod|ᨳ}} HIGH RATA, and {{lang|nod|ᨻ}} LOW PA.
{{lang|nod|ᨣᩕᩪ}} ({{IPA
{{lang|nod|ᨠᩣ᩠ᩁ}} ({{IPA
'''U+1A60 SAKOT, U+1A41 RA'''><ref name=N3207R/>{{rp|at=Section 4}}
{{lang|nod|ᩆᩦ᩠ᩃ}} ({{IPA
'''U+1A60 SAKOT, U+1A43 LA'''><ref name=N3207R/>{{rp|at=Section 14.5}} but {{lang|nod|ᨸᩖᩦ}} ({{IPA
U+1A57 SIGN LA TANG LAI looks like <U+1A60 SAKOT, U+1A43 LA> but is in origin a ligature of it with <U+1A60 SAKOT, U+
{{
|language=English
|title=Tai Lue: Complex Orthographic Rules: Graphic Blends(I)
Line 134 ⟶ 130:
<!-- See also http://www.seasite.niu.edu/tai/TaiLue/excerpt.htm -->
{{lang|nod|ᨣᩝᩴ}} ({{IPA
{{lang|nod|ᨠᩢᨷ᩠ᨷ᩺}} ({{IPA
:In the final proposal,<ref name="N3207R">
{{
|language=en
|title=Revised proposal for encoding the Lanna script in the BMP of the UCS
|
|
|authorlink1=Michael_Everson
|last2=Hosken
Line 153 ⟶ 148:
}}
</ref>{{rp|page=1}} which the [[Unicode Consortium]] accepted that what is now SIGN BA (as in {{lang|nod|ᨣᩝᩴ}}) would be encoded as <SAKOT, BA> and what is now <SAKOT, BA> (as in {{lang|nod|ᨠᩢ᩠ᨷ}}) should be encoded as <SAKOT, HIGH PA>, but during the ISO process the meaning of <SAKOT, BA> changed<ref name="N3384">
{{
|language=English
|title=Tai Tham Subjoined Variants
Line 172 ⟶ 166:
Tai Khuen has an additional way of writing subscript MA. There is a special codepoint for this additional method<ref name="N3379">
{{
|language=en
|title=Tai Tham Ad-hoc Meeting Report (WG2 N3379)
Line 196 ⟶ 189:
}}</ref>{{rp|page=368}} is encoded as <U+1A36 NA, U+1A65 SIGN I, U+1A23 LOW KA,
U+1A31 RANA, '''U+1A5B SIGN HIGH RATHA OR LOW PA'''>:
{{lang|nod|[[Rajabhat University system
<U+1A41 RA, U+1A63 SIGN AA, U+1A29 LOW CA, U+1A3D LOW PHA, U+1A62 MAI SAT, '''U+1A60 SAKOT, U+1A2E HIGH RATHA'''>.
{{lang|nod|ᨶᩥᨻᩛᩣᨶ}} is encoded as <U+1A36 NA, U+1A65 SIGN I, U+1A3B LOW PA, '''U+1A5B SIGN HIGH RATHA OR LOW PA''', U+1A63 SIGN AA, U+1A36 NA>:
{{lang|nod|ᨴᩮ᩠ᨻ}} is encoded as <U+1A34 LOW TA, U+1A6E SIGN E, '''U+1A60 SAKOT, U+1A3B LOW PA'''>.
The latter word is also written as {{lang|nod|ᨴᩮ᩠ᨷ}}.
The Lao-style consonant conjunct {{lang|lo|ᨲ᩠ᨳ}} (encoded as <U+1A32 HIGH TA, U+1A60 SAKOT, U+1A33 HIGH THA>) looks as though it is {{lang|lo|ᨲᩛ}} encoded as <U+1A32 HIGH TA, U+1A5B SIGN HIGH RATHA OR LOW PA>.
The dependent vowel of words like {{lang|nod|ᨯᩬᨠ}} 'flower' is encoded by the special vowel <U+1A6C SIGN OA BELOW>; one should not use the sequence <U+1A60 SAKOT, U+1A4B LETTER A> There is also an encoded dependent vowel for words like Tai Khuen, Tai Lue and Lao words such as {{lang|kkh|ᨶ᩶ᩭ}}, namely U+1A6D SIGN OY. This vowel is not encoded as <U+1A6C SIGN OA BELOW, U+1A60 SAKOT, U+1A3F LOW YA> (which is what Northern Thai uses for the corresponding words; nor is it the sequence <U+1A60 SAKOT, U+1A40 HIGH YA><ref name=N3207R/>{{rp|at=Section 5}}
Line 208 ⟶ 201:
Superscript consonants are encoded independently of the base consonants. Some characters serve both as superscript consonants and in other roles, and are therefore discussed further in this section.
[[Anusvara|Niggahita]] and is encoded as U+1A74 MAI KANG. Superscript WA is not encoded separately. It is encoded as MAI KANG. For example, Tai Khuen {{lang|kkh|ᨯ᩠ᨿᩴ}} ({{IPA
Superscript cluster-initial NGA is encoded as U+1A58 MAI KANG LAI. Note that Lao generally uses the same glyph for MAI KANG LAI and U+1A59 SIGN FINAL NGA.
Line 214 ⟶ 207:
U+1A62 MAI SAT serves three roles - it is a vowel, a final consonant, and a vowel shortener.
Choosing the encoding of the superscript form of RA and the vowel killers was difficult. In the
==Special Consonants==
The special forms {{lang|nod|ᩓ}} and {{lang|nod|ᩕ}} are encoded by the code points U+1A53 and U+1A55 respectively.
If the glyphs of U+1A36 NA and U+1A63 SIGN AA would be side by side they are written as the ligature {{lang|nod|ᨶᩣ}} rather than as two separate glyphs {{lang|nod|ᨶ‌ᩣ}}. They are written as a ligature even if the NA has a subscript consonant or a non-following mark attached. Examples: {{lang|nod|ᨾᨶ᩠ᨲᩣ}} ({{IPA
The geminate consonant {{lang|nod|ᩔ}} is encoded separately because the word {{lang|nod|ᩅᩥᩈᩮ᩠ᩈ}} ({{IPA
By contrast, the geminate consonant {{lang|nod|ᨬ᩠ᨬ}} is encoded as the conjunct <U+1A2C NYA, U+1A60 SAKOT, U+1A2C NYA>, even though some of its glyphs may resemble the hypothetical conjunct {{lang|nod|ᨱ᩠ᨬ}} <U+1A31 RANA, U+1A60 SAKOT, U+1A2C NYA>.
Line 251 ⟶ 244:
The 'onset letters' are consonants, independent vowels or special symbols. The consonants in a group are ordered according to the order in which they are sounded or used to be sounded.
Example: {{lang|nod|ᨻᩩᨴ᩠ᨵ}} ({{IPA
:onset letter: {{lang|nod|ᨻ}}
:pure vowel: {{lang|nod| ᩩ}}
Line 261 ⟶ 254:
The encoding is <U+1A3B LOW PA, U+1A69 SIGN U, U+1A34 LOW TA, U+1A60 SAKOT, U+1A35 LOW THA>
Example: {{lang|nod|ᨻᩕ}} has a single consonant sound {{IPA
Apart from MEDIAL RA, the order of the consonant glyphs is the same as the order of the sounds. In most cases MEDIAL RA is the last consonant but the WA of /ua/ and the LOW YA of /ia/ follow MEDIAL RA.
Line 269 ⟶ 262:
:{{lang|nod|ᨠᩕᩈᩢ᩠ᨲ}} is encoded <U+1A20 HIGH KA, U+1A55 MEDIAL RA, U+1A48 HIGH SA, U+1A62 MAI SAT, U+1A60 SAKOT, U+1A32 HIGH TA>.
:{{lang|nod|ᩈᩕ᩠ᩅᨾ}} is encoded <U+1A48 HIGH SA, U+1A55 MEDIAL RA, U+1A60 SAKOT, U+1A45 WA, U+1A3E MA>.
:But {{lang|nod|ᨲᩕ᩠ᨶᩬᨾ}} ({{IPA
For words like {{lang|nod|ᨧᩮᩢ᩶ᩣ}} there is the rule that symbols for vowels and tones have the order:<ref name=N3207R/>{{rp|at=Section 5 first part, 5.3 and 13}}
Line 284 ⟶ 277:
Examples:
:{{lang|nod|ᨧᩮᩢ᩶ᩣ}} is encoded as <U+1A27 HIGH CA, U+1A6E SIGN E, U+1A62 MAI SAT, U+1A76 TONE-2, U+1A63 SIGN AA><ref name="N3207R"/>{{rp|at=Section 5 no. 29}}
:{{lang|nod|ᨾᩢᩣ}} ({{IPA
:{{lang|nod|ᩃᩪᩢ}} ({{IPA
:{{lang|nod|ᨶᩮᩢᩣ}} is encoded as <U+1A36 NA, U+1A6E SIGN E, U+1A62 MAI SAT, U+1A63 SIGN AA>
:{{lang|nod|ᩋᩫᨶ᩠ᨲᩕᩣ᩠ᨿ}} ({{IPA
For /ia/ and /ua/ in all their forms, subscript LOW YA and WA are reckoned as onset consonants.<ref name="N3207R"/>{{rp|at=Section 14.3}}
Line 293 ⟶ 286:
Examples:
:{{lang|nod|ᩈ᩠ᨿᩮ}} is actually encoded <U+1A48 HIGH SA, U+1A60 SAKOT, U+1A3F LOW YA, U+1A6E SIGN E><ref name="N3207R"/>{{rp|at=Section 5 No. 33}}
:{{lang|nod|
:{{lang|nod|ᨲ᩠ᩅᩫ}} is actually encoded <U+1A32 HIGH TA, U+1A60 SAKOT, U+1A45 WA, U+1A6B SIGN O><ref name="N3207R"/>{{rp|at=Section 14.3}}
:{{lang|nod|ᩈ᩠ᩅ᩵ᩁ}} is actually encoded <U+1A48 HIGH SA, U+1A60 SAKOT, U+1A45 WA, U+1A75 TONE-1, U+1A41 RA>
:{{lang|nod|ᨠᩖ᩠ᩅ᩠᩶ᨿ}} is actually encoded as <U+1A20 KA, U+1A56 MEDIAL LA, U+1A60 SAKOT, U+1A45 WA, U+
::(<U+1A60, U+1A76> is canonically equivalent to <U+1A76, U+1A60>)
Outside Northern Thailand, the MAI KANG in the symbol for /am/ is written on the SIGN AA component. In Northern Thailand, it is positioned variously – on the consonant, on the SIGN AA and between them. The Unicode Consortium refused a special character for the combination. The word {{lang|nod|ᨷᩴ᩠᩵ᨾᩣ}} ({{IPA
U+1A5A SIGN LOW PA is a special case; the Tai Lue word {{lang|khb|ᨣᨽᩚ}} ({{IPA
Examples showing mai kang lai and la tang lai:
:Pali word {{lang|nod|ᩈᩘᨥᩮᩣ}} (saṅgho) is encoded <U+1A48 SA, U+1A58 MAI KANG LAI, U+1A25 LOW KHA, U+1A6E SIGN E, U+1A63 SIGN AA>.
:
:Tai Lue word {{lang|khb|ᨴᩢᩗᩣ}} ({{IPA
==External links==
|