Content deleted Content added
No edit summary |
→See also: List of ISO 639 codes (language codes) and List of ISO 15924 codes (script codes) |
||
Line 1:
{{permprot}}
{{tfd-kept|entry=Wikipedia:Templates for deletion/Log/2006 February 20#Language templates}}
= Documentation =
== Usage ==
The purpose of this template is to indicate that a given span of text belongs to a particular language (see [[language code]]).
<nowiki>{{</nowiki>lang|''Language tag''|''Text''}}
Use [[List of ISO 639 codes|ISO 639 language codes]]. Example (where <tt>fr</tt> is the code for [[French language|French]]):
<pre>
* She said: "{{lang|fr|''Je suis Française.''}}"
</pre>
Results in your browser:
* She said: "{{lang|fr|''Je suis Française.''}}"
Also, there are also versions of this template for each specific language that also print the language's name, intended to be used the first time that language is used in the article. For example, "{{tlx|lang-es|Español}}" and "{{tlx|lang-ru|русский язык}}" gives "{{lang-es|Español}}" and "{{lang-ru|русский язык}}".
Language subtags can also be used to indicate writing script or regional variation of a language. According to the [[W3C]], "The golden rule when creating language tags is to keep the tag as short as possible",[http://www.w3.org/International/articles/language-tags/Overview.en.php] so such subtags should only be added if there is an important reason to use them. [[ISO 639-1]] is preferred over [[ISO 639-2]] and [[ISO 639-3]].
=== Indicating writing script ===
If necessary, add the [[List of ISO 15924 codes by letter code|ISO 15924]] code to indicate the script.
For example, [[Russian language|Russian]] is usually written in the [[Cyrillic alphabet]], therefore the '<tt>Cyrl</tt>' script code is superfluous and the language code will be <tt>ru</tt> instead of <tt>ru-Cyrl</tt>. However, when that text is [[transliteration|transliterated]] the <tt>Latn</tt> code (latin script) should be used because it isn't the default script for Russian: <tt>ru-Latn</tt>. Example:
<pre>
* Moscow ([[Russian language|Russian]]: {{lang|ru|Москва́}}, {{lang|ru-Latn|''Moskva''}})
</pre>
which is the same as
<pre>
* Moscow ({{lang-ru|Москва́}}, {{lang|ru-Latn|''Moskva''}})
</pre>
Results in your browser:
* Moscow ({{lang-ru|Москва́}}, {{lang|ru-Latn|''Moskva''}})
<nowiki>{{lang|ru-Latn|''Moskva''}}</nowiki> is equivalent to <nowiki>{{transl|ru|''Moskva''}}</nowiki>. To specify you are using the [[ISO 9]] transliteration of Cyrillic, use <nowiki>{{transl|ru|ISO|''Moskva''}}</nowiki>:
* Moscow ({{lang-ru|Москва́}}, [[ISO 9]]: {{transl|ru|ISO|''Moskva''}})
[[Internet Assigned Numbers Authority|IANA]] maintains a list specifying when the script tag should be suppressed [http://www.iana.org/assignments/language-subtag-registry]. In some cases the script must be always specified, like [[Tajik language|Tajik]] which can be equally written in [[Arabic alphabet|Arabic]], [[Latin alphabet|Latin]] or Cyrilic alphabets:
<pre>
* Tajik ({{rtl-lang|tg-Arab|تاجیکی}}, {{lang|tg-Latn|''tojikī''}}, {{lang|tg-Cyrl|тоҷикӣ}})
</pre>
Which results in your browser:
* Tajik ({{rtl-lang|tg-Arab|تاجیکی}}, {{lang|tg-Latn|''tojikī''}}, {{lang|tg-Cyrl|тоҷикӣ}})
Note the use of {{tl|rtl-lang}} instead of {{tl|lang}} when using the Arabic script (see below section [[#Writing direction|writing direction]]).
=== Indicating regional variant ===
In some cases, maybe it will be needed to add [[ISO 3166-1 alpha-2]] country codes (specific usage of that country). Of course the three codes can appear in the same tag, for example the code <tt>zh-Hant-TW</tt> will be used for [[Chinese language|Chinese]] text written with [[Traditional Chinese characters|Traditional Han characters]], containing words or expressions specific to [[Taiwan]]. Examples:
<pre>
* {{lang|zh-Hant-TW|臺灣}}
</pre>
Results in your browser:
* {{lang|zh-Hant-TW|臺灣}}
=== Writing direction ===
<s>{{tl|rtl-lang}} is a specific template for right-to-left languages like [[Arabic language|Arabic]] or [[Hebrew language|Hebrew]].</s>
for right-to-left paragraphs (as opposed to rtl strings embedded in an English paragraph), use {{tl|rtl-para}}.
== Rationale ==
* Web browsers can use the information to choose an appropriate font.
** This is great for [[CJK]] where a character can be given its language-specific shape but will fall back to another form if no appropriate font is found or if the preferred font lacks that character, for example because the language does not make use of that character: see [[User:Wikipeditor/CJK|this comparison table and screenshot]].
* For [[web accessibility|accessibility]] purposes: [[screen reader]]s need language info to provide correctly audio output.
* For [[spell checker]]s and grammar checkers.
* To help browsers choosing appropriate [[quotation mark]]s, and making decisions about [[hyphenation]], [[ligature (typography)|ligature]]s, and spacing.
* Users can apply styles to languages in their [[style sheet]]s (useful for editors).
* Google and other [[search engine]]s can use this information when [[indexing]] text.
* Could be useful for application developers who re-publish Wikipedia.
* Could be useful for research or compiling statistics about language use in Wikipedia.
== Applying styles ==
You can apply [[Cascading Style Sheets|CSS]] styles in your user style sheet. Registered users can put styles into User:XXX/monobook.css, where ''XXX'' is the user name.
These examples will not work in Internet Explorer, because it doesn't support attribute selectors. Try [[Firefox (browser) |Firefox]].
Example: to apply a font to Russian-language text:
span[lang|=ru] { font-family: fonteskaya; }
Example: to apply a colour to text marked with any language:
span[lang] { color: green; }
Note: don't use quotation marks in your your user style sheet. Wikitext will screw them up. They are recommended in CSS, but not required.
== See also ==
* [[:Category:Multilingual support templates]]
* [[List of ISO 639 codes]] (language codes)
* [[List of ISO 15924 codes]] (script codes)
* {{tl|rtl-lang}}, for right-to-left scripts like Arabic or Hebrew
* {{tl|transl}} and [[list of ISO transliterations]]
<!-- Can the functionality of the following template be merged into {{lang}}/{{rtl-lang}}?: * {{tl|script}} and [[list of writing systems]]-->
* [[:Category:Language icons]], templates for visually marking external links to foreign-language content
== References ==
* [[World Wide Web Consortium|W3C]]
** [http://www.w3.org/International/articles/language-tags/Overview.en.php Language tags in HTML and XML]—overview
** [http://www.w3.org/TR/i18n-html-tech-lang/ Internationalization Best Practices: Specifying Language in XHTML & HTML Content]—W3C Working Draft 21 July 2006
** [http://www.w3.org/International/articles/bcp47/ Understanding the New Language Tags]
** [http://www.w3.org/International/questions/qa-css-lang FAQ: Styling using the lang attribute]
* [[Internet Assigned Numbers Authority|IANA]]
** [http://www.iana.org/assignments/language-subtag-registry IANA Language Subtag Registry]
** [http://www.rfc-editor.org/rfc/rfc4646.txt Tags for Identifying Languages] (RFC 4646)
** [http://www.rfc-editor.org/rfc/rfc4647.txt Matching of Language Tags] (RFC 4647)
** [http://www.iana.org/assignments/language-tags Language tags]—(obsolete per RFC4646)
= Discussion =
== lang-ru, lang-ar, ... ==
There are two templates right now that have similar functionality, [[Template:lang-ru|lang-ru]] and [[Template:lang-ar|lang-ar]]. (There may be others; these are the only two I know of). In addition to adding the span tag, these templates also prefix the encompassed text with the name of the language and a link to the relevant Wiki article.
I was thinking these two templates could be rewritten in terms a simpler "lang" template, in the form:
<nowiki>[[{{{2}}} language|{{{2}}}]]: <span lang="{{{1}}}" xml:lang="{{{1}}}">{{{3}}}</span></nowiki>
This would be used like:
<nowiki>{{lang|ar|Arabic|لرررررل}}</nowiki>
And produce something like:
{{lang-ar|لرررررل}}
But then I found that this similar [[Template:lang|lang]] template already exists. Since this template doesn't seem to be used in too many places, would any of its authors mind if I rewrote it to work like the above? Or perhaps this template could be left as-is and a "langWithName" template could be created for the above. What do people think?
— <span style="font-size:80%">'''[[User:J'raxis|J’raxis]]''' ('''[[User talk:J'raxis|T]]''')</span> 18:22:26, 2005-08-03 (UTC)
: There's also [[:Template:lang-uk |lang-uk]]. ''—[[User:Mzajac |Michael]] [[User talk:Mzajac |Z.]] <small>2005-08-3 21:55 Z</small>''
In Russian wikipedia, there are a lot of these '''lang-xx''' templates (see [[:ru:Википедия:Шаблоны:Языки]]). They are really handy in that form. As for the current '''lang''' template, it should be left as it is — very often foreign words or phrases are inserted many times in the same article, and it would be annoying if they are every time accompanied by (e.g.) German: .... German: .... German: .... — [[User:Monedula|Monedula]] 05:56, 4 August 2005 (UTC)
:Yep, there's a use for both of these types of templates. I'm going to make a new <nowiki>{{</nowiki>[[Template:langWithName|langWithName]]}} template and rewrite the three ''lang-xx'' ones in terms of it.
: — <span style="font-size:80%">'''[[User:J'raxis|J’raxis]]''' ('''[[User talk:J'raxis|T]]''')</span> 20:09:47, 2005-08-04 (UTC)
== Language templates are being removed ==
Language templates are being unilaterally removed by one user. Please see related discussion at [[template talk:lang-uk]]. ''—[[User:Mzajac |Michael]] [[User talk:Mzajac |Z.]] <small>2005-10-16 22:29 Z</small>''
== <code>span lang</code> vs. <code>font lang</code> ==
Can anybody explain or illustrate the difference? [[User:Wikipeditor|Wikipeditor]] 10:48, 10 January 2006 (UTC)
:No real difference at all. But <code>span</code> is preferable, since it does not suggest that it is about fonts. Formerly, Wikipedia did not allow the use of <code>span</code>, so <code>font</code> was used instead. — [[User:Monedula|Monedula]] 11:11, 10 January 2006 (UTC)
::Thank you! I see either tag's main use in helping the browser to choose an appropriate font, as mentioned [[#just|above]]. I gather from your reply that there aren't really any situations where <code>font lang</code> is more appropriate than <code>span lang</code>.—[[User:Wikipeditor|Wikipeditor]]
:::In fact, the <code>font</code> tag is deprecated in HTML 4.0 and absent in XHTML. — [[User:Monedula|Monedula]] 06:39, 13 January 2006 (UTC)
== What should be tagged? ==
Question: should we mark transliterations? (I think not) ''—[[User:Mzajac |Michael]] [[User talk:Mzajac |Z.]] <small>2005-01-27 10:17 Z</small>''
:Why not? IMHO in the general case is more important to tag transliterations than "original" words (words written with a different script are obviously not in English). --[[User:Suruena|surueña]] 10:38, 10 March 2006 (UTC)
::Because it looks bad. Example:
::A user chooses in their Firefox options to show "Western" (i.e. English and such languages) text in the font [[Gentium]], Cyrillic text in the font Georgia, and Simplified Chinese text in the font SimSun or SimHei.
::Now when this user sees a word in Cyrillic letters (e.g. {{lang|mk|Југославија}}), it will appear in Georgia characters if the template:lang is used, and as an irritating mix of Gentium characters (those whose shape also occurs in the Latin alphabet: Ј, о, с, а, ј) and Georgia characters (specifically Cyrillic characters: у, г, л, в, и) if it is not used.
::When the same user sees a Hanyu Pinyin romanisation (e.g. ''Bái Máo Nǚ''), it will be displayed in ugly SimSun or SimHei if the template is used to mark it as zh-CN, or in highly legible Gentium if not.
::If it weren't for this problem, I agree that it wouldn't make much sense to treat romanised text in a foreign language as Western, e.g. to tag pinyin as English in an article in the Chinese Wikipedia.<br/>[[User:Wikipeditor|Wikipeditor]] 18:39, 10 March 2006 (UTC)
:::[[Image:Template lang-roman numerals.png|frame|Screenshot with Firefox 1.5 under Linux]]
:::I'm mainly interested in this template for accessibility reasons (using the lang attribute is an important point specified in the [[Web Accessibility Initiative#Web Content Accessibility Guidelines (WCAG)|Web Content Accessibility Guidelines]])[http://www.w3.org/TR/WAI-WEBCONTENT/wai-pageauth.html#tech-identify-changes] because text-to-speech browsers need to know the language of every (the pronunciation of an English word is very different from a French one, for example).
:::But you are right, if it's looks ugly nobody will use it. Once I tried to use Unicode numerals[http://www.unicode.org/charts/PDF/U2150.pdf] instead of plain ASCII letters (yes, to allow screen readers pronounce "the second" instead of "{{IPA|/i i/}}"), but in my browser (Firefox 1.5) it was illegible in boldface:
::::'''[[Ferdinand II]]''' -> '''Ferdinand Ⅱ'''
:::Maybe a new template should be created for [[transliteration]]s? --[[User:Suruena|surueña]] 20:17, 10 March 2006 (UTC)
::On a related note, do you think there is a way to use [[ISO 15924]] to have Simplified or Traditional Chinese appear as such in Wikipedia articles, instead of using [[ISO 3166-1 alpha-2]]? It feels so wrong to use country codes sometimes.<br/>[[User:Wikipeditor|Wikipeditor]] 18:39, 10 March 2006 (UTC)
:::I've never heard about that before, sorry. Maybe a new template like [[Template:Polytonic]] can be created for that. However, from the accessibility POV, it's not very useful to specify the script instead of the language. I mean, it's OK to create accessible templates like [[Template:Polytonic]], because it "specifies the script" (through <tt>class="polytonic"</tt>, i.e. useful to style sheets but this isn't a HTML standarized method to indicate the script) as well as the <strong>language</strong>. Please, don't create templates that only specify the writing system.
:::But you are OK, I don't think it's a good solution to tag Traditional Chinese with <tt>zh-tw</tt>. and Simplified Chinese with <tt>zh-cn</tt>. --[[User:Suruena|surueña]] 20:17, 10 March 2006 (UTC)
::::Yes, it seems the W3C recommends the use of <tt>zh-Hant</tt> and <tt>zh-Hans</tt> language codes for Traditional and Simplified Chinese respectively, i.e. the [[ISO 3166-1 alpha-2]] language tag followed by the [[ISO 15924]] script tag, as you suggest.[http://www.w3.org/TR/i18n-html-tech-lang/#ri20040429.113217290] I just modified some templates like [[Template:Chinese]] to follow this guidelines. It seems the language tags are richer than I expected, as there is a standard procedure to specify after the language the script, region, variant and even a private tag, see [http://www.w3.org/International/articles/language-tags/] --[[User:Suruena|surueña]] 10:28, 21 September 2006 (UTC)
I've come to the conclusion that all text in another language should be marked as that language. The HTML lang attribute does not indicate character set or alphabet, merely language—how it's displayed is a browser issue. So I would use the template for, e.g., both Cyrillic and romanized versions of a name in Ukrainian: <nowiki>{{lang |uk|Україна, ''Ukrayina''}}</nowiki>, yielding: {{lang |uk|Україна, ''Ukrayina''}}.
Regarding Firefox's mixed-font display, I don't think this should happen if the name is entered correctly. Please note that ''no'' Unicode letters occur in both Cyrillic and Latin alphabets. Don't use Latin ''a'' in place of Cyrillic ''а''! How does this appear in Firefox?
* without template: Југославија
* with lang|mk: {{lang|mk|Југославија}}
Won't Firefox just switch to a different font if necessary even when you don't specify it? Safari displays everything on this discussion page correctly in Lucida Grande, without any font shifts, and it doesn't even ask the user choose fonts for different scripts. Why not pick a broad Unicode font like Lucida Grande, Arial Unicode, or Gentium as your default font for everything? ''—[[User:Mzajac |Michael]] [[User talk:Mzajac |Z.]] <small>2006-03-10 23:02 Z</small>''
:It seems to be a font issue. When I set Firefox to use a serif font and make Gentium the standard serif font for both Western and Cyrillic, the following happens:
:* Regular and bold text: у, г, л, в, и are displayed correctly (in the default font for Cyrillic where the template is used, in the default font for Western where it is not. In this case, Gentium is both.)
:* Italic and italic bold text: у, г, л, в, и are displayed in the default sans font for “other languages”. All other letters are displayed in Gentium.—[[User:Wikipeditor|Wikipeditor]]
:: Strange; Gentium does have an italic font: are you sure you have the italic version installed? And come to think of it, Gentium only has enough Cyrillic letters for Russian and Bulgarian, so it's just not a great choice for an international font if you're going to need other Cyrillic-alphabet languages. Sorry.
:: I think Arial Unicode has a pretty good range, although it displays a few characters incorrectly (see [[International Phonetic Alphabet|#Affricates and double articulation]]). ''—[[User:Mzajac |Michael]] [[User talk:Mzajac |Z.]] <small>2006-03-11 03:47 Z</small>''
:[[Image:Template lang-cyrillic and latin scripts.png|right|frame|Screenshot with Firefox 1.5 under Linux]]
:This is how I see it in my machine. They are rendered with different fonts (see the ''J''), maybe too big when the language tag is used. Both look OK to me, don't you think?
:--[[User:Suruena|surueña]] 11:50, 11 March 2006 (UTC)
<br clear="both" />
I propose to use these language templates with every foreing word, i.e. words that currently should be put in italics in Wikipedia (except those commonly used in English, see [[Wikipedia:Manual of Style#Loan words]]) but also with foreing proper names (but I'm not completely sure). However, as [[User:Wikipeditor|Wikipeditor]] has said, they sometimes looks very bad, for example look the following test I made to [[Akira (film)]]:
[[Image:Template_lang-jap_and_latin_scripts.png|frame|Screenshot with Firefox 1.5 under Linux]]
<nowiki>
'''''{{lang|ja|Akira}}''''' ({{lang-ja|アキラ}}) is a ...
</nowiki>
Note that the title not only looks ugly, it's not even in italics (and every title should be put in italics, see [[Wikipedia:Manual of Style (titles)]]). The ''Akira'' at the bottom is rendered right because it doesn't have any <tt>lang</tt> attribute.
It's very important for accessibility reasons that every foreing word is marked with the proper language, but if it looks like so bad nobody will use it. Maybe we can create another template for [[romanization|romanized]] words:
<nowiki>
<span lang="{{{1}}}" xml:lang="{{{1}}}" class="romanization">{{{2}}}</span>
</nowiki>
and a new standard style for that class that uses the right font (a font with all [[diacritical mark]]s, needed in words like ''[[rōmaji]]''). Maybe a good name for this template could be [[Template:latn]] (for ''latinization''), or [[Template:rom]] (for ''romanization''). Transliteration is not a good name because it seems that can be used between any two writing systems, as we want to refer only with latinized words. But I'm not very good choosing template names!, please help me out with that :-)
See also: [[Template:IPA]], [[Template:Unicode]], [[Template:polytonic]], [[Template:Nihongo]], [[Template:Zh-all]], [[Template:Ruby]], [[Template:Ivrit]], [[Template:ArB]], [[Template:ArTranslit]]
--[[User:Suruena|surueña]] 12:41, 11 March 2006 (UTC)
: Proper names should not be tagged. That's overkill, and they are not in a ''foreign language''.
: A separate template for romanized text would just confuse things. There is no ''correct font'' for romanized words; different platforms have different sets of fonts, and choose how to display them differently. Everything on this page looks right in my Safari. You are just making a workaround for the selection of fonts and display settings of your Firefox (or whatever). This is a system and browser configuration problem, and it shouldn't be worked around by injecting custom HTML and styles into Wikipedia. ''—[[User:Mzajac |Michael]] [[User talk:Mzajac |Z.]] <small>2006-03-11 16:25 Z</small>''
::: ''"Proper names should not be tagged. That's overkill, and they are not in a ''foreign language''."''
:: I'm not sure about that. The pronunciation of a name depends on its language, but maybe in practice the best behavior is to pronounce it like an English speaker, and refer to the phonetic notation if you want the native pronunciation. Maybe an accessibility expert can help with this.
::Of course there is not a specific font for romanization characters, I was talking about using a widely available font in the [[MediaWiki:Common.css]]. I agree with you that the romanization template is only a hack (as well as [[Template:polytonic]] and [[Template:Unicode]]), but nowadays browsers have some limitations and those problems should be circumvented. The best option is to use [[Template:lang]] for everything, but this is problematic for a high number of users. Also, I've checked the above examples with Firefox 1.5 under Windows and it looks much better, but still not perfect. The rendering with Internet Explorer 6.0 is nearly the same.
::So, [[User:Mzajac |Michael]] [[User talk:Mzajac |Z.]], in your opinion, what's the best policy? Use the lang template only for words in different scripts? For every foreign word (except proper nouns)? This last option in my opinion is problematic, you know. --[[User:Suruena|surueña]] 17:33, 11 March 2006 (UTC)
::: Well, we could formulate all sorts of complicated rules: "[[Antonín Dvořák]]" should be pronounced as by a Czech, "[[August Dvorak]]" as by an American, etc. Should "Paris" be pronounced in French, even though it has its own English pronunciation? Pages of discussion and many revert wars would ensue, and screen readers probably wouldn't support any of the results anyway. A name is a name; it is not text in a foreign language. If a screen reader supports a database of foreign names, it will choose its own pronunciation for them; if it doesn't, then there's no point in our trying to coach it.
::: Regarding some of those other templates mentioned above:
::: Template:IPA, Template:Unicode, and Template:polytonic are necessary workarounds for a bug in MSIE/Windows; certain text ''will not display at all'' in MSIE/Win without these templates. The fix is performed by a style sheet rule which is hidden from all other browsers, so these templates do not affect display in any other browser (although they do allow you to use your user style sheet to alter the display; e.g. I have IPA displayed in green).
::: Template:Nihongo breaks accessibility by injecting junk text into the page and then hiding using CSS. This is bad, and it should not be used.
::: I can't tell what template:ZH-all is, and there is no documentation.
::: Template:Ruby breaks the validation of both the HTML and CSS of pages, by injecting non-HTML (MSIE-specific) tags into the page, and by incorporating a CSS hack which doesn't survive wikitext rendering. This should not be used.
::: All of these things display correctly in my browser. If your browser has a bug or problem that mixes up the fonts a bit, 1) please don't add code to the page that changes the display in my web browser to fix the display in yours, 2) editors should not be required to add specialized tags just for this purpose. ''—[[User:Mzajac |Michael]] [[User talk:Mzajac |Z.]] <small>2006-03-11 17:56 Z</small>''
:::: Sorry for the delay, but I'm somewhat busy these days… I've been thinking about the [[Template:lang]] and the [[Template:rom]], and although a template for romanized words can be a good solution to currently solve the font problems of some browsers, you are right that this can be hard to use by editors (and this is only a short-term solution for the problems found in some Linux distributions AFAIK) and probably will not help voice browsers (probably each transliteration should be tagged with a different template to choose the good pronuntiation, and this could be a real nightmare.
:::: I'm not an expert about accessibility, although I've read some articles in my free time about Web Accessibility, Universal Design and Device Independence, and made some web pages WCAG 1.0-AA compliant. I also have some knowledge about assistive technologies, but I've never used a screen reader or voice brower. Do you know how JAWS and other common assistive techonologies use the <tt>lang</tt> attribute?
:::: In summary, [[Template:lang]] should be used for every non-English words, regardless of the script. But, I'm still not sure about:
:::: * Book and film titles: ''[[Amistad (film)]]'', ''[[El Ingenioso Hidalgo Don Quixote de la Mancha]]''
:::: * Names of places: [[Palazzo Pitti]]
:::: * Names of non-English instituions: [[Real Academia Española]]
:::: * Words in two or more languages, see [[Template:lang-ru/uk]]
:::: Also, I suppose that when talking about the origins of words, nearly all of them must refer to Ancient Greek ([[Template:polytonic]]), i.e. it's wrong to tag it merely as Greek (<nowiki>{{</nowiki>lang|el|…}}) because it would refer to modern Greek.
:::: --[[User:Suruena|surueña]] 09:27, 16 March 2006 (UTC)
==Interwiki==
please add [[es:Plantilla:Lang]] to the interwiki --[[User:Yonghokim|Yonghokim]] 21:37, 22 April 2006 (UTC)
:Done.—[[User:Ezhiki|Ëzhiki (ërinacëus amurënsis)]] • ([[User talk:Ezhiki|yo?]]); 04:37, 23 April 2006 (UTC)
Please add [[:bg:Шаблон:lang]]. Thanks. --[[User:Petar Petrov|Petar Petrov]] 10:44, 29 December 2006 (UTC)
:Done [[User:Metros232|Metros232]] 15:14, 30 December 2006 (UTC)
== Overspecifying ==
Keeping in mind the W3C's golden rule, what is the purpose of the following additions?
:'' Slang words are typical examples of terms used only in some regions, for example the [[Mexican Spanish]] word ''[[chale (slang)|chale]]'':
This is not an example like Taiwanese, where the language script is used differently in a different ___location. It's merely a local word. We're not going to add a language tag to every English regionalism in Wikipedia, nor should we promote the needless practice of labelling regionalisms in foreign words as if they belonged to yet another language.
:'' Changes in script must always be specified (when not inferred from the language), even when the language doesn't change. The following example shows this for text always in English: ... In the sign can be read the word <nowiki>{{lang|en-Brai|⠃⠗⠊⠇⠇⠑}}</nowiki>, which means "braille".
Does this come from any W3C guideline? It seems to me that the language (en) is already specified for the page, and the fact that the script is braille should be inferred by any Unicode-capable web browser. Adding a language tag is superfluous. (Anyway, since this is not a braille encyclopedia, better style would be to write "the word ''braille'', itself written in braille."
I'm going to remove these examples from the page, pending evidence that they represent good practice. ''—[[User:Mzajac |Michael]] [[User talk:Mzajac |Z.]] <small>2006-09-26 01:35 Z</small>''
:I'm agree with you, both examples weren't very good. In the first, I wanted to put another example to show that the script isn't needed when specifying the variant. And currently, with the example of Taiwan, this isn't very clear. As a side note, in my opinion the example of Taiwan isn't very good neither, don't reflect why that word is a variant only found in that region (and as far as I know, it isn't, it's also used in any Chinese dialect). For that reason I didn't include any word when I write that example, only the language tag needed in a hipotetical case. I'm a native Spanish speaker, and I know when a Spanish word is a regionalism, but not in other languages.
:In the other example, I wanted to put a more exotic case: a non-intuitive script, and also that language tags can be needed for English words. Probably it wasn't very well written, but I still think the idea is good. And isn't an academic example, it seems browsers need to specify those changes in script, otherwise the user won't be able to choose the font she wants (but I haven't make the test, maybe I'm wrong with current browsers). Anyway, I'm happy to see you again, comments are always welcome and you know really about this. Best regards --[[User:Suruena|surueña]] 19:17, 26 September 2006 (UTC)
:: I understand the desire to be exhaustive with the examples, but don't worry: sooner or later an editor will come here and add a real example which they used. These are both unusual cases, and if the need arises, a solution will come. In the meantime, simpler instructions are always better.
:: I think Taiwanese script has to be indicated because some of the characters are different, or have different meaning: it's a technical difference that the browser/OS needs to be aware of for correct rendering, and not an indicator to the reader that the content uses local vocabulary—but "[[Taiwanese language]]" doesn't really offer much information about this.
:: Perhaps en-brai should be used for braille text, but I think the Cyrillic and transliterated Russian example is sufficient to illustrate the same point as in the braille example.
:: Regarding the browser choosing script, I suggest you test with a few different browsers. I find that Safari and Firefox render almost anything correctly as long as the OS has an appropriate font installed, while Internet Explorer is brain-dead about blocks of text from different Unicode ranges, and requires font-specification hacks like [[:template:IPA]] and [[:template:Unicode]] to be used in Wikipedia, although even these sometimes fail. ''—[[User:Mzajac |Michael]] [[User talk:Mzajac |Z.]] <small>2006-09-27 01:45 Z</small>''
:::After reading the article about "[[Taiwanese language]]" I think the example is even more misleading, because somebody can understand TW is used for dialects, where in fact the correct tag in that case would be <tt>zh-min-nan-Hant</tt>. Isn't a good idea to use a macrolanguage for explaining this subtag, and maybe it would be better if we don't explain them. Anyway, I think the 99% of editors writing articles with loan words will use only the language code, and less frequently maybe the script code and the rtl writing direction. I think the variant subtag it's very specialised. However, although it's true the variant code isn't an indicatior for the reader, it is an indicator for a spell checker.
:::I'm interested in which cases [[:template:IPA]] and [[:template:Unicode]] sometimes fail. I want to use a similar technique for improving language templates: to add an optional second parameter to language templates written in a non-Latin script to ease adding the trasliteration. For example {{tlx|lang-ru|Москва́|<nowiki>''Moskva''</nowiki>}}, where the second parameter is optional for backwards compatibility, will generate the following code:
<pre>
[[Russian language|Russian]]: <span lang="ru" xml:lang="ru">Москва́</span>, [[Romanization of Russian| translit.]]: <span lang="ru-Latn" xml:lang="ru-Latn" class="Unicode"><i>Moskva</i></span>
</pre>
:::--[[User:Suruena|surueña]] 22:25, 28 September 2006 (UTC)
:::: It can't ever be simple, can it!? I thought the Taiwanese example would be safe because the W3C uses it in their overview page. Oh, well.... This stuff is new anyway, I'm sure more will be published on the subject, or conventions will emerge eventually.
:::: IPA and its friends sometimes fail to render particular characters because different fonts include different characters, and no one seems to be the best. One of the Windows XP default fonts with the widest range of supported characters has a bug and renders double-width combining tie bars incorrectly. You can read the details on the templates' talk pages. Or use Firefox and forget about it forever. (I use a Mac so I never suffer from any of the MSIE font bugs, but ironically I've done much of the implementation of these templates just so we could abandon ugly, confusing SAMPA and just use proper Unicode characters in Wikipedia.)
:::: The extended template with transliteration sounds like a very good idea. ''—[[User:Mzajac |Michael]] [[User talk:Mzajac |Z.]] <small>2006-09-29 03:42 Z</small>''
:::::I'm a bit late to join in, but examples for regional variants could be:
:::::* Poetry or [[Choe Manri|other texts]] written in [[Classical Chinese]] by Koreans (i.e. most pre-1900 texts from Korea) that could be displayed using Korean variants, for example <tt><nowiki>{{lang|zho-Hant-KR|平}}</nowiki></tt> instead of <tt><nowiki>{{lang|zho-Hant|平}}</nowiki></tt>: printed texts for use in Korea usually use the variant where the small strokes point upwards (/\), whereas in texts printed for use in China, I think the other variant (\/) is more common. One problem is that old prints often don't use the standard variants of today, another is that the use of [[ISO 3166|KR]] would be an anachronism in many cases. Anyhow, as long as browsers give precedence to language tags over script tags, we must act as if those texts were Korean language: <tt><nowiki>{{lang|ko-Hant|平}}</nowiki></tt> = {{lang|ko-Hant|平}}.
:::::* Swiss texts use ss in words where other standards use [[ß]], but unlike the above Korean example, regional information (i.e.<tt>-CH</tt>/<tt>-FL</tt>) is of course unnecessary for such words to be correctly rendered by an application.
:::::I'm sure there are better examples, but I can't think of any right now. [[User:Wikipeditor|Wikipeditor]] 13:13, 16 October 2006 (UTC)
::::::Better late than never! :-) And any example can help, because this is a difficult topic. Actually, after thinking about this for some days I believe this is probably too much specialised for the Wikipedia (and maybe more appropriate for the Wiktionary?). At least from my POV, we should start tagging only the language, and the script when needed (not an easy task, though), removing from the documentation the part about regional variants or at least stating this is needed rarely.
::::::But I think your examples are interesting. I believed regional variants are only useful to spell checkers, but you showed more uses. Maybe in the future. However, I don't agree with you in the example about Korean poetry: altough a Chinese text is printed with a Korean font the language is still Chinese, so the subtag must be zh instead of ko. We must circunvent browser bugs when we can, but in this case we are tagging the text with a wrong tag and not with a compatible one. Anyway, it will be interesting to known the behavior of current browsers (and other user agents :-) regarding language tags. I still working with the [[Template:lang-ru test]] and [[Template:lang-uk test|lang-uk test]] templates (to busy theses days), I hope to have time to make some tests also with IE. I will keep you informed. --[[User:Suruena|surueña]] 20:37, 16 October 2006 (UTC)
==[[ISO 639]]; more exactly==
I was trying to use {{tl|lang-gla}} and did not realise for some time we have {{tl|lang-gd}}. I think we should specify more exactly which codes to use to name templates. Namely, should we use ISO 639-1, ISO 639-2 or ISO 639-3? Then perhaps create redirects from other used codes. --[[User:Eleassar|'''Eleassar''']] <sup>[[User talk:Eleassar|my talk]]</sup> 13:02, 22 December 2006 (UTC)
: "The golden rule when creating language tags is to keep the tag as short as possible",[http://www.w3.org/International/articles/language-tags/Overview.en.php] so [[ISO 639-1]] is preferred over [[ISO 639-2]] and [[ISO 639-3]]. Also, some browsers only recognize alpha-2 codes. Best regards --[[User:Suruena|surueña]] 00:27, 25 December 2006 (UTC)
== Fonts customization through whole wiki ==
I suggest to add id parameter to span:
<pre><span id="lang-{{{1}}}" lang="{{{1}}}" xml:lang="{{{1}}}">{{{2}}}</span></pre>
if Your lang is "cop", id will also "lang-cop" and You may include line in [[MediaWiki:common.css]] like:
<pre>#lang-cop { font-family: MPH 2B Damase; }</pre>
--[[User:AlefZet|AlefZet]] 19:34, 2 April 2007 (UTC)
: See current realization of the idea in [[:kk:MediaWiki:common.css]], [[:kk:Template:lang]], [[:kk:Template:rtl-lang]]--[[User:AlefZet|AlefZet]] 11:42, 3 April 2007 (UTC)
:: You will have to use a class for this not an id. Attribute selectors would be even nicer, but they don't work in IE6, while Mozilla and Opera don't need this hack. It idea is a good idea though, I was planing on implementing this in the near future. —''[[User:Ruud Koot|Ruud]]'' 12:12, 3 April 2007 (UTC)
::: I think about future use of ids in some javascript tricks--[[User:AlefZet|AlefZet]] 05:09, 27 April 2007 (UTC)
===smart selection of css class===
In the spirit of the above, note templates like {{tl|Ar}}, {{tl|ArB}}, {{tl|Hebrew}} and {{tl|Ivrit}}.
These are created unsystematically by users who feel that Hebrew or Arabic script isn't rendered nicely by default, trying to fix it ad hoc (by explicit imposition of fonts, font size, boldface etc.)
{{tl|Ivrit}} uses a "spanHe" css class.
Obviously, our aim should be to have people just use <nowiki>{{lang|ar|...}}</nowiki>, <nowiki>{{lang|he|...}}</nowiki>, etc. and delegate style issues to the css. For this purpose, this template may need some conditional statements that would insert things like <code>class="spanHe"</code> based on the language code.
The problem is that these issues are really a matter of the script used, not of the language proper. Thus, on one hand, not just 'ar' would need to trigger an "Arabic" class, but also fa, ur, ku, ps, tg, ug etc. On the other hand, transliterated Arabic etc. ("ar-Latn") should not trigger them.
As noted above, both <nowiki>{{lang|kk|Қазақ тілі}} and {{lang|kk|قازاق ڌﻳل}}</nowiki> should result in correctly formated Kazakh spans. But these cases are rare, and could be disambiguated by using kk-Cyrl vs. kk-Arab.
Note the existence of the separate {{tl|ISOtranslit}} for ISO compliant romanizations (and {{tl|ArabDIN}} for DIN compliant Arabic transliteration in particular).
I am not sure what would be the best way of addressing this. At the moment, I think it might be introducing a third, optional, script parameter. Instead of <nowiki>{{lang|kk-Cyrl|Қазақ тілі}} and {{lang|kk-Arab|قازاق ڌﻳل}}</nowiki> we would then write <nowiki>{{lang|kk|Қазақ тілі|Cyrl}} and {{lang|kk|قازاق ڌﻳل|Arab}}</nowiki>, and the template do something like
<nowiki><span lang="{{{1}}}" xml:lang="{{{1}}}" {{ #if {{{3|}}} | class="Span{{{3}}}" |}}>{{{2}}}</span></nowiki>
or, if we really want to provide recognition of default script, something like
<nowiki><span lang="{{{1}}}" xml:lang="{{{1}}}" {{ #if {{{3|}}} | class="Span{{{3}}}" |
{{#switch: {{{1}}}
|fa
|ur
|ku
|ps
|tg
|ug
|ar = class="SpanArab"
|yi
|he = class="SpanHebr"
}} }}>{{{2}}}</span></nowiki>
This would need just a small number of css classes, say, SpanLatn (for romanizations), SpanCyrl (for cases like Tajik or Kazakh), SpanHebr (for the desired formatting elements), SpanArab. Others like SpanGrek or SpanTaml would not be needed unless there was some genuine formatting requirement, since proper usage would be <nowiki>{{lang|el|ελληνικά}}, {{lang|ta|தமிழ்}}</nowiki>, not <nowiki>{{lang|el|ελληνικά|Grek}}, {{lang|ta|தமிழ்|Taml}}</nowiki>.
Finally, it is unsatisfactory to use both <nowiki>{{lang|xx-Latn|...}}</nowiki> and <nowiki>{{ISOtranslit|xx|...}}</nowiki> for Romanization. Wouldn't it be cleaner to have a separate template for all romanizations, and restrict xx-Latn to languages that can be written in Latin natively (Kazakh etc.)? An exception is pinyin which has its own ISO 639 code pny (zh-Latn should be equivalent to pny, I suppose?)
thoughts?
[[User:Dbachmann|dab]] <small>[[User_talk:Dbachmann|(𒁳)]]</small> 12:03, 14 April 2007 (UTC)
: I think it should even be possible to do <nowiki>{{lang|kk|Cyrl|Қазақ тілі}} and {{lang|kk|Arab|قازاق ڌﻳل}}</nowiki> by testing for the existence of a third parameter. I'm not sure if the matching of a default script to a language should be done by template code or clever use of CSS. The first is more elegant, the second probably causes less strain on the servers (but then again [[Wikipedia:Don't worry about performance|Don't worry about performance]].) For transliterations I'd like to see something like <nowiki>{{transl|ar|DIN|al-Ḫuwārizmī}} / {{transl|ar|ALA|al-Khwārizmī}}</nowiki>. —''[[User:Ruud Koot|Ruud]]'' 12:31, 14 April 2007 (UTC)
::I agree entirely. It's important to keep the "script" and "scheme" (ISO, DIN) parameters optional (which is what you are saying, too). My main point is that the sooner we look towards clever standardization the better, to prevent more of the ilk of {{tl|ArB}} or {{tl|Hebrew}} turning up. Such a centralised approach may also somehow account for things like {{tl|IAST}} (and {{tl|PIE}}), but people will want to use these as shortcuts, and we can always use bots to subst: things. [[User:Dbachmann|dab]] <small>[[User_talk:Dbachmann|(𒁳)]]</small> 14:03, 14 April 2007 (UTC)
:I've created {{tl|transl}} along the lines you suggest. So far, it doesn nothing with classes, and I don't know what classes should or should not be included. I suppose we need a clean list first, there being disparate things like "IAST", "SpanHe" and "Arabic Unicode". I am transcluding {{tl|transl}} from {{tl|ArabDIN}} so far. [[User:Dbachmann|dab]] <small>[[User_talk:Dbachmann|(𒁳)]]</small> 14:35, 14 April 2007 (UTC)
==glyph variants: classes or templates?==
we need a mechanism to select preferred fonts for cases considered "glyph variants" by Unicode. Most notably, this will apply to CJK, and to [[Nasta'liq script]] fonts for Persian, Pashto and Urdu. Less urgently, some rarely supported ({{tl|SMP}}, and perhaps {{tl|mufi}} for ang, non etc.) scripts will need selection of specific fonts.
shall we define these fonts in templates (such as {{tl|Hebrew}}), and select these templates from here ({{tl|lang}}), or will we define css classes (and, can css classes depend on the "lang" parameter, or will we have to explicitly switch "class" here?). Note {{tl|script}}, intended to select a script by [[ISO 15924]] codes, but ISO 15924 is unsatisfactory for the purposes outlined above (it only has "Ital" for Etruscan and Raetic scripts, it only has "Arab" for Naksh and Nastaliq and it has only "Xsux" for all sorts of cuneiform, but it ''does'' have e.g. Latf and Latg vs. Latn, and Hant, Hans, Japn vs. Hani) [[User:Dbachmann|dab]] <small>[[User_talk:Dbachmann|(𒁳)]]</small> 07:43, 16 April 2007 (UTC)
A websearch reveals, the answer to my question are [http://www.w3.org/TR/CSS21/selector.html#lang language pseudo-classes]. The proper solution would be to introduce these to [[MediaWiki:Common.css|common.css]].
we'll need:
:lang(he) {
font-family: SBL Hebrew, Ezra SIL SR, Ezra SIL, Cardo, Chrysanthi Unicode, TITUS Cyberbit Basic, Arial Unicode MS, Narkisim, Times New Roman;
font-family /**/:inherit;
}
:lang(fa) {
font-family: Nafees Nastaleeq, Pak Nastaleeq, PDMS_Jauhar;
font-family /**/:inherit;
}
:lang(ps) {
font-family: Nafees Nastaleeq, Pak Nastaleeq, PDMS_Jauhar;
font-family /**/:inherit;
}
:lang(ur) {
font-family: Nafees Nastaleeq, Pak Nastaleeq, PDMS_Jauhar;
font-family /**/:inherit;
}
:lang(sux-Xsux) {
font-family: Akkadian;
font-family /**/:inherit;
}
:lang(ja) {
font-family: Code2000, Arial Unicode MS, Bitstream Cyberbit, Bitstream CyberCJK, IPAGothic, IPAPGothic, IPAUIGothic, Kochi Gothic, IPAMincho, IPAPMincho;
font-family /**/:inherit;
}
:lang(ko) {
font-family: Adobe Myungjo Std M, Baekmuk Batang, Baekmuk Gulim, Batang, Dotum, DotumChe, Gulim, GulimChe, HYGothic-Extra, HYMyeongJo-Extra, New Gulim, UnBatang, UnDotum, UnYetgul, UWKMJF;
font-family /**/:inherit;
}
:lang(zh-Hans) {
font-family: Adobe Song Std L, AR PL ShanHeiSun Uni, AR PL ShanHeiSun Uni MBE, MS Hei, MS Song, SimHei;
font-family /**/:inherit;
}
:lang(zh-Hant) {
font-family: Adobe Ming Std L, AR PL New Sung, AR PL ZenKai Uni, AR PL ZenKai Uni MBE, MingLiU, PMingLiU;
font-family /**/:inherit;
}
:lang(de-Latf) {
font-family: Gutenberg Textura, Humboldt Fraktur Regular, Alte Schwabacher, Hansa Gotisch;
font-family /**/:inherit;
}
:lang(ga-Latg) {
font-family: Corcaigh, Duibhlinn, Ceanannas, CeltScript;
font-family /**/:inherit;
}
:lang(got-Goth) {
font-family: Code2001;
font-family /**/:inherit;
}
:lang(gem-Runr) {
font-family: FreeMono, Junicode, Code2000;
font-family /**/:inherit;
}
:lang(ang-Runr) {
font-family: FreeMono, Junicode, Code2000;
font-family /**/:inherit;
}
:lang(non-Runr) {
font-family: FreeMono, Junicode, Code2000;
font-family /**/:inherit;
}
:lang(ang) {
font-family: Alphabetum, Cardo, LeedsUni, Junicode, "TITUS Cyberbit Basic", ALPHA-Demo;
font-family /**/:inherit;
}
:lang(non) {
font-family: Alphabetum, Cardo, LeedsUni, Junicode, "TITUS Cyberbit Basic", ALPHA-Demo;
font-family /**/:inherit;
}
:lang(goh) {
font-family: Alphabetum, Cardo, LeedsUni, Junicode, "TITUS Cyberbit Basic", ALPHA-Demo;
font-family /**/:inherit;
}
:lang(gmh) {
font-family: Alphabetum, Cardo, LeedsUni, Junicode, "TITUS Cyberbit Basic", ALPHA-Demo;
font-family /**/:inherit;
}
:lang(ett-Ital) {
font-family: Cardo, Code2001;
font-family /**/:inherit;
}
:lang(xrr-Ital) {
font-family: Cardo, Code2001;
font-family /**/:inherit;
}
:lang(gmy-Linb) {
font-family: Cardo, Code2001;
font-family /**/:inherit;
}
:lang(cu-Glag) {
font-family: Dilyana;
font-family /**/:inherit;
}
:lang(mn-Phag) {
font-family: Code2001;
font-family /**/:inherit;
}
:lang(so-Osma) {
font-family: Code2001;
font-family /**/:inherit;
}
:lang(grc-Cprt) {
font-family: Code2001;
font-family /**/:inherit;
}
:lang(grc) {
font-family: Athena, Gentium, "Palatino Linotype", "Arial Unicode MS", "Lucida Sans Unicode", "Lucida Grande", Code2000;
font-family /**/:inherit;
}
[[User:Dbachmann|dab]] <small>[[User_talk:Dbachmann|(𒁳)]]</small> 12:23, 16 April 2007 (UTC)
I've added some pseudo-classes to common.css now. Not the more esoteric, but the important ones, fa, ps, ur, he, ko, jp, grc. [[User:Dbachmann|dab]] <small>[[User_talk:Dbachmann|(𒁳)]]</small> 10:50, 17 April 2007 (UTC)
:Nice idea! But what about browser-side compatibility? Yet another suggestion: what about in 1st place in list the Vista's fonts for complex scripts in appropriate cases?--[[User:AlefZet|AlefZet]] 05:14, 27 April 2007 (UTC)
:: font-family /**/:inherit; doesn't work in MSIE7--[[User:AlefZet|AlefZet]] 05:36, 27 April 2007 (UTC)
== Devanagari ==
Hi,
I've created [[:Template:Lang-dev]], for use to signify how something is written in [[Devanagari]], but without saying exactly which language. --[[User:Soman|Soman]] 12:56, 8 June 2007 (UTC)
:Hello, Soman. I'm afraid this template is wrong, because the {{tl|lang}} family of templates is to specify the language, not the script. The <tt>[[ISO_639:d#dev|dev]]</tt> language code does exists, but it is the code for a language from New Guinea called [[Trans-New Guinea languages|Domung]]. I have modified the [[template:lang-dev]] accordingly (anyway, it wasn't used in any article or page).
:It's not straighforward how to implement what do you want to do. It's possible to create a template to specify the desired font, but the question is if this should be done (see above). Best regards —[[User:Suruena|surueña]] 20:58, 18 July 2007 (UTC)
|