Wikipedia:WikiProject Typography/Unicode: Difference between revisions

Content deleted Content added
m Unicode tables: trim redundent with new section
Articles with Unicode tables: OCR-A font → OCR-A
 
(24 intermediate revisions by 3 users not shown)
Line 1:
This page isattempts anto effortdocument atstandards documenting/standardizingand infrastructure for presentation of Unicode-related information on Wikipedia. It may also serve as a gathering point for work on building the same.
 
== Templates ==
Line 12:
== Glyph images ==
 
Wikipedia and/or Wikimedia Commons host many images of [[glyph]]s — characters rendered in a given font. In article text, we generally prefer to use literal Unicode characters, not these rendered images. Thus, these images are primarily used in articles ''about'' characters, where an illustration is appropriate. In particular, any [[#Unicode tables]] provide both the literal character and an image of the character.
FIXME
 
Ideally, all such glyph images would be [[vector graphics]], in [[Scalable Vector Graphics|SVG]] format. However, many exist in a [[raster graphics]] format, such as [[Graphics Interchange Format|GIF]]. Converting or replacing these with SVGs is something that should be done.
 
As of this writing, there is no standardized naming of these images. Sometimes an expression of the codepoint is used as the file name, e.g., <code>[[:File:U+2122.svg|U+2122.svg]]</code>. In other cases, the character name is used, e.g., <code>[[:File:OCR-A char Quotation Mark.svg|OCR-A char Quotation Mark.svg]]</code>.
 
== Unicode tables ==
 
Many articles dealing with Unicode include tables of Unicode characters. The standard form for such tables is as given in the following example:.
 
=== Example table ===
{| class="wikitable sortable"
 
|+ Unicode Table Example
{{Unicode table header|Example caption}}
! [[Character (computing)|Char]]
! Image
! Name
! [[Hexadecimal|Hex]]
! [[Decimal]]
|-
| {{Unicode|}}
|
| [[Ba gua (concept)|Trigram]] for Earth
| U+2637
| 9783
|-
| {{Unicode|}}
|
| [[Dharmacakra|Wheel of Dharma]]
| U+2638
| 9784
|-
| {{Unicode|}}
|
| White frowning face ([[Emoticon]])
| U+2639
| 9785
|-
| {{Unicode|}}
|
| White smiling face ([[Emoticon]])
| U+263a
| 9786
|}
 
=== Legend ===
:''A copy of this legend, or something like it, will be linked from or displayed with all Unicode tables, once we figure out exactly how that should be done.''
 
{| class="wikitable sortable"
! Char
| The literal character. If your computer lacks [[Help:Special characters|Unicode support]], you may see [[mojibake|other symbols]] instead of the proper character.
|-
! Image
| A sample image of the character, rendered in an example font.
|-
! Name
| The official name of the character. Additional information may be given in parenthesis.
|-
! Hex
| The numeric [[code point]] for the character, in [[hexadecimal]] (base 16), with [[Unicode#Architecture and terminology|"U+" prefix]].
|-
! [[Decimal]]
| The same code point value, expressed in [[decimal]] (base 10).
|}
 
=== Design features ===
 
The table format has the following design features:
Line 56 ⟶ 78:
** The literal Unicode [[Character (computing)|character]]
** For web browsers which support Unicode and can render it properly, gives the user a "native" presentation
** Allows the reader to copy-and-paste the characters for real usage (like [[Character Map|Charmap]])
** The {{tl|Unicode}} template is used
* "Image" column
** A sample rendering of the Unicode [[glyph]], as a(see [[raster#Glyph graphics|rasterimages]] or [[vector graphics|vector]] graphic)
** For systems/browsers which cannot render Unicode (or specific characters), allows the reader to see intended appearance
** Provides a consistency check for character, image, and browser. Discrepancies will stand out.
** When a glyph image isn't available, the table cell is left empty
* "Name" column
** The official [[codepoint]] name, as specified by the [[Unicode Consortium]]
Line 71 ⟶ 94:
*** All of "Wheel of Dharma" is wikilinked, because [[Dharmacakra]] is synonymous with "Wheel of Dharma"
*** "Emoticon" is a parenthetical, as that is not part of the official Unicode codepoint name
* "DecimalHex" and "HexDecimal" columns
** The codepoint number, in both [[decimal]] (base ten) and [[hexidecimalhexadecimal]] (base 16) formats
** Syntax is omitted (e.g.,The “<code>U+</code>”, “<code>&u</code>”prefix is used for hex, “<code>0x</code>”,per the Unicode etc.)standard
** Decimal is not prefixed, per [[WP:MOSNUM]]
** Such syntax is specific to the usage (Unicode specification, HTML, Perl, etc.); it is not a universal form. By omitting it, anyone can copy-and-paste the actual value, and add whatever syntax they need (or none at all).
** Such prefixes also prevent table columns from sorting properly
* The plan is to eventually add some kind of standard explanation of the columns to the tables, most likely as an adjacent template, or maybe links from the headers. Ideas welcome!
 
Line 91 ⟶ 113:
* [[List of Unicode characters]]
* [[Unicode symbols]]
* [[OCR-A]]
* [[Wikipedia:MOSNUM#Common_mathematical_symbols]]
* [[Linear_B#Unicode]] (note the link to the Unicode standard)
 
== See also ==
 
* [[Wikipedia:Naming conventions (Unicode) (draft)]]
* Deletion discussions
** [[Wikipedia:Articles for deletion/List of precomposed Latin characters in Unicode]]
** [[List of Unicode characters]]
*** [[Wikipedia:Articles for deletion/List of Unicode characters]]
*** [[Wikipedia:Articles for deletion/List of Unicode characters (2nd nomination)]]
*** [[Wikipedia:Articles for deletion/List of Unicode characters (3rd nomination)]]
** [[Wikipedia:Articles for deletion/Various Unicode-related pages]]
* Help
** [[Help:Multilingual support]]
** [[Help:Gothic Unicode Fonts]]