Wikidata/Notes/Language fallback

This is an archived version of this page, as edited by Jeblad (talk | contribs) at 10:19, 10 July 2012 (Global language chain). It may differ significantly from the current version.

If a value, especially a label, is not available in the currently selected language, a fallback mechanism should be used to show a value from another language.

  • fallback can be based on:
    • the languages a user speaks
    • similarieties between languages (en-uk -> en)
    • maybe eventually *any* language will do.
      • his should mostly work for labels, but not for descriptions?...
  • if a value from a different language is shown in the UI, then
    • the value's language should be indicated (small colored text?)
    • there should be a button to approve the value for the current language ("yes, it's the same in my language")

User fallback chain

The prefered user languages are defined in Special:Preferences for the various users. Those settings will be specific for each user, but will be the same for all pages with (user) content tailored to the specific user. There are some limitations where the user preferred languages can be used. It seems likely they can be used when the API is accessed, or when the rendered content isn't cached, or where the cached objects can be specialized to a given language.

For now the user fallback chain is one set of alternate languages where all have the same weight. Later it might be possible to add weight to the languages by giving them weights explicitly or by giving them some specific order. There can also be other hints used, such as the weights set in the browser.

If the user have defined alternate languages all of his set languages can be tried in sequence, but after his own main (primary) language is tried first. Content, messages, and other strings, may or may not exist in the given fallback language.

If all alternate languages are tried they can be retried but with by using the global fallback chain. The language with a defined string that is deemed closest to the primary language will win, and be used for the specific string.

Global fallback chain

Languages have relationships with each other. This can be used to build fallback chains so other strings with close relationship can be replaced if a specific string from a given language is missing. For example is the South Saami language part of the Southern group, which is again part of the Western group of Saami languages. [1] In that group there is also a Northern group and languages in that group are Northern, Lule and Pite Saami.

In Mediawiki there are fallbacks for languages. For a specific language a list of fallbacks can be acquired, with a default language at the end of the fallbacks. Each one of the fallbacks can then be tested against whatever strings there might be in any of the language, and the string for a specific language can then be returned. If no match is found before the end of the chain a second pass can be done, but this time by checking the fallbacks for the initial list.

By keeping a list of checked languages the second pass can be quite effective, and it is also necessary to avoid loops in the list as it will really form a graph and not a hierarchy.

There are now rather few languages that defines fallback. This is rather awkward for our use, but it is perhaps a wated behavior for an ordinary wiki. A first attempt could be to extend the fallbacks with languages from the same group, perhaps also supergroup, and if this is impossible then to go for a hook and do something similar but specific for our extension.