Interwiki synchronization: Difference between revisions
Content deleted Content added
→Current discussions: adding Brand and trademark |
|||
(58 intermediate revisions by 34 users not shown) | |||
Line 1:
{{shortcut|[[IS]]}}
{{Interwiki redirect|d:Wikidata:interwiki conflicts}}
{{Historical}}
This page is intended to host discussions between representatives from individual wikis regarding [[Help:Interwiki_linking#Interlanguage_link|interlanguage links]] between Wikipedia articles.
Interlanguage links can often become entangled in [[Interwiki conflicts|conflicts]]. For example, if the page [[:en:Emergency medical technician]] links to [[:fr:Ambulancier]], but [[:fr:Ambulancier]] links to [[:en:Paramedic]],
The task of synchronizing these links manually is hard, but challenging and interesting. A central hub for discussing this will foster further cooperation and concord between different language Wikipedias.
Line 9 ⟶ 12:
== Future directions ==
In the future the technology of interlanguage links may significantly change thanks to the Interlanguage Extension; see [[A newer look at the interlanguage link]]. When this happens, the discussions about interwiki synchronization may move to the wiki that will be set up for this extension.
== Automated analysis ==
; Note: This database appears very outdated. It's questionable if it has ever been updated since its creation in 2008. [[User:Have mörser, will travel|Have mörser, will travel]] 18:03, 23 September 2011 (UTC)
The largest [[:en:Connected component (graph theory)|connected component]] in the graph of interlanguage links between articles contains well over 70'000 nodes (including over 3'000 from the English edition) and over 3'000'000 links.
There are 24 more connected components with over 1'000 nodes. In total, there are about 60'000 inconsistent connected components in the graph.
These figures are based on an analysis of snapshots from late August 2008.
It is next to impossible for a human to untangle the largest graphs anymore.
Also, it is infeasible to process all 60'000 or so components, even if most of them are quite small.
For that reason, an automatic meaning detection approach, even very imperfect, might be quite useful.
An example of such automated analysis [[User:Bolo1729/Analysis/560ab7f3-7382-3406-8a5c-0961aa9c2661|can be found here]] (a middle-sized component with approx. 300 articles and about 25 meanings), [[:Category:Interlanguage links analysis|here are some more]].
The results show the identified meanings, the key nodes and links leading to semantic drift, and a complete set of links to remove to guarantee consistency.
A description of the graph of meanings in the [[:en:DOT language|DOT language]] used by [[:en:Graphviz|Graphviz]] is also provided.
A batch edition based on the above results is possible.
Personally, I am strongly in favor of such an automated correction.
Note that this would be a one-time action, which apparently has never been taken before.
The 70'000+ component must have been growing for years: it contained about 48'000 articles in March 2008.
I'll gladly answer any questions regarding the idea, and upload more examples of analyzed components. If the suggested course of actions is approved by the community, I'd be glad to provide generated edit recommendations
for all inconsistent components in a suitable format.
I don't have the bot permissions nor the necessary experience, so I'm looking forward to a cooperation with a bot owner.
I've performed the analysis for the graph of category interlanguage links too, and it requires similar action too.
''This is an application of methods which I have developed during my PhD research, I'm currently in the process of writing two articles documenting the topology of the graph and the methods used to identify meanings. In short, the approach used here arranges the nodes in space using the force-based graph layout algorithm with a custom potential. Then, during the reconstruction of meanings, shorter links are considered to be more trustworthy.''
Thanks, [[User:Bolo1729|Bolo1729]] 22:41, 29 October 2008 (UTC)
'''Update''': I've opened [http://wikitools.icm.edu.pl/ this service]
which presents all of the conflicts in which two or more English articles are involved (a little bit over 30'000 cases).
In each case, edit recommendations are also presented.
--[[User:Bolo1729|Bolo1729]] 17:01, 4 December 2008 (UTC)
== Opening a new case ==
Line 15 ⟶ 50:
= Current discussions =
{{/Lonesome George & Chelonoidis abingdonii}}
{{/Nymph}}
{{/Dutch Brazil}}
{{/Order of Malta}}
{{/gunpowder}}
{{/Kangaroo}}
{{/Daucus carota}}
{{/Normandy Landings}}
{{/Open source vs Open source software}}
{{/Apple vs. Malus domesticus}}
{{/Finger vs. Digit (anatomy)}}
{{/Trauma}}
{{/Featured articles}}
{{/Occultation-Eclipse}}
{{/Antarctica}}
{{/Dollar}}
{{/Epistemology}}▼
{{/Brand and trademark}}
{{/Remembrance day}}
{{/elementary-primary-comprehensive-school-education-etc}}
{{/Google}}
{{/Soldier - Military personnel}}
{{/Microsoft_.NET_vs_Framework_.NET}}
{{/Rice}}
{{/Contre-la-montre-Time trial}}
{{/A Few Good Men}}
{{/Radioactivity}}
{{/publishing}}
{{/nl:Röntgenfoto vs de:Röntgen and de:Röntgenaufnahme}}
{{/curator}}
{{/Mobile and Mobile Phone Operating Systems}}
<!-- add new cases to the bottom of the list in the format: {{/CONFLICTNAME}} -->
*[[/Tax]]
==September 2008==
* [[/Black box/]]
==October 2008==
* [[/Philately/]]
* [[/Dumpling/]]
* [[A newer look at the interlanguage link]]
* [[Interwiki conflicts]]
* [[Help:Interwiki_linking#Interlanguage_link|Interlanguage links]]
* [[w:Wikipedia:WikiProject Interlanguage Links/Ideas from the Hebrew Wikipedia]]
|