XML retrieval: Difference between revisions

Content deleted Content added
dead link
Monkbot (talk | contribs)
Line 5:
'''XML Retrieval''', or XML Information Retrieval,<ref>{{cite journal |
author=Luk, R.W.P.| coauthors= H.V. Leong, T.S. Dillon, Alvin T. S. Chan, W. B. Croft and J. Allan | year=2002|
title=A survey in indexing and searching XML documents | journal = Journal of the American Society for Information Science and Technology | volume = 53 | issue =6 | pages = 415–437| doi = 10.1002/asi.10056}}</ref> is the content-based retrieval of documents structured with [[XML]] (eXtensible Markup Language). As such it is used for computing [[Relevance (information retrieval)|relevance]] of XML documents.<ref>{{Cite web|url=ftp://ftp.tm.informatik.uni-frankfurt.de/pub/papers/ir/An%20Architecture%20for%20XML%20Information%20Retrieval%20in%20a%20Peer-to-Peer%20Environment_2007.pdf|title=An Architecture for XML Information Retrieval in a Peer-to-Peer Environment|last=Winter|first=Judith|coauthorsauthor2=Drobnik, Oswald |date=November 9, 2007|publisher=ACM|accessdate=2009-02-10}}</ref>
 
==Queries==
Line 17:
 
==Existing XML search engines==
An overview of two potential approaches is available.<ref>{{Cite web|url=http://www.sigmod.org/record/issues/0612/p16-article-yahia.pdf|title=XML Search: Languages, INEX and Scoring|last=Amer-Yahia|first=Sihem|coauthorsauthor2=Lalmas, Mounia |year=2006|publisher=SIGMOD Rec. Vol. 35, No. 4|accessdate=2009-02-10}} {{Dead link|date=October 2010|bot=H3llBot}}</ref><ref>{{Cite web|url=http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.109.5986&rep=rep1&type=pdf|title=XML Retrieval: A Survey|last=Pal|first=Sukomal|date=June 30, 2006|publisher=Technical Report, CVPR|accessdate=2013-07-04}}</ref> The INitiative for the Evaluation of XML-Retrieval (''INEX'') was founded in 2002 and provides a platform for evaluating such [[algorithm]]s.<ref name="INEX2006" /> Three different areas influence XML-Retrieval:<ref name="INEX2002">{{Cite web|url=http://www.is.informatik.uni-duisburg.de/bib/pdf/ir/Fuhr_etal:02a.pdf|title=INEX: Initiative for the Evaluation of XML Retrieval|last=Fuhr|first=Norbert|coauthors=Gövert, N.; Kazai, Gabriella; Lalmas, Mounia|year=2003|work=Proceedings of the First INEX Workshop, Dagstuhl, Germany, 2002|publisher=ERCIM Workshop Proceedings, France|accessdate=2009-02-10}}</ref>
 
===Traditional XML query languages===
[[Query language]]s such as the [[W3C]] standard [[XQuery]]<ref>{{Cite web|url=http://www.w3.org/TR/2007/REC-xquery-20070123/|title=XQuery 1.0: An XML Query Language|last=Boag|first=Scott|coauthors=Chamberlin, Don; Fernández, Mary F.; Florescu, Daniela; Robie, Jonathan; Siméon, Jérôme|date=23 January 2007|work=W3C Recommendation|publisher=World Wide Web Consortium|accessdate=2009-02-10}}</ref> supply complex queries, but only look for exact matches. Therefore, they need to be extended to allow for vague search with relevance computing. Most XML-centered approaches imply a quite exact knowledge of the documents' [[Database schema|schemas]].<ref name="Schlieder2002">{{Cite journal|url=http://web.archive.org/web/20070610002349/http://www.cis.uni-muenchen.de/people/Meuss/Pub/JASIS02.ps.gz|title=Querying and Ranking XML Documents|last=Schlieder|first=Torsten|coauthorsauthor2=Meuss, Holger |year=2002|work= Journal of the American Society for Information Science and Technology, Vol. 53, No. 6|accessdate=2009-02-10}}</ref>
 
===Databases===