Talk:Comparison of data-serialization formats: Difference between revisions

Content deleted Content added
SeanJA (talk | contribs)
SeanJA (talk | contribs)
Line 12:
This section should be on the XML page...
 
===XML Advantages===
{{Prose|date=August 2009}}
 
===Advantages===
* XML provides a basic syntax that can be used to share information between different kinds of computers, different applications, and different organizations. XML data is stored in plain text format.<ref name="w3chowxmluse">{{cite web|url=http://www.w3schools.com/Xml/xml_usedfor.asp |title=How Can XML be Used? |publisher=W3schools.com |date= |accessdate=2009-07-31}}</ref> This software- and hardware-independent way of storing data allows different incompatible systems to share data without needing to pass them through many layers of conversion. This also makes it easier to expand or upgrade to new operating systems, new applications, or new browsers, without losing any data.
* It supports [[Unicode]], allowing almost any information in any written human language to be communicated.
Line 24 ⟶ 25:
* Its predecessor, [[SGML]], has been in use since 1986, so there is extensive experience and software available.
 
===Disadvantages===
[[User:SeanJA|SeanJA]] ([[User talk:SeanJA|talk]]) 05:25, 12 September 2009 (UTC)
* XML syntax is redundant or large relative to binary representations of similar data,<ref name="Elliotte001">
{{cite book
| last = Harold
| first = Elliotte Rusty
| title = Processing XML with Java(tm): a guide to SAX, DOM, JDOM, JAXP, and TrAX
| publisher = Addison-Wesley
| year = 2002
| isbn = 0201771861
| ref = Reference-Rusty-2002-a
}}XML documents are too verbose compared with binary equivalents.</ref> especially with [[Table (information)|tabular]] data.
* The redundancy may affect application efficiency through higher storage, transmission and processing costs.<ref name="Elliotte000">
{{cite book
| last = Harold
| first = Elliotte Rusty
| title = XML in a Nutshell: A Desktop Quick Reference
| publisher = O'Reilly
| year = 2002
| isbn = 0596002920
| ref = Reference-Rusty-2002-b
}} XML documents are very verbose and searching is inefficient for
high-performance largescale database applications.</ref><ref name="However000">However, the [[Binary XML]] effort strives to alleviate these problems by using a binary representation for the XML document. For example, the [[Java (programming language)|Java]] reference implementation of the [[Fast Infoset]] standard parsing speed is better by a factor 10 compared to [[Java (programming language)|Java]] [[Xerces]], and by a factor 4 compared to the [http://piccolo.sourceforge.net/ Piccolo driver], one of the fastest Java-based XML parser [https://fi.dev.java.net/reports/parsing/report.html].</ref>
* XML syntax is verbose, especially for human readers, relative to other alternative 'text-based' data transmission formats.<ref name="Bierman000">
{{cite book
| last = Bierman
| first = Gavin
| title = Database Programming Languages: 10th international symposium, DBPL 2005 Trondheim, Norway
| publisher = Springer
| year = 2005
| isbn = 3540309519
}}XML syntax is too verbose for human readers in for certain applications.
Proposes a dual syntax for human readability.</ref><ref name="VerbRebut000">Although many purportedly
"less verbose" text formats actually cite XML as
both inspiration and prior art.
See e.g., http://yaml.org/spec/current.html,
http://innig.net/software/sweetxml/index.html,
http://www.json.org/xml.html.</ref>
* The [[hierarchical model]] for representation is limited in comparison to an [[object oriented]] [[Graph (mathematics)|graph]].<ref name="TreeLimit000">A hierarchical model only gives a fixed, monolithic view of the [[tree structure]]. For example, either actors under movies, or movies under actors, but not both.</ref><ref name="Lim000">
{{cite book
| last = Lim
| first = Ee-Peng
| title = Digital Libraries: People, Knowledge, and Technology
| publisher = Springer
| year = 2002
| isbn = 3540002618
}}Discusses some of the limitation with fixed hierarchy. Proceedings of the 5th International Conference on Asian Digital Libraries, ICADL 2002, held in Singapore in December 2002. </ref>
* Expressing overlapping (non-hierarchical) node relationships requires extra effort.<ref name="Searle000">{{cite book
| last = Searle
| first = Leroy F.
| title = Voice, text, hypertext: emerging practices in textual studies
| publisher = University of Washington Press
| year = 2004
| isbn = 0295983051
}} Proposes an alternative system for encoding overlapping elements. </ref>
* XML namespaces are problematic to use and namespace support can be difficult to correctly implement in an XML parser.<ref name="Names000">(See e.g., http://www-128.ibm.com/developerworks/library/x-abolns.html )</ref>
* XML is commonly depicted as "[[self-documenting]]" but this depiction ignores critical ambiguities.<ref name="selfdesc000">{{cite web
| title = The Myth of Self-Describing XML
| url = http://www.oceaninformatics.biz/publications/e2.pdf
|format=PDF| accessdate = 2007-05-12
}}</ref><ref>(See e.g., [[Use–mention distinction]], [[Naming collision]], [[Polysemy]])</ref>
* The distinction between content and attributes in XML seems unnatural to some and makes designing XML data structures harder.<ref name="XMLSuck8">{{cite web
| title = Does XML Suck?
| url = http://xmlsucks.org/but_you_have_to_use_it_anyway/does-xml-suck.html
| accessdate = 2007-12-15
}}(See "8. Complexity: Attributes and Content")</ref>
* Transformations, even identity transforms, result in changes to format (whitespace, attribute ordering, attribute quoting, whitespace around attributes, newlines). These problems can make [[diff]]-ing the XML source very difficult except via [[Canonical XML]].