Office Open XML file formats: Difference between revisions

Content deleted Content added
Rescuing 2 sources and tagging 0 as dead.) #IABot (v2.0.9.5) (Eastmain - 17674
Bender the Bot (talk | contribs)
m Office MathML (OMML): HTTP to HTTPS for Blogspot
 
(6 intermediate revisions by 6 users not shown)
Line 5:
{{Infobox file format
| name = Office Open XML Document
| icon = X-office-document.docx icon.svg
| logo sg =
| screenshot =
| caption =
| extension = .docx, .docm
| mime = application/vnd.<br />openxmlformats-officedocument.<br />wordprocessingml.<br />document<ref name="mimetype">{{ cite web | url = https://technet.microsoft.com/en-us/library/cc179224.aspx | title = Register file extensions on third party servers | author = Microsoft | date = 26 February 2008 | access-date = 2009-09-04 | publisher = microsoft.com }}</ref>
| type code =
| uniform type =
Line 24:
| extended to =
| standard = ECMA-376, ISO/IEC 29500
| url = [http://www.ecma-international.org/publications-and-standards/standards/Ecmaecma-376.htm/ ECMA-376], [http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_tc_browse.htm?commid=45374 ISO/IEC 29500:2008] }}
{{Infobox file format
| name = Office Open XML Presentation
| icon = X-office-presentation.pptx icon.svg
| logo =
| screenshot =
| caption =
| extension = .pptx, .pptm
| mime = application/vnd.<br />openxmlformats-officedocument.<br />presentationml.<br />presentation<ref name="mimetype">{{ cite web | url = https://technet.microsoft.com/en-us/library/cc179224.aspx | title = Register file extensions on third party servers | author = Microsoft | date = 26 February 2008 | access-date = 2009-09-04 | publisher = microsoft.com }}</ref>
| type code =
| uniform type =
Line 46:
| extended to =
| standard = ECMA-376, ISO/IEC 29500
| url = [http://www.ecma-international.org/publications-and-standards/standards/Ecmaecma-376.htm/ ECMA-376], [http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_tc_browse.htm?commid=45374 ISO/IEC 29500:2008] }}
{{Infobox file format
| name = Office Open XML Workbook
| icon = X-office-spreadsheet.xlsx icon.svg
| logo =
| screenshot =
| caption =
| extension = .xlsx, .xlsm
| mime = application/vnd.<br />openxmlformats-officedocument.<br />spreadsheetml.<br />sheet<ref name="mimetype">{{ cite web | url = https://technet.microsoft.com/en-us/library/cc179224.aspx | title = Register file extensions on third party servers | author = Microsoft | date = 26 February 2008 | access-date = 2009-09-04 | publisher = microsoft.com }}</ref>
| type code =
| uniform type =
Line 68:
| extended to =
| standard = ECMA-376, ISO/IEC 29500
| url = [http://www.ecma-international.org/publications/standards-and-standards/Ecmaecma-376.htm/ ECMA-376], [http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_tc_browse.htm?commid=45374 ISO/IEC 29500:2008]
}}
 
Line 164:
Shared markup language materials include:
* Office Math Markup Language (OMML)
* DrawingML used for vector drawing, charts, and for example, text art (additionally, though deprecated, [[Vector Markup Language|VML]] is supported for drawing)
* Extended properties
* Custom properties
Line 190:
| date=21 October 2008 }}</ref>
 
The [[W3C XML Schema|XML Schema]] of Office Open XML emphasizes reducing load time and improving [[parsing]] speed.<ref>{{Cite web| title=Software Developer uses Office Open XML to Minimize File Space, Increase Interoperability| url=http://www.openxmlcommunity.org/documents/casestudies/Intellisafe_OpenXML_Final.pdf | author=Intellisafe Technologies}}{{Dead link|date=July 2025 |bot=InternetArchiveBot |fix-attempted=yes }}</ref> In a test with applications current in April 2007, XML-based office documents were slower to load than binary formats.<ref>{{cite web | url=http://blogs.zdnet.com/Ou/?p=480 | title=MS Office 2007 versus Open Office 2.2 shootout | author=George Ou | date=2007-04-27 | access-date=2007-04-27 | publisher=ZDnet.com | archive-date=2009-03-26 | archive-url=https://web.archive.org/web/20090326081043/http://blogs.zdnet.com/Ou/?p=480 | url-status=dead }}</ref> To enhance performance, Office Open XML uses very short element names for common elements and spreadsheets save dates as index numbers (starting from 1900 or from 1904).<ref>{{ cite web | url=https://support.microsoft.com/en-us/kb/214330 | title=Differences between the 1900 and the 1904 date system in Excel | date=2013-03-05 | access-date=2016-08-23 | publisher=Microsoft}}</ref> In order to be systematic and generic, Office Open XML typically uses separate child elements for data and metadata (element names ending in ''Pr'' for ''properties'') rather than using multiple attributes, which allows structured properties. Office Open XML does not use mixed content but uses elements to put a series of text runs (element name ''r'') into paragraphs (element name ''p''). The result is terse{{Citation needed|date=October 2009}} and highly nested in contrast to [[HTML]], for example, which is fairly flat, designed for humans to write in [[text editors]] and is more congenial for humans to read.
 
The naming of elements and attributes within the text has attracted some criticism. There are three different syntaxes in OOXML (ECMA-376) for specifying the color and alignment of text depending on whether the document is a text, spreadsheet, or presentation. Rob Weir (an [[IBM]] employee and co-chair of the [[OASIS (organization)|OASIS]] [[OpenDocument Format]] TC) asks "What is the engineering justification for this horror?". He contrasts with [[OpenDocument]]: "ODF uses the W3C's XSL-FO vocabulary for text styling, and uses this vocabulary consistently".<ref>{{ cite web | url=http://www.robweir.com/blog/2008/03/disharmony-of-ooxml.html | title= Disharmony of OOXML | author=Rob Weir | date=14 March 2008}}</ref>
 
Some have argued the design is based too closely on Microsoft applications.
In August 2007, the [[Linux Foundation]] published a blog post calling upon ISO National Bodies to vote "No, with comments" during the International Standardization of OOXML. It said, "OOXML is a direct port of a single vendor's binary document formats. It avoids the re-use of relevant existing international standards (e.g. several cryptographic algorithms, VML, etc.). There are literally hundreds of technical flaws that should be addressed before standardizing OOXML including continued use of binary code tied to platform specific features, propagating bugs in MS-Office into the standard, proprietary units, references to proprietary/confidential tags, unclear [[Intellectual property|IP]] and patent rights, and much more".<ref>{{ cite web | url=http://www.linux-foundation.org/weblogs/cherry/2007/08/29/ooxml-vote-no-with-comments/ | title=OOXML&nbsp;— vote "No, with comments" | author=John Cherry | date=14 March 2008 | access-date=30 October 2009 | archive-date=22 August 2009 | archive-url=https://web.archive.org/web/20090822055654/http://www.linux-foundation.org/weblogs/cherry/2007/08/29/ooxml-vote-no-with-comments/ | url-status=dead }}</ref>
 
The version of the standard submitted to [[ISO/IEC JTC 1|JTC 1]] was 6546 pages long. The need and appropriateness of such length has been questioned.<ref name="GooglesPositiononOOXML">{{ cite web | url = http://www.odfalliance.org/resources/Google%20OOXML%20Q%20%20A.pdf | title = Google's Position on OOXML as a Proposed ISO Standard | date = February 2008 | publisher = [[Google]] | quote = If ISO were to give OOXML with its 6546 pages the same level of review that other standards have seen, it would take 18 years (6576 days for 6546 pages) to achieve comparable levels of review to the existing ODF standard (871 days for 867 pages) which achieves the same purpose and is thus a good comparison. Considering that OOXML has only received about 5.5% of the review that comparable standards have undergone, reports about inconsistencies, contradictions and missing information are hardly surprising | url-status = dead | archive-url = https://web.archive.org/web/20100818112807/http://www.odfalliance.org/resources/Google%20OOXML%20Q%20%20A.pdf | archive-date = 2010-08-18 }}</ref><ref>{{ cite web | url = http://www.ibm.com/developerworks/library/x-ooxmlstandard.html | title = OOXML: What's the big deal? | date = 2008-02-19 | publisher = [[IBM]] | url-status = dead | archive-url = https://web.archive.org/web/20091003044227/http://www.ibm.com/developerworks/library/x-ooxmlstandard.html | archive-date = 2009-10-03 }}</ref> [[Google]] stated that "the ODF standard, which achieves the same goal, is only 867 pages"<ref name="GooglesPositiononOOXML"/>
Line 213:
=== Office MathML (OMML) ===
Office Math Markup Language is a mathematical markup language which can be embedded in WordprocessingML, with intrinsic support for including word processing markup like revision markings,<ref>{{cite web|url = http://idippedut.dk/post/Do-your-math-OOXML-and-OMML|title = Do your math - OOXML and OMML (Updated 2008-02-12)|author = Jesper Lund Stocholm|publisher = A Mooh Point blog|date = 2008-02-12|access-date = 2015-11-18|archive-date = 2016-03-26|archive-url = https://web.archive.org/web/20160326225935/http://idippedut.dk/post/Do-your-math-OOXML-and-OMML|url-status = dead}}</ref> footnotes, comments, images and elaborate formatting and styles.<ref>{{cite web| url=http://blogs.msdn.com/murrays/archive/2007/06/05/science-and-nature-have-difficulties-with-word-2007-mathematics.aspx| title=Science and Nature have difficulties with Word 2007 mathematics| author=Murray Sargent| publisher=MSDN blogs| date=2007-06-05| access-date=2007-07-31}}</ref>
The OMML format is different from the [[World Wide Web Consortium]] (W3C) [[MathML]] recommendation that does not support those office features, but is partially compatible<ref>{{cite web| url=httphttps://dpcarlisle.blogspot.com/2007/04/xhtml-and-mathml-from-office-20007.html| title=XHTML and MathML from Office 2007| author=David Carlisle| publisher=David Carlisle| date=2007-05-09| access-date=2007-09-20}}</ref> through [[XSL Transformations]]; tools are provided with office suite and are automatically used via clipboard transformations.<ref>{{Cite web|url=http://blogs.msdn.com/b/murrays/archive/2007/06/05/science-and-nature-have-difficulties-with-word-2007-mathematics.aspx|title = DevBlogs}}</ref>
 
The following Office MathML example defines the [[fraction (mathematics)|fraction]]: <math>\frac{\pi}{2}</math>
Line 228:
</syntaxhighlight>
 
Some have queried the need for Office MathML (OMML) instead advocating the use of [[MathML]], a [[World Wide Web Consortium|W3C]] recommendation for the "inclusion of mathematical expressions in Web pages" and "machine to machine communication".<ref>{{cite web| url=httphttps://www.zdnet.com.au/newsarticle/software/soa/Microsoftmicrosoft-Officeoffice-dumped-by-Sciencescience-and-Naturenature/0,130061733,339278690,00.htm| title=Microsoft Office dumped by Science and Nature| publisher=ZDNet Australia| date=18 June 2007}}</ref> Murray Sargent has answered some of these issues in a blog post, which details some of the philosophical differences between the two formats.<ref>{{Cite web|url=http://blogs.msdn.com/b/murrays/archive/2006/10/07/mathml-and-ecma-math-_2800_omml_2900_-.aspx|title = DevBlogs}}</ref>
 
=== DrawingML ===<!-- English Metric Unit links to here -->