Comparison of data-serialization formats: Difference between revisions

Content deleted Content added
Add Preserves
m resize table font
 
(35 intermediate revisions by 20 users not shown)
Line 1:
{{Short description|None}}
This is a '''comparison of [[data serialization]] [[file format|format]]sformats''', various ways to convert complex [[object (computer science)|object]]s to sequences of [[bit]]s. It does not include [[markup language]]s used exclusively as [[document file format]]s.
 
==Overview==
 
{| class="wikitable sortable mw-collapsible"
{{sort-under}}
{{sticky table start}}
{| class="wikitable sortable sort-under sticky-table-head" style="font-size:75%"
|-
! Name
Line 16 ⟶ 19:
! Standard [[API]]s
! Supports [[zero-copy]] operations
|-
| [[Apache Arrow]]
| [[Apache Software Foundation]]
| {{n/a}}
| {{partial|''De facto''}}
| [https://arrow.apache.org/docs/format/Columnar.html Arrow Columnar Format]
| {{yes}}
| {{no}}
| {{yes}}
| {{yes|Built-in}}
| C, C++, C#, Go, Java, JavaScript, Julia, Matlab, Python, R, Ruby, Rust, Swift
| {{yes}}
|-
| [[Apache Avro]]
Line 40 ⟶ 55:
| Java, Python, C++
| {{no}}
|-
| [[Apache Thrift]]
| [[Facebook]] (creator)<br>[[Apache Software Foundation|Apache]] (maintainer)
| {{n/a}}
| {{no}}
| [http://thrift.apache.org/static/files/thrift-20070401.pdf Original whitepaper]
| {{yes}}
| {{partial}}{{ref|thrifttxt|c}}
| {{no}}
| {{yes|Built-in}}
| C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js, Smalltalk, OCaml, Delphi and other languages<ref>[https://thrift.apache.org/ Apache Thrift]</ref>
| {{n/a}}
|-
| [[ASN.1]]
Line 91 ⟶ 118:
| [[CBOR]]
| Carsten Bormann, [[Paul Hoffman (engineer)|P. Hoffman]]
| [[MessagePack]]<ref>{{cite web|url=https://github.com/msgpack/msgpack/issues/258#issuecomment-449978394|title=CBOR relationship with msgpack|first1=Carsten|last1=Bormann|website=[[GitHub]] |date=2018-12-26|access-date=2023-08-14}}</ref>
| {{yes}}
| RFC 8949
Line 98 ⟶ 125:
| {{yes}}, <br/>through tagging
| {{yes|[https://tools.ietf.org/html/rfc8610 CDDL]}}
| {{yes|[[FIDO_Alliance|FIDO2]]}}
| {{no}}
| {{no}}
|-
Line 104 ⟶ 131:
| RFC author:<br>Yakov Shafranovich
| {{n/a}}
| {{partial|A myriad ofMyriad informal variants}}
| RFC 4180<br>(among others)
| {{no}}
Line 148 ⟶ 175:
| {{Yes|[[Document Object Model|DOM]], [[Simple API for XML|SAX]], [[StAX]], [[XQuery]], [[XPath]]}}
| {{n/a}}
|-
| [[Extensible Data Notation]] (edn)
| [[Rich Hickey]] / Clojure community
| [[Clojure]]
| {{yes}}
| [https://github.com/edn-format/edn Official edn spec]
| {{no}}
| {{yes}}
| {{no}}
| {{no}}
| Clojure, Ruby, Go, C++, Javascript, Java, CLR, ObjC, Python<ref>{{cite web|url=https://github.com/edn-format/edn/wiki/Implementations|title=Implementations|website=[[GitHub]] }}</ref>
| {{no}}
|-
| [[FlatBuffers]]
Line 217 ⟶ 256:
| {{yes}}
| {{yes|[https://tools.ietf.org/html/rfc6901 JSON Pointer (RFC{{nbsp}}6901)], or alternately, [http://goessner.net/articles/JsonPath/ JSONPath], [https://web.archive.org/web/20120922110739/http://bluelinecity.com/software/jpath/ JPath], [https://web.archive.org/web/20121203081945/http://www.jspon.org/ JSPON], [https://github.com/lloyd/JSONSelect json:select()]; and [[JSON-LD]]}}
| {{partial}}<br>([http://json-schema.org/ JSON Schema Proposal], [[ASN.1]] with [[JSON encoding rules|JER]], [http://www.kuwata-lab.com/kwalify/ Kwalify], [http{{Webarchive|url=https://rjbsweb.manxomearchive.org/rxweb/20210812231831/http://www.kuwata-lab.com/kwalify/ Rx]|date=2021-08-12 }}, [http://itemscriptrjbs.manxome.org/ItemscriptSchema.htmlrx/ Itemscript SchemaRx]), [[JSON-LD]]
| {{partial}}<br>([https://github.com/dscape/clarinet Clarinet], [https://www.sitepen.com/blog/jsonquery-data-querying-beyond-jsonpath JSONQuery] / [https://www.sitepen.com/blog/resource-query-language-a-query-language-for-the-web-nosql RQL], [http://goessner.net/articles/JsonPath/ JSONPath]), [[JSON-LD]]
| {{no}}
Line 304 ⟶ 343:
| {{yes}}
| {{no}}
|-
| Preserves
| Tony Garnock-Jones
| -
| {{yes}}
| [https://preserves.dev/preserves.html Specification]
| {{yes|[https://preserves.dev/preserves-binary.html Yes]}}
| {{yes|[https://preserves.dev/preserves-text.html Yes]}}
| {{yes}}
| {{yes|[https://preserves.dev/preserves-schema.html Yes]}}
| {{n/a}}
| {{n/a}}
|-
| [[Property list]]
Line 332 ⟶ 359:
| [[Google]]
| {{n/a}}
| {{yesno}}
| [https://developers.google.com/protocol-buffers/docs/encoding Developer Guide: Encoding], [https://developers.google.com/protocol-buffers/docs/reference/proto2-spec proto2 specification], and [https://developers.google.com/protocol-buffers/docs/reference/proto3-spec proto3 specification]
| {{yes}}
Line 345 ⟶ 372:
| [[Lisp (programming language)|Lisp]], [[Netstring]]s
| {{partial|Largely ''de facto''}}
| [http://people.csail.mit.edu/rivest/Sexp.txt "S-Expressions"] {{Webarchive|url=https://web.archive.org/web/20131007024815/http://people.csail.mit.edu/rivest/Sexp.txt |date=2013-10-07 }} [[Internet Draft]]
| {{yes}}, ''canonical representation''
| {{yes}}, ''advanced transport representation''
Line 387 ⟶ 414:
| {{no}}
|
| {{n/a}}
|-
| [[Apache Thrift]]
| [[Facebook]] (creator)<br>[[Apache Software Foundation|Apache]] (maintainer)
| {{n/a}}
| {{no}}
| [http://thrift.apache.org/static/files/thrift-20070401.pdf Original whitepaper]
| {{yes}}
| {{partial}}{{ref|thrifttxt|c}}
| {{no}}
| {{yes|Built-in}}
| C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js, Smalltalk, OCaml, Delphi and other languages<ref>[https://thrift.apache.org/ Apache Thrift]</ref>
| {{n/a}}
|-
Line 457 ⟶ 472:
| {{yes}}
| {{yes}}
| {{partial}}<br>([http://www.kuwata-lab.com/kwalify/ Kwalify] {{Webarchive|url=https://web.archive.org/web/20210812231831/http://www.kuwata-lab.com/kwalify/ |date=2021-08-12 }}, [http://rjbs.manxome.org/rx/ Rx], built-in language type-defs)
| {{no}}
| {{no}}
Line 473 ⟶ 488:
! Supports [[zero-copy]] operations
|}
{{sticky table end}}
 
{{ordered list
| list-style-type=lower-alpha
Line 486 ⟶ 503:
 
==Syntax comparison of human-readable formats==
 
{| class="wikitable"
{{sticky table start}}
{| class="wikitable sortable sort-under sticky-table-head" style="font-size:75%"
|-
! Format
Line 544 ⟶ 563:
A to Z,1,2,3</pre>
|-
| [[Extensible Data Notation|edn]]
! Format
| <code>nil</code>
! [[Nullable type|Null]]
| <code>true</code>
! [[Boolean data type|Boolean]] true
| <code>false</code>
! [[Boolean data type|Boolean]] false
| <code>685230</code><br><code>-685230</code>
! [[Integer (computer science)|Integer]]
| <code>6.8523015e+5</code>
! [[Floating-point]]
| <code>"A to Z"</code>, <code>"A \"up to\" Z"</code>
! [[String (computer science)|String]]
| <code>[true nil -42.1e7 "A to Z"]</code>
! [[Array data type|Array]]
| <code>{:kw 1, "42" true, "A to Z" [1 2 3]}</code>
! [[Associative array]]/[[Object (computer science)|Object]]
|-
| [[Ion (Serialization format)|Ion]]
Line 627 ⟶ 646:
true
"A to Z", (1, 2, 3)</pre>
|-
! Format
! [[Nullable type|Null]]
! [[Boolean data type|Boolean]] true
! [[Boolean data type|Boolean]] false
! [[Integer (computer science)|Integer]]
! [[Floating-point]]
! [[String (computer science)|String]]
! [[Array data type|Array]]
! [[Associative array]]/[[Object (computer science)|Object]]
|-
| [[OpenDDL]]
Line 683 ⟶ 692:
| <code>(lI01\na(laF-421000000.0\naS'A to Z'\na.</code>
| <code>(dI42\nI01\nsS'A to Z'\n(lI1\naI2\naI3\nas.</code>
|-
| Preserves
| <code><null></code>
| <code>#t</code>
| <code>#f</code>
| <code>685230</code>
| <code>685230.15f</code>
| <code>"A to Z"</code>
| <code>[#t <null> -421000000.0f "A to Z"]/code>
| <code>{42: #f "A to Z": [1 2 3]}</code>
|-
| [[Property list]]<br>(plain text format)<ref name="gnustep">{{cite web|url=http://www.gnustep.org/resources/documentation/Developer/Base/Reference/NSPropertyList.html|title=NSPropertyListSerialization class documentation|website=www.gnustep.org|access-date=2009-10-28|archive-url=https://web.archive.org/web/20110519164921/http://gnustep.org/resources/documentation/Developer/Base/Reference/NSPropertyList.html|archive-date=2011-05-19|url-status=dead}}</ref>
Line 763 ⟶ 762:
[extensionFieldThatIsAnEnum]: EnumValue
</syntaxhighlight>
|-
! Format
! [[Nullable type|Null]]
! [[Boolean data type|Boolean]] true
! [[Boolean data type|Boolean]] false
! [[Integer (computer science)|Integer]]
! [[Floating-point]]
! [[String (computer science)|String]]
! [[Array data type|Array]]
! [[Associative array]]/[[Object (computer science)|Object]]
|-
| [[S-expression]]s
Line 856 ⟶ 845:
</struct></syntaxhighlight>
|}
{{sticky table end}}
{{ordered list
| list-style-type=lower-alpha
Line 865 ⟶ 855:
| {{note|lispstd}}This syntax is not compatible with the Internet-Draft, but is used by some dialects of [[Lisp (programming language)|Lisp]].
}}
 
 
==Comparison of binary formats==
<!--This table is meant to describe how the various datatypes are encoded in binary in the various formats.-->
 
{| class="wikitable"
{{sticky table start}}
{| class="wikitable sortable sort-under sticky-table-head sticky-table-col1" style="font-size:75%"
|- style="vertical-align:bottom;"
! Format
Line 876 ⟶ 869:
! [[Floating-point]]
! [[String (computer science)|String]]
! [[Array (data type)|Array]]
! [[Associative array]]/[[object (computer science)|object]]
|- style="vertical-align:top;"
Line 900 ⟶ 893:
| Data specifications {{mono|SET OF}} (unordered) and {{mono|SEQUENCE OF}} (guaranteed order)
| User definable type
|- style="vertical-align:top;"
| Binn
| <code>\x00</code>
| {{ubli
| True: <code>\x01</code>
| False: <code>\x02</code>
}}
| [[Big-endian]] [[2's complement]] signed and unsigned 8/16/32/64 bits
| {{ubli
| [[Single-precision floating-point format|Singles]]: [[big-endian]] [[binary32]]
| [[Double-precision floating-point format|Doubles]]: [[big-endian]] [[binary64]]
}}
| [[UTF-8]]-encoded, null-terminated, preceded by int8 or int32 string length in bytes
| Typecode (1 byte) + 1–4 bytes size + 1–4 bytes items count + list items
| Typecode (1 byte) + 1–4 bytes size + 1–4 bytes items count + key/value pairs
|- style="vertical-align:top;"
| [[BSON]]
Line 971 ⟶ 949:
| Length prefixed integer-encoded Unicode. Integers may represent enumerations or string table entries instead.
| Length prefixed set of items.
| {{No|Not in protocol.}}
|- style="vertical-align:top;"
| [[FlatBuffers]]
Line 1,042 ⟶ 1,020:
|- style="vertical-align:top;"
| [[Netstring]]s{{efn |group=binary |Interpretation of Netstrings is entirely application- or schema-dependent.}}
| {{No|Not in protocol.}}
| {{No|Not in protocol.}}
| {{No|Not in protocol.}}
| {{No|Not in protocol.}}
| Length-encoded as an ASCII string + ':' + data + ','<br>
Length counts only octets between ':' and ','
| {{No|Not in protocol.}}
| {{No|Not in protocol.}}
| Not in protocol.
|- style="vertical-align:top;"
| [[OGDL]] Binary
Line 1,119 ⟶ 1,097:
|
|}
 
{{sticky table end}}
{{notelist|group=binary}}
 
==See also==
*[[Comparison of document- markup languages]]
 
==References==