Comparison of data-serialization formats: Difference between revisions

Content deleted Content added
Google Protocol Buffers has not been standardized by any recognized standards body (e.g., W3C, ISO, IETF, etc.). It is owned and controlled by Google.
m resize table font
 
(33 intermediate revisions by 18 users not shown)
Line 1:
{{Short description|None}}
This is a '''comparison of [[data serialization]] [[file format|format]]sformats''', various ways to convert complex [[object (computer science)|object]]s to sequences of [[bit]]s. It does not include [[markup language]]s used exclusively as [[document file format]]s.
 
==Overview==
 
{| class="wikitable sortable mw-collapsible"
{{sort-under}}
{{sticky table start}}
{| class="wikitable sortable sort-under sticky-table-head" style="font-size:75%"
|-
! Name
Line 16 ⟶ 19:
! Standard [[API]]s
! Supports [[zero-copy]] operations
|-
| [[Apache Arrow]]
| [[Apache Software Foundation]]
| {{n/a}}
| {{partial|''De facto''}}
| [https://arrow.apache.org/docs/format/Columnar.html Arrow Columnar Format]
| {{yes}}
| {{no}}
| {{yes}}
| {{yes|Built-in}}
| C, C++, C#, Go, Java, JavaScript, Julia, Matlab, Python, R, Ruby, Rust, Swift
| {{yes}}
|-
| [[Apache Avro]]
Line 40 ⟶ 55:
| Java, Python, C++
| {{no}}
|-
| [[Apache Thrift]]
| [[Facebook]] (creator)<br>[[Apache Software Foundation|Apache]] (maintainer)
| {{n/a}}
| {{no}}
| [http://thrift.apache.org/static/files/thrift-20070401.pdf Original whitepaper]
| {{yes}}
| {{partial}}{{ref|thrifttxt|c}}
| {{no}}
| {{yes|Built-in}}
| C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js, Smalltalk, OCaml, Delphi and other languages<ref>[https://thrift.apache.org/ Apache Thrift]</ref>
| {{n/a}}
|-
| [[ASN.1]]
Line 91 ⟶ 118:
| [[CBOR]]
| Carsten Bormann, [[Paul Hoffman (engineer)|P. Hoffman]]
| [[MessagePack]]<ref>{{cite web|url=https://github.com/msgpack/msgpack/issues/258#issuecomment-449978394|title=CBOR relationship with msgpack|first1=Carsten|last1=Bormann|website=[[GitHub]] |date=2018-12-26|access-date=2023-08-14}}</ref>
| {{yes}}
| RFC 8949
Line 98 ⟶ 125:
| {{yes}}, <br/>through tagging
| {{yes|[https://tools.ietf.org/html/rfc8610 CDDL]}}
| {{yes|[[FIDO_Alliance|FIDO2]]}}
| {{no}}
| {{no}}
|-
Line 104 ⟶ 131:
| RFC author:<br>Yakov Shafranovich
| {{n/a}}
| {{partial|A myriad ofMyriad informal variants}}
| RFC 4180<br>(among others)
| {{no}}
Line 148 ⟶ 175:
| {{Yes|[[Document Object Model|DOM]], [[Simple API for XML|SAX]], [[StAX]], [[XQuery]], [[XPath]]}}
| {{n/a}}
|-
| [[Extensible Data Notation]] (edn)
| [[Rich Hickey]] / Clojure community
| [[Clojure]]
| {{yes}}
| [https://github.com/edn-format/edn Official edn spec]
| {{no}}
| {{yes}}
| {{no}}
| {{no}}
| Clojure, Ruby, Go, C++, Javascript, Java, CLR, ObjC, Python<ref>{{cite web|url=https://github.com/edn-format/edn/wiki/Implementations|title=Implementations|website=[[GitHub]] }}</ref>
| {{no}}
|-
| [[FlatBuffers]]
Line 217 ⟶ 256:
| {{yes}}
| {{yes|[https://tools.ietf.org/html/rfc6901 JSON Pointer (RFC{{nbsp}}6901)], or alternately, [http://goessner.net/articles/JsonPath/ JSONPath], [https://web.archive.org/web/20120922110739/http://bluelinecity.com/software/jpath/ JPath], [https://web.archive.org/web/20121203081945/http://www.jspon.org/ JSPON], [https://github.com/lloyd/JSONSelect json:select()]; and [[JSON-LD]]}}
| {{partial}}<br>([http://json-schema.org/ JSON Schema Proposal], [[ASN.1]] with [[JSON encoding rules|JER]], [http://www.kuwata-lab.com/kwalify/ Kwalify], [http{{Webarchive|url=https://rjbsweb.manxomearchive.org/rxweb/20210812231831/http://www.kuwata-lab.com/kwalify/ Rx]|date=2021-08-12 }}, [http://itemscriptrjbs.manxome.org/ItemscriptSchema.htmlrx/ Itemscript SchemaRx]), [[JSON-LD]]
| {{partial}}<br>([https://github.com/dscape/clarinet Clarinet], [https://www.sitepen.com/blog/jsonquery-data-querying-beyond-jsonpath JSONQuery] / [https://www.sitepen.com/blog/resource-query-language-a-query-language-for-the-web-nosql RQL], [http://goessner.net/articles/JsonPath/ JSONPath]), [[JSON-LD]]
| {{no}}
Line 304 ⟶ 343:
| {{yes}}
| {{no}}
|-
| Preserves
| Tony Garnock-Jones
| -
| {{yes}}
| [https://preserves.dev/preserves.html Specification]
| {{yes|[https://preserves.dev/preserves-binary.html Yes]}}
| {{yes|[https://preserves.dev/preserves-text.html Yes]}}
| {{yes}}
| {{yes|[https://preserves.dev/preserves-schema.html Yes]}}
| {{n/a}}
| {{n/a}}
|-
| [[Property list]]
Line 345 ⟶ 372:
| [[Lisp (programming language)|Lisp]], [[Netstring]]s
| {{partial|Largely ''de facto''}}
| [http://people.csail.mit.edu/rivest/Sexp.txt "S-Expressions"] {{Webarchive|url=https://web.archive.org/web/20131007024815/http://people.csail.mit.edu/rivest/Sexp.txt |date=2013-10-07 }} [[Internet Draft]]
| {{yes}}, ''canonical representation''
| {{yes}}, ''advanced transport representation''
Line 387 ⟶ 414:
| {{no}}
|
| {{n/a}}
|-
| [[Apache Thrift]]
| [[Facebook]] (creator)<br>[[Apache Software Foundation|Apache]] (maintainer)
| {{n/a}}
| {{no}}
| [http://thrift.apache.org/static/files/thrift-20070401.pdf Original whitepaper]
| {{yes}}
| {{partial}}{{ref|thrifttxt|c}}
| {{no}}
| {{yes|Built-in}}
| C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js, Smalltalk, OCaml, Delphi and other languages<ref>[https://thrift.apache.org/ Apache Thrift]</ref>
| {{n/a}}
|-
Line 457 ⟶ 472:
| {{yes}}
| {{yes}}
| {{partial}}<br>([http://www.kuwata-lab.com/kwalify/ Kwalify] {{Webarchive|url=https://web.archive.org/web/20210812231831/http://www.kuwata-lab.com/kwalify/ |date=2021-08-12 }}, [http://rjbs.manxome.org/rx/ Rx], built-in language type-defs)
| {{no}}
| {{no}}
Line 473 ⟶ 488:
! Supports [[zero-copy]] operations
|}
{{sticky table end}}
 
{{ordered list
| list-style-type=lower-alpha
Line 486 ⟶ 503:
 
==Syntax comparison of human-readable formats==
 
{| class="wikitable"
{{sticky table start}}
{| class="wikitable sortable sort-under sticky-table-head" style="font-size:75%"
|-
! Format
Line 544 ⟶ 563:
A to Z,1,2,3</pre>
|-
| [[Extensible Data Notation|edn]]
! Format
| <code>nil</code>
! [[Nullable type|Null]]
| <code>true</code>
! [[Boolean data type|Boolean]] true
| <code>false</code>
! [[Boolean data type|Boolean]] false
| <code>685230</code><br><code>-685230</code>
! [[Integer (computer science)|Integer]]
| <code>6.8523015e+5</code>
! [[Floating-point]]
| <code>"A to Z"</code>, <code>"A \"up to\" Z"</code>
! [[String (computer science)|String]]
| <code>[true nil -42.1e7 "A to Z"]</code>
! [[Array data type|Array]]
| <code>{:kw 1, "42" true, "A to Z" [1 2 3]}</code>
! [[Associative array]]/[[Object (computer science)|Object]]
|-
| [[Ion (Serialization format)|Ion]]
Line 627 ⟶ 646:
true
"A to Z", (1, 2, 3)</pre>
|-
! Format
! [[Nullable type|Null]]
! [[Boolean data type|Boolean]] true
! [[Boolean data type|Boolean]] false
! [[Integer (computer science)|Integer]]
! [[Floating-point]]
! [[String (computer science)|String]]
! [[Array data type|Array]]
! [[Associative array]]/[[Object (computer science)|Object]]
|-
| [[OpenDDL]]
Line 683 ⟶ 692:
| <code>(lI01\na(laF-421000000.0\naS'A to Z'\na.</code>
| <code>(dI42\nI01\nsS'A to Z'\n(lI1\naI2\naI3\nas.</code>
|-
| Preserves
| <code><null></code>
| <code>#t</code>
| <code>#f</code>
| <code>685230</code>
| <code>685230.15f</code>
| <code>"A to Z"</code>
| <code>[#t <null> -421000000.0f "A to Z"]</code>
| <code>{42: #f "A to Z": [1 2 3]}</code>
|-
| [[Property list]]<br>(plain text format)<ref name="gnustep">{{cite web|url=http://www.gnustep.org/resources/documentation/Developer/Base/Reference/NSPropertyList.html|title=NSPropertyListSerialization class documentation|website=www.gnustep.org|access-date=2009-10-28|archive-url=https://web.archive.org/web/20110519164921/http://gnustep.org/resources/documentation/Developer/Base/Reference/NSPropertyList.html|archive-date=2011-05-19|url-status=dead}}</ref>
Line 763 ⟶ 762:
[extensionFieldThatIsAnEnum]: EnumValue
</syntaxhighlight>
|-
! Format
! [[Nullable type|Null]]
! [[Boolean data type|Boolean]] true
! [[Boolean data type|Boolean]] false
! [[Integer (computer science)|Integer]]
! [[Floating-point]]
! [[String (computer science)|String]]
! [[Array data type|Array]]
! [[Associative array]]/[[Object (computer science)|Object]]
|-
| [[S-expression]]s
Line 856 ⟶ 845:
</struct></syntaxhighlight>
|}
{{sticky table end}}
{{ordered list
| list-style-type=lower-alpha
Line 865 ⟶ 855:
| {{note|lispstd}}This syntax is not compatible with the Internet-Draft, but is used by some dialects of [[Lisp (programming language)|Lisp]].
}}
 
 
==Comparison of binary formats==
<!--This table is meant to describe how the various datatypes are encoded in binary in the various formats.-->
 
{| class="wikitable"
{{sticky table start}}
{| class="wikitable sortable sort-under sticky-table-head sticky-table-col1" style="font-size:75%"
|- style="vertical-align:bottom;"
! Format
Line 876 ⟶ 869:
! [[Floating-point]]
! [[String (computer science)|String]]
! [[Array (data type)|Array]]
! [[Associative array]]/[[object (computer science)|object]]
|- style="vertical-align:top;"
Line 900 ⟶ 893:
| Data specifications {{mono|SET OF}} (unordered) and {{mono|SEQUENCE OF}} (guaranteed order)
| User definable type
|- style="vertical-align:top;"
| Binn
| <code>\x00</code>
| {{ubli
| True: <code>\x01</code>
| False: <code>\x02</code>
}}
| [[Big-endian]] [[2's complement]] signed and unsigned 8/16/32/64 bits
| {{ubli
| [[Single-precision floating-point format|Singles]]: [[big-endian]] [[binary32]]
| [[Double-precision floating-point format|Doubles]]: [[big-endian]] [[binary64]]
}}
| [[UTF-8]]-encoded, null-terminated, preceded by int8 or int32 string length in bytes
| Typecode (1 byte) + 1–4 bytes size + 1–4 bytes items count + list items
| Typecode (1 byte) + 1–4 bytes size + 1–4 bytes items count + key/value pairs
|- style="vertical-align:top;"
| [[BSON]]
Line 971 ⟶ 949:
| Length prefixed integer-encoded Unicode. Integers may represent enumerations or string table entries instead.
| Length prefixed set of items.
| {{No|Not in protocol.}}
|- style="vertical-align:top;"
| [[FlatBuffers]]
Line 1,042 ⟶ 1,020:
|- style="vertical-align:top;"
| [[Netstring]]s{{efn |group=binary |Interpretation of Netstrings is entirely application- or schema-dependent.}}
| {{No|Not in protocol.}}
| {{No|Not in protocol.}}
| {{No|Not in protocol.}}
| {{No|Not in protocol.}}
| Length-encoded as an ASCII string + ':' + data + ','<br>
Length counts only octets between ':' and ','
| {{No|Not in protocol.}}
| {{No|Not in protocol.}}
| Not in protocol.
|- style="vertical-align:top;"
| [[OGDL]] Binary
Line 1,119 ⟶ 1,097:
|
|}
 
{{sticky table end}}
{{notelist|group=binary}}
 
==See also==
*[[Comparison of document- markup languages]]
 
==References==