Multi-model database: Difference between revisions

Content deleted Content added
Add Informix to Databases list
No edit summary
 
(8 intermediate revisions by 7 users not shown)
Line 5:
The [[relational model|relational]] data model became popular after its publication by [[Edgar F. Codd]] in 1970. Due to increasing requirements for [[Scalability#Horizontal and vertical scaling|horizontal scalability]] and [[fault tolerance]], [[NoSQL]] databases became prominent after 2009. NoSQL databases use a variety of data models, with [[Document-oriented database|document]], [[Graph database|graph]], and key–value models being popular.<ref name="rise">[http://www.infoworld.com/article/2861579/database/the-rise-of-the-multimodel-database.html Infoworld, "The Rise of the Multi-Model Database"]</ref>
 
A multi-model database is a database that can store, index and query data in more than one model. For some time, databases have primarily supported only one model, such as: [[relational database]], [[document-oriented database]], [[graph database]] or [[triplestore]]. A database that combines many of these is multi-model. This should not be confused with multimodal database systems such as [https://pixeltable.com/ Pixeltable] or [https://www.aperturedata.io/ ApertureDB], which focus on unified management of different media types (images, video, audio, text) rather than different data models.
 
For some time,{{vague|date=April 2024}} it was all but forgotten (or considered irrelevant) that there were any other database models besides relational.{{citation needed|date=April 2024}} The relational model and notion of [[third normal form]] were the default standard for all data storage. However, prior to the dominance of relational data modeling, from about 1980 to 2005, the [[hierarchical database model]] was commonly used. Since 2000 or 2010, many [[NoSQL]] models that are non-relational, including documents, triples, key–value stores and graphs are popular. Arguably, [[geospatial data]], [[temporal data]], and [[text data]] are also separate models, though indexed, queryable text data is generally termed a "[[search engine]]" rather than a database.{{Citation needed|date=March 2021}}
 
The first time the word "multi-model" has been associated to the databases was on May 30, 2012 in Cologne, Germany, during the [[Luca Garulli]]'s key note "''NoSQL Adoption – What’s the Next Step?''".<ref>{{Cite web|date=2012-06-01|title=Multi-Model storage 1/2 one product|url=http://www.slideshare.net/lvca/no-sql-matters2012keynote/47-MultiModel_storage_12_one_product}}</ref><ref>{{Cite web|url=https://2012.nosql-matters.org/cgn/wp-content/uploads/2012/06/KeyNote-Luca-Garulli.pdf|title=Nosql Matters Conference 2012 {{!}} NoSQL Matters CGN 2012|website=2012.nosql-matters.org|access-date=2017-01-12}}</ref> Luca Garulli envisioned the evolution of the 1st generation NoSQL products into new products with more features able to be used by multiple use cases.
 
The idea of multi-model databases can be traced back to [[object–relational database|Object–Relational Data Management Systems (ORDBMS)]] in the early 1990s and in a more broader scope even to [[federated DBMS|federated]] and [[integrated DBMSsDBMS]]s in the early 1980s. An ORDBMS system manages different types of data such as relational, object, text and spatial by plugging ___domain specific data types, functions and index implementations into the DBMS kernels. A multi-model database is most directly a response to the "[[polyglot persistence]]" approach of knitting together multiple database products, each handing a different model, to achieve a multi-model capability as described by [[Martin Fowler (software engineer)|Martin Fowler]].<ref name="polyglot">[http://martinfowler.com/bliki/PolyglotPersistence.html Polyglot Persistence]</ref> This strategy has two major disadvantages: it leads to a significant increase in operational complexity, and there is no support for maintaining data consistency across the separate data stores, so multi-model databases have begun to fill in this gap.
 
Multi-model databases are intended to offer the data modeling advantages of polyglot persistence,<ref name="polyglot"/> without its disadvantages. Operational complexity, in particular, is reduced through the use of a single data store.<ref name="rise"/>
 
== Databases ==
{{Main article|Comparison of multi-model databases}}
Multi-model databases include (in alphabetic order):
 
<!--Added databases should be "notable" with a sourced article on English Wikipedia.-->
* [[AllegroGraph]] – document (JSON, JSON-LD), graph
* [[ArangoDB]] – document (JSON), graph, key–value
* [[ArcadeDB]] – document (JSON), graph, key–value, time-series, [[SQL]], [[Cypher (query language)|Cypher query language]], [[Gremlin (query language)]]
* [[Cosmos DB]] – document (JSON), graph,<ref>{{Cite web|url=https://docs.microsoft.com/en-us/azure/cosmos-db/create-graph-dotnet|title = Build an Azure Cosmos DB .NET Framework, Core application using the Gremlin API}}</ref> key–value, SQL
* [[Couchbase]] – document (JSON), key–value, [[N1QL]]
* [[Datastax]] – key–value, tabular, graph
* [[EnterpriseDB]] – document (XML and JSON), key–value
* [[Informix]] – relational, objects, document (JSON and XML), binary, time-series
* [[MarkLogic]] – document (XML and JSON), graph triplestore, binary, SQL
* [[Microsoft Azure SQL Database]] - relational, document (JSON), graph, XML
* [[Oracle Database]] – relational, document (JSON and XML), graph triplestore, property graph, key–value, objects
* [[OrientDB]] – document (JSON), graph, key–value, reactive, SQL
* [[PostgreSQL]] – relational, document (JSON and XML), key–value, graph, arrays, objects
* [[Redis]] – key–value, document (JSON), property graph, streaming, time-series
* [[SAP HANA]] – relational, document (JSON), graph, streaming
* [[Virtuoso Universal Server]] – [[Relational database|relational]], document ([[XML]]), [[Triplestore|RDF graphs]]
 
== Benchmarking multi-model databases ==
As more and more platforms are proposed to deal with multi-model data, there are a few works on benchmarking multi-model databases. For instance, [[Ewa Pluciennik|Pluciennik]],<ref>{{Cite journal|last=Ewa Pluciennik and Kamil Zgorzalek|title=The Multi-model Databases - A Review|journal=Bdas 2017|pages=141–152}}</ref> [[Fábio Roberto Oliveira|Oliveira]],<ref>{{Cite journal|last=Fábio Roberto Oliveira, Luis del Val Cura|title=Performance Evaluation of NoSQL Multi-Model Data Stores in Polyglot Persistence Applications|journal=Ideas '16|pages=230–235}}</ref> and [[UniBench]]<ref>{{Cite journal|last=Chao Zhang, Jiaheng Lu, Pengfei Xu, Yuxing Chen|title=UniBench: A Benchmark for Multi-Model Database Management Systems|url=https://www.cs.helsinki.fi/u/jilu/documents/UniBench.pdf|journal=TPCTC 2018}}</ref> reviewed existing multi-model databases and made an evaluation effort towards comparing multi-model databases and other SQL and NoSQL databases respectively. They pointed out that the advantages of multi-model databases over single-model databases are as follows : {{olist|list-style-type=lower-roman| they are able to ingest a variety of data formats such as [[Comma-separated values|CSV]] (including Graph, Relational), [[JSON]] into storage without any additional efforts. | they can employ a unified query language such as [[AQL (ArangoDB Query Language)|AQL]], [[Orient SQL]], [[SQL/XML]], [[SQL/JSON]] to retrieve correlated multi-model data, such as graph-JSON-key/value, XML-relational, and JSON-relational in a single platform. | they are able to support multi-model [[ACID]] transactions in the stand-alone mode.}}
 
== Architecture ==
Line 45 ⟶ 23:
== User-defined data models ==
In addition to offering multiple data models in a single data store, some databases allow developers to easily define custom data models. This capability is enabled by ACID transactions with high performance and scalability. In order for a custom data model to support concurrent updates, the database must be able to synchronize updates across multiple keys. ACID transactions, if they are sufficiently performant, allow such synchronization.<ref name="multiple">[http://www.odbms.org/wp-content/uploads/2014/04/Multiple-Data-Models.pdf ODBMS, "Polyglot Persistence or Multiple Data Models?"]</ref> JSON documents, graphs, and relational tables can all be implemented in a manner that inherits the horizontal scalability and fault-tolerance of the underlying data store.
 
== Theoretical Foundation for Multi-Model Databases ==
 
The traditional theory of relations is not enough to accurately describe multi-model database systems. Recent research <ref name="CT">[https://www.vldb.org/pvldb/vol14/p2663-uotila.pdf MultiCategory: Multi-model Query Processing Meets Category Theory and Functional Programming ]</ref> is focused on developing a new theoretical foundation for these systems. [[Category theory]] can provide a unified, rigorous language for modeling, integrating, and transforming different data models. By representing multi-model data as sets and their relationships as functions or relations within the Set category, we can create a formal framework to describe, manipulate, and understand various data models and how they interact.
 
 
 
== See also ==
Line 71 ⟶ 55:
* [http://www.odbms.org/wp-content/uploads/2014/04/Multiple-Data-Models.pdf ODBMS, "Polyglot Persistence or Multiple Data Models?"]
* [http://www.infoworld.com/article/2861579/database/the-rise-of-the-multimodel-database.html Infoworld, "The Rise of the Multi-Model Database"]
 
== {{Databases ==}}
 
{{DEFAULTSORT:Multi-model Database}}
Line 83 ⟶ 69:
[[Category:Data analysis]]
[[Category:Big data]]
[[Category:Database management systems]]