Graph Query Language: Difference between revisions

Content deleted Content added
Create article
 
No edit summary
Line 2:
 
== International Standard project ==
In September 2019 a proposal for a GQL standard project was approved by a vote of national standards bodies which are members of ISO/IEC Joint Technical Committee 1 (responsible for information technology standards)<ref name="39075 GQL">{{cite web|url=https://www.iso.org/standard/76120.html|title=ISO/IEC WD 39075 Information Technology — Database Languages — GQL|last=|first=|date=|website=|publisher=ISO|accessdate=September 29, 2019}}</ref>. The GQL projectis intended to be a declarative query language, proposallike statesSQL.
 
The GQL project proposal states
''Using graph as a fundamental representation for data modeling is an emerging approach in data management. In this approach, the data set is modeled as a graph, representing each data entity as a vertex (also called a node) of the graph and each relationship between two entities as an edge between corresponding vertices. The graph data model has been drawing attention for its unique advantages. Firstly, the graph model can be a natural fit for data sets that have hierarchical, complex, or even arbitrary structures. Such structures can be easily encoded into the graph model as edges. This can be more convenient than the relational model, which requires the normalization of the data set into a set of tables with fixed row types. Secondly, the graph model enables efficient execution of expensive queries or data analytic functions that need to observe multi-hop relationships among data entities, such as reachability queries, shortest or cheapest path queries, or centrality analysis. There are two graph models in current use: the Resource Description Framework (RDF) model and the Property Graph model. The RDF model has been standardized by W3C in a number of specifications. The Property Graph model, on the other hand, has a multitude of implementations in graph databases, graph algorithms, and graph processing facilities. However, a common, standardized query language for property graphs (like SQL for relational database systems) is missing. GQL is proposed to fill this void.''<ref name="BSI 39075 GQL">{{cite web|url=https://standardsdevelopment.bsigroup.com/projects/9019-02970|title=ISO/IEC JTC 1/SC 32 N 3007 - ISO/IEC NP 39075 Information Technology -- Database Languages -- GQL|last=|first=|date=|website=|publisher=British Standards Institute|accessdate=September 29, 2019}}</ref>.
 
''Using graph as a fundamental representation for data modeling is an emerging approach in data management. In this approach, the data set is modeled as a graph, representing each data entity as a vertex (also called a node) of the graph and each relationship between two entities as an edge between corresponding vertices. The graph data model has been drawing attention for its unique advantages. Firstly, the graph model can be a natural fit for data sets that have hierarchical, complex, or even arbitrary structures. Such structures can be easily encoded into the graph model as edges. This can be more convenient than the relational model, which requires the normalization of the data set into a set of tables with fixed row types. Secondly, the graph model enables efficient execution of expensive queries or data analytic functions that need to observe multi-hop relationships among data entities, such as reachability queries, shortest or cheapest path queries, or centrality analysis. There are two graph models in current use: the Resource Description Framework (RDF) model and the Property Graph model. The RDF model has been standardized by W3C in a number of specifications. The Property Graph model, on the other hand, has a multitude of implementations in graph databases, graph algorithms, and graph processing facilities. However, a common, standardized query language for property graphs (like SQL for relational database systems) is missing. GQL is proposed to fill this void.''<ref name="BSI 39075 GQL">{{cite web|url=https://standardsdevelopment.bsigroup.com/projects/9019-02970|title=ISO/IEC JTC 1/SC 32 N 3007 - ISO/IEC NP 39075 Information Technology -- Database Languages -- GQL|last=|first=|date=|website=|publisher=British Standards Institute|accessdate=September 29, 2019}}</ref>.
==Conducted alongside SQL in ISO/IEC JTC 1/SC32 Working Group 3==
 
==ConductedManaged alongside SQL inby ISO/IEC JTC 1/SC32 WorkingWG3 Group 3==
The GQL project has a four-year timespan. Seven national standards bodies (those of the United States, China, Korea, the Netherlands, the United Kingdom, Denmark and Sweden) have nominated national subject-matter experts to work on the project, which is conducted by Working Group 3 (Database Languages) of ISO/IEC Joint Technical Committee 1 (Information Technology) Subcommittee 32 (Data Management and Interchange), usually abbreviated as ISO/IEC JTC 1/SC 32 WG3, or just "WG3" for short. WG3 (and direct predecessor committees within JTC 1) has been responsible for the SQL standard since 1987.<ref name="SC32 and WG3 history">{{cite web|url=https://jtc1info.org/sd_2-history_of_jtc1/jtc1-subcommittees/sc-32/|title=JTC 1/SC 32 Data Management and Interchange|last=|first=|date=|website=|publisher=ISO/IEC JTC1|accessdate=October 6, 2019}}</ref>
 
==Extending existing graph query languages==
The GQL project draws on multiple sources or inputs, notably existing industrial languages and a new section of the SQL standard. In preparatory discussions within WG3 surveys of the history and comparative content of some of these inputs were presented.<ref name="GQLs history">{{cite web|url=https://s3.amazonaws.com/artifacts.opencypher.org/website/materials/DM32.2/DM32.2-2018-00085R1-recent_history_of_property_graph_query_languages.pdf|title="An overview of the recent history of Graph Query Languages", Tobias Lindaaker|last=|first=|date=|website=|publisher=ISOopencypher.org|accessdate=October 6, 2019}}</ref>. GQL is intended to draw inspiration from declarative languages which play a similar role to SQL in the building of a database application, but other graph query languages have been defined which offer direct procedural features such as branching and looping[REF TINKERPOP, GREMLIN], [REF GQSL], and the ability to traverse a graph iteratively[REF TINKERPOP, GREMLIN][REF PETER WOOD][REF MARCELO].
# '''SQL/PGQ Property Graph Queries''' Prior work by WG3 and SC32 mirror bodies, particularly INCITS DM32, has helped to define a new planned part to the SQL Standard, '''which allows a read-only graph query to be called inside a SQL SELECT statement, matching a graph pattern using syntax which is very close to Cypher, PGQL and G-CORE, and returning a table of data values as the result. SQL/PGQ Propertyalso Graphcontains Queries'''DDL to allow SQL tables to be mapped to a graph view schema object with nodes and edges associated to sets of labels and set of data properties.<ref name="SQL Part 16 PGQ">{{cite web|url=https://www.iso.org/standard/79473.html?browse=tc|title=ISO/IEC WD 9075-16 Information technology — Database languages SQL — Part 16: SQL Property Graph Queries (SQL/PGQ)|last=|first=|date=|website=|publisher=ISO|accessdate=October 6, 2019}}</ref><ref name="W3C Berlin PGQLSQL and GQL">{{cite web|url=https://www.w3.org/Data/events/data-ws-2019/assets/slides/KeithWHare-2.pdf||title="''SQL and GQL"'', Keith Hare et al., W3C Workshop on Web Standardization for Graph Data. Creating Bridges: RDF, Property Graph and SQL|last=|first=|date=|website=|publisher=W3C|accessdate=October 6, 2019}}</ref><ref name="LDBC SQL/PGQ">{{cite web|url=http://wiki.ldbcouncil.org/download/attachments/106233859/ldbc_tuc_2019_sql-pgq.pdf?version=1&modificationDate=1562342465000&api=v2|title=''Property graph extensions for the SQL standard'', Vasileios Trigonakis (Oracle). LDBC 12th TUC.|last=|first=|date=|website=|publisher=LBDC|accessdate=October 6, 2019}}</ref>
 
# '''Cypher''' A language
# '''Cypher 9''' A language originally designed and implemented by Neo4j Inc., but since 2015 made available as an open source specification, with grammar tooling, a JVM front-end that parses Cypher queries, and a Technology Compatibility Kit using Cucumber to define over 2000 test scenarios, for implementation language portability. Cypher is implemented in Neo4j's database, by Redis Graph, by Cambridge Semantics Anzograph, by Bitnine's Agens Graph, by Memgraph, and in open source projects Cypher for Gremlin and Cypher for Apache Spark (now renamed to Morpheus), as well as in research projects such as Cypher.PL and Ingraph.
# '''PGQL'''
 
# '''PGQL''' A language designed and implemented by Oracle Inc., but made available as an open source specification, along with JVM parsing software. PGQL combines familiar SQL SELECT syntax including SQL expressions and result ordering and aggregation with a pattern matching language very similar to that of Cypher. It allows the specification of the graph to be queried, and includes a facility for macros to capture "pattern views". It does not support insertion or updating operations, having been designed primarily for an analytics environment, such as Oracle's PGX product.<ref name="PGQL">{{cite web|url=http://pgql-lang.org/|title=PGQL|last=|first=|date=|website=|publisher=pgql.org|accessdate=October 6, 2019}}</ref>
 
# '''G-CORE'''
 
# '''GSQL'''
 
# '''Cypher 10 extensions implemented in Cypher for Apache Spark (now Morpheus)'''
 
== References ==