Content deleted Content added
Brianolsen2 (talk | contribs) No edit summary |
m Bot: link syntax and minor changes |
||
Line 16:
}}
'''Trino''' is an [[Open-source software|open-source]] distributed [[SQL]] query engine designed to query large data sets distributed over one or more heterogeneous data sources<ref>{{cite web |title=Overview — Trino 361 Documentation |url=https://trino.io/docs/361/overview.html |website=trino.io |access-date=20 September 2021}}</ref>. Trino is commonly used as a query engine over [[Data_lake|datalakes]] and [[Data Warehouse|data warehouses]] using the [[Hive]] and [[List of Apache Software Foundation projects#Active projects|Iceberg]]<ref name="iceberg">{{cite web |title=About - Apache Iceberg |url=http://iceberg.apache.org/ |website=iceberg.apache.org |access-date=18 September 2021}}</ref> table formats. In these configurations Trino queries can query data in [[Free and open source software|open]] [[Column-oriented DBMS|column-oriented]] data file formats like [[Apache ORC|ORC]] or [[Apache Parquet|Parquet]] residing on different storage systems like [[Apache Hadoop#Hadoop distributed file system|HDFS]], [[Amazon S3|AWS S3]], [[
== History ==
Trino was originally designed and developed by Martin Traverso, Dain Sundstrom, David Phillips, and Eric Hwang at [[Facebook]] to allow data analysts to run interactive queries on its large [[data warehouse]] in [[Apache Hadoop]]. The project was originally named [[Presto (SQL query engine)|Presto]] and shares the first six years of development with the Presto project<ref>{{cite web |title=Contributors to trinodb/trino |url=https://github.com/trinodb/trino/graphs/contributors?from=2012-08-05&to=2018-08-05&type=c |website=GitHub |access-date=20 September 2021 |language=en}}</ref><ref>{{cite web |title=Contributors to prestodb/presto |url=https://github.com/prestodb/presto/graphs/contributors?from=2012-08-05&to=2018-08-05&type=c |website=GitHub |access-date=20 September 2021 |language=en}}</ref>. Before Presto, data analysts at Facebook relied on [[Apache Hive]], which was too slow for running interctive SQL analytics on their 250 petabyte data warehouse<ref name="2013facebook">{{Cite news|url=http://www.computerworld.com/article/2485668/business-intelligence/facebook-goes-open-source-with-query-engine-for-big-data.html|title=Facebook goes open source with query engine for big data|author=Joab Jackson|date=November 6, 2013|work=Computer World|access-date=April 26, 2017}}</ref>.
Martin, Dain, David, and Eric began development in 2012 and they deployed an initial version later that year. Later, Facebook announced its release as open source late Fall of 2013<ref name="2013facebook" /><ref name="2013facebook2">{{Cite news|url=https://gigaom.com/2013/06/06/facebook-unveils-presto-engine-for-querying-250-pb-data-warehouse/|title=Facebook unveils Presto engine for querying 250 PB data warehouse|author=Jordan Novet|date=June 6, 2013|work=Giga Om|access-date=April 26, 2017}}</ref>. As Presto gained popularity, many well known companies, such as [[Netflix]]
In late 2018, a disagreement around the stewardship of Presto between the founders and Facebook formed as Facebook management pushed to have tighter control over the project. This move included giving automatic committership rights to Facebook developers without prior experience with the project. Shortly after Facebook management moved forward with these changes, the creators left the original Presto project to create a fork.<ref name="2020rename">{{cite web |last1=Traverso |first1=Martin |last2=Sundstrom |first2=Dain |last3=Phillips |first3=David |title=We’re rebranding PrestoSQL as Trino |url=https://trino.io/blog/2020/12/27/announcing-trino.html |website=trino.io |access-date=7 September 2021 |language=en |date=27 December 2020}}</ref> This fork was also initially named Presto, so to differentiate them, users called the original project PrestoDB and the fork PrestoSQL named after their respective web addresses, https://prestodb.io and [https://trino.io https://prestosql.io]. It is worth noting that this split has striking similarities to the [[Jenkins (software)#History|Jenkins and Hudson split]].
Line 29:
In September 2019, Facebook donated PrestoDB to the [[Linux Foundation]] establishing the Presto Foundation.<ref>{{Cite web|url=https://www.linuxfoundation.org/press-release/2019/09/facebook-uber-twitter-and-alibaba-form-presto-foundation-to-tackle-distributed-data-processing-at-scale/|title=Facebook, Uber, Twitter and Alibaba form Presto Foundation to Tackle Distributed Data Processing at Scale|access-date=2019-11-12}}</ref> Neither the creators of Presto, nor the top contributors and committers, were invited to join this foundation.<ref>{{Cite news|url=https://github.com/trinodb/trino/issues/380#issuecomment-557691046|title=What's the relationship between prestosql and prestodb?|date=2019-11-22}}</ref><ref name="2020rename"/>
In December 2020, PrestoSQL was rebranded as Trino.
== Architecture ==
Line 45:
Trino supports separation of compute and storage and may be deployed both on premises and in the [[Cloud computing|cloud]].
Trino has a
== Use Cases ==
Line 53:
=== Data Lake Query Engine ===
Trino was originally created to replace the [[Apache Hive]] runtime while maintaining the ability to query data in [[Apache Hadoop#Hadoop distributed file system|HDFS]] or [[
=== Federated Query Engine ===
|