Trino (SQL query engine): Difference between revisions

Content deleted Content added
Architecture: remove unsourced section
redirect to Presto (SQL query engine) to avoid WP:CONTENTFORK; please, discuss proposals for Presto/Trino organization at Talk:Presto (SQL query engine)
Tag: New redirect
Line 1:
* #REDIRECT[[Presto (SQL query engine)]]
{{Infobox software
| name = Trino
| logo = Trino-logo-w-bk.svg
| logo caption =
| screenshot = Trino-dashboard.png
| caption = Trino UI Version 358
| author = Martin Traverso, Dain Sundstrom, David Phillips, Eric Hwang
| released = {{Start date and age|10 November 2013}}
| programming language = [[Java (programming language)|Java]]
| operating system = [[Cross-platform]]
| repo = {{URL|https://github.com/trinodb/trino|Trino Repository}}
| standard = [[ANSI]] [[SQL]], [[JDBC]]
| genre = [[Data Warehouse]]
| license = [[Apache License]] 2.0
| website = {{URL|https://trino.io}}
}}
 
'''Trino''' is an [[Open-source software|open-source]] distributed [[SQL]] query engine designed to query large data sets distributed over one or more heterogeneous data sources.<ref>{{cite web |title=Overview — Trino 361 Documentation |url=https://trino.io/docs/361/overview.html |website=trino.io |access-date=20 September 2021}}</ref> Trino is commonly used as a query engine over [[Data lake|datalakes]] and [[Data Warehouse|data warehouses]] using the [[Apache Hive|Hive]] and [[List of Apache Software Foundation projects#Active projects|Iceberg]]<ref name="iceberg">{{cite web |title=About - Apache Iceberg |url=http://iceberg.apache.org/ |website=iceberg.apache.org |access-date=18 September 2021}}</ref> table formats. In these configurations Trino queries can query data in [[Free and open source software|open]] [[Column-oriented DBMS|column-oriented]] data file formats like [[Apache ORC|ORC]] or [[Apache Parquet|Parquet]] residing on different storage systems like [[Apache Hadoop#Hadoop distributed file system|HDFS]], [[Amazon S3|AWS S3]], [[Google Cloud Storage]], or [[Microsoft Azure#Storage services|Azure Blob Storage]]. Trino also has the ability to run federated queries across multiple disparate data sources such as [[MySQL]], [[PostgreSQL]], [[Apache Cassandra|Cassandra]], [[Apache Kafka|Kafka]], [[MongoDB]] and [[Elasticsearch]]. Trino is community driven and released under the [[Apache License]].
 
== History ==
Trino was originally designed and developed by Martin Traverso, Dain Sundstrom, David Phillips, and Eric Hwang at [[Facebook]] to allow data analysts to run interactive queries on its large [[data warehouse]] in [[Apache Hadoop]]. The project was originally named [[Presto (SQL query engine)|Presto]] and shares the first six years of development with the Presto project.<ref>{{cite web |title=Contributors to trinodb/trino |url=https://github.com/trinodb/trino/graphs/contributors?from=2012-08-05&to=2018-08-05&type=c |website=GitHub |access-date=20 September 2021 |language=en}}</ref><ref>{{cite web |title=Contributors to prestodb/presto |url=https://github.com/prestodb/presto/graphs/contributors?from=2012-08-05&to=2018-08-05&type=c |website=GitHub |access-date=20 September 2021 |language=en}}</ref> Before Presto, data analysts at Facebook relied on [[Apache Hive]], which was too slow for running interctive SQL analytics on their 250 petabyte data warehouse.<ref name="2013facebook">{{Cite news|url=http://www.computerworld.com/article/2485668/business-intelligence/facebook-goes-open-source-with-query-engine-for-big-data.html|title=Facebook goes open source with query engine for big data|author=Joab Jackson|date=November 6, 2013|work=Computer World|access-date=April 26, 2017}}</ref>
 
Martin, Dain, David, and Eric began development in 2012 and they deployed an initial version later that year. Later, Facebook announced its release as open source late Fall of 2013.<ref name="2013facebook" /><ref name="2013facebook2">{{Cite news|url=https://gigaom.com/2013/06/06/facebook-unveils-presto-engine-for-querying-250-pb-data-warehouse/|title=Facebook unveils Presto engine for querying 250 PB data warehouse|author=Jordan Novet|date=June 6, 2013|work=Giga Om|access-date=April 26, 2017}}</ref> As Presto gained popularity, many well known companies, such as [[Netflix]],<ref>{{Cite news|url=http://techblog.netflix.com/2014/10/using-presto-in-our-big-data-platform.html|title=Using Presto in our Big Data Platform on AWS|authors=Eva Tse, Zhenxiao Luo, Nezih Yigitbasi|date=October 7, 2014|work=Netflix technical blog|access-date=April 26, 2017}}</ref> [[AirBnB]],<ref>{{cite web |title=Airpal: a Web UI for PrestoDB |url=https://medium.com/airbnb-engineering/airpal-a-web-based-query-execution-tool-for-data-analysis-33c43265ed1f |website=Medium |access-date=20 September 2021 |language=en |date=4 April 2016}}</ref> among others, disclosed they used Presto in both on premise and cloud deployments at equivalent petabyte scales. In late 2016, Amazon released that it would provide Presto as a service called Athena.<ref>{{cite web |title=AWS Launches Amazon Athena {{!}} Amazon.com, Inc. - Press Room |url=https://press.aboutamazon.com/news-releases/news-release-details/aws-launches-amazon-athena |website=press.aboutamazon.com |access-date=20 September 2021 |language=en}}</ref>
 
In late 2018, a disagreement around the stewardship of Presto between the founders and Facebook formed as Facebook management pushed to have tighter control over the project.<ref name="2020rename">{{cite web |last1=Traverso |first1=Martin |last2=Sundstrom |first2=Dain |last3=Phillips |first3=David |title=We’re rebranding PrestoSQL as Trino |url=https://trino.io/blog/2020/12/27/announcing-trino.html |website=trino.io |access-date=7 September 2021 |language=en |date=27 December 2020}}</ref> This move included giving automatic committership rights to Facebook developers without prior experience with the project.<ref name="2020rename"/> Shortly after Facebook management moved forward with these changes, the creators left the original Presto project to create a fork.<ref name="2020rename"/> This fork was also initially named Presto, so to differentiate them, users called the original project PrestoDB and the fork PrestoSQL named after their respective web addresses, https://prestodb.io and [https://trino.io https://prestosql.io].<ref name="380issue">{{cite web |title=What is the relationship of prestosql and prestodb? · Issue #380 · trinodb/trino |url=https://github.com/trinodb/trino/issues/380 |website=GitHub |access-date=24 September 2021 |language=en}}</ref> It is worth noting that this split has striking similarities to the [[Jenkins (software)#History|Jenkins and Hudson split]].
 
In January 2019, the Trino Software Foundation (formerly Presto Software Foundation) was announced. The foundation is a not-for-profit organization dedicated to the advancement of the Trino open source distributed SQL query engine.<ref name="2019psf">{{Cite web|url=https://www.prweb.com/releases/presto_software_foundation_launches_to_advance_presto_open_source_community/prweb16070792.htm|title=Presto Software Foundation Launches to Advance Presto Open Source Community|website=PRWeb|access-date=2019-02-01}}</ref><ref name="2019psf2">{{Cite web|url=https://thenewstack.io/prestos-new-foundation-signals-growth-for-the-big-data-sql-engine/|title=Presto's New Foundation Signals Growth for the Big Data SQL Engine|date=2019-01-31|website=The New Stack|language=en-US|access-date=2019-02-01}}</ref>
 
In September 2019, Facebook donated PrestoDB to the [[Linux Foundation]] establishing the Presto Foundation.<ref>{{Cite web|url=https://www.linuxfoundation.org/press-release/2019/09/facebook-uber-twitter-and-alibaba-form-presto-foundation-to-tackle-distributed-data-processing-at-scale/|title=Facebook, Uber, Twitter and Alibaba form Presto Foundation to Tackle Distributed Data Processing at Scale|access-date=2019-11-12}}</ref> Neither the creators of Presto, nor the top contributors and committers, were invited to join this foundation.<ref name="2019comment">{{Cite news|url=https://github.com/trinodb/trino/issues/380#issuecomment-557691046|title=What's the relationship between prestosql and prestodb?|date=2019-11-22}}</ref><ref name="2020rename"/>
 
In December 2020, PrestoSQL was rebranded as Trino.<ref name="2020rename"/> The name comes from a shortening of the physics particle [[neutrino]], for its fast and light properties. The name Trino is shorter, sounds better, and is easier to search for on the web.<ref>{{cite web |title=8: Trino: A ludicrously fast query engine: past, present, and future |url=https://trino.io/episodes/8.html |website=trino.io |access-date=24 September 2021 |language=en |date=11 January 2021}}</ref>
 
==See also==
* [[Big data]]
* [[Data Intensive Computing]]
* [[Presto (SQL query engine)]]
* [[Computer cluster]]
 
== References ==
{{Reflist}}
 
== External links ==
 
* [https://trino.io/foundation.html Trino Software Foundation (formerly Presto Software Foundation)]
* [https://github.com/prestodb/foundation Presto Foundation] (under the [[Linux Foundation]])
 
[[:Category:SQL]]
[[:Category:Free system software]]
[[:Category:Hadoop]]
[[:Category:Cloud platforms]]
[[:Category:Java platform]]
 
{{Uncategorized|date=September 2021}}