Content deleted Content added
Brianolsen2 (talk | contribs) ←Created page with '{{Infobox software | name = Trino | logo = Trino-logo-w-bk.svg | logo caption = | screenshot = Trino-dashboard.png | caption = Trino UI Version 358 | author = Martin Traverso, Dain Sundstrom, David Phillips, Eric Hwang | released = {{Start date and age|10 November 2013}} | programming language = Java | operating system = Cross-platform | repo = {{URL|https://github.com/trinodb/trino|Trino Repository}} | standard = [[ANSI]...' Tag: Disambiguation links added |
LesterMartin (talk | contribs) m showing breadth of file formats, not just columnar ones |
||
(30 intermediate revisions by 22 users not shown) | |||
Line 1:
{{Short description|Open-source distributed SQL query engine}}
{{Infobox software
| name = Trino
Line 6 ⟶ 7:
| caption = Trino UI Version 358
| author = Martin Traverso, Dain Sundstrom, David Phillips, Eric Hwang
| programming language = [[Java (programming language)|Java]]
| operating system = [[Cross-platform]]
Line 16:
}}
'''Trino''' is an [[Open-source software|open-source]] distributed [[SQL]] query engine designed to query large data sets distributed over one or more heterogeneous data sources.<ref>{{cite web |title=Overview — Trino
== History ==
Trino was originally designed and developed by Martin Traverso, Dain Sundstrom, David Phillips, and Eric Hwang at [[Facebook]] to allow data analysts to run interactive queries on its large [[data warehouse]] in [[Apache Hadoop]]. The project was originally named [[Presto (SQL query engine)|Presto]] and shares the first six years of development with the Presto project<ref>{{cite web |title=Contributors to trinodb/trino |url=https://github.com/trinodb/trino/graphs/contributors?from=2012-08-05&to=2018-08-05&type=c |website=GitHub |access-date=20 September 2021 |language=en}}</ref><ref>{{cite web |title=Contributors to prestodb/presto |url=https://github.com/prestodb/presto/graphs/contributors?from=2012-08-05&to=2018-08-05&type=c |website=GitHub |access-date=20 September 2021 |language=en}}</ref>. Before Presto, data analysts at Facebook relied on [[Apache Hive]], which was too slow for running interctive SQL analytics on their 250 petabyte data warehouse<ref name="2013facebook">{{Cite news|url=http://www.computerworld.com/article/2485668/business-intelligence/facebook-goes-open-source-with-query-engine-for-big-data.html|title=Facebook goes open source with query engine for big data|author=Joab Jackson|date=November 6, 2013|work=Computer World|access-date=April 26, 2017}}</ref>. ▼
In January 2019, the
In
▲Presto and Trino
▲In January 2019, the Trino Software Foundation (formerly Presto Software Foundation) was announced. The foundation is a not-for-profit organization dedicated to the advancement of the Trino open source distributed SQL query engine.<ref name="2019psf">{{Cite web|url=https://www.prweb.com/releases/presto_software_foundation_launches_to_advance_presto_open_source_community/prweb16070792.htm|title=Presto Software Foundation Launches to Advance Presto Open Source Community|website=PRWeb|access-date=2019-02-01}}</ref><ref name="2019psf2">{{Cite web|url=https://thenewstack.io/prestos-new-foundation-signals-growth-for-the-big-data-sql-engine/|title=Presto's New Foundation Signals Growth for the Big Data SQL Engine|date=2019-01-31|website=The New Stack|language=en-US|access-date=2019-02-01}}</ref>
Trino is used in many data platforms and products from cloud providers and other vendors. Customization of these products varies from pure Trino usage to heavily customized systems to run a data platform or integration in specialized data platforms for usage with specific data. [https://trino.io/users Examples include Amazon Athena, Starburst Galaxy, Dune, and many others.]
== Architecture ==
[[File:Figure 4-1 Trino architecture.png|thumb|Trino architecture overview with coordinator and workers<ref name="trino-definitive-guide-ch4">{{cite book |last1=Fuller |first1=Matt |last2=Moser |first2=Manfred |last3=Traverso |first3=Martin |title=Trino: The Definitive Guide |chapter=Chapter 4. Trino Architecture |date=2021 |publisher=O'Reilly Media, Inc, USA |isbn=9781098107710 |pages=43–72}}</ref>]]
* The coordinator is responsible for parsing, analyzing, optimizing, planning, and scheduling a query submitted by a client. The coordinator interacts with the [[service provider interface]](SPI) to obtain the available tables, table statistics, and other information needed to carry out its tasks. ▼
* The workers are responsible for executing the tasks and operators fed to it by the scheduler. These tasks process rows from data sources and produce results that are returned to the coordinator and ultimately back to the client.▼
Trino has a [[distributed|Distributed computing]] [[massively parallel|MPP]] architecture. Trino first distributes work over multiple workers by running ad-hoc partitioning operations or relying on existing partitions in the data of the underlying data store. Once this data has reached the worker, the data is processed over pipelined operators carried out on multiple threads. Another decided characteristic of Trino was the lack of [[fault tolerance]], which avoids the check-in operations involving expensive writes to disk. This leaves queries vulnerable to needing to be restarted if there is a failure. In practice, this is not reported to happen too often.▼
Trino is written in [[Java (programming language)|Java]].<ref name="trino-definitive-guide-ch2">{{cite book |last1=Fuller |first1=Matt |last2=Moser |first2=Manfred |last3=Traverso |first3=Martin |title=Trino: The Definitive Guide |chapter=Chapter 2. Installing and Configuring Trino |date=2021 |publisher=O'Reilly Media, Inc, USA |isbn=9781098107710 |pages=19–24}}</ref> It runs on a cluster of servers that contains two types of nodes, a '''coordinator''' and a '''worker'''.<ref name="trino-definitive-guide-ch4" />
▲* The coordinator is responsible for parsing, analyzing, optimizing, planning, and scheduling a query submitted by a client. The coordinator interacts with the [[service provider interface]] (SPI) to obtain the available tables, table statistics, and other information needed to carry out its tasks.<ref name="trino-definitive-guide-ch4" />
▲* The workers are responsible for executing the tasks and operators fed to
Trino adheres to the [[ANSI]] [[SQL]]<ref name="trino-definitive-guide-ch1">{{cite book |last1=Fuller |first1=Matt |last2=Moser |first2=Manfred |last3=Traverso |first3=Martin |title=Trino: The Definitive Guide |chapter=Chapter 1. Introducing Trino |date=2021 |publisher=O'Reilly Media, Inc, USA |isbn=9781098107710 |pages=3–17}}</ref> standard and includes various parts of the following ANSI specifications: [[SQL-92]], [[SQL:1999]], [[SQL:2003]], [[SQL:2008]], [[SQL:2011]], [[SQL:2016]], [[SQL:2023]].
Trino supports the separation of compute and storage<ref name="trino-definitive-guide-ch1" /> and may be deployed both on-premises and in the [[Cloud computing|cloud]].<ref name="trino-definitive-guide-ch13">{{cite book |last1=Fuller |first1=Matt |last2=Moser |first2=Manfred |last3=Traverso |first3=Martin |title=Trino: The Definitive Guide |chapter=Chapter 13. Real-World Examples |date=2021 |publisher=O'Reilly Media, Inc, USA |isbn=9781098107710 |pages=267–272}}</ref>
▲Trino has a [[
==See also==
* [[Presto (SQL query engine)]]
* [[Big data]]
* [[Data Intensive Computing]]
Line 66 ⟶ 51:
== References ==
{{Reflist}}
== External links ==
* [https://trino.io/foundation.html Trino Software Foundation (formerly Presto Software Foundation)]
[[
[[
[[
[[
[[
|