Content deleted Content added
General fixes, added uncategorised tag |
→Use Cases: remove marketing section based exclusively on the official web page |
||
Line 45:
Trino has a [[Distributed computing|distributed]] [[massively parallel|MPP]] architecture, which was a big departure from the map reduce design used by most popular data lake systems like Hive, Impala, and [[Apache Spark]]. Trino first distributes work over multiple workers by running ad-hoc partitioning operations or relying on existing partitions in the data of the underlying data store. Once this data has reached the worker, the data is processed over pipelined operators carried out on multiple threads. Another decided characteristic of Trino was avoiding the [[Application checkpointing|checkpointing]] operations involving expensive writes, used by systems like Hive and Spark. Avoiding these writes may require restarting a query in the rare case of failure during the operation.
==See also==
|