Content deleted Content added
Iridescent (talk | contribs) m Cleanup and typo fixing, typo(s) fixed: don’t → don't, they’ve → they've, ’s → 's |
|||
Line 60:
Interactive data transformation (IDT)<ref>Tope Omitola , Andr´e Freitas , Edward Curry , Sean O’Riain , Nicholas Gibbins , and Nigel Shadbolt. Capturing Interactive Data Transformation Operations using Provenance Workflows Retrieved from: http://andrefreitas.org/papers/preprint_capturing%20interactive_data_transformation_eswc_highlights.pdf</ref> is an emerging capability that allows business analysts and business users the ability to directly interact with large datasets through a visual interface,<ref name="digital.lib.washington.edu"/> understand the characteristics of the data (via automated data profiling or visualization), and change or correct the data through simple interactions such as clicking or selecting certain elements of the data.<ref name="livinglab.mit.edu"/>
Although IDT follows the same data integration process steps as batch data integration, the key difference is that the steps are not necessarily followed in a linear fashion and typically
A number of companies, primarily start-ups such as Trifacta, Alteryx and Paxata provide interactive data transformation tools. They are aiming to efficiently analyze, map and transform large volumes of data without the technical and process complexity that currently exists.
Line 66:
IDT solutions provide an integrated visual interface that combines the previously disparate steps of data analysis, data mapping and code generation/execution and data inspection.<ref name="The Value of Data Transformation"/> IDT interfaces incorporate visualization to show the user patterns and anomalies in the data so they can identify erroneous or outlying values.<ref name="digital.lib.washington.edu"/>
Once
By removing the developer from the process, IDT systems shorten the time needed to prepare and transform the data, eliminate costly errors in interpretation of user requirements and empower business users and analysts to control their data and interact with it as needed.<ref name="ReferenceA"/>
Line 77:
* [[TXL (programming language)|TXL]] - prototyping language-based descriptions, used for source code or data transformation.
* [[XSLT]] - the standard XML data transformation language (suitable by [[XQuery]] in many applications);
Additionally, companies such as Trifacta and Paxata have developed ___domain-specific transformational languages (DSL) for servicing and transforming datasets. The development of ___domain-specific languages has been linked to increased productivity and accessibility for non-technical users.<ref>{{Cite web|url=https://docs.trifacta.com/display/PE/Wrangle+Language|title=Wrangle Language - Trifacta Wrangler - Trifacta Documentation|website=docs.trifacta.com|access-date=2017-09-20}}</ref>
Another advantage of the recent DSL trend is that a DSL can abstract the underlying execution of the logic defined in the DSL, but it can also utilize that same logic in various processing engines, such as [[SPARK (programming language)|Spark]], [[MapReduce]], and Dataflow. With a DSL, the transformation language is not tied to the engine.<ref name=":0" />
|