Content deleted Content added
KMaster888 (talk | contribs) subjective ce |
|||
(37 intermediate revisions by 32 users not shown) | |||
Line 1:
{{Data transformation}}
{{Expert needed|computing|reason=The information appears outdated and requires sources both historical (history of data mapping) and current (how is data mapping performed today)|date=May 2018}}
In [[computing]] and [[data management]], '''data mapping''' is the process of creating [[data element]] [[Map (mathematics)|mapping]]s between two distinct [[data model]]s.
* [[Data transformation]] or [[data mediation]] between a data source and a destination
* Identification of data relationships as part of [[data lineage]] analysis
* Discovery of hidden sensitive data such as the last four digits of a social security number hidden in another user id as part of a data masking or [[de-identification]] project
* [[Data consolidation|Consolidation]] of multiple databases into a single
For example, a company that would like to transmit and receive purchases and invoices with other companies might use data mapping to create data maps from a company's data to standardized [[ANSI ASC X12]] messages for items such as purchase orders and invoices.
==Standards==
In the future, tools based on [[semantic web]] languages such as RDF, the [[Web Ontology Language]] (OWL) and standardized [[metadata registry]] will make data mapping a more automatic process. ==Hand-coded, graphical manual ==
Data mappings can be done in a variety of ways using procedural code, creating [[XSLT]] transforms or by using graphical mapping tools that automatically generate executable transformation programs.
==Data-driven mapping==
This is the newest approach in data mapping and involves simultaneously evaluating actual data values in two data sources using heuristics and statistics to automatically discover complex mappings between two data sets.
==Semantic mapping==
[[Semantic mapper|Semantic mapping]] is similar to the auto-connect feature of data mappers with the exception that a [[metadata registry]] can be consulted to look up data element synonyms.
Data lineage is a track of the life cycle of each piece of data as it is ingested, processed, and output by the analytics system. This provides visibility into the analytics pipeline and simplifies tracing errors back to their sources. It also enables replaying specific portions or inputs of the data flow for step-wise debugging or regenerating lost output. In fact, database systems have used such information, called data provenance, to address similar validation and debugging challenges already.<ref>De, Soumyarupa. (2012). Newt : an architecture for lineage based replay and debugging in DISC systems. UC San Diego: b7355202. Retrieved from: https://escholarship.org/uc/item/3170p7zn</ref>
==See also==
* [[
* [[Data wrangling]]
*[[Identity transform]]
Line 44 ⟶ 46:
==References==
{{reflist}}
{{DEFAULTSORT:Data Mapping}}
|