Data transformation (computing): Difference between revisions

Content deleted Content added
m Cleaned up using AutoEd
Line 9:
Data element to data element mapping is frequently complicated by complex transformations that require [[one-to-many]] and many-to-one transformation rules.
 
The code generation step takes the data element mapping specification and creates an [[executable program]] that can be run on a computer system. Code generation can also create transformation in easy-to-maintain computer languages such as [[Java (programming language)|Java]] or [[XSLT]].
 
When the mapping is indirect via a mediating [[data model]], the process is also called '''data mediation'''.
 
==Transformational languages==
There are numerous languages available for performing data transformation. Many [[transformational language]]s require a [[grammar]] to be provided. In many cases the grammar is structured using something closely resembling [[Backus–Naur form|Backus–Naur Form (BNF)]]. There are numerous languages available for such purposes varying in their accessibility (cost) and general usefulness. Examples of such languages include:
* [[XSLT]] - the XML transformation language
* [[TXL (programming language)|TXL]] - prototyping language-based descriptions using source transformation
 
Although transformational languages are typically best suited for transformation, something as simple as regular expressions can be used to achieve useful transformation. Textpad supports the use of regular expressions with arguments. This would allow all instances of a particular pattern to be replaced with another pattern using parts of the original pattern. For example:
 
<pre>
Line 37:
In other words, all instances of a function invocation of foo with three arguments, followed by a function invocation with two invocations would be replaced with a single function invocation using some or all of the original set of arguments.
 
Another advantage to using regular expressions is that they will not fail the null transform test. That is, using your transformational language of choice, run a sample program through a transformation that doesn't perform any transformations. Many transformational languages will fail this test.
 
==Difficult problems==
There are many challenges in data transformation. Probably the most difficult problem to address in C++ is "unstructured preprocessor directives". These are preprocessor directives which do not contain blocks of code with simple grammatical descriptions - example:
 
There are many challenges in data transformation. Probably the most difficult problem to address in C++ is "unstructured preprocessor directives". These are preprocessor directives which do not contain blocks of code with simple grammatical descriptions - example:
 
<pre>
void MyFunc ()
{
if (x>17)
{ printf("test");
# ifdef FOO
} else {
# endif
if (gWatch)
mTest = 42;
}
}
</pre>
 
A really general solution to handling this is very hard because such preprocessor directives can essentially edit the underlying language in arbitrary ways.
However, because such directives are not, in practice, used in completely arbitrary ways, one can build practical tools for handling preprocessed languages. The [[DMS Software Reengineering Toolkit]] is capable of handling structured macros and preprocessor conditionals.
 
==See also==
Line 72 ⟶ 71:
* [[Refinement]] (contrast)
* [[Identity transform]]
* [[Wikiversityv:2-c (8-d): File formats, transformation, migration|File Formats, Transformation, and Migration]] (related wikiversity article)
 
==External links==