Pattern matching: Difference between revisions

Content deleted Content added
No edit summary
m Task 70: Update syntaxhighlight tags - remove use of deprecated <source> tags
Line 22:
The simplest pattern in pattern matching is an explicit value or a variable. For an example, consider a simple function definition in Haskell syntax (function parameters are not in parentheses but are separated by spaces, = is not assignment but definition):
 
<sourcesyntaxhighlight lang="haskell">
f 0 = 1
</syntaxhighlight>
</source>
 
Here, 0 is a single value pattern. Now, whenever f is given 0 as argument the pattern matches and the function returns 1. With any other argument, the matching and thus the function fail. As the syntax supports alternative patterns in function definitions, we can continue the definition extending it to take more generic arguments:
 
<sourcesyntaxhighlight lang="haskell">
f n = n * f (n-1)
</syntaxhighlight>
</source>
 
Here, the first <code>n</code> is a single variable pattern, which will match absolutely any argument and bind it to name n to be used in the rest of the definition. In Haskell (unlike at least [[Hope programming language|Hope]]), patterns are tried in order so the first definition still applies in the very specific case of the input being 0, while for any other argument the function returns <code>n * f (n-1)</code> with n being the argument.
Line 43:
In Haskell, the following line defines an algebraic data type <code>Color</code> that has a single data constructor <code>ColorConstructor</code> that wraps an integer and a string.
 
<sourcesyntaxhighlight lang="haskell">
data Color = ColorConstructor Integer String
</syntaxhighlight>
</source>
 
The constructor is a node in a tree and the integer and string are leaves in branches.
Line 53:
If we pass a variable that is of type Color, how can we get the data out of this variable? For example, for a function to get the integer part of <code>Color</code>, we can use a simple tree pattern and write:
 
<sourcesyntaxhighlight lang="haskell">
integerPart (ColorConstructor theInteger _) = theInteger
</syntaxhighlight>
</source>
 
As well:
<sourcesyntaxhighlight lang="haskell">
stringPart (ColorConstructor _ theString) = theString
</syntaxhighlight>
</source>
 
The creations of these functions can be automated by Haskell's data record syntax.
Line 67:
Pattern matching can be used to filter data of a certain structure. For instance, in Haskell a [[list comprehension]] could be used for this kind of filtering:
 
<sourcesyntaxhighlight lang="haskell">
[A x|A x <- [A 1, B 1, A 2, B 2]]
</syntaxhighlight>
</source>
 
evaluates to
Line 76:
==Pattern matching in Mathematica==
In [[Mathematica]], the only structure that exists is the [[Tree (data structure)|tree]], which is populated by symbols. In the [[Haskell (programming language)|Haskell]] syntax used thus far, this could be defined as
<sourcesyntaxhighlight lang="mathematica">
 
data SymbolTree = Symbol String [SymbolTree]
</syntaxhighlight>
</source>
An example tree could then look like
<sourcesyntaxhighlight lang="mathematica">
 
Symbol "a" [Symbol "b" [], Symbol "c" [] ]
</syntaxhighlight>
</source>
In the traditional, more suitable syntax, the symbols are written as they are and the levels of the tree are represented using [], so that for instance <code>a[b,c]</code> is a tree with a as the parent, and b and c as the children.
 
Line 94:
 
The Mathematica function <code>Cases</code> filters elements of the first argument that match the pattern in the second argument:
<sourcesyntaxhighlight lang="mathematica">
Cases[{a[1], b[1], a[2], b[2]}, a[_] ]
</syntaxhighlight>
</source>
evaluates to
<sourcesyntaxhighlight lang="mathematica">
{a[1], a[2]}
</syntaxhighlight>
</source>
Pattern matching applies to the ''structure'' of expressions. In the example below,
<sourcesyntaxhighlight lang="mathematica">
Cases[ {a[b], a[b, c], a[b[c], d], a[b[c], d[e]], a[b[c], d, e]}, a[b[_], _] ]
</syntaxhighlight>
</source>
returns
<sourcesyntaxhighlight lang="mathematica">
{a[b[c],d], a[b[c],d[e]]}
</syntaxhighlight>
</source>
because only these elements will match the pattern <code>a[b[_],_]</code> above.
 
In Mathematica, it is also possible to extract structures as they are created in the course of computation, regardless of how or where they appear. The function <code>Trace</code> can be used to monitor a computation, and return the elements that arise which match a pattern. For example, we can define the [[Fibonacci number|Fibonacci sequence]] as
<sourcesyntaxhighlight lang="mathematica">
fib[0|1]:=1
fib[n_]:= fib[n-1] + fib[n-2]
</syntaxhighlight>
</source>
Then, we can ask the question: Given fib[3], what is the sequence of recursive Fibonacci calls?
<sourcesyntaxhighlight lang="mathematica">
Trace[fib[3], fib[_]]
</syntaxhighlight>
</source>
returns a structure that represents the occurrences of the pattern <code>fib[_]</code> in the computational structure:
<sourcesyntaxhighlight lang="mathematica">
{fib[3],{fib[2],{fib[1]},{fib[0]}},{fib[1]}}
</syntaxhighlight>
</source>
 
===Declarative programming===
Line 130:
For instance, the [[Mathematica]] function <code>Compile</code> can be used to make more efficient versions of the code. In the following example the details do not particularly matter; what matters is that the subexpression <code>{{com[_], Integer}}</code> instructs <code>Compile</code> that expressions of the form <code>com[_]</code> can be assumed to be [[integer]]s for the purposes of compilation:
 
<sourcesyntaxhighlight lang="mathematica">
com[i_] := Binomial[2i, i]
Compile[{x, {i, _Integer}}, x^com[i], {{com[_], Integer}}]
</syntaxhighlight>
</source>
 
Mailboxes in [[Erlang programming language|Erlang]] also work this way.
Line 149:
In Haskell and functional programming languages in general, strings are represented as functional [[List (computing)|lists]] of characters. A functional list is defined as an empty list, or an element constructed on an existing list. In Haskell syntax:
 
<sourcesyntaxhighlight lang="haskell">
[] -- an empty list
x:xs -- an element x constructed on a list xs
</syntaxhighlight>
</source>
 
The structure for a list with some elements is thus <code>element:list</code>. When pattern matching, we assert that a certain piece of data is equal to a certain pattern. For example, in the function:
 
<sourcesyntaxhighlight lang="haskell">
head (element:list) = element
</syntaxhighlight>
</source>
 
We assert that the first element of <code>head</code>'s argument is called element, and the function returns this. We know that this is the first element because of the way lists are defined, a single element constructed onto a list. This single element must be the first. The empty list would not match the pattern at all, as an empty list does not have a head (the first element that is constructed).
Line 164:
In the example, we have no use for <code>list</code>, so we can disregard it, and thus write the function:
 
<sourcesyntaxhighlight lang="haskell">
head (element:_) = element
</syntaxhighlight>
</source>
 
The equivalent Mathematica transformation is expressed as
Line 181:
The same pattern in Haskell:
 
<sourcesyntaxhighlight lang="haskell">
['a', _]
</syntaxhighlight>
</source>
 
Symbolic entities can be introduced to represent many different classes of relevant features of a string. For instance,
Line 193:
In Haskell, [[guard (computing)|guards]] could be used to achieve the same matches:
 
<sourcesyntaxhighlight lang="haskell">
[letter, digit] | isAlpha letter && isDigit digit
</syntaxhighlight>
</source>
 
The main advantage of symbolic string manipulation is that it can be completely integrated with the rest of the programming language, rather than being a separate, special purpose subunit. The entire power of the language can be leveraged to build up the patterns themselves or analyze and transform the programs that contain them.