Array programming: Difference between revisions

Content deleted Content added
Citation bot (talk | contribs)
Alter: url. URLs might have been anonymized. Add: s2cid, authors 1-1. Removed proxy/dead URL that duplicated identifier. Removed access-date with no URL. Removed parameters. Some additions/deletions were parameter name changes. | Use this bot. Report bugs. | Suggested by Abductive | Category:Programming paradigms | #UCB_Category 44/113
 
(41 intermediate revisions by 37 users not shown)
Line 1:
{{Short description|Applying operations to whole sets of values simultaneously}}
{{Programming paradigms}}
In [[computer science]], '''array programming''' refers to solutions whichthat allow the application of operations to an entire set of values at once. Such solutions are commonly used in [[computational science|scientific]] and [[engineering]] settings.
 
Modern programming languages that support array programming (also known as [[vector (computingdata structure)|vector]] or '''[[multidimensional''' analysis|multidimensional]] languages) have been engineered specifically to generalize operations on [[scalar (computing)|scalar]]s to apply transparently to [[vector (geometric)|vector]]s, [[matrix (mathematics)|matrices]], and higher-dimensional arrays. These include [[APL (programming language)|APL]], [[J (programming language)|J]], [[Fortran 90]], Mata, [[MATLAB]], [[Analytica (software)|Analytica]], [[TK Solver]] (as lists), [[GNU Octave|Octave]], [[R (programming language)|R]], [[Cilk Plus]], [[Julia (programming language)|Julia]], [[Perl Data Language|Perl Data Language (PDL)]], and the [[NumPy]] extension to [[PythonRaku (programming language)|PythonRaku]]. In these languages, an operation that operates on entire arrays can be called a ''vectorized'' operation,<ref>{{cite journal |title=The NumPy array: a structure for efficient numerical computation |author=Stéfan van der Walt |author2=S. Chris Colbert |author3=Gaël Varoquaux |name-list-style=amp |journal=Computing in Science and Engineering |volume=13 |issue=2 |pages=22–30 |publisher=IEEE |year=2011 |doi=10.1109/mcse.2011.37|bibcode=2011CSE....13b..22V |arxiv=1102.1523 |s2cid=16907816 }}</ref> regardless of whether it is executed on a [[vector processor]], which implements vector instructions. Array programming primitives concisely express broad ideas about data manipulation. The level of concision can be dramatic in certain cases: it is not uncommon{{Example needed|date=September 2021}} to find array programming language [[one-liner program|one-liners]] that require several pages of object-oriented code.
 
==Concepts of array==
The fundamental idea behind array programming is that operations apply at once to an entire set of values. This makes it a [[high-level programming language|high-level programming]] model as it allows the programmer to think and operate on whole aggregates of data, without having to resort to explicit loops of individual scalar operations.
 
[[Kenneth E. Iverson]] described the rationale behind array programming (actually referring to APL) as follows:<ref>{{cite journal |authorlast= Iverson, |first=K. E. |title= NotationsNotation as a Tool of Thought. |journal= Communications of the ACM |volume= 23 |issue= 8 |pages= 444–465 |year= 1980 |doi= 10.1145/358896.358899 |author-link= Kenneth E. Iverson |doi-access= free }}</ref>
{{quote|most programming languages are decidedly inferior to mathematical notation and are little used as tools of thought in ways that would be considered significant by, say, an applied mathematician.
The thesis is that the advantages of executability and universality found in programming languages can be effectively combined, in a single coherent language, with the advantages offered by mathematical notation. it is important to distinguish the difficulty of describing and of learning a piece of notation from the difficulty of mastering its implications. For example, learning the rules for computing a matrix product is easy, but a mastery of its implications (such as its associativity, its distributivity over addition, and its ability to represent linear functions and geometric operations) is a different and much more difficult matter.
Line 21:
 
==Uses==
Array programming is very well suited to [[implicit parallelization]]; a topic of much research nowadays. Further, [[Intel]] and compatible CPUs developed and produced after 1997 contained various instruction set extensions, starting from [[MMX (instruction set)|MMX]] and continuing through [[SSSE3]] and [[3DNow!]], which include rudimentary [[Single instruction, multiple data|SIMD]] array capabilities. This has continued into the 2020s with instruction sets such as [[AVX-512]], making modern CPUs sophisticated vector processors. Array processing is distinct from [[parallel computing|parallel processing]] in that one physical processor performs operations on a group of items simultaneously while parallel processing aims to split a larger problem into smaller ones ([[Multiple instruction, multiple data|MIMD]]) to be solved piecemeal by numerous processors. Processors with two[[Multi-core orprocessor|multiple morecores]] and [[Graphics processing unit|GPU]]s with thousands of [[General-purpose computing on graphics processing units (software)|general computing cores]] are increasingly common todayas of 2023.
 
==Languages==
The canonical examples of array programming languages are [[Fortran]], [[APL (programming language)|APL]], and [[J (programming language)|J]]. Others include: [[A+ (programming language)|A+]], [[Analytica (software)|Analytica]], [[Chapel (programming language)|Chapel]], [[IDL (programming language)|IDL]], [[Julia (programming language)|Julia]], [[K (programming language)|K]], Klong, [[Q (programming language from Kx Systems)|Q]], Mata, [[MATLAB]], [[MOLSFGNU Octave]], NumPy[[Scilab]], [[GNU OctaveFreeMat]], [[Perl Data Language|PDL]] (PDL), [[R (programming language)|R]], [[S-LangRaku (programming language)|Raku]], [[S-Lang]], [[SAC programming language|SAC]], [[Nial]], [[ZPL (programming language)|NialZPL]], [[ZPLFuthark (programming language)|ZPLFuthark]], and [[TI-BASIC]].
 
===Scalar languages===
Line 43:
</syntaxhighlight>
 
While scalar languages like C do not have native array programming elements as part of the language proper, this does not mean programs written in these languages never take advantage of the underlying techniques of vectorization (i.e., utilizing a CPU's [[SIMDSingle instruction, multiple data|vector-based instructions]] if it has them or by using multiple CPU cores). Some C compilers like [[GNU Compiler Collection|GCC]] at some optimization levels detect and vectorize sections of code that its heuristics determine would benefit from it. Another approach is given by the [[OpenMP]] API, which allows one to parallelize applicable sections of code by taking advantage of multiple CPU cores.
 
===Array languages===
Line 187:
</syntaxhighlight>
 
Both MATLAB and GNU Octave natively support [[linear algebra]] operations such as matrix multiplication, [[matrix inversion]], and the numerical solution of [[system of linear equations]], even using the [[Moore–Penrose pseudoinverse]].<ref>{{cite web |title= GNU Octave Manual. Arithmetic Operators. |url= https://www.gnu.org/software/octave/doc/interpreter/Arithmetic-Ops.html |access-date= 2011-03-19}}</ref><ref>{{cite web |title= MATLAB documentation. Arithmetic Operators. |url= http://www.mathworks.com/help/techdoc/ref/arithmeticoperators.html |access-date= 2011-03-19 |archive-date= 2010-09-07 |archive-url= https://web.archive.org/web/20100907074906/http://www.mathworks.com/help/techdoc/ref/arithmeticoperators.html |url-status= dead }}</ref>
 
The [[Nial]] example of the inner product of two arrays can be implemented using the native matrix multiplication operator. If <code>a</code> is a row vector of size [1 n] and <code>b</code> is a corresponding column vector of size [n 1].
 
a * b;
 
By contrast, the [[entrywise product]] is implemented as:
 
a .* b;
 
The inner product between two matrices having the same number of elements can be implemented with the auxiliary operator <code>(:)</code>, which reshapes a given matrix into a column vector, and the [[transpose]] operator <code>'</code>:
Line 206 ⟶ 210:
====R====
The R language supports [[array paradigm]] by default. The following example illustrates a process of multiplication of two matrices followed by an addition of a scalar (which is, in fact, a one-element vector) and a vector:
<syntaxhighlight lang="rrout">
> A <- matrix(1:6, nrow=2) # !!this has nrow=2 ... and A has 2 rows
> A
[,1] [,2] [,3]
Line 232 ⟶ 236:
[1,] 30 21
[2,] 42 30
</syntaxhighlight>
 
====Raku====
Raku supports the array paradigm via its Metaoperators.<ref>{{cite web |url=https://docs.raku.org/language/operators#Metaoperators |title=Metaoperators section of Raku Operator documentation}}</ref> The following example demonstrates the addition of arrays @a and @b using the Hyper-operator in conjunction with the plus operator.
 
<syntaxhighlight lang="raku">
[0] > my @a = [[1,1],[2,2],[3,3]];
[[1 1] [2 2] [3 3]]
 
[1] > my @b = [[4,4],[5,5],[6,6]];
[[4 4] [5 5] [6 6]]
 
[2] > @a »+« @b;
[[5 5] [7 7] [9 9]]
</syntaxhighlight>
 
Line 243 ⟶ 261:
 
If the system is overdetermined – so that <code>A</code> has more rows than columns – the pseudoinverse <code>A<sup>+</sup></code> (in MATLAB and GNU Octave languages: <code>pinv(A)</code>) can replace the inverse <code>A<sup>−1</sup></code>, as follows:
:<{{code>|2=matlab|1=pinv(A) *(A * x)==pinv(A) * (b)</code>}}
:<{{code>|2=matlab|1=(pinv(A) * A)* x ==pinv(A) * b</code>}} &nbsp; &nbsp; &nbsp; (matrix-multiplication associativity)
:<{{code>|2=matlab|1=x = pinv(A) * b</code>}}
 
However, these solutions are neither the most concise ones (e.g. still remains the need to notationally differentiate overdetermined systems) nor the most computationally efficient. The latter point is easy to understand when considering again the scalar equivalent <code>a * x = b</code>, for which the solution <code>x = a^-1 * b</code> would require two operations instead of the more efficient <code>x = b / a</code>.
Line 265 ⟶ 283:
 
==Third-party libraries==
The use of specialized and efficient libraries to provide more terse abstractions is also common in other programming languages. In [[C++]] several linear algebra libraries exploit the language's ability to [[operator overloading|overload operators]]. In some cases a very terse abstraction in those languages is explicitly influenced by the array programming paradigm, as the [[NumPy]] extension library to [[Python (programming language)|Python]], [[Armadillo (C++ library)|Armadillo]] and [[Blitz++]] libraries do.<ref>{{cite web |title= Reference for Armadillo 1.1.8. Examples of Matlab/Octave syntax and conceptually corresponding Armadillo syntax. |url= httphttps://arma.sourceforge.net/docs.html#syntax |access-date= 2011-03-19}}</ref><ref>{{cite web |title= Blitz++ User's Guide. 3. Array Expressions. |url= http://www.oonumerics.org/blitz/docs/blitz_3.html#SEC80 |access-date= 2011-03-19 |archive-url= https://web.archive.org/web/20110323013142/http://www.oonumerics.org/blitz/docs/blitz_3.html#SEC80 |archive-date= 2011-03-23 |url-status= dead }}</ref>
 
==See also==
* [[Array slicing]]
* [[List of programming languages by type#Array languages|List of array programming languages]]
* [[Automatic vectorization]]
 
==References==
Line 277 ⟶ 296:
*[http://www.nsl.com/ "No stinking loops" programming]
*[https://web.archive.org/web/20110227013846/http://www.vector.org.uk/archive/v223/smill222.htm Discovering Array Languages]
*[http://www.zareenacademy.com/ "Types of Arrays" programming]
 
{{Programming languageparadigms navbox}}
{{Types of programming languages}}
 
[[Category:Array programming languages| ]]