Parallel coordinates: Difference between revisions

Content deleted Content added
Avarai (talk | contribs)
m Higher dimensions: Fixed typo
Tags: Mobile edit Mobile app edit Android app edit App select source
 
(48 intermediate revisions by 26 users not shown)
Line 1:
{{short description|Chart displaying multivariate data}}
[[Image:ParCorFisherIris.png|right|400px|Parallel coordinates]]
[[File:Ggobi-flea2.png|right|400px|alt=Ggobi-flea2|Parallel coordinate plot of the flea data in [[GGobi]].]]
'''Parallel coordinatesCoordinates''' plots are a common waymethod of visualizing [[multivariate data|high-dimensional]] [[geometrydatasets]] andto analyzinganalyze [[multivariate data]] having multiple variables, or attributes.
 
To showplot, or visualize, a set of [[point (geometry)|points]] in an [[n-dimensional space|''n''-dimensional space]], a backdrop is drawn consisting of ''n'' [[parallel (geometry)|parallel]] lines are drawn over the background representing [[coordinate]] axes, typically verticaloriented andvertically equallywith spacedequal spacing. A pointPoints in ''n''-dimensional space isare represented as aindividual [[polyline]]s with ''n'' [[vertex (geometry)|vertices]] placed on the parallel axes; thecorresponding positionto ofeach the[[coordinate]] vertexentry onof the ''in''-thdimensional axispoint, correspondsvertices toare theconnected with ''in-1''-th [[coordinate]]polyline of the pointsegments.
 
This data visualization is closely relatedsimilar to [[time series]] visualization, except that itParallel isCoordinates are applied to data where the axeswhich do not correspond towith pointschronological intime. timeTherefore, anddifferent thereforeaxes doarrangements notcan havebe aof naturalinterest, order.including Therefore,reflecting differentaxes axishorizontally, arrangementsotherwise mayinverting bethe ofattribute interestrange.
 
== History ==
 
The concept of Parallel Coordinates is often said to originate in 1885 by a French mathematician [[Philbert Maurice d'Ocagne]].<ref>Ocagne, M. (1885). Coordonnées Parallèles et Axiales: Méthode de transformation géométrique et procédé nouveau de calcul graphique déduits de la considération des coordonnées parallèlles. Gauthier-Villars. [https://archive.org/details/coordonnesparal00ocaggoog archive.org]</ref> d'Ocagne sought a way to provide graphical calculation of mathematical functions using alignment diagrams called [[nomogram]]s which used parallel axes with different scales.
Parallel coordinates were often said to be invented by [[Philbert Maurice d'Ocagne]] [[:fr:Maurice d'Ocagne|(fr)]] in 1885,<ref name="pc-first">{{cite book |last=d'Ocagne |first=Maurice |year=1885 |title=Coordonnées parallèles et axiales : Méthode de transformation géométrique et procédé nouveau de calcul graphique déduits de la considération des coordonnées parallèles |publisher=Paris: Gauthier-Villars |url=https://archive.org/details/coordonnesparal00ocaggoog }}</ref> but even though the words "Coordonnées parallèles" appear in the book title this work has nothing to do with the visualization techniques of the same name; the book only describes a method of coordinate transformation. But even before 1885, parallel coordinates were used, for example in Henry Gannetts "General Summary, Showing the Rank of States, by Ratios, 1880",<ref name="hg">{{cite journal |first=Henry |last=Gannett |title=General Summary Showing the Rank of States by Ratios 1880 |url=http://www.davidrumsey.com/luna/servlet/detail/RUMSEY~8~1~32803~1152181:General-summary,-showing-the-rank-o?sort=Pub_Date%2CPub_List_No_InitialSort&qvq=q:List_No%3D%274521.152%27%22%2B;sort:Pub_Date%2CPub_List_No_InitialSort;lc:RUMSEY~8~1&mi=0&trs=1 }}</ref> or afterwards in Henry Gannetts "Rank of States and Territories in Population at Each Census, 1790-1890" in 1898. They were popularised again 79 years later by [[Alfred Inselberg]] <ref name="pc">{{cite journal |first=Alfred |last=Inselberg |title=The Plane with Parallel Coordinates |journal=Visual Computer |volume=1 |issue=4 |pages=69–91 |year=1985 |doi=10.1007/BF01898350 }}</ref> in 1959 and systematically developed as a coordinate system starting from 1977. Some important applications are in [[Traffic collision avoidance system|collision avoidance algorithms]] for [[air traffic control]] (1987—3 USA patents), [[data mining]] (USA patent), [[computer vision]] (USA patent), Optimization, [[process control]], more recently in [[Intrusion detection system|intrusion detection]] and elsewhere.
For example, a three-variable equation could be solved using three parallel axes, marking known values on their scales, then drawing a line between them, with an unknown read from the scale at the point where the line intersects that scale.
 
The use of Parallel Coordinates as a visualization technique to show data is also often said to have originated earlier with [[Henry Gannett]] in work preceding the Statistical Atlas of the United States
for the 1890 Census, for example his "General Summary, Showing the Rank of States, by Ratios, 1880", <ref name="hg">{{cite book |first=Henry |last=Gannett |title=Scribner's statistical atlas of the United States |section=General Summary Showing the Rank of States by Ratios 1880 |url=https://www.davidrumsey.com/luna/servlet/detail/RUMSEY~8~1~32803~1152181}}</ref>
that shows the rank of 10 measures (population, occupations, wealth, manufacturing, agriculture, and so forth) on parallel axes connected by lines for each state.
 
However, both d'Ocagne and Gannet were far preceded in this by [[André-Michel Guerry]],<ref>Guerry, A.-M. (1833). Essai sur la Statistique Morale de la France. Paris: Crochard.</ref> Plate IV, "Influence de l'Age",
where he showed rankings of crimes against persons by age along parallel axes, connecting the same crime across age groups.<ref>Friendly, M. (2022). The life and works of André-Michel Guerry, revisited. Sociological Spectrum, 42(4-6), 233–259. https://doi.org/10.1080/02732173.2022.2078450</ref>
 
Parallel Coordinates were popularised again 87 years later by [[Alfred Inselberg]]<ref name="pc">{{cite journal |first=Alfred |last=Inselberg |title=The Plane with Parallel Coordinates |journal=Visual Computer |volume=1 |issue=4 |pages=69–91 |year=1985 |doi=10.1007/BF01898350 |s2cid=15933827 }}</ref> in 1985 and systematically developed as a coordinate system starting from 1977. Some important applications are in [[Traffic collision avoidance system|collision avoidance algorithms]] for [[air traffic control]] (1987—3 USA patents), [[data mining]] (USA patent), [[computer vision]] (USA patent), Optimization, [[process control]], more recently in [[Intrusion detection system|intrusion detection]] and elsewhere.
 
==Higher dimensions==
On the plane with an xyXY cartesianCartesian coordinate system, adding more [[dimensions]] in parallel coordinates (often abbreviated ||-coords, PCP, or PCPPC) involves adding more axes. The value of parallel coordinates is that certain geometrical properties in high dimensions transform into easily seen 2D patterns. For example, a set of points on a line in ''n''-space transforms to a set of [[polyline]]s in parallel coordinates all intersecting at ''n''&nbsp;&minus;&nbsp;1 points. For ''n'' = 2 this yields a point-line duality pointing out why the mathematical foundations of parallel coordinates are developed in the [[Projective space|projective]] rather than [[Euclidian space|euclidean]] space. A pair of lines intersects at a unique point which has two coordinates and, therefore, can correspond to a unique line which is also specified by two parameters (or two points). ByIn contrast, more than two points are required to specify a curve and also a pair of curves may not have a unique intersection. Hence by using curves in parallel coordinates instead of lines, the point line duality is lost together with all the other properties of projective geometry, and the known nice higher-dimensional patterns corresponding to (hyper)planes, curves, several smooth (hyper)surfaces, proximities, convexity and recently non-orientability.<ref name="pc2">{{cite book |first=Alfred |last=Inselberg |title=Parallel Coordinates: VISUAL Multidimensional Geometry and its Applications |publisher=Springer |year=2009 |isbn=978-0387215075 }}</ref> The goal is to map n-dimensional relations into 2D patterns. Hence, parallel coordinates is not a point-to-point mapping but rather a ''n''D subset to 2D subset mapping, there is no loss of information. Note: even a point in nD is not mapped into a point in 2D, but to a polygonal line—a subset of 2D.
 
==Statistical considerations==
Line 22 ⟶ 33:
The rotation of the axes is a translation in the parallel coordinates and if the lines intersected outside the parallel axes it can be translated between them by rotations. The simplest example of this is rotating the axis by 180 degrees.<ref name="Gpc2" />
 
Scaling is necessary because the plot is based on interpolation (linear combination) of consecutive pairs of variables.<ref name="Gpc2">{{cite book |first1=Rida |last1=Moustafa |first2=Edward J. |last2=Wegman |chapter=Multivariate continuous data – Parallel Coordinates |editorseditor1= Unwin, A.; |editor2=Theus, M.; and |editor3=Hofmann, H. (Eds.) |title=Graphics of Large Datasets: Visualizing a Million |publisher=Springer |pages=143–156 |year=2006 |isbn=978-0387329062 }}</ref> Therefore, the variables must be in common scale, and there are many scaling methods to be considered as part of data preparation process that can reveal more informative views.
 
A smooth parallel coordinate plot is achieved with splines.<ref name="Gpc1">{{cite journal |first1=Rida |last1=Moustafa |first2=Edward J. |last2=Wegman |title=On Some Generalizations of Parallel Coordinate Plots |journal=Seeing a Million, A Data Visualization Workshop, Rain Am Lech (nrNr.), Germany |year=2002 |url=http://herakles.zcu.cz/seminars/docs/infovis/papers/Moustafa_generalized_parallel_coordinates.pdf |archive-url=https://web.archive.org/web/20131224111246/http://herakles.zcu.cz/seminars/docs/infovis/papers/Moustafa_generalized_parallel_coordinates.pdf |url-status=dead |archive-date=2013-12-24 }}</ref> In the smooth plot, every observation is mapped into a parametric line (or curve), which is smooth, continuous on the axes, and orthogonal to each parallel axis. This design emphasizes the quantization level for each data attribute.<ref name="Gpc2" />
 
A smooth parallel coordinate plot is achieved with splines.<ref name="Gpc1">{{cite journal |first1=Rida |last1=Moustafa |first2=Edward J. |last2=Wegman |title=On Some Generalizations of Parallel Coordinate Plots |journal=Seeing a Million, A Data Visualization Workshop, Rain Am Lech (nr.), Germany |year=2002 |url=http://herakles.zcu.cz/seminars/docs/infovis/papers/Moustafa_generalized_parallel_coordinates.pdf |archive-url=https://web.archive.org/web/20131224111246/http://herakles.zcu.cz/seminars/docs/infovis/papers/Moustafa_generalized_parallel_coordinates.pdf |url-status=dead |archive-date=2013-12-24 }}</ref> In the smooth plot, every observation is mapped into a parametric line (or curve), which is smooth, continuous on the axes, and orthogonal to each parallel axis. This design emphasizes the quantization level for each data attribute.<ref name="Gpc2" />
== Reading ==
Inselberg ({{harvnb|Inselberg|1997|p= }}) made a full review of how to visually read out parallel coords'coordinates relational patterns.<ref>{{citation|last1=Inselberg |first1=A.|year=1997 |chapter=Multidimensional detective |editor=|title=Information Visualization, 1997. Proceedings., IEEE Symposium on |series=|isbn=|place=0-8186-8189-6|pages=100–107|chapter-urldoi=http://ieeexplore.ieee10.org1109/xpls/abs_allINFVIS.1997.jsp?arnumber=636793|s2cid=1823293 |citeseerx=10.1.1.457.3745 }}</ref> When most lines between two parallel axisaxes are somewhat parallel to each other, it suggests a positive relationship between these two dimensions. When lines cross in a kind of superposition of X-shapes, it's a negative relationship. When lines cross randomly or are parallel, it shows there is no particular relationship.
 
== Limitations ==
In parallel coordinates, each axis can have at most two neighboring axes (one on the left, and one on the right). For a d''n''-dimensional data set, at most d''n''-1 relationships can be shown at a time without altering the approach. In [[time series]] visualization, there exists a natural predecessor and successor; therefore in this special case, there exists a preferred arrangement. However, when the axes do not have a unique order, finding a good axis arrangement requires the use of heuristicsexperimentation and experimentationfeature engineering. In order toTo explore more complex relationships, axes mustmay be reordered or restructured.
 
ByOne arrangingapproach thearranges axes in 3-dimensional space (however, still in parallel, like nails informing a nail[[Lattice bedgraph]]), an axis can have more than two neighbors in a circle around the central attribute, and the arrangement problem getscan easierbe (for exampleimprove by using a [[minimum spanning tree]]).<ref name="sigmod13">{{cite journalbook |titleauthor=InteractiveElke DataAchtert Mining|author2=[[Hans-Peter withKriegel]] 3D-Parallel-Coordinate-Trees|author3=Erich Schubert |author4=Arthur Zimek
| journaltitle=Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data (SIGMOD)
| author=Elke Achtert, [[Hans-Peter Kriegel]], Erich Schubert, Arthur Zimek
| chapter=Interactive data mining with 3D-parallel-coordinate-trees
| journal=Proceedings of the ACM International Conference on Management of Data (SIGMOD)
| pages=10091009–1012
| publisher=Association for Computing Machinery
| ___location=New York City, NY | year=2013 | doi=10.1145/2463676.2463696| isbn=9781450320375
| s2cid=14850709
}}</ref> A prototype of this visualization is available as extension to the data mining software [[ELKI]]. However, the visualization is harder to interpret and interact with than a linear order.
 
== Software ==
While there are a large number of papers about parallel coordinates, there are only a few notable software publicly available to convert databases into parallel coordinates graphics.<ref>{{cite web|url=http://eagereyes.org/techniques/parallel-coordinates|title=Parallel Coordinates|last=Kosara|first=Robert|year=2010}}</ref> Notable software are [[ELKI]], [[GGobi]], [[Macrofocus High-D]], [[Mondrian data analysis|Mondrian]], [[Orange (software)|Orange]] and [[ROOT]]. Libraries include [[Protovis.js]],<ref>{{cite web|url=https://mbostock.github.com/protovis/ex/cars.html|title=Protovis.js: Parallel Coordinates|last=Bostock|first=Mike|year=2011}}</ref> [[D3.js]]<ref>{{cite web|url=https://mbostock.github.com/d3/talk/20111116/iris-parallel.html|title=D3.js: Parallel Coordinates|last=Bostock|first=Mike|year=2012}}</ref><ref>{{cite web|url=http://bl.ocks.org/1341281|title=Parallel%20Coordinates|last=Davies|first=Jason|year=2011}}</ref> provideprovides basic examples, while more complex examples are also available.<ref>{{cite web|url=http://exposedata.com/parallel/|title=Nutrient Contents - Parallel Coordinates|last=Chang|first=Kai|date=|year=2012|website=|url-status=dead|archive-url=https://web.archive.org/web/20160502023325/http://exposedata.com/parallel/|archive-date=2016-05-02|access-date=}}</ref><ref>http://bl.ocks.org/syntagmatic</ref><ref>{{Cite web|url=https://bl.ocks.org/IlievskiV/510869afe89b36eb46744cfbc2f1c1f1|title=Interactive exploring of the Laptop Prices dataset with Parallel Coordinates|last=Ilievski|first=Vladimir|date=2020-02-08|website=bl.ocks.org|url-status=live|archive-url=|archive-date=|access-date=2020-02-17}}</ref> D3.Parcoords.js<ref>{{cite web|url=https://syntagmatic.github.com/parallel-coordinates/|title=Parallel Coordinates (beta)|year=2012|last=Chang|first=Kai}}</ref> (a D3-based library) and [[Macrofocus High-D|Macrofocus High-D API]] (a Java library) specifically dedicated to parallel coordinates graphic creation havehas also been published. The [[Python (programming language)|Python]] data structure and analysis library [[Pandas (software)|Pandas]] implements parallel coordinates plotting, using the plotting library [[matplotlib]].<ref>[https://pandas.pydata.org/pandas-docs/version/0.21.0/visualization.html#parallel-coordinates Parallel Coordinates in Pandas]</ref> The [[R (programming language)|R]] programming language package [https://cran.r-project.org/web/packages/GGally/index.html GGally], among others, also implements parallel coordinates plotting.<ref>[https://cran.r-project.org/web/packages/GGally/index.html|title= Parallel coordinates in R]. </ref> High performance interactive parallel coordinates plots rendered with webgl can be made with the [[Plotly]] libraries in [https://plot.ly/python/parallel-coordinates-plot/ Python], [https://plot.ly/r/parallel-coordinates-plot/ R], and [https://plot.ly/javascript/parallel-coordinates-plot/ JavaScript].
 
== Other visualizations for multivariate data ==
* [[Radar chart]] – aA visualization with coordinate axes arranged radially.
* [[Andrews plot]] – theA Fourier transform of athe parallelParallel coordinatesCoordinates graph.
* [[Sankey diagram]] - A visualization that emphasizes flow/movement/change from one state to another.
 
== References ==
Line 52 ⟶ 67:
* Heinrich, Julian and Weiskopf, Daniel (2013) ''[https://diglib.eg.org/handle/10.2312/conf.EG2013.stars.095-116 State of the Art of Parallel Coordinates]'', Eurographics 2013 - State of the Art Reports, pp.&nbsp;95–116
* Moustafa, Rida (2011) '' Parallel coordinate and parallel coordinate density plots'', Wiley Interdisciplinary Reviews: Computational Statistics, Vol 3(2), pp.&nbsp;134–148.
* Weidele, Daniel Karl I. (2019) ''[https://doi.org/10.1109/VISUAL.2019.8933632 Conditional Parallel Coordinates]'', IEEE Visualization Conference (VIS) 2019, pp. 221-225&nbsp;221–225
 
==External links==
* [http://www.cs.tau.ac.il/~aiisreal Alfred Inselberg's Homepage], with Visual Tutorial, History, Selected Publications and Applications
* [http://www.agocg.ac.uk/reports/visual/casestud/brunsdon/abstract.htm An Investigation of Methods for Visualising Highly Multivariate Datasets] by C. Brunsdon, A. S. Fotheringham & M. E. Charlton, [[University of Newcastle upon Tyne|University of Newcastle]], [[UK]]
* [http://www.dcs.napier.ac.uk/~marting/parCoord/GrahamKennedyParallelCurvesIV03.pdf Using Curves to Enhance Parallel Coordinate Visualisations] {{Webarchive|url=https://web.archive.org/web/20070315191533/http://www.dcs.napier.ac.uk/~marting/parCoord/GrahamKennedyParallelCurvesIV03.pdf |date=2007-03-15 }} by Martin Graham & Jessie Kennedy, [[Napier University]], [[Edinburgh]], [[UK]]
* [http://www.ggobieagereyes.org/docstechniques/parallel-coordinates// Parallel coordinatesCoordinates], plota intutorial GGobi]by Robert Kosara
* [http://www.ichrome.com/grapheme Grapheme] – an intuitive and powerful data visualization tool.
* [https://github.com/IBM/conditional-parallel-coordinates Conditional Parallel Coordinates] – Recursive variant of Parallel Coordinates, where a categorical value can expand to reveal another level of Parallel Coordinates.
* [http://davis.wpi.edu/xmdv/vis_parcoord.html Parallel coordinates plot in the public-___domain software package XmdvTool]
* [http://www.dcs.napier.ac.uk/~marting/parCoord/GrahamKennedyParallelCurvesIV03.pdf Using Curves to Enhance Parallel Coordinate Visualisations] by Martin Graham & Jessie Kennedy, [[Napier University]], [[Edinburgh]], [[UK]]
*[http://www.r-statistics.com/2010/06/clustergram-visualization-and-diagnostics-for-cluster-analysis-r-code/ Clustergram: A graph for visualizing cluster analyses] based on the Parallel Coordinates of each observations cluster mean over the number of potential clusters (implemented in [[R (programming language)|R]]).
* [http://www.xdat.org/ XDAT] – a free [[GPL]] JAVA-based software for plotting parallel coordinates.
* [http://eagereyes.org/techniques/parallel-coordinates Parallel Coordinates], a tutorial by Robert Kosara
*[https://ilievskiv.github.io/blog/2020-02-08-interactive-dataviz/ The importance of interactive data visualization], a tutorial by Vladimir Ilievski on how to use the Parallel Coordinates plot to explore the data
* [http://www.high-d.com/ High-D] A multi-platform commercial tool for creating parallel coordinates visualizations (with examples)
* [http://archives.visokio.com/forums/forums.visokio.com/discussion/2136/radar-view-omniscope-2.html#parallel_coordinates Parallel coordinates plot in Omniscope Classic]
* [http://cda.ornl.gov/projects/eden/ EDEN] – An open source JAVA-based tool for interactive parallel coordinates plots developed by Chad A. Steed.
* [http://www.sliversoftware.com/ Sliver] – Data visualization software incorporating parallel coordinates plots.
* [https://github.com/IBM/conditional-parallel-coordinates Conditional Parallel Coordinates] – Recursive variant of Parallel Coordinates, where a categorical value can expand to reveal another level of Parallel Coordinates.
 
[[Category:Data and information visualization]]
[[Category:Multi-dimensional geometry]]
[[Category:Statistical charts and diagrams]]