Content deleted Content added
m Conciseness |
m →Higher dimensions: Fixed typo Tags: Mobile edit Mobile app edit Android app edit App select source |
||
(12 intermediate revisions by 8 users not shown) | |||
Line 2:
[[Image:ParCorFisherIris.png|right|400px|Parallel coordinates]]
[[File:Ggobi-flea2.png|right|400px|alt=Ggobi-flea2|Parallel coordinate plot of the flea data in [[GGobi]].]]
'''Parallel Coordinates''' plots are a common method of visualizing [[multivariate data|high-dimensional datasets]] to analyze multivariate data
To plot, or visualize, a set of [[point (geometry)|points]] in [[n-dimensional space|''n''-dimensional space]], ''n'' [[parallel (geometry)|parallel]] lines are drawn over the background representing [[coordinate]] axes, typically oriented vertically with equal spacing. Points in ''n''-dimensional space are represented as individual [[polyline]]s with ''n'' [[vertex (geometry)|vertices]] placed on the parallel axes corresponding to each [[coordinate]] entry of the ''n''-dimensional point,
This data visualization is similar to [[time series]] visualization, except that Parallel Coordinates are applied to data which do not correspond with chronological time. Therefore, different axes arrangements can be of interest, including reflecting axes horizontally, otherwise inverting the attribute range.
== History ==
The concept of Parallel Coordinates is often said to originate in 1885 by a French mathematician [[Philbert Maurice d'Ocagne]].<ref>Ocagne, M. (1885). Coordonnées Parallèles et Axiales: Méthode de transformation géométrique et procédé nouveau de calcul graphique déduits de la considération des coordonnées parallèlles. Gauthier-Villars. [https://archive.org/details/coordonnesparal00ocaggoog
For example, a three-variable equation could be solved using three parallel axes, marking known values on their scales, then drawing a line between them, with an unknown read from the scale at the point where the line intersects that scale.
The use of Parallel Coordinates as a visualization technique to show data is also often said to have originated earlier with [[Henry Gannett]] in work preceding the Statistical Atlas of the United States
for the 1890 Census, for example his "General Summary, Showing the Rank of States, by Ratios, 1880", <ref name="hg">{{cite
that shows the rank of 10 measures (population, occupations, wealth, manufacturing, agriculture, and so forth) on parallel axes connected by lines for each state.
Line 23:
==Higher dimensions==
On the plane with an XY Cartesian coordinate system, adding more [[dimensions]] in parallel coordinates (often abbreviated ||-coords, PCP, or PC) involves adding more axes. The value of parallel coordinates is that certain geometrical properties in high dimensions transform into easily seen 2D patterns. For example, a set of points on a line in ''n''-space transforms to a set of [[polyline]]s in parallel coordinates all intersecting at ''n'' − 1 points. For ''n'' = 2 this yields a point-line duality pointing out why the mathematical foundations of parallel coordinates are developed in the [[Projective space|projective]] rather than [[Euclidian space|euclidean]] space. A pair of lines intersects at a unique point which has two coordinates and, therefore, can correspond to a unique line which is also specified by two parameters (or two points).
==Statistical considerations==
Line 38:
== Reading ==
Inselberg ({{harvnb|Inselberg|1997|p= }}) made a full review of how to visually read out parallel coordinates relational patterns.<ref>{{citation|last1=Inselberg |first1=A.|year=1997 |chapter=Multidimensional detective |title=Information Visualization, 1997. Proceedings., IEEE Symposium on |isbn=0-8186-8189-6|pages=100–107|doi=10.1109/INFVIS.1997.636793|s2cid=1823293 |citeseerx=10.1.1.457.3745 }}</ref> When most lines between two parallel
== Limitations ==
In parallel coordinates, each axis can have at most two neighboring axes (one on the left, and one on the right). For a ''n''-dimensional data set, at most ''n''-1 relationships can be shown at a time without altering the approach. In [[time series]] visualization, there exists a natural predecessor and successor; therefore in this special case, there exists a preferred arrangement. However, when the axes do not have a unique order, finding a good axis arrangement requires the use of experimentation and feature engineering. To explore more relationships, axes may be reordered or restructured.
One approach arranges axes in 3-dimensional space (still in parallel, forming a [[Lattice graph]]), an axis can have more than two neighbors in a circle around the central attribute, and the arrangement problem can be improve by using a [[minimum spanning tree]].<ref name="sigmod13">{{cite book
| title=Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
| chapter=Interactive data mining with 3D-parallel-coordinate-trees
| pages=1009–1012
| publisher=Association for Computing Machinery
| ___location=New York City, NY | year=2013 | doi=10.1145/2463676.2463696| isbn=9781450320375
| s2cid=14850709
Line 53:
== Software ==
While there are a large number of papers about parallel coordinates, there are only a few notable software publicly available to convert databases into parallel coordinates graphics.<ref>{{cite web|url=http://eagereyes.org/techniques/parallel-coordinates|title=Parallel Coordinates|last=Kosara|first=Robert|year=2010}}</ref> Notable software are [[ELKI]], [[GGobi]], [[Mondrian data analysis|Mondrian]], [[Orange (software)|Orange]] and [[ROOT]]. Libraries include [[Protovis.js]], [[D3.js]] provides basic examples. D3.Parcoords.js (a D3-based library) specifically dedicated to parallel coordinates graphic creation has also been published. The [[Python (programming language)|Python]] data structure and analysis library [[Pandas (software)|Pandas]] implements parallel coordinates plotting, using the plotting library [[matplotlib]].<ref>[https://pandas.pydata.org/pandas-docs/version/0.21.0/visualization.html#parallel-coordinates Parallel Coordinates in Pandas]</ref>
== Other visualizations for multivariate data ==
* [[Radar chart]] –
* [[Andrews plot]] – A Fourier transform of the Parallel Coordinates graph.
* [[Sankey diagram]] - A visualization that emphasizes flow/movement/change from one state to another.
== References ==
Line 75 ⟶ 76:
*[https://github.com/IBM/conditional-parallel-coordinates Conditional Parallel Coordinates] – Recursive variant of Parallel Coordinates, where a categorical value can expand to reveal another level of Parallel Coordinates.
[[Category:Data and information visualization]]
[[Category:Multi-dimensional geometry]]
[[Category:Statistical charts and diagrams]]
|