Data and information visualization

This is an old revision of this page, as edited by Mdd (talk | contribs) at 20:19, 22 July 2008 (Some more history). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Data visualization is the study of the visual representation of data, defined as information which has been abstracted in some schematic form, including attributes or variables for the units of information.[2]

File:Data visualisation.jpg
The research process from data to visualization.[1]

Overview

The main goal of data visualization is its ability to visualize data, communicating information clearly and effectivelty. It doesn’t mean that data visualization needs to look boring to be functional or extremely sophisticated to look beautiful. To convey ideas effectively, both aesthetic form and functionality need to go hand in hand, providing insights into a rather sparse and complex data set by communicating its key-aspects in a more intuitive way. Yet designers often tend to discard the balance between design and function, creating gorgeous data visualizations which fail to serve its main purpose — communicate information.[3]

Datavisualisation is close related to:

History

The origins of this field are in the early days of computer graphics in the 1950s, when the first graphs and figures were generated by computers. A strong impulse was given to the field by the appearance, in 1987, of the NSF report "Visualization in Scientific Computing" edited by Bruce H. McCormick, Thomas A. DeFanti and Maxine D. Brown. In this report the need for new computer-based visualization techniques was stressed. With the rapid increase of computing power, larger and more complex numerical models were developed, resulting in the generation of huge numerical data sets. Also, large data sets were generated by data acquisition devices such as medical scanners and microscopes, and data was collected in large databases containing text, numerical information and multimedia information. Advanced computer graphics techniques were needed to process and visualize these massive data sets.[4]

The phrase "Visualization in Scientific Computing" which turned into Scientific Visualization was used initially to refer to visualization as a part of a process of scientific computing: the use of computer modelling and simulation in scientific and engineering practice. More recently, visualization is increasingly also concerned with data from other sources, including large and heterogeneous data collections found in business and finance, administration, digital media, etc. A new research area called Information Visualization was launched in the early 1990s, to support analysis of abstract and heterogeneous data sets in many application areas. Therefore, the phrase "Data Visualization" is gaining acceptance to include both the scientific and information visualization fields.[4]

Since then data visualization is an evolving concept whose boundaries are continually expending and, as such, is best defined in terms of loose generalizations. It referes to the more technologically advanced techniques, which allow visual interpretation of data through the representation, modelling and display of solids, surfaces, properties and animations, involving the use of graphics, image processing, computer vision and user interfaces. It encompasses a much broader range of techniques then specific techniques as solid modelling.[5]

Data visualization subjects

According to Michael Friendly (2008) two main parts of data visualization are:[2]

The "Data Visualization: Modern Approaches" (2007) article gives an overview of seven subjects of data visualisation:[6]

These subjects are all close related to graphic design.

Frits H. Post (2002) gives an quiet other overview of data visualization subjects. He listed:[4]

  • Visualization Algorithms and Techniques
  • Volume Visualization
  • Information Visualization
  • Multiresolution Methods
  • Modelling Techniques and
  • Interaction Techniques and Architectures

Data acquisition

Data acquisition is the sampling of the real world to generate data that can be manipulated by a computer. Sometimes abbreviated DAQ or DAS, data acquisition typically involves acquisition of signals and waveforms and processing the signals to obtain desired information. The components of data acquisition systems include appropriate sensors that convert any measurement parameter to an electrical signal, which is acquired by data acquisition hardware.

Data analysis

Data analysis is the process of looking at and summarizing data with the intent to extract useful information and develop conclusions. Data analysis is closely related to data mining, but data mining tends to focus on larger data sets, with less emphasis on making inference, and often uses data that was originally collected for a different purpose. In statistical applications, some people divide data analysis into descriptive statistics, exploratory data analysis and confirmatory data analysis, where the EDA focuses on discovering new features in the data, and CDA on confirming or falsifying existing hypotheses.

Types of data analysis are:

Data governance

Data governance encompasses the people, processes and technology required to create a consistent, enterprise view of an organisation's data in order to:

  • Increase consistency & confidence in decision making
  • Decrease the risk of regulatory fines
  • Improve data security
  • Maximize the income generation potential of data
  • Designate accountability for information quality

Data management

Data management comprises all the academic disciplines related to managing data as a valuable resource. The official definition provided by DAMA is that "Data Resource Management is the development and execution of architectures, policies, practices and procedures that properly manage the full data lifecycle needs of an enterprise." This definition is fairly broad and encompasses a number of professions which may not have direct technical contact with lower-level aspects of data management, such as relational database management.

Data mining

Data mining is the process of sorting through large amounts of data and picking out relevant information. It is usually used by business intelligence organizations, and financial analysts, but is increasingly being used in the sciences to extract information from the enormous data sets generated by modern experimental and observational methods.

It has been described as "the nontrivial extraction of implicit, previously unknown, and potentially useful information from data"[7] and "the science of extracting useful information from large data sets or databases."[8] Data mining in relation to enterprise resource planning is the statistical and logical analysis of large sets of transaction data, looking for patterns that can aid decision making.[9]

See also

Software programs/ visualization applications/graphics toolkit
Organizations

References

  1. ^ National Visualization and Analytics Center. Retrieved 1 Juli 2008.
  2. ^ a b Michael Friendly (2008). "Milestones in the history of thematic cartography, statistical graphics, and data visualization".
  3. ^ "Data Visualization and Infographics" in: Graphics, Monday Inspiration, January 14th, 2008.
  4. ^ a b c Frits H. Post, Gregory M. Nielson and Georges-Pierre Bonneau (2002). Data Visualization: The State of the Art.
  5. ^ Paul Reilly, S. P. Q. Rahtz (eds.) 1992. Archaeology and the Information Age: A Global Perspective. p.92.
  6. ^ "Data Visualization: Modern Approaches". in: Graphics, August 2nd, 2007
  7. ^ W. Frawley and G. Piatetsky-Shapiro and C. Matheus (Fall 1992). "Knowledge Discovery in Databases: An Overview". AI Magazine: pp. 213–228. ISSN 0738-4602. {{cite journal}}: |pages= has extra text (help)
  8. ^ D. Hand, H. Mannila, P. Smyth (2001). Principles of Data Mining. MIT Press, Cambridge, MA. ISBN 0-262-08290-X.{{cite book}}: CS1 maint: multiple names: authors list (link)
  9. ^ Ellen Monk, Bret Wagner (2006). Concepts in Enterprise Resource Planning, Second Edition. Thomson Course Technology, Boston, MA. ISBN 0-619-21663-8.

Further reading