Multiway data analysis: Difference between revisions

Content deleted Content added
Mgt88drcr (talk | contribs)
m Fixed missing citation.
 
(37 intermediate revisions by 14 users not shown)
Line 1:
{{short description|Method of analyzing large data sets}}
'''Multiway data analysis''' is distinguished from [[multivariate analysis]], [[multilevel analysis]] and [[multidimensional analysis]]. It
'''Multiway data analysis''' is a method of analyzing large data sets by representing a collection of observations as a [[multiway array]], <math> {\mathcal A}\in{\mathbb C}^{I_0\times I_1\times \dots I_c\times \dots I_C}</math>. The proper choice of data organization into ''(C+1)''-way array, and analysis techniques can reveal patterns in the underlying data undetected by other methods.<ref name=Coppi1989>
refers to any multiple array can be analyzed through a design and application of multiple [[methodologies]].<ref>Coppi, R., S. Bolasco (Eds.): Multiway Data Analysis. North-Holland, Amsterdam 1989, xiv+552 S </ref>
{{cite book
|editor1-last=Coppi|editor1-first=R.
|editor2-last=Bolasco|editor2-first=S.
|title=Multiway Data Analysis
|publisher=North-Holland
|___location=Amsterdam
|year=1989
|isbn=9780444874108
}}</ref>
 
==History==
The study of multiway data analysis was first formalized as the result of a conference held in 1988. The result of this conference was the first text specifically addressed to this field, Coppi and Bolasco's ''Multiway Data Analysis''.<ref name= Coppi1989>
{{cite book
|editor1-last=Coppi|editor1-first=R.
|editor2-last=Bolasco|editor2-first=S.
|title=Multiway Data Analysis
|publisher=North-Holland
|___location=Amsterdam
|year=1989
|isbn=9780444874108
}}</ref> At that time, the application areas for multiway analysis included [[statistics]], [[econometrics]] and [[psychometrics]]. In recent years, applications have expanded to include [[chemometrics]], [[agriculture]], [[social network analysis]] and the [[food industry]].<ref name=Bro1998>
{{cite thesis
|url=http://curis.ku.dk/ws/files/13035961/rasmus_bro.pdf
|title=Multi-way Analysis in the Food Industry: Models, Algorithms, and Applications
|first=Rasmus|last=Bro
|degree=Ph.D.
|publisher=[[University of Amsterdam]]
|date=20 November 1998
}}</ref>
 
== Composition of multiway data analysis ==
In 1988, an Italian-Dutch-French-English consortium that organized an international meeting on the multiway data analysis in Rome.<ref> Pieter M. Kroonenberg, Applied Multiway Data Analysis, Wiley 2008, pp. xv. </ref> In recent year, the development and application of multiway data analysis has extended to food industry and become an useful tool for chemometricians. <ref>http://curis.ku.dk/ws/files/13035961/rasmus_bro.pdf</ref>
 
== Composition of =Multiway Data Analysisdata ===
Multiway data analysts use the term ''way'' to refer to the number sources of data variation while reserving the word ''mode'' for the methods or models used to analyze the data.<ref name=Kroonenberg2008>
{{cite book
|page=xv
|title=Applied Multiway Data Analysis
|volume=702
|series=Wiley Series in Probability and Statistics
|first=Pieter M.|last=Kroonenberg
|publisher=John Wiley & Sons
|year=2008
|isbn=9780470237991
}}</ref>{{rp|xviii}}
 
In this sense, we can define the various ''ways'' of data to analyze:
===Multiway Data===
* ''One way data'': A data point with <math>I_0</math>-dimensions, <math>{\bf a}\in {\mathbb C}^{I_0}</math> is a [[Vector (mathematics and physics)|vector]] or data point that is stored in a ''one-way array'' data structure.
Some of research papers in this field of study apply [[mathematical notation]] to describe multiway data. Following provide a simplex example to describe different orders of multiway data starting from [[line chart]] ending with multinational corporation deployed their multiway data over [[cloud infrastructure]]:
* ''Two-way data:'' A collection of <math>I_1</math> data points <math>{\bf a}\in {\mathbb C}^{I_0}</math> is stored in a ''two-way array'', <math>{\bf A}\in {\mathbb C}^{I_0\times I_1}</math>. A [[spreadsheet]] can be used to visualize such data in the case of discrete dimensions.
* ''Three-way data'': A collection of data <math>{\bf a}\in {\mathbb C}^{I_0}</math> that has two modes of variation is stored in a three-way array, <math>{\bf A}\in {\mathbb C}^{I_0\times I_1\times I_2}</math>. Such data might represent the temperature at different locations (two-way data) sampled over different times (leading to three-way data)
* ''Four-way data'', using the same spreadsheet analogy, can be represented as a file folder full of separate workbooks.
* ''Five-way data'' and ''six-way data'' can be represented by similarly higher levels of data aggregation.
 
In general, a multiway data is stored in a multiway array and may be measured at different times, or in different places, using different methodologies, and may contain inconsistencies such as missing data or discrepancies in data representation.
* One-way [[data]] looks like a [[line chart]] displays displays information as a series of data points connected by straight line segments.
* Two-way [[data]] looks like a [[spreadsheet]] it contains [[row and column spaces]].
* Three-way [[data]] looks like a set of [[spreadsheets]] such as Excel Workbook such that it contains multiple sheets with every sheet contains [[row and column spaces]].
* Fourth-way [[data]] looks like a file folder contains multiple Excel Workbooks.
* Fifth-way [[data]] looks like a [[company]] have invested in [[IT infrastructure]] to connect different [[data storage devices]] managing multiple file folders together with multiple [[database]] and [[file formats]] (e.g. Excel, [[MS SQL]], [[Sybase]], DB2, [[ISAM]]).
* Sixth-way [[data]] looks like a [[multinational corporation]] contains multiple [[subsidiaries]] operate multiple [[business]] across multiple [[geographical region]]s their [[management team]] have decided to deploy these multiway data from existing heterogeneous [[IT infrastructure]] over so called [[cloud infrastructure]].
 
===Multiway Modelmodel===
Multiway data can be transformed upward or downward to different levels by using suitable model(s). However, transforming the above example of sixth-way [[data]] downward to two-way [[data]] and to one way [[data]] ultimately in order to obtain relevant [[information]] for [[management team]] will encounter multifaceted problems despite there are multiple [[methodologies]] available.
 
===Multiway Applicationapplication===
Multiway data analysis can be employed in various multiway applications so as to address the problem of finding hidden multilinear structure in multiway datasets. Following are examples of applications in different fields: <ref>
{{cite thesis
|url=http://www.cs.rpi.edu/research/pdf/07-06.pdf</ref>
|title=Unsupervised Multiway Data Analysis: A Literature Survey
|first1=Evrim|last1=Acar
|first2=Bulent|last2=Yener
|publisher=[[Rensselaer Polytechnic Institute]]
}}</ref>
 
* [[Computer vision]] - TensorFaces<ref name=Vasilescu2002Tensorfaces>{{cite journal
|first=M.A.O. |last=Vasilescu
|first2=D. |last2=Terzopoulos
|url=http://www.cs.toronto.edu/~maov/tensorfaces/Springer%20ECCV%202002_files/eccv02proceeding_23500447.pdf
|title=Multilinear Analysis of Image Ensembles: TensorFaces
|series=Lecture Notes in Computer Science 2350; (Presented at Proc. 7th European Conference on Computer Vision (ECCV'02), Copenhagen, Denmark)
|publisher=Springer, Berlin, Heidelberg
|doi=10.1007/3-540-47969-4_30
|isbn=978-3-540-43745-1
|year=2002
}}</ref><ref name="MPCA-MICA2005">M.A.O. Vasilescu, D. Terzopoulos (2005) [http://www.media.mit.edu/~maov/mica/mica05.pdf "Multilinear Independent Component Analysis"], "Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, June 2005, vol.1, 547–553."</ref> and Human motion signatures<ref name="Vasilescu2002b">M.A.O. Vasilescu (2002) [http://www.media.mit.edu/~maov/motionsignatures/hms_icpr02_corrected.pdf "Human Motion Signatures: Analysis, Synthesis, Recognition," Proceedings of International Conference on Pattern Recognition (ICPR 2002), Vol. 3, Quebec City, Canada, Aug, 2002, 456–460.]</ref> analyzes facial images and human joint angle data organizes in a multiway array. The multiway data analysis is employed to compute a set of causal factor representations.<ref name="Vasilescu2002tensorfaces">{{cite conference
|first=M.A.O. |last=Vasilescu
|first2=Eric |last2=Kim
|first3=Xiao |last3=Zeng
|url=http://www.cs.toronto.edu/~maov/tensorfaces/Springer%20ECCV%202002_files/eccv02proceeding_23500447.pdf
|title="CausalX: Causal eXplanations and Block Multilinear Factor Analysis",
|conference=In the Proceedings of the 2020 25th International Conference on Pattern Recognition (ICPR 2020)
|___location=Milan, Italy
|pages=10736-10743
|year=2021
}}</ref>
* [[Electroanalytical chemistry]]
* [[Neuroscience]]
Line 30 ⟶ 93:
* [[Social network analysis]]/web-mining
 
=== Multiway processing ===
Following multiway applications in not new to the field of studies and works:
Multiway processing is the execution of designed and determined multiway model(s) transforming multiway data to the desirable level by addressing the specific need of particular multiway application. A typical example of data generated with a potentiometric electronic tongue illustrates relevant multiway processing.<ref>
{{cite journal
|first1=Raul|last1=Cartas
|first2=Aitor|last2=Mimendia
|first3=Andrey|last3=Legin
|first4=Manel|last4=del Valle
|title=Multiway Processing of Data Generated with a Potentiometric Electronic Tongue in a SIA System
|year=2011
|journal=Electroanalysis
|volume=23
|issue=4
|pages=953–961
|doi=10.1002/elan.201000642
}}</ref>
 
* Auditing and assurance
* Financial and management accounting
 
However participants of these two fields seldom address their research studies and professional publication attributed to the terminology "multiway data analysis". Following are some of relevant terminologies associated with multiway data analysis:
 
'''Classified by IT terminologies:'''
 
* Aggregation and consolidation
* Tagging and denormalization
* System integration
* Data mapping
 
'''Classified by marketing terminologies:'''
 
* [[Business Intelligence]] (BI)
* [[Business Process Management]] (BPM)
* [[Enterprise Performance Management]] (EPM)
* [[Extract, transform, load]] (ETL)
* [[Record to report]] (R2R)
 
=== Multiway Processing ===
Multiway processing is the execution of designed and determined multiway model(s) transforming multiway data to desirable level with addressing the specific need of particular multiway application. A typical example of data generated with a potentiometric electronic tongue illustrate relevant multiway processing.<ref>Raul Cartas, Aitor Mimendia, Andrey Legin, Manel del Valle1,Multiway Processing of Data Generated with a Potentiometric Electronic Tongue in a SIA System, 2011.
</ref>
== See also ==
*[[AVESTA computing issues]] of financial and management accounting
*[[List of data structures]]
*[[Mathematics]]
*[[Multilinear subspace learning]]