Higher-order singular value decomposition: Difference between revisions

Content deleted Content added
No edit summary
No edit summary
Line 1:
{{Short description|Tensor decomposition}}
In [[multilinear algebra]], the '''higher-order singular value decomposition''' ('''HOSVD''') is a misnomer. There does not exist a single tensor decomposition that retains all the defining properties of the matrix SVD. The matrix SVD simultaneously yields a ''rank-𝑅'' decomposition and computes orthonormal subspaces for the row and column spaces and a diagonal matrix. These properties are not realized within a single algorithm for higher-order tensors, but are instead realized by two distinct algorithmic developments and represent two distinct research directions. Harshman, as well as, the team of Carol and Chang proposed [[Canonical polyadic decomposition]] (CPD), which is a variant of the [[tensor rank decomposition]], in which a tensor is approximated as a sum of ''K rank-1'' tensors for a user-specified ''K''. [[L. R. Tucker]] proposed a strategy for computing orthonormal subspaces for third order tensors. Aspecsts of these algorithms can be traced as far back as [[F. L. Hitchcock]] in 1928.<ref name=":0">{{Cite journal|last=Hitchcock|first=Frank L|date=1928-04-01|title=Multiple Invariants and Generalized Rank of a M-Way Array or Tensor|journal=Journal of Mathematics and Physics|language=en|volume=7|issue=1–4|pages=39–79|doi=10.1002/sapm19287139|issn=1467-9590}}</ref>
 
 
[[Lieven De Lathauwer |De Lathauwer]] ''et al.''<ref name=":2">{{Cite journal|last1=De Lathauwer|first1=L.|last2=De Moor|first2=B.|last3=Vandewalle|first3=J.|date=2000-01-01|title=On the Best Rank-1 and Rank-(R1 ,R2 ,. . .,RN) Approximation of Higher-Order Tensors|journal=SIAM Journal on Matrix Analysis and Applications|volume=21|issue=4|pages=1324–1342|doi=10.1137/S0895479898346995|issn=0895-4798|citeseerx=10.1.1.102.9135}}</ref><ref name="DeLathauwer00">{{Cite journal|last1=De Lathauwer|first1=L.|last2=De Moor|first2=B.|last3=Vandewalle|first3=J.|date=2000-01-01|title=A Multilinear Singular Value Decomposition|journal=SIAM Journal on Matrix Analysis and Applications|volume=21|issue=4|pages=1253–1278|doi=10.1137/s0895479896305696|issn=0895-4798|citeseerx=10.1.1.102.9135}}</ref> introduced clarity to the Tucker's concepts, while [[Vasilescu]] and [[Demetri Terzopoulos| Terzopoulos]]<ref name=":Vasilescu2002">{{cite conference
|author=M. A. O. Vasilescu, D. Terzopoulos
|title=Multilinear Analysis of Image Ensembles: TensorFaces
Line 20:
|book-title=Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR’05)
|___location=San Diego, CA
|year=2005}}</ref> introduced algorithmic clarity. Vasilescu and Terzopoulos
|year=2005}}</ref> introduced algorithmic clarity. Vasilescu and Terzopoulos introduced the '''M-mode SVD''' which is currently referred in the literature as the '''Tucker''' or the '''HOSVD'''. However, the '''Tucker''' algorithm, and De Lathauwer ''et al.'''s companion algorithm<ref name=":2"/> are sequential, relying on iterative methods such as gradient descent or the power method, respectively. Vasilescu and Terzopoulos synthesized a set of ideas into an elegant two-step algorithm that can be executed sequentially or in parallel, whose simplicity belies the complexity it resolves. The term '''M-mode SVD''' accurately reflects the algorithm employed without overpromising; it captures the actual computation (a set of SVDs on mode-flattenings) without making assumptions about the structure of the core tensor or implying a rank decomposition.
 
: This misattribution has had lasting impact on the scholarly record, obscuring the original source of a widely adopted algorithm, and complicating efforts to trace its development, reproduce results, and recognizing the respective contributions of different research efforts.
Line 34 ⟶ 35:
&=& \mathcal{A} \times ({\bf U}_1 {\bf U}_1^H, {\bf U}_2 {\bf U}_2^H, \ldots, {\bf U}_M {\bf U}_M^H) \\
&=& \left(\mathcal{A} \times ({\bf U}_1^H, {\bf U}_2^H, \ldots, {\bf U}_M^H) \right) \times ({\bf U}_1, {\bf U}_2, \ldots, {\bf U}_M),
\end{array}</math>where <math>\cdot^H</math> denotes the [[conjugate transpose]]. The second equality is because the <math>{\bf U}_m</math>'s are unitary matrices. Define now the '''core tensor'''<math display="block">\mathcal{S} := \mathcal{A} \times ({\bf U}_1^H, {\bf U}_2^H, \ldots, {\bf U}_M^H).</math>Then, the M-mode SVD(HOSVD)<ref name=":2" /> of <math>\mathcal{A}</math> is the decomposition<math display="block">\mathcal{A} = \mathcal{S}\times ({\bf U}_1, {\bf U}_2, \ldots, {\bf U}_M).</math> The above construction shows that every tensor has a M-mode SVD(HOSVD).
 
== Compact M-mode SVD (HOSVD) ==
As in the case of the [[Singular value decomposition#Compact SVD|compact singular value decomposition]] of a matrix, where the rows and columns corresponding to vanishing singular values are dropped, it is also possible to consider a '''compact HOSVDM-mode SVD'''(HOSVD), which is very useful in applications.
 
Assume that <math>{\bf U}_m \in \mathbb{C}^{I_m \times R_m}</math> is a matrix with unitary columns containing a basis of the left singular vectors corresponding to the nonzero singular values of the standard factor-''m'' flattening <math>\mathcal{A}_{[m]}</math> of <math>\mathcal{A}</math>. Let the columns of <math>{\bf U}_m</math> be sorted such that the <math>r_m</math> th column <math>{\bf u}_{r_m}</math> of <math>{\bf U}_m</math> corresponds to the ''<math>r_m</math>''th largest nonzero singular value of <math>\mathcal{A}_{[m]}</math>. Since the columns of <math>{\bf U}_m</math> form a basis for the image of <math>\mathcal{A}_{[m]}</math>, we have<math display="block">\mathcal{A}_{[m]} = {\bf U}_m {\bf U}_m^H \mathcal{A}_{[m]} = \bigl( \mathcal{A} \times_m ({\bf U}_m {\bf U}_m^H) \bigr)_{[m]},</math>where the first equality is due to the properties of [[Projection (linear algebra)|orthogonal projections]] (in the Hermitian inner product) and the last equality is due to the properties of multilinear multiplication. As flattenings are bijective maps and the above formula is valid for all <math>m=1,2,\ldots,m,\ldots,M</math>, we find as before that<math display="block">\begin{array}{rcl}
Line 49 ⟶ 50:
The '''multilinear rank'''<ref name=":0" /> of <math>\mathcal{A}</math> is denoted with rank-<math>(R_1, R_2, \ldots, R_M) </math>. The multilinear rank is a tuple in <math>\mathbb{N}^M</math> where <math>R_m := \mathrm{rank}( \mathcal{A}_{[m]} )</math>. Not all tuples in <math>\mathbb{N}^M</math> are multilinear ranks.<ref name=":3">{{Cite journal|last1=Carlini|first1=Enrico|last2=Kleppe|first2=Johannes|title=Ranks derived from multilinear maps|journal=Journal of Pure and Applied Algebra|volume=215|issue=8|pages=1999–2004|doi=10.1016/j.jpaa.2010.11.010|year=2011|doi-access=free}}</ref> The multilinear ranks are bounded by <math>1 \le R_m \le I_m</math> and it satisfy the constraint <math display="inline">R_m \le \prod_{i \ne m} R_i</math> must hold.<ref name=":3" />
 
The compact M-mode SVD(HOSVD) is a rank-revealing decomposition in the sense that the dimensions of its core tensor correspond with the components of the multilinear rank of the tensor.
 
== Interpretation ==
The following geometric interpretation is valid for both the full and compact M-mode SVD(HOSVD). Let <math>(R_1, R_2, \ldots, R_M)</math> be the multilinear rank of the tensor <math>\mathcal{A}</math>. Since <math>\mathcal{S} \in {\mathbb C}^{R_1 \times R_2 \times \cdots \times R_M}</math> is a multidimensional array, we can expand it as follows<math display="block">\mathcal{S} = \sum_{r_1=1}^{R_1} \sum_{r_2=1}^{R_2} \cdots \sum_{r_M=1}^{R_M} s_{r_1,r_2,\ldots,r_M} \mathbf{e}_{r_1} \otimes \mathbf{e}_{r_2} \otimes \cdots \otimes \mathbf{e}_{r_M},</math>where <math>\mathbf{e}_{r_m}</math> is the <math>r_m</math>th standard basis vector of <math>{\mathbb C}^{I_m}</math>. By definition of the multilinear multiplication, it holds that<math display="block">\mathcal{A} = \sum_{r_1=1}^{R_1} \sum_{r_2=1}^{R_2} \cdots \sum_{r_M=1}^{R_M} s_{r_1,r_2,\ldots,r_M}
\mathbf{u}_{r_1} \otimes \mathbf{u}_{r_2} \otimes \cdots \otimes \mathbf{u}_{r_M},</math>where the <math>\mathbf{u}_{r_m}</math> are the columns of <math>{\bf U}_m \in {\mathbb C}^{I_m \times R_m}</math>. It is easy to verify that <math>B = \{ \mathbf{u}_{r_1} \otimes \mathbf{u}_{r_2} \otimes \cdots \otimes \mathbf{u}_{r_M} \}_{r_1,r_2,\ldots,r_M}</math> is an orthonormal set of tensors. This means that the M-mode SVD(HOSVD) can be interpreted as a way to express the tensor <math>\mathcal{A}</math> with respect to a specifically chosen orthonormal basis <math>B</math> with the coefficients given as the multidimensional array <math>\mathcal{S}</math>.
 
== Computation ==
Line 59 ⟶ 60:
 
=== Classic computation ===
While De Lathauwer et al. clarified Tucker’s framework through two influential papers, Vasilescu and Terzopoulos synthesized the ideas into an elegant two-step algorithm—one whose simplicity belies the complexity it resolves. The Tucker/Kroonenberg algorithm and De Lathauwer ''et al.''<ref name=:2/> algorithmscompanion algorithm are sequential, relying on iterative methods such as gradient descent or the power method. In contrast, the M-mode SVD is a closed-form solution that can be computed sequentially, but it is also well-suited for parallel computation.
 
=== M-mode SVD (also referred to as HOSVD or Tucker)===
Line 89 ⟶ 90:
A simple idea for trying to solve this optimization problem is to truncate the (compact) SVD in step 2 of either the classic or the interlaced computation. A '''classically''' '''truncated M-mode SVD/HOSVD''' is obtained by replacing step 2 in the classic computation by
* Compute a rank-<math>\bar R_m </math> truncated SVD <math>\mathcal{A}_{[m]} \approx U_m \Sigma_m V^T_m </math>, and store the top <math>\bar R_m </math> left singular vectors <math>U_m \in F^{I_m \times \bar R_m}</math>;
while a '''sequentially truncated M-mode SVD (HOSVD)''' (or '''successively truncated M-mode SVD/(HOSVD)''') is obtained by replacing step 2 in the interlaced computation by
* Compute a rank-<math>\bar R_m </math> truncated SVD <math>\mathcal{A}_{[m]}^{m-1} \approx U_m \Sigma_m V^T_m </math>, and store the top <math>\bar R_m </math> left singular vectors <math>U_m \in F^{I_m \times \bar R_m}</math>. Unfortunately, truncation does not result in an optimal solution for the best low multilinear rank optimization problem,.<ref name=":2" /><ref name=":Vasilescu2002"/><ref name=":4" /><ref name=":fist_hosvd" /> However, both the classically and interleaved truncated M-mode SVD/HOSVD result in a '''quasi-optimal''' solution:<ref name=":4" /><ref name=":fist_hosvd" /><ref name=":5" /><ref>{{Cite journal|last=Grasedyck|first=L.|date=2010-01-01|title=Hierarchical Singular Value Decomposition of Tensors|journal=SIAM Journal on Matrix Analysis and Applications|volume=31|issue=4|pages=2029–2054|doi=10.1137/090764189|issn=0895-4798|citeseerx=10.1.1.660.8333}}</ref> if <math>\mathcal{\bar A}_t </math> denotes the classically or sequentially truncated M-mode SVD(HOSVD) and <math>\mathcal{\bar A}^* </math> denotes the optimal solution to the best low multilinear rank approximation problem, then<math display="block">\| \mathcal{A} - \mathcal{\bar A}_t \|_F \le \sqrt{M} \| \mathcal{A} - \mathcal{\bar A}^* \|_F; </math>in practice this means that if there exists an optimal solution with a small error, then a truncated M-mode SVD/HOSVD will for many intended purposes also yield a sufficiently good solution.
 
== Applications ==
Line 103 ⟶ 104:
It is also used in [[tensor product model transformation]]-based controller design.<ref name="Baranyi042">{{cite journal|author=P. Baranyi|date=April 2004|title=TP model transformation as a way to LMI based controller design|journal=IEEE Transactions on Industrial Electronics|volume=51|pages=387&ndash;400|doi=10.1109/tie.2003.822037|number=2|s2cid=7957799}}</ref><ref name="compind2">{{cite journal|author1=P. Baranyi|author2=D. Tikk|author3=Y. Yam|author4=R. J. Patton|year=2003|title=From Differential Equations to PDC Controller Design via Numerical Transformation|journal=Computers in Industry|volume=51|issue=3|pages=281&ndash;297|doi=10.1016/s0166-3615(03)00058-7}}</ref>
 
The concept of M-mode SVD (HOSVD) was carried over to functions by Baranyi and Yam via the [[TP model transformation]].<ref name="Baranyi042" /><ref name="compind2" /> This extension led to the definition of the HOSVD-based canonical form of tensor product functions and Linear Parameter Varying system models<ref name="canon12">{{cite conference|title=Definition of the HOSVD-based canonical form of polytopic dynamic models|author1=P. Baranyi|author2=L. Szeidl|author3=P. Várlaki|author4=Y. Yam|date=July 3–5, 2006|___location=Budapest, Hungary|pages=660–665|conference=3rd International Conference on Mechatronics (ICM 2006)}}</ref> and to convex hull manipulation based control optimization theory, see [[TP model transformation in control theories]].
 
M-mode SVD (HOSVD) was proposed to be applied to multi-view data analysis in an unsupervised manner<ref>{{Cite journal|author1=Y-h. Taguchi|date=August 2017|title=Tensor decomposition-based unsupervised feature extraction applied to matrix products for multi-view data processing|journal=PLOS ONE|volume=12|issue=8|pages=e0183933|doi=10.1371/journal.pone.0183933|pmc=5571984|pmid=28841719|bibcode=2017PLoSO..1283933T|doi-access=free}}</ref> and was successfully applied to in silico drug discovery from gene expression.<ref>{{Cite journal|author1=Y-h. Taguchi|date=October 2017|title=Identification of candidate drugs using tensor-decomposition-based unsupervised feature extraction in integrated analysis of gene expression between diseases and DrugMatrix datasets|journal=Scientific Reports|volume=7|issue=1|pages=13733|doi=10.1038/s41598-017-13003-0|pmc=5653784|pmid=29062063|bibcode=2017NatSR...713733T}}</ref>
 
== Robust L1-norm variant ==
L1-Tucker is the [[Lp space|L1-norm]]-based, [[Robust statistics|robust]] variant of [[Tucker decomposition]].<ref name="l1tucker"/><ref name="l1tucker3"/> L1-HOSVD is the analogous of M-mode SVD(HOSVD) for the solution to L1-Tucker.<ref name="l1tucker"/><ref name="l1tucker2"/>
 
== References ==