CUR matrix approximation: Difference between revisions

Content deleted Content added
mNo edit summary
mNo edit summary
Line 1:
A CUR matrix approximation is three matrices that, when multiplied together, closely approximate a given matrix.<ref name=mahoney>{{cite web|title=CUR matrix decompositions for improved data analysis|url=http://www.pnas.org/content/106/3/697.full|accessdate=26 June 2012|author=Michael W. Mahoney|coauthors=Petros Drineas}}</ref> A CUR approximation can be used in the same way as the low-rank approximation of the [[Singular value decomposition]] (SVD). CUR approximations are less accurate than the SVD, but since the rows and columns come from the original matrix (rather than left and right singular vectors), the CUR approximation is often easy for users to comprehend.
 
Formally, a CUR matrix approximation of a matrix A is three matrices C, U, and R such that C is made from columns of A, R is made from rows of A, and that CUR closely approximates A. Usually the CUR is selected to be a [[Rank (linear algebra)|rank]]-k approximation, which means that C contains k columns of A, R contains k rows of A, and U is a k-by-k matrix. There are many possible CUR matrix approximationapproximations, and many CUR matrix approximationapproximations offor thea samegiven rank.
 
The CUR matrix approximation is often used in place of the low-rank approximation of the SVD in [[Principle components analysis]]. The CUR is less accurate, but the columns of the matrix C are taken from A and the rows of R are taken from A. In PCA, each column of A contains a data sample, the matrix C is made of a subset of data samples. This is much easier to interpret than the SVD's left singular vectors, which represent the data in a rotated space. Similarly, the matrix R is made of a subset of variables measured for each data sample. This is easier to comprehend than the SVD's right singular vectors, which are another rotations of the data in space.