Non-negative matrix factorization: Difference between revisions

Content deleted Content added
GreenC bot (talk | contribs)
Removed oxfordjournals.com URL per discussion. Wayback Medic 2.5
No edit summary
Line 38:
| bibcode = 1995AtmEn..29.1705A
}}</ref>
It became more widely known as ''non-negative matrix factorization'' after Lee and [[Sebastian Seung|Seung]] investigated the properties of the algorithm and published some simple and useful
the properties of the algorithm and published some simple and useful
algorithms for two types of factorizations.<ref name="lee-seung">{{Cite journal
| author = Daniel D. Lee
Line 87 ⟶ 86:
 
== Clustering property ==
NMF has an inherent clustering property,<ref name="DingSDM2005" /> i.e., it automatically clusters the columns of input data <math>\mathbf{V} = (v_1, \dots, v_n) </math>.
<math>\mathbf{V} = (v_1, \cdots, v_n) </math>.
 
More specifically, the approximation of <math>\mathbf{V}</math> by <math>\mathbf{V} \simeq \mathbf{W}\mathbf{H}</math> is achieved by finding <math>W</math> and <math>H</math> that minimize the error function
<math>\mathbf{V} \simeq \mathbf{W}\mathbf{H}</math> is achieved by finding <math>W</math> and <math>H</math> that minimize the error function
 
<math> |\left\| V - WH |\right\|_F,</math> subject to <math>W \geq 0, H \geq 0.</math>
 
If we furthermore impose an orthogonality constraint on <math> \mathbf{H} </math>,
Line 99 ⟶ 96:
 
Furthermore, the computed <math> H </math> gives the cluster membership, i.e.,
if <math>\mathbf{H}_{kj} > \mathbf{H}_{ij} </math> for all ''i''''k'', this suggests that
the input data <math> v_j </math>
belongs to <math>k^{th}</math>-th cluster.
The computed <math>W</math> gives the cluster centroids, i.e.,
the <math>k^{th}</math>-th column
gives the cluster centroid of
<math>k^{th}</math>-th cluster. This centroid's representation can be significantly enhanced by convex NMF.
 
When the orthogonality constraint <math> \mathbf{H}\mathbf{H}^T = I </math> is not explicitly imposed, the orthogonality holds to a large extent, and the clustering property holds too. Clustering is the main objective of most [[data mining]] applications of NMF.{{citation needed|date=April 2015}}
Line 119 ⟶ 116:
 
=== Convex non-negative matrix factorization ===
In standard NMF, matrix factor {{math|'''W''' ∈ '''R'''<sub>+</sub><sup>''m'' × ''k''</sup>}}, i.e., {{math|'''W'''}} can be anything in that space. Convex NMF<ref name="ding">C Ding, T Li, MI Jordan, Convex and semi-nonnegative matrix factorizations, IEEE Transactions on Pattern Analysis and Machine Intelligence, 32, 45-55, 2010</ref> restricts the columns of {{math|'''W'''}} to [[convex combination]]s of the input data vectors <math> (v_1, \cdotsdots, v_n) </math>. This greatly improves the quality of data representation of {{math|'''W'''}}. Furthermore, the resulting matrix factor {{math|'''H'''}} becomes more sparse and orthogonal.
 
=== Nonnegative rank factorization ===
Line 133 ⟶ 130:
The factorization problem in the squared error version of NMF may be stated as:
Given a matrix <math>\mathbf{V}</math> find nonnegative matrices W and H that minimize the function
: <math>F(\mathbf{W},\mathbf{H}) = \left\|\mathbf{V} - \mathbf{WH} \right\|^2_F</math>
 
Another type of NMF for images is based on the [[total variation norm]].<ref>{{Cite journal | last1 = Zhang | first1 = T. | last2 = Fang | first2 = B. | last3 = Liu | first3 = W. | last4 = Tang | first4 = Y. Y. | last5 = He | first5 = G. | last6 = Wen | first6 = J. | doi = 10.1016/j.neucom.2008.01.022 | title = Total variation norm-based nonnegative matrix factorization for identifying discriminant representation of image patterns | journal = [[Neurocomputing (journal)|Neurocomputing]]| volume = 71 | issue = 10–12 | pages = 1824–1831| year = 2008 }}</ref>