Non-negative matrix factorization: Difference between revisions

Content deleted Content added
mNo edit summary
mNo edit summary
Line 1:
{{Short description|Algorithms for matrix decomposition}}
{{machineMachine learning|Dimensionality reduction}}
[[File:NMF.png|thumb|400px|Illustration of approximate non-negative matrix factorization: the matrix {{math|'''V'''}} is represented by the two smaller matrices {{math|'''W'''}} and {{math|'''H'''}}, which, when multiplied, approximately reconstruct {{math|'''V'''}}.]]
 
Line 96:
i.e. <math> \mathbf{H}\mathbf{H}^T = I </math>, then the above minimization is mathematically equivalent to the minimization of [[K-means clustering]].<ref name="DingSDM2005" />
 
Furthermore, the computed <math>H</math> gives the cluster membership, i.e., if <math>\mathbf{H}_{kj} > \mathbf{H}_{ij} </math> for all ''i'' ≠ ''k'', this suggests that the input data <math> v_j </math> belongs to <math>k</math>-th cluster. The computed <math>W</math> gives the cluster centroids, i.e., the <math>k</math>-th column gives the cluster centroid of <math>k</math>-th cluster. This centroid's representation can be significantly enhanced by convex NMF.
Furthermore, the computed <math> H </math> gives the cluster membership, i.e.,
if <math>\mathbf{H}_{kj} > \mathbf{H}_{ij} </math> for all ''i'' ≠ ''k'', this suggests that
the input data <math> v_j </math>
belongs to <math>k</math>-th cluster.
The computed <math>W</math> gives the cluster centroids, i.e.,
the <math>k</math>-th column
gives the cluster centroid of
<math>k</math>-th cluster. This centroid's representation can be significantly enhanced by convex NMF.
 
When the orthogonality constraint <math> \mathbf{H}\mathbf{H}^T = I </math> is not explicitly imposed, the orthogonality holds to a large extent, and the clustering property holds too. Clustering is the main objective of most [[data mining]] applications of NMF.{{citation needed|date=April 2015}}
 
When the error function to be used is [[Kullback–Leibler divergence]], NMF is identical to the [[Probabilisticprobabilistic latent semantic analysis]] (PLSA), a popular document clustering method.<ref>{{cite journal |vauthors=Ding C, Li Y, Peng W |url=http://users.cis.fiu.edu/~taoli/pub/NMFpLSIequiv.pdf |title=On the equivalence between non-negative matrix factorization and probabilistic latent semantic indexing |archive-url=https://web.archive.org/web/20160304070027/http://users.cis.fiu.edu/~taoli/pub/NMFpLSIequiv.pdf |archive-date=2016-03-04 |url-status=dead |journal=Computational Statistics & Data Analysis |year=2008 |volume=52 |issue=8 |pages=3913–3927|doi=10.1016/j.csda.2008.01.011 }}</ref>
 
== Types ==
Line 120 ⟶ 113:
 
=== Nonnegative rank factorization ===
In case the [[Nonnegativenonnegative rank (linear algebra)|nonnegative rank]] of {{math|'''V'''}} is equal to its actual rank, {{math|1='''V''' = '''WH'''}} is called a nonnegative rank factorization (NRF).<ref name=BermanPlemmons74>{{cite journal|last=Berman|first=A.|author2=R.J. Plemmons |title=Inverses of nonnegative matrices|journal=Linear and Multilinear Algebra|year=1974|volume=2|issue=2|pages=161–172|doi=10.1080/03081087408817055}}</ref><ref name=BermanPlemmons94>{{cite book|author1=A. Berman |author2=R.J. Plemmons |title=Nonnegative matrices in the Mathematical Sciences|year=1994|publisher=SIAM|___location=Philadelphia}}</ref><ref name=Thomas74>{{cite journal |last=Thomas|first=L.B.|title=Problem 73-14, Rank factorization of nonnegative matrices|journal=SIAM Rev.|year=1974|volume=16|issue=3|pages=393–394|doi=10.1137/1016064}}</ref> The problem of finding the NRF of {{math|'''V'''}}, if it exists, is known to be NP-hard.<ref name=Vavasis09>{{cite journal|last=Vavasis|first=S.A.|title=On the complexity of nonnegative matrix factorization|journal=SIAM J. Optim.|year=2009|volume=20|issue=3|pages=1364–1377 |doi=10.1137/070709967|arxiv=0708.4149|s2cid=7150400}}</ref>
 
=== Different cost functions and regularizations ===
Line 151 ⟶ 144:
although it may also still be referred to as NMF.<ref>{{Cite conference | last1 = Hsieh | first1 = C. J. | last2 = Dhillon | first2 = I. S. | doi = 10.1145/2020408.2020577 | title = Fast coordinate descent methods with variable selection for non-negative matrix factorization | conference = Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '11 | pages = 1064| year = 2011 | isbn = 9781450308137 | url = http://www.cs.utexas.edu/~cjhsieh/nmf_kdd11.pdf}}</ref>
 
=== Online NMF ===
Many standard NMF algorithms analyze all the data together; i.e., the whole matrix is available from the start. This may be unsatisfactory in applications where there are too many data to fit into memory or where the data are provided in [[Data stream|streaming]] fashion. One such use is for [[collaborative filtering]] in [[recommender system|recommendation systems]], where there may be many users and many items to recommend, and it would be inefficient to recalculate everything when one user or one item is added to the system. The cost function for optimization in these cases may or may not be the same as for standard NMF, but the algorithms need to be rather different.<ref>http://www.ijcai.org/papers07/Papers/IJCAI07-432.pdf {{Bare URL PDF|date=March 2022}}</ref><ref>{{cite book|url=http://dl.acm.org/citation.cfm?id=1339264.1339709|title=Online Discussion Participation Prediction Using Non-negative Matrix Factorization |first1=Yik-Hing|last1=Fung|first2=Chun-Hung|last2=Li|first3=William K.|last3=Cheung|date=2 November 2007|publisher=IEEE Computer Society|pages=284–287|via=dl.acm.org|isbn=9780769530284|series=Wi-Iatw '07}}</ref><ref>{{Cite journal |author=Naiyang Guan|author2=Dacheng Tao|author3=Zhigang Luo|author4=Bo Yuan|name-list-style=amp|date=July 2012|title=Online Nonnegative Matrix Factorization With Robust Stochastic Approximation|journal=IEEE Transactions on Neural Networks and Learning Systems |issue=7 |doi=10.1109/TNNLS.2012.2197827|pmid=24807135|volume=23|pages=1087–1099|s2cid=8755408}}</ref>
 
== Algorithms ==
Line 375 ⟶ 368:
 
=== Astronomy ===
In astronomy, NMF is a promising method for [[dimension reduction]] in the sense that astrophysical signals are non-negative. NMF has been applied to the spectroscopic observations <ref name=":0" /><ref name="blantonRoweis07">{{Cite journal |arxiv=astro-ph/0606170|last1= Blanton|first1= Michael R.|title= K-corrections and filter transformations in the ultraviolet, optical, and near infrared |journal= The Astronomical Journal|volume= 133|issue= 2|pages= 734–754|last2= Roweis|first2= Sam |year= 2007|doi= 10.1086/510127|bibcode = 2007AJ....133..734B |s2cid= 18561804}}</ref> and the direct imaging observations <ref name = "ren18">{{Cite journal|arxiv=1712.10317|last1= Ren|first1= Bin |title= Non-negative Matrix Factorization: Robust Extraction of Extended Structures|journal= The Astrophysical Journal|volume= 852|issue= 2|pages= 104|last2= Pueyo|first2= Laurent|last3= Zhu | first3 = Guangtun B.|last4= Duchêne|first4= Gaspard |year= 2018|doi= 10.3847/1538-4357/aaa1f2|bibcode = 2018ApJ...852..104R |s2cid= 3966513}}</ref> as a method to study the common properties of astronomical objects and post-process the astronomical observations. The advances in the spectroscopic observations by Blanton & Roweis (2007) <ref name="blantonRoweis07" /> takes into account of the uncertainties of astronomical observations, which is later improved by Zhu (2016) <ref name="zhu16">{{Cite arXiv|last=Zhu|first=Guangtun B.|date=2016-12-19|title=Nonnegative Matrix Factorization (NMF) with Heteroscedastic Uncertainties and Missing data |eprint=1612.06037|class=astro-ph.IM}}</ref> where missing data are also considered and [[parallel computing]] is enabled. Their method is then adopted by Ren et al. (2018) <ref name="ren18" /> to the direct imaging field as one of the [[methods of detecting exoplanets]], especially for the direct imaging of [[circumstellar disks]].
 
Ren et al. (2018) <ref name="ren18" /> are able to prove the stability of NMF components when they are constructed sequentially (i.e., one by one), which enables the [[linearity]] of the NMF modeling process; the [[linearity]] property is used to separate the stellar light and the light scattered from the [[exoplanets]] and [[circumstellar disks]].
 
In direct imaging, to reveal the faint exoplanets and circumstellar disks from bright the surrounding stellar lights, which has a typical contrast from 10⁵ to 10¹⁰, various statistical methods have been adopted,<ref>{{Cite journal |arxiv=0902.3247 |last1=Lafrenière|first1=David |title=HST/NICMOS Detection of HR 8799 b in 1998 |journal=The Astrophysical Journal Letters |volume=694|issue=2|pages=L148|last2=Maroid |first2= Christian|last3= Doyon |first3=René| last4=Barman|first4=Travis|year=2009|doi=10.1088/0004-637X/694/2/L148|bibcode=2009ApJ...694L.148L |s2cid=7332750}}</ref><ref>{{Cite journal|arxiv=1207.6637 |last1= Amara|first1= Adam |title= PYNPOINT: an image processing package for finding exoplanets|journal= Monthly Notices of the Royal Astronomical Society|volume= 427|issue= 2|pages= 948|last2= Quanz|first2= Sascha P.|year= 2012|doi= 10.1111/j.1365-2966.2012.21918.x|bibcode = 2012MNRAS.427..948A|s2cid= 119200505}}</ref><ref name = "soummer12">{{Cite journal|arxiv=1207.4197|last1= Soummer|first1= Rémi |title= Detection and Characterization of Exoplanets and Disks Using Projections on Karhunen-Loève Eigenimages|journal= The Astrophysical Journal Letters |volume= 755|issue= 2|pages= L28|last2= Pueyo|first2= Laurent|last3= Larkin |first3=James|year=2012|doi=10.1088/2041-8205/755/2/L28|bibcode=2012ApJ...755L..28S|s2cid=51088743}}</ref> however the light from the exoplanets or circumstellar disks are usually over-fitted, where forward modeling have to be adopted to recover the true flux.<ref>{{Cite journal|arxiv= 1502.03092 |last1= Wahhaj |first1= Zahed |title=Improving signal-to-noise in the direct imaging of exoplanets and circumstellar disks with MLOCI |last2=Cieza|first2=Lucas A.|last3=Mawet|first3=Dimitri|last4=Yang|first4=Bin|last5=Canovas |first5=Hector|last6=de Boer|first6=Jozua|last7=Casassus |first7=Simon|last8= Ménard|first8= François |last9=Schreiber|first9=Matthias R.|last10=Liu|first10=Michael C.|last11=Biller|first11=Beth A. |last12=Nielsen|first12=Eric L.|last13=Hayward|first13=Thomas L.|journal= Astronomy & Astrophysics|volume= 581|issue= 24|pages= A24|year= 2015|doi= 10.1051/0004-6361/201525837|bibcode = 2015A&A...581A..24W|s2cid= 20174209}}</ref><ref name="pueyo16">{{Cite journal|arxiv= 1604.06097 |last1= Pueyo|first1= Laurent |title= Detection and Characterization of Exoplanets using Projections on Karhunen Loeve Eigenimages: Forward Modeling |journal= The Astrophysical Journal |volume= 824|issue= 2|pages= 117|year= 2016|doi= 10.3847/0004-637X/824/2/117 |bibcode = 2016ApJ...824..117P|s2cid= 118349503}}</ref> Forward modeling is currently optimized for point sources,<ref name="pueyo16"/> however not for extended sources, especially for irregularly shaped structures such as circumstellar disks. In this situation, NMF has been an excellent method, being less over-fitting in the sense of the non-negativity and [[sparsity]] of the NMF modeling coefficients, therefore forward modeling can be performed with a few scaling factors,<ref name="ren18" /> rather than a computationally intensive data re-reduction on generated models.
Line 555 ⟶ 548:
| pages = 126&ndash;135
| doi = 10.1145/1150402.1150420
|isbn = 1595933395|s2cid = 165018}}</ref> as been use for drug repurposing tasks in order to predict novel protein targets and therapeutic indications for approved drugs <ref>{{Cite journal
| last1 = Ceddia|last2 = Pinoli|last3 = Ceri|last4 = Masseroli
| title = Matrix factorization-based technique for drug repurposing predictions
Line 680 ⟶ 673:
# Cohen and Rothblum 1993 problem: whether a rational matrix always has an NMF of minimal inner dimension whose factors are also rational. Recently, this problem has been answered negatively.<ref>{{Cite arXiv|last1=Chistikov|first1=Dmitry|last2=Kiefer|first2=Stefan|last3=Marušić|first3=Ines|last4=Shirmohammadi|first4=Mahsa|last5=Worrell|first5=James|date=2016-05-22|title=Nonnegative Matrix Factorization Requires Irrationality |eprint=1605.06848|class=cs.CC}}</ref>
 
== See also ==
*[[Multilinear algebra]]
*[[Multilinear subspace learning]]