Content deleted Content added
mNo edit summary |
mNo edit summary |
||
Line 1:
{{Short description|Algorithms for matrix decomposition}}
{{
[[File:NMF.png|thumb|400px|Illustration of approximate non-negative matrix factorization: the matrix {{math|'''V'''}} is represented by the two smaller matrices {{math|'''W'''}} and {{math|'''H'''}}, which, when multiplied, approximately reconstruct {{math|'''V'''}}.]]
Line 96:
i.e. <math> \mathbf{H}\mathbf{H}^T = I </math>, then the above minimization is mathematically equivalent to the minimization of [[K-means clustering]].<ref name="DingSDM2005" />
Furthermore, the computed <math>H</math> gives the cluster membership, i.e., if <math>\mathbf{H}_{kj} > \mathbf{H}_{ij} </math> for all ''i'' ≠ ''k'', this suggests that the input data <math> v_j </math> belongs to <math>k</math>-th cluster. The computed <math>W</math> gives the cluster centroids, i.e., the <math>k</math>-th column gives the cluster centroid of <math>k</math>-th cluster. This centroid's representation can be significantly enhanced by convex NMF.
When the orthogonality constraint <math> \mathbf{H}\mathbf{H}^T = I </math> is not explicitly imposed, the orthogonality holds to a large extent, and the clustering property holds too. Clustering is the main objective of most [[data mining]] applications of NMF.{{citation needed|date=April 2015}}
When the error function to be used is [[Kullback–Leibler divergence]], NMF is identical to the [[
== Types ==
Line 120 ⟶ 113:
=== Nonnegative rank factorization ===
In case the [[
=== Different cost functions and regularizations ===
Line 151 ⟶ 144:
although it may also still be referred to as NMF.<ref>{{Cite conference | last1 = Hsieh | first1 = C. J. | last2 = Dhillon | first2 = I. S. | doi = 10.1145/2020408.2020577 | title = Fast coordinate descent methods with variable selection for non-negative matrix factorization | conference = Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '11 | pages = 1064| year = 2011 | isbn = 9781450308137 | url = http://www.cs.utexas.edu/~cjhsieh/nmf_kdd11.pdf}}</ref>
=== Online NMF ===
Many standard NMF algorithms analyze all the data together; i.e., the whole matrix is available from the start. This may be unsatisfactory in applications where there are too many data to fit into memory or where the data are provided in [[Data stream|streaming]] fashion. One such use is for [[collaborative filtering]] in [[recommender system|recommendation systems]], where there may be many users and many items to recommend, and it would be inefficient to recalculate everything when one user or one item is added to the system. The cost function for optimization in these cases may or may not be the same as for standard NMF, but the algorithms need to be rather different.<ref>http://www.ijcai.org/papers07/Papers/IJCAI07-432.pdf {{Bare URL PDF|date=March 2022}}</ref><ref>{{cite book|url=http://dl.acm.org/citation.cfm?id=1339264.1339709|title=Online Discussion Participation Prediction Using Non-negative Matrix Factorization |first1=Yik-Hing|last1=Fung|first2=Chun-Hung|last2=Li|first3=William K.|last3=Cheung|date=2 November 2007|publisher=IEEE Computer Society|pages=284–287|via=dl.acm.org|isbn=9780769530284|series=Wi-Iatw '07}}</ref><ref>{{Cite journal |author=Naiyang Guan|author2=Dacheng Tao|author3=Zhigang Luo|author4=Bo Yuan|name-list-style=amp|date=July 2012|title=Online Nonnegative Matrix Factorization With Robust Stochastic Approximation|journal=IEEE Transactions on Neural Networks and Learning Systems |issue=7 |doi=10.1109/TNNLS.2012.2197827|pmid=24807135|volume=23|pages=1087–1099|s2cid=8755408}}</ref>
== Algorithms ==
Line 375 ⟶ 368:
=== Astronomy ===
In astronomy, NMF is a promising method for [[dimension reduction]] in the sense that astrophysical signals are non-negative. NMF has been applied to the spectroscopic observations
Ren et al. (2018)
In direct imaging, to reveal the faint exoplanets and circumstellar disks from bright the surrounding stellar lights, which has a typical contrast from 10⁵ to 10¹⁰, various statistical methods have been adopted,<ref>{{Cite journal |arxiv=0902.3247 |last1=Lafrenière|first1=David |title=HST/NICMOS Detection of HR 8799 b in 1998 |journal=The Astrophysical Journal Letters |volume=694|issue=2|pages=L148|last2=Maroid |first2= Christian|last3= Doyon |first3=René| last4=Barman|first4=Travis|year=2009|doi=10.1088/0004-637X/694/2/L148|bibcode=2009ApJ...694L.148L |s2cid=7332750}}</ref><ref>{{Cite journal|arxiv=1207.6637 |last1= Amara|first1= Adam |title= PYNPOINT: an image processing package for finding exoplanets|journal= Monthly Notices of the Royal Astronomical Society|volume= 427|issue= 2|pages= 948|last2= Quanz|first2= Sascha P.|year= 2012|doi= 10.1111/j.1365-2966.2012.21918.x|bibcode = 2012MNRAS.427..948A|s2cid= 119200505}}</ref><ref name = "soummer12">{{Cite journal|arxiv=1207.4197|last1= Soummer|first1= Rémi |title= Detection and Characterization of Exoplanets and Disks Using Projections on Karhunen-Loève Eigenimages|journal= The Astrophysical Journal Letters |volume= 755|issue= 2|pages= L28|last2= Pueyo|first2= Laurent|last3= Larkin |first3=James|year=2012|doi=10.1088/2041-8205/755/2/L28|bibcode=2012ApJ...755L..28S|s2cid=51088743}}</ref> however the light from the exoplanets or circumstellar disks are usually over-fitted, where forward modeling have to be adopted to recover the true flux.<ref>{{Cite journal|arxiv= 1502.03092 |last1= Wahhaj |first1= Zahed |title=Improving signal-to-noise in the direct imaging of exoplanets and circumstellar disks with MLOCI |last2=Cieza|first2=Lucas A.|last3=Mawet|first3=Dimitri|last4=Yang|first4=Bin|last5=Canovas |first5=Hector|last6=de Boer|first6=Jozua|last7=Casassus |first7=Simon|last8= Ménard|first8= François |last9=Schreiber|first9=Matthias R.|last10=Liu|first10=Michael C.|last11=Biller|first11=Beth A. |last12=Nielsen|first12=Eric L.|last13=Hayward|first13=Thomas L.|journal= Astronomy & Astrophysics|volume= 581|issue= 24|pages= A24|year= 2015|doi= 10.1051/0004-6361/201525837|bibcode = 2015A&A...581A..24W|s2cid= 20174209}}</ref><ref name="pueyo16">{{Cite journal|arxiv= 1604.06097 |last1= Pueyo|first1= Laurent |title= Detection and Characterization of Exoplanets using Projections on Karhunen Loeve Eigenimages: Forward Modeling |journal= The Astrophysical Journal |volume= 824|issue= 2|pages= 117|year= 2016|doi= 10.3847/0004-637X/824/2/117 |bibcode = 2016ApJ...824..117P|s2cid= 118349503}}</ref> Forward modeling is currently optimized for point sources,<ref name="pueyo16"/> however not for extended sources, especially for irregularly shaped structures such as circumstellar disks. In this situation, NMF has been an excellent method, being less over-fitting in the sense of the non-negativity and [[sparsity]] of the NMF modeling coefficients, therefore forward modeling can be performed with a few scaling factors,<ref name="ren18" /> rather than a computationally intensive data re-reduction on generated models.
Line 555 ⟶ 548:
| pages = 126–135
| doi = 10.1145/1150402.1150420
|isbn = 1595933395|s2cid = 165018}}</ref> as been use for drug repurposing tasks in order to predict novel protein targets and therapeutic indications for approved drugs
| last1 = Ceddia|last2 = Pinoli|last3 = Ceri|last4 = Masseroli
| title = Matrix factorization-based technique for drug repurposing predictions
Line 680 ⟶ 673:
# Cohen and Rothblum 1993 problem: whether a rational matrix always has an NMF of minimal inner dimension whose factors are also rational. Recently, this problem has been answered negatively.<ref>{{Cite arXiv|last1=Chistikov|first1=Dmitry|last2=Kiefer|first2=Stefan|last3=Marušić|first3=Ines|last4=Shirmohammadi|first4=Mahsa|last5=Worrell|first5=James|date=2016-05-22|title=Nonnegative Matrix Factorization Requires Irrationality |eprint=1605.06848|class=cs.CC}}</ref>
== See also ==
*[[Multilinear algebra]]
*[[Multilinear subspace learning]]
|