Non-negative matrix factorization: Difference between revisions

Content deleted Content added
Citation bot (talk | contribs)
Alter: title, template type. Add: date, chapter. Removed parameters. | Use this bot. Report bugs. | Suggested by Headbomb | Linked from Wikipedia:WikiProject_Academic_Journals/Journals_cited_by_Wikipedia/Sandbox3 | #UCB_webform_linked 1408/2306
Citation bot (talk | contribs)
Alter: title. Add: chapter, doi-access. | Use this bot. Report bugs. | #UCB_CommandLine
Line 5:
'''Non-negative matrix factorization''' ('''NMF''' or '''NNMF'''), also '''non-negative matrix approximation'''<ref name="dhillon"/><ref>{{cite report|last1=Tandon|first1=Rashish|last2=Sra|first2=Suvrit |title=Sparse nonnegative matrix approximation: new formulations and algorithms|date=September 13, 2010 |url=https://is.tuebingen.mpg.de/fileadmin/user_upload/files/publications/MPIK-TR-193_%5B0%5D.pdf |id=Technical Report No. 193 |publisher=Max Planck Institute for Biological Cybernetics}}</ref> is a group of [[algorithm]]s in [[multivariate analysis]] and [[linear algebra]] where a [[matrix (mathematics)|matrix]] {{math|'''V'''}} is [[Matrix decomposition|factorized]] into (usually) two matrices {{math|'''W'''}} and {{math|'''H'''}}, with the property that all three matrices have no negative elements. This non-negativity makes the resulting matrices easier to inspect. Also, in applications such as processing of audio spectrograms or muscular activity, non-negativity is inherent to the data being considered. Since the problem is not exactly solvable in general, it is commonly approximated numerically.
 
NMF finds applications in such fields as [[astronomy]],<ref name="blantonRoweis07"/><ref name="ren18"/> [[computer vision]], [[document clustering]],<ref name="dhillon" /> [[Imputation (statistics)|missing data imputation]],<ref name="ren20">{{Cite journal|arxiv=2001.00563|last1= Ren|first1= Bin |title= Using Data Imputation for Signal Separation in High Contrast Imaging|journal= The Astrophysical Journal|volume= 892|issue= 2|pages= 74|last2= Pueyo|first2= Laurent|last3= Chen | first3 = Christine|last4= Choquet|first4= Elodie |last5= Debes|first5= John H|last6= Duechene |first6= Gaspard|last7= Menard|first7=Francois|last8=Perrin|first8=Marshall D.|year= 2020|doi= 10.3847/1538-4357/ab7024 | bibcode = 2020ApJ...892...74R |s2cid= 209531731|doi-access= free}}</ref> [[chemometrics]], [[audio signal processing]], [[recommender system|recommender systems]],<ref name="gemulla">{{cite conference |author=Rainer Gemulla |author2=Erik Nijkamp |author3=Peter J. Haas|author3-link= Peter J. Haas (computer scientist)|author4=Yannis Sismanis |title=Large-scale matrix factorization with distributed stochastic gradient descent |conference=Proc. ACM SIGKDD Int'l Conf. on Knowledge discovery and data mining |url=<!-- http://www.mpi-inf.mpg.de/~rgemulla/publications/rj10481rev.pdf --><!--removing dead link--> |year=2011 |pages=69–77 }}</ref><ref>{{cite conference |author=Yang Bao|title=TopicMF: Simultaneously Exploiting Ratings and Reviews for Recommendation |conference=AAAI |url=http://www.aaai.org/ocs/index.php/AAAI/AAAI14/paper/view/8273 |year=2014 |display-authors=etal}}</ref> and [[bioinformatics]].<ref>{{cite journal |author=Ben Murrell|title=Non-Negative Matrix Factorization for Learning Alignment-Specific Models of Protein Evolution |journal=PLOS ONE |volume=6 |issue=12 |year=2011 |pages=e28898|display-authors=etal|doi=10.1371/journal.pone.0028898 |pmid=22216138 |pmc=3245233 |bibcode=2011PLoSO...628898M |doi-access=free }}</ref>
 
== History ==
Line 371:
 
=== Astronomy ===
In astronomy, NMF is a promising method for [[dimension reduction]] in the sense that astrophysical signals are non-negative. NMF has been applied to the spectroscopic observations<ref name=":0">{{Cite journal |last1=Berné |first1=O. |last2=Joblin |first2=C.|author2-link=Christine Joblin |last3=Deville |first3=Y. |last4=Smith |first4=J. D. |last5=Rapacioli |first5=M. |last6=Bernard |first6=J. P. |last7=Thomas |first7=J. |last8=Reach |first8=W. |last9=Abergel |first9=A. |date=2007-07-01 |title=Analysis of the emission of very small dust particles from Spitzer spectro-imagery data using blind signal separation methods |url=https://www.aanda.org/articles/aa/abs/2007/26/aa6282-06/aa6282-06.html |journal=Astronomy & Astrophysics |language=en |volume=469 |issue=2 |pages=575–586 |doi=10.1051/0004-6361:20066282 |arxiv=astro-ph/0703072 |bibcode=2007A&A...469..575B |issn=0004-6361|doi-access=free }}</ref><ref name="blantonRoweis07">{{Cite journal |arxiv=astro-ph/0606170|last1= Blanton|first1= Michael R.|title= K-corrections and filter transformations in the ultraviolet, optical, and near infrared |journal= The Astronomical Journal|volume= 133|issue= 2|pages= 734–754|last2= Roweis|first2= Sam |year= 2007|doi= 10.1086/510127|bibcode = 2007AJ....133..734B |s2cid= 18561804}}</ref> and the direct imaging observations<ref name = "ren18">{{Cite journal|arxiv=1712.10317|last1= Ren|first1= Bin |title= Non-negative Matrix Factorization: Robust Extraction of Extended Structures|journal= The Astrophysical Journal|volume= 852|issue= 2|pages= 104|last2= Pueyo|first2= Laurent|last3= Zhu | first3 = Guangtun B.|last4= Duchêne|first4= Gaspard |year= 2018|doi= 10.3847/1538-4357/aaa1f2|bibcode = 2018ApJ...852..104R |s2cid= 3966513|doi-access= free}}</ref> as a method to study the common properties of astronomical objects and post-process the astronomical observations. The advances in the spectroscopic observations by Blanton & Roweis (2007)<ref name="blantonRoweis07" /> takes into account of the uncertainties of astronomical observations, which is later improved by Zhu (2016)<ref name="zhu16">{{Cite arXiv|last=Zhu|first=Guangtun B.|date=2016-12-19|title=Nonnegative Matrix Factorization (NMF) with Heteroscedastic Uncertainties and Missing data |eprint=1612.06037|class=astro-ph.IM}}</ref> where missing data are also considered and [[parallel computing]] is enabled. Their method is then adopted by Ren et al. (2018)<ref name="ren18" /> to the direct imaging field as one of the [[methods of detecting exoplanets]], especially for the direct imaging of [[circumstellar disks]].
 
Ren et al. (2018)<ref name="ren18" /> are able to prove the stability of NMF components when they are constructed sequentially (i.e., one by one), which enables the [[linearity]] of the NMF modeling process; the [[linearity]] property is used to separate the stellar light and the light scattered from the [[exoplanets]] and [[circumstellar disks]].
 
In direct imaging, to reveal the faint exoplanets and circumstellar disks from bright the surrounding stellar lights, which has a typical contrast from 10⁵ to 10¹⁰, various statistical methods have been adopted,<ref>{{Cite journal |arxiv=0902.3247 |last1=Lafrenière|first1=David |title=HST/NICMOS Detection of HR 8799 b in 1998 |journal=The Astrophysical Journal Letters |volume=694|issue=2|pages=L148|last2=Maroid |first2= Christian|last3= Doyon |first3=René| last4=Barman|first4=Travis|year=2009|doi=10.1088/0004-637X/694/2/L148|bibcode=2009ApJ...694L.148L |s2cid=7332750}}</ref><ref>{{Cite journal|arxiv=1207.6637 |last1= Amara|first1= Adam |title= PYNPOINT: an image processing package for finding exoplanets|journal= Monthly Notices of the Royal Astronomical Society|volume= 427|issue= 2|pages= 948|last2= Quanz|first2= Sascha P.|year= 2012|doi= 10.1111/j.1365-2966.2012.21918.x|bibcode = 2012MNRAS.427..948A|s2cid= 119200505}}</ref><ref name = "soummer12">{{Cite journal|arxiv=1207.4197|last1= Soummer|first1= Rémi |title= Detection and Characterization of Exoplanets and Disks Using Projections on Karhunen-Loève Eigenimages|journal= The Astrophysical Journal Letters |volume= 755|issue= 2|pages= L28|last2= Pueyo|first2= Laurent|last3= Larkin |first3=James|year=2012|doi=10.1088/2041-8205/755/2/L28|bibcode=2012ApJ...755L..28S|s2cid=51088743}}</ref> however the light from the exoplanets or circumstellar disks are usually over-fitted, where forward modeling have to be adopted to recover the true flux.<ref>{{Cite journal|arxiv= 1502.03092 |last1= Wahhaj |first1= Zahed |title=Improving signal-to-noise in the direct imaging of exoplanets and circumstellar disks with MLOCI |last2=Cieza|first2=Lucas A.|last3=Mawet|first3=Dimitri|last4=Yang|first4=Bin|last5=Canovas |first5=Hector|last6=de Boer|first6=Jozua|last7=Casassus |first7=Simon|last8= Ménard|first8= François |last9=Schreiber|first9=Matthias R.|last10=Liu|first10=Michael C.|last11=Biller|first11=Beth A. |last12=Nielsen|first12=Eric L.|last13=Hayward|first13=Thomas L.|journal= Astronomy & Astrophysics|volume= 581|issue= 24|pages= A24|year= 2015|doi= 10.1051/0004-6361/201525837|bibcode = 2015A&A...581A..24W|s2cid= 20174209}}</ref><ref name="pueyo16">{{Cite journal|arxiv= 1604.06097 |last1= Pueyo|first1= Laurent |title= Detection and Characterization of Exoplanets using Projections on Karhunen Loeve Eigenimages: Forward Modeling |journal= The Astrophysical Journal |volume= 824|issue= 2|pages= 117|year= 2016|doi= 10.3847/0004-637X/824/2/117 |bibcode = 2016ApJ...824..117P|s2cid= 118349503|doi-access= free}}</ref> Forward modeling is currently optimized for point sources,<ref name="pueyo16"/> however not for extended sources, especially for irregularly shaped structures such as circumstellar disks. In this situation, NMF has been an excellent method, being less over-fitting in the sense of the non-negativity and [[sparsity]] of the NMF modeling coefficients, therefore forward modeling can be performed with a few scaling factors,<ref name="ren18" /> rather than a computationally intensive data re-reduction on generated models.
 
=== Data imputation ===
Line 520:
| pages=e1000029
| bibcode = 2008PLSCB...4E0029D
| doi-access = free
}}</ref><ref name="kim2007sparse">{{Cite journal
|author1=Hyunsoo Kim |author2=Haesun Park
Line 662 ⟶ 663:
| author3 = Jing Gao
| author4 = Jiawei Han
| title = Proceedings of the 2013 SIAM International Conference on Data Mining
| titlechapter = Multi-View Clustering via Joint Nonnegative Matrix Factorization
| name-list-style = amp
| title = Multi-View Clustering via Joint Nonnegative Matrix Factorization
| journal = Proceedings of SIAM Data Mining Conference
| year = 2013