Content deleted Content added
Citation bot (talk | contribs) Alter: pages, template type, journal. Add: page, pmid, volume, s2cid, isbn, doi, issue, year. Formatted dashes. | Use this bot. Report bugs. | Suggested by AManWithNoPlan | #UCB_webform 351/1682 |
tag with {{Bare URL PDF}} |
||
Line 152:
===Online NMF===
Many standard NMF algorithms analyze all the data together; i.e., the whole matrix is available from the start. This may be unsatisfactory in applications where there are too many data to fit into memory or where the data are provided in [[Data stream|streaming]] fashion. One such use is for [[collaborative filtering]] in [[recommendation systems]], where there may be many users and many items to recommend, and it would be inefficient to recalculate everything when one user or one item is added to the system. The cost function for optimization in these cases may or may not be the same as for standard NMF, but the algorithms need to be rather different.<ref>http://www.ijcai.org/papers07/Papers/IJCAI07-432.pdf {{Bare URL PDF|date=March 2022}}</ref><ref>{{cite book|url=http://dl.acm.org/citation.cfm?id=1339264.1339709|title=Online Discussion Participation Prediction Using Non-negative Matrix Factorization |first1=Yik-Hing|last1=Fung|first2=Chun-Hung|last2=Li|first3=William K.|last3=Cheung|date=2 November 2007|publisher=IEEE Computer Society|pages=284–287|via=dl.acm.org|isbn=9780769530284|series=Wi-Iatw '07}}</ref><ref>{{Cite journal |author=Naiyang Guan|author2=Dacheng Tao|author3=Zhigang Luo|author4=Bo Yuan|name-list-style=amp|date=July 2012|title=Online Nonnegative Matrix Factorization With Robust Stochastic Approximation|journal=IEEE Transactions on Neural Networks and Learning Systems |issue=7 |doi=10.1109/TNNLS.2012.2197827|pmid=24807135|volume=23|pages=1087–1099|s2cid=8755408}}</ref>
== Algorithms ==
Line 384:
To impute missing data in statistics, NMF can take missing data while minimizing its cost function, rather than treating these missing data as zeros.<ref name="ren20"/> This makes it a mathematically proven method for [[Imputation (statistics)|data imputation]] in statistics.<ref name="ren20"/> By first proving that the missing data are ignored in the cost function, then proving that the impact from missing data can be as small as a second order effect, Ren et al. (2020)<ref name="ren20"/> studied and applied such an approach for the field of astronomy. Their work focuses on two-dimensional matrices, specifically, it includes mathematical derivation, simulated data imputation, and application to on-sky data.
The data imputation procedure with NMF can be composed of two steps. First, when the NMF components are known, Ren et al. (2020) proved that impact from missing data during data imputation ("target modeling" in their study) is a second order effect. Second, when the NMF components are unknown, the authors proved that the impact from missing data during component construction is a first-to-second order effect.
Depending on the way that the NMF components are obtained, the former step above can be either independent or dependent from the latter. In addition, the imputation quality can be increased when the more NMF components are used, see Figure 4 of Ren et al. (2020) for their illustration.<ref name="ren20"/>
|