Content deleted Content added
Introduce a general method that can improve the result of matrix factorization. |
|||
(48 intermediate revisions by 29 users not shown) | |||
Line 1:
{{Short description|Mathematical procedure}}
{{
{{Recommender systems}}
'''Matrix factorization''' is a class of [[collaborative filtering]] algorithms used in [[recommender
== Techniques ==
The idea behind matrix factorization is to represent users and items in a lower dimensional [[latent space
=== Funk MF ===
The original algorithm proposed by Simon Funk in his blog post
The predicted ratings can be computed as <math>\tilde{R}=H W</math>, where <math>\tilde{R} \in \mathbb{R}^{\text{users} \times \text{items}}</math> is the user-item rating matrix, <math>H \in \mathbb{R}^{\text{users} \times \text{latent factors}}</math> contains the user's latent factors and <math>W \in \mathbb{R}^{\text{latent factors} \times \text{items}}</math> the item's latent factors.
Specifically, the predicted rating user ''u'' will give to item ''i'' is computed as:
:<math>\tilde{r}_{ui} = \sum_{f=0}^{\text{n factors} } H_{u,f}W_{f,i}</math>
It is possible to tune the expressive power of the model by changing the number of latent factors. It has been demonstrated
Funk MF was developed as a ''rating prediction'' problem, therefore it uses explicit numerical ratings as user-item interactions.
All things considered, Funk MF minimizes the following objective function:
:<math>\underset{H, W}{\operatorname{arg\,min}}\, \|R - \tilde{R}\|_{\rm F} + \alpha\|H\| + \beta\|W\|</math>
Where <math>\|.\|_{\rm F}</math> is defined to be the [[frobenius norm]] whereas the other norms might be either frobenius or another norm depending on the specific recommending problem.<ref name="Paterek07">{{cite journal |last1=Paterek |first1=Arkadiusz |title=Improving regularized singular value decomposition for collaborative filtering |journal=Proceedings of KDD Cup and Workshop |date=2007 |url=https://www.mimuw.edu.pl/~paterek/ap_kdd.pdf}}</ref>
=== SVD++ ===
While Funk MF is able to provide very good recommendation quality, its ability to use only explicit numerical ratings as user-items interactions constitutes a limitation. Modern day [[recommender systems]] should exploit all available interactions both explicit (e.g. numerical ratings) and implicit (e.g. likes, purchases, skipped, bookmarked). To this end SVD++ was designed to take into account implicit interactions as well.<ref name="Cao15">{{cite book |last1=Cao |first1=Jian |last2=Hu |first2=Hengkui |last3=Luo |first3=Tianyan |last4=Wang |first4=Jia |last5=Huang |first5=May |last6=Wang |first6=Karl |last7=Wu |first7=Zhonghai |last8=Zhang |first8=Xing |title=Distributed Design and Implementation of SVD++ Algorithm for E-commerce Personalized Recommender System |volume=572 |date=2015 |pages=30–44 |doi=10.1007/978-981-10-0421-6_4 |publisher=Springer Singapore |language=en|series=Communications in Computer and Information Science |isbn=978-981-10-0420-9 }}</ref><ref name="Jia14">{{cite book |last1=Jia |first1=Yancheng |title
Compared to Funk MF, SVD++ takes also into account user and item bias.
The predicted rating user ''u'' will give to item ''i'' is computed as:
:<math>\tilde{r}_{ui} = \mu + b_i + b_u + \sum_{f=0}^{\text{n factors}} H_{u,f}W_{f,i}</math>
Where <math>\mu</math> refers to the overall average rating over all items and <math>b_i</math> and <math>b_u</math> refers to the observed deviation of the item {{mvar|i}} and the user {{mvar|u}} respectively from the average.<ref>{{Cite journal |last1=Koren |first1=Yehuda |last2=Bell |first2=Robert |last3=Volinsky |first3=Chris |date=August 2009 |title=Matrix factorization techniques for recommender systems |url=https://datajobs.com/data-science-repo/Recommender-Systems-[Netflix].pdf |journal=Computer |pages=45}}</ref> SVD++ has however some disadvantages, with the main drawback being that this method is not ''model-based.'' This means that if a new user is added, the algorithm is incapable of modeling it unless the whole model is retrained. Even though the system might have gathered some interactions for that new user, its latent factors are not available and therefore no recommendations can be computed. This is an example of a [[Cold start (
A possible way to address this cold start problem is to modify SVD++ in order for it to become a ''model-based'' algorithm, therefore allowing to easily manage new items and new users.
As previously mentioned in SVD++ we don't have the latent factors of new users, therefore it is necessary to represent them in a different way. The user's latent factors represent the preference of that user for the corresponding item's latent factors, therefore user's latent factors can be estimated via the past user interactions. If the system is able to gather some interactions for the new user it is possible to estimate its latent factors.
Note that this does not entirely solve the [[Cold start (
:<math>\tilde{r}_{ui} = \mu + b_i + b_u + \sum_{f=0}^{\text{n factors}} \biggl( \sum_{j=0}^{\text{n items}} r_{uj}
With this formulation, the equivalent [[item-item recommender]] would be <math>\tilde{R} = R S = R W^{\rm T} W</math>. Therefore the similarity matrix is symmetric.
=== Asymmetric SVD ===
Asymmetric SVD aims at combining the advantages of SVD++ while being a model based algorithm, therefore being able to consider new users with a few ratings without needing to retrain the whole model. As opposed to the model-based SVD here the user latent factor matrix H is replaced by Q, which learns the user's preferences as function of their ratings.<ref name="Pu13">{{cite book |last1=Pu |first1=Li |title=Proceedings of the 7th ACM conference on Recommender systems
The predicted rating user ''u'' will give to item ''i'' is computed as:
<math>\tilde{r}_{ui} = \mu + b_i + b_u + \sum_{f=0}^{\text{n factors}} \sum_{j=0}^{\text{n items}} r_{uj} Q_{j,f}W_{f,i}</math>
With this formulation, the equivalent [[item-item recommender]] would be <math>\tilde{R} = R S = R Q^{\rm T} W</math>. Since matrices Q and W are different the similarity matrix is asymmetric, hence the name of the model.
=== Group-specific SVD ===
A group-specific SVD can be an effective approach for the [[Cold start (recommender systems)|cold-start]] problem in many scenarios.<ref name="bi2017"/> It clusters users and items based on dependency information and similarities in characteristics. Then once a new user or item arrives, we can assign a group label to it, and approximates its latent factor by the group effects (of the corresponding group). Therefore, although ratings associated with the new user or item are not necessarily available, the group effects provide immediate and effective predictions.
The predicted rating user ''u'' will give to item ''i'' is computed as:
:<math>\tilde{r}_{ui} = \sum_{f=0}^{\text{n factors}} (H_{u,f}+S_{v_u,f})(W_{f,i}+T_{f,j_i})</math>
Here <math>v_u</math> and <math>j_i</math> represent the group label of user ''u'' and item ''i'', respectively, which are identical across members from the same group. And {{mvar|S}} and {{mvar|T}} are matrices of group effects. For example, for a new user <math>u_{new}</math> whose latent factor <math>H_{u_{new}}</math> is not available, we can at least identify their group label <math>v_{u_{new}}</math>, and predict their ratings as:
:<math>\tilde{r}_{u_{new}i} = \sum_{f=0}^{\text{n factors}} S_{v_{u_{new}},f}(W_{f,i}+T_{f,j_i})</math>
This provides a good approximation to the unobserved ratings.
=== Hybrid MF ===
In recent years many other matrix factorization models have been developed to exploit the ever increasing amount and variety of available interaction data and use cases. Hybrid matrix factorization algorithms are capable of merging explicit and implicit interactions
===Deep-
In recent years a number of neural and deep-learning techniques have been proposed, some of which generalize traditional
While deep learning has been applied to many different scenarios
==See also==
|