Multi-task learning: Difference between revisions

Content deleted Content added
not citing a particular publication → smells like spam
Tag: references removed
Line 4:
In a widely cited 1997 paper, Rich Caruana gave the following characterization:<blockquote>Multitask Learning is an approach to [[inductive transfer]] that improves [[Generalization error|generalization]] by using the ___domain information contained in the training signals of related tasks as an [[inductive bias]]. It does this by learning tasks in parallel while using a shared [[Representation learning|representation]]; what is learned for each task can help other tasks be learned better.<ref name=":2">{{Cite journal|url = http://www.cs.cornell.edu/~caruana/mlj97.pdf|title = Multi-task learning|last = Caruana|first = R.|date = 1997|journal = Machine Learning|doi = 10.1023/A:1007379606734|volume=28|pages=41–75|doi-access = free}}</ref></blockquote>
 
In the classification context, MTL aims to improve the performance of multiple classification tasks by learning them jointly. One example is a spam-filter, which can be treated as distinct but related classification tasks across different users. To make this more concrete, consider that different people have different distributions of features which distinguish spam emails from legitimate ones, for example an English speaker may find that all emails in Russian are spam, not so for Russian speakers. Yet there is a definite commonality in this classification task across users, for example one common feature might be text related to money transfer. Solving each user's spam classification problem jointly via MTL can let the solutions inform each other and improve performance.<ref name=":0">{{Cite web|url = http://www.cs.cornell.edu/~kilian/research/multitasklearning/multitasklearning.html|title = Multi-task Learning|last = Weinberger|first =Citation Kilianneeded}}</ref> Further examples of settings for MTL include [[multiclass classification]] and [[multi-label classification]].<ref name=":1">{{Cite arXiv|eprint = 1504.03101|title = Convex Learning of Multiple Tasks and their Structure|last = Ciliberto|first = C.|date = 2015 |class = cs.LG}}</ref>
 
Multi-task learning works because [[Regularization (mathematics)|regularization]] induced by requiring an algorithm to perform well on a related task can be superior to regularization that prevents [[overfitting]] by penalizing all complexity uniformly. One situation where MTL may be particularly helpful is if the tasks share significant commonalities and are generally slightly under sampled.<ref name=":bmdl"/><ref name=":0" /> However, as discussed below, MTL has also been shown to be beneficial for learning unrelated tasks.<ref name=":bmdl"/><ref name=":3">Romera-Paredes, B., Argyriou, A., Bianchi-Berthouze, N., & Pontil, M., (2012) Exploiting Unrelated Tasks in Multi-Task Learning. http://jmlr.csail.mit.edu/proceedings/papers/v22/romera12/romera12.pdf</ref>
 
==Methods==