Revision as of 08:03, 16 December 2020 edit WikiCleanerBot (talk \| contribs) Bots 1,007,764 edits m v2.04b - Bot T21 - Fix errors for CW project (Missing whitespace before a link - Reference before punctuation) Tag: WPCleaner ← Previous edit		Revision as of 15:19, 13 January 2021 edit undo Kku (talk \| contribs) Extended confirmed users 122,082 edits m link adjacency matrix using Find link Next edit →
Line 90: * Letting <math display="inline">A^\dagger = \gamma I_T + ( \gamma - \lambda)\frac {1} T \mathbf{1}\mathbf{1}^\top </math> (where <math>I_T </math> is the ''T''x''T'' identity matrix, and <math display="inline">\mathbf{1}\mathbf{1}^\top </math> is the ''T''x''T'' matrix of ones) is equivalent to letting {{math\|Γ}} control the variance <math display="inline">\sum_t \|\| f_t - \bar f\|\| _{\mathcal H_k} </math> of tasks from their mean <math display="inline">\frac 1 T \sum_t f_t </math>. For example, blood levels of some biomarker may be taken on {{mvar\|T}} patients at <math>n_t</math> time points during the course of a day and interest may lie in regularizing the variance of the predictions across patients. * Letting <math> A^\dagger = \alpha I_T +(\alpha - \lambda )M </math> , where <math> M_{t,s} = \frac 1 {\|G_r\|} \mathbb I(t,s\in G_r) </math> is equivalent to letting <math> \alpha </math> control the variance measured with respect to a group mean: <math> \sum _{r} \sum _{t \in G_r } \|\|f_t - \frac 1 {\|G_r\|} \sum _{s\in G_r)} f_s\|\| </math>. (Here <math> \|G_r\| </math> the cardinality of group r, and <math> \mathbb I </math> is the indicator function). For example, people in different political parties (groups) might be regularized together with respect to predicting the favorability rating of a politician. Note that this penalty reduces to the first when all tasks are in the same group. * Letting <math> A^\dagger = \delta I_T + (\delta -\lambda)L </math>, where <math> L=D-M</math> is the [[Laplacian matrix\|Laplacian]] for the graph with [[adjacency matrix]] ''M'' giving pairwise similarities of tasks. This is equivalent to giving a larger penalty to the distance separating tasks ''t'' and ''s'' when they are more similar (according to the weight <math> M_{t,s} </math>,) i.e. <math>\delta </math> regularizes <math> \sum _{t,s}\|\|f_t - f_s \|\|_{\mathcal H _k }^2 M_{t,s} </math>. * All of the above choices of A also induce the additional regularization term <math display="inline">\lambda \sum_t \|\|f\|\| _{\mathcal H_k} ^2 </math> which penalizes complexity in f more broadly.

Multi-task learning: Difference between revisions