Content deleted Content added
Citation bot (talk | contribs) Add: s2cid. | Use this bot. Report bugs. | Suggested by SemperIocundus | #UCB_webform |
|||
Line 2:
:<math>\min_{w\in\mathbb{R}^d} \frac{1}{n}\sum_{i=1}^n (y_i- \langle w,x_i\rangle)^2+ \lambda \|w\|_1, \quad \text{ where } x_i\in \mathbb{R}^d\text{ and } y_i\in\mathbb{R}.</math>
Proximal gradient methods offer a general framework for solving regularization problems from statistical learning theory with penalties that are tailored to a specific problem application.<ref name=combettes>{{cite journal|last=Combettes|first=Patrick L.|author2=Wajs, Valérie R. |title=Signal Recovering by Proximal Forward-Backward Splitting|journal=Multiscale Model. Simul.|year=2005|volume=4|issue=4|pages=1168–1200|doi=10.1137/050626090|s2cid=15064954|url=https://semanticscholar.org/paper/56974187b4d9a8757f4d8a6fd6facc8b4ad08240}}</ref><ref name=structSparse>{{cite journal|last=Mosci|first=S.|author2=Rosasco, L. |author3=Matteo, S. |author4=Verri, A. |author5=Villa, S. |title=Solving Structured Sparsity Regularization with Proximal Methods|journal=Machine Learning and Knowledge Discovery in Databases|year=2010|volume=6322|pages=418–433 |doi=10.1007/978-3-642-15883-4_27|series=Lecture Notes in Computer Science|isbn=978-3-642-15882-7|doi-access=free}}</ref> Such customized penalties can help to induce certain structure in problem solutions, such as ''sparsity'' (in the case of [[Lasso (statistics)|lasso]]) or ''group structure'' (in the case of [[Lasso (statistics)#Group LASSO|group lasso]]).
== Relevant background ==
Line 105:
=== Group lasso ===
Group lasso is a generalization of the [[Lasso (statistics)|lasso method]] when features are grouped into disjoint blocks.<ref name=groupLasso>{{cite journal|last=Yuan|first=M.|author2=Lin, Y. |title=Model selection and estimation in regression with grouped variables|journal=J. R. Stat. Soc. B|year=2006|volume=68|issue=1|pages=49–67|doi=10.1111/j.1467-9868.2005.00532.x|s2cid=6162124|url=https://semanticscholar.org/paper/d98ef875e2cbde3e2cc8fad521e3cbfe1bddbd69}}</ref> Suppose the features are grouped into blocks <math>\{w_1,\ldots,w_G\}</math>. Here we take as a regularization penalty
:<math>R(w) =\sum_{g=1}^G \|w_g\|_2,</math>
|