Talk:Ridge regression
![]() | This article is rated Start-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | ||||||||||
|
![]() | The content of Tikhonov regularization was merged into Ridge regression on 17 October 2022. The former page's history now serves to provide attribution for that content in the latter page, and it must not be deleted as long as the latter page exists. For the discussion at that ___location, see its talk page. |
Merging that way is not advised, they are overlapping methods but in different theoretical frameworks, and the Bayesian is more general. -Anon
Fold into Tikhanov regression?
editRidge regression is Tikhanov regression. And that areticle is much more detailed. I propose we merge the two articles, basically delete this one and point to the other David (talk) 01:42, 22 January 2022 (UTC)
- Yeah, it definitely should be; this article is also worse than the one on Tikhonov regularization. For instance, this article claims that ridge regression is used to deal with multicollinearity, when in reality there is no need for the variables to be correlated. Ridge regression just regularizes coefficients towards 0 to improve out-of-sample performance. Closed Limelike Curves (talk) 00:49, 15 May 2022 (UTC)[reply]
Explain the name
editThe article left me confused about the name of the regression. Was the name originally 'RIDGE' and was thus originally an abbreviation? Did it later change to 'Ridge', maybe due to the abbreviation being clumsy in hindsight? Or is this referring to something within the mathematical method?
Applications
editThe method (Tikhonov regularization) is applied in several fields of applied physics like,
- Geophysics (seismic studies for example)
- Atmospheric inverse problem, in general it is an important method in remote sensing
- Medicine in studies using functional magnetic resonance imaging.
- Deep learning regularization techniques are also using this method (for genomic data for instance)
I would suggest to add this kind of information to the page. AyubuZimbale (talk) 07:31, 4 November 2024 (UTC)
"Most real-world phenomena have the effect of low-pass filters[clarification needed] in the forward direction where A {\displaystyle A} maps x {\displaystyle \mathbf {x} } to b {\displaystyle \mathbf {b} }"
editThere is a classification needed tag. Basically, this says that if we take an x with "reasonable" values, and add noise to with mean zero but possibly "unreasonable" (large) amplitude (i.e. SD of the noise), then the matrix A will act to make most of these additional noise deviations cancel one another.
More specifically, if we think of x as a time series, that is to say, a process in continuous time that has been sampled at discrete points in time, then any white noise superimposed on the underlying "actual" signal will cancel out, and this effect will be strongest for the "high-frequency" component.
The trouble with the passage is that "low-pass filter" was added here by some author (not me) who felt this analogy would aid intuition. If clarification is needed, then it just misses its mark and the reader had best skip it altogether. 145.53.11.225 (talk) 11:38, 2 July 2025 (UTC)
- In other cases, high-pass operators (e.g., a difference operator or a weighted Fourier operator) may be used to enforce smoothness if the underlying vector is believed to be mostly continuous
- - this passage continues the above idea and, if anything, more clarification is needed here, because this talk of being "mostly continuous" only makes sense if we think of x as being a discrete sampling of some underlying smooth signal, such as an acoustic signal. In that context, this remark makes perfect sense - but it probably looks quite odd if you came here from a statistical linear regression problem.
- (Indeed, all this mathematically speaking very loose talk seems to indicate that a signal processing engineer has been at this page. But no matter.) 145.53.11.225 (talk) 11:45, 2 July 2025 (UTC)
Generalized Tikhonov regularization
editthe alternate solution at the end stipulates "Q is not a null matrix"; however, both solutions are equal and equivalent to the standard least squares solution in this case. I propose to remove that condition. Chris2crawford (talk) 12:29, 23 August 2025 (UTC)
Error in the introduction
editThe overview states that the regularization is equivalent to the 'constraint' beta^T beta = c. This has two problems: 1) as shown in the section "Tikhonov regularization", the value of c is actually 0 in this case (lambda = alpha^2). Not even the generalized problem 'constrains' beta to the surface of a sphere, but to the point beta_0, ie. minimizing |beta-beta_0| 2) it is not a constaint in the technical sense of the word but an additional term in the objective function pulling beta towards \vec 0. Chris2crawford (talk) 12:59, 23 August 2025 (UTC)
Disjoint transition between Ridge and Tikhonov sections
editThere is an abrupt change in notation from y=X beta to b = A x, which unfortunately permutes two of the three variable letters! Thus, although the two articles are the same concept, the current merge confounds instead of unifies this connection.
While the math in the Tikhonov section is more mathematically correct than the Ridge section, its introduction is sloppy and incorrect. The probelm: "However, if no x satisfies the equation" was already solved by the previous sentence: "The standard approach is ordinary least squares..", which itself is not solving the problem "Ax=b" but minimizing |Ax-b|^2. So the problem of overdetermination is already solved by least squares, and only underdetermination is being solved by Tikhonov regularization, but both are conflated in the introduction. Chris2crawford (talk) 13:24, 23 August 2025 (UTC)