Content deleted Content added
Add links for algorithms without separate pages. |
grammar mistake, editing recently to recent |
||
Line 5:
'''Symbolic regression''' ('''SR''') is a type of [[regression analysis]] that searches the space of mathematical expressions to find the model that best fits a given dataset, both in terms of accuracy and simplicity.
No particular model is provided as a starting point for symbolic regression. Instead, initial expressions are formed by randomly combining mathematical building blocks such as [[Operation (mathematics)|mathematical operators]], [[analytic function]]s, [[Constant (mathematics)|constants]], and [[state variable]]s. Usually, a subset of these primitives will be specified by the person operating it, but that's not a requirement of the technique. The symbolic regression problem for mathematical functions has been tackled with a variety of methods, including recombining equations most commonly using [[genetic programming]],<ref name="schmidt2009distilling"/> as well as more
By not requiring ''a priori'' specification of a model, symbolic regression isn't affected by human bias, or unknown gaps in [[___domain knowledge]]. It attempts to uncover the intrinsic relationships of the dataset, by letting the patterns in the data itself reveal the appropriate models, rather than imposing a model structure that is deemed mathematically tractable from a human perspective. The [[fitness function]] that drives the evolution of the models takes into account not only [[Residual (numerical analysis)|error metrics]] (to ensure the models accurately predict the data), but also special complexity measures,<ref name="complexity"/> thus ensuring that the resulting models reveal the data's underlying structure in a way that's understandable from a human perspective. This facilitates reasoning and favors the odds of getting insights about the data-generating system.
|