Revision as of 20:17, 29 June 2025 edit Peckjonk (talk \| contribs) 2 edits add SPSS Statistics implementation of MARS ← Previous edit		Revision as of 08:55, 1 July 2025 edit undo Otr500 (talk \| contribs) Autopatrolled, Extended confirmed users, New page reviewers 16,256 edits Move unsourced material and 2016 career maintenance tags, as well as the entire "External links" section for any possible discussion per WP:ELBURDEN. Next edit →
Line 191: == Pros and cons == MARS models are simple to understand and interpret.<ref name=":0">{{Cite book\|title=Applied Predictive Modeling\|last1=Kuhn\|first1=Max\|last2=Johnson\|first2=Kjell\|date=2013\|publisher=Springer New York\|isbn=9781461468486\|___location=New York, NY\|language=en\|doi=10.1007/978-1-4614-6849-3}}</ref> ~~Compare the equation for ozone concentration above to, say, the innards of a trained [[Artificial neural network\|neural network]] or a [[random forest]].~~▼ ~~{{original research\|date=October 2016}}~~ MARS can handle both continuous and [[categorical data]].<ref>{{cite book \| last=Friedman \| first=Jerome H. \| chapter=Estimating Functions of Mixed Ordinal and Categorical Variables Using Adaptive Splines \| author-link=Friedman, J. H.\|year=1993\|title=New Directions in Statistical Data Analysis and Robustness \|editor=Stephan Morgenthaler \|editor2=Elvezio Ronchetti \|editor3=Werner Stahel\|publisher=Birkhauser}}</ref><ref name="Friedman 1991">{{cite journal \| last=Friedman \| first=Jerome H. \| title=Estimating Functions of Mixed Ordinal and Categorical Variables Using Adaptive Splines \| website=DTIC \| date=1991-06-01 \| url=https://apps.dtic.mil/sti/citations/ADA590939 \| archive-url=https://web.archive.org/web/20220411085148/https://apps.dtic.mil/sti/citations/ADA590939 \| url-status=live \| archive-date=April 11, 2022 \| access-date=2022-04-11}}</ref> ~~MARS tends to be better than recursive partitioning for numeric data because hinges are more appropriate for numeric variables than the piecewise constant segmentation used by recursive partitioning.~~▼ ~~No regression modeling technique is best for all situations.~~ ~~The guidelines below are intended to give an idea of the pros and cons of MARS,~~ ~~but there will be exceptions to the guidelines.~~ ~~It is useful to compare MARS to [[recursive partitioning]] and this is done below.~~ ~~(Recursive partitioning is also commonly called ''regression trees'',~~ ~~''decision trees'', or [[Predictive analytics#Classification and regression trees\|CART]];~~ ~~see the [[Decision tree learning\|recursive partitioning]] article for details).~~ MARS models are more flexible than [[linear regression]] models. ▲MARS models are simple to understand and interpret.<ref name=":0">{{Cite book\|title=Applied Predictive Modeling\|last1=Kuhn\|first1=Max\|last2=Johnson\|first2=Kjell\|date=2013\|publisher=Springer New York\|isbn=9781461468486\|___location=New York, NY\|language=en\|doi=10.1007/978-1-4614-6849-3}}</ref> Compare the equation for ozone concentration above to, say, the innards of a trained [[Artificial neural network\|neural network]] or a [[random forest]]. ▲MARS can handle both continuous and [[categorical data]].<ref>{{cite book \| last=Friedman \| first=Jerome H. \| chapter=Estimating Functions of Mixed Ordinal and Categorical Variables Using Adaptive Splines \| author-link=Friedman, J. H.\|year=1993\|title=New Directions in Statistical Data Analysis and Robustness \|editor=Stephan Morgenthaler \|editor2=Elvezio Ronchetti \|editor3=Werner Stahel\|publisher=Birkhauser}}</ref><ref name="Friedman 1991">{{cite journal \| last=Friedman \| first=Jerome H. \| title=Estimating Functions of Mixed Ordinal and Categorical Variables Using Adaptive Splines \| website=DTIC \| date=1991-06-01 \| url=https://apps.dtic.mil/sti/citations/ADA590939 \| archive-url=https://web.archive.org/web/20220411085148/https://apps.dtic.mil/sti/citations/ADA590939 \| url-status=live \| archive-date=April 11, 2022 \| access-date=2022-04-11}}</ref> MARS tends to be better than recursive partitioning for numeric data because hinges are more appropriate for numeric variables than the piecewise constant segmentation used by recursive partitioning. Building MARS models often requires little or no data preparation.<ref name=":0" /> The hinge functions automatically partition the input data, so the effect of outliers is contained. In this respect MARS is similar to [[recursive partitioning]] which also partitions the data into disjoint regions, although using a different method. MARS (like recursive partitioning) does automatic [[Feature selection\|variable selection]] (meaning it includes important variables in the model and excludes unimportant ones). However, there can be some arbitrariness in the selection, especially when there are correlated predictors, and this can affect interpretability.<ref name=":0" /> Building MARS models often requires little or no data preparation.<ref name=":0" /> MARS models tend to have a good bias-variance trade-off. The models are flexible enough to model non-linearity and variable interactions (thus MARS models have fairly low bias), yet the constrained form of MARS basis functions prevents too much flexibility (thus MARS models have fairly low variance). * [https://web.stat.tamu.edu/~bmallick/wileybook/book_code.html Code] from the book ''Bayesian Methods for Nonlinear Classification and Regression''<ref>{{cite book \|last1=Denison \|first1=D. G. T. \|last2=Holmes \|first2=C. C. \|last3=Mallick \|first3=B. K. \|last4=Smith \|first4=A. F. M. \|title=Bayesian methods for nonlinear classification and regression \|date=2002 \|publisher=Wiley \|___location=Chichester, England \|isbn=978-0-471-49036-4}}</ref> for Bayesian MARS.▼ MARS is suitable for handling large datasets, and implementations run very quickly. However, recursive partitioning can be faster than MARS{{Citation needed\|date=March 2019}}. With MARS models, as with any non-parametric regression, parameter confidence intervals and other checks on the model cannot be calculated directly (unlike [[linear regression]] models). [[Cross-validation (statistics)\|Cross-validation]] and related techniques must be used for validating the model instead. The <code>earth</code>, <code>mda</code>, and <code>polspline</code> implementations do not allow missing values in predictors, but free implementations of regression trees (such as <code>rpart</code> and <code>party</code>) do allow missing values using a technique called surrogate splits. MARS models can make predictions very quickly, as they only require evaluating a linear function of the predictors. The resulting fitted function is continuous, unlike recursive partitioning, which can give a more realistic model in some situations. (However, the model is not smooth or differentiable). == Extensions and related concepts == Line 239 ⟶ 223: Berk R.A. (2008) ''Statistical learning from a regression perspective'', Springer, {{ISBN\|978-0-387-77500-5}} ~~== External links ==~~ ~~{{external cleanup\|date=October 2016}}~~ ~~Several free and commercial software packages are available for fitting MARS-type models.~~ ~~; Free software:~~ * [[R (programming language)\|R]] packages: <code>earth</code> function in the <code>[https://cran.r-project.org/web/packages/earth/index.html earth]</code> package <code>mars</code> function in the <code>[https://cran.r-project.org/web/packages/mda/index.html mda]</code> package <code>polymars</code> function in the <code>[https://cran.r-project.org/web/packages/polspline/index.html polspline]</code> package. Not Friedman's MARS. <code>bass</code> function in the <code>[https://cran.r-project.org/web/packages/BASS/index.html BASS]</code> package for Bayesian MARS. * Matlab code: [http://www.cs.rtu.lv/jekabsons/regression.html ARESLab: Adaptive Regression Splines toolbox for Matlab] ▲ [https://web.stat.tamu.edu/~bmallick/wileybook/book_code.html Code] from the book ''Bayesian Methods for Nonlinear Classification and Regression''<ref>{{cite book \|last1=Denison \|first1=D. G. T. \|last2=Holmes \|first2=C. C. \|last3=Mallick \|first3=B. K. \|last4=Smith \|first4=A. F. M. \|title=Bayesian methods for nonlinear classification and regression \|date=2002 \|publisher=Wiley \|___location=Chichester, England \|isbn=978-0-471-49036-4}}</ref> for Bayesian MARS. * Python [http://orange.biolab.si/blog/2011/12/20/earth-multivariate-adaptive-regression-splines/ Earth – Multivariate adaptive regression splines] [https://github.com/jcrudy/py-earth/ py-earth] ** [https://github.com/lanl/pyBASS pyBASS] for Bayesian MARS. ~~; Commercial software:~~ * [http://www.salford-systems.com/mars.php MARS] from Salford Systems. Based on Friedman's implementation. * [https://web.archive.org/web/20101203023609/http://www.statsoft.com/products/data-mining-solutions/ STATISTICA Data Miner] from StatSoft * [http://support.sas.com/documentation/cdl/en/statug/65328/HTML/default/viewer.htm#statug_adaptivereg_overview.htm ADAPTIVEREG from SAS.] * [https://www.ibm.com/products/spss STATS EARTH extension command in IBM SPSS Statistics]. [[Category:Nonparametric regression]]

Multivariate adaptive regression spline: Difference between revisions