Content deleted Content added
→Pros and cons: The resulting fitted function is not smooth (not differentiable along hinges). |
Kwiki user (talk | contribs) Added citation for some of the pros of MARS method and added "[citation needed]" for some of the additional pros mentioned in those lines since those are not mentioned in the newly added source. This section needs some cleanup and more references |
||
Line 304:
*MARS models are more flexible than [[linear regression]] models.
*MARS models are simple to understand and interpret<ref name=":0">{{Cite book|url=http://link.springer.com/10.1007/978-1-4614-6849-3|title=Applied Predictive Modeling|last=Kuhn|first=Max|last2=Johnson|first2=Kjell|date=2013|publisher=Springer New York|isbn=9781461468486|___location=New York, NY|language=en|doi=10.1007/978-1-4614-6849-3}}</ref>. Compare the equation for ozone concentration above to, say, the innards of a trained [[Artificial neural network|neural network]] or a [[random forest]].
*MARS can handle both continuous and categorical data.<ref>[[Friedman, J. H.]] (1993) ''Estimating Functions of Mixed Ordinal and Categorical Variables Using Adaptive Splines'', New Directions in Statistical Data Analysis and Robustness (Morgenthaler, Ronchetti, Stahel, eds.), Birkhauser</ref> MARS tends to be better than recursive partitioning for numeric data because hinges are more appropriate for numeric variables than the piecewise constant segmentation used by recursive partitioning.
*Building MARS models often requires little or no data preparation<ref name=":0" />. The hinge functions automatically partition the input data, so the effect of outliers is contained.
*MARS (like recursive partitioning) does
*MARS models tend to have a good bias-variance trade-off.
*MARS is suitable for handling fairly large datasets.
*With MARS models, as with any non-parametric regression, parameter confidence intervals and other checks on the model cannot be calculated directly (unlike [[linear regression]] models).
*MARS models do not give as good fits as [[Boosting (meta-algorithm)|boosted]] trees, but can be built much more quickly and are more interpretable. (An 'interpretable' model is in a form that makes it clear what the effect of each predictor is.)
*The <code>earth</code>, <code>mda</code>, and <code>polspline</code> implementations do not allow missing values in predictors, but free implementations of regression trees (such as <code>rpart</code> and <code>party</code>) do allow missing values using a technique called surrogate splits.
|