Content deleted Content added
m Cleaned up using AutoEd |
Alter: pmc. Add: doi-access, s2cid, authors 1-1. Removed proxy/dead URL that duplicated identifier. Removed parameters. Some additions/deletions were parameter name changes. | Use this tool. Report bugs. | #UCB_Gadget |
||
Line 5:
'''Decision tree learning''' is a supervised learning approach used in [[statistics]], [[data mining]] and [[machine learning]]. In this formalism, a classification or regression [[decision tree]] is used as a [[predictive model]] to draw conclusions about a set of observations.
Tree models where the target variable can take a discrete set of values are called '''[[classification]] [[decision tree|trees]]'''; in these tree structures, [[leaf node|leaves]] represent class labels and branches represent [[Logical conjunction|conjunction]]s of features that lead to those class labels. Decision trees where the target variable can take continuous values (typically [[real numbers]]) are called '''[[regression analysis|regression]] [[decision tree|trees]]'''. More generally, the concept of regression tree can be extended to any kind of object equipped with pairwise dissimilarities such as categorical sequences.<ref name=":1">{{Cite journal |
Decision trees are among the most popular machine learning algorithms given their intelligibility and simplicity.<ref>{{Cite journal|last1=Wu|first1=Xindong|last2=Kumar|first2=Vipin|last3=Ross Quinlan|first3=J.|last4=Ghosh|first4=Joydeep|last5=Yang|first5=Qiang|last6=Motoda|first6=Hiroshi|last7=McLachlan|first7=Geoffrey J.|last8=Ng|first8=Angus|last9=Liu|first9=Bing|last10=Yu|first10=Philip S.|last11=Zhou|first11=Zhi-Hua|date=2008-01-01|title=Top 10 algorithms in data mining|journal=Knowledge and Information Systems|language=en|volume=14|issue=1|pages=1–37|doi=10.1007/s10115-007-0114-2|s2cid=2367747|issn=0219-3116|hdl=10983/15329|hdl-access=free}}</ref>
Line 304:
* Non-parametric approach that makes no assumptions of the training data or prediction residuals; e.g., no distributional, independence, or constant variance assumptions
* '''Performs well with large datasets.''' Large amounts of data can be analyzed using standard computing resources in reasonable time.
* '''Accuracy with flexible modeling'''. These methods may be applied to healthcare research with increased accuracy.<ref>{{Cite journal |
* '''Mirrors human decision making more closely than other approaches.'''<ref name=":0" /> This could be useful when modeling human decisions/behavior.
* '''Robust against co-linearity, particularly boosting.'''
|