Decision tree learning: Difference between revisions

Content deleted Content added
OAbot (talk | contribs)
m Open access bot: url-access updated in citation with #oabot.
 
(2 intermediate revisions by 2 users not shown)
Line 39:
The dependent variable, <math>Y</math>, is the target variable that we are trying to understand, classify or generalize. The vector <math>\textbf{x}</math> is composed of the features, <math>x_1, x_2, x_3</math> etc., that are used for that task.
 
[[File:Cart tree kyphosis.png|thumb|800px500x500px|
alt=Three different representations of a regression tree of kyphosis data|
An example tree which estimates the probability of
Line 235:
 
===Variance reduction===
Introduced in CART,<ref name="bfos"/> variance reduction is ofte
Introduced in CART,<ref name="bfos"/> variance reduction is often employed in cases where the target variable is continuous (regression tree), meaning that use of many other metrics would first require discretization before being applied. The variance reduction of a node {{mvar|N}} is defined as the total reduction of the variance of the target variable {{mvar|Y}} due to the split at this node:
 
:<math>
Line 245:
By replacing <math>(y_i - y_j)^2</math> in the formula above with the dissimilarity <math>d_{ij}</math> between two objects <math>i</math> and <math>j</math>, the variance reduction criterion applies to any kind of object for which pairwise dissimilarities can be computed.<ref name=":1" />
 
 
===Measure of "goodness"===
Used by CART in 1984,<ref name="ll">{{Cite book
|last=Larose