Revision as of 17:32, 10 August 2023 edit 198.187.147.105 (talk) link pointed to a missing section Tag: Visual edit ← Previous edit		Revision as of 18:39, 4 September 2023 edit undo 88.17.172.167 (talk) Grammar Next edit →
Line 166: ===Gini impurity=== '''Gini impurity''', '''Gini's diversity index''',<ref>{{cite web \|title=Growing Decision Trees \|url=https://www.mathworks.com/help/stats/growing-decision-trees.html \|website=MathWorks }}</ref> or '''[[Diversity index#Gini–Simpson index\|Gini-Simpson Index]]''' in biodiversity research, is named after Italian mathematician [[Corrado Gini]] and used by the CART (classification and regression tree) algorithm for classification trees. Gini impurity measures how often a randomly chosen element of a set would be incorrectly labeled if it ~~was~~were labeled randomly and independently according to the distribution of labels in the set. It reaches its minimum (zero) when all cases in the node fall into a single target category. For a set of items with <math>J</math> classes and relative frequencies <math>p_i</math>, <math>i \in \{1, 2, ...,J\}</math>, the probability of choosing an item with label <math>i</math> is <math>p_i</math>, and the probability of miscategorizing that item is <math>\sum_{k \ne i} p_k = 1-p_i</math>. The Gini impurity is computed by summing pairwise products of these probabilities for each class label:

Decision tree learning: Difference between revisions