Content deleted Content added
No edit summary |
No edit summary |
||
Line 1:
In [[statistics]] and [[Machine_learning|machine learning]], '''discretization''' refers to the process of converting or partitioning continuous [[Variable_(statistics)#Applied_statistics|attributes]], [[Features_(pattern_recognition)|features]] or [[Dependent_and_independent_variables|variables]] to discretized or [[nominal data|nominal]] attributes/features/variables/[[Interval_(mathematics)|intervals]]. This can be useful when creating probability mass functions – formally, in [[density estimation]]. It is a form of [[data binning|binning]], as in making a [[histogram]].
Typically data is discretized into partitions of ''K'' equal lengths/width (equal intervals) or K% of the total data (equal frequencies).<ref name=clarke> {{cite web|url=http://sci2s.ugr.es/keel/pdf/specific/articulo/IJIS00.pdf |title=Entropy and MDL Discretization of Continuous Variables for Bayesian Belief Networks |accessdate=2008-07-10 }}</ref>
Some mechanisms for discretizing continuous data include:
|