Discretization of continuous features: Difference between revisions

Content deleted Content added
m top: replace id={{foobar|...}} with |foobar=..., Report bugs, errors, and suggestions at User talk:CitationCleanerBot.
m top: clean up spacing around commas and other punctuation fixes, replaced: ; → ;
 
(6 intermediate revisions by 5 users not shown)
Line 8:
| journal = International Journal of Intelligent Systems
| volume = 15
| pages = 6161–92
| year = 2000
| pmid =
Line 16:
</ref>
 
Mechanisms for discretizing continuous data include [[Usama Fayyad|Fayyad]] & Irani's MDL method,<ref>Fayyad, Usama M.; Irani, Keki B. (1993) {{cite web|hdl=2014/35171 | url = https://www.ijcai.org/Proceedings/93-2/Papers/022.pdf | title = Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning| date = 29 July 2023 }}, ''ProceedingsProc. of13th the InternationalInt. Joint ConferenceConf. on Uncertainty inArtificial AIIntelligence'' (Q334 .I571 1993), pp. 1022-1027</ref> which uses [[mutual information]] to recursively define the best bins, CAIM, CACC, Ameva, and many others<ref>Dougherty, J.; Kohavi, R. ; Sahami, M. (1995). "[http://robotics.stanford.edu/users/sahami/papers-dir/disc.pdf Supervised and Unsupervised Discretization of Continuous Features]". In A. Prieditis & S. J. Russell, eds. ''Work''. Morgan Kaufmann, pp. 194-202</ref>
 
Many machine learning algorithms are known to produce better models by discretizing continuous attributes.<ref>{{cite journal|url=http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.109.3084&rep=rep1&type=pdf | first1=S. |last1=Kotsiantis |first2= D| last2= Kanellopoulos |title=Discretization Techniques: A recent survey|journal= GESTS International Transactions on Computer Science and Engineering |volume=32 |issue=1 |year=2006 |pages= 47–58|citeseerx = 10.1.1.109.3084}}</ref>
 
== Software ==
This is a partial list of software that implement MDL algorithm.
* [https://gforge.inria.fr/projects/discretize4crf discretize4crf] tool designed to work with popular [[Conditional random field|CRF]] implementations ([[C++]])
* [https://cran.r-project.org/web/packages/discretization/discretization.pdf mdlp] in the R package discretizediscretization
* [https://cran.r-project.org/web/packages/RWeka/RWeka.pdf Discretize] in the R package RWeka