Revision as of 19:50, 5 May 2008 edit 80.216.41.247 (talk) No edit summary ← Previous edit		Revision as of 01:28, 15 January 2009 edit undo MLauba (talk \| contribs) Administrators 11,334 edits m References wikified as part of the Wikification WikiProject. Still needs work. Next edit →
Line 2: {{context}} Many factors affect the success of [[Machine learning]] (ML) on a given task. The representation and quality of the instance [[data]] is first and foremost (.<ref>Pyle, D., 1999). Data Preparation for Data Mining. Morgan Kaufmann Publishers, Los Altos, CA.</ref> If there is much irrelevant and redundant information present or noisy and unreliable data, then knowledge discovery during the training phase is more difficult. It is well known that data preparation and filtering steps take considerable amount of processing time in ML problems. Data pre-processing includes data cleaning, normalization, transformation, feature extraction and selection, etc. The product of data pre-processing is the final training set. Kotsiantis et al. (2006) present a well know algorithm for each step of data pre-processing.<ref>S. Kotsiantis, D. Kanellopoulos, P. Pintelas, Data Preprocessing for Supervised Leaning, International Journal of Computer Science, 2006, Vol 1 N. 2, pp 111-117.</ref> ==References== {{reflist}} S. Kotsiantis, D. Kanellopoulos, P. Pintelas, Data Preprocessing for Supervised Leaning, International Journal of Computer Science, 2006, Vol 1 N. 2, pp 111-117. Pyle, D., 1999. Data Preparation for Data Mining. Morgan Kaufmann Publishers, Los Altos, CA. [[Category:Machine learning]]

Data preprocessing: Difference between revisions