Revision as of 22:54, 26 August 2007 edit SmackBot (talk \| contribs) 3,734,324 edits Date/fix the maintenance tags or gen fixes ← Previous edit		Revision as of 02:50, 3 September 2007 edit undo RichardVeryard (talk \| contribs) Extended confirmed users 3,183 edits category added Next edit →
Line 2: {{context}} Many factors affect the success of [[Machine learning]] (ML) on a given task. The representation and quality of the instance [[data]] is first and foremost (Pyle, 1999). If there is much irrelevant and redundant information present or noisy and unreliable data, then knowledge discovery during the training phase is more difficult. It is well known that data preparation and filtering steps take considerable amount of processing time in ML problems. Data pre-processing includes data cleaning, normalization, transformation, feature extraction and selection, etc. The product of data pre-processing is the final training set. It would be nice if a single sequence of data pre-processing algorithms had the best performance for each data set but this is not happened. Kotsiantis et al. (2006) present the most well know algorithms for each step of data pre-processing so that one achieves the best performance for their data set. ~~Many factors affect the success of Machine Learning~~ ~~'''~~==References~~'''<br />~~==▼ ~~(ML) on a given task. The representation and quality of the instance~~ S. Kotsiantis, D. Kanellopoulos, P. Pintelas, Data Preprocessing for Supervised Leaning, International Journal of Computer Science, 2006, Vol 1 N. 2, pp 111-117.~~<br />~~▼ ~~[[data]] is first and foremost (Pyle, 1999). If there is much irrelevant and redundant~~ Pyle, D., 1999. Data Preparation for Data Mining. Morgan Kaufmann Publishers, Los Altos, CA.▼ ~~information present or noisy and unreliable data, then knowledge~~ ~~discovery during the training phase is more difficult. It is well known~~ [[Category:Machine learning]] ~~that data preparation and filtering steps take considerable amount of~~ ~~processing time in ML problems. Data pre-processing includes data~~ ~~cleaning, normalization, transformation, feature extraction and~~ ~~selection, etc. The product of data pre-processing is the final training~~ ~~set. It would be nice if a single sequence of data pre-processing~~ ~~algorithms had the best performance for each data set but this is not~~ ~~happened. Kotsiantis et al. (2006) present the most well know algorithms for each~~ ~~step of data pre-processing so that one achieves the best performance~~ ~~for their data set.<br />~~ ▲'''References'''<br /> ▲S. Kotsiantis, D. Kanellopoulos, P. Pintelas, Data Preprocessing for Supervised Leaning, International Journal of Computer Science, 2006, Vol 1 N. 2, pp 111-117.<br /> ▲Pyle, D., 1999. Data Preparation for Data Mining. Morgan Kaufmann ~~Publishers, Los Altos, CA.~~ ~~{{Uncategorized\|date=July 2007}}~~

Data preprocessing: Difference between revisions