Data preprocessing: Difference between revisions

Content deleted Content added
top: Remove Example section: does not add to the article or provide information on what Preprocessing actually is
Tags: Mobile edit Mobile app edit Android app edit
m top: Fix broken link; remove unnecessary examples.
Tags: Mobile edit Mobile app edit Android app edit
Line 1:
'''Data preprocessing''' can refer to manipulation or dropping of data before it is used in order to ensure or enhance performance,<ref>{{Cite web|title=Guide To Data Cleaning: Definition, Benefits, Components, And How To Clean Your Data|url=https://www.tableau.com/learn/articles/what-is-data-cleaning|access-date=2021-10-17|website=Tableau|language=en-US}}</ref> and is an important step in the [[data mining]] process. The phrase [[GIGO|"garbage in, garbage out"]] is particularly applicable to [[data mining]] and [[machine learning]] projects. [[Data collection|Data-gathering]] methods are often loosely controlled, resulting in [[range error|out-of-range]] values (e.g., Income: −100), impossible data combinations (e.g., Sex: Male, Pregnant: Yes), and [[missing values]], etc.amongst other issues.
 
Analyzing data that has not been carefully screened for such problems can produce misleading results. Thus, the representation and [[data quality|quality of data]] is first and foremost before running any analysis.<ref>Pyle, D., 1999. ''Data Preparation for Data Mining.'' Morgan Kaufmann Publishers, [[Los Altos, California]].</ref>