Content deleted Content added
Replaced a potentially offensive example of an impossible data combination. Tag: Reverted |
Nmacpherson (talk | contribs) m Reverted edits by 208.104.252.252 (talk) (AV) |
||
Line 1:
'''Data preprocessing''' can refer to manipulation or dropping of data before it is used in order to ensure or enhance performance,<ref>{{Cite web|title=Guide To Data Cleaning: Definition, Benefits, Components, And How To Clean Your Data|url=https://www.tableau.com/learn/articles/what-is-data-cleaning|access-date=2021-10-17|website=Tableau|language=en-US}}</ref> and is an important step in the [[data mining]] process. The phrase [[GIGO|"garbage in, garbage out"]] is particularly applicable to [[data mining]] and [[machine learning]] projects. [[Data collection|Data-gathering]] methods are often loosely controlled, resulting in [[range error|out-of-range]] values (e.g., Income: −100), impossible data combinations (e.g.,
Analyzing data that has not been carefully screened for such problems can produce misleading results. Thus, the representation and [[data quality|quality of data]] is first and foremost before running any analysis.<ref>Pyle, D., 1999. ''Data Preparation for Data Mining.'' Morgan Kaufmann Publishers, [[Los Altos, California]].</ref>
|