Data editing: Difference between revisions

Content deleted Content added
m Bellerophon moved page Draft:Data editing to Draft:Statistical Data editing: To avoid this being repeatedly declined as already existing. The topic is different.
Commenting on Wikipedia:Articles for creation submission (AFCH)
Line 1:
{{AFC submission|d|exists|Data editing|u=Stats sanx|ns=118|decliner=FoCuSandLeArN|declinets=20140610154447|ts=20140610141312|small=yes}}{{AFC <!--submission|d|exists|Data Doediting|u=Stats not remove this line! -->sanx|ns=118|decliner=MatthewVanitas|declinets=20140609152707|ts=20140609145436}}
{{AFC submission|d|exists|Data editing|u=Stats sanx|ns=118|decliner=MatthewVanitas|declinets=20140609152707|small=yes|ts=20140609145436}} <!-- Do not remove this line! -->
 
{{AFC comment|1=Is there a reason why you submitted again when the topic is already covered in Wikipedia? [[User:FoCuSandLeArN|FoCuSandLeArN]] ([[User talk:FoCuSandLeArN|talk]]) 15:44, 10 June 2014 (UTC)}}
 
{{AFC comment|1=[[User:MatthewVanitas|MatthewVanitas]] ([[User talk:MatthewVanitas|talk]]) 15:27, 9 June 2014 (UTC)}}
 
{{afc comment|1=The topic the author is writing about is not the same as [[Data editing]]. [[User:Bellerophon|<span style="font:small-caps 1.0em Alexandria,serif;color=#00008B">'''Bellerophon''']]</span> [[User talk:Bellerophon|<span style="font:0.75em Verdana,Geneva,sans-serif;color:#9966CC;"><sub>''talk to me''</sub>]]</span> 19:49, 10 June 2014 (UTC)}}
----
<!-- Do not remove this line! -->
<!-- Do not remove this line! -->
 
<!-- EDIT BELOW THIS LINE -->
=Statistical Data Editing=
Data editing is defined as the process involving the review and adjustment of collected survey data. The purpose is to control the quality of the collected data.<ref>[http://www.unece.org/stats/editing.html UNECE]</ref> Data editing can be performed manually, with the assistance of a computer or a combination of both. <ref>http://www.statcan.gc.ca/edu/power-pouvoir/ch3/editing-edition/5214781-eng.htm</ref>
== Editing Methods ==
 
=== Interactive editing ===
 
The term interractive editing is commonly used for modern computer-assisted manual editing. Most interactive data editing tools applied at 'National Statistical Institutes' (NSIs) allow one to check the specified editsduring or after data entry, and if necessary to correct erroneous data immediatedly. Several approaches can be followed to correct erroneous data:
*Recontact the respondent
Line 19 ⟶ 21:
*Use the subject matter knowledge of the human editor
Interractive editing is a standard way to edit data. It can be used to edit both [[categorical]] and [[continuous]] data.<ref>Waal, Ton de et al. "Handbook of Statistical Data Editing and Imputation". Wiley publication, 2011,p.15.</ref> Interractive editing reduces the time frame needed to complete the cyclical process of review and adjustment. <ref>http://www.unece.org/fileadmin/DAM/stats/publications/editing/SDE1chA.pdf</ref>
=== Selective editing ===
 
Selective editing is an umbrella term for several methods to identify the influential errors, <ref group=note> the errors that have substatial impact on the publication figures</ref> and outliers <ref group=note>values that do not fit a model of data well</ref>. Selective editing techniques aim to apply interactive editing to a well-chosen subset of the records, such that the limited time and resources available for interactive editing are allocated to those records where it has the most effect on the quality of the final estimates of publication figures. In selective editing, data is split into two streams
*The critical stream
*The noncritical stream
The critical stream consists of records that are more likely to contain influential errors. These critical records are edited in a traditional interactive manner. The records in the non critical stream which are unlikely to contain influential errors are not edited in a computer assisted manner.<ref>Waal, Ton de et al. "Handbook of Statistical Data Editing and Imputation". Wiley publication, 2011,p.16.</ref>
=== Macro editing ===
 
There are two forms of macro editing<ref>Waal, Ton de et al. "Handbook of Statistical Data Editing and Imputation". Wiley publication, 2011,p.16.</ref>
==== Aggregation method ====
 
This method is followed in almost every statistical agency before publication: verifying whether figures to be published seem plausible. This is accomplished by comparing quantities in publication tables with same quantities in previous publications. If an unusual value is observed, a micro-editing procedure is applied to the individual records and fields contributing to the suspicious quantity.<ref>http://www.unece.org/fileadmin/DAM/stats/publications/editing/SDE1chB.pdf</ref>
==== Distribution method ====
 
Data available is used to characterize the distribution of the variables. Then all individual values are compared with the distribution. Records containing values that could be considered uncommon (given the distribution) are candidates for further inspection and possibly for editing.<ref>Bethlehem,J. "Applied Survey Methods A Statistical Perspective ". Wiley publication, 2009,p.205.</ref>
=== Automatic editing ===
 
In automatic editing records are edited by a computer without human intervention.<ref>Waal, Ton de et al. "Handbook of Statistical Data Editing and Imputation". Wiley publication, 2011,p.16.</ref><ref>http://www.unece.org/fileadmin/DAM/stats/publications/editing/SDE1chC.pdf</ref> Prior knowledge on the values of a single variable or a combination of variables can be formulated as a set of edit rules which specify or constrain the admissible values. <ref>http://www.cbs.nl/NR/rdonlyres/E1FF7D78-E697-42E7-A36D-94AE74EDB83A/0/201309x10pub.pdf</ref>
== Notes ==
 
{{reflist|group=note}}
 
== References ==
 
{{reflist}}