Data validation and reconciliation: Difference between revisions

Content deleted Content added
Gmstanley (talk | contribs)
m History: punctuation corrected ~~~~
Gmstanley (talk | contribs)
m Error types: Clarified assumption of normal distribution, nature of errors being reconciled ~~~~
Line 16:
# [[systematic errors]] (or gross errors) due to sensor [[calibration]] or faulty data transmission.
 
[[Random error]]s means that the measurement <math style="vertical-align:-30%;">y\,\!</math> is a [[normal distribution|normally distributed]] [[random variable]] with [[mean]] <math style="vertical-align:-30%;">y^*\,\!</math>, where <math style="vertical-align:-30%;">y^*\,\!</math> is the true value that is typically not known. A [[systematic error]] on the other hand is characterized by a measurement <math style="vertical-align:-30%;">y\,\!</math> which is a normally distributed random variable with [[mean]] <math>\bar{y}\,\!</math>, which is not equal to the true value <math style="vertical-align:-30%;">y^*\,</math>. For ease in deriving and implementing an optimal estimation solution, and based on arguments that errors are the sum of many factors (so that the [[Central limit theorem]] has some effect), data reconciliation assumes these errors are [[normal distribution|normally distributed]].
 
For ease in deriving and implementing an optimal estimation solution, and based on arguments that errors are the sum of many factors (so that the [[central limit theorem]] has some effect), data reconciliation assumes these errors are [[normally distributed]].
Other sources of error when calculating plant balances, are small instabilities in plant operations. Not all measurements and samples are taken at the same time, causing discrepancies between measurements. Using time averages for plant data partly reduces this problem but lab analyses cannot be averaged.
 
Other sources of errors when calculating plant balances include process faults such as leaks, unmodeled heat losses, incorrect physical properties or other physical parameters used in equations, and incorrect structure such as unmodeled bypass lines. Other errors include unmodeled plant dynamics such as holdup changes, and other instabilities in plant operations that violate steady state (algebraic) models. Additional dynamic errors arise when measurements and samples are not taken at the same time, especially lab analyses.
 
The normal practice of using time averages for the data input partly reduces the dynamic problems. However, that does not completely resolve timing inconsistencies for infrequently-sampled data like lab analyses.
 
This use of average values, like a [[moving average]], acts as a [[low-pass filter]], so high frequency noise is mostly eliminated. The result is that, in practice, data reconciliation is mainly making adjustments to correct systematic errors like biases.
 
===Necessity of removing measurement errors===