Data validation and reconciliation: Difference between revisions

Content deleted Content added
DVR replaced bei PDR (process data reconciliation). Current VDI 2048 edition is 2017 not 2000.
Rescuing 1 sources and tagging 0 as dead.) #IABot (v2.0.9.5
 
(11 intermediate revisions by 10 users not shown)
Line 1:
{{Short description|A technology to correct measurements in industrial processes}}
'''Industrial process data validation and reconciliation''', or more briefly, '''process data reconciliation (PDR)''', is a technology that uses process information and mathematical methods in order to automatically ensure [[data validation]] and reconciliation by correcting measurements in industrial processes. The use of PDR allows for extracting accurate and reliable information about the state of industry processes from raw measurement [[data]] and produces a single consistent set of data representing the most likely process operation.
 
Line 12 ⟶ 13:
File:Normal_with_bias.jpg|Normally distributed measurements with bias.
</gallery>
Data originates typically from [[measurements]] taken at different places throughout the industrial site, for example temperature, pressure, volumetric flow rate measurements etc. To understand the basic principles of DVRPDR, it is important to first recognize that plant measurements are never 100% correct, i.e. raw measurement <math>y\,</math> is not a solution of the nonlinear system <math>F(y)=0\,\!</math>. When using measurements without correction to generate plant balances, it is common to have incoherencies. [[Observational error|Measurement errors]] can be categorized into two basic types:
# [[random error]]s due to intrinsic [[sensor]] [[accuracy]] and
# [[systematic errors]] (or gross errors) due to sensor [[calibration]] or faulty data transmission.
Line 32 ⟶ 33:
==History==
PDR has become more and more important due to industrial processes that are becoming more and more complex. PDR started in the early 1960s with applications aiming at closing [[mass balance|material balances]] in production processes where raw measurements were available for all [[variable (mathematics)|variables]].<ref>D.R. Kuehn, H. Davidson, ''Computer Control II. Mathematics of Control'', Chem. Eng. Process 57: 44–47, 1961.</ref> At the same time the problem of [[systematic error|gross error]] identification and elimination has been presented.<ref>V. Vaclavek, ''Studies on System Engineering I. On the Application of the Calculus of the Observations of Calculations of Chemical Engineering Balances'', Coll. Czech Chem. Commun. 34: 3653, 1968.</ref> In the late 1960s and 1970s unmeasured variables were taken into account in the data reconciliation process.,<ref>V. Vaclavek, M. Loucka, ''Selection of Measurements Necessary to Achieve Multicomponent Mass Balances in Chemical Plant'', Chem. Eng. Sci. 31: 1199–1205, 1976.</ref><ref name="Mah-Stanley-Downing-1976">[[Richard S. H. Mah|R.S.H. Mah]], G.M. Stanley, D.W. Downing, [http://gregstanleyandassociates.com/ReconciliationRectificationProcessData-1976.pdf ''Reconciliation and Rectification of Process Flow and Inventory Data'', Ind. & Eng. Chem. Proc. Des. Dev. 15: 175–183, 1976.]</ref> PDR also became more mature by considering general nonlinear equation systems coming from thermodynamic models.,<ref>J.C. Knepper, J.W. Gorman, ''Statistical Analysis of Constrained Data Sets'', AiChE Journal 26: 260–164, 1961.</ref>
,<ref name="Stanley-Mah-1977">G.M. Stanley and R.S.H. Mah, [http://gregstanleyandassociates.com/AIChEJ-1977-EstimationInProcessNetworks.pdf ''Estimation of Flows and Temperatures in Process Networks'', AIChE Journal 23: 642–650, 1977.]</ref><ref>P. Joris, B. Kalitventzeff, ''Process measurements analysis and validation'', Proc. CEF’87: Use Comput. Chem. Eng., Italy, 41–46, 1987.</ref> Quasi steady state dynamics for filtering and simultaneous parameter estimation over time were introduced in 1977 by Stanley and Mah.<ref name="Stanley-Mah-1977"/> Dynamic DVRPDR was formulated as a nonlinear optimization problem by Liebman et al. in 1992.<ref>M.J. Liebman, T.F. Edgar, L.S. Lasdon, ''Efficient Data Reconciliation and Estimation for Dynamic Processes Using Nonlinear Programming Techniques'', Computers Chem. Eng. 16: 963–986, 1992.</ref>
 
==Data reconciliation==
Line 68 ⟶ 69:
[https://gregstanleyandassociates.com/CES-1981a-ObservabilityRedundancy.pdf Stanley G.M. and Mah, R.S.H., "Observability and Redundancy in Process Data Estimation, Chem. Engng. Sci. 36, 259 (1981)]</ref> for these cases with set constraints such as algebraic equations and inequalities. Next, we illustrate some special cases:
 
Topological redundancy is intimately linked with the [[degrees of freedom (physics and chemistry)|degrees of freedom]] (<math>dof\,\!</math>) of a mathematical system,<ref name="vdi">VDI-Gesellschaft Energie und Umwelt, "Guidelines - VDI 2048 Blatt 1 - Uncertainties“Control and quality improvement of measurementsprocess atdata acceptanceand teststheir foruncertainties energyby means of correction calculation for conversionoperation and poweracceptance plantstests”; -VDI 2048 Part 1; September Fundamentals2017", ''[http://www.vdi.de/401.0.html Association of German Engineers] {{Webarchive|url=https://web.archive.org/web/20100325223512/http://www.vdi.de/401.0.html |date=2010-03-25 }}'', 2017.</ref> i.e. the minimum number of pieces of information (i.e. measurements) that are required in order to calculate all of the system variables. For instance, in the example above the flow conservation requires that <math>a=b+c\,</math>. One needs to know the value of two of the 3 variables in order to calculate the third one. The degrees of freedom for the model in that case is equal to 2. At least 2 measurements are needed to estimate all the variables, and 3 would be needed for redundancy.
 
When speaking about topological redundancy we have to distinguish between measured and unmeasured variables. In the following let us denote by <math>x\,\!</math> the unmeasured variables and <math>y\,\!</math> the measured variables. Then the system of the process constraints becomes <math>F(x,y)=0\,\!</math>, which is a nonlinear system in <math>y\,\!</math> and <math>x\,\!</math>.
Line 89 ⟶ 90:
We incorporate only flow conservation constraints and obtain <math>a+b=c\,\!</math> and <math>c=d\,\!</math>. It is possible that the system <math>F(x,y)=0\,\!</math> is not calculable, even though <math>p-m\ge 0\,\!</math>.
 
If we have measurements for <math>c\,\!</math> and <math>d\,\!</math>, but not for <math>a\,\!</math> and <math>b\,\!</math>, then the system cannot be calculated (knowing <math>c\,\!</math> does not give information about <math>a\,\!</math> and <math>b\,\!</math>). On the other hand, if <math>a\,\!</math> and <math>cd\,\!</math> are known, but not <math>b\,\!</math> and <math>dc\,\!</math>, then the system can be calculated.
 
In 1981, observability and redundancy criteria were proven for these sorts of flow networks involving only mass and energy balance constraints.<ref name="Stanley-Mah-1981b">[https://gregstanleyandassociates.com/CES-1981b-ObservabilityRedundancyProcessNetworks.pdf Stanley G.M., and Mah R.S.H., "Observability and Redundancy Classification in Process Networks", Chem. Engng. Sci. 36, 1941 (1981) ]</ref> After combining all the plant inputs and outputs into an "environment node", loss of observability corresponds to cycles of unmeasured streams. That is seen in the second case above, where streams a and b are in a cycle of unmeasured streams. Redundancy classification follows, by testing for a path of unmeasured streams, since that would lead to an unmeasured cycle if the measurement was removed. Measurements c and d are redundant in the second case above, even though part of the system is unobservable.
Line 114 ⟶ 115:
The individual test compares each penalty term in the objective function with the critical values of the normal distribution. If the <math>i</math>-th penalty term is outside the 95% confidence interval of the normal distribution, then there is reason to believe that this measurement has a gross error.
 
==Advanced process data validation and reconciliation==
Advanced process data reconciliation (PDR) is an integrated approach of combining data reconciliation and data validation techniques, which is characterized by
* complex models incorporating besides mass balances also thermodynamics, momentum balances, equilibria constraints, hydrodynamics etc.
Line 163 ⟶ 164:
* Rankin, J. & Wasik, L. "Dynamic Data Reconciliation of Batch Pulping Processes (for On-Line Prediction)" PAPTAC Spring Conference 2009.
* S. Narasimhan, C. Jordache, ''Data reconciliation and gross error detection: an intelligent use of process data'', Golf Publishing Company, Houston, 2000.
* V. Veverka, F. Madron, ''Material and Energy Balancing in the Process Industries'', Elsevier Science BV, Amsterdam, 1997.
* J. Romagnoli, M.C. Sanchez, ''Data processing and reconciliation for chemical process operations'', Academic Press, 2000.