Data validation and reconciliation: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 22:54, 25 September 2012 edit Gmstanley (talk \| contribs) 70 edits m →History: Added reference to first use of dynamics and filtering over time with Data Reconciliation ~~~~ ← Previous edit		Latest revision as of 02:24, 18 August 2025 edit undo InternetArchiveBot (talk \| contribs) Bots, Pending changes reviewers 5,694,577 edits Rescuing 1 sources and tagging 0 as dead.) #IABot (v2.0.9.5
(47 intermediate revisions by 33 users not shown)
Line 1: {{Short description\|A technology to correct measurements in industrial processes}} '''Industrial process data validation and reconciliation''', or more briefly, '''process data ~~validation and~~ reconciliation (~~DVR~~PDR)''', is a technology that uses process information and mathematical methods in order to automatically ~~correct~~ensure [[data validation]] and reconciliation by correcting measurements in industrial processes. The use of ~~DVR~~PDR allows for extracting accurate and reliable information about the state of industry processes from raw measurement [[data]] and produces a single consistent set of data representing the most likely process operation. ==Models, data and measurement errors== Line 5 ⟶ 6: # Models that express the general structure of the processes, # Data that reflects the state of the processes at a given point in time. Models can have different levels of detail, for example one can incorporate simple mass or compound conservation balances, or more advanced thermodynamic models including energy conservation laws. Mathematically the model can be expressed by a [[nonlinear system\|nonlinear system of equations]] <math ~~style="vertical-align:-30%;"~~>F(y)=0\,</math> in the variables <math ~~style="vertical-align:-20%;"~~>y=(y_1,\ldots,y_n)</math>, which incorporates all the above-mentioned system constraints (for example the mass or heat balances around a unit). A variable could be the temperature or the pressure at a certain place in the plant. ===Error types=== <gallery caption="Random and systematic errors" widths="~~300%~~225px" perrow="2" ~~align~~style="float:right"> File:Normal_no_bias.jpg\|Normally distributed measurements without bias. File:Normal_with_bias.jpg\|Normally distributed measurements with bias. </gallery> Data originates typically from [[measurements]] taken at different places throughout the industrial site, for example temperature, pressure, volumetric flow rate measurements etc. To understand the basic principles of ~~DVR~~PDR, it is important to first recognize that plant measurements are never 100% correct, i.e. raw measurement <math ~~style="vertical-align:-20%;"~~>y\,</math> is not a solution of the nonlinear system <math ~~style="vertical-align:-30%;"~~>F(y)=0\,\!</math>. When using measurements without correction to generate plant balances, it is common to have incoherencies. [[Observational error\|Measurement errors]] can be categorized into two basic types: # [[random error]]s due to intrinsic [[sensor]] [[accuracy]] and # [[systematic errors]] (or gross errors) due to sensor [[calibration]] or faulty data transmission. [[Random error]]s means that the measurement <math ~~style="vertical-align:-30%;"~~>y\,\!</math> is a ~~[[normal distribution\|normally distributed]]~~ [[random variable]] with [[mean]] <math ~~style="vertical-align:-30%;"~~>y^\,\!</math>, where <math ~~style="vertical-align:-30%;"~~>y^\,\!</math> is the true value that is typically not known. A [[systematic error]] on the other hand is characterized by a measurement <math ~~style="vertical-align:-30%;"~~>y\,\!</math> which is a ~~normally distributed~~ random variable with [[mean]] <math>\bar{y}\,\!</math>, which is not equal to the true value <math ~~style="vertical-align:-30%;"~~>y^\,</math>. For ease in deriving and implementing an optimal estimation solution, and based on arguments that errors are the sum of many factors (so that the [[Central limit theorem]] has some effect), data reconciliation assumes these errors are [[normal distribution\|normally distributed]]. Other sources of errors when calculating plant balances include process faults such as leaks, unmodeled heat losses, incorrect physical properties or other physical parameters used in equations, and incorrect structure such as unmodeled bypass lines. Other errors include unmodeled plant dynamics such as holdup changes, and other instabilities in plant operations that violate steady state (algebraic) models. Additional dynamic errors arise when measurements and samples are not taken at the same time, especially lab analyses. Other sources of error when calculating plant balances, are small instabilities in plant operations. Not all measurements and samples are taken at the same time, causing discrepancies between measurements. Using time averages for plant data partly reduces this problem but lab analyses cannot be averaged. The normal practice of using time averages for the data input partly reduces the dynamic problems. However, that does not completely resolve timing inconsistencies for infrequently-sampled data like lab analyses. This use of average values, like a [[moving average]], acts as a [[low-pass filter]], so high frequency noise is mostly eliminated. The result is that, in practice, data reconciliation is mainly making adjustments to correct systematic errors like biases. ===Necessity of removing measurement errors=== Line 27 ⟶ 32: ==History== ~~DVR~~PDR has become more and more important due to industrial processes that are becoming more and more complex. ~~DVR~~PDR started in the early 1960s with applications aiming at closing [[mass balance\|material balances]] in production processes where raw measurements were available for all [[variable (mathematics)\|variables]].<ref>D.R. Kuehn, H. Davidson, ''Computer Control II. Mathematics of Control'', Chem. Eng. Process 57: 44–47, 1961.</ref> At the same time the problem of [[systematic error\|gross error]] identification and elimination has been presented.<ref>V. Vaclavek, ''Studies on System Engineering I. On the Application of the Calculus of the Observations of Calculations of Chemical Engineering Balances'', Coll. Czech Chem. Commun. 34: 3653, 1968.</ref> In the late 1960s and 1970s unmeasured variables were taken into account in the data reconciliation process.,<ref>V. Vaclavek, M. Loucka, ''Selection of Measurements Necessary to Achieve Multicomponent Mass Balances in Chemical Plant'', Chem. Eng. Sci. 31: 1199–1205, 1976.</ref>,<ref name="Mah-Stanley-Downing-1976">[~~http://gregstanleyandassociates~~[Richard S.~~com/ReconciliationRectificationProcessData-1976~~ H.~~pdf~~ Mah\|R.S.H. Mah]], G.M. Stanley, D.W. Downing, [http://gregstanleyandassociates.com/ReconciliationRectificationProcessData-1976.pdf ''Reconciliation and Rectification of Process Flow and Inventory Data'', Ind. & Eng. Chem. Proc. Des. Dev. 15: 175–183, 1976.]</ref> ~~During~~PDR ~~the 1980s the area of DVR~~also became more mature by considering general nonlinear equation systems coming from thermodynamic models.,<ref>J.C. Knepper, J.W. Gorman, ''Statistical Analysis of Constrained Data Sets'', AiChE Journal 26: 260–164, 1961.</ref> ,<ref name="Stanley-Mah-1977">G.M. Stanley and R.S.H. Mah, [http://gregstanleyandassociates.com/AIChEJ-1977-EstimationInProcessNetworks.pdf ''Estimation of Flows and Temperatures in Process Networks'', AIChE Journal 23: 642–650, 1977.]</ref><ref>P. Joris, B. Kalitventzeff, ''Process measurements analysis and validation'', Proc. CEF’87: Use Comput. Chem. Eng., Italy, 41–46, 1987.</ref> Quasi steady state dynamics for filtering and simultaneous parameter estimation over time were introduced in 1977 by Stanley and Mah, .<ref~~>[http://gregstanleyandassociates.com/AIChEJ-1977-EstimationInProcessNetworks.pdf G.M.~~ name="Stanley ~~and R.S.H.~~ -Mah~~, ''Estimation of Flows and Temperatures in Process Networks'', AIChE Journal 23: 642–650,~~ -1977~~.]<~~"/~~ref~~>. Dynamic ~~DVR~~PDR was formulated as a nonlinear optimization problem by Liebman et al. in 1992.<ref>M.J. Liebman, T.F. Edgar, L.S. Lasdon, ''Efficient Data Reconciliation and Estimation for Dynamic Processes Using Nonlinear Programming Techniques'', Computers Chem. Eng. 16: 963–986, 1992.</ref> ==Data reconciliation== Line 43 ⟶ 49: where <math ~~style="vertical-align:-30%;"~~>y_i^\,\!</math> is the reconciled value of the <math ~~style="vertical-align:-0%;"~~>i</math>-th measurement (<math ~~style="vertical-align:-25%;"~~>i=1,\ldots,n\,\!</math>), <math ~~style="vertical-align:-20%;"~~>y_i\,\!</math> is the measured value of the <math ~~style="vertical-align:-0%;"~~>i</math>-th measurement (<math ~~style="vertical-align:-25%;"~~>i=1,\ldots,n\,\!</math>), <math ~~style="vertical-align:-40%;"~~>x_j\,\!</math> is the <math ~~style="vertical-align:-0%;"~~>j</math>-th unmeasured variable (<math ~~style="vertical-align:-25%;"~~>j=1,\ldots,m\,\!</math>), and <math>\sigma_i\,\!</math> is the standard deviation of the <math ~~style="vertical-align:-0%;"~~>i</math>-th measurement (<math ~~style="vertical-align:-20%;"~~>i=1,\ldots,n\,\!</math>), <math ~~style="vertical-align:-20%;"~~>F(x,y^)=0\,\!</math> are the <math ~~style="vertical-align:-20%;"~~>p\,\!</math> process equality constraints and <math ~~style="vertical-align:-20%;"~~>x_{\min}, x_{\max}, y_{\min}, y_{\max}\,\!</math> are the bounds on the measured and unmeasured variables. The term <math>\left(\frac{y_i^-y_i}{\sigma_i}\right)^2\,\!</math> is called the ''penalty'' of measurement ''i''. The objective function is the sum of the penalties, which will be denoted in the following by <math>f(y^)=\sum_{i=1}^n\left(\frac{y_i^-y_i}{\sigma_i}\right)^2</math>. In other words, one wants to minimize the overall correction (measured in the least squares term) that is needed in order to satisfy the [[constraint (mathematics)\|system constraints]]. Additionally, each least squares term is weighted by the [[standard deviation]] of the corresponding measurement. The standard deviation is related to the accuracy of the measurement. For example, at a 95% confidence level, the standard deviation is about half the accuracy. ===Redundancy=== <gallery caption="Sensor and topological redundancy" ~~heights~~widths="~~150px~~225px" ~~widths~~heights="~~225px~~150px" perrow="2" ~~align~~style="float:right"> File:sensor_red.jpg\|Sensor redundancy arising from multiple sensors of the same quantity at the same time at the same place. File:topological_red.jpg\|Topological redundancy arising from model information, using the mass conservation constraint <math ~~style="vertical-align:-10%;"~~>a=b+c\,\!</math>, for example one can calculate <math ~~style="vertical-align:-0%;"~~>c\,\!</math>, when <math ~~style="vertical-align:-0%;"~~>a\,\!</math> and <math ~~style="vertical-align:-0%;"~~>b\,\!</math> are known. </gallery> Data reconciliation isrelies strongly ~~relying~~ on the concept of [[redundancy ~~(information theory)\|redundancy]]. Redundancy is a source of information that is used~~ to correct the measurements as little as possible in order to satisfy the process constraints. ~~Redundancy~~ ~~can~~Here, beredundancy ~~due~~is todefined differently from [[~~redundancy~~Redundancy (~~engineering~~information theory)\|~~sensor~~ redundancy~~]],~~ ~~where~~in ~~sensors~~information ~~are~~theory]]. ~~duplicated~~ inInstead, ~~order~~redundancy toarises ~~have~~from ~~more~~combining ~~than~~sensor ~~one~~data ~~measurement of~~with the ~~same~~model ~~quantity.~~(algebraic ~~Redundancy~~constraints), ~~can~~sometimes ~~also~~more ~~arise~~specifically ~~from~~called ~~[[topological~~"spatial redundancy]]",<ref ~~where~~name="Stanley-Mah-1977"/> a"analytical ~~single variable can be estimated in several independent ways~~redundancy", ~~from~~or ~~separate~~"topological ~~sets of measurements~~redundancy"./ Topological redundancy is intimately linked with the [[degrees of freedom (physics and chemistry)\|degree of freedom]] (<math style="vertical-align:-25%;">dof\,\!</math>) of a mathematical system,<ref name="vdi">VDI-Gesellschaft Energie und Umwelt, "Guidelines - VDI 2048 Blatt 1 - Uncertainties of measurements at acceptance tests for energy conversion and power plants - Fundamentals", ''[http://www.vdi.de/401.0.html Association of German Engineers]'', 2000.</ref> i.e. the minimum number of pieces of information (i.e. measurements) that are required in order to calculate all of the system variables. For instance, in the example above the flow conservation requires that <math style="vertical-align:-10%;">a=b+c\,</math>, and it is clear that one needs to know the value of two of the 3 variables in order to calculate the third one. Therefore the degree of freedom in that case is equal to 2.▼ Redundancy can be due to [[redundancy (engineering)\|sensor redundancy]], where sensors are duplicated in order to have more than one measurement of the same quantity. Redundancy also arises when a single variable can be estimated in several independent ways from separate sets of measurements at a given time or time averaging period, using the algebraic constraints. When speaking about topological redundancy we have to distinguish between measured and unmeasured variables. In the following let us denote by <math style="vertical-align:-0%;">x\,\!</math> the unmeasured variables and <math style="vertical-align:-30%;">y\,\!</math> the measured variables. Then the system of the process constraints becomes <math style="vertical-align:-25%;">F(x,y)=0\,\!</math>, which is a nonlinear system in <math style="vertical-align:-30%;">y\,\!</math> and <math style="vertical-align:-0%;">x\,\!</math>.▼ If the system <math style="vertical-align:-25%;">F(x,y)=0\,\!</math> is calculable with the <math style="vertical-align:-0%;">n\,</math> measurements given, then the level of topological redundancy is defined as <math style="vertical-align:-25%;">red= n - dof\,\!</math>, i.e. the number of additional measurements that are at hand on top of those measurements which are required in order to just calculate the system. Another way of viewing the level of redundancy is to use the definition of <math style="vertical-align:-20%;">dof\,</math>, which is the difference between the number of variables (measured and unmeasured) and the number of equations. Then one gets▼ Redundancy is linked to the concept of [[observability]]. A variable (or system) is observable if the models and sensor measurements can be used to uniquely determine its value (system state). A sensor is redundant if its removal causes no loss of observability. Rigorous definitions of observability, calculability, and redundancy, along with criteria for determining it, were established by Stanley and Mah,<ref name="Stanley-Mah-1981a"> [https://gregstanleyandassociates.com/CES-1981a-ObservabilityRedundancy.pdf Stanley G.M. and Mah, R.S.H., "Observability and Redundancy in Process Data Estimation, Chem. Engng. Sci. 36, 259 (1981)]</ref> for these cases with set constraints such as algebraic equations and inequalities. Next, we illustrate some special cases: ▲Topological redundancy is intimately linked with the [[degrees of freedom (physics and chemistry)\|~~degree~~degrees of freedom]] (<math ~~style="vertical-align:-25%;"~~>dof\,\!</math>) of a mathematical system,<ref name="vdi">VDI-Gesellschaft Energie und Umwelt, "Guidelines - VDI 2048 Blatt 1 - ~~Uncertainties~~“Control and quality improvement of ~~measurements~~process atdata ~~acceptance~~and ~~tests~~their ~~for~~uncertainties ~~energy~~by ~~conversion~~means of correction calculation for operation and ~~power~~acceptance ~~plants~~tests”; -VDI 2048 Part 1; September ~~Fundamentals~~2017", ''[http://www.vdi.de/401.0.html Association of German Engineers] {{Webarchive\|url=https://web.archive.org/web/20100325223512/http://www.vdi.de/401.0.html \|date=2010-03-25 }}'', ~~2000~~2017.</ref> i.e. the minimum number of pieces of information (i.e. measurements) that are required in order to calculate all of the system variables. For instance, in the example above the flow conservation requires that <math ~~style="vertical-align:-10%;"~~>a=b+c\,</math>,. ~~and~~ ~~it is clear that one~~One needs to know the value of two of the 3 variables in order to calculate the third one. ~~Therefore~~The ~~the degree~~degrees of freedom for the model in that case is equal to 2. At least 2 measurements are needed to estimate all the variables, and 3 would be needed for redundancy. ▲When speaking about topological redundancy we have to distinguish between measured and unmeasured variables. In the following let us denote by <math ~~style="vertical-align:-0%;"~~>x\,\!</math> the unmeasured variables and <math ~~style="vertical-align:-30%;"~~>y\,\!</math> the measured variables. Then the system of the process constraints becomes <math ~~style="vertical-align:-25%;"~~>F(x,y)=0\,\!</math>, which is a nonlinear system in <math ~~style="vertical-align:-30%;"~~>y\,\!</math> and <math ~~style="vertical-align:-0%;"~~>x\,\!</math>. ▲If the system <math ~~style="vertical-align:-25%;"~~>F(x,y)=0\,\!</math> is calculable with the <math ~~style="vertical-align:-0%;"~~>n\,</math> measurements given, then the level of topological redundancy is defined as <math ~~style="vertical-align:-25%;"~~>red= n - dof\,\!</math>, i.e. the number of additional measurements that are at hand on top of those measurements which are required in order to just calculate the system. Another way of viewing the level of redundancy is to use the definition of <math ~~style="vertical-align:-20%;"~~>dof\,</math>, which is the difference between the number of variables (measured and unmeasured) and the number of equations. Then one gets :<math>\begin{align} Line 66 ⟶ 78: \end{align}</math> i.e. the redundancy is the difference between the number of equations <math ~~style="vertical-align:-30%;"~~>p\,</math> and the number of unmeasured variables <math>m\,</math>. The level of total redundancy is the sum of sensor redundancy and topological redundancy. We speak of positive redundancy if the system is calculable and the total redundancy is positive. One can see that the level of topological redundancy merely depends on the number of equations (the more equations the higher the redundancy) and the number of unmeasured variables (the more unmeasured variables, the lower the redundancy) and not on the number of measured variables. However, it is possible that the system <math style="vertical-align:-30%;">F(x,y)=0\,\!</math> is not calculable, even though <math style="vertical-align:-30%;">p-m\ge 0\,\!</math>, as illustrated in the following example. Simple counts of variables, equations, and measurements are inadequate for many systems, breaking down for several reasons: (a) Portions of a system might have redundancy, while others do not, and some portions might not even be possible to calculate, and (b) Nonlinearities can lead to different conclusions at different operating points. As an example, consider the following system with 4 streams and 2 units. ====Example of calculable and non-calculable systems==== <gallery caption="Calculable and non-calculable systems" ~~heights~~widths="~~150px~~225px" ~~widths~~heights="~~225px~~150px" perrow="2" ~~align~~style="float:right"> File:calculable_system.jpg\|Calculable system, from <math ~~style="vertical-align:-0%;"~~>d\,\!</math> one can compute <math ~~style="vertical-align:-0%;"~~>c\,\!</math>, and knowing <math ~~style="vertical-align:-0%;"~~>a\,\!</math> yields <math ~~style="vertical-align:-0%;"~~>b\,\!</math>. File:uncalculable_system.jpg\|non-calculable system, knowing <math ~~style="vertical-align:-0%;"~~>c\,\!</math> does not give information about <math ~~style="vertical-align:-0%;"~~>a\,\!</math> and <math ~~style="vertical-align:-0%;"~~>b\,\!</math>. </gallery> Let us consider a small system with 4 streams and 2 units. We incorporate only flow conservation constraints and obtain <math style="vertical-align:-10%;">a+b=c\,\!</math> and <math style="vertical-align:-0%;">c=d\,\!</math>. If we have measurements for <math style="vertical-align:-0%;">c\,\!</math> and <math style="vertical-align:-0%;">d\,\!</math>, but not for <math style="vertical-align:-0%;">a\,\!</math> and <math style="vertical-align:-0%;">b\,\!</math>, then the system cannot be calculated (knowing <math style="vertical-align:-0%;">c\,\!</math> does not give information about <math style="vertical-align:-0%;">a\,\!</math> and <math style="vertical-align:-0%;">b\,\!</math>). On the other hand, if <math style="vertical-align:-0%;">a\,\!</math> and <math style="vertical-align:-0%;">c\,\!</math> are known, but not <math style="vertical-align:-0%;">b\,\!</math> and <math style="vertical-align:-0%;">d\,\!</math>, then the system can be calculated.▼ We incorporate only flow conservation constraints and obtain <math>a+b=c\,\!</math> and <math>c=d\,\!</math>. It is possible that the system <math>F(x,y)=0\,\!</math> is not calculable, even though <math>p-m\ge 0\,\!</math>. ▲Let us consider a small system with 4 streams and 2 units. We incorporate only flow conservation constraints and obtain <math style="vertical-align:-10%;">a+b=c\,\!</math> and <math style="vertical-align:-0%;">c=d\,\!</math>. If we have measurements for <math ~~style="vertical-align:-0%;"~~>c\,\!</math> and <math ~~style="vertical-align:-0%;"~~>d\,\!</math>, but not for <math ~~style="vertical-align:-0%;"~~>a\,\!</math> and <math ~~style="vertical-align:-0%;"~~>b\,\!</math>, then the system cannot be calculated (knowing <math ~~style="vertical-align:-0%;"~~>c\,\!</math> does not give information about <math ~~style="vertical-align:-0%;"~~>a\,\!</math> and <math ~~style="vertical-align:-0%;"~~>b\,\!</math>). On the other hand, if <math ~~style="vertical-align:-0%;"~~>a\,\!</math> and <math ~~style="vertical-align:-0%;"~~>cd\,\!</math> are known, but not <math ~~style="vertical-align:-0%;"~~>b\,\!</math> and <math ~~style="vertical-align:-0%;"~~>dc\,\!</math>, then the system can be calculated. In 1981, observability and redundancy criteria were proven for these sorts of flow networks involving only mass and energy balance constraints.<ref name="Stanley-Mah-1981b">[https://gregstanleyandassociates.com/CES-1981b-ObservabilityRedundancyProcessNetworks.pdf Stanley G.M., and Mah R.S.H., "Observability and Redundancy Classification in Process Networks", Chem. Engng. Sci. 36, 1941 (1981) ]</ref> After combining all the plant inputs and outputs into an "environment node", loss of observability corresponds to cycles of unmeasured streams. That is seen in the second case above, where streams a and b are in a cycle of unmeasured streams. Redundancy classification follows, by testing for a path of unmeasured streams, since that would lead to an unmeasured cycle if the measurement was removed. Measurements c and d are redundant in the second case above, even though part of the system is unobservable. ===Benefits=== Line 92 ⟶ 111: * the individual test. If no gross errors exist in the set of measured values, then each penalty term in the objective function is a [[normal distribution\|random variable]] that is normally distributed with mean equal to 0 and variance equal to 1. By consequence, the objective function is a random variable which follows a [[chi-square distribution]], since it is the sum of the square of normally distributed random variables. Comparing the value of the objective function <math ~~style="vertical-align:-30%;"~~>f(y^)\,\!</math> with a given [[percentile]] <math ~~style="vertical-align:-20%;"~~>P_{\alpha}\,</math> of the probability density function of a chi-square distribution (e.g. the 95th percentile for a 95% confidence) gives an indication of whether a gross error exists: If <math ~~style="vertical-align:-20%;"~~>f(y^)\le P_{95}</math>, then no gross errors exist with 95% probability. The chi square test gives only a rough indication about the existence of gross errors, and it is easy to conduct: one only has to compare the value of the objective function with the critical value of the chi square distribution. The individual test compares each penalty term in the objective function with the critical values of the normal distribution. If the <math>i</math>-th penalty term is outside the 95% confidence interval of the normal distribution, then there is reason to believe that this measurement has a gross error. ==Advanced process data ~~validation and~~ reconciliation== Advanced process data ~~validation and~~ reconciliation (~~DVR~~PDR) is an integrated approach of combining data reconciliation and data validation techniques, which is characterized by * complex models incorporating besides mass balances also thermodynamics, momentum balances, equilibria constraints, hydrodynamics etc. * gross error remediation techniques to ensure meaningfulness of the reconciled values, Line 103 ⟶ 122: ===Thermodynamic models=== Simple models include mass balances only. When adding thermodynamic constraints such as [[~~energy~~First ~~balance~~law of thermodynamics\|~~heat~~energy balances]] to the model, its scope and the level of [[Data redundancy\|redundancy]] increases. Indeed, as we have seen above, the level of redundancy is defined as <math>p-m</math>, where <math>p</math> is the number of equations. Including energy balances means adding equations to the system, which results in a higher level of redundancy (provided that enough measurements are available, or equivalently, not too many variables are unmeasured). ===Gross error remediation=== [[image:scheme reconciliation.jpg\|thumb\|350px\|The workflow of an advanced data validation and reconciliation process.]] Gross errors are measurement systematic errors that may [[bias]] the reconciliation results. Therefore, it is important to identify and eliminate these gross errors from the reconciliation process. After the reconciliation [[statistical tests]] can be applied that indicate whether or not a gross error does exist somewhere in the set of measurements. These techniques of gross error remediation are based on two concepts: * gross error elimination * gross error relaxation. Line 114 ⟶ 133: Gross error relaxation targets at relaxing the estimate for the uncertainty of suspicious measurements so that the reconciled value is in the 95% confidence interval. Relaxation typically finds application when it is not possible to determine which measurement around one unit is responsible for the gross error (equivalence of gross errors). Then measurement uncertainties of the measurements involved are increased. It is important to note that the remediation of gross errors reduces the quality of the reconciliation, either the redundancy decreases (elimination) or the uncertainty of the measured data increases (relaxation). Therefore, it can only be applied when the initial level of redundancy is high enough to ensure that the data reconciliation can still be done (see Section 2,<ref name="vdi" />). ===Workflow=== Advanced ~~DVR~~PDR solutions offer an integration of the techniques mentioned above: # data acquisition from data historian, data base or manual inputs # data validation and filtering of raw measurements Line 125 ⟶ 144: #* gross error remediation (and go back to step 3) # result storage (raw measurements together with reconciled values) The result of an advanced ~~DVR~~PDR procedure is a coherent set of validated and reconciled process data. ==Applications== ~~DVR~~PDR finds application mainly in industry sectors where either measurements are not accurate or even non-existing, like for example in the [[upstream (fossil-fuel industry)\|upstream sector]] where [[flow measurement\|flow meters]] are difficult or expensive to position (see <ref>P. Delava, E. Maréchal, B. Vrielynck, B. Kalitventzeff (1999), ''Modelling of a Crude Oil Distillation Unit in Term of Data Reconciliation with ASTM or TBP Curves as Direct Input – Application : Crude Oil Preheating Train'', Proceedings of ESCAPE-9 conference, Budapest, May 31-June 2, 1999, supplementary volume, p. 17-20.</ref>); or where accurate data is of high importance, for example for security reasons in [[nuclear power plants]] (see <ref>M. Langenstein, J. Jansky, B. Laipple (2004), ''Finding Megawatts in nuclear power plants with process data validation'', Proceedings of ICONE12, Arlington, USA, April 25–29, 2004.</ref>). Another field of application is [[Performance test (assessment)\|performance and process monitoring]] (see <ref>Th. Amand, G. Heyen, B. Kalitventzeff, ''Plant Monitoring and Fault Detection: Synergy between Data Reconciliation and Principal Component Analysis'', Comp. and Chem, Eng. 25, p. 501-507, 2001.</ref>) in oil refining or in the chemical industry. As ~~DVR~~PDR enables to calculate estimates even for unmeasured variables in a reliable way, the German Engineering Society (VDI Gesellschaft Energie und Umwelt) has accepted the technology of ~~DVR~~PDR as a means to replace expensive sensors in the nuclear power industry (see VDI norm 2048,<ref name="vdi" />). ==See also== Line 137 ⟶ 156: * [[Industrial processes]] * [[Chemical engineering]] ~~==DVR software==~~ * Aspen Operations Reconciliation and Accounting ([http://www.aspentech.com/ Aspen Technology, Inc.]) * Bilmat ([http://www.algosys.com/ Algosys]) * CADSIM Plus([http://www.aurelsystems.com Aurel Systems]) * DATREC ([http://www.technip.com/sites/default/files/technip/field_activity/attachments/DATREC.pdf Technip]) * ErrorSolver ([http://www.mesenter.com MESenter]) * Resolver ([http://inova8.com/joomla/index.php/products/resolver-production-data-reconciliation inova8]) * ROMeo ([http://iom.invensys.com/EN/Pages/SimSci-Esscor_ROMeoOnlinePerformanceSuite.aspx Invensys]) * Sigmafine ([http://www.pimsoftinc.com/ Pimsoft]) * S-TMS ([http://www.soteica.com/en/s-tms_yield_accounting.php Soteica]) * VALI ([http://www.belsim.com Belsim]) * VisioBal ([http://www.mathsmet.com/VisioBal MathsMet]) ==References== Line 155 ⟶ 161: {{Reflist}} * Alexander, Dave, Tannar, Dave & Wasik, Larry "Mill Information System uses Dynamic Data Reconciliation for Accurate Energy Accounting" TAPPI Fall Conference 2007.[http://www.tappi.org/Downloads/Conference-Papers/2007/07EPE/07epe87.aspx]{{Dead link\|date=July 2019 \|bot=InternetArchiveBot \|fix-attempted=yes }} * Rankin, J. & Wasik, L. "Dynamic Data Reconciliation of Batch Pulping Processes (for On-Line Prediction)" PAPTAC Spring Conference 2009. * S. Narasimhan, C. Jordache, ''Data reconciliation and gross error detection: an intelligent use of process data'', Golf Publishing Company, Houston, 2000. * V. Veverka, F. Madron, ''Material and Energy Balancing in the Process Industries'', Elsevier Science BV, Amsterdam, 1997. * J. Romagnoli, M.C. Sanchez, ''Data processing and reconciliation for chemical process operations'', Academic Press, 2000. ~~==External links==~~ ~~Some research groups working on data reconciliation:~~ * [http://www.aurelsystems.com/ Chemical Process Simulation Software Tools & Services , Vancouver, Canada] * [http://www.ou.edu/class/che-design/ Process and product design – Plant operations, University of Oklahoma, USA] * [http://www.che.iitm.ac.in/~naras/research.htm Indian Institute of Technology Madras, India] * [http://www.lassc.ulg.ac.be/ Laboratory for Analysis and Synthesis of Chemical Systems, University of Liege, Belgium] * [http://leni.epfl.ch/ Industrial Energy Systems Laboratory, Lausanne, Switzerland] ~~White papers:~~ * [http://www.aurelsystems.com/papers.htm Mill Information System uses Dynamic Data Reconciliation for Accurate Energy Accounting] * [http://www.aurelsystems.com/papers.htm Dynamic Data Reconciliation of Batch Pulping Processes (for On-Line Prediction)] * [http://gregstanleyandassociates.com/whitepapers/DataRec/datarec.htm Data Reconciliation, Observability, and Redundancy papers] ~~{{DEFAULTSORT:Data Validation And Reconciliation}}~~ [[Category:Data management]]