Mean squared prediction error: Difference between revisions

Content deleted Content added
top: ce
new section: Computation of MSPE over out-of-sample data
Line 12:
 
Knowledge of ''g'' is required in order to calculate MSPE exactly, otherwise, it can be estimated.
 
==Computation of MSPE over out-of-sample data==
 
The mean squared prediction error can be computed exactly in two contexts. First, with a [[sample (statistics)|data sample]] of length ''n'', the [[data analyst]] may run the [[regression analysis|regression]] over only ''q'' of the data points (with ''q'' < ''n''), holding back the other ''n – q'' data points with the specific purpose of using them to compute the estimated model’s MSPE out of sample (i.e., not using data that were used in the model estimation process). Since the regression process is tailored to the ''q'' in-sample points, normally the in-sample MSPE will be smaller than the out-of-sample one computed over the ''n – q'' held-back points. If the increase in the MSPE out of sample compared to in sample is relatively slight, that results in the model being viewed favorably. And if two models are to be compared, the one with the lower MSPE over the ''n – q'' out-of-sample data points is viewed more favorably, regardless of the models’ relative in-sample performances. The out-of-sample MSPE in this context is exact for the out-of-sample data points that it was computed over, but is merely an estimate of the model’s MSPE for the mostly unobserved population from which the data were drawn.
 
Second, as time goes on more data may become available to the data analyst, and then the MSPE can be computed over these new data.
 
==Estimation of MSPE==