Content deleted Content added
typo |
Ira Leviton (talk | contribs) Fixed typos found with Wikipedia:Typo_Team/moss. |
||
Line 37:
It should be pointed out here that the “pseudo-BEMD” method is not limited to only one-spatial dimension; rather, it can be applied to data of any number of spatial-temporal dimensions. Since the spatial structure is essentially determined by timescales of the variability of a physical quantity at each ___location and the decomposition is completely based on the characteristics of individual time series at each spatial ___location, there is no assumption of spatial coherent structures of this physical quantity. When a coherent spatial structure emerges, it reflects better the physical processes that drive the evolution of the physical quantity on the timescale of each component. Therefore, we expect this method to have significant applications in spatial-temporal data analysis.
To design a pseudo-BEMD algorithm the key step is to translate the algorithm of the 1D [[Hilbert huang transform|EMD]] into a Bi-dimensional Empirical Mode Decomposition (BEMD) and further extend the algorithm to three or more dimensions which is similar to the BEMD by extending the procedure on successive dimensions. For a 3D data cube of i × j × k elements, the pseudo-BEMD will yield detailed 3D components of m × n × q where m, n and q are the number of the IMFs decomposed from each dimension having i, j, and k elements, respectively.
Mathematically let us represent a 2D signal in the form of ixj matrix form with a finite number of elements.
Line 68:
# RX(m,i,j) will be decomposed into CRX(m,1,i,j), CRX(m,2,i,j),…, CRX(m,n,i,j)
where C means column decomposing. Finally, the 2D decomposition will result into m× n matrices which are the 2D EMD components of the original data X(i,j). The matrix expression for the result of the 2D decomposition is
: <math>
Line 76:
</math><ref name=":5" />
where each element in the matrix CRX is an i × j sub-matrix representing a 2D EMD decomposed component. We use the arguments (or suffixes) m and n to represent the component number of row decomposition and column decomposition, respectively rather than the subscripts indicating the row and the column of a matrix. Notice that the m and n indicate the number of components resulting from row (horizontal) decomposition and then column (vertical) decomposition, respectively.
By combining the components of the same scale or the comparable scales with minimal difference will yield a 2D feature with best physical significance. The components of the first row and the first column are approximately the same or comparable scale although their scales are increasing gradually along the row or column. Therefore, combining the components of the first row and the first column will obtain the first complete 2D component (C2D1). The subsequent process is to perform the same combination technique to the rest of the components, the contribution of the noises are distributed to the separate component according to their scales. As a result, the coherent structures of the components emerge, In this way, the pseudo-BEMD method can be applied to reveal the evolution of spatial structures of data.
Line 88:
given as <math>I=f(x_1,x_2,x_3,x_4,\ldots,x_n)</math>
In which the subscription, n, indicated the number of dimensions. The procedure is identical as stated above: the decomposition starts with the first dimension, and proceeds to the second and third till all the dimensions are exhausted. The decomposition is still implemented by slices. This new approach is based on separating the original data into one-dimensional slices, then applying ensemble EMD to each one-dimensional slice. The key part of the method is in the construction of the IMF according to the principle of combination of the comparable minimal scale components.
For example, the matrix expression for the result of a 3D decomposition is TCRX(m,n,q,i,j,k) where T denotes the depth (or time) decomposition. Based on the comparable minimal scale combination principle as applied in the 2D case, the number of complete 3D components will be the smallest value of ''m'', ''n'', and ''q''. The general equation for deriving 3D components is
Line 102:
</math>
The pseudo-BEMD method has several advantages. For instance, the sifting procedure of the pseudo-BEMD is a combination of one dimensional sifting. It employs 1D curve fitting in the sifting process of each dimension, and has no difficulty as encountered in the 2D EMD algorithms using surface fitting which has the problem of determining the saddle point as a local maximum or minimum. Sifting is the process which separates the IMF and repeats the process until the residue is obtained. The first step of performing sifting is to determine the upper and lower envelopes encompassing all the data by using the spline method. Sifting scheme for pseudo-BEMD is like the 1D sifting where the local mean of the standard EMD is replaced by the mean of multivariate envelope curves.
The major disadvantage of this method is that although we could extend this algorithm to any dimensional data we only use it for Two dimension applications. Because the computation time of higher dimensional data would be proportional to the number of IMF's of the succeeding dimensions. Hence it could exceed the computation capacity for a Geo-Physical data processing system when the number of EMD in the algorithm is large. Hence we have mentioned below faster and better techniques to tackle this disadvantage.
Line 112:
# EEMD on the compressed data; this is the most challenging since on decomposing the compressed data there is a high probability to lose key information. A data compression method that uses principal component analysis (PCA)/empirical orthogonal function (EOF) analysis or principal oscillation pattern analysis is used to compress data.
==== Principal component analysis (PCA) or empirical orthogonal function analysis (EOF)
The [[principal component analysis]]/[[Empirical orthogonal functions|empirical orthogonal function]] analysis (PCA/EOF) has been widely used in data analysis and image compression, its main objective is to reduce a data set containing a large number of variables to a data set containing fewer variables, but that still represents a large fraction of the variability contained in the original data set. In climate studies, EOF analysis is often used to study possible spatial modes (i.e., patterns) of variability and how they change with time . In statistics, EOF analysis is known as [[principal component analysis]] (PCA).
Typically, the EOFs are found by computing the eigenvalues and eigen vectors of a spatially weighted anomaly covariance matrix of a field. Most commonly, the spatial weights are the cos(latitude) or, better for EOF analysis, the sqrt(cos(latitude)). The derived eigenvalues provide a measure of the percent variance explained by each mode. Unfortunately, the eigenvalues are not necessarily distinct due to sampling issues. North et al. (Mon. Wea. Rev., 1982, eqns 24-26) provide a 'rule of thumb' for determining if a particular eigenvalue (mode) is distinct from its nearest neighbor.
Line 120:
Atmospheric and oceanographic processes are typically 'red' which means that most of the variance (power) is contained within the first few modes. The time series of each mode (aka, principle components) are determined by projecting the derived eigen vectors onto the spatially weighted anomalies. This will result in the amplitude of each mode over the period of record.
By construction, the EOF patterns and the principal components are independent. Two factors inhibit physical interpretation of EOFs: (i) The orthogonality constraint and (ii) the derived patterns may be ___domain dependent. Physical systems are not necessarily orthogonal and if the patterns depend on the region used they may not exist if the ___domain changes.
==== Spatial-temporal signal using multi-dimensional ensemble empirical mode decomposition
Assume, we have a spatio-temporal data
Using PCA/EOF, one can express
where
If the data subjected to PCA/EOF analysis is all white noise, all eigenvalues are theoretically equal and there is no preferred vector direction for the principal component in PCA/EOF space. To retain most of the information of the data, one needs to retain almost all the PC's and EOF's, making the size of PCA/EOF expression even larger than that of the original but If the original data contain only one spatial structure and oscillate with time, then the original data can be expressed as the product of one PC and one EOF, implying that the original data of large size can be expressed by small size data without losing information, i.e. highly compressible.
The variability of a smaller region tends to be more spatio-temporally coherent than that of a bigger region containing that smaller region, and, therefore, it is expected that fewer PC/EOF components are required to account for a threshold level of variance hence one way to improve the efficiency of the representation of data in terms of PC/EOF component is to divide the global spatial ___domain into a set of sub-regions. If we divide the original global spatial ___domain into n sub-regions containing N1, N2, . . . , Nn spatial grids, respectively, with all Ni, where i=1, . . . , n, greater than M, where M denotes the number of temporal locations, we anticipate that the numbers of the retained PC/EOF pairs for all sub-regions K1, K2, . . . , Kn are all smaller than K, the total number of data values in PCA/EOF representation of the original data of the global spatial ___domain by the equation given up is K×(N+M). For the new approach of using spatial division, the total number of values in PCA/EOF
: <math> \sum_{i=1}^nK_i(M+N_i)=K'_iN+M\sum_{i=1}^nK_i</math>
Line 144:
=== Fast multidimensional ensemble empirical mode decomposition<ref name=":7" /> ===
For a temporal signal of length ''M'', the complexity of cubic spline sifting through its local extrema is about the order of ''M,'' and so is that of the EEMD as it only repeats the spline fitting operation with a number that is not dependent on ''M''. However, as the sifting number (often selected as 10) and the ensemble number (often a few hundred) multiply to the spline sifting operations, hence the EEMD is time consuming compared with many other time series analysis methods such as Fourier transforms and wavelet transforms. The MEEMD employs EEMD decomposition of the time serie s at each division grids of the initial temporal signal, the EEMD operation is repeated by the number of total grid points of the ___domain. The idea of the fast MEEMD is very simple. As PCA/EOF-based compression expressed the original data in terms of pairs of PCs and EOFs, through decomposing PCs, instead of time series of each grid, and using the corresponding spatial structure depicted by the corresponding EOFs, the computational burden can be significantly reduced.
The fast MEEMD includes the following steps:
Line 165:
# Dynamic data variations: In EEMD, white noises change the number of extrema causing some irregularity and load imbalance, and thus slowing down the parallel execution.
# Stride memory accesses of high-dimensional data: High dimensional data are stored in non-continuous memory locations. Accesses along high dimensions are thus strided and uncoalesced, wasting available memory bandwidth.
# Limited resources to harness parallelism: While the independent EMDs and/or EEMDs comprising an MEEMD provide high parallelism, the computational capacities of multi-core and many-core processors may not be sufficient to fully exploit the inherent parallelism of MEEMD. Moreover, increased parallelism may increase memory requirements beyond the memory capacities of these processors.
[[File:Sample_BEMD_Simulation_results_for_a_noisy_signal_with_imf.jpg|thumb|Bi-Dimensional EMD Intrinsic mode function along with the residue eliminating the noise level.]] In MEEMD, when a high degree of parallelism is given by the ensemble dimension and/or the non-operating dimensions, the benefits of using a thread-level parallel algorithm are threefold.<ref name=":8" /> # It can exploit more parallelism than a block-level parallel algorithm.
Line 240 ⟶ 242:
===Advantages===
This method (FABEMD) provides a way to use less computation to obtain the result rapidly, and it allows us to ensure more accurate estimation of the BIMFs. Even more, the FABEMD is more adaptive to handle the large size input than the traditional BEMD. Otherwise, the FABEMD is an efficient method that we do not need to consider the boundary effects and overshoot-undershoot problems.
===Limitations===
Line 248 ⟶ 250:
===Concept===
The '''Partial Differential Equation-Based Multidimensional Empirical Mode Decomposition (PDE-based MEMD)''' approach is a way to improve and overcome the difficulties of mean-envelope estimation of a signal from the traditional EMD. The PDE-based MEMD focus on modifying the original algorithm for MEMD. Thus, the result will provide an analytical formulation which can facilitate theoretical analysis and performance observation. In order to perform multidimensional EMD, we need to extend the 1-D PDE-based sifting process<ref name=":2" /> to 2-D space as shown by the steps below.
Here, we take 2-D PDE-based EMD as an example.
|