Startseite Optimal Signal Extraction with Correlated Components
Artikel
Lizenziert
Nicht lizenziert Erfordert eine Authentifizierung

Optimal Signal Extraction with Correlated Components

  • Tucker S. McElroy EMAIL logo und Agustin Maravall
Veröffentlicht/Copyright: 15. April 2014
Veröffentlichen auch Sie bei De Gruyter Brill

Abstract

While it is typical in the econometric signal extraction literature to assume that the unobserved signal and noise components are uncorrelated, there is nevertheless an interest among econometricians in the hypothesis of hysteresis, i.e. that major movements in the economy are fundamentally linked. While specific models involving correlated signal and noise innovation sequences have been developed and applied using state space methods, there is no systematic treatment of optimal signal extraction with correlated components. This paper provides the mean square error optimal formulas for both finite samples and bi-infinite samples and furthermore relates these filters to the more well-known Wiener–Kolmogorov (WK) and Beveridge–Nelson (BN) signal extraction formulas in the case of ARIMA component models. Then we obtain the result that the optimal filter for correlated components can be viewed as a weighted linear combination of the WK and BN filters. The gain and phase functions of the resulting filters are plotted for some standard cases. Some discussion of estimation of hysteretic models is presented, along with empirical results on an economic time series. Comparisons are made between signal extractions from traditional WK filters and those arising from the hysteretic models.

Appendix

Derivation of the BN filter

The original BN filter of Beveridge and Nelson (1981) was applied to nonseasonal time series, yielding a decomposition into permanent and transitory components. This was achieved by supposing the innovations of signal and noise to be identical (rather than orthogonal). We extend this notion of the BN filter to the general scenario outlined in Section 3 (and following) by taking at=bt=ct in eq. [6]. Then the MSE optimal filters have zero error and are given by de-correlating the aggregate process (via the filter φ(z)/θ(z)) followed by re-correlating according to the signal’s pattern, namely by the filter θS(z)/φS(z). Multiplying these filters yields θS(z)φN(z)/θ(z). The formula we provide in Section 3 generalizes this treatment slightly to the case of a filter derived under the assumption that the innovations are fully (positively) correlated (i.e. they may have different variances). This leads to the insertion of the factor σb/σa in the formula for ΨBN.

By definition, this filter will produce optimal extractions under the full positive correlation hypothesis, and of course the signal and noise extractions aggregate back to the original aggregate variables. This defines the filter; of course it may be applied to data that does not satisfy the hypotheses under which the filter was derived. It is in this sense that we may speak of the hysteretic filter as being a convex combination of WK and BN filters – it is an algebraic fact, involving statistical quantities derived under incompatible stochastic assumptions.

We are not aware of prior references to this generalized BN filter, but are not comfortable claiming that our derivation is novel. This sort of idea, and construction, has close precedents in Morley, Nelson, and Zivot (2003) and Proietti (2006).

We also mention an interesting interpretation of the BN filter in the case of a trend plus noise decomposition, namely that it can be interpreted as the optimal concurrent filter in an orthogonal decomposition. Consider the special case of an I(1) data process {Yt} that satisfies (1B)Yt=θ(B)at, where the polynomial (or causal power series) θ(z) is invertible. Then the BN decomposition can be written

θ(B)at=θ(1)at+θ(B)θ(1)at,

which can be compared to (1B)Yt=(1B)St+(1B)Nt. Thus, using the notation of eq. [6] we have θS(z)=θ(1), a constant, while θN(z)=(θ(z)θ(1))/(1z) and φS(z)=1z and φN(z)=1. Note that a single innovation {at} drives both component processes. The assumed invertibility of θ(z) ensures that θN(z) has a convergent power series expansion. The BN filter can then be expressed as θ(1)/θ(z).

Suppose now that we take the same component model definitions – namely (1B)St=bt, a white noise of variance θ2(1)σa2, and Nt=θN(B)ct, with {ct} another white noise – but instead of having a common innovation drive the components, we assume they are uncorrelated. The concurrent filter in this case, using the formulas of Bell and Martin (2004), is

1zσa2θ(z)1zˉθ(zˉ)σb2(1z)(1zˉ)+=σb2σa2θ(z)θ(1),

which simplifies to θ(1)/θ(z). Here the bracket notation with the plus subscript indicates that in the power series expansion, we only retain the terms corresponding to non-negative powers of z.

Proofs

Proof of Theorem 1. Following the same technique as McElroy (2008), it suffices to show that the error process ε=SˆS is uncorrelated with Y. Using I to denote an identity matrix, the error process is

ε=FN(IF)S=M1ΔNΓV1+PΓW1Δ_SVM1ΔSΓU1PΓW1Δ_NU.

From Assumption A, it follows that ε is uncorrelated with Y. Since Y can be expressed as a linear combination of Y and W, as discussed in McElroy (2008), it suffices to show that ε is uncorrelated with W. To that end, we have

E[εW]=M1ΔNΓV1+PΓW1Δ_SΓVΔ_S+ΓVUΔ_NM1ΔSΓU1PΓW1Δ_NΓUΔ_N+ΓUVΔ_S=M1Δ+ΔNΓV1ΓVUΔ_N+PΣW1Δ_SΓVΔ_S+Δ_SΓVUΔ_NM1Δ+ΔSΓU1ΓUVΔ_SPΣW1Δ_NΓUΔ_N+Δ_NΓUVΔ_S=M1(P+P)=0,

using eq. [4]. This establishes MSE linear optimality. For the error covariance matrix, we obtain the formula by expanding E[εε] and simplifying the algebra.

It may be instructive to offer a constructive derivation of the formula for F. Recall that U=ΔSS and V=ΔNN; the minimum MSE linear estimator of U given Y is the same as the optimal estimator of U given W (by Assumption A), and hence its formula is

Cov(U,W)Cov(W,W)1W=ΓUΔ_N+ΓUVΔ_SΓW1W,

which utilizes eq. [3]. By linearity, the above estimate must be equal to ΔS times the optimal estimate of S given Y. Similarly, we can derive the minimum MSE linear estimator of V given Y via

Cov(V,W)Cov(W,W)1W=ΓVΔ_S+ΓVUΔ_NΓW1W.

(This tactic, whereby we directly compute the best linear estimate of S given Y, can be used to derive F as well, but requires considerably more algebra than the approach given here.) Collecting these results, and letting F denote the matrix such that FY is the minimum MSE linear estimator of S given Y, we obtain

ΔSF=ΓUΔ_N+ΓUVΔ_SΓW1Δ
ΔNF=ΔNΓVΔ_S+ΓVUΔ_NΓW1Δ.

The second equation utilizes ΔNF=ΔNΔN(IF) and the fact that IF is the linear optimal estimator of the noise N given Y. Now knowing algebraically the form of both ΔSF and ΔNF permits us to solve for F, presuming that the intersection of the null spaces of ΔS and ΔN is the zero vector; this follows from the assumption that δS(z) and δN(z) are relatively prime. Then multiply ΔSF by ΔSΓU1(this is not a unique choice, e.g. we could just utilize ΔS instead and get an alternative expression for the unique F) and ΔNF by ΔNΓV1 and add the result. Then utilizing eq. [2], this yields

MF=Δ+ΔSΓU1ΓUVΔ_SΓW1Δ+ΔNΓV1ΔNΔΔNΓV1ΓVUΔ_NΓW1Δ,

which simplifies to ΔNΓV1ΔN+PΓW1Δ. Inverting M then yields the formula for F.

Proof of Theorem 2. Our strategy is to demonstrate that the filter Ψ(z) produces a signal extraction error process that is orthogonal to the aggregate process, implying MSE linear optimality [cf. Bell (1984)]. It is clear from the given formula that δN(z) can be factored out, leaving Ψ(z)=Ω(z)δN(z) with

Ω(z)=fU(λ)δN(zˉ)+fUV(λ)δS(zˉ)fW(λ).

Similarly, it is easily verified that 1Ψ(z)=Φ(z)δS(z) with

Φ(z)=fV(λ)δS(zˉ)+fVU(λ)δN(zˉ)fW(λ).

Then εt=Ψ(B)YtSt=Ω(B)VtΦ(B)Ut. This shows that {εt} is weakly stationary. The aggregate process {Yt} can be written in terms of a linear combination of d initial values summed with a linear function of the differenced process {Wt}– see Bell (1984). Hence by Assumption A, it is sufficient to demonstrate that εt is uncorrelated with Wt+h for any t and h. Now Wt+h=δN(B)Ut+h+δS(B)Vt+h, so that

EεtWt+h=Ω(B)δS(B1)γV(h)+δN(B1)γVU(h)Φ(B)δN(B1)γU(h)+δS(B1)γUV(h),

which is independent of t, but holds for all h. Thus, taking the Fourier transform yields

hzhEεtWt+h=Ω(z)δS(zˉ)fV(\lambda)+δN(zˉ)fVU(\lambda)Φ(z)δN(zˉ)fU(\lambda)+δS(zˉ)fUV(\lambda).

Then simple algebra, along with the above formulas for Ω(z) and Φ(z), produces hzhE[εtWt+h]=0, and hence E[εtWt+h]=0 for all h. A second calculation produces

Eεtεt+h=Ω(B)Ω(B1)γV(h)Ω(B)Φ(B1)γVU(h)Φ(B)Ω(B1)γUV(h)+Φ(B)Φ(B1)γU(h).

Again, summing against zh yields

fε(λ)=Ω(z)Ω(zˉ)fV(λ)Ω(z)Φ(zˉ)fVU(λ)Φ(z)Ω(zˉ)fUV(λ)+Φ(z)Φ(zˉ)fU(λ).

Now plugging in for Ω and Φ gives the stated result.

Alternative calculation of F

We mentioned in the proof of Theorem 1 that there is more than one expression for the signal extraction matrix F, although numerically all such formulas are equal. A different approach to the problem, based upon results of Bell and Hillmer (1988), can be developed that has some computational advantages. Let S denote the first dS values of the vector S, conceived of as initial values of the process. Likewise, let N denote the first dN values of the vector N, and Y the first d values of the vector Y. As described in Bell and Hillmer (1988), it is always possible to algebraically describe S as a linear function of Y, U, and V, and likewise for N. We review and develop these relationships below.

We use the notation In to denote an identity matrix of dimension n. We can relate the signal vector S to its initial values S and differenced values U, and similarly for the noise, via the transformations

SU=IdS0ΔSSandNV=IdN0ΔNN.

Let us denote these matrices by S and N, respectively; they are unit lower triangular and invertible, which allows us to express S directly in terms of S and U (and N in terms of N and V). Taking only the first d rows of S1 yields

[Id0]S=[Id0]S1SU=IdS0ASBSSU.

This calculation first uses the fact that – because S is unit lower triangular with first dS rows given by [IdS0]– the first dS rows of S1 are also given by [IdS0]. The matrices AS and BS correspond to the next dN rows of S1. With a similar notation and decomposition for N, and noting that S+N=Y, we obtain

Y=[Id0]S+[Id0]N=IdS0ASBSSU+IdN0ANBNNV=IdSIdNASANSN+0BSU+0BNV.

The d×d matrix that multiplies the signal and noise initial values is denoted by [H1H2] in Bell and Hillmer (1988) and is there proved to be invertible. We will denote it by the symbol Ω. Note that its inversion is inexpensive due to its relatively low dimension of d. It now follows that

S=[IdS0]Ω1Y0BSU0BNV
N=[0IdN]Ω1Y0BSU0BNV.

These relations between signal and noise initial values are an exact algebraic relation, and a direct implication of the nonstationary signal and noise structure; no stochastic assumptions have been used yet. The relations can be utilized to produce signal extraction estimates as follows. From S=S1[S,U] we deduce that the linear optimal estimate of S can be constructed from estimates of S and U, followed by application of S1. This latter matrix does not require inversion in general, but rather expresses in matrix notation the notion of recursion. Namely, we can always compute St from dS prior values (in time) together with Ut, i.e. St=j=1dSδjS/δ0SStj+Ut. In the proof of Theorem 1, we discussed the optimal linear estimates of U given Y, and estimates of V given Y, which we shall denote by Uˆ and Vˆ. Then the optimal linear estimates of the signal and noise initial values are just

Sˆ=IdS0Ω1Y0BSUˆ0BNVˆ
Nˆ=0IdNΩ1Y0BSUˆ0BNVˆ.

It is easy to check, using Assumption A, that the error SSˆ is orthogonal to Y. The algorithm is to first compute Uˆ and Vˆ via ΔSFY and ΔN(IF)Y, i.e.

Uˆ=ΓUΔ_N+ΓUVΔ_SΓW1W
Vˆ=ΓVΔ_S+ΓVUΔ_NΓW1W.

Then the formula for Sˆ is utilized, and finally Sˆ is obtained recursively (and Nˆ=YSˆ). Such an algorithm can avoid the inversion of large matrices, excepting the work involved in inverting ΓW; however, this is a Toeplitz matrix, and hence the innovations algorithm of Brockwell and Davis (1991) can be utilized.

Some readers may find it illuminating to derive F directly from Cov(S,Y)Cov(Y,Y)1, which we now derive, utilizing the above expressions. Let denote the differencing matrix with upper rows [1d0] and bottom rows Δ. Because W is orthogonal to Y, we find that

Cov(Y,Y)=1Cov(Y,Y)00ΓW.

Moreover, U is orthogonal to Y, so that

Cov(S,Y)=S1Cov(S,Y)Cov(S,W)0Cov(U,W).

Therefore, we obtain

Cov(S,Y)Cov(Y,Y)1=S1Cov(S,Y)Cov(S,W)0ΓUΔ_N+ΓUVΔ_SCov(Y,Y)1,0ΓW1Δ=S1Cov(S,Y)Cov(Y,Y)1,0+Cov(S,W)ΓW1ΔΓUΔ_N+ΓUVΔ_SΓW1Δ.

Now utilizing the expression of S above, written in terms of Y, U, and V, it follows that

Cov(S,Y)=IdS0Ω1Cov(Y,Y)
Cov(S,W)=IdS0Ω10BSΓUΔ_N+ΓUVΔ_S+0BNΓVΔ_S+ΓVUΔ_N.

As a result,

Cov(S,Y)Cov(Y,Y)1=S1IdS0Ω1[Id0]0BSΓUΔ_N+ΓUVΔ_S+0BNΓVΔ_S+ΓVUΔ_NΓW1ΔΓUΔ_N+ΓUVΔ_SΓW1Δ

This is the same formula for F as given above.

References

Bell, W. 1984. “Signal Extraction for Nonstationary Time Series.” The Annals of Statistics12:64664.10.1214/aos/1176346512Suche in Google Scholar

Bell, W. 2004. “On RegCOMPONENT Time Series Models and Their Applications.” In State Space and Unobserved Component Models: Theory and Applications, edited by A. C.Harvey, S. J.Koopman, and N.Shephard. Cambridge, UK: Cambridge University Press.10.1017/CBO9780511617010.013Suche in Google Scholar

Bell, W., and S.Hillmer. 1984. “Issues Involved with the Seasonal Adjustment of Economic Time Series.” Journal of Business and Economics Statistics2:291320.Suche in Google Scholar

Bell, W., and S.Hillmer. 1988. “A Matrix Approach to Likelihood Evaluation and Signal Extraction for ARIMA Component Time Series Models.” Technical Report, U.S. Census Bureau.Suche in Google Scholar

Bell, W., and S.Hillmer. 1991. “Initializing the Kalman Filter for Nonstationary Time Series Models.” Journal of Time Series Analysis12:283300.10.1111/j.1467-9892.1991.tb00084.xSuche in Google Scholar

Bell, W., and D.Martin. 2004. “Computation of Asymmetric Signal Extraction Filters and Mean Squared Error for ARIMA Component Models.” Journal of Time Series Analysis25:60325.10.1111/j.1467-9892.2004.01920.xSuche in Google Scholar

Beveridge, S., and C.Nelson. 1981. “A New Approach to the Decomposition of Economic Time Series into Permanent and Transitory Components with Particular Attention to the Measurement of the ‘Business Cycle’.” Journal of Monetary Economics7:15174.10.1016/0304-3932(81)90040-4Suche in Google Scholar

Brewer, K. 1979. “Seasonal Adjustment of ARIMA Series.” Économie Appliquée1:722.Suche in Google Scholar

Brewer, K., P.Hagan, and P.Perazzelli. 1975. “Seasonal Adjustment Using Box–Jenkins Models.” Bulletin of the International Statistical Institute, Proceedings of the 40th Session31:13036.Suche in Google Scholar

Brockwell, P., and R.Davis. 1991. Time Series: Theory and Methods. New York: Springer.10.1007/978-1-4419-0320-4Suche in Google Scholar

Findley, D., and D.Martin. 2006. “Frequency Domain Analyses of SEATS and X-12-ARIMA Seasonal Adjustment Filters for Short and Moderate-Length Time Series.” Journal of Official Statistics22:134.Suche in Google Scholar

Gersch, W., and G.Kitagawa. 1983. “The Prediction of Time Series with Trends and Seasonalities.” Journal of Business and Economics Statistics1:25364.Suche in Google Scholar

Ghysels, E. 1987. “Seasonal Extraction in the Presence of Feedback.” Journal of Business and Economics Statistics5:19194.Suche in Google Scholar

Harvey, A. 1989. Forecasting, Structural Time Series Models and the Kalman Filter. Cambridge: Cambridge University Press.10.1017/CBO9781107049994Suche in Google Scholar

Harvey, A., and A.Jäger. 1993. “Detrending, Stylized Facts and the Business Cycle.” Journal of Applied Econometrics8:23147.10.1002/jae.3950080302Suche in Google Scholar

Harvey, A., and T.Trimbur. 2007. “Trend Estimation, Signal-Noise Ratios, and the Frequency of Observations.” In Growth and Cycle in the Eurozone, edited by G. L. Mazzi and G. Savio. Basingstoke, UK: Palgrave MacMillan.Suche in Google Scholar

Hillmer, S., and G.Tiao. 1982. “An ARIMA-Model-Based Approach to Seasonal Adjustment.” Journal of the American Statistical Association77:6370.10.1080/01621459.1982.10477767Suche in Google Scholar

Hyndman, R., A.Koehler, R.Snyder, and S.Grose. 2002. “A State Space Framework for Automatic Forecasting Using Exponential Smoothing Methods.” International Journal of Forecasting18:43954.10.1016/S0169-2070(01)00110-8Suche in Google Scholar

Jäger, A., and M.Parkinson. 1994. “Some Evidence on Hysteresis in Unemployment Rates.” European Economic Review38:32942.10.1016/0014-2921(94)90061-2Suche in Google Scholar

Kaiser, R., and A.Maravall. 2005. “Combining Filter Design with Model-Based Filtering: An Application to Business-Cycle Estimation.” International Journal of Forecasting21:691710.10.1016/j.ijforecast.2005.04.016Suche in Google Scholar

Maravall, A. 1995. “Unobserved Components in Economic Time Series.” In The Handbook of Applied Econometrics, edited by H.Pesaran and M.Wickens, Chap. 1, 1272. Oxford: Basil Blackwell.Suche in Google Scholar

Maravall, A., and G.Caporello. 2004. “Program TSW: Revised Reference Manual.” Working Paper 2004, Research Department, Bank of Spain.Suche in Google Scholar

McElroy, T. 2008. “Matrix Formulas for Nonstationary ARIMA Signal Extraction.” Econometric Theory24:122.10.1017/S0266466608080389Suche in Google Scholar

McElroy, T. 2012. “An Alternative Model-Based Seasonal Adjustment That Reduces Over-Adjustment.” Taiwan Economic Forecast and Policy43:3370.Suche in Google Scholar

McElroy, T., and A.Sutcliffe. 2006. “An Iterated Parametric Approach to Nonstationary Signal Extraction.” Computational Statistics and Data Analysis, Special Issue on Signal Extraction50:220631.10.1016/j.csda.2005.07.008Suche in Google Scholar

Morley, J., C.Nelson, and E.Zivot. 2003. “Why Are Beveridge–Nelson and Unobserved-Component Decompositions of GDP so Different?Review of Economics and Statistics85:23543.10.1162/003465303765299765Suche in Google Scholar

Oh, K., E.Zivot, and D.Creal. 2008. “The Relationship between the Beveridge–Nelson Decomposition and Other Permanent-Transitory Decompositions That Are Popular in Economics.” Journal of Econometrics146:20719.10.1016/j.jeconom.2008.08.021Suche in Google Scholar

Ord, J., A.Koehler, and R.Snyder. 1997. “Estimation and Prediction for a Class of Dynamic Nonlinear Statistical Models.” Journal of the American Statistical Association92:162129.10.1080/01621459.1997.10473684Suche in Google Scholar

Pinheiro, J., and D.Bates. 1996. “Unconstrained Parameterizations for Variance-Covariance Matrices.” Statistics and Computing6:28996.10.1007/BF00140873Suche in Google Scholar

Proietti, T. 1995. “The Beveridge–Nelson Decomposition: Properties and Extensions.” Journal of the Italian Statistical Society4:10124.10.1007/BF02589061Suche in Google Scholar

Proietti, T. 2006. “Trend-Cycle Decompositions with Correlated Components.” Econometric Reviews25:6184.10.1080/07474930500545496Suche in Google Scholar

R Development Core Team. 2012. “R: A Language and Environment for Statistical Computing.” R Foundation for Statistical Computing, Vienna, Austria.Suche in Google Scholar

Snyder, R. 1985. “Recursive Estimation of Dynamic Linear Models.” Journal of the Royal Statistical Society, B47:27276.10.1111/j.2517-6161.1985.tb01355.xSuche in Google Scholar

Taniguchi, M., and Y. Kakizawa. 2000. “Asymptotic Theory of Statistical Inference for Time Series.” New York: Springer.10.1007/978-1-4612-1162-4Suche in Google Scholar

Whittle, P. 1963. Prediction and Regulation. London: English Universities Press.Suche in Google Scholar

Wiener, N. 1949. The Extrapolation, Interpolation, and Smoothing of Stationary Time Series with Engineering Applications. New York: Wiley.10.7551/mitpress/2946.001.0001Suche in Google Scholar

  1. 1

    By this we mean any unobserved component time series model – see Gersch and Kitagawa (1983) and Harvey (1989) – where each component, once suitably differenced to reduce to stationarity, can be viewed as a linear process.

  2. 2

    Technically, the WK filter only applies to the case of stationary signal and noise, but by a standard abuse of terminology we extend this appellation to the case of nonstationary signal and noise as developed in Bell (1984).

  3. 3

    R code (R Development Core Team 2012) for the finite sample filters is available from the first author.

  4. 4

    See the Appendix for the derivation of the BN filter.

  5. 5

    Standard algorithms – in R for instance – produce the autocovariances for fU and fV; a little more work is required for the cross-covariances, but similar principles are in play.

  6. 6

    This is software developed at the Bank of Spain (Maravall and Caporello 2004).

  7. 7

    34 foreign trade, 4 retail, 6 housing, and 44 manufacturing.

  8. 8

    This phenomenon was quite common for the BHSM fitted to other series; redundancy in the parameter vector ψ manifested numerically through a final Hessian with some eigenvalues equal to zero, or even a negative number. The latter case indicates a saddle point in the likelihood, where routines such as Nelder–Mead, BFGS, and simulated annealing typically fail. However, m42110 and x42100 did not have this problem, their Hessians being positive definite.

Published Online: 2014-4-15
Published in Print: 2014-7-1

©2014 by Walter de Gruyter Berlin / Boston

Heruntergeladen am 2.10.2025 von https://www.degruyterbrill.com/document/doi/10.1515/jtse-2013-0016/html?lang=de
Button zum nach oben scrollen