The Rescaled VAR Model with an Application to Mixed-Frequency Macroeconomic Forecasting

Andrea Giusto; Talan B. İşcan

doi:10.1515/snde-2017-0047

Article

The Rescaled VAR Model with an Application to Mixed-Frequency Macroeconomic Forecasting

Andrea Giusto and Talan B. İşcan

Published/Copyright: April 30, 2018

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information

From the journal Studies in Nonlinear Dynamics & Econometrics Volume 22 Issue 4

Abstract

This paper introduces the rescaled representation of VAR models (R-VARs) and demonstrates its application in forecasting mixed-frequency macroeconomic data. We develop the model, illustrate how to implement it, and derive the asymptotic properties of the estimates. We show that R-VARs provide reliable estimates of the prediction error bands while maintaining the precision of the point forecasts. We illustrate these features by comparing it to a mixed-frequency Bayesian VAR model, the leading alternative in the existing literature.

Keywords: forecasts confidence intervals; mixed-frequency data; real-time forecasting; rescaled VAR model

JEL Classification: C15; C32

Acknowledgement

We thank two anonymous reviewers for their comments and suggestions. This research was not externally funded. The authors declare no competing financial interests.

A Appendix

A.1 Proof of Theorem 1

The solutions to non-stochastic homogeneous systems are given by the matrix exponential function defined, for complex t and square stable matrix A, as exp⁡(tA)=∑n=0∞tnn!An. Hence, to find a particular solution to (3) – given initial conditions X₀ at t₀ = 0 – we define y(t) as

(6)X(t)=etAy(t).

which implies that dX(t) = Ae^tAy(t)dt + e^tAdy, and, using (3), it follows that

Bdt+AXdt+Γ′dW=AeAty(t)dt+eAtdy,

dy=e−AtBdt+e−AtΓ′dW,

integrating between 0 and t, we have

y(t)=∫0te−AsBds+∫0te−AsΓ′dW(s).

Using now the initial condition X(0) = X₀ the particular solution to (3) is

X(t)=etAX0+etA∫0te−sABds+eAt∫0te−sAΓ′dW(s),

where the last integral is a normally distributed random variable. Therefore, our DGP maps into that of Sims (1971) and Geweke (1978) by specifying the functions x and b in those papers as

x(s)b(t−s)≡{etAX0+e(t−s)ABs∈[0,t]0otherwise.

Now let δ > 0 denote the interval of time between observations of this process. The solution is

X(t+δ)=e(t+δ)AX0+e(t+δ)A∫0t+δe−sABds+e(t+δ)A∫0t+δe−sAΓ′dW(s)=eδA[etAX0+etA∫0te−sABds+etA∫0te−sAΓ′dW(s)]+e(t+δ)A∫tt+δe−sABds+e(t+δ)A∫tt+δe−sAΓ′dW(s)=eδAX(t)+[eδA−I]A−1B+∫tt+δe(t+δ−s)AΓ′dW(s).

This implies that X(t) and X(t + δ) satisfy a VAR(1) recursion; now define s = t + δ − z to obtain

X(t+δ)=eAδX(t)+[eAδ−I]A−1B+∫δ0eAz(−1)Γ′dW(z)=eAδX(t)+[eAδ−I]A−1B+∫0δeAzΓ′dW(z),

where the integral is a Gaussian random variable that is uncorrelated with X(t), with zero mean and covariance matrix

(7)∫0δeAzΓ′ΓeA′zdz.

The proof is completed by letting δ equal 1 first and then 1τ. Note that the inverse of e^δA exists for all δ and A, thus showing uniqueness of Φ₁ and Φ_τ.

A.2 Proof of Theorem 2

Define Q = Γ^′Γ, and

(8)S=∫0∞ezAQezA′dz.

Noticing that Q is positive definite, we then have

AS+SA′=∫0∞AezAQezA′+ezAQezA′A′dz=∫0∞ddz[ezAQezA′]dz

AS+SA′=[ezAQezA′]z=0z=∞,

and taking the limit yields the following Lyapunov equation

AS+SA′+Q=0.

Lyapunov equations have a unique solution for S if A is stable. Moreover, the solution S is symmetric and positive definite. To solve for S, we use a vectorization formula [see Hamilton (1994) equations 10.2.13–18]

(I⊗A+A⊗I)vec(S)=−vec(Q).

vec(S)=−(A⊕A)−1vec(Q).

where ⊕ denotes the Kronecker sum.^[9] Consider now the integral on the left side of (4)

∫0δezAΓ′ΓezA′dz=∫0∞ezAΓ′ΓezA′dz−∫δ∞ezAΓ′ΓezA′dz.

and using the definition of S in equation (8) we have

∫0δezAΓ′ΓezA′dz=S−∫0∞e(s+δ)AΓ′Γe(s+δ)A′dz,

where we applied the change of variable z = s + δ; using (8) once more proves the theorem.

A.3 Proof of Theorem 3

A.3.1 Preliminary Lemmas

The next two lemmas are well-known but are included here to make the paper self contained.

Lemma 1

Let Δ be a complex number. Then

limN→∞(I+ΔNA)N=eΔA.
limN→∞∑i=0N−1(I+ΔNA)iΔN=[eΔA−I]A−1.

Proof.

Since the matrices A and I trivially commute we can apply the binomial theorem

(I+ΔNA)N=∑n=0∞1n!∏k=0n−1(N−k)(ΔNA)n=I+ΔA+N(N−1)2!N2Δ2A2+N(N−1)(N−2)3!N3Δ3A3….

and letting N go to infinity proves (a). Using again the binomial theorem for each of the terms of the summation above yields

limN→∞∑i=0N−1(I+ΔNA)iΔN=limN→∞∑i=0N−1∑n=0∞1n!∏k=0n−1(i−k)(ΔN)nAnΔN=limN→∞∑i=0N−1∑n=0∞1n!∏k=0n−1(i−k)(ΔN)n+1An.

Notice that the terms with matching indexes i and n are equal to zero:

i↓,n→0123…N−1N0:ΔNI+11!(0−0)(ΔN)2A+0+0…+0+01:+ΔNI+11!(1−0)(ΔN)2A+12!(1−0)(1−1)(ΔN)3A2+0…+0+02:+ΔNI+11!(2−0)(ΔN)2A+12!(2−0)(2−1)(ΔN)3A2+0…+0+0⋮N−1:+ΔNI+11!(N−1)(ΔN)2A+12!(N−1)(N−2)(ΔN)3A2…+(N−1)!(N−1)!(ΔN)NAN−1+0.

Thus we have

∑i=0N−1(I+ΔNA)iΔN=∑n=0N−1ΔNI+∑n=1N−1n(ΔN)2A+∑n=2N−1(n−1)(n−2)2!(ΔN)3A2+⋯+(ΔN)NAN−1.

Letting N → ∞, this expression becomes

ΔI+12!Δ2A2+13!Δ3A2+….

Post-multiplying it by AA⁻¹ proves (b). □

Lemma 2

Let Δ be a complex number, and let S be a matrix that solves the discrete-time Lyapunov equation

(9)(I+ΔNA)S(I+ΔNA)′−S+Γ′Γ=0.

Then vec(Γ′Γ)=(−A⊕A)limN→∞ΔNvec(S).

Proof.

The solution to the Lyapunov equation satisfies

(10)vec(Γ′Γ)=[I−(I+ΔNA)⊗(I+ΔNA)]vec(S)=NΔ[−I⊗ΔNA−ΔNA⊗I−ΔNA⊗ΔNA]ΔNvec(S).

Taking the limit gives

limN→∞vec(Γ′Γ)=limN→∞[−I⊗A−A⊗I−A⊗ΔNA]ΔNvec(S)vec(Γ′Γ)=[−I⊗A−A⊗I]limN→∞ΔNvec(S).

This proves the Lemma. □

A.3.2 Theorem 3

Let θ = [A′, B′, Γ′]′. Process (3) is linear and Gaussian and therefore there exists a unique family of probability measures {P_θ,s,x: s ≥ 0, x ∈ ℝ^K} induced by the solutions of (3) for t ≥ s, given θ and X_s = x – see Pedersen (1995). Define the transition probability function implied by this family of probabilities as

P(s,x,t,D;θ)=Pθ,s,x(Xt∈D)

for a Borel set D ∈ ℝ^K. Define Δ = t − s and let p(s, x, t, y; θ) denote the density of P with respect to the Lebesgue measure in ℝ^K. The Euler-Maruyama approximation of order N = 1, 2, … under P_θ,s,x to the continuous time stochastic process, then, is

(11)nj=jNΔ+s,Xs(N)=x,Xnj(N)=Xnj−1+ΔN[AXnj−1(N)+B]+Γ′(Wnjθ,s−Wnj−1θ,s),

where Wtθ,s=∫st(Γ′Γ)−1/2d[Xu−x−∫suAXv+Bdv], for t ≥ 0, is a K-dimensional Wiener process starting at time s, under P_θ,s,x. Then it can be shown that X_{n_N} → X_t in L¹ norm (with measure P_θ,s,x) as N diverges (Pedersen, 1995). The transition probabilities of the Euler-Maruyama approximation are such that

(12)p(N)(s,x,t,y;θ)=E[p(1)(nN−1,XnN−1(N),t,y;θ)],

where the expectation is taken with respect to P_θ,s,x. Intuitively, p^(N) is equal to the probability that the Euler-Maruyama process will transition to y at time t, from the position XnN−1(N) that it is expected to take at t − Δ/N. Then p^(N) → p as N → ∞ (Pedersen, 1995, Theorem 1). Backward substitution in the Euler-Maruyama N-approximation yields

XnN−1(N)=(I+ΔNA)N−1x+∑i=0N−2(I+ΔNA)iΔNB+∑i=0N−2(I+ΔNA)iΓ′εnN−1+i,

where εnj=Wnjθ,s−Wnj−1θ,s. It follows that XnN−1(N) is normal with

E[XnN−1(N)]=(I+ΔNA)N−1x+∑i=0N−2(I+ΔNA)iΔNB,var[XnN−1(N)]=ΔN∑i=0N−2(I+ΔNA)iΓ′Γ(I+ΔNA)i′.

Together with (12), these imply that the probability that XnN(N)=y at n_N = t can be found by taking the expectation of the following expression

(I+ΔNA)iXnN−1(N)+ΔNB+Γ′εnN=y.

Moreover, because XnN−1(N) is normal, we have that p^(N)(s, x, t, y; θ) also is normal with mean (I+ΔNA)Nx+∑i=0N−1(I+ΔNA)iΔNB and covariance matrix ΔN∑i=0N−1(I+ΔNA)iΓ′Γ(I+ΔNA)i′.

Define now S=∑i=0∞(I+ΔNA)iΓ′Γ(I+ΔNA)i′. Under Assumption 2 and Assumption 3, none of the eigenvalues of I+ΔNA can be the reciprocal of another one and therefore S is the unique solution to the Lyapunov equation (9), and it satisfies equation (10). Now, from the definition of S it follows that

∑i=0N−1(I+ΔNA)iΓ′Γ(I+ΔNA)i′=S−(I+ΔNA)NS(I+ΔNA)N′,

which shows that p^(N)(s, x, s + Δ, y; θ) is normal with mean

μ(N)=(I+ΔNA)Nx+∑i=0N−1(I+ΔNA)iΔNB,

and covariance

Σ(N)=ΔN(S−(I+ΔNA)NS(I+ΔNA)N′).

From Lemma 1 the limiting distribution of p^(N) as N → ∞ is normal with mean

μ=limN→∞(I+ΔNA)Nx+∑i=0N−1(I+ΔNA)iΔNB=eΔAx+[eΔA−I]A−1B,

which is the desired result for the mean. To derive the covariance matrix, use Lemma 1 and Lemma 2 and since Assumption 2 ensures that A ⊕ A is invertible, we have

Σ=limN→∞ΔN(S−(I+ΔNA)NS(I+ΔNA)N′).

Finally,

vec(Σ)=limN→∞ΔN[vec(S)−(I+ΔNA)N⊗(I+ΔNA)Nvec(S)]=limN→∞[I−(I+ΔNA)2N⊗(I+ΔNA)N]ΔNvec(S)=(I−eΔ(A⊕A))(−A⊕A)−1vec(Γ′Γ).

This completes the proof.

A.4 Proof of Theorem 4

Let ℓ({Xi}i=1T;C1,Φ1,Σ1) denote the likelihood function for the observed data set, given X₀; we have

(13)log⁡ℓ({Xi}i=1T;C1,Φ1,Σ1)=−KT2log⁡2π−T2log⁡|Σ1|−12∑t=1T[(Xt−m)−Φ1(Xt−1−m)]′Σ1−1[(Xt−m)−Φ1(Xt−1−m)],

where m = [I − Φ₁]⁻¹C₁. The maximum likelihood estimates for the LF-VAR are (C~1,Φ~1,Σ~1)=arg⁡maxlog⁡ℓ({Xi}i=1T;C1,Φ1,Σ1). Under the assumption that (3) is the DGP we can also express the likelihood of these data as follows

log⁡ℓ({Xi}i=1T;A,B,Γ)=∑t=1Tlog⁡p(t,X(t),t+1,X(t+1))

where p is the transition probability function defined in the proof of Theorem 3: given X(t) at t the distribution of X(t + 1) at t + 1 is normal with mean e^AX(t) + [e^A − I]A⁻¹B and (vectorized) variance vec(Σ) = (I − e^A⊕A)(−A ⊕ A)⁻¹vec(Γ^′Γ). Define now X⁰(t) ≡ X(t) + A⁻¹B, so that p(t, X⁰(t), t + 1, X⁰(t + 1)) is normal with mean e^AX⁰(t) and variance Σ. It follows that this likelihood satifies

(14)log⁡ℓ({Xi}i=1T;A,B,Γ)=−KT2log⁡2π−T2log⁡|Σ|−12∑t=1T[X0(t)−eAX0(t−1)]′Σ−1[X0(t)−eAX0(t−1)].

Under the mapping defined by the rescaling algorithm, we have that (i) e^A = Φ₁; (ii) C₁ = (e^A − I)⁻¹A⁻¹B so that m = −A⁻¹B; and (iii) vec(Σ) = (e^A⊕A − I)(A ⊕ A)⁻¹vec(Γ^′Γ) = vec(Σ₁); and therefore the rescaling algorithm maps one-to-one the maximizers of (13) into the maximizers of (14). It is now immediate from Theorem 3 that the rescaling algorithm also maps one-to-one the maximizers of (14) into the maximizers of the likelihood of the (unobserved) data set in which all the variables are measured at the higher frequency

log⁡ℓ({Xi/τ}i=1Tτ;A,B,Γ)

and therefore the matrices C̃_τ, Φ~τ, and Σ~τ are, asymptotically, ML estimates of the R-VAR’s parameters. The distributions of the estimators follow from standard results; see, for example, (Lütkepohl, 2007, Proposition 3.1).

References

Clark, Todd E., and Michael W. McCracken. 2017. “Tests of Predictive Ability for Vector Autoregressions Used for Conditional Forecasting.” Journal of Applied Econometrics 32 (3): 533–553.10.1002/jae.2529Search in Google Scholar

Geweke, John. 1978. “Temporal Aggregation in the Multiple Regression Model.” Econometrica 46 (3): 643–661.10.2307/1914238Search in Google Scholar

Ghysels, Eric. 2016. “Macroeconomics and the Reality of Mixed Frequency Data.” Journal of Econometrics 193 (2): 294–314.10.1016/j.jeconom.2016.04.008Search in Google Scholar

Ghysels, Eric, Andrew C. Harvey, and Eric Renault. 1996. “Stochastic Volatility.” Handbook of Statistics 14: 119–191.10.1016/S0169-7161(96)14007-4Search in Google Scholar

Giacomini, Raffaella, and Halbert White. 2006. “Tests of Conditional Predictive Ability.” Econometrica 74 (6): 1545–1578.10.1111/j.1468-0262.2006.00718.xSearch in Google Scholar

Götz, Thomas B., Alain Hecq, and Stephan Smeekes. 2016. “Testing for Granger Causality in Large Mixed-Frequency VARs.” Journal of Econometrics 193 (2): 418–432.10.1016/j.jeconom.2016.04.015Search in Google Scholar

Gourieroux, Christian, and Joann Jasiak. 2016. “Filtering, Prediction and Simulation Methods for Noncausal Processes.” Journal of Time Series Analysis 37 (3): 405–430.10.1111/jtsa.12165Search in Google Scholar

Hamilton, James Douglas. 1994. Time Series Analysis, vol. 2. Princeton: Princeton University Press.10.1515/9780691218632Search in Google Scholar

Higham, Nicholas J. 2002. “Computing the Nearest Correlation Matrix – A Problem from Finance.” IMA Journal of Numerical Analysis 22 (3): 329–343.10.1093/imanum/22.3.329Search in Google Scholar

Hyndman, Rob J., Anne B. Koehler, Ralph D. Snyder, and Simone Grose. 2002. “A State Space Framework for Automatic Forecasting Using Exponential Smoothing Methods.” International Journal of Forecasting 18 (3): 439–454.10.1016/S0169-2070(01)00110-8Search in Google Scholar

Knuth, D. E. 1981. The Art of Computer Programming Vol. 2: Seminumerical Methods. Reading, MA: Addison-Wesley.Search in Google Scholar

Lütkepohl, Helmut. 2007. New Introduction to Multiple Time Series Analysis. Springer.Search in Google Scholar

Marcellino, Massimiliano. 1999. “Some Consequences of Temporal Aggregation in Empirical Analysis.” Journal of Business and Economic Statistics 17 (1): 129–136.10.1080/07350015.1999.10524802Search in Google Scholar

Nyberg, Henri, and Pentti Saikkonen. 2014. “Forecasting with a Noncausal Var Model.” Computational Statistics & Data Analysis 76: 536–555.10.1016/j.csda.2013.10.014Search in Google Scholar

Pedersen, Asger Roer. 1995. “A New Approach to Maximum Likelihood Estimation for Stochastic Differential Equations Based on Discrete Observations.” Scandinavian Journal of Statistics 22 (1): 55–71.Search in Google Scholar

Press, William H., Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery. 2007. Numerical Recipes 3rd Edition: The Art of Scientific Computing. Cambridge: Cambridge University Press.Search in Google Scholar

Qian, Hang. 2016. “A Computationally Efficient Method for Vector Autoregression with Mixed Frequency Data.” Journal of Econometrics 193 (2): 433–437.10.1016/j.jeconom.2016.04.016Search in Google Scholar

Ripley, B. D. 1990. “Thoughts on Pseudorandom Number Generators.” Journal of Computational and Applied Mathematics 31 (1): 153–163.10.1016/0377-0427(90)90346-2Search in Google Scholar

Schorfheide, Frank, and Dongho Song. 2015. “Real-Time Forecasting with a Mixed Frequency VAR.” Journal of Business and Economic Statistics 33 (3): 366–380.10.3386/w19712Search in Google Scholar

Sims, Christopher A. 1971. “Discrete Approximations to Continuous Time Distributed Lags in Econometrics.” Econometrica 39 (3): 545–563.10.2307/1913265Search in Google Scholar

Published Online: 2018-04-30

You are currently not able to access this content.

Articles in the same Issue

https://doi.org/10.1515/snde-2017-0047

Keywords for this article

forecasts confidence intervals; mixed-frequency data; real-time forecasting; rescaled VAR model