Steady-state priors and Bayesian variable selection in VAR forecasting

Dimitrios P. Louzis

doi:10.1515/snde-2015-0048

Article Publicly Available

Steady-state priors and Bayesian variable selection in VAR forecasting

Dimitrios P. Louzis

Published/Copyright: March 16, 2016

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information

From the journal Studies in Nonlinear Dynamics & Econometrics Volume 20 Issue 5

Abstract

This study proposes methods for estimating Bayesian vector autoregressions (VARs) with a (semi-) automatic variable selection and an informative prior on the unconditional mean or steady-state of the system. We show that extant Gibbs sampling methods for Bayesian variable selection can be efficiently extended to incorporate prior beliefs on the steady-state of the economy. Empirical analysis, based on three major US macroeconomic time series, indicates that the out-of-sample forecasting accuracy of a VAR model is considerably improved when it combines both variable selection and steady-state prior information.

Keywords: Bayesian VAR; macroeconomic forecasting; steadystates; variable selection

1 Introduction

The seminal studies of Sims (1980), Doan, Litterman, and Sims (1984) and Litterman (1986) kick-started a flurry of research on Bayesian vector autoregressions (Bayesian VARs or BVARs hereafter) and their ability to generate accurate macroeconomic forecasts. Almost 30 years later, BVARs have been established as a standard forecasting tool in empirical macroeconomics (e.g. see Karlsson 2013 for a recent review and references therein).

A key element in the plethora of BVAR specifications is the shrinkage of dynamic parameters towards a specific representation of the data which reflects researchers’ prior beliefs and deals with the over-parameterization problem. A popular shrinkage method is the Minnesota (or Litterman) prior (Doan, Litterman, and Sims 1984; Litterman 1986) and its variants (Banbura, Giannone, and Reichlin 2010; Koop 2013; Giannone, Lenza, and Primiceri 2015) which allow for different levels of shrinkage on VAR coefficients and in some cases lead to conjugate posterior densities eliminating the need of posterior simulations. Recently, Bayesian least absolute shrinkage and selection operator (Lasso) has also been proposed for VAR shrinkage (Korobilis 2013; Gefang 2014).

Another strand of the literature has proposed Bayesian variable selection as an alternative way of VAR shrinkage. In general, variable selection techniques refer to a statistical procedure that decides stochastically which of the variables enter the VAR equation and which not, based on information provided by the data. Variable selection can be performed either by imposing a tight prior around zero on some of the VAR coefficients (George, Sun, and Ni 2008; Korobilis 2008; Koop 2013) or via restricting coefficients to be exactly zero (Korobilis 2013). In the former case, all variables enter the VAR equations but some of them have coefficients very close, but not exactly, zero; whereas, in the latter case, some of the variables are actually excluded from the VAR system leading to a restricted VAR specification. The method of George, Sun, and Ni (2008) is generally faster and computationally more stable than Korobilis (2013) method (see also George and McCulloch 1997). On the other hand, the method of Korobilis (2013) is considered more flexible since it is fully automatic and independent of the prior specified on the dynamic parameters, a feature that simplifies posterior simulations and enables the adoption of the method in non-linear VAR models.

Although the main bulk of the BVAR literature concentrates on prior specification and shrinkage techniques about the dynamic VAR coefficients, Villani (2009) argues that the unconditional mean or the steady sate of the system is an equally, or even more, important aspect of BVARs forecasting performance. The rationale is that longer-term forecasts of a stationary VAR converge to their unconditional means and thus, their estimated level plays a crucial role in forecasting accuracy. Moreover, economists usually have a more crystallized view on the steady-state level of an economy than on its short-term fluctuations, implying that a VAR model that accounts for this kind of prior information can improve its forecasting behavior. Therefore, the author proposes a mean-adjusted representation of a VAR model that enables the incorporation of prior beliefs on steady-states. The empirical evidence, so far, suggests that steady-state VAR models outperform their counterparts with uninformative priors on constant terms (e.g. see Beechey and Österholm 2008; Villani 2009; Clark 2011; Wright 2013). In Beechey and Österholm (2010), the authors also show that univariate steady-state AR models have better forecasting performance than AR models with traditional specifications.^[1]

Against this background, this study extends those of Villani (2009), George, Sun, and Ni (2008) and Korobilis (2013) and proposes methods of estimating a VAR model that incorporates prior beliefs on the steady-state and also adopts a Bayesian variable selection method. More specifically, we show that extant Gibbs sampling algorithms for Bayesian variable selection proposed by George, Sun, and Ni (2008) or Korobilis (2013) can be efficiently extended to allow for priors on the steady-state (Villani 2009). The essential steps include re-writing the VAR model in a mean-adjusted form, which allows prior elicitation for the unconditional mean, and adding an extra block to the Gibbs sampler of George, Sun, and Ni (2008) or Korobilis (2013) that draws from the full conditional posterior density of the steady-state parameters.

The proposed specification is evaluated in terms of an out-of-sample point forecasting exercise based on three US macroeconomic variables, namely, real gross domestic product (GDP) growth, inflation rate (consumer price index, CPI) and short-term interest rates (Federal funds rate). The out-of-sample period covers almost 46 years from 1969:Q4 to 2015:Q1 and the suggested model is compared to alternative established specifications such us the steady-state VAR model of Villani (2009) and the VAR model with variable selection methods either à laGeorge, Sun, and Ni (2008) or à laKorobilis (2013). Finally, as a robustness check, we use three alternative prior specifications on the dynamic VAR parameters and variables in both ‘gap’ form, i.e. deviations from the long-run expectations, and non-gap form. Nonetheless, we should mention that since our main purpose is to evaluate the alternative specifications in terms of point forecasting, we do not consider steady-state VAR specifications enhanced with stochastic volatility that materially improve density forecasts (Clark 2011).

The rest of the paper is organized as follows. Section 2 develops the steady-state VAR with variable selection techniques and the Gibbs sampling algorithms. Section 3 presents the empirical results while Section 4 concludes this paper.

2 Steady-state VARs with variable selection

A reduced-form VAR is written as:

(1)B(L)yt=cdt+εt

where y_t is a m×1 vector of time series at time t with t=1, …, T observations, B(L)=I_m–B₁L–…–B_pL^p with Ly_t=y_t–1, ε_t are the errors distributed as N(0, Σ) with Σ being the m×m covariance matrix and d_t is a q-dimensional vector of exogenous deterministic variables such as constants, dummies or time trends. Assuming stationarity for y_t, the unconditional mean or steady-state of the VAR process in Eq. (1) is defined as E(y_t)=μ_t=B(L)^–1cd_t. From the steady-state definition it is clear that it is hard to encapsulate prior opinions with respect to μ_t into Eq. (1).^[2] To circumvent this problem, Villani (2009) proposes a steady-state representation of the VAR model that is practically a deviations-from-mean parameterization, i.e.:

(2)B(L)(yt−φdt)=εt

where φ=B(L)^–1c and the long-run mean is μ_t=φd_t. Therefore, a researcher can incorporate his prior beliefs on μ_t by directly specifying priors on φ. The steady-state VAR in Eq. (2) can be rewritten as:

(3)y˜t=B1y˜t−1+…+Bpy˜t−p+εt

with ỹ_t=y_t–φd_t being the mean-adjusted times series of y_t. It is straightforward to extend the steady-state VAR in Eq. (3) to allow for Bayesian variable selection. The alternative variable selection approaches are discussed in Sections 2.1 and 2.2 below.

2.1 Stochastic search variable selection of George, Sun, and Ni (2008)

Before proceeding to the description of the stochastic search variable selection (SSVS) method proposed by George, Sun, and Ni (2008) it is useful to we rewrite the steady-state VAR using matrix notation, i.e.:

(4)Y˜=X˜B+E

where Ỹ and E are defined as a T×m matrices with their t-th row being Y˜t=y′˜t=(y˜1t, y˜2t, …, y˜mt) and Et=ε′t=(ε1t, ε2t, …, εmt), respectively. X˜ is a T×k matrix with its t-th row being X˜t=x˜t=(y˜′t−1, …, y˜′t−p) and k=mp being the number of explanatory variables in each VAR equation. The matrix of VAR coefficients is defined as B=(B₁, …, B_p)′ with β=vec(B) being the n×1 vector of regression coefficients and n=mk=m²p being the total number of VAR dynamic coefficients.

George, Sun, and Ni (2008) uses the insights of the hierarchical Bayesian modeling so as to specify the prior distribution on β as a mixture of two Normal distributions. More specifically:

(5)β|γ∼Nn(b_, DD)

with γ=(γ₁, …, γ_n)′ being a n×1 vector of dummy variables which take the values of 1 and 0 and D=diag(h₁, …, h_n) being a n×n diagonal matrix where:

(6)hj={τ0j if γj=0τ1j if γj=1

Eqs. (5) and (6) imply the following prior on each of the j=1, … n elements of β

(7)βj|γj∼(1−γj)N(b_j, τ0j2)+γjN(b_j, τ1j2)

The prior in Eq. (7) is hierarchical since it depends on the unknown dummy parameter γ_j, for which we specify a prior and we estimate its posterior distribution using the available information. Obviously, the degree of shrinkage towards the prior means, b_j, depends on the hyperparameters τ0j2 and τ1j2 which are preselected by the researcher so that τ0j2<τ1j2. Thus, if γ_j equals zero then β_j is forced to take a value very close to its prior mean because it is drawn from the first ‘tight’ Normal with a very small prior variance. Obviously, if the prior mean is zero the dynamic coefficient is shrunk towards zero (but not exactly zero) and the variable is almost excluded from the model. Whereas, if γ_j equals one then β_j is drawn from the second ‘loose’ Normal which is relatively uninformative due to its large prior variance.

The three alternative specifications used for hyperparamters b and D are discussed in Section 3 and in more detail in Appendix B. Moreover, the γ_j dummy variables are assumed to be independent of each other ∀j and their prior is defined as a Bernoulli density with π_j prior probability, i.e.:

(8)γj|γ\−j∼Bernoulli(π_j),

with Pr(γ_j=1)=π_j and Pr(γ_j=0)=1–π_j and γ_\–j denoting all elements of γ except for the j-th.

We also follow George, Sun, and Ni (2008) and we impose restrictions on the inverse of the error covariance matrix, Σ^–1, via its Cholesky factor, Ψ. In particular, let Σ^–1=ΨΨ′ with Ψ being an upper triangular m×m matrix. The authors propose a Gamma prior for the squared elements of the main diagonal of Ψ, ψ=(ψ₁₁, …, ψ_mm)′, i.e. ψjj2∼gamma(aj, bj) (with ψ_jj being independent of each other), and a mixture of two Normal distributions for the non-zero off diagonal elements of Ψ. In particular, we define η_i=(ψ_1i, …, ψ_i–1_i)′ for i=2, …, m and we assume that:

(9)ηi|ωi∼Ni−1(0, DiDi)

where ω_i=(ω_1i, …, ω_i–1_i)′ is a vector of dummy variables and D_i=diag(h₁_i, …, h_i–1_i) is a (i–1)×(i–1) matrix with:

(10)hji={κ0ji if ωji=0κ1ji if ωji=1

Hyperparameters κ_0ji and κ_1ji are predetermined constants with κ_0ji<κ_1ji. As in Eq. (7), Eqs. (8) and (9) imply the following mixture of Nomals for each element of η_i:

(11)ψji|ωji∼(1−ωji)N(0, κ0ji2)+ωjiN(0, κ1ji2)

Again each element of vector ω=(ω′2, …, ω′m)′ is independ of each other and has a Bernoulli prior density, i.e.:

(12)ωji|ω\−ji∼Bernoulli(q_ji)

with Pr(ω_ji=1)=q_ji and Pr(ω_ji=0)=–q_ji.

Finally, the prior on the steady-state parameter, φ, is φ~N_mq(b_ϕ, V_ϕ) assuming independence between φ and β as in Villani (2009).

2.1.1 Posterior inference for the steady-state VAR with stochastic search variable selection

The posterior inference is based on the idea that conditional on the steady-state parameters, φ, the VAR specification in Eq. (4) is a standard SSVS VAR for the mean-adjusted time series ỹ_t, and therefore the Gibbs sampler framework of George, Sun, and Ni (2008) can be implemented.^[3] In fact, estimation of Eq. (4) requires only an extra block which samples the steady-state parameters from a normal density described in Villani (2009). A general Gibbs sampler algorithm for the steady-state VAR with stochastic search variable selection, which draws sequentially from the full conditional posterior density of the parameters, contains the following steps:^[4]

Draw ψjj2|β, φ, η; D from gamma(a̅_j, b̅_j);
Draw η_i|β, φ, ψ; 𝒟 from N_i–1(h̅_i, Δ̅_i);
Draw ω_ji|ω_\–ji, β, φ, ψ; 𝒟 from Bernoulli(q̅_ji);
Draw β|γ, φ, Ψ; 𝒟 from N_n(β̅, V̅);
Draw γ_j|γ_\–j, β, φ, Ψ; 𝒟 from Bernoulli(π̅_j);
Draw φ|β, Ψ; 𝒟 from N_mq(b̅_ϕ, V̅_ϕ).

where 𝒟={y₁, …, y_T, d₁, …, d_T} denote the data. The quantities a̅_j, b̅_j, h̅_i, Δ̅_i, q̅_ji, β̅, V̅, π̅_j, b̅_ϕ and V̅_ϕ as well as the posterior sampling procedure are described in detail in Appendix A.

2.2 Variable selection of Korobilis (2013)

The steady-state VAR in Eq. (3) can also be extended to allow for variable selection (VS) in the form of Korobilis (2013). In contrast to George, Sun, and Ni (2008), the Korobilis (2013) method does exclude variables, in a sense that the dynamic coefficients take a value exactly zero and not close to zero. This means that each of the VAR equations may have different lagged variables and the steady-state VAR in Eq. (3) should be re-written as a system of seemingly unrelated equations (SUR) i.e.:

(13)y˜t=z˜tβ+εt

Where z˜t=Im⊗x˜t is a m×n matrix with n=m²p being, again, the total number of coefficients in Eq. (13), x˜t=(y′˜t−1, …, y′˜t−p) is a 1×k (k=mp) vector containing all regressors at time t and β=vec(B) is the n×1 vector of the VAR dynamic coefficients. The model in Eq. (13) is an unrestricted steady-state VAR since no restrictions are incorporated in the [βj]j=1n elements of β. By contrast, the Bayesian variable selection method proposed by Korobilis (2013) restricts some of the β_j coefficients to be zero as follows

(14){βj=0 if γj=0βj≠0 if γj=1

where γ_j is an indicator variable and the j-th element of the vector γ=(γ₁, …, γ_n)′. We define the steady-state VAR with variable selection as

(15)y˜t=z˜tθ+εt

where θ=Γβ and Γ is a n×n diagonal matrix with the elements of γ on its main diagonal, i.e. Γ_jj=γ_j. The specification in Eq. (15) implies that for γ_j=1 ∀j we get the unrestricted steady-state VAR of Eq. (13).

Bayesian estimation and inference on β, γ, φ and Σ requires the specification of prior distributions which are generally based on the propositions of Korobilis (2013) and Villani (2009). Accordingly, we define the prior density for the dynamic parameters, β, as a multivariate normal distribution, i.e. β~N_n(b, V), with hyperparameters b and V being further specified in Section 3 and Appendix B. The γ_j dummy variables are assumed to be independent of each other ∀j and their prior is defined as a Bernoulli density with π_j prior probability, i.e. γ_j|γ_\–j~Bernoulli(π_j), where γ_\–j denotes all elements of γ except for the j-th. Finally, the prior on Σ is taken to be the scale invariant improper Jeffreys prior, i.e. Σ∝|Σ|^{–(^m+1)/2}, while the prior on φ is φ~N_mq(b_ϕ, V_ϕ) assuming independence between φ and β.

2.2.1 Posterior inference for the steady-state VAR with variable selection

The posterior inference is based on the same idea already presented in Section 2.1.1 for the steady-state VAR with stochastic search variable selection. This means that we implement the Gibbs sampler framework of Korobilis (2013) for the mean-adjusted time series ỹ_t and then we add an extra fourth block which samples steady-state parameters from a normal density described in Villani (2009). A general Gibbs sampler algorithm for the steady-state VAR with variable selection, contains the following four steps:

Draw β|γ, φ, Σ; 𝒟 from N_n(β̅, V̅);
Draw γ_j|γ_\–j, β, φ, Σ; 𝒟 from Bernoulli(π̅_j);
Draw Σ^–1|γ, β, φ; 𝒟 from Wishart(T, S^–1);
Draw φ|γ, β, Σ; 𝒟 from N_mq(b̅_ϕ, V̅_ϕ).

Appendix A describes in detail the formulation of β̅, V̅, π̅_j, S, b̅_ϕ and V̅_ϕ as well as the posterior sampling procedure.

In general, the extra block which draws the steady-state parameters requires only a very small portion of the total computing time, for both Gibbs sampling algorithms described in Sections 2.1.1 and 2.2.1, as also pointed out in Villani (2009). In practice, the Gibbs sampler is proved to be very efficient and convergence problems may arise only under certain conditions. More specifically, the unconditional mean is not identified for a non-stationary VAR process and this may lead to convergence difficulties only if it is combined with an uninformative (i.e. very large prior variance) steady-state prior (Villani 2005; Villani 2009, 646–647). Villani (2005, 2009) shows theoretically how an informative prior on steady-state stabilizes the Gibbs sampler even if the VAR system approaches a unit root process. The author also uses a simulation exercise and shows that even moderately informative steady-state priors can produce acceptable posterior simulations. We also confirm the abovementioned findings for the proposed models by implementing a similar simulation exercise.^[5]

3 Empirical analysis

We use three major US macroeconomic series, i.e. real GDP growth rate, CPI inflation rate and effective Federal funds rate, in order to estimate the proposed model and evaluate its forecasting performance (henceforth, we refer to these variables as GDP (r_t), inflation (π_t) and interest rate (i_t), respectively). The data series cover the period from 1954:Q2 to 2015:Q1 and were obtained from the St. Louis’ FRED database on a quarterly basis.^[6] Real GDP growth and CPI inflation rate are calculated as annualized quarter-on-quarter percentage changes, i.e. r_t=400×ln(x_t/x_t–1) and π_t=400×ln(p_t/p_t–1) where p and x are quarterly CPI inflation index and real GDP, respectively, while we use the interest rate in levels. Figure 1 shows the evolution of the variables for the full sample period.

Figure 1:

Annualized real GDP growth rate, CPI inflation rate and Federal Funds rate in (%).

The inflation graph reveals the well-known slowly moving trend component which rises during the 1970s (Great Inflation) and starts falling after a decade (Great Moderation). The main bulk of the macroeconomic forecasting literature argues that this trend has to be taken into consideration in terms of inflation forecasting, while others argue that it represents the agent’s perceptions of the Fed’s time varying long-run inflation target (see Faust and Wright 2013 and references therein). In the same vein, Chan and Koop (2014) find a structural break on the steady-state of inflation and interest rates during the 1970s, while Clark (2011) and Wright (2013) account for this time-varying local mean for inflation and interest rates by employing the long-term inflation expectations based on Blue Chip surveys.^[7] Moreover, a number of studies show that the survey based long-run inflation expectations can be conveniently approximated by using a simple exponential smoothing (e.g. see Cogley 2002; Clark 2011; Faust and Wright 2013 among others).

Here, we consider a twofold approach. First, we estimate the alternative specifications using as endogenous variables the raw figures, i.e. the vector of the endogenous variables is given by y_t=(r_t, π_t, i_t)′. Second, we also use inflation and interests rates in a gap form, i.e. deviations from the long-run inflation expectations, where the long-run expectations are roughly proxied by the exponential smoothed counterpart of inflation, π_t. More specifically, we estimate the exponential smoothed inflation as πtes=απt−1es+(1−α)πt where exponential smoothing parameter is set equal to 0.95. Then, inflation and interest variables in a gap form are defined as πtgap=πt−πtes and itgap=it−πtes, respectively, which are also depicted in Figure 1 along with the πtes variable. Therefore, now, the vector of endogenous variables is given by yt=(rt, πtgap, itgap)′ and the steady-state priors are set on the gap variables. Since the alternative VAR models provide forecasts only for the πtgap and itgap variables, we also assume a random walk process for the low frequency component of inflation, πtes, in order to get forecasts for the raw figures of inflation and interest rate, π_t and i_t, respectively (Faust and Wright 2013). Although the two alternative approaches are mainly used as a robustness check for the alternative specifications, a comparison between gap and non-gap models is always interesting.

3.1 Prior specifications and alternative models

An initial first step in Bayesian estimation is to specify the hyperparameters in prior distributions described in Sections 2.1 and 2.2. In particular, we follow George, Sun, and Ni (2008), Koop (2013) and Korobilis (2013) and we use three distinct prior specifications on the dynamic parameters vector, β. In particular for the SSVS VAR models we use (i) the default ‘semi-automatic’ approach of George, Sun, and Ni (2008) (standard prior hereafter), (ii) the combination of the SSVS method with the Minnesota prior of Koop (2013) and (iii) the combination of the SSVS method with the Ridge prior (see also Korobilis 2013). Similarly for the VS VARs we use (i) the ridge regression prior (ridge prior hereafter), (ii) the Minnesota prior and (iii) the hierarchical Bayes shrinkage prior (shrink prior hereafter); all three of them are based on the Normal distribution and are briefly described in Appendix B.^[8] As regards the VS specifications, we set the prior probability of the Bernoulli density equal to 0.8, i.e. π_j=0.8, while the standard non-informative prior |Σ|^–(m+1)/2 is used for Σ. For the SSVS specifications we set π_j=0.5, q_j=0.7, κ_0ji=1, κ_1ji=5, a_j=0.01 and b_j=0.01.

In the steady-state version of VAR models we also have to specify the prior mean and standard deviation on steady-states coefficients, φ. Following the recent literature (e.g. see Clark 2011; Österholm 2012), we set the steady-state GDP growth and inflation rate equal to 3% and 2%, respectively, and the nominal interest rate equal to 4%, while we assume a 0.5 standard deviation for GDP and inflation and 0.7 for interest rate. Finally, the steady-state levels for inflation and interest rate variables in a gap form are set equal to 0% and 2%, respectively (see Clark 2011).

The primary scope of the empirical analysis is to examine whether the forecasting ability of a Bayesian VAR model is improved when we incorporate prior beliefs with respect to the steady-state of the economy and we also allow for a (semi-) automatic Bayesian variable selection. To that end, we compare the forecasting ability of the following six alternative BVAR specifications (models’ names in parenthesis):

a baseline BVAR without variable selection and steady-state priors (Standard),
a steady-state BVAR (Villani 2009) (SSP),
a BVAR with variable selection (Korobilis 2013) (VS),
a BVAR with both variable selection and steady-state priors (VS-SSP),
a BVAR with stochastic search variable selection (George, Sun, and Ni 2008) (SSVS),
a BVAR with stochastic search variable selection and steady-state priors (SSVS-SSP).

These six alternative specifications are estimated using each of the three prior specifications on β meaning that we have in total 18 distinct BVAR models. For all models we use a lag length of 4. For comparison reasons we also estimate a standard VAR model using ordinary least squares (OLS) with one lag.

3.2 In-sample analysis

In this section we present some estimation results using the full data sample (1954:Q2–2015:Q1). We use the Gibbs samplers in Appendix A to sample 10,000 draws with a thinning of 5, i.e. we keep one every five draws, after discarding the first 2000 draws used for initial convergence (burn-in period). We evaluate the convergence of the proposed Markov Chain Monte Carlo (MCMC) algorithm by implementing the inefficiency factors (IFs) metric proposed by Primiceri (2005). Table 1 presents a summary of the distribution for the inefficiency factors of the parameters posterior draws, while Figure 2 shows the IFs across all parameters. The empirical evidence reveals that the convergence of the Gibbs samplers is more than satisfactory with the majority of the IFs being well below the threshold value of 20 (see Primiceri 2005).^[9]

Table 1:

MCMC convergence diagnostics: summary of distribution of inefficiency factors.

	Median	Mean	Min	Max	10-th Percentile	90-th percentile
Min-VS-SSP	1.09	1.38	1.00	7.08	1.00	2.13
Ridge-VS-SSP	1.15	2.63	1.00	27.06	1.00	6.39
Shrink-VS-SSP	1.55	3.06	1.00	15.15	1.00	6.55
SSVS-SSP	1.00	1.02	1.00	1.64	1.00	1.02
Min-SSVS-SSP	1.00	1.02	1.00	1.37	1.00	1.06
Ridge-SSVS-SSP	1.00	1.05	1.00	1.50	1.00	1.20

This table presents a summary of the distribution of inefficiency factors (IFs) for the posterior estimates of the parameters. Models are estimated using the full sample and variables in a non-gap form. Inefficiency factors are defined as the inverse of the relative numerical efficiency measure of Geweke (1992) and are calculated as (1+2∑k=1∞ρk), with ρ_k being the k-th autocorrelation of the posterior estimates chain. Following Primiceri (2005), we calculate IFs using a 4% tapered window for the estimation of spectral density at frequency zero. Min, Ridge and Shrink are the Minnesota, Ridge and Shrink priors for the dynamic coefficients, respectively. (prior)-VS-SSP is a steady-state BVAR(4) model with variable selection à laKorobilis (2013). (prior)-SSVS-SSP is a steady-state BVAR(4) model with stochastic search variable selection à laGeorge, Sun, and Ni (2008). In SSVS-SSP we use the standard prior specification for the regression coefficients proposed in George, Sun, and Ni (2008).

Figure 2:

MCMC convergence diagnostics: inefficiency factors.

Notes: The figures show the inefficiency factors for all parameters across the proposed specifications using the full sample and variables in a non-gap form. Min, Ridge and Shrink are the Minnesota, Ridge and Shrink priors for the dynamic coefficients, respectively. (prior)-VS-SSP is a steady-state BVAR(4) model with variable selection à laKorobilis (2013). (prior)-SSVS-SSP is a steady-state BVAR(4) model with stochastic search variable selection à laGeorge, Sun, and Ni (2008). In SSVS-SSP we use the standard prior specification for the regression coefficients proposed in George, Sun, and Ni (2008).

Table 2 presents the posterior means and standard deviations for the steady-states across different priors and VAR specifications. Overall, the results show that neither different priors nor parameter restrictions (i.e. models with variable selection) have a significant impact on steady-states estimation. In particular, the steady-state growth rate of GDP is estimated to be close to 3% ranging between 2.99% and 3.06%. The average CPI inflation across priors and specifications is close to 2.4% and 0.02% for the inflation in a gap form, while the steady-state of the Fed funds rate is estimated to be close to 3.85% (1.7% for the gap form).

Table 2:

Posterior means and standard deviations of the steady-states.

	VAR dynamic coefficients prior
	Ridge		Minnesota		Shrink		Standard SSVS	Minnesota SSVS	Ridge SSVS
	no VS	with VS	no VS	with VS	no VS	with VS	Standard SSVS
Real GDP growth
Posterior mean	3.03	3.03	3.02	3.03	3.04	3.04	3.06	3.06	3.02
	(3.02)	(3.00)	(3.02)	(3.01)	(2.99)	(3.01)	(2.99)	(2.99)	(3.04)
St. deviation	0.30	0.29	0.29	0.29	0.27	0.27	0.30	0.29	0.27
	(0.26)	(0.27)	(0.27)	(0.27)	(0.25)	(0.25)	(0.27)	(0.26)	(0.16)
CPI Inflation
Posterior mean	2.40	2.40	2.42	2.42	2.39	2.39	2.42	2.39	2.42
	(0.01)	(0.00)	(0.00)	(0.03)	(0.03)	(0.02)	(0.03)	(0.03)	(0.02)
St. deviation	0.43	0.43	0.44	0.42	0.42	0.44	0.42	0.45	0.43
	(0.30)	(0.31)	(0.30)	(0.30)	(0.32)	(0.31)	(0.36)	(0.34)	(0.17)
Federal funds rates
Posterior mean	3.85	3.86	3.83	3.84	3.84	3.85	3.83	3.89	3.84
	(1.69)	(1.71)	(1.68)	(1.69)	(1.70)	(1.71)	(1.61)	(1.65)	(1.89)
St. deviation	0.58	0.57	0.58	0.56	0.57	0.58	0.55	0.55	0.57
	(0.51)	(0.50)	(0.51)	(0.50)	(0.52)	(0.53)	(0.46)	(0.46)	(0.44)

Models are estimated using the full sample period. Numbers in parenthesis show the steady-states estimates using inflation and Federal funds rate in a gap form. ‘no VS’ denotes the models without variable selection while ‘with VS’ denotes the models with variable selection using the method of Korobilis (2013). SSVS denotes the stochastic search variable selection method of George, Sun, and Ni (2008).

In Table 3 we also present estimation results regarding the restriction parameters γ_j using the VS method. As pointed out in Korobilis (2013) the posterior mean of γ_j can be seen as an average probability of including the respective β_j parameter to the true model. Our intention is to examine whether the incorporation of steady-state priors in a restricted VAR model has a significant impact on γ_j’s estimations. Indeed, the results in Table 3 reveal that the differences in γ_j estimations between models with and without steady-state priors are negligible with the greatest divergences been observed among the models with different priors on the regression coefficients. We also see that the first own lag is always included in the model across variables and priors, as expected. Moreover, the estimation results with respect to the γ_j’s and ω_ji’s parameters of the SSVS method (not shown here due to space considerations) also show that the inclusion of the steady-state priors in a SSVS VAR model has an insignificant impact.

Table 3:

Posterior means of the γ vector elements using the method of Korobilis (2013).

	Ridge		Minnesota		Shrink
	Uninformative	Steady-state	Uninformative	Steady-state	Uninformative	Steady-state
Dependent variable: r_t
r_t–1	1.00	1.00	1.00	1.00	1.00	1.00
π_t–1	0.13	0.12	0.24	0.28	0.74	0.68
i_t–1	0.26	0.34	0.26	0.28	0.68	0.71
r_t–2	0.76	0.77	0.97	0.96	0.78	0.75
π_t–2	0.18	0.22	0.43	0.48	0.74	0.74
i_t–2	1.00	1.00	0.98	0.99	0.81	0.78
r_t–3	0.09	0.09	0.64	0.64	0.67	0.68
π_t–3	0.31	0.21	0.64	0.57	0.70	0.70
i_t–3	1.00	0.99	0.98	0.97	0.80	0.79
r_t–4	0.06	0.07	0.65	0.66	0.62	0.67
π_t–4	0.13	0.15	0.59	0.57	0.72	0.69
i_t–4	0.34	0.30	0.58	0.59	0.68	0.67
Dependent variable: π_t
r_t–1	0.06	0.07	0.34	0.37	0.62	0.70
π_t–1	1.00	1.00	1.00	1.00	1.00	1.00
i_t–1	1.00	1.00	1.00	1.00	0.98	0.96
r_t–2	0.14	0.07	0.69	0.57	0.67	0.66
π_t–2	0.09	0.10	0.61	0.66	0.68	0.65
i_t–2	0.86	0.96	0.88	0.96	0.78	0.87
r_t–3	0.05	0.06	0.55	0.59	0.62	0.67
π_t–3	1.00	1.00	1.00	1.00	1.00	1.00
i_t–3	0.40	0.23	0.71	0.63	0.83	0.76
r_t–4	0.16	0.27	0.83	0.86	0.69	0.72
π_t–4	0.08	0.07	0.69	0.71	0.61	0.64
i_t–4	0.21	0.20	0.61	0.59	0.71	0.69
Dependent variable: i_t
r_t–1	1.00	1.00	1.00	1.00	1.00	1.00
π_t–1	0.05	0.05	0.26	0.35	0.62	0.67
i_t–1	1.00	1.00	1.00	1.00	1.00	1.00
r_t–2	0.22	0.19	0.80	0.84	0.71	0.71
π_t–2	0.92	0.70	0.99	0.94	0.98	0.92
i_t–2	1.00	1.00	0.99	1.00	0.94	0.89
r_t–3	0.02	0.03	0.61	0.59	0.61	0.61
π_t–3	0.03	0.05	0.55	0.55	0.67	0.59
i_t–3	1.00	0.98	1.00	1.00	0.79	0.79
r_t–4	0.02	0.03	0.64	0.60	0.64	0.58
π_t–4	0.04	0.04	0.57	0.58	0.60	0.61
i_t–4	0.92	0.86	0.82	0.81	0.77	0.70

Models are estimated using the full sample and variables in a non-gap form. ‘Uninformative’ denotes models with uninformative priors on the steady-state.

3.3 Out-of-sample forecasting analysis

We evaluate the alternative VAR specifications using a forecasting horizon of 12 quarters (3 years), h=1, … 12, and an out-of-sample period spanning from 1969:Q4 to 2015:Q1. The forecasts across all horizons are produced using a recursive forecasting scheme. This means that we use an initial sample (1954:Q2–1969:Q3) to generate forecasts from 1969:Q4 to 1972:Q3, i.e. 12 quarters ahead. Next, we allow the sample to expand and include one more period, i.e. 1954:Q2–1969:Q4, and generate forecasts from 1970:Q1 to 1972:Q4. This procedure is continued till the end of the sample period. As mentioned in Section 3.2 estimation and forecasting results are based on 10,000 posterior simulations (after a burn-in period of 2000 simulations), keeping one every five draws, and forecasts are generated iteratively following Korobilis (2013, 215–216).

Before proceeding to the forecasting evaluation, we present the sequential GDP, inflation and interest rate out-of-sample forecasts along with the actual variables in Figures 3–5. The forecasts has been generated by five alternative models: a VAR(1) estimated with OLS, two VAR models with variable selection (VS and SSVS) and a Minnesota prior, and two steady-state VAR models with variable selection (VS and SSVS) and a Minnesota prior. Overall, the forecasts generated by the steady-state VARs tend to converge to the steady-state of each variable much faster than its counterparts, as expected. This phenomenon is particularly evident during the Great Inflation period, i.e. 1970–1980; take as an example the inflation forecasts generated by the models at the end of 1979, during which the variable takes its maximum value. The models without a steady state prior seem to over-estimate the steady-state levels and their forecasts drift away in even higher levels. Another example is the GDP forecasts during the recession periods of the 1970s and 1980s during which the non steady-state models seem to under-estimate the unconditional means of the processes. Another interesting point is that the differences in the forecast patterns between steady-state and non-steady-state models are mitigated when we use variables in a gap form (see the right columns of Figures 3–5) implying that these two distinct categories of models may also be closer in terms of forecasting evaluation when we use variables in a gap form (see also the discussion in Section 3.3.2).

Figure 3:

Out-of-sample forecasts for real GDP growth (%).

Notes: This figure presents the annualized GDP growth out-of-sample forecasts for 12 quarters ahead (dotted lines) along with the actual values (solid line). The left (right) column presents the forecasts generated by models with inflation and interest rates variables specified in “non-gap” (“gap”) form. MIN denotes the Minnesota prior, VS and SSVS denotes the variable selection and stochastic search variable selection as employed by Korobilis (2013) and George, Sun, and Ni (2008), respectively and SSP denotes the steady state prior. The forecasts are plotted every other quarter for clarity in presentation.

Figure 4:

Out-of-sample forecasts for CPI inflation (%).

Notes: This figure presents the annualized CPI inflation out-of-sample forecasts for 12 quarters ahead (dotted lines) along with the actual values (solid line). The left (right) column presents the forecasts generated by models with inflation and interest rates variables specified in “non-gap” (“gap”) form. MIN denotes the Minnesota prior, VS and SSVS denotes the variable selection and stochastic search variable selection as employed by Korobilis (2013) and George, Sun, and Ni (2008), respectively and SSP denotes the steady state prior. The forecasts are plotted every other quarter for clarity in presentation.

Figure 5:

Out-of-sample forecasts for Federal Funds rate (%).

Notes: This figure presents the Federal Funds rate out-of-sample forecasts for 12 quarters ahead (dotted lines) along with the actual values (solid line). The left (right) column presents the forecasts generated by models with inflation and interest rates variables specified in “non-gap” (“gap”) form. MIN denotes the Minnesota prior, VS and SSVS denotes the variable selection and stochastic search variable selection as employed by Korobilis (2013) and George, Sun, and Ni (2008), respectively and SSP denotes the steady state prior. The forecasts are plotted every other quarter for clarity in presentation.

Our main research interest is to examine whether the abovementioned forecasting behavior leads to superior forecasting ability for the models that incorporate both steady-state priors and variable selection techniques. To that end, we use two popular forecasting evaluation metrics, namely, the root mean squared forecast error (RMSFE) and the mean absolute forecast error (MAFE). Following the usual convention, we use relative forecasting evaluation metrics, i.e. we present the RMSFE and MAFE metrics of the various competing models as proportion of the corresponding RMSFE and MAFE metrics of a random walkwith drift (RW) process for the levels of the variables.

Relative RMSFEijh=RMSFEijhRMSFEi,RWh, Relative MAFEijh=RMAFEijhRMAFEi,RWh

where h=1, …, 12 is the forecasting horizon, i is the variable of interest, i.e. GDP, inflation and interest rate, and j are the various competing models outlined is Section 3.1.. Values below one indicate that the corresponding model outperforms the random walk process and vice versa.

3.3.1 Forecasting results using variables in a non-gap form

Figures 6 and 7 present the relative RMSFE and MAFE out-of sample results for the (i) Standard, (ii) VS, (iii) SSP and (iv) VS-SSP models using variables in a non-gap form. Thus, Figures 6 and 7 examine the forecasting ability of the models that adopt variable selection technique as implemented in Korobilis (2013). On the other hand, Figures 8 and 9 compare models that employ the stochastic search variable selection of George, Sun, and Ni (2008). In particular, Figures 8 and 9 present the relative RMSFE and MAFE out-of sample results for the (i) Standard, (iii) SSP, (v) SSVS and (vi) SSVS-SSP models.^[10] The forecasting results from an OLS VAR(1) are always included in the figures for comparison reasons, while each row of the figure corresponds to a different prior specification with regard to the dynamic coefficients.^[11]

Figure 6:

Relative RMSFE results for models using variable selection and variables in a non-gap form.

Notes: The figures show the relative RMSFE metric as a function of the forecasting horizon for three alternative priors for the VAR dynamic coefficients, i.e. Minnesota, Ridge and Shrink priors. OLS is a standard VAR(1) estimated using ordinary least squares. Standard is a BVAR(4) model, VS is BVAR(4) with variable selection à laKorobilis (2013), SSP is a BVAR(4) with steady-state prior and VS-SSP is a steady-state BVAR(4) model with variable selection. Inflation and Federal Funds rate are specified in a non-gap form.

Figure 7:

Relative MAFE results for models using variable selection and variables in a non-gap form.

Notes: The figures show the relative MAFE metric as a function of the forecasting horizon for three alternative priors for the VAR dynamic coefficients, i.e. Minnesota, Ridge and Shrink priors. OLS is a standard VAR(1) estimated using ordinary least squares. Standard is a BVAR(4) model, VS is BVAR(4) with variable selection à laKorobilis (2013), SSP is a BVAR(4) with steady-state prior and VS-SSP is a steady-state BVAR(4) model with variable selection. Inflation and Federal Funds rate are specified in a non-gap form.

Figure 8:

Relative RMSFE results for models using stochastic search variable selection and variables in a non-gap form.

Notes: The figures show the relative RMSFE metric as a function of the forecasting horizon for three alternative priors for the VAR dynamic coefficients, i.e. Standard, Minnesota and Ridge priors. Standard prior is specified as in George, Sun, and Ni (2008). OLS is a standard VAR(1) estimated using ordinary least squares. SSVS is a BVAR(4) model with stochastic search variable selection à laGeorge, Sun, and Ni (2008) and SSVS-SSP is a BVAR(4) with steady-state prior and stochastic search variable selection. Inflation and Federal Funds rate are specified in a non-gap form.

Figure 9:

Relative MAFE results for models using stochastic search variable selection and variables in a non-gap form.

Notes: The figures show the relative MAFE metric as a function of the forecasting horizon for three alternative priors for the VAR dynamic coefficients, i.e. Standard, Minnesota and Ridge priors. Standard prior is specified as in George, Sun, and Ni (2008). OLS is a standard VAR(1) estimated using ordinary least squares. SSVS is a BVAR(4) model with stochastic search variable selection à laGeorge, Sun, and Ni (2008) and SSVS-SSP is a BVAR(4) with steady-state prior and stochastic search variable selection. Inflation and Federal Funds rate are specified in a non-gap form.

Beginning from Figures 6 and 7 we see that, overall, the empirical findings indicate that the proposed model, i.e. the steady-state VAR model with variable selection (VS-SSP) (purple dashed line) is the best performing model in the majority of the cases examined here, followed closely by the steady-state VAR (SSP) (red solid line). In some cases the differences between the two models are actually negligible, e.g. see the Inflation forecasts for the Minnesota and Shrink priors; however, we should note that the SSP model clearly outperforms the proposed model only in the case of the interest rate variable for the Minnesota and Ridge priors and for horizons greater than six (6) and ten (10), respectively. These results also align with other studies that highlight the contribution of steady-state priors to the accuracy of macroeconomic forecasts (Beechey and Österholm 2008; Villani 2009; Clark 2011; Wright 2013).

Moreover, it is also evident that the degree of improvement for the VS-SSP over the SSP model clearly depends on the prior used for the dynamic coefficients and the level of information it carries (see also Korobilis 2013). Thus, the improvement is substantial when we use the relative uninformative ridge prior (the figures in the middle row) and marginal or even negligible when we use priors that are more informative (Shrink or Minnesota priors), as expected.

As regards the rest of the models, their ranking for the GDP and Inflation variables is qualitatively similar with the VS-SSP and SSP models being followed by the VS (green dashed line) and Standard (blue solid line) Bayesian VAR models. Again, we find that, overall, the variable selection improves forecasting accuracy mainly when it is combined with less informative priors, while the OLS model is constantly the worst performer. By contrast, the forecasting results for the interest rates reveal that steady-state VAR models with or without variable selection (SSP or VS-SSP) and the OLS model are the main competitors, since they produce overall the best forecasts. In particular, Bayesian models (SSP or VS-SSP) outperform the OLS model for forecasting horizons longer than four or five quarters, across evaluation metrics. In line with the literature, a RW process also provides good short-term forecasts for the interest rates (e.g. see Villani 2009; Banbura, Giannone, and Reichlin 2010; Korobilis 2013; Giannone, Lenza, and Primiceri 2015).^[12]

We now turn to the evaluation of the models that adopt the stochastic search variable selection of George, Sun, and Ni (2008) and are presented in Figures 8 and 9. The overall picture is more or less the same with the one presented in Figures 6 and 7, with the proposed specification, i.e. the SSVS-SSP model, being the overall best performing model. In particular, the SSVS-SSP is rarely outranked by some of its competitors, with the most striking examples being the GDP and the interest rate forecasts for horizons greater than eight (8) and smaller than four (4) quarters, where the proposed model is usually beaten by the SSVS and the OLS models respectively.

Moreover, in line with the findings of Koop (2013), we also find that the SSVS model tends to improve forecasting accuracy over the OLS and Standard Bayesian models for most of the cases examined here. Furthermore, the general ranking of the models is not that different from the one reported for Figures 6 and 7 with the most remarkable exception being the interest rate variable, where the SSVS model consistently outranks the Standard BVAR model (while the corresponding VS model in Figures 6 and 7 does not). Finally, a visual inspection of Figures 6–9 also reveal that, in general, SSVS models offer greater forecasting improvements compared to the VS models. A formal analysis for this issue as well as a comparison among all alternative specifications is provided in Section 3.3.3.

3.3.2 Forecasting results using variables in a gap form

This section discusses the forecasting results produced by VAR models that employ the inflation and interest rate variables in gap form as described in Section 3. Figures 10 and 11 present the forecasting results for the relative RMSFE and MAFE evaluation metrics for the models using the variable selection technique à laKorobilis (2013), while Figures 12 and 13 present the forecasting results for the models using the variable selection technique à laGeorge, Sun, and Ni (2008).

Figure 10:

Relative RMSFE results for models using variable selection and variables in a gap form.

Figure 11:

Relative MAFE results for models using variable selection and variables in a gap form.

Figure 12:

Relative RMSFE results for models using stochastic search variable selection and variables in a gap form.

Figure 13:

Relative MAFE results for models using stochastic search variable selection and variables in a gap form.

The overall picture remains relatively unchanged with respect to the results presented in Figures 6–9 for the models using variables in a non-gap form. Again the steady-state VARs combined with both types of variable selection can lead to forecasting improvements across variables, evaluation metrics and priors. In addition, we observe that the adoption of variable selection techniques lead to significant forecasting improvements only in the case of relatively uninformative priors (e.g. the ridge prior), as expected, while the incorporation of steady-state priors is also usually beneficial for forecasting accuracy.

Nonetheless, we should also note that the difference in the forecasting accuracy between the steady-state models and the models with uninformative steady-state priors has been mitigated, especially for the inflation and interest rate variables, compared to the models using non-gap variables (see Figures 6–9). This probably reflects the forecasting patterns presented in the right column of Figures 3–5 and show that the misestimation of steady-state levels from the Bayesian models with uninformative steady-state priors is not that severe compared to the models using non-gap variables (left column of Figures 3–5).

A possible explanation is that in the case of models with variables in a gap form, the two components of the variables, i.e. the smoothed and the gap component, are modeled separately. In particular, since the low-frequency component forecasts are common across all models, the potential improvement for the steady-state models over the models with uninformative steady-state comes exclusively from the gap component of the variable. This is obviously a smaller portion of the final number to be forecasted and consequently, the prospective improvement would also be analogously smaller.

3.3.3 Model confidence set results

In this section, we employ the Model Confidence Set (MCS) method of Hansen, Lunde, and Nason (2003, 2011) and we construct a set of models, M1−a∗⊆M0, that present statistically superior predictive ability at a given confidence level. The scope of this analysis is twofold: first, based on the RMSFE and MAFE metrics we discern that set of models that generate statistically significant superior forecasts and second, we evaluate the forecasting ability of the full set of models across priors and specifications. Next, we briefly describe the MCS methodology.

Assuming an initial set of M=M₀ models, the MCS method is based on a specific loss function, L_m,_t with m=1, …, M, and applies an iterative process of sequential Equal Predictive Ability (EPA) tests of the form:

(16)H0,M0:E(dmk,t)=0 for all m, k∈M

where d_mk,_t=L_m,_t–L_k,_t is the loss differential between models m and k and L_{•, t} is one of the RMSFE or MAFE at each point in time, t. A rejection of the null hypothesis indicates that a model has inferior predictive ability and should not be included in the MCS at an a significance level. The EPA test in (16) is repeated for the remaining M_1–a models, with M_1–a⊂M, and this procedure continues until the null hypothesis cannot be rejected. The final set of surviving models forms the MCS at a 1–a confidence level, denoted by M1−a∗. The models included in the MCS have equal predictive ability, but they outperform the eliminated models, while the MCS p-values indicate the probability of a model being a member of the MCS.^[13]

Table 4 presents only a synopsis of the main MCS results for four selected forecasting horizons, i.e. h=1, 4, 8 and 12, while we refer the interested reader to Tables C.1–C.4 of Appendix C for a more detailed analysis. The first and second column for each of the variables in Table 4 presents the percentage of times a model is included in the MCS at 10% significance level across the selected forecasting horizons and evaluation metrics for models using variables in a gap and non-gap form, respectively. The third column repeats the MCS results across all (gap and non-gap) models.^[14] The last three columns of Table 4, named Avg., show the average percentage of times a model is included in the MCS across all three variables. The initial set includes all 20 models: the RW and OLS models, and all 18 BVARS.

Table 4:

Synopsis of the Model confidence set (MCS) results.

Models\variables		GDP			CPI			FFR			Avg.
Models\variables		Gap	Non-gap	Total	Gap	Non-gap	Total	Gap	Non-gap	Total	Gap	Non-gap	Total
1.	RW (benchmark)	50	75	63	0	0	0	25	50	38	25	42	34
2.	OLS	25	25	25	13	0	06	50	38	44	29	21	25
3.	Min-Standard	38	13	25	25	13	19	25	38	31	29	21	25
4.	Min-VS	50	38	44	25	25	25	38	38	38	38	34	36
5.	Min-SSP	63	63	63	50	50	50	50	50	50	54	54	54
6.	Min-VS-SSP	88	100	94	38	50	44	75	50	63	67	67	67
7.	Ridge-standard	25	13	19	25	13	19	0	38	19	17	21	19
8.	Ridge-VS	63	38	50	25	25	25	25	0	13	38	21	29
9.	Ridge-SSP	50	25	38	50	38	44	25	38	31	42	34	38
10.	Ridge-VS-SSP	88	100	94	100	63	81	50	50	50	79	71	75
11.	Shrink-standard	25	38	31	13	13	13	25	13	19	21	21	21
12.	Shrink-VS	25	38	31	13	13	13	38	13	25	25	21	23
13.	Shrink-SSP	50	63	56	50	38	44	75	50	63	58	50	54
14.	Shrink-VS-SSP	38	100	69	63	38	50	75	75	75	59	71	65
15.	SSVS	50	50	50	13	50	31	75	63	69	46	54	50
16.	SSVS-SSP	100	25	63	50	100	75	0	88	44	50	71	61
17.	Min-SSVS	75	63	69	63	13	38	38	25	31	59	34	46
18.	Min-SSVS-SSP	100	75	88	63	100	81	63	75	69	75	83	79
19.	Ridge-SSVS	63	50	56	13	13	13	88	38	63	55	34	44
20.	Ridge-SSVS-SSP	88	100	94	38	88	63	50	88	69	59	92	75

This table shows the percentage (%) of times a model is included in the MCS at 10% significance level across evaluation metrics and forecasting horizons. Bold faced numbers indicate the models with the highest (%) of times in the MCS. RW is random walk model with drift in levels. OLS is a standard VAR(1) estimated using ordinary least squares. Min, Ridge and Shrink are the Minnesota, Ridge and Shrink priors for the dynamic coefficients, respectively. Standard is a BVAR(4) model, VS is BVAR(4) with variable selection à laKorobilis (2013), SSP is a BVAR(4) with a steady-state prior, VS-SSP is a steady-state BVAR(4) model with variable selection, SSVS is a BVAR(4) with stochastic search variable selection à laGeorge, Sun, and Ni (2008) and SSVS-SSP is a BVAR(4) with steady-state prior and stochastic search variable selection. For the SSVS (15) and SSVS-SSP (16) models the prior on the dynamic coefficients is that of George, Sun, and Ni (2008).

In general, the synopsis of the MCS results in Table 4 confirms the graphical evidence presented in Sections 3.3.1 and 3.3.2, but also allows for several other comparisons among models that were not feasible before. The main conclusions emerge from Table 4 are discussed below.^[15]

First, except for the interest rate variable for models with gap variables, in all other cases the best performing model (denoted with bold faced entries), i.e. the model that maximizes the percentage of times included in the MCS, is a steady-state model with some kind of variable selection (VS or SSVS). Thus, we can state that, in general, the proposed specification which incorporates steady-state prior beliefs in a VAR model with Bayesian variable selection enhances the macroeconomic forecasting accuracy of small-scale VARs irrespective of the prior used on the dynamic parameters.

Second, the empirical evidence also confirms that, on average, there is an unambiguous ranking of the various Bayesian VAR models (see Avg. columns).^[16] Particularly, Standard BVARs are usually outranked by models with variable selection which, in turn, are consistently outperformed by BVARs with an informative steady-state prior. Of course, as already mentioned, models that combine both steady-state priors and variable selection outrank all the above mentioned specifications since the percentage of times included in the MCS is consistently higher compared to other specifications.

Third, the synopsis of the results presented in Table 4 give us the opportunity to compare the two alternative variable selection techniques, i.e. the variable selection as implemented in Korobilis (2013) and the stochastic search variable selection of George, Sun, and Ni (2008). Overall, the SSVS models tend to outperform the VS models irrespective of whether they incorporate a steady-state prior or not. More specifically, the average percentage of times a VS or a VS-SSP model is part of the MCS is 29% and 69%, while the corresponding averages for the SSVS and SSVS-SSP models are 47% and 72%, respectively.

Fourth, evidence presented in Tables C.1–C.4 shows that controlling for the slowly moving local mean for inflation is beneficial for forecasting purposes as it has recurrently mentioned in the extant literature (e.g. see Faust and Wright 2013 and references therein). Indeed, the average relative RMSFE and MAFE metrics for inflation and interest rate are lower for models with variables in a gap form relative to models with raw variables. However, the difference in the forecasting accuracy for gap and non-gap models is greater when we use models with uninformative steady-state priors.

Finally, the overall best performing model is the SSVS-SSP model combined with the Minnesota prior which is included in the MCS in 79% of cases (see column Avg., Total) followed closely by the SSVS-SSP and VS-SSP models with a Ridge prior, which are both part of the MCS in 75% of cases. Therefore, empirical evidence also suggests that the simple and relatively uninformative ridge prior can also provide accurate forecasts when combined with variable selection techniques, compared to the more informative Minnesota or Shrink priors.

4 Conclusions

Empirical evidence in extant literature has highlighted the importance of steady-state prior beliefs in Bayesian VAR forecasting. Moreover, Bayesian variable selection techniques have also been suggested as an efficient and automatic way of VAR shrinkage and model parsimony with beneficial effects on forecasting accuracy. In this paper, we propose Gibbs sampler algorithms for estimating Bayesian VAR models that efficiently combine this two promising VAR specifications: informative priors on the steady state of the system and parameter restrictions based on Bayesian variable selection methods. We evaluate the proposed specification in terms of an out-of-sample forecasting exercise using three major US macroeconomic variables and we find that it clearly outperforms alternative VAR models that encapsulate only one (or none) of the abovementioned specifications. Empirical evidence also suggests that these results are robust against alternative priors on dynamic parameters that carry different degree of information regarding VAR shrinkage.

Corresponding author: Dimitrios P. Louzis, Bank of Greece – Economic Analysis and Research Department, Athens, Greece, dlouzis@bankofgreece.gr

Acknowledgments

I gratefully acknowledge two anonymous reviewers, Bruce Mizrach (the Editor-in-Chief), Gary Koop (the Associated Editor) and Heather Gibson for their constructive and insightful comments and suggestions that considerably improved this article.

Disclaimer: The views expressed in this article do not necessarily represent Bank of Greece.

Appendix A: Posterior inference in the steady-state VARs with Bayesian variable selection

As already mentioned, conditional on the steady-state parameters, φ, the VAR models described in Section 2 are standard VAR models with (stochastic search) variable selection for the mean-adjusted series y˜t. Therefore, the first five and three blocks of the Gibbs samplers presented here, reproduce the George, Sun, and Ni (2008) and Korobilis (2013) algorithm for the mean-adjusted series, respectively. The last extra block samples the steady-state parameters, φ, conditional on all other parameters, from a normal posterior density. Conditional on β and γ (or equivalently θ for the variable selection method) the model is a standard steady-state VAR model and can be written in a form so that we can apply Villani’s (2009) methodology.

Stochastic search variable selection à la George, Sun, and Ni (2008)

Next, we derive the full conditional posteriors of the six-block Gibbs sampler algorithm for the steady-state VAR with stochastic search variable selection which employ the priors of Section 2.1. The notation used in this appendix is identical to the one used in Section 2.1. More specifically, the Markov Chain Monte Carlo (MCMC) algorithm comprises sequential drawing from the posterior distribution using the following steps.

Draw ψjj2,j=1, …, m, independently, from the following Gamma density:
(A.1)ψjj2|β, φ, η; Y∼gamma(a¯j, b¯j)
where a¯j=a_j+12T and b̅_j is given by:
b¯j={b_1+12v11 if j=1b_jj+12{vjj−v′j[Vj−1+(DjDj)−1]−1vj} if j=2, … m
where V=(Y˜−X˜B)′(Y˜−X˜B) with v_ij elements, V_j is the upper left j×j block of V and v_j=(v_1j, …, v_j–1_j)′.
Draw η_i, i=2, …, m, independently of one another, from the following (i–1) -dimensional multivariate Normal distribution
(A.2)ηi|β, ω, φ, ψ; Y∼Ni−1(h¯i, Δ¯i)
where Δ¯i=[Vi−1+(DiDi)−1]−1 and h¯i=−ψiiΔ¯ivi.
Draw ω_ji, i=2, …, m and j=1, …, i–1, independently of one another, from the following Bernoulli distribution:
(A.3)ωji|ω\−ji, β, φ, ψ; Y∼Bernoulli(q¯ji)
where q¯ji=[1κ0jiexp(−ηji22κ0ji2)q_ji]/[1κ0jiexp(−ηji22κ0ji2)q_ji+1κ1jiexp(−ηji22κ1ji2)(1−q_ji)].
Draw β from the following n-dimensional multivariate Normal distribution:
(A.4)β|γ, φ, Ψ; Y∼Nn(β¯, V¯)
where β¯=V¯[(DD)−1b_+vec(X˜′Y˜Σ−1)] and V¯=[Σ−1⊗(X˜′X˜)+(DD)−1]−1.
Step 5.Draw γ_j, j=1, …, n, independently of one another, from the following Bernoulli distribution:
(A.5)γj|γ\−j, β, φ, Ψ; Y∼Bernoulli(π¯j)
where π¯j=[1τ0jexp(−βj22τ0j2)π_j]/[1τ0jexp(−βj22τ0j2)π_j+1τ1jexp(−βj22τ1j2)(1−π_j)].
Draw the steady-state coefficients φ from the following mq-dimensional multivariate Normal density
(A. 6)φ|β, Ψ; Y∼Nmq(b¯φ, V¯φ)
For this step the VAR model conditional on β is written as follows:
(A.7)(yt−φdt)=B1(yt−1−φdt−1)+…+Bp(yt−p−φdt−p)+εt
Rearranging the terms in Eq. (A.7) we get:
(A.8)B(L)yt=B(L)φdt+εt=φdt−B1φdt−1−…−Bpφdt−p+εt
where B(L)=I_m–B₁L–…–B_pL^p. Following Villani (2009) we define Y_ϕ as a T×m matrix with its t-th row being Y_ϕt=(B(L)y_t)′, D as a T×(p+1)q matrix which t-th row is given by Dt=(d′t, −d′t−1, −…, −d′t−p) and Λ=[φ, B₁φ, …, B_pφ]′ as the (p+1)q×m matrix of coefficients. Therefore, the model in (A.8) can be written as Y_ϕ=DΛ+E and standard results for multivariate regressions can be applied (see Villani 2009, 646–647). Thus, given that vec(Λ′)=Uvec(φ) with
U=(ImqIq⊗B1⋮Iq⊗Bp)
we define the mean, b̅_ϕ, and variance, V̅_ϕ, of the posterior distribution in (A.6) as
b¯φ=V¯φ(U′vec(Σ−1Y′φD)+V_φ−1b_φ) and V¯φ−1=U′(D′D⊗Σ−1)U+V_φ−1 respectively.

Variable selection à la Korobilis (2013)

We follow Korobilis (2013) and we re-write the VAR model in a vectorized form to exploit the computational efficiency of matrix multiplications. To this end, we define ỹ as a T×m matrix with its t-th row being Y˜t=y′˜t=(y˜1t, y˜2t, …, y˜mt),X˜ as an T×k matrix with its t-th row being X˜t=x˜t and E as a T×m with its t-th row being Et=ε′t=(ε1t, ε2t, …, εmt). Then, ỹ=vec(ỹ) and ε=vec(E) are Tm×1 vectors with vec(∘) being the standard operator which stacks the columns of a matrix. Then, the VAR model can be written as

(A. 9)y˜(Tm×1)=Z˜(Tm×n)θ(n×1)+ε(Tm×1)

where Z˜=Im⊗X˜ is a Tm×n block diagonal matrix with X˜ being repeatedly on its main diagonal, θ=Γβ is the n×1 vector of coefficients with β=vec(B). To facilitate reading, we remind that m is the number of variables, k=mp is the number of explanatory variables in each VAR equation, and n=mk=m²p is the total number of VAR coefficients. Also recall that ỹ_t=y_t–φd_t and x˜t=(y˜′t−1, …, y˜′t−p) are the mean-adjusted time-series of y_t and x_t, respectively.

The full conditional posteriors for the steady-state VAR with variable selection using the priors specified in Section 2.2 are given below. More specifically, the four-block Gibbs sampler algorithm includes the following steps:

Draw the slope coefficients β from the following n-dimensional multivariate Normal density
(A.10)β|γ, φ, Σ, y; Z∼Nn(β¯, V¯)
where V¯=(V_−1+Z˜∗′(Σ−1⊗IT)Z˜∗)−1,β¯=V¯(V_−1b_+Z˜∗′(Σ−1⊗IT)y˜) and Z˜∗=Z˜Γ.
Draw γ_j in random order j with j=1, …, n from
(A.11)γj|γ\−j, β, ϕ, Σ; y, Z∼Bernoulli(π¯j)
where π¯j=l0j/(l0j+l1j),l0j=exp(−0.5(y˜−Z˜θ∗)′(Σ−1⊗IT)(y˜−Z˜θ∗))πj, l1j=exp(−0.5(y˜−Z˜θ∗∗)′(Σ−1⊗IT)(y˜−Z˜θ∗∗))(1−πj) with θ^* and θ^** being equal to θ but with their j-th element being equal to β_j and 0, respectively.
Draw Σ^–1 from
(A.12)Σ−1|β, γ, φ; y, Z∼Wishart(T, S−1)
For this step, re-write the VAR as
(A.13)Y˜(T×m)=X˜(T×k)Θ(k×m)+E(T×m)
where Θ is a k×m matrix which ij-th element is given by Θ_ij=θ_(j–1)k+i for i=1, …, k and j=1, …, m. Then, S is given by S=E′E=(Y˜−X˜Θ)′(Y˜−X˜Θ).
Draw the steady-state coefficients φ from the following mq-dimensional multivariate Normal density
(A. 14)φ|β, γ, Σ; y, Z∼Nmq(b¯φ, V¯φ)
For this step the VAR model conditional on β and γ (or equivalently θ) is written as follows
(A.15)(yt−φdt)=Θ1(yt−1−φdt−1)+…+Θp(yt−p−φdt−p)+εt
where Θ_l, l=1, …, p is a m×m matrix of coefficients which ij-th element is given by Θ_l,_ij=Θ_{(l–1)m+j, i} for l=1, …, p and i, j=1, …, m. Rearranging the terms in Eq. (A.15) we get
(A.16)Θ(L)yt=Θ(L)φdt+εt=φdt−Θ1φdt−1−…−Θpφdt−p+εt
where Θ(L)=I_m–Θ₁L–…–Θ_pL^p. Following Villani (2009) I define Y_ϕ as a T×m matrix with its t-th row being Y_ϕt=(Θ(L)y_t)′, D as a T×(p+1)q matrix which t-th row is given by Dt=(d′t, −d′t−1, −…,−d′t−p) and Λ=[φ, Θ₁φ, …, Θ_pφ]′ as the (p+1)q×m matrix of coefficients. Therefore, the model in Eq. (A.16) can be written as Y_ϕ=DΛ+E and standard results for multivariate regressions can be applied (see Villani 2009, 646–647). Thus, given that vec(Λ′)=Uvec(φ) with
U=(ImqIq⊗Θ1⋮Iq⊗Θp)
we define the mean, b̅_ϕ, and variance, V̅_ϕ, of the posterior distribution in (A.14) as b¯φ=V¯φ(U′vec(Σ−1Y′φD)+V_φ−1b_φ) and V¯φ−1=U′(D′D⊗Σ−1)U+V_φ−1 respectively.
The Gibbs samplers described in this Appendix were implemented in Matlab and extend the Matlab codes for VAR models with variable selection and stochastic search variable selection kindly provided by Gary Koop and Dimitris Korobilis in their websites: personal.strath.ac.uk/gary.koop/bayes_matlab_code_by_koop_and_korobilis.html and https://sites.google.com/site/dimitriskorobilis/matlab.

Appendix B: Specifying priors on β

This appendix briefly describes the three different types of prior distribution on β used in the VS and SSVS models (for more details see George, Sun, and Ni (2008), Koop (2013), Korobilis (2013) and references therein). More specifically, for the VS models we use:

The Ridge regression prior which defines b=0_n×1 and V=λI_n with the hyperparameter λ determining the degree of shrinkage on β. We choose λ=100 for the intercepts (diffuse prior) and λ=9 for the dynamic parameters.
The popular Minnesota prior which assumes that variables follow a AR(1) process implying that all elements of b are zero except for the parameter of the first own lag of each of the variables which is equal to δ_i.^[17] Here, we set the autoregressive parameter equal to 0.25 for the GDP and 0.8 for the inflation and interest rate variables. The prior covariance matrix V is assumed to be a n×n diagonal matrix with each of its diagonal elements being defined as
(B.1)v_ijl={100σi2for the interceptsλ1/l2for own lag parameters λ1λ2σi2/l2σj2for the jth lagged variable parameter(i≠j)
where l=1, …, p denotes the lag. We define σi2 as the residual variance from a univariate AR(p) for variable i estimated with OLS. We choose λ₁=0.2 and λ₂=0.5 as in Clark (2011).
The hierarchical Bayes shrinkage prior (Shrink) proposed by Korobilis (2013) is an hierarchical Normal-Jeffreys prior which defines b=0_n×1 and V_jj=λ_j, j=1, …, n with
(B.2)λj={δ100(λ)for the intercepts1/λjotherwise
where δ₁₀₀(λ) is the Dirac delta function. Assuming a scale invariant Jeffreys prior on λ_j, its posterior value is solely data driven. This is in contrast to the previous two approaches where the shrinkage parameter λ is an ad hoc selection of the researchers.
For the SSVS models we use the following three priors on regression coefficients:
The Standard ‘semi-automatic’ prior of George, Sun, and Ni (2008) which defines τ0j=c0var(βj) and τ1j=c1var(βj), where var(β_j) is the OLS estimate of the variance of β_j parameters, c₀=1/10 and c₁=10.
The conjunction of the Minnesota prior and the SSVS method proposed by Koop (2013) where τ0j=c0varMin(βj) and τ1j=c1varMin(βj), with var_Min(β_j) being the variance of the β_j parameters as defined by the Minnesota prior in (b) with c₀=1/10 and c₁=1. This prior specification implies that if γ_j=1 then we impose the same degree of shrinkage on β_j as in the Minnesota prior, whereas if γ_j=0 we impose additional shrinkage.
The conjunction of the ridge prior and the SSVS method where τ0j=c0varRidge(βj) and τ1j=c1varRidge(βj), with var_Ridge(β_j) being the variance of the β_j parameters as defined by the Ridge prior above in (a) with c₀=1/10 and c₁=1. Again the prior suggests that if γ_j=1 then we impose the same degree of shrinkage on β_j as in the Ridge prior, whereas if γ_j=0 we impose additional shrinkage.

We should note that the mean-adjusted form used in the steady-state VARs does not contain constant terms and the priors presented above are implemented only for the dynamic coefficients.

Appendix C: Additional forecasting results

Table C.1:

Relative RMSFE out-of-sample results (variables in a non-gap form).

Models\horizon (quarters)		Real GDP growth				CPI inflation				Federal funds rate
Models\horizon (quarters)		h=1	h=4	h=8	h=12	h=1	h=4	h=8	h=12	h=1	h=4	h=8	h=12
1.	RW (benchmark)	3.36	3.38•	3.33•	3.32•	3.38	3.45	3.55	3.64	0.99•	2.22	3.33	3.86
2.	OLS	0.97•	1.04	1.13	1.18	0.64	0.88	1.12	1.16	0.99•	1.00	0.99	0.99
3.	Min-standard	0.98	1.02	1.04	1.04	0.61	0.81	1.05	1.07	1.01•	1.03	1.00	0.99
4.	Min-VS	0.95•	1.01	1.03	1.05	0.61•	0.81	1.05	1.07	1.01•	1.03	1.01	1.00
5.	Min-SSP	0.97	0.99•	1.00	1.00•	0.61•	0.77•	0.93	0.93	0.99•	1.00	0.97	0.95
6.	Min-VS-SSP	0.93•	0.99•	1.00•	1.00•	0.61•	0.77•	0.93	0.93	0.99•	1.01	0.98	0.96
7.	Ridge-standard	0.98	1.05	1.05	1.03	0.62	0.81	1.07	1.11	1.11•	1.05	1.04	1.02
8.	Ridge-VS	0.94•	1.01	1.03	1.05	0.61•	0.81	1.05	1.08	1.12	1.07	1.05	1.04
9.	Ridge-SSP	0.97	1.02	1.01	0.99•	0.61	0.77•	0.94	0.93	1.08•	1.00	0.99	0.96
10.	Ridge-VS-SSP	0.93•	0.99•	0.99•	0.99•	0.60•	0.77•	0.92•	0.93	1.05•	1.00	0.98	0.97
11.	Shrink-standard	0.95•	1.01	1.04	1.06	0.62	0.82	1.07	1.09	1.07•	1.05	1.04	1.03
12.	Shrink-VS	0.95•	1.01	1.04	1.06	0.62	0.83	1.07	1.10	1.06•	1.06	1.05	1.04
13.	Shrink-SSP	0.93•	0.98•	1.00	1.00	0.61•	0.78	0.93	0.94	1.04•	0.99	0.97	0.97
14.	Shrink-VS-SSP	0.93•	0.98•	1.00•	1.00•	0.61	0.78	0.93•	0.94	1.03•	0.98•	0.97•	0.97
15.	SSVS	0.96•	1.08	1.02	0.99•	0.61•	0.77•	0.99	1.03	1.20	0.99•	0.98•	0.98
16.	SSVS-SSP	0.98	1.00	1.03	1.00	0.61•	0.77•	0.90•	0.90•	1.01•	0.99	0.95•	0.94•
17.	Min-SSVS	0.96•	1.09	1.01	1.00•	0.61•	0.77	1.01	1.06	1.18	1.01	0.99	0.99
18.	Min-SSVS-SSP	0.95•	1.00•	1.00	0.99•	0.61•	0.75•	0.91•	0.91•	1.17	1.00	0.95•	0.93•
19.	Ridge-SSVS	0.98	1.10	1.02	0.98•	0.62	0.80	1.02	1.06	1.00•	1.00	1.00	1.00
20.	Ridge-SSVS-SSP	0.95•	0.99•	1.00•	1.00•	0.61	0.78•	0.91•	0.92•	0.99•	0.99	0.95•	0.94•

The first row shows the root mean squared forecast error (RMSFE) for the benchmark random walk (RW) model. For the rest of the models the table shows the relative RMSFE. The symbol, •, indicates that a model belongs to the MCS because its p-value is greater than the prespecified significance level, a, where a=0.10. The MCS p-values are calculated using the quadratic test statistic. RW is random walk model with drift in levels. OLS is a standard VAR(1) estimated using ordinary least squares. Min, Ridge and Shrink are the Minnesota, Ridge and Shrink priors for the dynamic coefficients, respectively. Standard is a BVAR(4) model, VS is BVAR(4) with variable selection à laKorobilis (2013), SSP is a BVAR(4) with a steady-state prior, VS-SSP is a steady-state BVAR(4) model with variable selection, SSVS is a BVAR(4) with stochastic search variable selection à laGeorge, Sun, and Ni (2008) and SSVS-SSP is a BVAR(4) with steady-state prior and stochastic search variable selection. For the SSVS (17) and SSVS-SSP (18) models the prior on the dynamic coefficients is that of George, Sun, and Ni (2008). The out-of-sample period is1969:Q4–2015:Q1.

Table C.2:

Relative MAFE out-of-sample results (variables in a non-gap form).

Models\horizon (quarters)		Real GDP growth				CPI inflation				Federal funds rate
Models\horizon (quarters)		h=1	h=4	h=8	h=12	h=1	h=4	h=8	h=12	h=1	h=4	h=8	h=12
1.	RW (benchmark)	2.39	2.40•	2.36•	2.34•	2.42	2.46	2.53	2.62	0.58•	1.59•	2.52•	3.03
2.	OLS	0.99•	1.05	1.17	1.21	0.61	0.86	1.06	1.12	0.97•	1.01•	1.01	0.99
3.	Min-standard	1.00	1.04	1.05	1.05•	0.59•	0.83	1.04	1.05	1.03•	1.03•	1.03	0.98
4.	Min-VS	0.97•	1.02	1.05	1.06•	0.58•	0.83	1.04	1.05	1.02•	1.03•	1.04	0.99
5.	Min-SSP	0.99	1.00•	1.02•	1.01•	0.59•	0.78•	0.92	0.86	1.02•	1.00•	0.95•	0.88
6.	Min-VS-SSP	0.96•	0.99•	1.01•	1.01•	0.59•	0.78•	0.92	0.86	1.01•	1.00•	0.96•	0.90
7.	Ridge-standard	1.01	1.06	1.06	1.04•	0.59•	0.84	1.07	1.08	1.08•	1.04•	1.07	1.00
8.	Ridge-VS	0.98•	1.02	1.05	1.06•	0.58•	0.82	1.06	1.06	1.08	1.07	1.07	1.02
9.	Ridge-SSP	1.01	1.02	1.03	1.00•	0.59•	0.78•	0.93	0.87	1.05•	1.00•	0.97	0.90
10.	Ridge-VS-SSP	0.96•	0.99•	1.01•	1.01•	0.58•	0.78•	0.92	0.86•	1.01•	0.99•	0.97•	0.91
11.	Shrink-standard	0.98•	1.02	1.06	1.07•	0.59•	0.83	1.05	1.06	1.04•	1.05	1.05	1.02
12.	Shrink-VS	0.98•	1.02	1.06	1.07•	0.59•	0.84	1.05	1.06	1.04•	1.06	1.06	1.02
13.	Shrink-SSP	0.96•	0.99•	1.02	1.02•	0.59•	0.78•	0.92	0.86	1.01•	0.98•	0.96•	0.91
14.	Shrink-VS-SSP	0.95•	0.99•	1.02•	1.02•	0.60•	0.78•	0.92	0.86	1.00•	0.97•	0.96•	0.91
15.	SSVS	0.97•	1.08	1.03	1.01•	0.59•	0.77•	0.96	0.97	1.36	1.01•	0.99•	0.97
16.	SSVS-SSP	1.00	1.01•	1.06	1.02•	0.58•	0.77•	0.89•	0.83•	1.09	1.01•	0.95•	0.88•
17.	Min-SSVS	0.97•	1.07	1.01•	1.00•	0.60	0.80	1.00	1.01	1.28	1.03•	1.01	1.00
18.	Min-SSVS-SSP	0.96•	1.00•	1.02	1.01•	0.60•	0.77•	0.90•	0.84•	1.27	1.02•	0.94•	0.87•
19.	Ridge-SSVS	0.98•	1.10	1.03•	1.00•	0.59•	0.81	0.98	0.99	1.07•	1.01•	1.00	0.99
20.	Ridge-SSVS-SSP	0.97•	0.98•	1.02•	1.01•	0.59•	0.76•	0.89•	0.84•	1.07	1.01•	0.95•	0.88•

The first row shows the root mean absolute forecast error (MAFE) for the benchmark random walk (RW) model. For the rest of the models the table shows the relative RMSFE. The symbol, •, indicates that a model belongs to the MCS because its p-value is greater than the prespecified significance level, a, where a=0.10. The MCS p-values are calculated using the quadratic test statistic. RW is random walk model with drift in levels. OLS is a standard VAR(1) estimated using ordinary least squares. Min, Ridge and Shrink are the Minnesota, Ridge and Shrink priors for the dynamic coefficients, respectively. Standard is a BVAR(4) model, VS is BVAR(4) with variable selection à la Korobilis (2013), SSP is a BVAR(4) with a steady-state prior, VS-SSP is a steady-state BVAR(4) model with variable selection, SSVS is a BVAR(4) with stochastic search variable selection à laGeorge, Sun, and Ni (2008) and SSVS-SSP is a BVAR(4) with steady-state prior and stochastic search variable selection. For the SSVS (17) and SSVS-SSP (18) models the prior on the dynamic coefficients is that of George, Sun, and Ni (2008). The out-of-sample period is1969:Q4–2015:Q1.

Table C.3:

Relative RMSFE results (variables in a gap form).

Models\horizon (quarters)		Real GDP growth				CPI inflation				Federal funds rate
Models\horizon (quarters)		h=1	h=4	h=8	h=12	h=1	h=4	h=8	h=12	h=1	h=4	h=8	h=12
1.	RW (benchmark)	3.36	3.38	3.33•	3.32	3.38	3.45	3.55	3.64	0.99	2.22	3.33	3.86
2.	OLS	0.94•	1.00	1.04	1.05	0.63	0.84	0.97	0.97	0.97•	0.95•	0.88•	0.83
3.	Min-standard	0.96	1.00	1.00•	1.00	0.60•	0.75	0.88	0.89	0.99	0.97	0.89	0.83
4.	Min-VS	0.93•	0.99	1.00•	1.00	0.60•	0.76	0.88	0.89	0.98	0.97	0.89	0.82•
5.	Min-SSP	0.95	0.99	0.99•	0.99	0.60•	0.75•	0.87•	0.87	0.98•	0.96•	0.88	0.83
6.	Min-VS-SSP	0.92•	0.98•	0.99•	0.99	0.60•	0.75	0.86•	0.87	0.98•	0.96•	0.88	0.83
7.	Ridge-standard	0.97	1.03	1.00	1.00	0.61•	0.75	0.89	0.88	1.07	0.97	0.90	0.83
8.	Ridge-VS	0.93•	0.99•	0.99•	1.00	0.60•	0.75	0.87	0.87	1.06	0.97	0.88	0.82•
9.	Ridge-SSP	0.96	1.02	1.00	0.99•	0.61•	0.74•	0.87	0.87	1.07	0.96•	0.90	0.84
10.	Ridge-VS-SSP	0.92•	0.98•	0.99•	0.99	0.60•	0.74•	0.85•	0.85•	1.05	0.96•	0.88	0.82•
11.	Shrink-standard	0.94•	0.99	1.01	1.00	0.61	0.77	0.88	0.88	1.04	0.97	0.88	0.82•
12.	Shrink-VS	0.94•	1.00	1.01	1.00	0.61	0.77	0.89	0.88	1.03	0.97	0.88	0.82•
13.	Shrink-SSP	0.93•	0.99•	1.00	0.99	0.61	0.75	0.86•	0.86	1.03	0.96•	0.87•	0.82•
14.	Shrink-VS-SSP	0.94•	0.99	1.00	0.99	0.61	0.76	0.86•	0.86	1.02	0.96•	0.88•	0.82•
15.	SSVS	0.93•	1.04	1.00	0.98•	0.62	0.76	0.88	0.89	0.99	0.95•	0.88•	0.82•
16.	SSVS-SSP	0.92•	0.97•	0.99•	0.99•	0.61	0.75•	0.86•	0.87	0.99	0.96	0.88	0.82
17.	Min-SSVS	0.94•	1.04	0.98•	0.97•	0.60•	0.74•	0.88	0.89	1.17	0.95•	0.89	0.84
18.	Min-SSVS-SSP	0.93•	0.99•	0.99•	0.99•	0.60•	0.74•	0.86•	0.87	1.17	0.95•	0.87•	0.82•
19.	Ridge-SSVS	0.95•	1.05	1.00•	0.97•	0.61	0.76	0.88	0.90	0.98•	0.94•	0.87•	0.82•
20.	Ridge-SSVS-SSP	0.93•	0.97•	0.99•	0.99	0.61•	0.75	0.86•	0.88	0.99	0.95•	0.87•	0.82•

The first row shows the root mean squared forecast error (RMSFE) for the benchmark random walk (RW) model. For the rest of the models the table shows the relative RMSFE. The symbol, •, indicates that a model belongs to the MCS because its p-value is greater than the prespecified significance level, a, where a=0.10. The MCS p-values are calculated using the quadratic test statistic. RW is random walk model with drift in levels. OLS is a standard VAR(1) estimated using ordinary least squares. Min, Ridge and Shrink are the Minnesota, Ridge and Shrink priors for the dynamic coefficients, respectively. Standard is a BVAR(4) model, VS is BVAR(4) with variable selection à la Korobilis (2013), SSP is a BVAR(4) with a steady-state prior, VS-SSP is a steady-state BVAR(4) model with variable selection, SSVS is a BVAR(4) with stochastic search variable selection à laGeorge, Sun, and Ni (2008) and SSVS-SSP is a BVAR(4) with steady-state prior and stochastic search variable selection. For the SSVS (17) and SSVS-SSP (18) models the prior on the dynamic coefficients is that of George, Sun, and Ni (2008). The out-of-sample period is1969:Q4–2015:Q1.

Table C.4:

Relative MAFE out-of-sample results (variables in a gap form).

Models\horizon (quarters)		Real GDP growth				CPI inflation				Federal funds rate
Models\horizon (quarters)		h=1	h=4	h=8	h=12	h=1	h=4	h=8	h=12	h=1	h=4	h=8	h=12
1.	RW (benchmark)	2.39•	2.40•	2.36•	2.34	2.42	2.46	2.53	2.62	0.58•	1.59•	2.52	3.03
2.	OLS	0.99•	1.03	1.07	1.08	0.62•	0.85	0.99	1.00	0.99•	1.01	0.94	0.85
3.	Min-standard	1.00•	1.02	1.01•	1.00	0.58•	0.79	0.92	0.91	1.03•	0.99•	0.92	0.83
4.	Min-VS	0.98•	1.02	1.01•	1.00	0.58•	0.79	0.92	0.91	1.03•	0.99•	0.92	0.82
5.	Min-SSP	0.99•	1.01•	1.00•	0.99•	0.58•	0.77•	0.89	0.88	1.03	0.98•	0.91•	0.82
6.	Min-VS-SSP	0.96•	1.00•	1.00•	0.99•	0.58•	0.78	0.89	0.88	1.02•	0.98•	0.91•	0.81•
7.	Ridge-standard	1.01•	1.05	1.01•	1.00	0.58•	0.79	0.92	0.90	1.06	1.00	0.93	0.83
8.	Ridge-VS	0.98•	1.02	1.01•	1.01	0.58•	0.78	0.90	0.88	1.03•	1.00	0.93	0.82
9.	Ridge-SSP	1.00•	1.03	1.01•	0.99•	0.58•	0.77•	0.90	0.88	1.06	0.99•	0.92	0.82
10.	Ridge-VS-SSP	0.98•	1.00•	1.00•	1.00•	0.57•	0.76•	0.87•	0.86•	1.02•	0.99•	0.92	0.81
11.	Shrink-standard	0.98•	1.03	1.02	1.02	0.59•	0.78	0.91	0.89	1.03	0.99•	0.92	0.82
12.	Shrink-VS	0.99•	1.03	1.02	1.02	0.59•	0.79	0.91	0.89	1.03•	0.99•	0.92	0.83
13.	Shrink-SSP	0.98•	1.01	1.01•	1.00	0.58•	0.77•	0.88•	0.86	1.02•	0.98•	0.91•	0.81
14.	Shrink-VS-SSP	0.99•	1.02	1.01•	1.01	0.59•	0.77•	0.88•	0.86•	1.01•	0.98•	0.91•	0.82
15.	SSVS	0.95•	1.04	1.04	0.99•	0.59•	0.79	0.91	0.91	1.09	0.98•	0.90•	0.80•
16.	SSVS-SSP	0.97•	0.99•	1.00•	0.99•	0.58•	0.78•	0.90	0.90	1.11	1.01	0.91	0.82
17.	Min-SSVS	0.95•	1.03	1.00•	0.97•	0.59•	0.76•	0.89•	0.89	1.28	0.99•	0.91•	0.81
18.	Min-SSVS-SSP	0.96•	1.01•	1.00•	0.99•	0.59•	0.76•	0.89	0.88	1.28	1.00	0.90•	0.80•
19.	Ridge-SSVS	0.96•	1.06	1.04	1.00•	0.58•	0.79	0.91	0.91	1.06	0.97•	0.89•	0.80•
20.	Ridge-SSVS-SSP	0.98•	0.99•	1.00•	0.99•	0.59•	0.78	0.90	0.90	1.08	1.00	0.91•	0.81

The first row shows the root mean absolute forecast error (MAFE) for the benchmark random walk (RW) model. For the rest of the models the table shows the relative RMSFE. The symbol, •, indicates that a model belongs to the MCS because its p-value is greater than the prespecified significance level, a, where a=0.10. The MCS p-values are calculated using the quadratic test statistic. RW is random walk model with drift in levels. OLS is a standard VAR(1) estimated using ordinary least squares. Min, Ridge and Shrink are the Minnesota, Ridge and Shrink priors for the dynamic coefficients, respectively. Standard is a BVAR(4) model, VS is BVAR(4) with variable selection à la Korobilis (2013), SSP is a BVAR(4) with a steady-state prior, VS-SSP is a steady-state BVAR(4) model with variable selection, SSVS is a BVAR(4) with stochastic search variable selection à la George, Sun, and Ni (2008) and SSVS-SSP is a BVAR(4) with steady-state prior and stochastic search variable selection. For the SSVS (17) and SSVS-SSP (18) models the prior on the dynamic coefficients is that of George, Sun, and Ni (2008). The out-of-sample period is1969:Q4–2015:Q1.

References

Adolfson, M., M. K. Andersson, J. Lindé, M. Villani, and A. Vredin. 2007. “Modern Forecasting Models in Action: Improving Macroeconomic Analyses at Central Banks.” International Journal of Central Banking 3: 111–144.10.2139/ssrn.980666Search in Google Scholar

Banbura, M., D. Giannone, and L. Reichlin. 2010. “Large Bayesian Vector Auto Regressions.” Journal of Applied Econometrics 25: 71–92.10.1002/jae.1137Search in Google Scholar

Beechey, M., and P. Österholm. 2008. “A Bayesian Vector Autoregressive Model with Informative Steady-State priors for the Australian Economy.” The Economic Record 84 (267): 449–465.10.1111/j.1475-4932.2008.00510.xSearch in Google Scholar

Beechey, M., and P. Österholm. 2010. “Forecasting Inflation in an Inflation-Targeting Regime: A Role for Informative Steady-State Priors.” International Journal of Forecasting 26 (2): 248–264.10.1016/j.ijforecast.2009.10.006Search in Google Scholar

Chan, J. C. C., and G. Koop. 2014. “Modelling Breaks and Clusters in Steady States of Macroeconomic Variables.” Computational Statistics & Data Analysis 76: 186–193.10.1016/j.csda.2013.05.007Search in Google Scholar

Clark, T. E. 2011. “Real-Time Density Forecasts from Bayesian Vector Autoregressions with Stochastic Volatility.” Journal of Business and Economics Statistics 29 (3): 327–341.10.1198/jbes.2010.09248Search in Google Scholar

Cogley, T. 2002. “A Simple Adaptive Measure of Core Inflation.” Journal of Money, Credit and Banking 34 (1): 94–113.10.1353/mcb.2002.0027Search in Google Scholar

Doan, T., R. Litterman, and C. Sims. 1984. “Forecasting and Conditional Projection using Realistic Prior Distributions.” Econometric Reviews 3 (1): 1–144.10.3386/w1202Search in Google Scholar

Faust, J., and J. H. Wright. 2013. “Forecasting Inflation.” Handbook of Economic Forecasting 2(Part A): 3–56.10.1016/B978-0-444-53683-9.00001-3Search in Google Scholar

Gefang, D. 2014. “Bayesian Doubly Adaptive Elastic-Net Lasso for VAR Shrinkage.” International Journal of Forecasting 30 (1): 1–11.10.1016/j.ijforecast.2013.04.004Search in Google Scholar

George, E. I., and R. E. McCulloch. 1997. “Approaches for Bayesian Variable Selection.” Statistica Sinica 7 (2): 339–373.Search in Google Scholar

George, E. I., D. Sun, and S. Ni. 2008. “Bayesian Stochastic Search for VAR Models Restrictions.” Journal of Econometrics 142 (1): 553–580.10.1016/j.jeconom.2007.08.017Search in Google Scholar

Geweke, J. 1992. “Evaluating the Accuracy of Sampling-Based Approaches to the Calculation of Posterior Moments.” In Bayesian Statistics, edited by J. M. Bernardo, J. Berger, A. P. Dawid and A. F. M. Smith, 169–193. Oxford: Oxford University Press.10.21034/sr.148Search in Google Scholar

Giannone, D., M. Lenza, and G. Primiceri. 2015. “Prior selection for Vector Autoregressions.” Review of Economics and Statistics 97 (2): 436–451.10.3386/w18467Search in Google Scholar

Hansen, P. R., A. Lunde, and J. M. Nason. 2003. “Choosing the Best Volatility Models: The Model Confidence Set Approach.” Oxford Bulletin of Economics and Statistics 65 (S1): 839–861.10.1046/j.0305-9049.2003.00086.xSearch in Google Scholar

Hansen, P. R., A. Lunde, and J. M. Nason. 2011. “The Model Confidence Set.” Econometrica 79 (2), 453–479.10.2139/ssrn.522382Search in Google Scholar

Jarociński, M., and J. F. Smets. 2008. “House Prices and the Stance of Monetary Policy.” Federal Reserve Bank of St. Louis Review 90: 339–365.10.20955/r.90.339-366Search in Google Scholar

Karlsson, S. 2013. “Forecasting with Bayesian Vector Autoregressions.” Handbook of Economic Forecasting 2 (Part B): 791–897.10.1016/B978-0-444-62731-5.00015-4Search in Google Scholar

Koop, G. 2013. “Forecasting with Medium and Large Bayesian VARs.” Journal of Applied Econometrics 28 (2): 177–203.10.1002/jae.1270Search in Google Scholar

Korobilis, D. 2008. “Forecasting with Vector Autoregressions with Many Predictors.” Advances in Econometrics 23, 403–431.10.1016/S0731-9053(08)23012-4Search in Google Scholar

Korobilis, D. 2009. “VAR Forecasting Using Bayesian Variable Selection.” MPRA Paper No. 21124.10.2139/ssrn.1564378Search in Google Scholar

Korobilis, D. 2013. “VAR Forecasting Using Bayesian Variable Selection.” Journal of Applied Econometrics 28: 204–230.10.1002/jae.1271Search in Google Scholar

Litterman, R. B. 1986. “Forecasting with Bayesian Vector Autoregressions – Five Years of Experience.” Journal of Business and Economic Statistics 4 (1): 25–38.10.1080/07350015.1986.10509491Search in Google Scholar

Österholm, P. 2012. “The Limited Usefulness of Macroeconomic Bayesian VARs when Estimating the Probability of a US Recession.” Journal of Macroeconomics 34 (1): 76–86.10.1016/j.jmacro.2011.10.002Search in Google Scholar

Primiceri, G. E. 2005. “Time Varying Structural Vector Autoregressions and Monetary Policy.” The Review of Economic Studies 72 (3): 821–852.10.1111/j.1467-937X.2005.00353.xSearch in Google Scholar

Sims, C. 1980. “Macroeconomics and Reality.” Econometrica 48 (1): 1–48.10.2307/1912017Search in Google Scholar

Van Roye, B. 2011. “Financial Stress and Economic Activity in Germany and the Euro area.” Kiel Working Papers No 1743.Search in Google Scholar

Villani, M. 2005. “Inference in Vector Autoregressive Models with Informative Prior on the Steady State.” Sveriges Riksbank working paper series No 181.10.2139/ssrn.938726Search in Google Scholar

Villani, M. 2009. “Steady-State Priors for Vector Autoregressions.” Journal of Applied Econometrics 24: 630–650.10.1002/jae.1065Search in Google Scholar

Wright, J. H. 2013. “Evaluating Real-Time VAR Forecasts with an Informative Democratic Prior.” Journal of Applied Econometrics 28: 762–776.10.21799/frbp.wp.2010.19Search in Google Scholar

Supplemental Material:

The online version of this article (DOI: 10.1515/snde-2015-0048) offers supplementary material, available to authorized users.

Published Online: 2016-3-16

Published in Print: 2016-12-1

Supplementary material

Articles in the same Issue

https://doi.org/10.1515/snde-2015-0048

Keywords for this article

Bayesian VAR; macroeconomic forecasting; steadystates; variable selection

Steady-state priors and Bayesian variable selection in VAR forecasting

Article

Abstract

1 Introduction

2 Steady-state VARs with variable selection

2.1 Stochastic search variable selection of George, Sun, and Ni (2008)

2.1.1 Posterior inference for the steady-state VAR with stochastic search variable selection

2.2 Variable selection of Korobilis (2013)

2.2.1 Posterior inference for the steady-state VAR with variable selection

3 Empirical analysis

3.1 Prior specifications and alternative models

3.2 In-sample analysis

3.3 Out-of-sample forecasting analysis

3.3.1 Forecasting results using variables in a non-gap form

3.3.2 Forecasting results using variables in a gap form

3.3.3 Model confidence set results

4 Conclusions

Acknowledgments

Appendix A: Posterior inference in the steady-state VARs with Bayesian variable selection

Appendix B: Specifying priors on β

Appendix C: Additional forecasting results

References

Supplemental Material:

Supplementary Material

Articles in the same Issue

Articles in the same Issue

Articles in the same Issue