The forward rate premium puzzle: a case of misspecification?1)

Stephen G. Hall; Amangeldi Kenjegaliev; P. A. V. B. Swamy; George S. Tavlas

doi:10.1515/snde-2013-0009

Article

The forward rate premium puzzle: a case of misspecification?¹⁾

Stephen G. Hall , Amangeldi Kenjegaliev , P. A. V. B. Swamy and George S. Tavlas

Published/Copyright: April 16, 2013

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information

From the journal Studies in Nonlinear Dynamics and Econometrics Volume 17 Issue 3

Abstract

Empirical studies often report a negative relationship between the difference in the spot exchange rate and the forward premium, violating the forward-rate unbiasedness hypothesis. Using standard regression on a sample of ten exchange rates, we obtain both positive and negative coefficients. We argue that the negative coefficients could arise as a result of the non-linearities in the relationship and misspecification. As an alternative to the standard regression, we use a time-varying-coefficient technique that estimates bias-free coefficients and, thus, should provide better estimates of the link between spot and forward rates. Our findings strongly support the forward rate unbiasedness hypothesis. All the parameters are very close to unity and significant.

Keywords: forward premium anomaly; time-varying-coefficient; spurious relationship; JEL classifications: C51; E43

Corresponding author: Stephen G. Hall, Economics Dept, University of Leicester, Leicester, LE1 7RH, UK

Appendix A: technical exposition of TVC estimation

When studying the relation of a dependent variable, denoted by to a hypothesized set of K–1 of its determinants, denoted by where K–1 may be only a subset of the complete set of determinants of a number of problems may arise. Any specific functional form of the relation may be incorrect and may therefore lead to specification errors resulting from the erroneous functional form. Another problem that can arise in investigating the relationship between the dependent variable and its determinants is that may not exhaust the complete list of the determinants of in which case the relation of to may be subject to omitted-variable biases. In addition to these problems, the available data on may not be perfect measures of the underlying true variables, causing errors-in-variables problems. In what follows, we propose the correct interpretations and an appropriate method of estimation of the coefficients of the relationship between and in the presence of the foregoing problems.

Suppose that T measurements on are made and these measurements are in fact, the sums of “true” values and measurement errors: j=1,…,K–1, t=1,…,T, where the variables y_t, x₁_t,…,x_K_–1,_t without an asterisk are the observable variables, the variables with an asterisk are the unobservable “true” values, and the v’s are measurement errors. Also, given the possibilities that the functional form we are estimating may be misspecified and there may be some important variables missing from we need a model which will present all these potential problems.

It is useful at this point to clarify what we believe is the main objective of econometric estimation. In our view, the objective is to obtain unbiased estimates of the effect on a dependent variable of changing one of its complete set of determinants holding all of the remaining determinants of this set constant. That is, we aim to find a consistent estimator of the bias-free component of the coefficient of any in the regression of on This interpretation is, of course, standard one usually placed on the coefficients of a typical econometric model, but the validity of this interpretation depends crucially on the assumption that the conventional model gives bias-free coefficients, which is not the case in the presence of model misspecification.

To deal with this issue, consider a set of time-varying coefficients that provide a complete explanation of the dependent variable y.

where y_t=s_t₊_k–s_t, x₁_t=(f_t–s_t) and the other connections between models (3) and (A1) are shown below. We call (A1) “the time-varying coefficient (TVC) model”. (Note that this equation is formulated in terms of the observed variables). As this model provides a complete explanation of y, all the misspecifications in the model, as well as the correct components must be present in the time-varying coefficients. Note that, if the true functional form is non-linear, the time-varying coefficients may be thought of as the true non-linear structure and so they are able to capture any possible function. These coefficients will also capture the biases introduced by measurement error and omitted variables. Our aim is to find a way of decomposing these coefficients into the bias and the bias-free components.

It is important to stress that, while we start from a TVC model, and this technique is typically referred to in the literature as time-varying-coefficient estimation, the objective here is not to simply estimate a model with changing coefficients. We start from (A1) because this is a representation of the underlying data generation process, which is correct. This is the case simply because, if the coefficients can vary at each point in time, they are able to explain one-hundred percent of the variation in the dependent variable. In the case of the TVC procedure followed in this paper, however, we extend the standard TVC model typically considered in the literature; specifically, we decompose each of these varying coefficients into two parts, a consistent estimate of the bias-free part and the remaining part, which is due to biases from the various misspecifications in the model. If the true model is linear, we would get back to a constant coefficient model. If the true model is non-linear, the coefficients of its linear-in-variables form will vary over time to reflect this circumstance. The key point is that the TVC technique used here produces consistent estimates of structural relationships in the presence of model misspecification.

For empirical implementation, model (A1) has to be embedded in a stochastic framework. To do so, we need to answer the question: What are the correct stochastic assumptions about the TVC’s of (A1)? We believe that the correct answer is: the correct interpretation of the TVC’s and the assumptions about them must be based on an understanding of the model misspecification which comes from any (i) omitted variables, (ii) measurement errors, and (iii) misspecification of the functional form. We expand on this argument in what follows.

Notation and assumptions

Let m_tdenote the total number of the determinants of The exact value of m_t cannot be known at any time. We assume that m_t is larger than K–1 (that is, the number of determinants is greater than the determinants for which we have observations) and possibly varies over time.⁸ This assumption means that there are determinants of that are excluded from Equation (A1) since (A1) includes only K–1 determinants. Let g=K,…,m_t, denote these excluded determinants. Let denote the intercept and let both j=1,…,K–1, and g=K,…,m_t, denote the other coefficients of the regression of on all of its determinants. The true functional forms of the partial derivatives of this regression determine the time profiles of α^*s. These time profiles are unknown, since the true functional form is unknown. Note that an equation that is linear in variables accurately represents a non-linear equation, provided the coefficients of the former equation are time-varying with time profiles determined by the true functional forms of the partial derivatives of the latter equation. This type of representation of a non-linear equation is convenient, particularly when the true functional form of the non-linear equation is unknown. Such a representation is not subject to the criticism of misspecified functional form. For g=K,…,m_t, let denote the intercept and let j=1,…,K–1, denote the other coefficients of the regression of on The true functional forms of these regressions determine the time profiles of 9λ^*s. Let a set, denoted by S₁, contain those regressors of (A1) that take the value zero with probability zero and another set, denoted by S₂, contain the remaining regressors of (7) that take the value zero with positive probability.

The following theorem gives the correct interpretations of the coefficients of equation (A1):

Theorem 1The intercept of (A1) satisfies the equation,

and the coefficients of (A1) other than the intercept satisfy the equations,

Proof: See Swamy and Tavlas (2007).⁹ □

Thus, we may interpret the TVC’s in terms of the underlying correct coefficients, the observed explanatory variables and measurement errors in the dependent variable and included regressors. It should be noted that, by assuming that the λ^*s in equations (A2) and (A3) are possibly nonzero, we do not require that the determinants of included in (A1) be independent of the determinants of excluded from (A1). Pratt and Schlaifer (1988: p. 34) show that this condition is what-they-call “meaningless”. By the same logic, the usual exogeneity assumption of independence between a regressor and the disturbances of an econometric model is (what-they-also-call) “meaningless” if the disturbances are assumed to represent the net effect on the dependent variable of the determinants of the dependent variable excluded from the model. The real culprit appears to be the interpretation that the disturbances of an econometric model represent the net effect on the dependent variable of the unidentified determinants of the dependent variable excluded from the model.

By assuming that the α^*s and λ^*s are possibly time-varying, we do not a priori rule out the possibility that the relationship of with all of its determinants and the regressions of the determinants of excluded from (A1) on the determinants of included in (A1) are non-linear. It should be noted that for ℓ=0, 1,…,m_t, each is functionally dependent on all Also, the last term on the right-hand side of the first equation in (A3) implies that the regressors of (A1) are correlated with their own coefficients.¹⁰

Theorem 2For j=1,…,K–1, the componentof γ_jt in (A3) is its bias-free component and is unique.

Proof. It can be seen from Equation (A3) that the component of γ_jt is free of omitted-variables bias measurement-error bias and of functional-form bias, since we allow the α^*s and λ^*s to have the correct time profiles. These biases are not unique being dependent on what determinants of are excluded from (A1) and the v_jt.¹¹ Note that is the coefficient of in the correctly specified relation of to all of its determinants. Hence, represents the bias-free component of the coefficient of .□

The bias-free component is constant if the relationship between and all of its determinants is linear; alternatively, it is variable if the relationship is non-linear. We often have information from theory as to the right sign of Any observed correlation between y_t and x_jt is spurious if =0¹²

Thus, the “correctness” of the specification of model (A1) comes from the correct interpretations and their implications of the components of its coefficients in (A2) and (A3). The very generality of model (A1) relative to model (4) is clear; the former model gives an appropriate framework for testing the null forward-rate unbiasedness hypothesis against a very general and realistic class of alternative hypotheses. If we interpret the error term of (4) as a function of the misspecification of the model, then the error term is not conditionally independent of the included regressors.

A key implication of (A2) and (A3) is that, in the presence of a misspecified functional form, omitted variables, and measurement error, the errors in a standard regression will contain the difference between the right-hand side of (A1) and the right-hand side of the standard regression with the errors suppressed, as in (5) and (6). So the errors will contain the included x variables. This means that the independence condition between an error term and instrumental variables underlying the GMM and instrumental variables method cannot be met as the errors contain exactly the same variables that we require the instruments to have a strong correlation with. In effect, if the instruments are highly correlated with the x variables, they cannot be uncorrelated with the errors as these errors contain exactly the same x variables.

The time-varying coefficients are decomposed to give consistent estimators of the bias-free components of coefficients in a model which is misspecified in terms of its functional form, its excluded variables and measurement error.¹³ The key to this decomposition is to use a set of observable variables, called coefficient drivers, which explain the time variation in the coefficients. It is assumed that the regressors of (A1) are conditionally independent of its coefficients given the coefficient drivers. This set of coefficient drivers can be split into three subsets so that one subset, say the first subset, should be correlated with any true variation in the bias-free component while the other two subsets, say the second and third subsets, should be correlated with the biases that are present.¹⁴ Once this is achieved we can estimate the biases, which come from the second and third subsets of coefficient drivers. We remove the estimates of biases from the estimates of total coefficients to obtain a consistent estimator of the underlying bias-free components. These second and third subsets of coefficient drivers act rather like the dual of conventional instruments. The key difference, however, is that some of these drivers should be correlated with the misspecifications rather than uncorrelated with an error term, as in the case of instruments, and this should be much easier to achieve in a real world situation.

¹
Swamy et al. (2010), in turn, draw on papers by Swamy and Tavlas (2001), Chang, Hallahan and Swamy (1992) and Chang et al. (2000).
²
In their survey, MacDonald and Torrance (1989) noted that the coefficient of β usually turned out to be close to –1.
³
See Chakraborty (2007) for a formal analysis of how learning may affect the coefficient.
⁴
Learning could take many specific forms, for example it is possible that the error term in (4) could be serially correlated because of slow learning. In which case (4) becomes s_t₊_k–s_t=α_t+β_t(f_t–s_t)+u_t₊_k where u_t₊_k=ε_t₊_k+ρu_t. If ρ→1 then the variables will tend to exhibit near unit root behavior as in Maynard and Phillips (2001). It is also possible that the speed of learning could be affected by the size of the error and this would create serious nonlinearities, so if s_t₊_k–s_t=α_t+β_t(f_t–s_t)+u_t₊_k where u_t₊_k=ε_t₊_k+ρ_tu_t where ρ_t=f(u_t) so that large errors cause faster learning than small error where f could be either a continuous function or a threshold function we now generate both near unit root behavior and nonlinearities. Fortunately these and all other possibilities may be captured in Equation (5) through the time varying coefficient.
⁵
Following Pratt and Schlaifer (1984), we can show that the coefficients and error term of (4) or (5) are not unique and have multiple forms. Because of its nonuniqueness, the error term in one form can be shown to be correlated and in another form may or may not be correlated with any included regressors.
⁶
This is also the case for the other currencies considered.
⁷
We follow the approach of Swamy et al. (2010).
⁸
That is, the number of determinants is itself time-variant.
⁹
The differences between Equations (A2) and (A3) and those in Swamy and Tavlas (2007) arise as a direct consequence of our division of the set of the regressors of (A1) into S₁ and S₂ in this paper.
¹⁰
These correlations are typically ignored in the analyses of state-space models. Thus, inexpressive conditions and restrictive functional forms are avoided in arriving at Equations (A2) and (A3) so that Theorem 1 can easily hold; for further discussion and interpretation of the terms in (A2) and (A3), see Swamy and Tavlas (2001).
¹¹
However, the sum of the components of γ_jt is unique when their correct interpretations given by (8) and (9) are adopted (see Swamy and Tavlas, 2007: p. 300).
¹²
We use the term spurious in a more general sense than Granger and Newbold’s (1974), where it strictly applies to linear models expressed in terms of integrated variables. Here we mean any correlation which is observed between two variables when the true bias-free component of the coefficient in the regression of one on the other is actually zero.
¹³
See, Swamy et al. (2010)
¹⁴
Effectively, the coefficient drivers absorb the specification errors.
¹⁾
The views expressed are those of the authors and not those of their respective institutions.

References

Baillie, R. T., and T. Bollerslev. 1994. “The Long Memory of the Forward Premium.” Journal of International Money and Finance 11: 208–219.10.1016/0261-5606(94)90005-1Search in Google Scholar

Baillie, R. T., and T. Bollerslev. 2000. “The Forward Premium Anomaly is Not as Bad as You Think.” Journal of International Money and Finance 19: 471–488.10.1016/S0261-5606(00)00018-8Search in Google Scholar

Baillie, R. T., and R. Kilic. 2006. “Do Asymmetric and Nonlinear Adjustments Explain the Forward Premium Anomaly?” Journal of International Money and Finance 25: 22–47.10.1016/j.jimonfin.2005.10.002Search in Google Scholar

Bilson, J. F. O. 1981. “The ‘speculative efficiency’ hypothesis.” The Journal of Business 54: 435–451.10.1086/296139Search in Google Scholar

Chakraborty, A. 2007. “Learning, Forward Premium Puzzle and Exchange Rate Fundamentals Under Sticky Prices.” Economics Bulletin 34: 1–13.Search in Google Scholar

Chang I.-L., C. Hallahan, P.A.V.B Swamy. 1992. “Efficient Computation of Coeffient Models.” Computational Economics and Econometrics, edited by H. M. Amman, D. A. Belsley, and L. F. Pau, pp.43—54. Boston, MA: Kluwer Academic Press.10.1007/978-94-011-3162-9_4Search in Google Scholar

Chang, I., P. A. V. B. Swamy, C. Hallahan, and G. S. Tavlas. 2000. “A Computational Approach to Finding Causal Economic Laws.” Computational Economics 16: 105–136.10.1023/A:1008709704755Search in Google Scholar

Chao, J. C., and N. R. Swanson. 2003. “Consistent Estimation with a Large Number of Weak Instruments.” Econometrica 73: 1673–1692.10.1111/j.1468-0262.2005.00632.xSearch in Google Scholar

Chin, M. D. and G. Meredith. 2004. “Monetary Policy and the Long Horizon Uncovered Interest Parity.” IMF Staff Papers 51: 409–430.Search in Google Scholar

Engel, C. 1996. “The Forward Discount Anomaly and the Risk Premium: A Survey of Recent Evidence.” Journal of Empirical Finance 3: 123–192.10.1016/0927-5398(95)00016-XSearch in Google Scholar

Fama, E. F. 1984. “Forward and Spot Exchange Rates.” Journal of Monetary Economics 14: 319–338.10.1016/0304-3932(84)90046-1Search in Google Scholar

Frankel, J., and J. Poonawala. 2010. “The Forward Market in Emerging Currencies: Less Biased than in Major Currencies.” Journal of International Money and Finance 29: 585–598.10.1016/j.jimonfin.2009.11.004Search in Google Scholar

Froot, K. A., and J. A. Frankel. 1989. “Forward Discount Bias: Is it an Exchange Risk Premium?” The Quarterly Journal of Economics 104: 139–161.10.2307/2937838Search in Google Scholar

Granger, C. W. J. 2008. “Non-linear Models: Where we go Next-Time Varying Parameters.” Studies in Non-Linear Dynamics and Econometrics 12: 1–10.10.2202/1558-3708.1639Search in Google Scholar

Granger, C. W. J., and P. Newbold. 1974. “Spurious Regressions In Econometrics.” Journal of Econometrics 2: 111–120.10.1016/0304-4076(74)90034-7Search in Google Scholar

Greene, W. H. 2003. Econometric Analysis, 5^th edition. Upper Saddle River, New Jersey: Prentice Hall.Search in Google Scholar

Hall, S. G., P. A. V. B. Swamy, and G. S. Tavlas. 2009. The nonexistence of instrumental variables, Discussion Papers in Economics 09/16, Department of Economics, University of Leicester.Search in Google Scholar

Hall, S. G., P. A. V. B. Swamy, and G. S. Tavlas. 2012. “Generalized Cointegration: A New Concept with an Application to Health Expenditure and Health Outcomes.” Empirical Economics 42(2): 603–618.10.1007/s00181-011-0483-ySearch in Google Scholar

Hausman, J. A. 1975. “An Instrumental Variable Approach to Full Information Estimators for Linear and Certain Nonlinear Econometric Models. Econometrica 43: 727–738.10.2307/1913081Search in Google Scholar

Hodrick, R. J., and S. Srivastava. 1986. “The Covariation of Risk Premiums and Expected Future Spot Exchange Rates.” Journal of International Money and Finance 5: 5–21.10.1016/0261-5606(86)90015-XSearch in Google Scholar

Lewis, K. K. 1995. “Puzzles in International Financial Markets.” Handbook of International Economics 3: 1913–1971.10.1016/S1573-4404(05)80017-6Search in Google Scholar

MacDonald, R., and T. S. Torrance. 1989. “Some Survey Based Tests of Uncovered Interest Parity.” In: Exchange Rates and Open Economy Macroeconomic Models, edited by R. MacDonald and M. P. Taylor. Oxford: Blackwell.Search in Google Scholar

Maynard, A. and P. C. B. Phillips. 2001. “Rethinking an Old Empirical Puzzle: Empirical Evidence on the Forward Discout Anomaly.” Journal of Applied Econometrics 16: 671–708.10.1002/jae.624Search in Google Scholar

Pratt, J. W., and R. Schlaifer. 1984. “On the Nature and Discovery of Structure.” Journal of the American Statistical Association 79: 9–22.10.1080/01621459.1984.10477054Search in Google Scholar

Pratt, J. W., and R. Schlaifer. 1988. “On the Interpretation and Observation of Laws.” Journal of Econometrics 39: 23–52.10.1016/0304-4076(88)90039-5Search in Google Scholar

Sakoulis, G., E. Zivot, and K. Choi. 2010. “Structural Change in the Forward Discount: Implications for the Forward Rate Unbiasedness Hypothesis.” Journal of Empirical Finance 17: 957–966.10.1016/j.jempfin.2010.08.001Search in Google Scholar

Sarno, L., G. Valente, and H. Leon. 2006. “Nonlinearity in Deviations from Uncovered Interest Parity: An Explanation of the Forward Bias Puzzle.” Review of Finance 10: 443–482.10.1007/s10679-006-9001-zSearch in Google Scholar

Stock, J. H. and M. Yogo. 2001. “Testing for Weak Instruments in Linear IV Regression.” NBER working papers No. 0284.10.3386/t0284Search in Google Scholar

Stock, J. H., and M. W. Watson. 2003. Introduction to Econometrics. Boston: Addison Wesley.Search in Google Scholar

Swamy, P. A. V. B., and G. S. Tavlas. 2001. “Random Coefficient Models.” In: A Companion to Theoretical Econometrics, edited by B. H. Baltagi, 410–428. Malden: Blackwell.10.1002/9780470996249.ch20Search in Google Scholar

Swamy, P. A. V. B., and G. S. Tavlas. 2007. “The New Keynesian Philips Curve and Inflation Expectations: Re-Specification and Interpretation.” Economic Theory 31: 293–306.10.1007/s00199-006-0100-zSearch in Google Scholar

Swamy, P. A. V. B., G. S. Tavlas, S. G. Hall, and G. Hondroyiannis. 2010. “Estimation of Parameters in the Presence of Model Misspecification and Measurement Error.” Studies in Nonlinear Dynamics & Econometrics 14: 1–33.10.2202/1558-3708.1743Search in Google Scholar

Published Online: 2013-04-16

Published in Print: 2013-05-01

You are currently not able to access this content.

Articles in the same Issue

https://doi.org/10.1515/snde-2013-0009

Keywords for this article

forward premium anomaly; time-varying-coefficient; spurious relationship; JEL classifications: C51; E43

The forward rate premium puzzle: a case of misspecification?1)

Article

Abstract

Appendix A: technical exposition of TVC estimation

Notation and assumptions

References

Supplementary Material

Articles in the same Issue

Articles in the same Issue

Articles in the same Issue

The forward rate premium puzzle: a case of misspecification?¹⁾