On the Robustness of Coefficient Estimates to the Inclusion of Proxy Variables

Christopher R. Bollinger; Jenny Minier

doi:10.1515/jem-2012-0008

Artikel

On the Robustness of Coefficient Estimates to the Inclusion of Proxy Variables

Christopher R. Bollinger und Jenny Minier

Veröffentlicht/Copyright: 13. März 2014

Veröffentlicht von

Veröffentlichen auch Sie bei De Gruyter Brill

Manuskript einreichen Informationen für Autor*innen

Aus der Zeitschrift Journal of Econometric Methods Band 4 Heft 1

Abstract

This paper considers the use of multiple proxy measures for an unobserved variable and contrasts the approach taken in the measurement error literature to that of the model specification literature. We find that including all available proxy variables in the regression minimizes the bias on coefficients of correctly measured variables in the regression. We derive a set of bounds for all parameters in the model, and compare these results to extreme bounds analysis. Monte Carlo simulations demonstrate the performance of our bounds relative to extreme bounds. We conclude with an empirical example from the cross-country growth literature in which human capital is measured through three proxy variables: literacy rates, and enrollment in primary and secondary school, and show that our approach yields results that contrast sharply with extreme bounds analysis.

Keywords: cross-country growth regressions; econometric bounds; latent variable; measurement error

JEL Codes: C4; C51; O47

Corresponding author: Christopher R. Bollinger, Department of Economics, University of Kentucky, Lexington, KY 40506, USA, E-mail: crboll@uky.edu

Acknowledgment

We thank Helle Bunzel, Steven Durlauf, Josh Ederington, Per Hjerstrand, Brian Krauth, Brent Krieder, Mike McCracken, John Pepper, Shinichi Sakata, Justin Tobias, Ken Troske, Tom Wansbeek, Hendrik Wolff, Jim Ziliak, and participants in seminars at the Universities of California, Berkeley and Santa Cruz, University of Oregon, University of Washington, Iowa State University, IUPUI, the International Measurement Error Conference, Canadian Economics Association, and the Southern Economic Association meetings for helpful comments and discussion.

Appendix: Proofs

Let

V(Z1iZ2i)=[V1CC′V2].

The matrix V₁ is the k×k variance matrix for Z₁_i, C is the k×1 covariance, and V₂ is the scalar variance of Z₂_i. Let δ be an arbitrary l×1 vector such that δ′ρ=γ>0 for some given value γ. Let θ=β/γ.

The next three Lemmas establish key results for Proposition 3.

Lemma 1Expressions for (α–a) and (θ–t).

Then

[at]=[V1γCγC′γ2V2+(δ′Σδ)]−1[V1α+θγCγC′α+θγ2V2].

Rewriting yields

[V1γCγC′γ2V2+(δ′Σδ)][at]=[V1α+θγCγC′α+θγ2V2],

which is equivalent to

[V1γCγC′γ2V2]−1[V1γCγC′γ2V2+(δ′Σδ)][at] =[V1γCγC′γ2V2]−1[V1α+θγCγC′α+θγ2V2].

This yields

[Iγ(V1−CV2−1C)−1C(1−(γ2V2)−1(γ2V2+(δ′Σδ)))0(γ2(V2−C′V1−1C))−1(γ2V2+(δ′Σδ)−γ2C′V1−1C)][at]=[αθ].

Noting that V₂, γ, and (δ′Σδ) are all scalars, this can be written as

[I−γ(V1−CV2−1C)−1C((δ′Σδ)γ2V2)0′(1+(δ′Σδ)(γ2(V2−C′V1−1C)))][at]=[αθ].

Rearranging gives

[a−γ(V1−CV2−1C)−1C((δ′Σδ)γ2V2)t(1+(δ′Σδ)(γ2(V2−C′V1−1C)))t]=[αθ].

Thus

(a−a)=γ(V1−CV2−1C)−1CV2−C′V1−1CV2(d′Σd)(γ2(V2−C′V1−1C))+(d′Σd)θ=(V1−CV2−1C)−1CV2−C′V1−1CV2(d′Σd)(γ2(V2−C′V1−1C))+(d′Σd)β,

and

(t−θ)=((γ2(V2−C′V1−1C))(γ2(V2−C′V1−1C))+(δ′Σδ))θ−θ=−((δ′Σδ)(γ2(V2−C′V1−1C))+(δ′Σδ))θ.

QED.

Lemma 2The term ((δ′Σδ)(γ2(V2−C′V1−1C))+(δ′Σδ)) is positive and increasing in (δ′Σδ).

The term (γ2(V2−C′V1−1C))+(δ′Σδ) is positive provided that γ≠0 and Σ is positive semi-definite. The term V2−C′V1−1C is the determinant of the V(Z₁_i, Z₂_i), and so is, by necessary assumption, positive. The term (δ′Σδ) will be non-negative provided Σ is positive semi-definite. The derivative with respect to the term (δ′Σδ) is

(γ2(V2−C′V1−1C))+(δ′Σδ)−(δ′Σδ)((γ2(V2−C′V1−1C))+(δ′Σδ))2 =γ2(V2−C′V1−1C)((γ2(V2−C′V1−1C))+(δ′Σδ))2>0.

Hence the inconsistency in both are increasing in (δ′Σδ). QED.

Lemma 3The solution to min_δ(δ′Σδ) s.t.δρ=γ isδ=γΣ^–1ρ(ρ′Σ^–1ρ)^–1.

The Lagrangian is

(δ′Σδ)−λ(δ′ρ−γ).

FOC are

2Σδ−λρ=0δ′ρ−γ=0.

Solving:

δ=12λΣ−1ρ12ρ′Σ−1ρλ=γ.

Substitution yields

δ=γΣ−1ρ(ρ′Σ−1ρ)−1λ=2γ(ρ′Σ−1ρ)−1.

QED.

Proof. The proof of proposition 1 follows from the details in the text combined with the above lemmas. ■

Proof. Proof of Corollary 1. Substitution of the results from proposition 1 into the expressions in Lemma 1 yields

(δ′Σδ)=γ2ρ′Σ−1ΣΣ−1ρ(ρ′Σ−1ρ)2=γ2(ρ′Σ−1ρ).

From Lemma 1 we have that

(t−θ)=−θ((δ′Σδ)(γ2(V2−C′V1−1C))+(δ′Σδ))

Alternatively,

t=θ(1−((δ′Σδ)(γ2(V2−C′V1−1C))+(δ′Σδ)))=βγ(γ2(V2−C′V1−1C)(γ2(V2−C′V1−1C))+(δ′Σδ)).

Substitute the optimal choice of δ from proposition 1 which yields

t=βγ(γ2(V2−C′V1−1C)γ2(V2−C′V1−1C)+γ2(ρ′Σ−1ρ)−1)=βγ((V2−C′V1−1C)(V2−C′V1−1C)+(ρ′Σ−1ρ)−1).

Hence, by choosing

γ=(V2−C′V1−1C)+(ρ′Σ−1ρ)−1(V2−C′V1−1C)=1+1(ρ′Σ−1ρ)(V2−C′V1−1C),

we have t=β: no inconsistency in the coefficient on X^δ. QED ■

Lemma 4(Sherwin-Morrison_Woodbury Matrix Inversion Lemma): If A and B are non-singular matrices, and X is conformable, then (A+XBX′)^–1=A^–1–A^–1X(B^–1+X′A^–1X)^–1X′A^–1.

Proof. Proof of Proposition 2:

The linear regression of y_i on Z₁_i and X_i yields slope coefficients consistent for

(ab)=[V1Cρ′ρC′(ρρ′V2+Σ)]−1[V1α+CβρC′α+ρV2β].

Rewriting yields

[V1Cρ′ρC′(ρρ′V2+Σ)][ab]=[V1α+CβρC′α+ρV2β],

which is equivalent to

[V1Cρ′ρC′Iρ′ρV2]−1[V1Cρ′ρC′(ρρ′V2+Σ)][ab]=[V1Cρ′ρC′Iρ′ρV2]−1[V1α+CβρC′α+ρV2β],

where I is the identity matrix of appropriate dimensions. The inverse of the leading matrix (a partitioned matrix) can be written as

[(V1−Cρ′(Iρ′ρV2)−1ρC′)−1−(V1−Cρ′(Iρ′ρV2)−1ρC′)−1Cρ′(Iρ′ρV2)−1−(Iρ′ρV2−ρC′V1−1Cρ′)−1ρC′V1−1(Iρ′ρV2−ρC′V1−1Cρ′)−1].

Since ρ′ρV₂ is a scalar, this reduces to

[(V1−CV2−1C′)−1−(Iρ′ρV2−ρC′V1−1Cρ′)−1ρC′V1−1 −(V1−CV2−1C′)−1Cρ′(ρ′ρV2)−1(Iρ′ρV2−ρC′V1−1Cρ′)−1].

Substitution and simplification yields

[I−(V1−CV2−1C′)−1(Cρ′(ρ′ρV2)−1Σ)0(Iρ′ρV2−ρC′V1−1Cρ′)−1(ρ(V2−C′V1−1C)ρ′+Σ)][ab] =[α(Iρ′ρV2−ρC′V1−1Cρ′)−1ρ(V2−C′V1−1C)β],

[a−(V1−CV2−1C′)−1(Cρ′(ρ′ρV2)−1Σ)b(Iρ′ρV2−ρC′V1−1Cρ′)−1(ρ(V2−C′V1−1C)ρ′+Σ)b] =[α(Iρ′ρV2−ρC′V1−1Cρ′)−1ρ(V2−C′V1−1C)β].

We can write

b=(ρ(V2−C′V1−1C)ρ′+Σ)−1ρ(V2−C′V1−1C)β,

and

a=a+(V1−CV2−1C′)−1(Cρ′(ρ′ρV2)−1Σ)×(ρ(V2−C′V1−1C)ρ′+Σ)−1ρ(V2−C′V1−1C)β.

Turning first to the term a and applying the Sherwin-Morrison_Woodbury Matrix Inversion Lemma:

a=a+(V1−CV2−1C′)−1(Cρ′(ρ′ρV2)−1Σ)×(Σ−1−Σ−1ρ((V2−C′V1−1C)−1+ρ′Σ−1ρ)−1ρ′Σ−1)ρ(V2−C′V1−1C)β.

Simplification yields

a=a+(V1−CV2−1C′)−1C(V2−C′V1−1C)V2×1−ρ′Σ−1ρ(V2−C′V1−1C)−1+ρ′Σ−1ρβ

=a+(V1−CV2−1C′)−1C(V2−C′V1−1C)V2×(ρ′Σ−1ρ)−1(V2−C′V1−1C)+(ρ′Σ−1ρ)−1β,

which is the expression for a when the error-variance-minimizing choice of δ is used to construct X^δ (See Corollary 2).

Turning now to b, consider

ρ′b=ρ′(ρ(V2−C′V1−1C)ρ′+Σ)−1ρ(V2−C′V1−1C)β.

Again using the Sherwin-Morrison_Woodbury Matrix Inversion Lemma,

ρ′b=ρ′(Σ−1−Σ−1ρ((V2−C′V1−1C)−1+ρ′Σ−1ρ)−1ρ′Σ−1)ρ(V2−C′V1−1C)β=(ρ′Σ−1ρ−ρ′Σ−1ρ((V2−C′V1−1C)−1+ρ′Σ−1ρ)ρ′Σ−1ρ)(V2−C′V1−1C)β=(V2−C′V1−1C)−1(ρ′Σ−1ρ)+(ρ′Σ−1ρ)2−(ρ′Σ−1ρ)2(V2−C′V1−1C)−1+ρ′Σ−1ρ(V2−C′V1−1C)β=(V2−C′V1−1C)(V2−C′V1−1C)+(ρ′Σ−1ρ)−1β.

This is equal to the expression for a when the error variance minimizing choice of δ is used to construct X^δ in Corollary 1 if γ=1.QED ■

References

Barro, Robert J. 1991. “Economic Growth in a Cross-Section of Countries.” Quarterly Journal of Economics 106 (2): 407–443.10.2307/2937943Suche in Google Scholar

Barro, Robert J., and Jong-Wha Lee. 2001. “International Data on Educational Attainment: Updates and implications.” Oxford Economic Papers 53 (3): 541–563.10.1093/oep/53.3.541Suche in Google Scholar

Bollinger, Christopher R. 1996. “Bounding Mean Regressions When A Binary Regressor is Mismeasured.” Journal of Econometrics 73 (2): 387–399.10.1016/S0304-4076(95)01730-5Suche in Google Scholar

Bollinger, Christopher R. 2003. “Measurement Error in Human Capital and the Black-White Wage Differential.” Review of Economics and Statistics 85 (3): 578–585.10.1162/003465303322369731Suche in Google Scholar

Brock, William A., and Steven N. Durlauf. 2001. “Growth Empirics and Reality.” World Bank Economic Review 15 (2): 229–272.10.1093/wber/15.2.229Suche in Google Scholar

Brock, William A., Steven N. Durlauf, and Kenneth D. West. 2003. “Policy Evaluation in Uncertain Economic Environments.” Brookings Papers on Economic Activity 2003 (1): 235–301.10.1353/eca.2003.0013Suche in Google Scholar

Durlauf, Steven N., Andros Kourtellos, and Chih Ming Tan. 2008. “Are Any Growth Theories Robust?.” Economic Journal 2008 (119): 329–346.10.1111/j.1468-0297.2007.02123.xSuche in Google Scholar

Goldberger, Arthur S., and Karl G. Jöreskog. 1975. “Estimation of a Model with Multiple Indicators and Multiple Causes of a Single Latent Variable.” Journal of the American Statistical Associaton 70 (351): 631–639.10.2307/2285946Suche in Google Scholar

Griliches, Zvi. 1974. “Errors in Variables and Other Unobservables.” Econometrica 42 (6): 971–998.10.2307/1914213Suche in Google Scholar

Klepper, Steven. 1988. “Regressor Diagnostics for the Classical Errors-in-Variables Model.” Journal of Econometrics 37: 225–243.10.1016/0304-4076(88)90004-8Suche in Google Scholar

Klepper, Steven, and Edward E. Leamer. 1984. “Consistent sets of Estimates for Regressions with Errors in All Variables.” Econometrica 52: 163–183.10.2307/1911466Suche in Google Scholar

Leamer, Edward E., and Herman B. Leonard. 1983. “Reporting the Fragility of Regression Estimates,” The Review of Economics and Statistics 65 (2): 306–317.10.2307/1924497Suche in Google Scholar

Levine, Ross, and David Renelt. 1992. “A Sensitivity Analysis of Cross-Country Growth Regressions.” American Economic Review 82 (4): 942–963.Suche in Google Scholar

Lubotsky, Darrin, and Martin Wittenberg. 2006. “Interpretation of Regressions with Multiple Proxies.” Review of Economics and Statistics 88 (3): 549–562.10.1162/rest.88.3.549Suche in Google Scholar

Mankiw, N. Gregory, David Romer, and David N. Weil. 1992. “A Contribution to the Empirics of Economic Growth.” Quarterly Journal of Economics 107 (2): 407–437.10.2307/2118477Suche in Google Scholar

Neal, Derek A., and William R. Johnson. 1996. “The Role of Premarket Factors in Black-White Wage Differences.” Journal of Political Economy 104 (5): 869–895.10.1086/262045Suche in Google Scholar

Sala-i-Martin, Xavier X. 1997. “I Just Ran Two Million Regressions.” American Economic Review 87 (2): 178–183.Suche in Google Scholar

Sala-i-Martin, Xavier, Gernot Doppelhofer, and Ronald Miller. 2004. “Determinants of Long-Term Growth: A Bayesian Averaging of Classical Estimates (BACE) Approach.” American Economic Review 94 (4): 813–835.10.1257/0002828042002570Suche in Google Scholar

Solow, Robert M. 1956. “A Contribution to the Theory of Economic Growth.” Quarterly Journal of Economics 70 (1): 65–94.10.2307/1884513Suche in Google Scholar

Wittenberg, Martin. 2007. “Testing For A Common Latent Variable In A Linear Regression.” Working Paper, Available at SSRN: http://ssrn.com/abstract=978395.Suche in Google Scholar

Wooldridge, Jeffrey. 2010. Econometric Analsyis of Cross Section and Panel Data, second edition. Cambridge: MIT Press.Suche in Google Scholar

Published Online: 2014-3-13

Published in Print: 2015-1-1

Sie haben derzeit keinen Zugang zu diesem Inhalt.

Artikel in diesem Heft

https://doi.org/10.1515/jem-2012-0008

Schlagwörter für diesen Artikel

cross-country growth regressions; econometric bounds; latent variable; measurement error