Selecting Instrumental Variables in a Data Rich Environment
-
Serena Ng
und Jushan Bai
Practitioners often have at their disposal a large number of instruments that are weakly exogenous for the parameter of interest. However, not every instrument has the same predictive power for the endogenous variable, and using too many instruments can induce bias. We consider two ways of handling these problems. The first is to form principal components from the observed instruments, and the second is to reduce the number of instruments by subset variable selection. For the latter, we consider boosting, a method that does not require an a priori ordering of the instruments. We also suggest a way to pre-order the instruments and then screen the instruments using the goodness of fit of the first stage regression and information criteria. We find that the principal components are often better instruments than the observed data except when the number of relevant instruments is small. While no single method dominates, a hard-thresholding method based on the t test generally yields estimates with small biases and small root-mean-squared errors.
©2011 Walter de Gruyter GmbH & Co. KG, Berlin/Boston
Artikel in diesem Heft
- Article
- Statistical Fourier Analysis: Clarifications and Interpretations
- Asymptotics of the QMLE for Non-Linear ARCH Models
- Price Level Convergence, Purchasing Power Parity and Multiple Structural Breaks in Panel Data Analysis: An Application to U.S. Cities
- Selecting Instrumental Variables in a Data Rich Environment
- The KPSS Test Using Fixed-b Critical Values: Size and Power in Highly Autocorrelated Time Series
Artikel in diesem Heft
- Article
- Statistical Fourier Analysis: Clarifications and Interpretations
- Asymptotics of the QMLE for Non-Linear ARCH Models
- Price Level Convergence, Purchasing Power Parity and Multiple Structural Breaks in Panel Data Analysis: An Application to U.S. Cities
- Selecting Instrumental Variables in a Data Rich Environment
- The KPSS Test Using Fixed-b Critical Values: Size and Power in Highly Autocorrelated Time Series