Factor Modeling for High-Dimensional Interval-Valued Data

Yan Guo; Guchu Zou; Jianhong Wu

doi:10.1515/snde-2024-0019

Enjoy 40% off

academic books on De Gruyter Brill *

Article

Factor Modeling for High-Dimensional Interval-Valued Data

Yan Guo , Guchu Zou and Jianhong Wu

Published/Copyright: February 4, 2025

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information

From the journal Studies in Nonlinear Dynamics & Econometrics

Abstract

The paper considers an approximate factor model for interval-valued panel data with both large numbers of cross-section units and time series observations. A ratio-type estimator is proposed for the number of interval-valued factors in the approximate factor model. A variant of the estimator is also suggested, which is robust to the case with dominant factors. Under certain conditions, the estimators can be proved to be consistent. Moreover, the estimators of interval-valued factors and the pooled loadings can be obtained by the principal component analysis method for point-valued data. Monte Carlo simulation studies show that the proposed estimators have the desired finite sample properties.

Keywords: approximate factor models; factor analysis; interval-valued data; principal components; the number of factors

JEL Classification: C01; C13

Corresponding author: Jianhong Wu, College of Mathematics and Science, Shanghai Normal University, Shanghai 200234, China; and Lab for Educational Big Data and Policymaking, Ministry of Education, Shanghai 200234, China, E-mail: wujianhong@shnu.edu.cn

Yan Guo and Guchu Zou contributed equally to this work.

Acknowledgments

We are deeply grateful to Professor Jeremy Piger and two anonymous referees for valuable comments that led to substantial improvement of this paper.

Research funding: This research is supported in part by the National Nature Science Foundation of China (Grant No. 72173086).

Appendix: Technical Details

Proof of Theorem 1.

Under Assumptions 1–3, it follows from Lemma A.11 of Ahn and Horenstein (2013) that, for any k ≤ r, μ ̃ N T , k L = O p ( 1 ) and μ ̃ N T , k R = O p ( 1 ) . Thus, we have ω μ ̃ N T , k L = O p ( 1 ) and ( 1 − ω ) μ ̃ N T , k R = O p ( 1 ) , ω ∈ [ 0,1 ] , k = 1,2 , … , r . Accordingly, for any weight 0 ≤ ω ≤ 1,

ω μ ̃ N T , k L + ( 1 − ω ) μ ̃ N T , k R ω μ ̃ N T , k + 1 L + ( 1 − ω ) μ ̃ N T , k + 1 R = O p ( 1 ) , k = 1,2 , … , r − 1 .

Also, it follows from Lemma A.9 of Ahn and Horenstein (2013) that, for any r + 1 ≤ k ≤ [d^cm] − r, μ ̃ N T , k L = O p ( 1 m ) and μ ̃ N T , k R = O p ( 1 m ) . Then, we have ω μ ̃ N T , k L = O p ( 1 m ) and ( 1 − ω ) μ ̃ N T , k R = O p ( 1 m ) , for any r + 1 ≤ k ≤ [d^cm] − r and ω ∈ [0, 1]. Accordingly,

ω μ ̃ N T , k L + ( 1 − ω ) μ ̃ N T , k R ω μ ̃ N T , k + 1 L + ( 1 − ω ) μ ̃ N T , k + 1 R = O p ( 1 ) , k = r + 1 , r + 2 , … , r max .

However, when k = r,

ω μ ̃ N T , r L + ( 1 − ω ) μ ̃ N T , r R ω μ ̃ N T , r + 1 L + ( 1 − ω ) μ ̃ N T , r + 1 R = O p ( m ) → ∞ .

It then follows that

lim m → ∞ Pr ( r ̂ ω = r ) = 1 .

Proof of Theorem 2.

Let μ ̃ N T , k * be ω μ ̃ N T , k L + ( 1 − ω ) μ ̃ N T , k R . It follows from the proof of Theorem 1 that, under Assumptions 1–2, μ ̃ N T , k * = O p ( 1 ) , k ≤ r and μ ̃ N T , k * = O p ( 1 m ) , k ≥ r + 1 . Therefore, for k ≤ r,

2 Φ μ ̃ N T , k * − 1 = ∫ 0 μ ̃ N T , k * 2 2 π e − t 2 2 d t ≥ 2 2 π e − μ ̃ N T , k * 2 2 μ ̃ N T , k * = O p ( 1 ) .

For r + 1 ≤ k ≤ [d^cm] − r, we have 2 Φ μ ̃ N T , k * − 1 = O p ( 1 m ) , because

2 Φ μ ̃ N T , k * − 1 = ∫ 0 μ ̃ N T , k * 2 2 π e − t 2 2 d t ≥ 2 2 π e − μ ̃ N T , k * 2 2 μ ̃ N T , k * = O p 1 m , 2 Φ μ ̃ N T , k * − 1 = ∫ 0 μ ̃ N T , k * 2 2 π e − t 2 2 d t ≤ 2 2 π μ ̃ N T , k * = O p 1 m .

Thus, we have

2 Φ μ ̃ N T , k * − 1 2 Φ μ ̃ N T , k + 1 * − 1 = O p ( 1 ) , k = 1 , … , r − 1 , r + 1 , … , r max .

However, when k = r,

2 Φ μ ̃ N T , r * − 1 2 Φ μ ̃ N T , r + 1 * − 1 = O p ( m ) → ∞ .

It then follows that

lim m → ∞ Pr ( r ̃ ω = r ) = 1 .

References

Ahn, S. C., and A. R. Horenstein. 2013. “Eigenvalue Ratio Test for the Number of Factors.” Econometrica 81 (3): 1203–27.10.3982/ECTA8968Search in Google Scholar

Bai, J., and S. Ng. 2002. “Determining the Number of Factors in Approximate Factor Models.” Econometrica 70 (1): 191–221. https://doi.org/10.1111/1468-0262.00273.Search in Google Scholar

Bai, J. 2003. “Inferential Theory for Factor Models of Large Dimensions.” Econometrica 71 (1): 135–71. https://doi.org/10.1111/1468-0262.00392.Search in Google Scholar

Bai, J., and K. Li. 2016. “Maximum Likelihood Estimation and Inference for Approximate Factor Models of High Dimension.” Review of Economics and Statistics 98 (2): 298–309. https://doi.org/10.1162/rest_a_00519.Search in Google Scholar

Billard, L., and E. Diday. 2000. “Regression Analysis for Interval-Valued Data.” In Data Analysis, Classification, and Related Methods, 369–74. Berlin: Springer.10.1007/978-3-642-59789-3_58Search in Google Scholar

Billard, L., and E. Diday. 2002. “Symbolic Regression Analysis.” In Classification, Clustering, and Data Analysis: Recent Advances and Applications, 281–8. Berlin: Springer.10.1007/978-3-642-56181-8_31Search in Google Scholar

Billard, L., and E. Diday. 2003. “From the Statistics of Data to the Statistics of Knowledge: Symbolic Data Analysis.” Journal of the American Statistical Association 98 (462): 470–87. https://doi.org/10.1198/016214503000242.Search in Google Scholar

Dias, S., and P. Brito. 2017. “Off the Beaten Track: A New Linear Model for Interval Data.” European Journal of Operational Research 258 (3): 1118–30. https://doi.org/10.1016/j.ejor.2016.09.006.Search in Google Scholar

González-Rivera, G., and W. Lin. 2013. “Constrained Regression for Interval-Valued Data.” Journal of Business and Economic Statistics 31 (4): 473–90. https://doi.org/10.1080/07350015.2013.818004.Search in Google Scholar

Han, A., Y. Hong, and S. Wang. 2012. “Autoregressive Conditional Models for Interval-Valued Time Series Data.” In Proceedings of the 3rd International Conference on Singular Spectrum Analysis and its Applications.Search in Google Scholar

Hukuhara, M. 1967. “Integration des applications mesurables dont la valeur est uncompact convexe.” Funkcialaj Ekvacioj 10 (3): 205–23.Search in Google Scholar

Kaucher, E. 1980. “Interval Analysis in the Extended Interval Space IR.” Computing (Suppl 2): 33–49. https://doi.org/10.1007/978-3-7091-8577-3_3.Search in Google Scholar

Lam, C., and Q. Yao. 2012. “Factor Modeling for High-Dimensional Time Series: Inference for the Number of Factors.” Annals of Statistics 40 (2): 694–726. https://doi.org/10.1214/12-aos970.Search in Google Scholar

Lima Neto, E. A., and F. D. A. De Carvalho. 2008. “Centre and Range Method for Fitting a Linear Regression Model to Symbolic Interval Data.” Computational Statistics and Data Analysis 52 (3): 1500–15. https://doi.org/10.1016/j.csda.2007.04.014.Search in Google Scholar

Lima Neto, E. A., and F. D. A. De Carvalho. 2010. “Constrained Linear Regression Models for Symbolic Interval-Valued Variables.” Computational Statistics and Data Analysis 54 (2): 333–47. https://doi.org/10.1016/j.csda.2009.08.010.Search in Google Scholar

Onatski, A. 2010. “Determining the Number of Factors from Empirical Distribution of Eigenvalues.” Review of Economics and Statistics 92 (4): 1004–16. https://doi.org/10.1162/rest_a_00043.Search in Google Scholar

Rodrigues, P. M., and N. Salish. 2015. “Modeling and Forecasting Interval Time Series with Threshold Models.” Advances in Data Analysis and Classification 9 (1): 41–57. https://doi.org/10.1007/s11634-014-0170-x.Search in Google Scholar

Sun, Y., A. Han, Y. Hong, and S. Wang. 2018. “Threshold Autoregressive Models for Interval-Valued Time Series Data.” Journal of Econometrics 206 (2): 414–46. https://doi.org/10.1016/j.jeconom.2018.06.009.Search in Google Scholar

Sun, Y., X. Zhang, A. T. Wan, and S. Wang. 2022. “Model Averaging for Interval-Valued Data.” European Journal of Operational Research 301 (2): 772–84. https://doi.org/10.1016/j.ejor.2021.11.015.Search in Google Scholar

Wu, J. 2016. “Robust Determination for the Number of Common Factors in the Approximate Factor Models.” Economics Letters 144: 102–6. https://doi.org/10.1016/j.econlet.2016.04.026.Search in Google Scholar

Wu, J. 2018. “Eigenvalue Difference Test for the Number of Common Factors in the Approximate Factor Models.” Economics Letters 169: 63–7. https://doi.org/10.1016/j.econlet.2018.05.009.Search in Google Scholar

Xia, Q., W. Xu, and L. Zhu. 2015. “Consistently Determining the Number of Factors in Multivariate Volatility Modelling.” Statistica Sinica 25 (3): 1025–44, https://doi.org/10.5705/ss.2013.252.Search in Google Scholar

Supplementary Material

This article contains supplementary material (https://doi.org/10.1515/snde-2024-0019).

Received: 2024-03-13

Accepted: 2024-12-19

Published Online: 2025-02-04

You are currently not able to access this content.

Supplementary Material Details

https://doi.org/10.1515/snde-2024-0019

Keywords for this article

approximate factor models; factor analysis; interval-valued data; principal components; the number of factors