On the performance of information criteria for model identification of count time series

Christian H. Weiß; Martin H.-J.M. Feld

doi:10.1515/snde-2018-0012

Abstract

Model fitting for count time series is of great relevance for many economic applications. Here, we focus on the step of model selection, where information criteria like AIC and BIC are commonly used in practice. Previous studies about their model selection abilities concentrated on real-valued time series, but here, we comprehensively investigate AIC and BIC in a count time series context. In our simulations, we consider diverse scenarios of model selection, like the identification of serial (in)dependence, overdispersion, zero inflation or a trend, the order selection within a given model family as well as the model selection also across model families. We apply our findings to economic count time series about monthly numbers of strikes in the US, and about monthly numbers of corporate insolvencies in the districts of Rhineland-Palatinate.

Keywords: AIC; BIC; corporate insolvencies; count process; model selection

Acknowledgements

The authors thank the referees for carefully reading the article and for their comments, which greatly improved the article.

Appendix A

Some models for count time series

This appendix briefly summarizes the definitions and relevant properties of the count time series models being considered in this article; more details and references can be found in the book by Weiß (2018).

In the simplest case, the generated counts are independent and identically distributed (i. i. d.). But most of the models considered here are types of regression models. The integer-valued autoregressive conditional heteroskedasticity (INARCH) models in Sections 2 to 5 assume the conditional mean at time t, Mt := E[Xt | Xt−1,…], to be a linear function of the last p observations, i.e. Mt=β+∑i=1pαi Xt−i, whereas the marginal regression models in Section 6 assume a linear trend in time: Mt=E[Xt]=a+b⋅t. Given the mean at time t, the actual count is generated from either the Poisson distribution Poi(M_t) (Poi-INARCH or Poi-Reg model, respectively) or the negative binomial distribution NB(Mt π1−π, π) (NB-INARCH or NB-Reg model, respectively). Here, the NB-parameter π controls the degree of overdispersion, because the dispersion index of an NB-variate is equal to 1/π. Some of our analyses related to the INARCH model also use the zero-inflated Poisson distribution ZIP(11−ω Mt, ω) (ZIP-INARCH), where ω determines the extent of zero inflation. In view of likelihood computation as required for Appendix B, the following formulae for computing (conditional) probabilities are relevant:

(5) INARCH ( p ) Reg P ( X t = x | x t − 1 , … ) = P ( X t = x ) = Poi- e − m t ⋅ m t x x ! , e − m t ⋅ m t x x ! , where m t = β + ∑ i = 1 p α i x t − i where m t = a + b ⋅ t NB- ( n t + x − 1 ) ( x ) x ! ⋅ ( 1 − π ) x ⋅ π n t , ( n t + x − 1 ) ( x ) x ! ⋅ ( 1 − π ) x ⋅ π n t , where n t = ( β + ∑ i = 1 p α i x t − i ) π 1 − π where n t = ( a + b ⋅ t ) π 1 − π ZIP- ω 𝟙 { x = 0 } + ( 1 − ω ) ⋅ e − λ t ⋅ λ t x x ! , where λ t = ( β + ∑ i = 1 p α i x t − i ) 1 1 − ω

where k(l)=k⋯(k−l+1) denotes the falling factorials, and 𝟙{⋅} the indicator function.

In Sections 2 to 4, we also consider types of integer-valued autoregressive (INAR) models of order 1. The INAR(1) model is defined by the recursion Xt=α∘Xt−1+ϵt, where the binomial thinning operator “α∘” is defined by the conditional distribution α∘X|X ∼Bin(X,α), and where the innovations (ϵt)N are i. i. d. count random variables. We consider either Poisson or negative binomial or zero-inflated Poisson innovations, ϵt∼Poi(λ) or ϵt∼NB(n,π) or ϵt∼ZIP(λ,ω), leading to the Poi- or NB- or ZIP-INAR(1) model, respectively. The corresponding conditional probabilities are

(6) P ( X t = x | x t − 1 , … ) = ∑ j = 0 min { x , x t − 1 } ( x t − 1 j ) α j ( 1 − α ) x t − 1 − j ⋅ P ( ϵ t = x − j ) , where P ( ϵ t = x − j ) = { e − λ ⋅ λ x − j ( x − j ) ! if Poi-INAR(1) , ( n + x − j − 1 ) ( x − j ) ( x − j ) ! ⋅ ( 1 − π ) x − j ⋅ π n if NB-INAR(1) , ω 𝟙 { x − j = 0 } + ( 1 − ω ) ⋅ e − λ ⋅ λ x − j ( x − j ) ! if ZIP-INAR(1) .

Note that the Poi-INAR(1) model also has a Poisson marginal distribution, Xt∼Poi(λ1−α), whereas the NB- and ZIP-INAR(1)’s (as well as all INARCH’s) marginal distribution is not known explicitly.

Appendix B

About the simulations

The simulation studies of the present article have been done with R (R Core Team 2018). In contrast to previous works like Emiliano, Vivanco, and de Menezes (2014) and Rinke and Sibbertsen (2016), we even used 10,000 replications per simulated scenario. The resulting time series according to the DGP were then used to fit all of the specified candidate models via maximum likelihood (ML) estimation, where we computed the required log-likelihood based on formulae (5) and (6). For the autoregressive models of order p, we actually used a conditional ML approach, i.e. we maximized the log-likelihood conditioned on the first p observations. To correct for this reduced number of unconditional observations, we followed the suggestion in Weiß (2018) and multiplied the maximized log-likelihood by the factor T/(T − p); then we computed the values of AIC and BIC for each candidate model.

To get a comprehensive picture about the information criteria’s model selection abilities, we fixed the length T as well as marginal properties per given scenario, but we randomly drew the autoregressive parameters from the possible range (in case of the autoregressive INAR and INARCH models). More precisely, for order p = 1, we drew α₁ (or α, respectively) from a uniform distribution on the interval (0.1; 0.9) ⊂ (0; 1) (we avoided more extreme values of α₁ to circumvent computational problems). Similarly, α1,…,αp were drawn considering the stationarity condition ∑i=1pαi<1 together with the additional constraint 0.1<αi<0.9. To ensure that the DGP (nearly) reached its stationary state, we always used a prerun of length 250.

Considering these issues, the architecture of our simulation codes is as follows: Having specified the candidate models and the DGP,

we simulated 10,000 time series according to the DGP,
we fitted the candidate models by ML estimation and stored the resulting maximized log-likelihood values,
we computed the considered information criteria and used these for model selection.

The full R codes are available from the authors upon request.

References

Akaike, H. 1974. “A New Look at the Statistical Model Identification.” IEEE Transactions on Automatic Control 19 (6): 716–723.10.1007/978-1-4612-1694-0_16Search in Google Scholar

Altman, E. I., and P. Narayanan. 1997. “An International Survey of Business Failure Classification Models.” Financial Markets, Institutions & Instruments 6 (2): 1–57.10.1111/1468-0416.00010Search in Google Scholar

Burnham, K. P., and D. R. Anderson. 2002. Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach. 2nd edition, New York: Springer-Verlag Inc.Search in Google Scholar

Christou, V., and K. Fokianos. 2015. “On count Time Series Prediction.” Journal of Statistical Computation and Simulation 85 (2): 357–373.10.1080/00949655.2013.823612Search in Google Scholar

Czado, C., T. Gneiting, and L. Held. 2009. “Predictive Model Assessment for Count Data.” Biometrics 65 (4): 1254–1261.10.1111/j.1541-0420.2009.01191.xSearch in Google Scholar PubMed

Davis, R. A., S. H. Holan, R. Lund, and N. Ravishanker, eds. 2016. Handbook of Discrete-Valued Time Series. Boca Raton: Chapman & Hall/CRC Press.10.1201/b19485Search in Google Scholar

Emiliano, P. C., M. J. F. Vivanco, and F. S. de Menezes. 2014. “Information Criteria: How do they Behave in Different Models?” Computational Statistics and Data Analysis 69: 141–153.10.1016/j.csda.2013.07.032Search in Google Scholar

Hannan, E. J., and B. G. Quinn. 1979. “The Determination of the Order of an Autoregression.” Journal of the Royal Statistical Society, Series B 41 (2): 190–195.10.1111/j.2517-6161.1979.tb01072.xSearch in Google Scholar

Hughes, A. W., M. L. King, and K. T. Kwek. 2004. “Selecting the Order of an ARCH Model.” Economics Letters 83 (2): 269–275.10.1016/j.econlet.2003.05.003Search in Google Scholar

Jung, R. C., and A. R. Tremayne. 2011. “Useful Models for time Series of Counts or Simply Wrong Ones?” AStA Advances in Statistical Analysis 95 (1): 59–91.10.1007/s10182-010-0139-9Search in Google Scholar

Jung, R. C., B. P. M. McCabe, and A. R. Tremayne. 2016. “Model Validation and Diagnostics.” In Handbook of Discrete-Valued Time Series, ed. by Davis et al., 189–218. Boca Raton: Chapman & Hall/CRC Press.Search in Google Scholar

Katz, R. W. 1981. “On Some Criteria for Estimating the Order of a Markov Chain.” Technometrics 23 (3): 243–249.10.2307/1267787Search in Google Scholar

Psaradakis, Z., M. Sola, F. Spagnolo, and N. Spagnolo. 2009. “Selecting Nonlinear Time Series Models Using Information Criteria.” Journal of Time Series Analysis 30 (4): 369–394.10.1111/j.1467-9892.2009.00614.xSearch in Google Scholar

R Core Team. 2018. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing.https://www.R-project.orgSearch in Google Scholar

Reschenhofer, E. 1996. “Prediction with Vague Prior Knowledge.” Communications in Statistics – Theory and Methods 25 (3): 601–608.10.1080/03610929608831716Search in Google Scholar

Rinke, S., and P. Sibbertsen. 2016. “Information Criteria for Nonlinear Time Series Models.” Studies in Nonlinear Dynamics & Econometrics 20 (3): 325–341.10.1515/snde-2015-0026Search in Google Scholar

Röhl, K.-H., and G. Vogt. 2016. “Unternehmensinsolvenzen – Anhaltender Rückgang bei Fortbestehenden Regionalen Differenzen (in German).” IW-Trends 43 (3): 21–37.Search in Google Scholar

Schwarz, G. 1978. “Estimating the Dimension of a Model.” Annals of Statistics 6 (2): 461–464.10.1214/aos/1176344136Search in Google Scholar

Weiß, C. H. 2018. An Introduction to Discrete-Valued Time Series. Chichester: John Wiley & Sons, Inc..10.1002/9781119097013Search in Google Scholar

Weiß, C. H., A. Homburg, and P. Puig. 2016. “Testing for Zero Inflation and Overdispersion in INAR(1) models.” Statistical Papers, forthcoming.10.1007/s00362-016-0851-ySearch in Google Scholar

Wu, T.-J., and A. Sepulveda. 1998. “The Weighted Average Information Criterion for Order Selection in time Series and Regression Models.” Statistics & Probability Letters 39 (1): 1–10.10.1016/S0167-7152(98)00003-0Search in Google Scholar

Supplementary Material

The online version of this article offers supplementary material (DOI: https://doi.org/10.1515/snde-2018-0012).

Published Online: 2019-05-09

You are currently not able to access this content.

Supplementary Material