A New INAR(1) Model for ℤ-Valued Time Series Using the Relative Binomial Thinning Operator

Maher Kachour; Hassan S. Bakouch; Zohreh Mohammadi

doi:10.1515/jbnst-2022-0059

Article Open Access

A New INAR(1) Model for ℤ-Valued Time Series Using the Relative Binomial Thinning Operator

Maher Kachour , Hassan S. Bakouch and Zohreh Mohammadi

Published/Copyright: June 7, 2023

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information Explore this Subject

From the journal Jahrbücher für Nationalökonomie und Statistik Volume 243 Issue 2

Abstract

A new first-order integer-valued autoregressive process (INAR(1)) with extended Poisson innovations is introduced based on a signed version of the thinning operator, called relative binomial thinning operator, which can be considered as an extension of standard binomial thinning operator introduced by Steutel, F.W. and van Harn, K. (1979. Discrete analogues of self-decomposability and stability. Ann. Probab. 7: 893–899). It is appropriate for modeling Z -valued time series and either positive or negative correlations. Some properties of the process are established. Conditional least squares, Yule–Walker and conditional maximum likelihood methods are considered for the parameter estimation of the model. Moreover, simulation experiments are carried out to attest to the performance of the estimation methods. The applicability of the proposed model is investigated through a practical data set of the Saudi stock market.

Keywords: time series; signed thinning operator; extended Poisson distribution; simulation; Pearson residuals

JEL Classification: C13; C15; C22; C53

1 Introduction

Discrete variable time series data are fairly common in practice, which has attracted the attention of many researchers. The early studies about modeling this type of data have been conducted by McKenzie (1985) and Al-Osh and Alzaid (1987), who suggested the first-order integer-valued autoregressive (INAR(1)) models based on the binomial thinning operator introduced by Steutel and van Harn (1979). Afterward, the researchers have introduced alternative INAR(1) process with different marginal and innovation distributions based on this operator. Moreover, various modification of the binomial thinning operator have been proposed for modeling count time series. A good review on INAR models can be found in Weiß (2008) and Scotto et al. (2015).

All the above models mainly are applicable to analyze a time series with non-negative integer-valued, but in practice, one can encounter also integer-valued time series data that include negative values. For example, in stock market, we analyze intra-daily stock prices the changes belongs on a discrete valued set, which can be represented by Z , known also as ticks (the price can go up or down on certain predefined ranges of value). In image analysis, we model the distribution of intensity differences between two neighborhood pixels. Thus, since intensities are discrete actually, we have the difference of two discrete variables. In football, the results of a match can be expressed in Z as the difference in the number of goals, a value often used in sport betting. In medicine, for some diseases we care about the difference in some count variables, such as number of pimples, before and after some treatment is applied. Moreover, in some fields, e.g. meteorology, by sake of simplicity and to facilitate the reading and interpret, data recorded are rounded, which lead to data with Z as support. Furthermore, when we analyze a non-stationary integer-valued time series with non-negative values, we use the differencing operator to achieve stationary. Therefore, we may obtain a time series on Z .

To the best of our knowledge, limited studies have been conducted on modeling time series data defined on the set Z , with an autoregressive structure. Kim and Park (2008) have proposed an integer-valued autoregressive process of order p ≥ 1 with signed binomial thinning operator. Kachour and Yao (2009) have introduced the first-order rounded integer-valued autoregressive process, based on the rounding operator. A more general setup has been introduced since then by Kachour (2014). Recently, Liu et al. (2021) have proposed a semiparametric autoregressive model with a log-concave innovation, based on the model proposed by Kachour (2014). Zhang et al. (2010) have presented an integer-valued autoregressive processes of order p ≥ 1 with signed generalized power series thinning operator. Kachour and Truquet (2011) have proposed an extension of INAR models using a modified version of the generalized thinning operator, which is different from that has been introduced by Kim and Park (2008). Zhang et al. (2012) have introduced a random coefficient process called generalized random coefficient first-order integer-valued autoregressive process with signed thinning operator. In literature, there exists a family of models defined on Z that arise as the difference between two discrete distributions. For example, Freeland (2010) has defined the true integer-valued autoregressive process of order one as the difference of two Poisson INAR(1) processes, Barreto-Souza and Bourguignon (2015) have proposed the skew INAR(1) process, which is defined as the difference of two INAR(1) process with geometric marginal distribution, Bourguignon and Vasconcellos (2016) have defined the new skew integer-valued process as the difference between a Poisson INAR(1) process and a geometric INAR(1) process, and Taveira da Cunha et al. (2018) have proposed a new integer-valued autoregressive process with generalized Poisson difference marginal distributions based on difference of two quasi-binomial thinning operators. It is also important to mention the existence of certain studies that deal with the multivariate Z -valued autoregressive models. For example, Bulla et al. (2011) have introduced the bivariate autoregressive integer-valued time-series models, based on the signed thinning operator. More recently, Chen et al. (2023) have proposed a new bivariate Z -valued autoregressive model, based on the bivariate Skellam distribution.

In this paper we introduce a new integer-valued process, denoted by EP-RBINAR(1), by considering a parametric assumption on the common distribution of the counting sequence of the signed integer-valued autoregressive process proposed by Kachour and Truquet (2011). Thus, the EP-RBINAR(1) can fit integer-valued times series with possible negative values. Moreover, similar to an AR(1) process, the autocorrelation function of EP-RBINAR(1) can also have negative values. Indeed, unlike models that arise as the difference between two discrete distributions, EP-RBINAR(1) process can fit Z -valued time series without take into account this specific construction. Moreover, in comparison with the model proposed by Kachour and Yao (2009), the EP-RBINAR(1) doesn’t have the issue related to small lack of identifiability on the parameter. Note that, the thinning part of the new process can be considered as the sum of a “discrete random walk”, where one can go a one step forward, or one step back, or keep still. Furthermore, innovations of EP-RBINAR(1) process follow the extended Poisson distribution introduced by Bakouch et al. (2016). This distribution is defined on Z , having a dispersion flexibility, and under some assumptions, it can be approximated by the Gaussian distribution.

The paper is structured as follows. EP-RBINAR(1) process is formally defined in Section 2 and some of its properties are outlined. In Section 3, estimation methods for the process parameters are proposed. Section 4 discusses some simulation results for the estimation methods. Moreover, the EP-RBINAR(1) model is applied to a practical data set of the Saudi stock market. Finally, the proofs of all propositions and theorems are contained in the appendix.

2 The EP-RBINAR(1) Process

2.1 Definition

In this section, we first consider a special version of the signed thinning operator, originally proposed by Latour and Truquet (2008). It is referred to “relative binomial thinning operator” and is an extension of the classical Steutel and van Harn operator to Z -valued random variables. Then, we propose a new signed integer-valued autoregressive process based on this operator with extended Poisson innovations.

Definition 1.

(Relative binomial thinning operator) Let { Y i } i ∈ N be a sequence of independent identically distributed (i.i.d) integer-valued random variables with the common distribution F, and independent of an integer-valued random variable X. The relative binomial thinning operator, denoted by F∘, is defined as

(1) F ∘ X = sign ( X ) ∑ i = 1 | X | Y i , if X ≠ 0 0 o t h e r w i s e

where for any integer x ≠ 0, sign(x) = 1 if x > 0 and −1 if x < 0, and F is defined as follow

(2) P ( Y i = y ) = α 2 , if y = 1 2 α ( 1 − α ) , if y = 0 ( 1 − α ) 2 , if y = − 1

with 0 ≤ α ≤ 1 (F is called a relative Bernoulli distribution).

Remark 1.

For classical Steutel and van Harn operator, X is positive and { Y i } i ∈ N is a sequence of Bernoulli variables (i.e. Y i Ω = { 0,1 } ), where “Y _i = 1” represents “success” (survival) and “Y _i = 0” represents “failure” (death). However, for the relative binomial thinning operator, Y _i can be seen as a “discrete random walk”, where “Y _i = 1” can be interpreted as a one step forward, “Y _i = −1” as one step back, and “Y _i = 0” as keep still.

Remark 2.

Note that originally Chesneau and Kachour (2012) introduced the relative binomial distribution, without giving it this name. Moreover, one can see that Y i = d Z i − 1 where Z _i ∼ B(2, α) with E ( Y i ) = 2 α − 1 and V ( Y i ) = 2 α ( 1 − α ) . Moreover, let l be a positive integer, such that l > 2. Thus, one can deduce that

α 2 + − 1 l 1 − α 2 = E Y l < + ∞

In the following lemma, some useful basic properties of the relative binomial thinning operator are presented.

Lemma 1.

Let x ∈ Z \ { 0 } , and set F∘X|(X = x) = F∘x, then the relative binomial thinning operator in Definition 1 has the following properties:

∀ y ∈ Z ,
P ( F ∘ x = y ) = 2 | x | | x | + sign ( x ) y α | x | + sign ( x ) y ( 1 − α ) | x | − sign ( x ) y I A ( y ) ,
where I A ( y ) denotes the indicator function, which has the value 1 when y takes values in A and has the value 0, otherwise and A = { − |x|, …, 0, …, |x|}.
E ( F ∘ x ) = ( 2 α − 1 ) x ,
V ( F ∘ x ) = 2 α ( 1 − α ) | x | ,
E ( s F ∘ x ) = s − x ( 1 − α + α s sign ( x ) ) 2 | x | ,
max ( ( 2 α − 1 ) | x | , 0 ) ≤ E ( | F ∘ x | ) ≤ | x | ( 2 α + 1 ) .

Using the relative binomial thinning operator in Definition 1, we now introduce a new process to be used for modeling Z -valued time series.

Definition 2.

A sequence { X t } t ∈ Z is said to be a EP-RBINAR(1) (first-order Extended-Poisson Relative Binomial Intergred-valued Autoregressive) process if it has the following representation:

(3) X t = F ∘ X t − 1 + ϵ t , t ∈ Z

where F is defined in (2) and { ϵ t } t ∈ Z is a sequence of i.i.d random variables, called innovations, following an extended Poisson distribution introduced by Bakouch et al. (2016). It is denoted by ϵ _t ∼ E-Po(p, λ) and defined by the following probability mass function (pmf):

(4) P ( ϵ t = k ) = e − λ if k = 0 p e − λ λ k k ! if k = 1,2 , … ( 1 − p ) e − λ λ | k | | k | ! if k = … , − 2 , − 1 ,

where λ > 0 and 0 ≤ p ≤ 1. All { Y i } i ∈ N in counting sequence are independent of ϵ _t. Moreover, ϵ _t and the sigma-algebra σ(X _t−1, X _t−2, …) are supposed to be mutually independent.

Remark 3.

Given that innovations (i.e. { ϵ t } t ∈ Z ) have a known distribution, our model defined in (3), can be considered as a special case of P-SINAR(1) process introduced by Chesneau and Kachour (2012), which is a particular case of SINAR(1) process presented by Kachour and Truquet (2011).

Remark 4.

The use of E-Po(p, λ) distribution guarantees innovations with possible negative values and having more dispersion flexibility (i.e. the dispersion of E-Po(p, λ) distribution depends on the value of p).

Remark 5.

From Bakouch et al. (2016), the corresponding formulas for mean, variance and probability generating function (pgf) of E-Po(p, λ) distribution with the pmf (4) are:

E ( ϵ t ) = ( 2 p − 1 ) λ

V ( ϵ t ) = λ + 4 p ( 1 − p ) λ 2

(5) Φ ϵ t ( s ) = e − λ p e s λ + ( 1 − p ) e λ s .

Obviously, if ϵ _t ∼ E-Po(p, λ), then |ϵ _t|∼ Po(λ). As a result, we have

E | ϵ t | = V | ϵ t | = λ .

Moreover, for any positive integer l > 2, we have

E | ϵ t | l ≤ λ l e l 2 2 λ < + ∞ .

Remark 6.

Under the previous parametric assumptions, the EP-RBINAR(1) process can be seen as an extension on Z of the Poisson INAR(1) process, originally proposed Al-Osh and Alzaid (1987).

Remark 7.

Based on its construction, one can deduce that, under specific parameters values, the EP-RBINAR(1) process can provide zero-inflated integer (positive and negative) observations. For example, suppose that (by sake of simplicity) that the innovation distribution is symmetric (i.e. p = 0.5) and concentrated on zero (i.e. λ < 1), and α is chosen so that zero be the mode of distribution F. Thus, in this case, it is very likely that zero is the most represented value of the process. Figure 1 shows plot and distribution of n = 1000 observations simulated from EP-RBINAR(1) process, where actual values are α = 0.45, p = 0.5, and λ = 0.3. One can see that the empirical frequency associated with zero equals 60 % and there is an almost balance between the distribution of positive and negative values.

Figure 1:

From left to right: plot and distribution of n = 1000 observations simulated from EP-RBINAR(1) process, where actual values are α = 0.45, p = 0.5, and λ = 0.3.

2.2 Properties

The stationarity of the proposed EP-RBINAR(1) process is established in the following theorem.

Theorem 1.

Suppose that α ∈ 0,1 \ { 1 2 } and 0 < p < 1. Hence, one can deduce that the process X _t defined in (3) has a unique stationary solution. Moreover, for any positive integer l > 2, we have

E | X 0 | l < + ∞ .

In the sequel, under stationary conditions and using Lemma 1, we derive some properties of the EP-RBINAR(1) process.

Proposition 1.

Let F j represent the filtration σ ( X j − i | i ∈ N ) . Under stationary conditions presented in Theorem 1, one can deduce that

E ( X t | F t − 1 ) = 2 α − 1 X t − 1 + 2 p − 1 λ .
V ( X t | F t − 1 ) = 2 α 1 − α | X t − 1 | + λ + 4 p ( 1 − p ) λ 2 .
E ( X t ) = ( 2 p − 1 ) λ 2 ( 1 − α ) ,
V ( X t ) = 2 α ( 1 − α ) E ( | X t | ) + λ ( 1 + 4 p ( 1 − p ) λ ) 1 − ( 2 α − 1 ) 2 ,
The probability generating function of the stationary EP-RBINAR(1) process is obtained as
(6) Φ X t ( s ) = E s F ∘ X t − 1 Φ ϵ t ( s )
where
E s F ∘ X t − 1 = P ( X t = 0 ) + ∑ x = − ∞ − 1 s | x | 1 − α + α s 2 | x | P ( X t = x ) + ∑ x = 1 ∞ 1 s x ( 1 − α + α s ) 2 x P ( X t = x )
and Φ ϵ t ( s ) is the pgf of {ϵ _t} given in (5).
ρ(1) = corr(X _t, X _t−1) = 2α − 1

Remark 8.

The stationary conditions associated with the EP-RBINAR(1) process are similar to that of a classical AR(1) process. Indeed, one can see that

X t = ( 2 α − 1 ) X t − 1 + ξ t ,

where ξ _t is a stationary process defined by

ξ t = ϵ t + F ∘ X t − 1 − ( 2 α − 1 ) X t − 1 .

Moreover, one can see, based on properties of signed thinning operator that ξ = (ξ _t) is an uncorrelated process. Thus, the EP-RBINAR(1) process has the same autocorrelation structure as a classic AR(1) process. In other words, the autocorrelation function of the EP-RBINAR(1) process is obtained as

ρ ( k ) = corr ( X t , X t + k ) = ( 2 α − 1 ) k ∀ k ≥ 1 , with α ∈ 0,1 \ { 1 2 } .

Therefore, the spectral density function of the model, for ω ∈ − π , + π ; is obtained as

f x , x ω = 1 2 π ∑ k = − ∞ + ∞ cov X t , X t − k e − i ω k , = V X t 2 π ∑ k = − ∞ + ∞ 2 α − 1 k e − i ω k , = V X t 2 π 1 1 + 2 α − 1 2 − 2 2 α − 1 cos ( ω ) .

Under stationary conditions, the marginal probability function of the EP-RBINAR(1) process is given in the following proposition.

Proposition 2.

Let {X _t} be an EP-RBINAR(1) process, with 0 < α < 1, α ≠ 1 2 , and 0 < p < 1. Thus, for all a ∈ Z , we have

P ( X t = a ) = ∑ i ∈ Z * ∑ i = − | j | | j | 2 | j | | j | + sign ( j ) i α | j | + sign ( j ) i ( 1 − α ) | j | − sign ( j ) i × P ( X t − 1 = j ) P ( ϵ t = a − i ) + P ( X t = 0 ) P ( ϵ t = a ) ,

where

P ( ϵ t = a − i ) = e − λ if a = i p e − λ λ a − i ( a − i ) ! if i < a ( 1 − p ) e − λ λ | a − i | | a − i | ! if i > a .

The EP-RBINAR(1) process’s one-step transition probability, denoted by P i , j = P ( X t + 1 = j | X t = i ) , is derived as

(7) P 0 , j = P ( ϵ t = j )

and for i ≠ 0

(8) P i , j = ∑ k = − | i | | i | 2 | i | | i | + sign ( i ) k α sign ( i ) ( i + k ) ( 1 − α ) sign ( i ) ( i − k ) P ( ϵ t = j − k ) ,

where P ( ϵ t = j ) is the pmf of {ϵ _t} defined by (4).

Moreover, using the first-order dependence of the process, the joint probability can be obtained as

(9) P ( X 1 = i 1 , X 2 = i 2 , … , X m = i m ) = P ( X 1 = i 1 ) P ( X 2 = i 2 | X 1 = i 1 ) × P ( X m = i m | X m − 1 = i m − 1 ) = P ( X 1 = i 1 ) ∏ s = 1 m − 1 P i s , i s + 1 .

The k step-ahead conditional mean (usually used for time series forecast issues) of the process is derived in the following proposition.

Proposition 3.

Let {X _t} be an EP-RBINAR(1) process with 0 < α < 1, α ≠ 1 2 , and 0 < p < 1. Thus, the k step-ahead conditional mean is given by

(10) E ( X t + k | X t ) = ( 2 α − 1 ) k X t + ( 2 p − 1 ) λ 1 − ( 2 α − 1 ) k 2 ( 1 − α ) .

Remark 9.

Using Equation (10), we find that

lim k → ∞ E ( X t + k | X t ) = ( 2 p − 1 ) λ 2 ( 1 − α ) ,

which is the unconditional mean of the EP-RBINAR(1) process.

3 Parameters Estimation

Let ( X 1 , … , X m ) t be a vector of observations from our model defined in (3), and θ ∈ { ( α , p , λ ) t ; 0 < α < 1 , α ≠ 1 2 , 0 < p < 1 , λ > 0 } denote the unknown parameter vector. In order to estimate θ, we propose three estimation methods, namely, Yule–Walker (YW), conditional least square (CLS), and conditional maximum likelihood (CML).

3.1 Yule–Walker Estimation

Let X ¯ = 1 m ∑ i = 1 m X i , | X | ¯ = 1 m ∑ i = 1 m | X i | and

ρ ˆ ( 1 ) = γ ˆ ( 1 ) γ ˆ ( 0 ) = ∑ k = 1 m − 1 ( X t − X ¯ ) ( X t + 1 − X ¯ ) ∑ k = 1 m ( X t − X ¯ ) 2 .

In order to obtain estimations in this method, we solve a set of equations that are resulted from equating the theoretical and empirical aspects at the same time. Thus, the YW estimation of α is obtained by

(11) α ˆ Y W = ρ ˆ ( 1 ) + 1 2 ,

and the YW estimation of p and λ are obtained by solving the following equations

(12) γ ˆ ( 0 ) ( 1 − ρ ˆ 2 ( 1 ) ) = 2 α ˆ Y W ( 1 − α ˆ Y W ) | X | ¯ + λ ( 1 + 4 p ( 1 − p ) λ ) , 2 X ¯ ( 1 − α ˆ Y W ) = λ ( 2 p − 1 ) .

Remark 10.

A special case of our model, when we suppose that α = p. In this case, the YW estimation of α (and p) is

α ˆ Y W = ρ ˆ ( 1 ) + 1 2 ,

and the YW estimation of λ is given by

λ ˆ Y W = 2 ( 1 − α ˆ Y W ) 2 α ˆ Y W − 1 X ¯ = ( 1 ρ ˆ ( 1 ) − 1 ) X ¯ .

3.2 Conditional Least Square Estimation

Let γ = 2α − 1 and μ = (2p − 1)λ, then the CLS estimator of θ* = (γ, μ) is defined by

(13) θ ˆ m * = arg min θ * ∈ Θ S m ( θ * ) ,

where θ ˆ m * = ( γ ˆ m , μ ˆ m ) and

S m ( θ * ) = 1 m ∑ t = 2 m ( X t − E ( X t | F t − 1 ) ) 2 = 1 m ∑ t = 2 m ( X t − γ X t − 1 − μ ) 2 .

Remark 11.

Based on Theorem 1, we have E X 0 2 < ∞ and E | X 0 | 3 < ∞ . Thus, based on Theorem 2 of Kachour and Truquet (2011), one can deduce that

θ ˆ m * is strongly consistent (i.e. lim m → ∞ θ ˆ m * = θ * a.s.)
The CLS estimation is asymptotically normal.

Remark 12.

A special case of the proposed model, when we suppose that α = p. In this case, the CLS estimation of α (and p) is given as

α ˆ CLS = γ ˆ m + 1 2 ,

and the CLS estimation of λ is given by

λ ˆ CLS = μ ˆ m 2 α ˆ m − 1 = μ ˆ m γ ˆ m .

3.3 Conditional Maximum Likelihood Estimation

From the joint probability function (9), the conditional log-likelihood function for the EP-RBINAR(1) model can be written as

(14) l m ( θ | X 1 ) = ∑ s = 1 m − 1 log P ( ϵ t = X s + 1 ) I { 0 } ( X s ) + I Z * ( X s ) × ∑ k = − | X s | | X s | 2 | X s | | X s | + sign ( X s ) k α sign ( X s ) ( X s + k ) × ( 1 − α ) sign ( X s ) ( X s − k ) P ( ϵ t = X s + 1 − k ) ,

where P ( ϵ t = j ) is the pmf of {ϵ _t} defined by (4) and I A ( X s ) denotes the indicator function, which has the value 1 when X _s takes values in A and has the value 0, otherwise.

The conditional maximum likelihood (CML) estimator θ ˆ CML = ( α ˆ CML , p ˆ CML , λ ˆ CML ) ′ of θ = (α, p, λ)′ is defined as the value of θ that maximize the conditional log-likelihood function in (14). Since equating the first-order conditional log-likelihood derivatives to zero leads us to a complicated system of equations, the CML estimates are achieved using numerical methods.

4 Empirical Study

This section includes two subsections. In the first part, the performance of estimation methods that we used for parameters of the proposed process is compared through a simulation study. To ensure the applicability of the proposed model, the second part is devoted to analyzing a practical data set of the Saudi stock market.

4.1 Simulation

4.1.1 The Case When α = p

In this section, we aim to test the efficiency of the parameter’s estimation discussed in the previous section. For the sake of simplicity, we suppose that α = p. Thus, we simulate (using the R programming language) 1000 paths of length 100, 500, 1000 and 10,000. These paths are simulated using Equation (3) with three sets of parameters: (a) α = p = 0.1 and λ = 1; (b) α = p = 0.6 and λ = 2; (c) α = p = 0.9 and λ = 3. Mean values of YW, CLS, and CML estimates for each set of parameters are given in Table 1. The standards deviations of the estimates are stated in brackets under the estimated values.

Table 1:

Estimated parameters and the corresponding standard errors (in brackets) stated under the estimates for YW, CLS and CML methods (when α = p).

Size	α ˆ Y W	λ ˆ Y W	α ˆ CLS	λ ˆ CLS	α ˆ CML	λ ˆ CML
a) α = p = 0.1 and λ = 1

n = 100	0.110	1.009	0.109	1.010	0.101	0.977
n = 100	(0.034)	(0.176)	(0.034)	(0.175)	(0.028)	(0.120)
n = 500	0.102	1.002	0.101	1.002	0.102	1.000
n = 500	(0.015)	(0.076)	(0.015)	(0.076)	(0.011)	(0.050)
n = 1000	0.101	1.002	0.101	1.002	0.099	1.006
n = 1000	(0.010)	(0.054)	(0.010)	(0.054)	(0.009)	(0.041)
n = 10,000	0.100	0.999	0.100	0.999	0.0999	1.000

	(0.003)	(0.016)	(0.003)	(0.016)	(0.002)	(0.011)

b) α = p = 0.6 and λ = 2

n = 100	0.592	1.584	0.592	1.513	0.593	2.019
n = 100	(0.050)	(21.931)	(0.050)	(23.147)	(0.033)	(0.172)
n = 500	0.598	2.251	0.597	2.252	0.599	1.987
n = 500	(0.023)	(1.048)	(0.023)	(1.047)	(0.015)	(0.082)
n = 1000	0.599	2.101	0.599	2.101	0.597	1.999
n = 1000	(0.016)	(0.660)	(0.016)	(0.659)	(0.011)	(0.061)
n = 10,000	0.599	2.004	0.599	2.005	0.598	1.999

(0.005)	(0.184)	(0.005)	(0.184)	(0.004)	(0.052)

c) α = p = 0.9 and λ = 3

n = 100	0.887	3.454	0.883	3.704	0.899	2.999
n = 100	(0.317)	(1.377)	(0.032)	(1.445)	(0.012)	(0.283)
n = 500	0.897	3.100	0.896	3.152	0.899	2.996
n = 500	(0.013)	(0.535)	(0.136)	(0.544)	(0.005)	(0.129)
n = 1000	0.898	3.040	0.898	3.065	0.899	3.000
n = 1000	(0.009)	(0.379)	(0.009)	(0.382)	(0.003)	(0.089)
n = 10,000	0.899	3.004	0.899	3.007	0.899	3.002
n = 10,000	(0.003)	(0.114)	(0.003)	(0.114)	(0.002)	(0.064)

Remark 13.

In the case when α = p, one can use results presented in Remarks 10 and 12 to calculate YW and CLS estimates. However, CML estimates are calculate by using numerical methods. Explicitly, we use the nlm function (Non-Linear Minimization, from package “stats”, software R), to find values that maximize the conditional log-likelihood function, denoted by (14) (where we consider that α = p in this equation).

In general way, from the obtained results, one can deduce that YW, CLS and CML methods provide estimates that are quite close to the actual values. Moreover, the performances of YW and CLS methods are very similar. For these methods, one can see that when the length of series is over 500, they converge quickly to the actual parameter values. However, results of CML method are more efficient even for a small length of series. For all the proposed methods, we can notice that the standard deviation of the estimates decreases as the length series increases. Furthermore, in most of studied cases, the CLM method provides the lower standard deviation of estimates.

However, for YW and CLS methods, it is important to point the convergence rate is less quickly for λ. Indeed, this can be explained by the fact that, for both methods, the estimate of λ depends on the estimated value of α. Thus, for these methods, we find that a large current value of λ implies the standard deviation of the estimates is high, especially when the size of the series is small. This remark is illustrated thanks to the simulation results obtained for parameters set (c) (for details, see Table 1). The boxplots for the parameter set (c) are given in Figures 2 –4 for CLS, YW and CML methods, respectively. One can see that, contrary to the other methods, the estimates from the CML method have few or no outliers.

Figure 2:

The boxplot for CLS estimates for parameters set (c), where actual values are α = p = 0.9 and λ = 3.

Figure 3:

The boxplot for YW estimates for parameters set (c), where actual values are α = p = 0.9 and λ = 3.

Figure 4:

The boxplot for CML estimates for parameters set (c), where actual values are α = p = 0.9 and λ = 3.

On the other hand, one can see, based on the simulation results of YW and CLS methods, obtained for parameter set (b) when n = 100, that the standard deviation of λ estimates is very high for both methods (see Table 1). In fact, these results are not surprising cause that α (= p) actual value is close to 0.5 (which is an excluded value of the parameter). So, with a small size of series the estimate value of α may be close to 0.5, which can “explode” the estimate value of λ. However, this issue is not present when we use the CML method (in this case, even with a small length series, the standard deviation is weak, see Table 1). Finally, fitting to the Gaussian distribution is illustrated in Figures 5 and 6 for YW, CLS and CML estimators (with the parameter set (c) and n = 10,000). These figures show numerically the normal asymptotical distribution of the proposed estimators.

$Figure 5: Normal Q-Q plots for errors ( α ˆ Y W − 0.9 ) $({\widehat{\alpha }}_{YW}-0.9)$ , ( α ˆ CLS − 0.9 ) $({\widehat{\alpha }}_{\mathit{CLS}}-0.9)$ and ( α ˆ CML − 0.9 ) $({\widehat{\alpha }}_{\mathit{CML}}-0.9)$ , when n = 10,000. The horizontal axis indicates theoretical quantiles and the vertical axis indicates sample quantiles.$

Figure 5:

Normal Q-Q plots for errors ( α ˆ Y W − 0.9 ) , ( α ˆ CLS − 0.9 ) and ( α ˆ CML − 0.9 ) , when n = 10,000. The horizontal axis indicates theoretical quantiles and the vertical axis indicates sample quantiles.

$Figure 6: Normal Q-Q plots for errors ( λ ˆ Y W − 3 ) $({\widehat{\lambda }}_{YW}-3)$ , ( λ ˆ CLS − 3 ) $({\widehat{\lambda }}_{\mathit{CLS}}-3)$ and ( λ ˆ CML − 3 ) $({\widehat{\lambda }}_{\mathit{CML}}-3)$ , when n = 10,000. The horizontal axis indicates theoretical quantiles and the vertical axis indicates sample quantiles.$

Figure 6:

Normal Q-Q plots for errors ( λ ˆ Y W − 3 ) , ( λ ˆ CLS − 3 ) and ( λ ˆ CML − 3 ) , when n = 10,000. The horizontal axis indicates theoretical quantiles and the vertical axis indicates sample quantiles.

4.1.2 The Case When α ≠ p

As we mentioned in the previous section, the performance of the YW and CLS methods are quite close. Moreover, we have also seen that if α ≠ p, then the estimates of these methods do not have an explicit form. Thus, in this section, we will compare the performances resulting from the YW and CML methods based on one set of parameters: α = 0.75, p = 0.4, and λ = 2. We simulate (using the R programming language) 10,000 paths of length 100, 250, and 1000. These paths are simulated using Equation (3) with the above set of parameters. To calculate the CML estimates we use the nlm function (Non-Linear Minimization, from package “stats”, software R) to find values that maximize the conditional log-likelihood function, denoted by (14). On the other hand, the YW estimates of α is given by (11). Once we estimate the value of parameter α (based on (11)) and substitute this value into Equation (12), we get the system of two equations with two unknown variables λ and p. To find YW estimates of these parameters, this system has to be solved numerically. Thus, the numerical procedure will be conducted in programming language R, where we will use the BB package. Mean values of YW and CML estimates for each parameters are given in Table 2. The standards deviations of the estimates are stated in brackets under the estimated values. Once again, we can see that the precision of these estimates (from both methods) increases when the size n increases.

Table 2:

Estimated parameters and the corresponding standard errors stated (in brackets) under the estimates for YW and CML methods (when α ≠ p).

α = 0.75, λ = 2, and p = 0.4
Size	α ˆ Y W	λ ˆ Y W	p ˆ Y W	α ˆ C M L	λ ˆ CML	P ˆ C M L
n = 100	0.739	1.981	0.389	0.739	1.997	0.397
n = 100	(0.046)	(0.186)	(0.074)	(0.078)	(0.171)	(0.065)
n = 250	0.748	1.982	0.412	0.746	2.012	0.402
n = 250	(0.015)	(0.129)	(0.045)	(0.069)	(0.103)	(0.039)
n = 1000	0.749	2.002	0.401	0.749	1.999	0.399
n = 1000	(0.015)	(0.054)	(0.019)	(0.019)	(0.052)	(0.019)

Remark 14.

In order to study the compatibility of the proposed model and the estimation results, we generated, based on Equation (1), n = 1000 observations of the EP-RBINAR(1) process, with α = 0.75, p = 0.4, and λ = 2. Basic descriptive statistics associated with the simulated data are represented in the following table:

Indicator	Mean	Variance	First autocorrelation	Second autocorrelation	Third autocorrelation
Empirical value	−0.936	8.590494	0.48065090	0.20075170	0.06104609

Using the simulated data and the procedure presented in the above section, we compute the CML estimates of the process parameters: α ˆ = 0.7431100 , p ˆ = 0.3854399 and λ ˆ = 1.9496662 . Now, using these estimates and based on the properties of the EP-RBINAR(1) process (see Proposition 1 and Remark 8), we calculate the indicators presented in the above table

Indicator	Mean	Variance	First autocorrelation	Second autocorrelation	Third autocorrelation
Estimate value	−0.8694537	8.487022	0.48622	0.2364099	0.1149472

Thus, one can see that empirical values of these indicators are very close to that of estimate values (calculated based on estimate parameters and properties of the process).

4.2 A Practical Data Example

Here, we present an application of the model EP-RBINAR(1) based on real-world data from Saudi stock market. In 2007, the minimum amount of change was 0.25 SR (Saudi Riyal) for all stocks. The daily close price as number of ticks (ticks = close price × 4) in 2007 for Saudi Telecommunication Company (STC) stock is considered. Note that these data were originally introduced by Alzaid and Omair (2014) as an application of a new integer-valued model with a Poisson difference marginal distribution. This model can be used as a tool to model non-stationary count data. Basic descriptive statistics concerning these data are presented in Table 3.

Table 3:

Basic descriptive statistics for the STC stock data set.

Length	Mean	Minimum	Maximum	First quartile	Median	Third quartile	Standard deviation
248	49.82	45	66	46	48	52	4.6968

After, examination of the time series plot of the data, authors (Alzaid and Omair (2014)) notice a non-stationarity in the mean (this issue was confirmed based on a sustained large autocorrelation function (ACF) and exceptionally large first lag partial autocorrelation function (PACF)). Thus, authors considered that differencing is needed. Explicitly, if {X _t} is the process behind the above data, then authors propose to study the lag one differenced process, denoted by {Y _t}, where Y _t = X _t − X _t−1. Indeed, one can consider that data observed from process {Y _t} as the difference between two consecutive daily ticks (associated with the close price). Thus, a data with negative values represents the number of ticks “lost” at the close between two consecutive days (i.e. a drop in price), while a positive value reveals the number of ticks “gained” at the close between two consecutive days (i.e. a price increase). Finally, if the observed data is zero, this means that the number of ticks at closing has remained “stable” between two consecutive days (i.e. the price has not changed).

The time series plot of the differenced stock is illustrated in Figure 7 and basic descriptive statistics concerning the differenced data are presented in Table 4. Thus, Figure 7 shows the stationarity in the mean is now verified. The ACF and PACF for differenced stock are shown in Figure 8. These figures show that the lag one correlation is positive and significant, which implies that the differenced process have the same autocorrelation structure of an AR(1) process. Moreover, after the lag one differenced stage, the data is integer with negative values.

Figure 7:

The time series plot of the differenced STC stock data set.

Table 4:

Basic descriptive statistics for the differenced STC data set.

Length	Mean	Minimum	Maximum	First quartile	Median	Third quartile	Standard deviation
247	0.004049	−19	13	−2	0	3	4.715049

Figure 8:

The ACF and PACF of the differenced STC stock data set.

Remark 15.

Bourguignon and Vasconcellos (2016) have used data, dating from 2012, providing also from Saudi Telekom. The choice of these data is consistent with the model proposed by the authors (where they introduced a new process (defined on Z ) constructed as being the difference between a Poisson INAR(1) process and a geometric INAR(1) process). At the first glance, it is quite possible to consider the model proposed by Bourguignon and Vasconcellos (2016) to fit the differenced data of our application. However, we cannot say that the differenced data used in our application comes from the difference between two sets of independent positive integer-valued data and therefore it goes against the philosophy of the model proposed by Bourguignon and Vasconcellos (2016). However, since our model belongs to SINAR(1) process family, it can be considered to fit data used by Bourguignon and Vasconcellos (2016) without take into account how data have been constructed.

To fit the differenced data, we propose our EP-RBINAR(1) process. Explicitly, we consider that

Y t = F ∘ Y t − 1 + ϵ t ,

where F is as defined in (2) and ϵ _t follows an E-Po(p, λ), where 0 < p < 1, λ > 0 and α ∈ 0 , 1 2 ∪ 1 2 , 1 . The estimated values of unknown parameters are obtained by using the CML method. Thus, we obtain α ˆ CML = 0.6448645 , p ˆ CML = 0.5345621 and λ ˆ CML = 3.402455 .

Remark 16.

Based on properties of the process and using the estimates of the unknown parameters, one can deduce that the mean is equal to 0.3311299 (which is not close enough to 0.004048583 (mean of the data) however, one can consider that both values are near to zero), the standard deviation equals 4.246681 (which is close to 4.715049, the standard deviation calculated from the data), and the lag-1 autocorrelation is equal to 0.289729 (which is close to 0.2194211, lag-1 autocorrelation calculated from the data). These first results support our choice of the EP-RBINAR(1) model to fit the data.

Moreover, based on the conditional one-step ahead mean, the estimated residuals are computed as

e ˆ t = y t − y ˆ t = y t − ( 2 α ˆ CML − 1 ) y t − 1 + λ ˆ CML ( 2 p ˆ CML − 1 )

Remark 17.

In general, y ˆ t are real-valued, a mapping into the discrete support of the series is obtained by rounding to the nearest integer.

To verify the adequacy of the considered model, the ACF and the PACF of the estimated residuals are plotted in Figure 9. Clearly, this graph shows that the residuals can be considered as observations from a white noise. Moreover, in Figure 10, we make a comparison between the empirical frequencies of e ˆ t and their associated theoretical probabilities resulting from E-Po(0.5345621, 3.402455). This comparison shows that the frequencies and probabilities associated with the observed modalities are relatively close. This is one of reasons to justify our choice to use the EP-RBINAR(1) model to fit the data.

Figure 9:

The ACF and PACF of the estimated residuals of the differenced STC stock data set.

Figure 10:

Comparison between the empirical frequencies of estimated residuals and their associated theoretical probabilities resulting from E-Po(0.5345621, 3.402455).

Remark 18.

Based on the conditional one-step ahead standard deviation, the estimated standardized Pearson residuals are computed as

k ˆ t = y t − ( ( 2 α ˆ CML − 1 ) y t − 1 + λ ˆ CML ( 2 p ˆ CML − 1 ) ) 2 α ˆ CML ( 1 − α ˆ CML ) | y t − 1 | + λ ˆ CML + 4 p ˆ CML ( 1 − p ˆ CML ) λ ˆ CML 2 .

For details concerning the standardized Pearson residuals and its role to check model adequacy for count time series, see Weiß et al. (2019). Note that the empirical mean, median, and standard deviation, of the estimated standardized Pearson residuals, are, respectively, equals: −0.05694, 0, and 1.127245. Thus, one can see that the empirical mean is very close to zero and empirical standard deviation is close to one. Moreover, the ACF, the PACF, and the fitting to the Gaussian distribution (Normal Q-Q plot) of the estimated standardized Pearson residuals are plotted in Figure 11. These results can be considered as additional arguments whose objective is to check the model adequacy to fit the data.

Figure 11:

From top to bottom: ACF, PACF, and normal Q-Q plot of estimated standardized Pearson residuals of the differenced STC stock data set.

Remark 19.

As mentioned before, the data processed in this section have been studied by Alzaid and Omair (2014). Indeed, authors proposed their model, called PDINAR(1) process, to fit the data. However, this process has two variants, denoted by PDINAR⁺ (1) and PDINAR⁻ (1), depending on the sign of the correlation. In practice, to fix which variant should be used to fit the data, it is necessary to have a preliminary idea of the sign of ρ ˆ ( 1 ) . Thus, since the empirical first-order autocorrelation is positive, Alzaid and Omair (2014) considered the PDINAR⁺ (1) to study the data. However, for the use of our model, we don’t need to know the sign of ρ ˆ ( 1 ) . On the other hand, Alzaid and Omair (2014) show the that one-step ahead least squares predictions can be calculated by

y ~ t = 0.2201 y t − 1 + 10.4088 − 10.3953 .

These predicted values need also to by rounding to the nearest integer to be adequate with the discrete support of the data. From Figure 12, one can deduce that the predictions (based on conditional one-step ahead mean) results provided from PDINAR⁺ (1) process and those from the EP-RBINAR(1) process, are very similar. This implies that both models have a prediction performances very close. Indeed, the Root Mean Squared Error (RMSE) equals 4.604699 (resp. 4.590552), for forecasting values (based on conditional mean) provided from PDINAR⁺ (1) (resp. EP-RBINAR(1)) process. However, one can see also that for both models there is a rather discrepancy between the forecast and the actual data.

Figure 12:

Comparison of conditional mean predictions results from PDINAR(1) and EP-RBINAR(1).

Remark 20.

As mentioned above, conditional one-step ahead mean leads to real-valued forecasts, which is incoherent with the support of the integer-valued time series. Recently, coherent techniques have been introduced in order to provide integer-valued forecasts. One of the most used technique is the conditional median (for details, see Homburg et al. (2019)), which is obtained by calculating the conditional probability of each possible integer value, then selecting a forecast value with a cumulative conditional probability greater than 0.5. From Figure 13, one can see that discrepancy between conditional median forecasts and actual data have been reduced, specially for high positive values. Moreover, one can find that the Mean Absolute Error (MAE) (based on the conditional median forecasts) equals 3.768293 and Median Absolute Error (MedAE) equals 3.

Figure 13:

Comparison between conditional mean predictions results of PDINAR(1) and conditional median predictions of EP-RBINAR(1).

5 Concluding Remarks

In this paper, we introduce a new stationary first-order integer-valued autoregressive process, denoted by EP-RBINAR(1), where the thinning part of the process can be considered as the sum of a random walk. The new model has several advantages: possible negative values for time series, possible negative values for autocorrelation, and an innovation structure that can be interpreted as an extension on Z of the Poisson distribution. Moreover, under specific parameters values, EP-RBINAR(1) can provide zero-inflated integer (positive and negative) observations. The main properties of the model are derived. Then we considered the problem of parameter estimation and derived properties of the Yule–Walker, conditional least square, and conditional maximum likelihood estimators. An example of application to a real-world data set, based on number of ticks (associated with close price) for Saudi Telecommunication Company (STC) stock (in 2007), illustrates the importance and potentiality of the new model. We are convinced that this new process may attract many other wider applications in time series analysis. As part of future research, it would be of interest to extend the proposed process, studying the process of order p.

Corresponding author: Maher Kachour, ESSCA School of Management, Lyon, France, E-mail: maher.kachour@essca.fr

Appendix

Proof of Lemma 1. In proving all cases, we use this fact that Y i = d Z i − 1 where Z _i ∼ B(2, α).

For any y ∈ Z , we get
P ( F ∘ x = y ) = P sign ( x ) ∑ i = 1 | x | Y i = y = P ∑ i = 1 | x | Z i = | x | + sign ( x ) y ,
where ∑ i = 1 | x | Z i has a binomial distribution with parameters 2|x| and α, B(2|x|, α). This completes the proof.
E ( F ∘ x ) = ∑ y ∈ Z y P ( F ∘ x = y ) = ∑ z = 0 2 | x | sign ( x ) ( z − | x | ) P ( Z = z ) ,
where Z ∼ B(2|x|, α) and sign(x)|x| = x. So, we have
E ( F ∘ x ) = sign ( x ) ( E ( Z ) − | x | ) = sign ( x ) ( 2 α | x | − | x | ) = ( 2 α − 1 ) x .
V ( F ∘ x ) = E ( ( F ∘ x ) 2 ) − ( E ( F ∘ x ) ) 2 = ∑ y ∈ Z y 2 P ( F ∘ x = y ) − ( 2 α − 1 ) 2 x 2 = E ( Z 2 ) − 2 | x | E ( Z ) + | x | 2 − ( 2 α − 1 ) 2 x 2 = 2 α ( 1 − α ) | x | .
E ( s F ∘ x ) = ∑ y ∈ Z s y P ( F ∘ x = y ) = ∑ z = 0 2 | x | s sign ( x ) ( z − | x | ) P ( Z = z ) = s − x ( 1 − α + α s sign ( x ) ) 2 | x | .
First note that
E ( | F ∘ x | ) = ∑ y ∈ Z | y | P ( F ∘ x = y ) = ∑ z = 0 2 | x | | sign ( x ) ( z − x ) | P ( Z = z ) = E ( | Z − x | ) .

By utilizing the fact that max(|a| − |b|, 0) ≤ |a − b| ≤ |a| + |b|, we obtain
max ( E ( Z ) − | x | , 0 ) ≤ E ( | Z − x | ) ≤ E ( Z ) + | x | ,
the proof is completed by substituting E ( Z ) = 2 | x | α .□

Proof of Theorem 1. Under the assumptions of Theorem 1, one can deduce that

F { 0 } = P ( Y i = 0 ) > 0 ,
∀ a ∈ Z ; P ( ϵ 0 = a ) > 0 ,
The root of the polynomial function
P ( z ) = 1 − ( 2 α − 1 ) z
is outside the unit disc (i.e. 1 2 α − 1 > 1 ).

Consequently, assumptions of Theorem 1 of Kachour and Truquet (2011) are verified. Thus, one can confirm that the process X _t defined in (3) has a unique stationary solution. In other hand, from Remarks 2 and 5, for all positive integer l > 2, we have that

The lth moment of F is finite,
E | ϵ t | < + ∞ .

Once again, based on Theorem 1 of Kachour and Truquet (2011), one can deduce that E | X 0 | l < + ∞ . □

Proof of Proposition 1. Let F j represent the filtration σ ( X j − i | i ∈ N ) . Thus, based on Lemma 1, we have

E ( X t | F t − 1 ) = E ( F ∘ X t − 1 | F t − 1 ) + E ( ϵ t ) = 2 α − 1 X t − 1 + 2 p − 1 λ .
V ( X t | F t − 1 ) = V ( F ∘ X t − 1 | F t − 1 ) + V ( ϵ t ) = 2 α 1 − α | X t − 1 | + λ + 4 p ( 1 − p ) λ 2 .
E ( X t ) = E ( F ∘ X t − 1 ) + E ( ϵ t ) = E ( E ( F ∘ X t − 1 | F t − 1 ) ) + ( 2 p − 1 ) λ = ( 2 α − 1 ) E ( X t ) + ( 2 p − 1 ) λ .
V ( X t ) = V ( F ∘ X t − 1 ) + V ( ϵ t ) = V ( E ( F ∘ X t − 1 | F t − 1 ) ) + E ( V ( F ∘ X t − 1 ) | F t − 1 ) + V ( ϵ t ) = ( 2 α − 1 ) 2 V ( X t ) + 2 α ( 1 − α ) E ( | X t | ) + λ ( 1 + 4 p ( 1 − p ) λ ) .
E s F ∘ X t − 1 = E E s F ∘ X t − 1 | F t − 1 = s 0 P X t = 0 + ∑ x ∈ Z E s F ∘ x P X t = x .
cov ( X t , X t − 1 ) = cov ( F ∘ X t − 1 , X t − 1 ) = E ( E ( ( F ∘ X t − 1 ) X t − 1 | F t − 1 ) ) − ( 2 α − 1 ) E 2 ( X t − 1 ) = E ( X t − 1 E ( F ∘ X t − 1 | F t − 1 ) ) − ( 2 α − 1 ) E 2 ( X t − 1 ) = ( 2 α − 1 ) E X t − 1 2 − ( 2 α − 1 ) E 2 ( X t − 1 ) .

□

Proof of Proposition 2. Note that, for all a ∈ Z , the marginal probability function of X _t can be calculated as

(15) P ( X t = a ) = ∑ i ∈ Z P ( F ∘ X t − 1 = i ) P ( ϵ t = a − i ) ,

where

(16) P ( F ∘ X t − 1 = i ) = ∑ j ∈ Z P ( F ∘ X t − 1 = i , X t − 1 = j ) = ∑ j ∈ Z * P ( F ∘ X t − 1 = i | X t − 1 = j ) P ( X t − 1 = j ) + P ( F ∘ X t − 1 = i | X t − 1 = 0 ) P ( X t − 1 = 0 ) .

Now, by substituting (16) into (15), we have

(17) P ( X t = a ) = ∑ i ∈ Z ∑ j ∈ Z * P ( F ∘ j = i ) P ( X t − 1 = j ) P ( ϵ t = a − i ) + ∑ i ∈ Z P ( F ∘ X t − 1 = i | X t − 1 = 0 ) P ( X t − 1 = 0 ) P ( ϵ t = a − i ) ,

where

(18) P ( F ∘ X t − 1 = i | X t − 1 = 0 ) = 1 if i = 0 , 0 otherwise .

The proof is completed. □

Proof of Proposition 3. Using Proposition 1, the one step-ahead conditional mean equals

E ( X t + 1 | X t ) = E ( F ∘ X t + ϵ t | X t ) = ( 2 α − 1 ) X t + ( 2 p − 1 ) λ .

Based on the above equation, we have

E ( X t + 2 | X t ) = E ( E ( X t + 2 | X t + 1 ) | X t ) = E ( ( 2 α − 1 ) X t + 1 + ( 2 p − 1 ) λ | X t ) = ( 2 α − 1 ) 2 X t + ( 2 p − 1 ) λ 1 + ( 2 α − 1 ) ,

and in the same manner, one can see that

E ( X t + 3 | X t ) = E ( E ( X t + 3 | X t + 1 ) | X t ) = E ( ( 2 α − 1 ) 2 X t + 1 + ( 2 p − 1 ) λ 1 + ( 2 α − 1 ) | X t ) = ( 2 α − 1 ) 2 ( 2 α − 1 ) X t + ( 2 p − 1 ) λ + ( 2 p − 1 ) λ 1 + ( 2 α − 1 ) = ( 2 α − 1 ) 3 X t + ( 2 p − 1 ) λ 1 + ( 2 α − 1 ) + ( 2 α − 1 ) 2 .

Hence by induction, one can conclude that

E ( X t + k | X t ) = ( 2 α − 1 ) k X t + ( 2 p − 1 ) λ 1 + ( 2 α − 1 ) + ⋯ + ( 2 α − 1 ) k − 1 = ( 2 α − 1 ) k + ( 2 p − 1 ) λ 1 − ( 2 α − 1 ) k 2 ( 1 − α ) .

□

References

Al-Osh, M. and Alzaid, A.A. (1987). First-order integer-valued autoregressive (INAR(1)) process. J. Time Ser. Anal. 8: 261–275, https://doi.org/10.1111/j.1467-9892.1987.tb00438.x.Search in Google Scholar

Alzaid, A.A. and Omair, M.A. (2014). Poisson difference integer valued autoregressive model of order one. Bull. Malaysian Math. Sci. Soc. 37: 465–485.Search in Google Scholar

Bakouch, H.S., Kachour, M., and Nadarajah, S. (2016). An extended Poisson distribution. Commun. Stat. Theor. Methods 45: 6746–6764, https://doi.org/10.1080/03610926.2014.967587.Search in Google Scholar

Barreto-Souza, W. and Bourguignon, M. (2015). A skew INAR(1) process on Z. AStA Adv. Stat. Anal. 99: 189–208, https://doi.org/10.1007/s10182-014-0236-2.Search in Google Scholar

Bourguignon, M. and Vasconcellos, K.L. (2016). A new skew integer valued time series process. Stat. Methodol. 31: 8–19, https://doi.org/10.1016/j.stamet.2016.01.002.Search in Google Scholar

Bulla, J., Chesneau, C., and Kachour, M. (2011). A bivariate first-order signed integer-valued autoregressive process. Commun. Stat. Theory Methods 46: 6590–6604.10.1080/03610926.2015.1132322Search in Google Scholar

Chen, H., Zhu, F., and Liu, X. (2023). Two-step conditional least squares estimation for the bivariate Z-valued INAR(1) model with bivariate Skellam innovations. Commun. Stat. Theor. Methods, https://doi.org/10.1080/03610926.2023.2172587.Search in Google Scholar

Chesneau, C. and Kachour, M. (2012). A parametric study for the first-order signed integer-valued autoregressive process. J. Stat. Theory Pract. 6: 760–782, https://doi.org/10.1080/15598608.2012.719816.Search in Google Scholar

Freeland, R.K. (2010). True integer value time series. AStA Adv. Stat. Anal. 94: 217–229, https://doi.org/10.1007/s10182-010-0135-0.Search in Google Scholar

Homburg, A., Weiß, C.H., Alwan, L.C., Frahm, G., and Göb, R. (2019). Evaluating approximate point forecasting of count processes. Econometrics 7: 30, https://doi.org/10.3390/econometrics7030030.Search in Google Scholar

Kachour, M. (2014). On the rounded integer-valued autoregressive process. Commun. Stat. Theor. Methods 43: 355–376, https://doi.org/10.1080/03610926.2012.661506.Search in Google Scholar

Kachour, M. and Truquet, L. (2011). A p-order signed integer-valued autoregressive (SINAR(p)) model. J. Time Ser. Anal. 32: 223–236, https://doi.org/10.1111/j.1467-9892.2010.00694.x.Search in Google Scholar

Kachour, M. and Yao, J.F. (2009). First-order rounded integer-valued autoregressive (RINAR (1)) process. J. Time Ser. Anal. 30: 417–448, https://doi.org/10.1111/j.1467-9892.2009.00620.x.Search in Google Scholar

Kim, H.Y. and Park, Y. (2008). A non-stationary integer-valued autoregressive model. Stat. Pap. 49: 485–502, https://doi.org/10.1007/s00362-006-0028-1.Search in Google Scholar

Latour, A. and Truquet, L. (2008). An integer-valued bilinear type model, Available at: http://hal.archives-ouvertes.fr/hal-00373409/fr/.Search in Google Scholar

Liu, Z., Li, Q., and Zhu, F. (2021). Semiparametric integer-valued autoregressive models on. Can. J. Stat. 49: 1317–1337, https://doi.org/10.1002/cjs.11621.Search in Google Scholar

McKenzie, E. (1985). Some simple models for discrete variate time series. Water Resour. Bull. 21: 645–650, https://doi.org/10.1111/j.1752-1688.1985.tb05379.x.Search in Google Scholar

Scotto, M.G., Weiß, C.H., and Gouveia, S. (2015). Thinning-based models in the analysis of integer-valued time series: a review. Stat. Model. Int. J. 15: 590–618, https://doi.org/10.1177/1471082x15584701.Search in Google Scholar

Steutel, F.W. and van Harn, K. (1979). Discrete analogues of self-decomposability and stability. Ann. Probab. 7: 893–899, https://doi.org/10.1214/aop/1176994950.Search in Google Scholar

Taveira da Cunha, E., Vasconcellos, K.L., and Bourguignon, M. (2018). A skew integer-valued time-series process with generalized Poisson difference marginal distribution. J. Stat. Theory Pract. 12: 718–743, https://doi.org/10.1080/15598608.2018.1470046.Search in Google Scholar

Weiß, C.H. (2008). Thinning operations for modeling time series of counts-a survey. AStA Adv. Stat. Anal. 92: 319–343.10.1007/s10182-008-0072-3Search in Google Scholar

Weiß, C., Scherer, L., Aleksandrov, B., and Feld, M. (2019). Checking model adequacy for count time series by using Pearson residuals. J. Time Econom. 12: 2018–0018.10.1515/jtse-2018-0018Search in Google Scholar

Zhang, H., Wang, D., and Zhu, F. (2010). Inference for INAR (p) processes with signed generalized power series thinning operator. J. Stat. Plann. Inference 140: 667–683, https://doi.org/10.1016/j.jspi.2009.08.012.Search in Google Scholar

Zhang, H., Wang, D., and Zhu, F. (2012). Generalized RCINAR (1) process with signed thinning operator. Commun. Stat. Theor. Methods 41: 1750–1770, https://doi.org/10.1080/03610926.2010.551452.Search in Google Scholar

Received: 2022-09-26

Accepted: 2023-04-21

Published Online: 2023-06-07

Published in Print: 2023-04-25

This work is licensed under the Creative Commons Attribution 4.0 International License.

Articles in the same Issue

https://doi.org/10.1515/jbnst-2022-0059

Keywords for this article

time series; signed thinning operator; extended Poisson distribution; simulation; Pearson residuals

Creative Commons

BY 4.0