Home Business & Economics Detecting Change-Point via Saddlepoint Approximations
Article Publicly Available

Detecting Change-Point via Saddlepoint Approximations

  • Zhaoyuan Li and Maozai Tian EMAIL logo
Published/Copyright: June 8, 2017
Become an author with De Gruyter Brill

Abstract

It’s well-known that change-point problem is an important part of model statistical analysis. Most of the existing methods are not robust to criteria of the evaluation of change-point problem. In this article, we consider “mean-shift” problem in change-point studies. A quantile test of single quantile is proposed based on saddlepoint approximation method. In order to utilize the information at different quantile of the sequence, we further construct a “composite quantile test” to calculate the probability of every location of the sequence to be a change-point. The location of change-point can be pinpointed rather than estimated within a interval. The proposed tests make no assumptions about the functional forms of the sequence distribution and work sensitively on both large and small size samples, the case of change-point in the tails, and multiple change-points situation. The good performances of the tests are confirmed by simulations and real data analysis. The saddlepoint approximation based distribution of the test statistic that is developed in the paper is of independent interest and appealing. This finding may be of independent interest to the readers in this research area.

1 Introduction

Consider a sequence of random variables X1,X2,⋯,Xn, then the sequence is said to have a change-point at m if Xt for t = 1,2,⋯, m have a common distribution function F1(x) and Xt for t = m + 1, m + 2,⋯,n have a common distribution function F2(x), and F1(x)≠ F2(x). Change-point analysis is very important and applied in many fields, including macroeconomics[2], environment[3,4], quality control[57], medicine[8], geology[9], and etc.

Theoretically, various authors concentrate on the change-point problems. Basically, studies can be divided into two focuses, “change-point in distribution” and “change-point in regression function”. “Change-point in distribution” refers that a pre-change distribution and a post-change distribution exist in the sequence[10]. Studies on this issue include Pettitt[1, 11], Hinkley[12, 13], Talwar and Gentle[14], Carlstein[15], Inclan and Tiao[16], Hawkins and Zamba[35], Kawahara and Sugiyama[18] and etc. “Change-point in regression function”, on the other side, refers the changed regression models, like parameter instability and structural change[19]. Previous studies include Muller[20] and Loader[21], who adapt kernel smoothers, and Davis, et al.[22], Huskova, et al.[23] and Gombay, et al.[24], considering autoregressive models.

The existing methods have several limitations. 1) Large-sample properties have been deduced in most of previous works, while few developments are made on small-sample properties. Thus, these methods usually fail when sample that could be obtained or observed is very small. 2) The computations for existing methods are very complicated and time-consuming. 3) In the case that change-point occurs in the lower and upper tails of the sequence, most of the approaches become ineffective. 4) Even in the common conditions without change-point in the tails, existing methods can only give an estimated interval where the change-point may be located. The limitations mentioned above hinder the applications of these change-point methods in practice.

To settle these problems, we first propose a quantile test of single τ based on the theory of saddlepoint approximations. Further, a “composite quantile test” is constructed to collect the information on each τ of the sequence and give the probability of each location to be a change-point. This detection scheme could effectively solve the problems before, and has several good properties. 1) No assumptions are made about the functional form of the sequence distribution of independent random variables X1, X2, ⋯, Xn.2) The computation of the “composite quantile test” is quick and simple. In addition, 3) the detection scheme is very sensitive to the varying information of the sequence. Thus, not only can it give the probable interval to be a change-point, but also can sensitively pinpoint the location. 4) This approach can solve the change-point problem with much smaller sample size than other existing methods. 5) The scheme works well when the change-point occurs in the tails of the sequence. 6) Multiple change-point problems can also be successfully detected.

“Mean shift” problem is an important issue in studies of no matter change-point in distribution or in regression function. Pre-shift and post-shift means are to be estimated concurrently with the change point. Many former studies focus on this problem, like Kokoszka and Leipus[26], Ferger[27] and Shi, et al.[28, 29]. This paper presents the performances of “composite quantile test” in the “mean shift” problem as a representative attempt. Other change-point models are possible to be derived based on this idea.

Suppose X1, X2, ⋯, Xm are independent with mean μ, and Xm+1, Xm+2,⋯, Xn are independent with mean μ + δ. These two sequences are also independent with each other, where m is an unknown positive integer. m is the unknown location of a change point Xm. Mean change-point model is formulated as:

(1.1)Xi=Zi,ifim,Zi+δ,ifi>m,

where {Zi, i = 1,2, ⋯, n} is a sequence of random variables with constant mean μ. δ is the unknown amount of change in mean. If δ = 0, there is no change-point at the point of m, otherwise, change-point exists. Under the null hypothesis X1, X2,⋯,Xn is an i.i.d. sequence, whereas under the alternative two sequences exist with δ < 0, indicating the mean of X1, X2,⋯, Xm greater than the mean of Xm+1, Xm+2,⋯, Xn. In practice, this problem is usually considered as the detection that wether the quality of articles decline[7, 30], and so on. Similarly, δ > 0 is meant that the mean of quantity or quality increases by δ. The objective is to determine wether δ ≠ 0 at m and pinpoint the location of change-point.

2 Quantile Test for Single Quantile

The hypothesis test used here is

  • Null hypothesis H0: δ = 0, that is, mean shift does not happened in the location of m, m = 2, 3,⋯,n – 1, denoted as ηΘ0;

  • Alternative hypothesis H1: δ < 0, and change-point occurs and the mean of the sequence decreases, denoted as ηΘ0c.

For the case of mean of a sequence increasing, it’s to inverse the order to be the alternative hypothesis above. For the null hypothesis of “no-change” H0, the test statistic Tτ is constructed as the number of observations among X1, X2,⋯, Xm which are greater than the τ quantile of the whole sequence X1, X2,⋯, Xn, that is,

Tτ=i=1mI(Xi>Qτ(X)),m=2,3,,n1,

where I(·) is the indicative function, which is 1 for Xi > Qτ(X)and 0 for XiQτ(X). Qτ(X) is the τ quantile of the whole sequence.

Remark 1

In practice, we don’t know the location of the change point. However, we can do it step by step. For more details about homogeneous testing, we refer to Tian and Chan[50, 51] and the references therein. Specifically,

  1. Initialization

    As a stating point, consider the initial state in which m = 2. To this end, the data set could be divided into two parts: {X1,X2} and {X3,X4,⋯,Xn}. Calculate the overall type I error, etc.

  2. Enlarging homogenous region

    Based on the previous step, we consider m = 3. Consequently, a test for the homogeneity hypothesis (i.e., no-change point) can be conducted. For this case, the data set is accordingly divided into two parts: {X1, X2, X3}, and {X4, X5,⋯, Xn}. Then calculating the overall type I error based on the two updated parts will be done.

  3. Iteration

    Following the previous two steps, repeat, till m = n.

In order to speed up the procedure, we may try to use, for example, m = 2k, k = 1,2,⋯ instead of m = 2, 3, ⋯. Of course, there many other ways to deal with such problem, such as the inverse procedure, in which the whole data set is split into two parts for the first step. The sample size of each are not necessarily equal. Then testing for the homogeneity hypothesis (i.e., no-change point) for each parts could be conducted. Repeat till all change-points are detected out.

The distribution of is mainly considered. The cumulative density distribution of Tτ is

(2.1)FTτ(t)=P(Tτt).

Proposition 1

Assume that the sequence X1, X2,⋯,Xn are i.i.d. with a positive definite variance σ2(X), and X ∈ ℝ. The probability space is (Ω, 𝓐,P). Applying saddlepoint approximation method, the first-order saddlepoint approximation of formula (2.1) is given by

(2.2)P(Tτt)1{nτ(n+1)}nτ(n+1)+12{τ(n+1)}τ(n+1)+12(nm)nm+12mm+12×{(t+1)(τ(n+1)+tm+1)}B0(λ^)×{nn+12{nτ(n+1)t1}nτ(n+1)t12{τ(n+1)+tm+1}τ(n+1)+tm+32(t+1)t+32(mt1)mt12{n(t+1)nm+τ(n+1)m}}1,

where

λ^=log(t+1)(τ(n+1)+tm+1)(mt1)(nτ(n+1)t1)×(t+1)(nτ(n+1)t1)(mt1)(τ(n+1)+tm+1)(t+1)(nτ(n+1)t1)τ(n+1)+(mt1)1/2×{(τ(n+1)+tm+1)(nτ(n+1))}1/2,

and B0(λ^) is Esscher function, given by

B0(λ^)=λ^expλ^22[1N(λ^)],

where 𝔑(x) is standard normal distribution function,

N(y)=yn(s)ds=y12πexps22ds.

When tm[nτ(n+1)]n, the formula (2.2) is valid. In order to compute the cumulative probability density, P(Tτ ≤ t), let m×τ2nn+1.

The procedure of how Proposition 1 is derived is presented in the following. Let ϖ(Tτ) be the statistics relative to Tτ, which helps approximate the distribution of Tτ. The Laplace transform function of ϖ(Tτ) is defined as

φ(θ)=Eexp(θϖ(Tτ))=ϖ(Tτ)exp(θϖ(Tτ))P(ϖ(Tτ)).

A cumulant transform function is defined as

κ(θ)=logφ(θ),

and a characteristic function is defined as

ϕ(t)=Eexp(itϖ(Tτ)),

where θ ∈ ℝ. The domain of Laplace transform is Θθ = {θ ∈ ℝ :φ(θ) < }. The positive definite variance of X1, X2, ⋯, Xn enables the cumulant transform κ(θ) strictly convex on Θθ[25].

Applications of the saddlepoint to approximate the probability density of a statistic have much more development. Jensen[32] proposed many methods and ideas to produce a very accurate approximation and we use one of them in our deviation. Let ϖ̅ be the mean of ϖ(Tτ). The saddlepoint approximation to the density of ϖ̅ is

(2.3)fϖ¯(w)φ(θ^)exp(nθ^w)n0(2πΣ(θ^))1/21+1nκ4(θ^)8Σ2(θ^)5κ3(θ^)224Σ3(θ^),

where κ3 (·) and κ4 (·) are the third and forth deviations of κ(·), Σ (θ̂) = σ2(θ̂), and θ̂ is called the saddlepoint satisfying the saddlepoint equation,

κ(θ^)=w.

The equation is also available in (1.5) of Butler[25], and the density approximation formula (2.3) was derived from Jensen[32].

For continuous random variables, an saddlepoint approximation to the cumulative density function(CDF) of ϖ̅ is equally straight forward,

(2.4)P(ϖ¯w)φ(θ^)nexp(nθ^w)n|θ^|σ(θ^){B0(λ)+sgn(θ^)κ3(θ^)6nΣ3/2(θ^)B3(λ)1nκ4(θ^)24Σ2(θ^)B4(λ)+κ3(θ^)272Σ3(θ^)B6(λ)},

where

Σ(θ^)=σ2(θ^)=κ(θ^),λ=n|θ^|σ(θ^),B3={λ3B0(λ)(λ3λ)(2π)1/2},B4=λ4B0(λ)(λ4λ2)(2π)1/2,B6=λ6B0(λ)(λ6λ4+3λ2)(2π)1/2.

In this case, the saddlepoint θ̂ is obtained by solving κ′(θ̂) = ι(w), i.e., ϵ = 0.

For lattice random variables, use the expanded formula similar to (2.4)

(2.5)P(ϖ¯w)φ(θ^)nexp(nθ^w)nσ(θ^){1exp(|θ^|)}×{B0(λ)+(n)1[(σ(θ))1((|θ^|)1exp(|θ^|)1exp(|θ^|))B1(λ)+κ3(θ^)6Σ3/2(θ^)sgn(θ^)B3(λ)]+1n[(exp(|θ^|)1exp(|θ^|)12(|θ^|)1+[exp(|θ^|)1exp(|θ^|)]2)B2(λ)Σ(θ^)+1σ(θ^)(|θ^|)1exp(|θ^|)1exp(|θ^|)×κ3(θ^)6Σ3/2(θ^)sgn(θ^)B4(λ)+κ4(θ^)24Σ2(θ^)B4(λ)+κ3(θ^)272Σ3(θ^)B6(λ)]},

where

B1=λ{B0(λ)(2π)1/2},B2=λ2{B0(λ)(2π)1/2}.

The saddlepoint θ̂ in this case is obtained by solving κ′(θ̂) = ι(w)ϵ/2, i.e., ϵ= 0. Without loss of generality, let ϵ = 1/n.

The distribution of Tτ can be represented as a conditional distribution. Let ϑi = 0 for iτ (n + 1) and ϑi = 1 for i > τ(n + 1). The cumulative density function of Tτ is

(2.6)FTτ(t)=P1nϑiUit|1nUi=m,t=0,1,,m,

where U1,U2,⋯, Un are i.i.d. with P(Ui=1)=P(Ui=0)=12. Therefore, the statistic ϖ(Tτ)=(1nϑiUi,1nUi).

In order to approximate the above formula in which θ ∈2, it is suitable for formula (2.5) to be expanded to the formula (2.2.15) in Jensen[32]

(2.7)P(ϖ1¯(Tτ)w1ϖ2¯(Tτ)=w2)=expr22|Σ22(θ~)|n|Σ(θ^)|1/21|θ^1|B0λ^+OB0(λ^)1+λ^n1/2,

where r2=2n(θ^θ~)sκ(θ^)+κ(θ~) and ϖ̅1 and ϖ̅2 are defined as in the Appendix. The procedure of Proposition 1 is given. And details of the deviation from formula (2.6) to formula (2.2) for our case are in the Appendix.

According to the Proposition 1, the probability distribution of Tτ can be obtained via P(Tτ = t) = P(Tτt) – P(Tτt – 1), which is denoted as fTτ(t)for t = 0,1,⋯,m.

If change-point does not exist at the location m of the sequence, the distribution of Tτ is exactly approximated by saddlepoint approximation method, the formula (2.2). In order to exactly compute its value, formula (2.2) is transformed to

(2.8)P(Tτt)1nτ(n+1)nτ(n+1)t1nτ(n+1)t12×τ(n+1)τ(n+1)+tm+1τ(n+1)+12×nτ(n+1)τ(n+1)+tm+1t+1×nmnn+12×τ(n+1)+tm+1nmm×mmt1m+12×mt1t+1t+1×(t+1)12×{(t+1)[τ(n+1)+tm+1]B0(λ^)}n(t+1)nm+τ(n+1)m.

The distribution of Tτ approximated by saddlepoint approximation is under the null hypothesis of “no-change”.

For each location m, the significant level is set to be α = 0.05. The criterion for hypothesis test is the value of Tτ, named as t0(τ) when the accumulative density distribution of Tτ equals 0.95. That is

t0(τ)=arg{t=0t0(τ)fTτ(t)=0.95}.

Here, t0(τ) is not an integer necessarily, and

t=0int(t0(τ))fTτ(t)0.95t=0int(t0(τ))+1fTτ(t),

where int(t) denotes the largest integer smaller than t. The plots of Tτ when m = 10, m = 50 and m = 90 are presented in Figure 1. The curve of Tτ intersects 0.95 at t0(τ). When k number of τ are considered, for each location m = 4, 5,⋯, n – 1, the threshold matrix of t0(τ) by {τj, j = 1, 2,⋯, k} are plotted in Figure 2.

Figure 1 FTτ$F_{T_{\tau}}$: (a) describes FTτ$F_{T_{\tau}}$ when m = 10 which intersects line 0.95 at t0(τ); (b) describes FTτ$F_{T_{\tau}}$ when m = 50 which intersects line 0.95 at t0(τ); (c) describes FTτ$F_{T_{\tau}}$when m = 90 which intersects line 0.95 at t0(τ)
Figure 1

FTτ: (a) describes FTτ when m = 10 which intersects line 0.95 at t0(τ); (b) describes FTτ when m = 50 which intersects line 0.95 at t0(τ); (c) describes FTτwhen m = 90 which intersects line 0.95 at t0(τ)

Figure 2 The matrix of t0(τ): (a) describes the surface of t0(τ) in the large sample case (n = 100); (b) is the surface of t0(τ) in the small sample case (n = 20)
Figure 2

The matrix of t0(τ): (a) describes the surface of t0(τ) in the large sample case (n = 100); (b) is the surface of t0(τ) in the small sample case (n = 20)

When “mean shift” happens, the distribution of Tτ of data would largely deviate from the distribution under the null hypothesis. If m is given, the Tτ for target sequence is calculated as t1(τ) and compared with t0(τ). If t1(τ) > t0(τ), the null hypothesis is rejected and change point exists at m location of the sequence X1, X2,⋯, Xn. If t1(τ) < t0(τ) instead, null hypothesis cannot be rejected and the change-point cannot be determined. By shifting the value of m from 2 to n – 1, we can detect wether change point exists in the target sequence and pinpoint the location. Therefore, a quantile test that regects H0 if t1X(τ)>t0(τ),is a test with rejection region R,{tX:t1X(τ)>t0(τ)}.

To compute the power of hypothesis test, under the alternative hypothesis, the distribution of Tτ is obtained by applying the bootstrap method[33]. The basic steps proceed as follows. Assuming that the random sample of size n is drawn from an unspecified probability distribution, Ξ.

Step 1. Placing a probability of 1/n at each point, x1, X2,⋯,xn. Each sample’s element has the same probability to be drawn.

Step 2. From the sample x1, X2,⋯,xn, draw a random sample of size d with replacement.

Step 3. Calculate the statistic of interest, Tτ. For resample, yield Tτ=Tτ×nd.

Step 4. Repeat Steps 2 and 3 N (N > 1000) times.

Step 5. Construct the relative frequency through the N number of Tτ by placing a probability of 1/N at each point, Tτ1,Tτ2,,TτN.The distribution obtained is the bootstrapped estimate of the sampling distribution of Tτ.

The bootstrapped sampling distribution of Tτ is denoted as fTτboot(t). The power of hypothesis test, denoted as βτ(η), can be calculated by

(2.9)βτ(η)=t=[int(t0(τ))+1]mfTτ(t),ifηΘ0,1t=0int(t0(τ))fTτboot(t),ifηΘ0c,

where, t=[int(t0(τ))+1]mfTτ(t) is the probability of Type I Error the test makes. t=0int(t0(τ))fTτint(t)

is the probability of Type II Error. A good hypothesis test has power near 1 for most ηΘ0c and near 0 for most η Θ0. Due to

supηΘ0βτ(η)=α,

quantile test is a size α test.

3 Composite Quantile Test

Instead of adopting one τ quantile (median or other τ) for change-point detection, which will drop much of the valuable information, we try to collect information from several τ of the sequence X1, X2,⋯,Xn. A complete detection scheme is constructed to compute the probability of each location m to be a change point. The information across k uniformly spaced quantiles {τj, j = 1, 2,⋯, k} is aggregated to maximise the efficiency of making inference, where

τj=j/(k+1),j=1,2,,k.

The probability of one location m to be a change-point is calculated by

(3.1)P(m)=Λmm=2n1Λm,

where

Λm=Δmmin{Δm,m{2,3,,n1}},

and

Δm=j=1ksgn(t1(τj)t0(τj))×(t1(τj)t0(τj))2,

where

sgn(t1(τj)t0(τj))=1,ift1(τj)t0(τj),1,ift1(τj)<t0(τj),

t0(τj) is the theoretical criterion for hypothesis test at significant level 95% when m and τj are given. t1(τj) denotes the number of points in series x1,x2,⋯,xm which are larger than τj quantile of the whole sequence x1, X2,⋯, Xn. The distance between t1(τj) and t0(τj) indicates the deviation of the distribution of proposed statistics in the target sequence from theoretical criterion in the null hypothesis. Euclidean distance is used here.

The proposed “composite quantile regression” can be considered as a special procedure to combine the results of each single quantile test together by a varying weight. For given j, the distance between t1(τj) and t0(τj) determines the weight of the information from τj quantile test. In this regard, “composite quantile regression” is a series of hypothesis tests to the greatest extent collecting the valuable information from the whole sequence, instead of limited data points. This is a more sensitive way to detect the change-point location without restricting the forms of the sequence distribution.

4 Simulation

Monte Carlo simulation is applied to present the performances of the proposed single quantile test and composite quantile test. Three other detecting methods-single tests, CUSUM[5, 34], Student-t statistic[35, 36] and Mann-Whitney statistics[37, 38] are adopted as benchmarks. Three scenarios with different degree of “mean shift” are considered in both large and small sample cases. Multiple change-points situation is also tested.

4.1 Single Quantile Test and Power

We analyze the situations of large size sample n = 100 and small size sample n = 20. Given the simulated sequences with a change-point located at m = 4, 5,⋯, n – 1, the rate of successful detection in the Monte Carlo simulation is defined as S(τ)=NN,where N* is the number of successful detection, and N (N = 104) denotes the times of Monte Carlo experiments. The generated sequence Xt for t = 1,2,⋯, m is distributed by Norm(μ1, σ) and sequence Xt for t = m+1,⋯ , n is distributed by Norm(μ2, σ), where μ1 = μ2+δ ·σ. In the previous empirical studies, the mean shift is usually larger than one σ. Thus, δ is set as —1, —1.5 and —2 in the scenarios respectively indicating the degree of “mean shift”. For simplicity, σ = 1. The results of S (τ) are presented in Figure 3. Most of location m show great success rate S(τ) except for extreme conditions of m ≥n — 3. This is a very good performance in comparison with studies before. Quantile test possesses the advantage that it can find the location of change point exactly.

Figure 3 S (τ): S (τ) in cases of n = 100 and n = 20 with δ = — 1, —1.5, —2, respectively
Figure 3

S (τ): S (τ) in cases of n = 100 and n = 20 with δ = — 1, —1.5, —2, respectively

Sequences without change-points are generated to test the probability of Type I Error. The probability of Type I Error is defined as W(τ)=N^N,where N is the times of Monte Carlo experiments which make Type I Error. As reported in Figure 4, some large W(τ) are at the lower tail of sequence when small τ is adopted. At other location W(τ) are close to zero. Even in the extreme case of small sample (n = 20), for more than half of locations, W(τ) still show very small values. However, small quantile test fail when sample size is extreme small. For large sample size, the Type I Errors of CUSUM, Student-t and Mann-Whitney are 0.9185, 0.0522 and 0.0556, respectively. The corresponding values are 0.9908, 0.0482 and 0.051 for extreme small sample size. CUSUM fails, while Student-t and Mann-Whitney still perform well. The Type I Errors of Student-t and Mann-Whitney are close to the means of Type I Errors of 0.5, 0.6 and 0.7 quantiles obtained from Figure 4, which are 0.0580, 0.0549 and 0.0538 for large sample size, and 0.0639, 0.490 and 0.0573 for extreme small size.

Figure 4 W(τ): W(τ) in cases of n = 100 and n = 20
Figure 4

W(τ): W(τ) in cases of n = 100 and n = 20

The power of quantile test is defined as Formula (2.9). Bootstrap resampling is introduced with repeating times N = 104. Figure 5 presents the results of βτ in three scenarios δ = -1,-1.5,-2 with n = 100 and n = 20. The mean of βτ is reported in Table 1. βτ, increasing with higher degree of “mean shift”, shows great performance of the quantile test by single τ. Figure 6 shows the powers of CUSUM, Student-t, Mann-Whitney, and single quantile tests using 0.5, 0.6 and 0.7 quantiles. In the simulation, the sequences are generated with one change point at location m. Single quantile test can exactly find this location, while CUSUM, Student-t and Mann-Whitney can only detect the existence of change point but cannot always give the exact location. In the calculation of power for them in Figure 6, we do not consider whether they find the exact location. The high power of CUSUM is accompanied by extremely high Type I Error. Therefore, it makes little sense to compare the power of CUSUM with other methods. The powers of Student-t and Mann-Whitney are close. For both large and small sample size, the power of single quantile tests is almost always higher than these two methods. Single quantile tests are more powerful than other methods when change point existing in the first half part of sequence. As showed in Table 2, the powers of Student-t and Mann-Whitney are very weak when we consider whether they find the exact location of change point. The power of CUSUM can reach 67 percent and the powers of Student-t and Mann-Whitney are smaller than 20 percent. They are far smaller than that of single quantile test as showed in Table 1.

Table 1

β̅τ in three scenarios for both n = 100 and n = 20

δ−1−1.5−2
n = 1000.90880.97630.9895
n = 200.79410.95750.9577

Table 2

Summary of statistics about the powers of three methods under the requirement of finding exact location of change point

n = 100δ = −1MaxMinMedianMean
CUSUM0.298300.22450.2006
Student-t0.111300.10140.0895
Mann-Whitney0.116000.10130.0884
δ = −1.5MaxMinMedianMean
CUSUM0.490800.37370.3353
Student-t0.131900.12460.1174
Mann-Whitney0.149600.12180.1143
δ = −2MaxMinMedianMean
CUSUM0.65380.00870.50850.4559
Student-t0.120500.11090.1090
Mann-Whitney0.161600.11290.1071
n = 20δ = −1MaxMinMedianMean
CUSUM0.328100.27830.2468
Student-t0.045100.03740.0325
Mann-Whitney0.052600.04140.0337
δ = −1.5MaxMinMedianMean
CUSUM0.515300.44130.3906
Student-t0.090600.07760.0667
Mann-Whitney0.105100.07930.0655
δ = −2MaxMinMedianMean
CUSUM0.67260.24300.57470.5373
Student-t0.107900.09480.0869
Mann-Whitney0.136500.09410.0845

Figure 5 βτ : βτ in cases of n = 100 and n = 20 under three scenarios δ = –1, –1.5, –2, respectively
Figure 5

βτ : βτ in cases of n = 100 and n = 20 under three scenarios δ = –1, –1.5, –2, respectively

Figure 6 Power comparison of CUSUM, Student-t, Mann-Whitney and single quantile tests in cases of n = 100 and n = 20 under three scenarios δ = −1, -1.5, -2, respectively
Figure 6

Power comparison of CUSUM, Student-t, Mann-Whitney and single quantile tests in cases of n = 100 and n = 20 under three scenarios δ = −1, -1.5, -2, respectively

4.2 Composite Quantile Test and Multiple Change-Points Situation

Based on quantile test of single τ, “composite quantile test” is constructed as Formula (3.1). In this part, single change-point is set at location of the generated sequence. shifts from 4 to n - 1. For each given , the probability P(m) of each location m to be a change-point is calculated by collecting the information of each single quantile test. The results are presented in Figure 7 both for n = 100 and n = 20. A series of peaks can be observed on the diagonal of the surface, where the simulated change-points are actually located. That means, by applying “composite quantile test”, the highest probabilities pinpoint the actual change-point locations. Similar patterns can also be observed for small sample cases. The ridge-like surfaces of Figure 7 reflect the sensitive performance of “composite quantile test”. Even in the extreme conditions that generated change-point occurs in the tails of the sequence, the proposed method still responds sensitively and gives the highest probability to the right locations.

Figure 7 P(m): P(m) in cases of n = 100 and n = 20 under three scenarios δ = –1,–1.5,–2, respectively
Figure 7

P(m): P(m) in cases of n = 100 and n = 20 under three scenarios δ = –1,–1.5,–2, respectively

Multiple change-points cases are further investigated to confirm the advantage of the proposed method. For two change-points condition, the sequences which have two change-points are generated. The two change-points at first are on the two tails of the sequence, one on the lower and the other on the upper. The two points approach to each other and eventually join together. The degree of “mean shift” is set as δ = –1. The length of sequence is set as n = 150. It is the same with three change-points condition, except for a third change-point which is constantly in the middle of a n = 200 length sequence. The results of two and three change-points situations are plotted in Figure 8.

Two ridges are observed in Figure 7(a) and (b), exactly positioning the locations of change-points. Similar patterns can be found for three change-points situation, except that the change-point in the middle is more emphasized by the method when two other points are on the tails. Multiple change-points situation indicates that the “composite quantile test” works robustly in dealing with unfavorable data conditions.

Figure 8 P(m): P(m) for two and three change-point δ = –1
Figure 8

P(m): P(m) for two and three change-point δ = –1

5 Real Data

Two sets of data are analyzed. The first one is a much analysed[3943] data set of intervals between British coal-mining disasters which resulted in 10 or more fatalities during the 112-year period 1851–1962, which was gathered by Maguire et al.[44], extended and corrected by Jarrett[45]. And Carlin, et al.[46, 47] converted Jarrett’s table into yearly intervals. The second data is a 60-year (1912–1971) annual series of temperatures from Libby, Montana. The documented change point is 1938 (resulting from a change in latitude and elevation)[48]. The plots of these two time series are showed in Figure 9(a) and (b). As presented in Figure 9(c) and (d), the probability of each location to be a change point is computed by “composite quantile test”.

The aim is to find a location m* cutting off the sequence making F1(x) ≠ F2(x). F1(x) and F2(x) are common distribution function for Xt, t = 1, 2, · · · , m and Xt, t = m+ 1, m + 2, ⋯ , n, respectively. For the first data, two peaks with very near locations are identified at m = 36 and m = 39. The probabilities of these two points are also very close. The standard deviation is σL = 1.6475, which is the empirical basis to choose the value of δ in Monte Carlo experiments. In detail, at location m = 36, μ¯136=3.250 is the mean of the sequence X1, X2, · · · , Xm and μ¯236=0.9737 is the mean of the sequence Xm+1, Xm+2, · · · , Xn. Thus the mean difference is μ¯136μ¯236=1.382σL. At location m = 39, on the other hand, μ¯139=3.1538andμ¯239=0.9315. And μ¯139μ¯239=1.349σL. It should be noticed that the mean difference at m does not necessarily lead to a high P(m). Change-point m* makes the sample points before and after m* coming from different distributions. According to the result of composite quantile test, two special points are pinpointed and more likely to be change-points. In previous studies, a probable interval is usually given without exact location to be a change-point. For example, Carlin, et al.[49] gave the probable interval of change-point 1889–1892, which is m = 39 to m = 42 in our context. The point m = 36 is not detected and the interval m = 39 to m = 42 is also too wide to be accurate. Student-t detect only one change point at m = 36 and also Mann-Whitney detect a change point at m = 41. In this regard, “composite quantile test” responds more sensitively to change “tendency” of the sequence than methods before.

Figure 9 Real data: (a) and (b) describe the time series of dataset 1 and 2. (c) and (d) describe P(m), the probability of being change-point
Figure 9

Real data: (a) and (b) describe the time series of dataset 1 and 2. (c) and (d) describe P(m), the probability of being change-point

In the second data, one peak is observed at m = 19, which is the year of 1930. The standard deviation is σS = 0.8769. The mean of points before this peak is μ̄1 = 6.5626 and the mean of points after that point is μ̄2 = 7.3521. The mean difference is μ̄2 μ̄1 = 0.9σS. The proposed method senses a tendency that the distribution of sample is going to change. Student-t and Mann-Whitney detect the same change point location. This data is also studies by Reeves, et al[48]. They applied eight methods, where the standard normal homogeneity (SNH) test and nonparametric SNH procedure detect a change point in 1930, Sawa’s Bayes criteria, two-phase regression with a common trend and modified vincent’s method detect a change point in 1947, and Akaike’s information criteria, two-phase regression (LR) and a modified LR method detect a change point in 1945. Our method detect the same change point location as SNH test and nonparametric SNH procedure. This location differs from the documented change point. This discrepancy maybe caused by other undocumented change points in the series or changed information happening before the location of true change point.

6 Conclusion

“Mean shift” problem in change-point studies is discussed in this paper. Saddlepoint approximation method is applied to approximate the cumulative density function of statistic Tτ for null hypothesis of “no-change”, and a quantile test of single τ is proposed. In order to capture the overall tendency information of the sequence, we construct “composite quantile test” to calculate the probability of each location to be a change-point. As showed in Monte Carlo experiments and real data analysis, the proposed detection scheme performs better than existing methods. The types of functional form of sequence distribution are not restrained. The sample sizes (n = 100, 20) considered in our context are much smaller than previous studies. In the cases that change-points at lower or upper tails of sequence, the proposed test still works well. In addition, the robustness of this scheme is confirmed by multiple change-points conditions. The ideas behind this paper are extendable to other complex issues in change-point studies. In addition, the extension of this testing to two-sided is left as a future research topic.

Acknowledgments

The work was partially supported by the major research projects of philosophy and social science of the Chinese Ministry of Education (15JZD015), National Natural Science Foundation of China (11271368), Project supported by the Major Program of Beijing Philosophy and Social Science Foundation of China (15ZDA17), Project of Ministry of Education supported by the Specialized Research Fund for the Doctoral Program of Higher Education of China (20130004110007), the Key Program of National Philosophy and Social Science Foundation Grant (13AZD064). The work was partially supported by the Major Project of Humanities Social Science Foundation of Ministry of Education (15JJD910001), the Fundamental Research Funds for the Central Universities, and the Research Funds of Renmin University of China (15XNL008), the Project of Flying Apsaras Scholar of Lanzhou University of Finance & Economics, and the Project of Tianshan Mountain Scholar of Xinjiang University of Finance & Economics.

References

[1] Pettitt A N. A non-parametric approach to the change-point problem. Applied Statistics, 1979, 28(2): 126–135.10.2307/2346729Search in Google Scholar

[2] Kim C J, Nelson C R. Has the US economy become more stable? A bayesian approach based on a Markov-switching model of the business cycle. Review of Economics & Statistics, 1999, 81(4): 608–616.10.1162/003465399558472Search in Google Scholar

[3] Solow A. Testing for climate change — An application of the two-phase regression model. Journal of Climate & Applied Meteorology, 1987, 26(10): 1401–1405.10.1175/1520-0450(1987)026<1401:TFCCAA>2.0.CO;2Search in Google Scholar

[4] Maguire R O, Sims J T. Measuring agronomic and environmental soil phosphorus saturation and predicting phosphorus leaching with mehlich 3. Soil Science Society of America Journal, 2002, 66(6): 2033–2039.10.2136/sssaj2002.2033Search in Google Scholar

[5] Page E S. Continuous inspection schemes. Biometrika, 1954, 41(1/2): 100–115.10.1093/biomet/41.1-2.100Search in Google Scholar

[6] Lai T L. Sequential changepoint detection in quality control and dynamical systems. Journal of the Royal Statistical Society. Series B (Methodological), 1995, 57(4): 613–658.10.1111/j.2517-6161.1995.tb02052.xSearch in Google Scholar

[7] Mahmoud M A, Parker P A, Woodall W H, et al. A change point method for linear profile data. Quality & Reliability Engineering International, 2007, 23(2): 247–268.10.1002/qre.788Search in Google Scholar

[8] Hall C B, Lipton R B, Sliwinski M, et al. A change point model for estimating the onset of cognitive decline in preclinical alzheimer’s disease. Statistics in Medicine, 2000, 19(11–12): 1555–1566.10.1002/(SICI)1097-0258(20000615/30)19:11/12<1555::AID-SIM445>3.0.CO;2-3Search in Google Scholar

[9] Leonte D, Nott D J, Dunsmuir W T M. Smoothing and change point detection for gamma ray count data. Mathematical Geology, 2003, 35(2): 175–194.10.1023/A:1023287521958Search in Google Scholar

[10] Mei Y. Sequential change-point detection when unknown parameters are present in the pre-change distribution. The Annals of Statistics, 2006, 34(1): 92–122.10.1214/009053605000000859Search in Google Scholar

[11] Pettitt A N. A simple cumulative sum type statistic for the change-point problem with zero-one observations. Biometrika, 1980, 67(1): 79–84.10.1093/biomet/67.1.79Search in Google Scholar

[12] Hinkley D V. Inference about the change-point in a sequence of random variables. Biometrika, 1970, 57(1): 1–17.10.1093/biomet/57.1.1Search in Google Scholar

[13] Hinkley D V. Inference about the change-point from cumulative sum tests. Biometrika, 1971, 58(3): 509– 523.10.1093/biomet/58.3.509Search in Google Scholar

[14] Talwar P P, Gentle J E. Detecting a scale shift in a random sequence at an unknown time point. Journal of the Royal Statistical Society. Series C (Applied Statistics), 1981, 30(3): 301–304.10.2307/2346356Search in Google Scholar

[15] Carlstein E. Nonparametric change-point estimation. The Annals of Statistics, 1988, 16(1): 188–197.10.1214/aos/1176350699Search in Google Scholar

[16] Inclan C, Tiao G C. Use of cumulative sums of squares for retrospective detection of changes of variance. Journal of the American Statistical Association, 1994, 89(427): 913–923.Search in Google Scholar

[17] Hawkins D M, Zamba K D. A change-point model for a shift in variance. Journal of Quality Technology, 2005, 37(1): 21–31.10.1080/00224065.2005.11980297Search in Google Scholar

[18] Kawahara Y, Sugiyama M. Sequential change-point detection based on direct density-ratio estimation. Statistical Analysis & Data Mining, 2012, 5(2): 114–127.10.1002/sam.10124Search in Google Scholar

[19] Andrews D W K. Tests for parameter instability and structural change with unknown change point. Econometrica, 1993, 61(4): 821–856.10.2307/2951764Search in Google Scholar

[20] Muller H G. Change-points in nonparametric regression analysis. The Annals of Statistics, 1992, 20(2): 737–761.10.1214/aos/1176348654Search in Google Scholar

[21] Loader C R. Change point estimation using nonparametric regression. The Annals of Statistics, 1996, 24(4): 1667–1678.10.1214/aos/1032298290Search in Google Scholar

[22] Davis R A, Huang D, Yao Y C. Testing for a change in the parameter values and order of an autoregressive model. The Annals of Statistics, 1995, 23(1): 282–304.10.1214/aos/1176324468Search in Google Scholar

[23] Huskova M, Praskova Z, Steinebach J. On the detection of changes in autoregressive time series I. Asymptotics. Journal of Statistical Planning & Inference, 2007, 137(4): 1243–1259.10.1016/j.jspi.2006.02.010Search in Google Scholar

[24] Gombay E, Serban D. Monitoring parameter change in time series models. Journal of Multivariate Analysis, 2009, 100(4): 715–725.10.1016/j.jmva.2008.08.005Search in Google Scholar

[25] Bai J. Least squares estimation of a shift in linear processes. Journal of Time Series Analysis, 1994, 15(5): 453–472.10.1111/j.1467-9892.1994.tb00204.xSearch in Google Scholar

[26] Kokoszka P, Leipus R. Change-point in the mean of dependent observations. Statistics & Probability Letters, 1998, 40(4): 385–393.10.1016/S0167-7152(98)00145-XSearch in Google Scholar

[27] Ferger D. Exponential and polynomial tailbounds for change-point estimators. Journal of Statistical Planning & Inference, 2001, 92(1–2): 73–109.10.1016/S0378-3758(00)00148-8Search in Google Scholar

[28] Shi X, Wu Y, Miao B. A note on the convergence rate of the kernel density estimator of the mode. Statistics & Probability Letter, 2009, 79(17): 1866–1871.10.1016/j.spl.2009.05.015Search in Google Scholar

[29] Shi X, Wu Y, Miao B. Strong convergence rate of estimators of change point and its application. Computational Statistics & Data Analysis, 2009, 53(4): 990–998.10.1016/j.csda.2008.11.015Search in Google Scholar

[30] Samuel T R, Pignatiello Jr J J, Calvin J A. Identifying the time of a step change with X control charts. Quality Engineering, 1998, 10(3): 521–527.10.1080/08982119808919166Search in Google Scholar

[31] Barndorff-Nielsen O. Hyperbolic distributions and distributions on hyperbolae. Scandinavian Journal of Statistics, 1978, 5(3): 151–157.Search in Google Scholar

[32] Jensen J L. Sanddlepoint approximations. Clarendon Press, Oxford, 1995.Search in Google Scholar

[33] Varian H. Bootstrap tutorial. Mathematica Journal, 2005, 9(4): 768–775.Search in Google Scholar

[34] Csorgo M, Horvath L. Limit theorems in change-point analysis. Wiley, New York, 1997.Search in Google Scholar

[35] Hawkins D M, Qiu P, Chang W K. The changepoint model for statistical process control. Journal of Quality Technology, 2003, 35(4): 355–366.10.1080/00224065.2003.11980233Search in Google Scholar

[36] Hawkins D M, Zamba K D. Statistical process control for shifts in mean or variance using a changepoint formulation. Technometrics, 2005, 47(2): 164–173.10.1198/004017004000000644Search in Google Scholar

[37] Hawkins D M, Deng Q. A nonparametric change-point control chart. Journal of Quality Technology, 2010, 42(2): 165–173.10.1080/00224065.2010.11917814Search in Google Scholar

[38] Ross G J, Tasoulis D K, Adams N M. Nonparametric monitoring of data streams for changes in location and scale. Technometrics, 2011, 53(4): 379–389.10.1198/TECH.2011.10069Search in Google Scholar

[39] Worsley K J. Confidence regions and tests for a change-point in a sequence of exponential family random variables. Biometrika, 1986, 73(1): 91–104.10.1093/biomet/73.1.91Search in Google Scholar

[40] Raftery A E, Akman V E. Bayesian analysis of a poisson process with a changepoint. Biometrika, 1986, 73(1): 85–89.10.1093/biomet/73.1.85Search in Google Scholar

[41] Siegmund D. Boundary crossing probabilities and statistical applications. The Annals of Statistics, 1986, 14(2): 361–404.10.1214/aos/1176349928Search in Google Scholar

[42] Siegmund D. Confidence sets in change-point problems. International Statistical Review, 1998, 56(1): 31–48.10.2307/1403360Search in Google Scholar

[43] Hand D J, Daly F, Lunn A D, et al. A handbook of small data sets, volume 1. CRC Press, US, 1994.10.1007/978-1-4899-7266-8Search in Google Scholar

[44] Maguire B A, Pearson E S, Wynn A H A. The time intervals between industrial accidents. Biometrika, 1952, 39(1/2): 168–180.10.1093/biomet/39.1-2.168Search in Google Scholar

[45] Jarrett R G. A note on the intervals between coal-mining disasters. Biometrika, 1979, 66(1): 191–193.10.1093/biomet/66.1.191Search in Google Scholar

[46] Carlin B P, Polson N G, Stoffer D S. A monte carlo approach to nonnormal and nonlinear state-space modelling. Journal of the American Statistical Association, 1992, 41(2): 493–500.10.1080/01621459.1992.10475231Search in Google Scholar

[47] Carlin B P, Chib S. Bayesian model choice via markov chain monte carlo methods. Journal of the Royal Statistical Society. Series B (Methodological), 1995, 57(3): 473–484.10.1111/j.2517-6161.1995.tb02042.xSearch in Google Scholar

[48] Reeves J, Chen J, Wang X L, et al. A review and comparison of changepoint detection techniques for climate data. Journal of Applied Meteorology & Climatology, 2007, 46(6): 900–915.10.1175/JAM2493.1Search in Google Scholar

[49] Carlin B P, Gelfand A E, Smith A F M. Hierarchical bayesian analysis of changepoint problems. Applied Statistics, 1992, 87(418): 389–405.10.2307/2347570Search in Google Scholar

[50] Tian M Z, Chan N H. Saddle point approximation and volatility estimation of value-at-risk. Statistica Sinica, 2010, 20: 1239–1256.Search in Google Scholar

[51] Tian M Z, Chan N H. Adaptive quantile regression with precise risk bounds. Statistica Sinica, to appear.00010.1007/s11425-015-0199-7Search in Google Scholar

[52] Butler R W. Saddlepoint approximations with applications. Cambridge University Press, UK, 2007.10.1017/CBO9780511619083Search in Google Scholar

Appendix

The Proof of Formula (2.2)

Let ϖ1(Tτ)=1nϑiUiandϖ2(Tτ)=1nUi, then the Laplace transform φ(θ12) of (ϖ1(Tτ), ϖ2(Tτ))is

φ(θ1,θ2)=1+exp(θ2)2τ(n+1)1+exp(θ1+θ2)2nτ(n+1).

The cumulant transform κ(θ1, θ2) is

κ(θ1,θ2)=logφ(θ1,θ2)=τ(n+1)1+exp(θ2)2+(nτ(n+1))1+exp(θ1+θ2)2.

Through

κ(θ1,θ2)=1nκ(θ1,θ2)(θ1,θ2)=1c1+exp(θ1θ2),1+cexp(θ1θ2)+(1c)exp(θ2)(1+exp(θ2))(1+exp(θ1θ2)),

we get the saddlepoints

θ1^=logb1(cb2+b1)(b2b1)(1cb1),θ2^=logcb2+b1b2b1,

which is the solution κn(θ1^,θ2^)=(b1,b2), where b1and b2 are the maximum likelihood estimates of means of ϖ1(Tτ) and ϖ(Tτ), respectively. Here, b1=tn,b2=mnandc=τ(1+1n). Meanwhile,

Σ(θ1,θ2)=1n2κ(θ1,θ2)(θ1,θ2)(θ1,θ2),Σ22(θ~)=cexp(θ2){1+exp(θ2)}2+(1c)exp(θ1θ2){1+exp(θ1θ2)}2.

In fact, θ̂1 is equivalent to θ̂ of (2.7), and θ̂2 is equivalent to θ̃ in (2.7). Put these results into the formula (2.7), formula (2.2) follows.

The Proof of Proposition 1

In the more general situation as (3.2), the saddle-point approximation to the density (3.1) as follows:

The sample X1, X2, · · · , Xn are i.i.d. with a positive definite variance. Let φ(θ) = E exp(θ · X1) be the Laplace transform defined for θ ∈ Θ.

Let fn be the density of X = (X1 + X2+ ··· + Xn)/n. The first d1coordinates Xi(1) of Xi are continuous variables and the remaining d2 = d – d1coordinates Xi(2) belong to the lattice Zd2,Zd2 being the minimal span for X(2). For any θ ∈ Θ, have

[(A.1)]fn(x)=φ(θ)nexp(nθx)n(d1d2)/2fn,θ(0),

where fn,θ is nd2/2 times the density of {n(X¯(1)x(1)),X¯(2)x(2)} under the measure Pθ. γ(t) = E exp(it X) is the characteristic function of X. Define μk = EXk and μ̄k = E|X|k, then j| ≤ μ̄j (μ̄k+1)j/(k+1) for j = 1, 2, · · · , k. According the inequalities

|exp(ix)j=0k(ix)j/j!||x|k+1/(k+1)!,|log(1z)j=1kzj/j||z|k+1/{(k+1)(1|z|)},forz=1γ(t),|z|<1.

we have

[(A.2)]|γ(t)j=0kmujj!(it)j|μ¯k+1(k+1)!|t|k+1,

so, give above that we can find constants c1(μ̄k+1), c2(μ̄k+1) depending on μ̄k+1 such that

[(A.3)]|logγ(t)j=1kκjj!(it)j|c2(μ¯k+1)|t|k+1for|t|c1(μ¯k+1),

here the cumulants κj of X are defined by

[(A.4)]κj=1ijdjdtjlnγ(t)t=0.

The characteristic function of Y=n(X¯μ1), by (A.3)

[(A.5)]Eexp(itY)=γtnnexp(intμ1)=expκ22t2iκ36nt3++κkk!n(k2)/2(it)k+ωc2(μ¯k+1)n(k1)/2|t|k+1

for |t|nc1(μ¯k+1). Consider Eexp(it Y)∈ L1 for n > ξ, the variable Y has a bounded density gn given

[(A.6)]gn(y)=12πexp(ity)E{exp(itY)}dt=12πexpκ22t21+j=3kn(j2)/2Pj(it)exp(ity)dt12π|t|>nc3expκ22t21+j=3kn(j2)/2Pj(it)exp(ity)dt+ω1c4(κ2,μ¯k+1)n(k1)/212π|t|<nc3expκ24t2|t|k+1+|t|3(k1)dt+12π|t|>nc3γtnnexp(intμ1)exp(ity)dt.

Define

[(A.7)]δ(c)=|t|>c|γ(t)|.

According to that if X has a density w.r.t Lebesgue measure then γ(t) → 0 for |t| → ∞, γ(t)n tends to zero for t → ∞, then estimate the last term by

[(A.8)]nδ(c3)nξ12π|γ(t)|ξdt=12πnδ(c3)nξγξξ.

Then the expansion is

[(A.9)]gn(y)n(y;κ2)+j=3kn(j2)/212πexpκ22t2Pj(it)exp(ity)dtc5(κ2,μ¯k+1)n(k1)/2+12πnδ(c3(κ2,μ¯k+1))nξγξξ,

where n(y;κ2)=(2πκ2)1/2expy22κ2 is the density of a normally distributed variable with variance κ2.

Use simple form of n as

n(y)=(2π)1expt22exp(ity)dt,

with n(y) = (y; 1), then

dmdymn(y)=12π(it)mexpt22exp(ity)dt.

Define the Hermite polynomial Hm(y)Hm(y)=(1)mn(y)1dmdymn(y),so

[(A.10)]12π(it)mexpκ22t2exp(ity)dt=Hmyκ2nyκ2κ2(m+1)/2=κ2m/2Hmyκ2n(y;κ2).

Combining (A.9) and (A.10) the expansion with k = 4 becomes

[(A.11)]gn(y)n(y;κ2)1+ζ6nH3yκ2+1nζ424H4yκ2+ζ3272H6yκ2,

where

ζm=κm/κ2m/2,m=3,4,,

are the standardized cumulants. (A.11) is the Edge-worth expansion of the density gn(y).

In the case we considered, (A.11) becomes

[(A.12)]gn(y)n(y;κ2)1+16nH3d(y;{κ})+1n124H4d(y;{κ})+172H3,3d(y;{κ}),

where

Hmd(y;{κ})=i1,,im=1dκm,(i1,,im=1)Hm,(i1,,im=1)d(y;κ2)

and

Hl.md(y;{κ})=i1,,im=1dj1,,jl=1dκm,(i1,,im=1)κl,(j1,,jl)Hm+l,(i1,,im,j1,,jl)d(y;κ2).

The error term of the approximation is, with k = 4,

[(A.13)]c2(κ2,μ¯k+1)n(k1)/2+nd/2(2π)dδ(c1(κ2,μ¯k+1))nξγξξ,

where

δ(c)=sup|t|>c,tRd1×[π,π]d2|γ(t)|

and γξis theLξ(Rd1×[π,π]d2)-norm

With

φθ(t)=φ(θ+it)φ(θ)exp(itμ(θ)),

(A.1) can be written as

[(A.14)]fn,θ(0)=(2π)dRd1×[nπ,nπ]d2φθtnnexp{int(μ(θ)x)}dt.

Then, according TLF, expand the inversion integral

1(2π)dJ(θ)nφ¯θitnndt,

the main term of the expansion is n(0; I) = (2π) d/2and the error term is according to (A.13), with k = 5, of the form

a2(Eθ|Xθ|6)n62+(2π)dnd/2δ{a1(Eθ|Xθ|6)}nξφ¯θξξ,

where a1and a2 are constants depending on the specified argument. We can obtain uniformly for θΘ0 the error term of the form O(n–2). Due to

Hm,(i1,,im)d(0;κ2)=0formodd and for any(i1,,im),

the term of order n–3/2 that has been included in the expansion is zero. Therefore, combining (A.1), (A.12) and (A.13), we obtain

[(A.15)]fn(x)=φ(θ)nexp(nθx)n(d1d2)/2n(0;Σ(θ))×1+1n124H4d(0,{κ(θ)})+172H3,3d(0,{κ(θ)})+O(n2),

where {κ(θ)} are cumulants κm(i1,⋯im)(θ) of the form

κm,(i1,,im)(θ)=mlnφ(θ)θi1θim.

Now give the expansion

[(A.16)]Eexp(itY)exp12tκ2t1+j=3kn(j2)/2Pj({κ(it)})c2(κ2,μ¯k+1)n(k1)/2exp|t|24λ(κ2)|t|k+1+|t|3k1

for |t|<nc1(κ2,μ¯k+1), where λ(κ2) is the smallest eigenvalue of κ2. With κ2 = I and is replaced by ζ m,(il,··· ,im) :

ζm,(i1,,im)(θ)=mξi1ξimlogφθ+ξΘ(θ)ξ=0.

According to

[(A.17)]Rdexp(12tΘt)(itl1)(itlm)(it1)a11+it1/βdt=Rdexp12t(2)t1γΣ22t(2)t1γσ122t12(itl1)(itlm)(it1)a11+it1/βdt=Rdexp(12uΣ22u)j=1m(iulj+it1γlj)expσ122t12(it1)a11+it1/βdudt1=sI(m)Rdexp(12uΣ22u)(iuls1)(iuls|s|)δsexpσ122t12(it1)a1+bs1+it1/βdudt1,

where δs=j{1,2,,m}sqj,withqj=γlj. For θ ≥ 0, the expansion is

[(A.18)]x1fn(y,x2,,xd)dy=φ(θ)nexp(nθx)n|θ1|σ1(θ)n(d1d2)/2n(0;Σ22(θ))×{B0(λ)+sgn(θ1)n[τ1(θ)B1(λ)+τ3(θ)B3(λ)]+1n[τ0(θ)B0(λ)+τ2(θ)B2(λ)+τ4(θ)B4(λ)+τ6(θ)B6(θ)]+OB0(λ)n3/2},

where λ=n|θ1|σ1(θ). The coefficients τ0(θ), ⋯ ,τ6(θ) are

τ1(θ)=12j=2dζ(1jj),τ3(θ)=16ζ(111),τ0(θ)=18j,k=2dζ(jjkk)j,k,l=2d112ζ(jkl)2+18ζ(jjk)ζ(llk),τ2(θ)=14j=2dζ(11jj)+j,k=2d14ζ(11j)ζ(jkk)+18ζ1jjζ(1kk)+14ζ(1jk)2,τ4=124ζ(1111)j=2d112ζ(111)ζ(1jj)+18ζ(11j)2,τ6(θ)=172ζ(111)2.

(A.18) has a similar expression

[(A.19)]yx1,yZ/nfn(y,x2,,xd)=φ(θ)nexp(nθx)σ1(θ){1exp(|θ1|)}n(d1d2)/2n(0;Σ22(θ))×{B0(λ)+1n1|θ1|γθ1σ1(θ)B1(λ)+sgn(θ1)[τ1(θ)B1(λ)+τ3(θ)B3(λ)]+1n[γθ1121|θ1|+γθ12B2(λ)σ1(θ)2+τ0(θ)B0(λ)+τ2(θ)B2(λ)+τ4(θ)B4(λ)+τ6(θ)B6(λ)+sgn(θ1)σ1(θ)1|θ1|γθ1[τ1(θ)B2(λ)+τ3(θ)B4(λ)]]+OB0(λ)n3/2},

where

λ=n|θ1|σ1(θ),γθ1=exp(|θ1|)/{1exp(|θ1|)},σ1(θ)2=Σ11(θ)Σ12(θ)Σ22(θ)1Σ21(θ).

Apply (A.15) to the marginal density of (2, … ,X̄d), the following approximation:

[(A.20)]PX¯1x2X¯2,,X¯d=(x2,,xd)=φ(θ^)nexp(nθ^x)n(0;Σ22(θ^))n1/2|θ^1|σ1(θ^)φ(θ~)nexp(nθ~x)n(0,Σ22(θ~)){B0(λ^)+O(B0(λ^)1+λ^n12)},

where λ^=n|θ^1|σ1(θ^) and

μ(θ^)=x,θ~=(0,θ~(2)),μ(θ~)2,,μ(θ~)d=(x2,,xd).

Let the exponential family generated by X and P consists of the probability measures Pθ given:

dPθdP(ω)=φ(θ)1exp(θX(ω)).

And let r2 be the log likelihood ratio statistic for testing θ1 = 0 in the exponential family, obtain

r2=2nθ^θ~xκθ^+κθ~.

Therefore

[(A.21)]P(X¯1x1(X¯2,,X¯d)=(x2,,xd))=exp(r22)(|Σ22(θ~)|n|Σ(θ^)|)1/21|θ^1|{B0(λ^)+O(B0(λ^)1+λ^n1/2)}.

The Laplace transform φn(θ1, θ2) of (i=1naiZi,i=1nZi) is

[(A.22)]φn(θ1,θ2)=(1+exp(θ2)2)k(1+exp(θ1+θ2)2)nk.

With κn(θ12) = log φn(θ12) and α = k/n we find

[(A.23)]μn(θ1,θ2)=1nκn(θ1,θ2)(θ1,θ2)=(1α1+exp(θ1θ2),α1+exp(θ2)+1α1+exp(θ1θ2))

and

[(A.24)]n(θ1,θ2)=1n2κn(θ)θθ=(1α)exp(θ1θ2)[1+exp(θ1θ2)]2(1α)exp(θ1θ2)[1+exp(θ1θ2)]2(1α)exp(θ1θ2)[1+exp(θ1θ2)]2αexp(θ2)[1+exp(θ1θ2)]2+(1α)exp(θ1θ2)[1+exp(θ1θ2)]2.

The solution θ̂ to μn(θ̂) = (x1,x2), in the range 0 < x1 < 1 – α, 0 < x2 – x1 < α is

θ2^=logαx2x11,

and

θ1^+θ2^=log1αx11,

and the solution θ̂2 to μn(0, θ̂2)2 = x2 is

[(A.25)]θ2~=log1x21.

For x1 > (1 — α)x2and with x1=sn,x2=mn, (3.1) becomes

[(A.26)]P1naiZis1nZi=mαα+x1x2k1α1αx1nkn11x2n1αx1x1nx1α+x1x2x2x1n(x2x1)x2(1x2)1x2x2nx2|n(θ^)|1exp(θ1^)B0(λ^)=(nk)nk+12kk+12(nm)nm+12mm+12s(k+sm)ns(nk)mB0(λ^)nn+12(nks)nks+12(k+sm)k+sm+12ss+12(ms)ms+12,

where

[(A.27)]λ^=nθ1^σ1(θ^)=logs(k+sm)(ms)(nks)s(nks)(ms)(k+sm)s(nks)k+(ms)(k+sm)(nk),

and B0(λ̂) is Esscher function,

[(A.28)]B0(λ^)=λ^exp(λ^22)[1N(λ^)],

where 𝔑(x) is standard normal distribution function,

N(x)=xn(y)dy=x12πexpy22dy.
Received: 2015-12-4
Accepted: 2016-1-7
Published Online: 2017-6-8

© 2017 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 28.1.2026 from https://www.degruyterbrill.com/document/doi/10.21078/JSSI-2017-048-26/html
Scroll to top button