Home Simultaneous prediction in the generalized linear model
Article Open Access

Simultaneous prediction in the generalized linear model

  • Chao Bai EMAIL logo and Haiqi Li
Published/Copyright: August 24, 2018

Abstract

This paper studies the prediction based on a composite target function that allows to simultaneously predict the actual and the mean values of the unobserved regressand in the generalized linear model. The best linear unbiased prediction (BLUP) of the target function is derived. Studies show that our BLUP has better properties than some other predictions. Simulations confirm its better finite sample performance.

MSC 2010: 62M20; 62J12

1 Introduction

Generalized linear models have a long history in the statistical literature and have been used to analyze data from various branches of science on account of both mathematical and practical convenience. Consider the following generalized linear model:

yy0=XX0ÎČ+ΔΔ0(1)

where

y is the n-dimensional vector of observed data;

y0 is the m-dimensional vector of unobserved values that is to be predicted;

X and X0 are n × p and m × p known matrices of explanatory variables. Let rk(A) denote the rank of matrix A and suppose rk(X) ≀ p;

ÎČ is the p × 1 unknown vector of regression coefficients, and

Δ and Δ0 are random errors with zero mean and covariance matrix

Cov(Δ,Δ0â€Č)=ÎŁVâ€ČVÎŁ0,

where ÎŁ ≄ 0 and ÎŁ0 ≄ 0 are known positive semi-definite matrices of arbitrary ranks.

The problem of predicting unobserved variables plays an important role in decision making and has received much attention in recent years. For the prediction of y0 in model (1), [1] obtained the best linear unbiased predictor (BLUP) when ÎŁ > 0. The Bayes and minimax prediction were obtained by [2] when random errors were normally distributed. [3] and [4] derived the linear minimax prediction under a modified quadratic loss function. [5] considered the optimal Stein-rule prediction. [6] reviewed the existing theory of minimum mean squared error loss predictors and made an extension based on the principle of equivariance. [7] investigated the admissibility of linear predictors with inequality constraints under the mean squared error loss function. Another interested subject of prediction relates to the mean of y0, since [8] figured out that the best predictor of y0 is the conditional mean under the criterion of minimum mean squared error. In model (1), prediction of the mean value of y0 (namely = X0ÎČ) relates naturally to the plug-in estimators of parameter ÎČ. [9] proposed the simple projection predictor (SPP) of X0ÎČ by plugging in the best linear unbiased estimator (BLUE) of ÎČ. [10, 11] considered plugging in the prediction of ÎČ under the balanced loss function. The plug-in approach spawned a large literature for the derivation of combined prediction, see [12, 13, 14].

Generally, predictions are investigated either for y0 or for Ey0 at a time. However, sometimes in the fields of medicine and economics, people would like to know the actual value of y0 and its mean value Ey0 simultaneously. For example, in the financial markets, some investors may want to know the actual profit while others would be more interested in the mean profit. Therefore, in order to meet different requirements, the market manager should acquire both the prediction of the actual profit and the prediction of the mean profit simultaneously. Let aside investors’ demands and from the point of view of a decision maker, the market manager needs to determine which prediction should be preferred or provides another comprehensive combined prediction both of the actual and the mean profit based on empirical data. [15] gave other examples of practical situations where one is required to predict both the mean and the actual values of a variable. Under these circumstances, we consider predictions of the following target function

ÎŽ=λy0+(1−λ)Ey0,(2)

where λ ∈ [0, 1] is a non-stochastic weight scalar representing the preference to the prediction of actual and the mean value of the studied variable. Note that, Ύ = y0 if λ = 1 and Ύ = Ey0 if λ = 0, which means predicting Ύ can achieve the prediction of y0 and Ey0 simultaneously. If 0 < λ < 1, then prediction of Ύ balances the prediction of actual and the average value of y0. Besides, the unbiased prediction of Ύ is also the unbiased prediction of y0 or Ey0. Therefore, Ύ is more sensitive and inclusive to be studied.

Studies on the prediction of ÎŽ have been carried out in the literature from various perspective. The properties of the predictors by plugging in Stein-rule estimators have been concerned by [16, 17, 18]. [19] investigated the Stein-rule prediction for ÎŽ in linear regression model when the error covariance matrix was positive definite yet unknown. [20] studied the admissible prediction of ÎŽ. [21, 22], and [23] considered predictors for ÎŽ in linear regression models with stochastic or non-stochastic linear constraints on the regression coefficients. The issues of simultaneous prediction in measurement error models have been addressed in [24] and [25]. [26] considered a scalar multiple of the classical prediction vector for the prediction of ÎŽ and discussed the performance properties.

For model (1), most former work concerned about biased prediction under ÎŁ > 0 (including the special case ÎŁ = I), and did not discuss the value of the weight scalar λ in (2). In this paper, supposing ÎŁ ≄ 0, we studied the best linear unbiased prediction (BLUP) of ÎŽ and make some comparisons to the usual BLUPs of y0 and Ey0. We also propose a method to choose the value of λ in (2), which can give the way to determine which prediction of ÎŽ or y0 or Ey0 should be provided by finite sample data.

The rest of the paper is organized as follows. In Section 2, we derive the BLUPs of the target function (2) in the generalized linear model, and discuss the efficiency of our BLUP comparing to the usual BLUP and SPP. Simulation studies are provided in Section 3 to illustrate the determination of the weight scalar in our BLUP and the performance of our proposed BLUP comparing to the other two predictors. Concluding remarks are given in Section 4.

2 The BLUP of ÎŽ and its efficiency

Denote ℒℋ = {Cy ∣ C is an m × n matrix} as the set of all the homogeneous linear predictor of y0. Denote ÎŽÌ‚BLUP as the best linear unbiased predictor of ÎŽ in model (1). In this section, we first derive the expressions of ÎŽÌ‚BLUP in ℒℋ, and then study its performance comparing to the BLUP of y0 and the SPP of Ey0. All of the predictors discussed in this paper are derived under the criterion of minimum mean squared error. Some preliminaries and basic results are given as follows:

Definition 2.1

The predictor ÎŽÌ‚ of ÎŽ is unbiased if E ÎŽÌ‚ = E ÎŽ.

Definition 2.2

ή is linearly predictable if there exists a linear predictor Cy in ℒℋ such that Cy is an unbiased predictor of d.

Lemma 2.3

In model (1), ÎŽ is linearly predictable if there exists a matrix C such that CX = X0, or ℳ(X0â€Č)⊆ ℳ(Xâ€Č).

Proof

From Definition 2.1 and 2.2, there exists a matrix C such that E(Cy) = EÎŽ for any ÎČ, namely CX = X0 or Xâ€ČCâ€Č = X0â€Č which is equivalent to ℳ( X0â€Č)⊆ℳ(Xâ€Č). □

If not specified otherwise, the variables we aim to predict in this paper are all linearly predictable.

Lemma 2.4

([27]). Suppose the n × n matrix ÎŁ ≄ 0 and let X be an n × p matrix, then

ÎŁXXâ€Č0−=T+−T+X(Xâ€ČT+X)−Xâ€ČT+T+X(Xâ€ČT+X)−(Xâ€ČT+X)−Xâ€ČT+(Xâ€ČT+X)−(Xâ€ČT+X)−(Xâ€ČT+X)−,

where T = ÎŁ + XXâ€Č. Especially, if ÎŁ > 0, then

ÎŁXXâ€Č0−=Σ−1−Σ−1X(Xâ€ČΣ−1X)−Xâ€ČΣ−1Σ−1X(Xâ€ČΣ−1X)−(Xâ€ČΣ−1X)−Xâ€ČΣ−1−(Xâ€ČΣ−1X)−.

Lemma 2.5

In model (1), the BLUP of y0and the SPP of Ey0are respectively

y~0BLUP=X0ÎČ~+VT+(y−XÎČ~),andy~0SPP=X0ÎČ~,

where T = ÎŁ + XXâ€ČandÎČÍ  = (Xâ€ČT+X)–Xâ€ČT+y is the best linear unbiased estimator (BLUE) of ÎČ in model (1).

If ÎŁ > 0 and rk (X) = p in model (1), the BLUP of y0and the SPP of Ey0are respectively

y^0BLUP=X0ÎČ^BLUE+VΣ−1(y−XÎČ^BLUE),andy^0SPP=X0ÎČ^BLUE,

where ÎČ̂BLUE = (Xâ€ČΣ–1X)–1Xâ€ČΣ–1y is the BLUE of ÎČ.

Proof

BLUPs of y0 in Lemma 2.5 were derived by [1] and [28]. The SPPs of Ey0 were derived by [9]. □

The BLUPs and SPPs are presented here for further comparisons.

2.1 The best linear unbiased predictor of ÎŽ

Theorem 2.6

In model (1), the BLUP of ή in ℒℋ is

ÎŽ^BLUP=X0ÎČ~+λVT+(y−XÎČ~),

where T = ÎŁ + XXâ€Č, ÎČÍ  = (Xâ€ČT+X)–Xâ€ČT+y.

Proof

Suppose ÎŽÌ‚ = Cy ∈ ℒℋ and is unbiased, then by Lemma 2.3, CX = X0. Denote R(ÎŽÌ‚;ÎČ) as the risk of ÎŽÌ‚ and tr(A) as the trace of squared matrix A, we have

R(ÎŽ^;ÎČ)=E[(ÎŽ^−ή)â€Č(ÎŽ^−ή)]=E[Cy−λy0−(1−λ)X0ÎČ]â€Č[Cy−λy0−(1−λ)X0ÎČ]=E(Cy)â€Č(Cy)+λE(Cy)â€Č(Cy0)−(1−λ)E(Cy)â€ČX0ÎČ−λE(y0)â€Č(Cy)+λ2Ey0â€Čy0+λ(1−λ)Ey0â€ČX0ÎČ−(1−λ)E(X0ÎČ)â€ČCy+λ(1−λ)E(X0ÎČ)â€Čy0+(1−λ)2E(X0ÎČ)â€ČX0ÎČ=tr(CÎŁCâ€Č)+λ2trÎŁ0−2λtr(CVâ€Č)+ÎČâ€Č(CX−X0)â€Č(CX−X0)ÎČ.

Minimizing R(ÎŽÌ‚;ÎČ) is equivalent to solve the following optimization problem to obtain C such that

argmin[tr(CÎŁCâ€Č)+λtrÎŁ0−2λtr(CVâ€Č)]=CCX−X0=0.

Let Λ be a p × m Lagrange multiplier and construct the Lagrange function as

L(C,Λ)=tr(CÎŁCâ€Č)+λtrÎŁ0−2λtr(CVâ€Č)+2tr[(CX−X0)Λ].

Let ∂ L/∂ C = 0 and ∂ L/∂ Λ = 0, we have

CÎŁâˆ’Î»V+Λâ€ČXâ€Č=0,Xâ€ČCâ€Č=X0â€Č,

namely

ÎŁXXâ€Č0Câ€ČΛ=λVâ€ČX0â€Č,(3)

and

Câ€ČΛ=ÎŁXXâ€Č0−λVâ€ČX0â€Č.

By Lemma 2.4, we obtain C = X0(Xâ€ČT+X)–Xâ€ČT+ + λ VT+ (I – X(Xâ€ČT+X)–Xâ€ČT+). Let ÎČÍ  = (Xâ€ČT+X)–Xâ€ČT+y, thus ÎŽÍ BLUP = Cy = X0ÎČÍ  + λ VT+(y – XÎČÍ ). □

Corollary 2.7

If ÎŁ > 0 and rk(X) = p in model (1), then the BLUP of ÎŽ is

ÎŽ^BLUP=X0ÎČ^BLUE+λVΣ−1(y−XÎČ^BLUE),

where ÎČ̂BLUE = (Xâ€ČΣ–1X)–1Xâ€ČΣ–1y.

Proof

If ÎŁ > 0 and rk(X) = p, then Xâ€ČΣ–1X is nonsingular. Since

ÎŁXXâ€Č0=ÎŁX0−Xâ€ČΣ−1X=−|ÎŁ||Xâ€ČΣ−1X|≠0,

then ÎŁXXâ€Č0 is nonsingular. By Lemma 2.4,

ÎŁXXâ€Č0−1=Σ−1−Σ−1X(Xâ€ČΣ−1X)−1Xâ€ČΣ−1Σ−1X(Xâ€ČΣ−1X)−1(Xâ€ČΣ−1X)−1Xâ€ČΣ−1−(Xâ€ČΣ−1X)−1.

With similar calculations as in the proof of Theorem 2.6, the solution of (3) gives that

C=X0(Xâ€ČΣ−1X)−1Xâ€ČΣ−1+λVΣ−1(I−X(Xâ€ČΣ−1X)−1Xâ€ČΣ−1),

and therefore ÎŽÌ‚BLUP = X0ÎČ̂BLUE + λ VΣ–1(y – XÎČ̂BLUE). □

Theorem 2.8

For the prediction of (2) in model (1), EÎŽÌ‚BLUP = EyÍ 0BLUP = yÍ 0SPP = Ey0 = X0ÎČ.

Proof

By Theorem 2.6, E ÎŽÌ‚BLUP = E[X0ÎČÍ  + λ VT+ (y – X ÎČÍ )] = X0ÎČ = Ey0. From Lemma 2.5, it is easy to prove that E yÍ 0BLUP = yÍ 0SPP = Ey0 = X0ÎČ. □

Remark 2.9

According to Definition 2.1 and Theorem 2.8, ÎŽÌ‚BLUP, yÍ 0BLUPand yÍ 0SPPare all unbiased predictors of y0or Ey0. Let λ = 1, ÎŽÌ‚BLUP = yÍ 0BLUPis the BLUP of y0; Let λ = 0, ÎŽÌ‚BLUP = X0ÎČÍ  is the SPP of Ey0. It shows that the function (2) can simultaneously predict the actual value of y0and its mean value. Since ÎŽÌ‚BLUP = λyÍ 0BLUP + (1 – λ)yÍ 0SPP, then ÎŽÌ‚BLUP can be viewed as a tradeoff between the BLUP of y0and the SPP of Ey0. By using ÎŽÌ‚BLUP in practical applications, forecasters can provide a more comprehensive predictor by assigning different weights in ÎŽÌ‚BLUP.

As for the choice of λ, usually the weight scalar should be given before predicting. Since λ represents the weight to the prediction of y0 and is not a parameter, then there is no “true” but suitable value of it. One method to select λ is by forecasters’ subjective preferences. For example, if the prediction of y0 and Ey0 are treated equally, then λ = 0.5. Another method to determine λ is by using observed data of (y, X) in model (1). In this paper we recommend to use the leave-one-out cross-validation technique. In order to determine λ, we take ÎŽÌ‚BLUP as the predictor of y0 by Theorem 2.8 since the true ÎČ in Ey0 = X0ÎČ is unknown. Define ÎŽÌ‚(–j)(λ) to be the predictor of yj when the jth case of (y, X) in (1) is deleted. Denote 𝒯 = {λi|0 ≀ λi ≀ 1, i = 1, 2, ⋯}. The predicted residual sum of squares is defined as

CV(λ)=∑j=1n[yj−ή^(−j)(λ)]2.

For each λi ∈ 𝒯, compute ∑j=1n[yj – ÎŽÌ‚(–j)(λi)]2. The choice of λ is the one that minimizes CV(λ) over 𝒯. Simulations in Section 3 indicate the leave-one-out cross-validation technique for the selection of λ is feasible. Forecasters can determine which one of ÎŽÌ‚BLUP, yÍ 0BLUP and yÍ 0SPP is more “suitable” to be afforded through the selection of λ by observed data.

2.2 Efficiency of Ύ̂BLUP

According to Theorem 2.8, Ύ̂BLUP, y͠0BLUP and y͠0SPP are all unbiased predictors of y0 or Ey0. From the point of view of the linearity and unbiasedness of the prediction, we mainly discuss the performance of Ύ̂BLUP comparing to y͠0BLUP and y͠0SPP in what follows.

Theorem 2.10

For model (1),

Cov(ÎŽ^BLUP)≀Cov(y~0BLUP),

and the equality holds if and only if (1 – λ2) VT+[I – T+X (Xâ€ČT+X)–Xâ€Č]T+Vâ€Č = 0.

Proof

Denote Δ͠0 = λ VT+(y – XÎČÍ ) as the predictor of Δ0, we have

Cov(ÎŽ^BLUP)=Cov(X0ÎČ~+λΔ~0),Cov(y~0BLUP)=Cov(X0ÎČ~+Δ~0).

Since ÎŁ = T – XXâ€Č and Xâ€Č[I – T+X(Xâ€ČT+X)–Xâ€Č] = 0, then

Cov(X0ÎČ~,Δ~0)=X0(Xâ€ČT+X)−Xâ€ČT+ÎŁ[I−T+X(Xâ€ČT+X)−Xâ€Č]T+Vâ€Č=X0(Xâ€ČT+X)−Xâ€ČT+(T−XXâ€Č)[I−T+X(Xâ€ČT+X)−Xâ€Č]T+Vâ€Č=0.

Therefore, Cov (ÎŽÌ‚BLUP)– Cov (yÍ 0BLUP) = (1 – λ2)Cov(Δ͠0)≀ 0, and

Cov(ÎŽ^BLUP)≀Cov(y~0BLUP),

and the equality holds if and only if (1 – λ2)Cov(Δ͠0) = (1 – λ2)VT+[I – T+X(Xâ€ČT+X)–Xâ€Č]T+Vâ€Č = 0. □

Corollary 2.11

If ÎŁ > 0 and rk(X) = p in model (1), then

Cov(ÎŽ^BLUP)≀Cov(y^0BLUP),

and the equality holds if and only if (1 – λ2)VΣ–1[I – Σ–1X(Xâ€ČΣ–1X)–1Xâ€Č]Σ–1Vâ€Č = 0.

Proof

Corollary 2.11 is easily proved by Lemma 2.4 and Theorem 2.10. □

Remark 2.12

Theorem 2.10 and Corollary 2.11 show that Ύ̂BLUP is better than y͠0BLUPunder the criterion of covariance.

Theorem 2.13

For model (1), if DT+Vâ€ČX0(Xâ€ČT+X)–Xâ€ČT+ + T+X(Xâ€ČT+X)–X0â€ČVT+D≄ 0, where D = I – X(Xâ€ČT+X)–Xâ€ČT+, then

E(y~0SPP−X0ÎČ)â€Č(y~0SPP−X0ÎČ)≀E(ÎŽ^BLUP−X0ÎČ)â€Č(ÎŽ^BLUP−X0ÎČ)≀E(y~0BLUP−X0ÎČ)â€Č(y~0BLUP−X0ÎČ).

Proof

Denote

C1=X0(Xâ€ČT+X)−Xâ€ČT++λVT+[I−X(Xâ€ČT+X)−1Xâ€ČT+],C2=X0(Xâ€ČT+X)−Xâ€ČT++VT+[I−X(Xâ€ČT+X)−1Xâ€ČT+],

then Ύ̂BLUP = C1y and y͠0BLUP = C2y. By the unbiasedness, C1X = X0 and C2X = X0. Therefore,

E(ÎŽ^BLUP−X0ÎČ)â€Č(ÎŽ^BLUP−X0ÎČ)−E(y~0BLUP−X0ÎČ)â€Č(y~0BLUP−X0ÎČ)=(XÎČ)â€Č(C1â€ČC1−C2â€ČC2)XÎČ+tr(C1ÎŁC1â€Č−C2ÎŁC2â€Č).(4)

Note that D is a symmetric idempotent matrix and

C1ÎŁC1â€Č=C1(T−XXâ€Č)C1â€Č=X0(Xâ€ČT+X)−X0â€Č+λ2VT+DVâ€Č−X0X0â€Č,C2ÎŁC2â€Č=C2(T−XXâ€Č)C2â€Č=X0(Xâ€ČT+X)−X0â€Č+VT+DVâ€Č−X0X0â€Č,

then we have

C1ÎŁC1â€Č−C2ÎŁC2â€Č=−(1−λ2)VT+DVâ€Č≀0,andtr(C1ÎŁC1â€Č−C2ÎŁC2â€Č)≀0.(5)

Besides,

C1â€ČC1−C2â€ČC2=(λ−1)[DT+Vâ€ČX0(Xâ€ČT+X)−Xâ€ČT++T+X(Xâ€ČT+X)−X0â€ČVT+D]+(λ2−1)DT+Vâ€ČVT+D≀(λ−1)[DT+Vâ€ČX0(Xâ€ČT+X)−Xâ€ČT++T+X(Xâ€ČT+X)−X0â€ČVT+D]≀0.(6)

Substituting (5) and (6) into (4), we have

E(ÎŽ^BLUP−X0ÎČ)â€Č(ÎŽ^BLUP−X0ÎČ)≀E(y~0BLUP−X0ÎČ)â€Č(y~0BLUP−X0ÎČ).

Let λ = 0 in (2), then yÍ 0SPP = X0ÎČÍ  = arg miny^0∈LI E(Ć·0 – X0ÎČ)â€Č(Ć·0 – X0ÎČ) by Theorem 2.6. It is obvious that

E(y~0SPP−X0ÎČ)â€Č(y~0SPP−X0ÎČ)≀E(ÎŽ^BLUP−X0ÎČ)â€Č(ÎŽ^BLUP−X0ÎČ).

□

By Lemma 2.4 and Theorem 2.13, we have

Corollary 2.14

In model (1), if ÎŁ > 0, rk(X) = p and DΣ–1Vâ€ČX0(Xâ€ČΣ–1X)–1Xâ€ČΣ–1 + Σ–1X (Xâ€ČΣ–1X)–1X0â€ČVΣ–1D ≄ 0, where D = I – X(Xâ€ČΣ–1X)–1Xâ€ČΣ–1, then

E(y^0SPP−X0ÎČ)â€Č(y^0SPP−X0ÎČ)≀E(ÎŽ^BLUP−X0ÎČ)â€Č(ÎŽ^BLUP−X0ÎČ)≀E(y^0BLUP−X0ÎČ)â€Č(y^0BLUP−X0ÎČ).

Remark 2.15

Theorem 2.13 and Corollary 2.14 show that Ύ̂BLUP is better than y͠0BLUPunder the squared loss function as the predictor of Ey0.

Theorem 2.16

For model (1),

E(y~0BLUP−y0)â€Č(y~0BLUP−y0)≀E(ÎŽ^BLUP−y0)â€Č(ÎŽ^BLUP−y0)≀E(y~0SPP−y0)â€Č(y~0SPP−y0).

Proof

Denote

C1=X0(Xâ€ČT+X)−Xâ€ČT++λVT+[I−X(Xâ€ČT+X)−Xâ€ČT+],C2=X0(Xâ€ČT+X)−Xâ€ČT++VT+[I−X(Xâ€ČT+X)−Xâ€ČT+],C3=X0(Xâ€ČT+X)−Xâ€ČT+,

then ÎŽÌ‚BLUP = C1y, yÍ 0BLUP = C2y and yÍ 0SPP = X0ÎČÍ  = C3y. By Lemma 2.3, C1X = X0, C2X = X0 and C3X = X0. Since

E(Ciy−y0)â€Č(Ciy−y0)=trCiÎŁCiâ€Č−2tr(CiVâ€Č)+trÎŁ0,E(Ciy−y0)â€Č(Ciy−y0)−E(Cjy−y0)â€Č(Cjy−y0)=tr(CiÎŁCiâ€Č−CjÎŁCjâ€Č)−2tr(Ci−Cj)Vâ€Č,1≀i,j≀3,and0≀λ≀1,

we have

E(C1y−y0)â€Č(C1y−y0)−E(C2y−y0)â€Č(C2y−y0)=(λ−1)2trVT+DVâ€Č≄0,E(C1y−y0)â€Č(C1y−y0)−E(C3y−y0)â€Č(C3y−y0)=[(λ−1)2−1]trVT+DVâ€Č≀0,

which give that

E(y~0BLUP−y0)â€Č(y~0BLUP−y0)≀E(ÎŽ^BLUP−y0)â€Č(ÎŽ^BLUP−y0)≀E(y~0SPP−y0)â€Č(y~0SPP−y0).

□

By Lemma 2.4 and Theorem 2.16, we have

Corollary 2.17

In model (1), if ÎŁ > 0 and rk(X) = p, then

E(y^0BLUP−y0)â€Č(y^0BLUP−y0)≀E(ÎŽ^BLUP−y0)â€Č(ÎŽ^BLUP−y0)≀E(y^0SPP−y0)â€Č(y^0SPP−y0).

Remark 2.18

Theorem 2.16 and Corollary 2.17 show that Ύ̂BLUP is better than y͠0SPPunder the squared loss function as the predictor of y0.

3 Simulation studies

In this section, we conduct simulations to illustrate the selection of λ in ÎŽÌ‚0BLUP and the finite sample performance of our simultaneous prediction comparing to Ć·0BLUP and Ć·0SPP.

The data are generated from the following model:

yy0=XX0ÎČ+ΔΔ0,ΔΔ0∌N(0,ÎŁ),(7)

where ÎŁ=502⋯2250⋯2⋟⋟⋱⋟22⋯50.

We assume y is the observation with sample size n = 200 and y0 is to be predicted with sample size m = 1. In Section 3.1 we only need the sample data of y to determine λ, while in Section 3.2 we use all the sample data of y and y0 for comparison with various λ. Elements in corresponding matrices X and X0 are generated from the Uniform distribution [1.1, 30.7].

3.1 Selection of λ in ÎŽÌ‚BLUP

We set ÎČ to be the one-dimensional parameter with the true value 0.8. The number of simulated realizations for choosing λ is 1000. In each simulation, let λ vary from 0 to 1 with step size 0.001. We use the leave-one-out cross-validation technique (see Section 2.1) to determine λ. Let λ* be the selected value of λ, then

λ⋆=argminCV(λ)=argmin∑j=1200[yj−ή^−j(λ)]2,0≀λ≀1.

Simulations show that the relationship between CV(λ) and λ is varying. Three of the simulations are presented to illustrate the relation between λ and log CV(λ) in Figure 1. Subfigure (a) tells that λ = 1 and Ć·0BLUP should be provided when predicting; (b) tells that λ = 0 and Ć·0SPP should be preferred; (c) tells that λ = 0.315 and ÎŽÌ‚BLUP should be provided when predicting. The relationship between CV(λ) and λ also tells us that there are three kinds of λ* in our simulations. Table 1 shows that among 1000 simulations, 267 of them give that λ = 0, 332 of them determine λ = 1 and 401 of them give that 0 < λ < 1. Simulation performance shows that the leave-one-out cross-validation technique for the selection of λ is feasible and give the way to solve the question “ which one of ÎŽÌ‚BLUP, Ć·0BLUP and Ć·0SPP is preferred from the observations”.

Fig. 1 Relationships between λ and log[CV(λ)] in three simulations (a),(b) and (c) and the corresponding selection of λ
Fig. 1

Relationships between λ and log[CV(λ)] in three simulations (a),(b) and (c) and the corresponding selection of λ

Table 1

Frequency of occurrences of three kinds of λ* in 1000 simulations

λ*=0λ*=10 < λ* < 1
Frequency267100033210004011000

3.2 Finite sample performance of the predictors

Let n = 200, m = 1, p = 3 and the true ÎČ = (1, 0.8, 0.2)â€Č in (7). λ in ÎŽÌ‚BLUP varies on a grid from 0.1 to 0.9. For each λ, the number of simulations is 1000. In each simulation, we make some comparisons about ÎŽÌ‚BLUP, Ć·0BLUP and Ć·0SPP. Regarding ÎŽÌ‚BLUP – y0, Ć·0BLUP – y0 and Ć·0SPP – y0, the sample means (sms), the standard deviations (stds) and the mean squares (mss) of which are obtained in Table 2. Also, regarding ÎŽÌ‚BLUP-X0ÎČ, Ć·0BLUP – X0ÎČ and Ć·0SPP – X0ÎČ, the sms, the stds and the mss of which are presented in Table 3.

Table 2

Finite sample performance about forecast precision ofĆ·0BLUP, ÎŽÌ‚BLUP(with different λ)and Ć·0SPP

λ =0.1λ = 0.2λ = 0.3λ = 0.4λ = 0.5λ = 0.6λ = 0.7λ = 0.8λ = 0.9
Ć·0BLUP–y0-0.25960.1509-0.10560.31240.4998-0.07030.12510.24320.2150
smÎŽÌ‚BLUP–y00.25240.1790-0.06620.33140.4911-0.07830.12920.23580.2152
Ć·0SPP–y00.25160.1861-0.04940.33410.4825-0.09040.13890.20600.2172
Ć·0BLUP–y07.25167.19306.98656.85456.92536.98446.86226.91936.9606
stdÎŽÌ‚BLUP–y07.27587.22397.04486.84626.96246.97916.86826.92776.9640
Ć·0SPP–y07.28437.24337.08906.86567.02987.00516.92307.00537.0463
Ć·0BLUP–y052.60151.71148.77347.03548.16248.73847.05847.88848.448
msÎŽÌ‚BLUP–y052.94852.16349.58446.93348.66848.66547.14148.00048.496
Ć·0SPP–y053.07252.44750.20647.20749.60149.03047.89949.06849.648

Table 3

Finite sample performance about goodness fit of the model ofĆ·0BLUP, ÎŽÌ‚BLUP(with different λ)and Ć·0SPP

λ = 0.1λ = 0.2λ = 0.3λ = 0.4λ = 0.5λ = 0.6λ = 0.7λ = 0.8λ = 0.9
Ć·0BLUP–X0ÎČ0.0249-0.0190-0.0742-0.0077-0.0256-0.02570.0047-0.0543-0.0340
smÎŽÌ‚BLUP–X0ÎČ0.01770.0091-0.03490.0113-0.0343-0.03370.0089-0.0618-0.0338
Ć·0SPP–X0ÎČ0.01690.0162-0.01800.0240-0.04290.04570.0186-0.0915-0.0318
Ć·0BLUP–X0ÎČ1.63891.67691.64151.61241.61211.66401.64011.52421.6445
stdÎŽÌ‚BLUP–X0ÎČ1.33891.38311.38441.37741.39141.50101.50481.43891.5966
Ć·0SPP–X0ÎČ1.33341.36291.36261.33081.30391.39831.35471.29491.3700
Ć·0BLUP–X0ÎČ2.68412.80972.69742.59732.59682.76682.68722.32392.7028
msÎŽÌ‚BLUP-X0ÎČ1.79101.91121.91591.89551.93522.25182.26212.07212.5477
Ć·0SPP-X0ÎČ1.77651.85581.85511.76971.70021.95541.83361.68351.8761

From Table 2 and Table 3, we make the following observations:

  1. As for the prediction precision, no matter what λ is set to be, the sample means (sms) of these prediction error of Ć·0BLUP, ÎŽÌ‚BLUP and Ć·0SPP are all small. Comparisons of sms can not tell which one of the three predictors is better, yet the standard deviations (stds) and the mean squares (mss) of ÎŽÌ‚BLUP – y0 are less than that of Ć·0SPP – y0.

  2. No matter what λ is set to be, the sample means (sms) of Ć·0BLUP – X0ÎČ, ÎŽÌ‚BLUP – X0ÎČ and Ć·0SPP – X0ÎČ are all small. Comparisons of sms can not determine which predictor is better, yet the standard deviations (stds) and the mean squares (mss) of ÎŽÌ‚BLUP – X0ÎČ are less than that of Ć·0BLUP – X0ÎČ.

The above facts imply that for any λ ∈ (0, 1), ÎŽÌ‚BLUP, Ć·0BLUP and Ć·0SPP are all unbiased predictions of y0 and Ey0. ÎŽÌ‚BLUP is more efficient than X0ÎČ̂BLUE when predicting the actual value, and is more efficient than Ć·0BLUP when predicting the mean value. Simulation performances verify the results in Section 2.2.

4 Conclusion

In this paper, we study the prediction based on a composite target function that allows to simultaneously predict the actual and the mean values of the unobserved regressand in the generalized linear model. The BLUP of the target function is derived when the model error covariance is positive semi-definite. The BLUP is also the unbiased prediction of the actual and the mean values of the the unobserved regressand. We propose the leave-one-out cross-validation technique to determine the value of the weight scalar in our prediction, which can help to provide a suitable prediction. For the efficiency of the proposed BLUP, studies show that it is better than the usual BLUP under the criterion of covariance and dominates it as a prediction of the mean value of the regressand. Besides, the proposed BLUP is better than the SPP as a prediction of the actual value of the regressand. Simulation studies illustrate the selection of the weight scalar in the proposed BLUP and show that it has better finite sample performance. Further researches on simultaneous prediction are in progress.

Acknowledgement

The authors are grateful to the responsible editor and the anonymous reviewers for their valuable comments and suggestions, which have greatly improved this paper. This research is supported by the Scientific Research Fund of Hunan Provincial Education Department (13C1139), the Youth Scientific Research Foundation of Central South University of Forestry and Technology of China (QJ2012013A) and the Natural Science Foundation of Hunan Province (2015JJ4090).

References

[1] Goldberger A.S., Best linear unbiased prediction in the generalized linear regression model, Journal of the American Statistical Association, 1962, 57(298), 369–37510.1080/01621459.1962.10480665Search in Google Scholar

[2] Bolfarine H., Zacks S., Bayes and minimax prediction in finite populations, J. Statist. Plann. Infer., 1991, 28, 139–15110.1016/0378-3758(91)90022-7Search in Google Scholar

[3] Yu S. H., The linear minimax predictor in finite populations with arbitrary rank under quadratic loss function, Chin. Ann. Math., 2004, 25, 485–496Search in Google Scholar

[4] Xu L. W., Wang S. G., The minimax predictor in finite populations with arbitrary rank in normal distribution, Chin. Ann. Math., 2006, 27, 405–416Search in Google Scholar

[5] Gotway C. A., Cressie N., Improved multivariate prediction under a general linear model, J. Multivariate Anal., 1993, 45, 56–7210.1006/jmva.1993.1026Search in Google Scholar

[6] P J G Teunissen P. J. G., Best prediction in linear models with mixed integer/real unknowns: theory and application, Journal of Geodesy, 2007, 81(12), 759–78010.1007/s00190-007-0140-6Search in Google Scholar

[7] Xu L. W., Admissible linear predictors in the superpopulation model with respect to inequality constraints, Comm. Statist. Theory Methods, 2009, 38, 2528–254010.1080/03610920802571211Search in Google Scholar

[8] Searle S. R., Casella G., McCulloch C. E., Variance components, 1992, New York: Wiley.10.1002/9780470316856Search in Google Scholar

[9] Bolfarine H., Rodrigues J., On the simple projection predictor in finite populations, Australian Journal of Statistics, 1988, 30(3), 338–34110.1111/j.1467-842X.1988.tb00627.xSearch in Google Scholar

[10] Hu G. K., Li Q. G., Yu S. H., Optimal and minimax prediction in multivariate normal populations under a balanced loss function, J. Multivariate Anal., 2014, 128, 154–16410.1016/j.jmva.2014.03.014Search in Google Scholar

[11] Hu G. K., Peng P., Linear admissible predictor of finite population regression coefficient under a balanced loss function, J. Math., 2014, 34, 820–828Search in Google Scholar

[12] Diebold F. X., Lopez J. A., Forecast evaluation and combination, Handbook of statistics, 1996, 14, 241–26810.1016/S0169-7161(96)14010-4Search in Google Scholar

[13] Hendry D. F., Clements M. P., Pooling of forecasts, Econometrics Journal, 2002, 5, 1–2610.1111/j.1368-423X.2004.00119.xSearch in Google Scholar

[14] Timmermann A., Forecast combinations, Handbook of economic forecasting, 2006, 1, 135–19610.1016/S1574-0706(05)01004-9Search in Google Scholar

[15] Shalabh, Performance of stein-rule procedure for simultaneous prediction of actual and average values of study variable in linear regression models, Bull. Internat. Statist. Inst, 1995, 56, 1357–1390Search in Google Scholar

[16] Chaturvedi A., Singh S. P., Stein rule prediction of the composite target function in a general linear regression model, Statist. Papers, 2000, 41(3), 359–36710.1007/BF02925929Search in Google Scholar

[17] Chaturvedi A., Kesarwani S., Chandra R., Simultaneous prediction based on shrinkage estimator, in: Shalabh, C. Heumann (Eds.), Recent Advances in Linear Models and Related Areas, Springer, 2008, 181–20410.1007/978-3-7908-2064-5_10Search in Google Scholar

[18] Shalabh, Heumann C., Simultaneous prediction of actual and average values of study variable using stein-rule estimators, in: K. Kumar, A. Chaturvedi (Eds.), Some Recent Developments in Statistical Theory and Application, Brown Walker Press, USA, 2012, 68–81Search in Google Scholar

[19] Chaturvedi A., Wan A. T. K., Singh S. P., Improved multivariate prediction in a general linear model with an unknown error covariance matrix, J. Multivariate Anal., 2002, 83(1), 166–18210.1006/jmva.2001.2042Search in Google Scholar

[20] Bai C., Li H., Admissibility of simultaneous prediction for actual and average values in finite population, J. Inequal. Appl., 2018, 2018(1), 11710.1186/s13660-018-1707-xSearch in Google Scholar

[21] Toutenburg H., Shalabh, Predictive performance of the methods of restricted and mixed regression estimators, Biometrical J., 1996, 38(8), 951–95910.1002/bimj.4710380807Search in Google Scholar

[22] Toutenburg H., Shalabh, Improved predictions in linear regression models with stochastic linear constraints, Biom. J., 2000, 42(1), 71–8610.1002/(SICI)1521-4036(200001)42:1<71::AID-BIMJ71>3.0.CO;2-HSearch in Google Scholar

[23] Dubeand M., Manocha V., Simultaneous prediction in restricted regression models, J. Appl. Statist. Sci., 2002, 11(4), 277–288Search in Google Scholar

[24] Shalabh, Paudel C. M., Kumar N., Simultaneous prediction of actual and average values of response variable in replicated measurement error models, in: Shalabh, C. Heumann (Eds.), Recent Advances in Linear Models and Related Areas, Springer, 2008, 105–13310.1007/978-3-7908-2064-5_7Search in Google Scholar

[25] Garg G., Shalabh, Simultaneous predictions under exact restrictions in ultrastructural model, Journal of Statistical Research (in Special Volume on Measurement Error Models), 2011, 45(2), 139–154Search in Google Scholar

[26] Shalabh, A revisit to efficient forecasting in linear regression models, J. Multivariate Anal., 2013, 114, 161–17010.1016/j.jmva.2012.07.017Search in Google Scholar

[27] Wang S. G., Shi J. H., Introduction to the linear model, 2004, Science Press, Beijing.Search in Google Scholar

[28] Yu S. H., Xu L. W., Admissibility of linear prediction under quadratic loss, Acta Mathematicae Applicatae Sinica, 2004, 27, 385–396Search in Google Scholar

Received: 2017-11-28
Accepted: 2018-07-17
Published Online: 2018-08-24

© 2018 Bai and Li, published by De Gruyter

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.

Articles in the same Issue

  1. Regular Articles
  2. Algebraic proofs for shallow water bi–Hamiltonian systems for three cocycle of the semi-direct product of Kac–Moody and Virasoro Lie algebras
  3. On a viscous two-fluid channel flow including evaporation
  4. Generation of pseudo-random numbers with the use of inverse chaotic transformation
  5. Singular Cauchy problem for the general Euler-Poisson-Darboux equation
  6. Ternary and n-ary f-distributive structures
  7. On the fine Simpson moduli spaces of 1-dimensional sheaves supported on plane quartics
  8. Evaluation of integrals with hypergeometric and logarithmic functions
  9. Bounded solutions of self-adjoint second order linear difference equations with periodic coeffients
  10. Oscillation of first order linear differential equations with several non-monotone delays
  11. Existence and regularity of mild solutions in some interpolation spaces for functional partial differential equations with nonlocal initial conditions
  12. The log-concavity of the q-derangement numbers of type B
  13. Generalized state maps and states on pseudo equality algebras
  14. Monotone subsequence via ultrapower
  15. Note on group irregularity strength of disconnected graphs
  16. On the security of the Courtois-Finiasz-Sendrier signature
  17. A further study on ordered regular equivalence relations in ordered semihypergroups
  18. On the structure vector field of a real hypersurface in complex quadric
  19. Rank relations between a {0, 1}-matrix and its complement
  20. Lie n superderivations and generalized Lie n superderivations of superalgebras
  21. Time parallelization scheme with an adaptive time step size for solving stiff initial value problems
  22. Stability problems and numerical integration on the Lie group SO(3) × R3 × R3
  23. On some fixed point results for (s, p, α)-contractive mappings in b-metric-like spaces and applications to integral equations
  24. On algebraic characterization of SSC of the Jahangir’s graph 𝓙n,m
  25. A greedy algorithm for interval greedoids
  26. On nonlinear evolution equation of second order in Banach spaces
  27. A primal-dual approach of weak vector equilibrium problems
  28. On new strong versions of Browder type theorems
  29. A GerĆĄgorin-type eigenvalue localization set with n parameters for stochastic matrices
  30. Restriction conditions on PL(7, 2) codes (3 ≀ |𝓖i| ≀ 7)
  31. Singular integrals with variable kernel and fractional differentiation in homogeneous Morrey-Herz-type Hardy spaces with variable exponents
  32. Introduction to disoriented knot theory
  33. Restricted triangulation on circulant graphs
  34. Boundedness control sets for linear systems on Lie groups
  35. Chen’s inequalities for submanifolds in (Îș, ÎŒ)-contact space form with a semi-symmetric metric connection
  36. Disjointed sum of products by a novel technique of orthogonalizing ORing
  37. A parametric linearizing approach for quadratically inequality constrained quadratic programs
  38. Generalizations of Steffensen’s inequality via the extension of Montgomery identity
  39. Vector fields satisfying the barycenter property
  40. On the freeness of hypersurface arrangements consisting of hyperplanes and spheres
  41. Biderivations of the higher rank Witt algebra without anti-symmetric condition
  42. Some remarks on spectra of nuclear operators
  43. Recursive interpolating sequences
  44. Involutory biquandles and singular knots and links
  45. Constacyclic codes over đ”œpm[u1, u2,⋯,uk]/〈 ui2 = ui, uiuj = ujui〉
  46. Topological entropy for positively weak measure expansive shadowable maps
  47. Oscillation and non-oscillation of half-linear differential equations with coeffcients determined by functions having mean values
  48. On 𝓠-regular semigroups
  49. One kind power mean of the hybrid Gauss sums
  50. A reduced space branch and bound algorithm for a class of sum of ratios problems
  51. Some recurrence formulas for the Hermite polynomials and their squares
  52. A relaxed block splitting preconditioner for complex symmetric indefinite linear systems
  53. On f - prime radical in ordered semigroups
  54. Positive solutions of semipositone singular fractional differential systems with a parameter and integral boundary conditions
  55. Disjoint hypercyclicity equals disjoint supercyclicity for families of Taylor-type operators
  56. A stochastic differential game of low carbon technology sharing in collaborative innovation system of superior enterprises and inferior enterprises under uncertain environment
  57. Dynamic behavior analysis of a prey-predator model with ratio-dependent Monod-Haldane functional response
  58. The points and diameters of quantales
  59. Directed colimits of some flatness properties and purity of epimorphisms in S-posets
  60. Super (a, d)-H-antimagic labeling of subdivided graphs
  61. On the power sum problem of Lucas polynomials and its divisible property
  62. Existence of solutions for a shear thickening fluid-particle system with non-Newtonian potential
  63. On generalized P-reducible Finsler manifolds
  64. On Banach and Kuratowski Theorem, K-Lusin sets and strong sequences
  65. On the boundedness of square function generated by the Bessel differential operator in weighted Lebesque Lp,α spaces
  66. On the different kinds of separability of the space of Borel functions
  67. Curves in the Lorentz-Minkowski plane: elasticae, catenaries and grim-reapers
  68. Functional analysis method for the M/G/1 queueing model with single working vacation
  69. Existence of asymptotically periodic solutions for semilinear evolution equations with nonlocal initial conditions
  70. The existence of solutions to certain type of nonlinear difference-differential equations
  71. Domination in 4-regular Knödel graphs
  72. Stepanov-like pseudo almost periodic functions on time scales and applications to dynamic equations with delay
  73. Algebras of right ample semigroups
  74. Random attractors for stochastic retarded reaction-diffusion equations with multiplicative white noise on unbounded domains
  75. Nontrivial periodic solutions to delay difference equations via Morse theory
  76. A note on the three-way generalization of the Jordan canonical form
  77. On some varieties of ai-semirings satisfying xp+1 ≈ x
  78. Abstract-valued Orlicz spaces of range-varying type
  79. On the recursive properties of one kind hybrid power mean involving two-term exponential sums and Gauss sums
  80. Arithmetic of generalized Dedekind sums and their modularity
  81. Multipreconditioned GMRES for simulating stochastic automata networks
  82. Regularization and error estimates for an inverse heat problem under the conformable derivative
  83. Transitivity of the Δm-relation on (m-idempotent) hyperrings
  84. Learning Bayesian networks based on bi-velocity discrete particle swarm optimization with mutation operator
  85. Simultaneous prediction in the generalized linear model
  86. Two asymptotic expansions for gamma function developed by Windschitl’s formula
  87. State maps on semihoops
  88. 𝓜𝓝-convergence and lim-inf𝓜-convergence in partially ordered sets
  89. Stability and convergence of a local discontinuous Galerkin finite element method for the general Lax equation
  90. New topology in residuated lattices
  91. Optimality and duality in set-valued optimization utilizing limit sets
  92. An improved Schwarz Lemma at the boundary
  93. Initial layer problem of the Boussinesq system for Rayleigh-Bénard convection with infinite Prandtl number limit
  94. Toeplitz matrices whose elements are coefficients of Bazilevič functions
  95. Epi-mild normality
  96. Nonlinear elastic beam problems with the parameter near resonance
  97. Orlicz difference bodies
  98. The Picard group of Brauer-Severi varieties
  99. Galoisian and qualitative approaches to linear Polyanin-Zaitsev vector fields
  100. Weak group inverse
  101. Infinite growth of solutions of second order complex differential equation
  102. Semi-Hurewicz-Type properties in ditopological texture spaces
  103. Chaos and bifurcation in the controlled chaotic system
  104. Translatability and translatable semigroups
  105. Sharp bounds for partition dimension of generalized Möbius ladders
  106. Uniqueness theorems for L-functions in the extended Selberg class
  107. An effective algorithm for globally solving quadratic programs using parametric linearization technique
  108. Bounds of Strong EMT Strength for certain Subdivision of Star and Bistar
  109. On categorical aspects of S -quantales
  110. On the algebraicity of coefficients of half-integral weight mock modular forms
  111. Dunkl analogue of SzĂĄsz-mirakjan operators of blending type
  112. Majorization, “useful” Csiszár divergence and “useful” Zipf-Mandelbrot law
  113. Global stability of a distributed delayed viral model with general incidence rate
  114. Analyzing a generalized pest-natural enemy model with nonlinear impulsive control
  115. Boundary value problems of a discrete generalized beam equation via variational methods
  116. Common fixed point theorem of six self-mappings in Menger spaces using (CLRST) property
  117. Periodic and subharmonic solutions for a 2nth-order p-Laplacian difference equation containing both advances and retardations
  118. Spectrum of free-form Sudoku graphs
  119. Regularity of fuzzy convergence spaces
  120. The well-posedness of solution to a compressible non-Newtonian fluid with self-gravitational potential
  121. On further refinements for Young inequalities
  122. Pretty good state transfer on 1-sum of star graphs
  123. On a conjecture about generalized Q-recurrence
  124. Univariate approximating schemes and their non-tensor product generalization
  125. Multi-term fractional differential equations with nonlocal boundary conditions
  126. Homoclinic and heteroclinic solutions to a hepatitis C evolution model
  127. Regularity of one-sided multilinear fractional maximal functions
  128. Galois connections between sets of paths and closure operators in simple graphs
  129. KGSA: A Gravitational Search Algorithm for Multimodal Optimization based on K-Means Niching Technique and a Novel Elitism Strategy
  130. Ξ-type Calderón-Zygmund Operators and Commutators in Variable Exponents Herz space
  131. An integral that counts the zeros of a function
  132. On rough sets induced by fuzzy relations approach in semigroups
  133. Computational uncertainty quantification for random non-autonomous second order linear differential equations via adapted gPC: a comparative case study with random Fröbenius method and Monte Carlo simulation
  134. The fourth order strongly noncanonical operators
  135. Topical Issue on Cyber-security Mathematics
  136. Review of Cryptographic Schemes applied to Remote Electronic Voting systems: remaining challenges and the upcoming post-quantum paradigm
  137. Linearity in decimation-based generators: an improved cryptanalysis on the shrinking generator
  138. On dynamic network security: A random decentering algorithm on graphs
Downloaded on 6.9.2025 from https://www.degruyterbrill.com/document/doi/10.1515/math-2018-0087/html
Scroll to top button