Testing for Spatial Lag Effects in Varying Coefficient Spatial Autoregressive Models

Shuang Guo; Chuanhua Wei

doi:10.1515/JSSI-2015-0561

Enjoy 40% off

academic books on De Gruyter Brill *

Article Publicly Available

Testing for Spatial Lag Effects in Varying Coefficient Spatial Autoregressive Models

Shuang Guo and Chuanhua Wei

Published/Copyright: December 25, 2015

Published by

Become an author with De Gruyter Brill

Author Information

From the journal Journal of Systems Science and Information Volume 3 Issue 6

Abstract

This paper is concerned with testing for the varying coefficient spatial autoregressive models. Based on the profile likelihood estimation procedure, a profile generalized likelihood ratio test procedure is proposed to test spatial lag effects, and a residual-based bootstrap procedure is used to derive the p-value of the test. Some simulations are conducted to assess the performance of the test and the results are satisfactory.

Keywords: varying coefficient spatial autoregressive model; profile likelihood estimation; profile generalized likelihood ratio test; local linear method; bootstrap

1 Introduction

In the last three decades, spatial autoregressive models as a popular spatial econometric method have received much attention in the literature. For a long time, however, its main theory has centered around parametric models in which the relationship between the response and the covariates is assumed to be linear. In recent years, many useful semiparametric spatial autoregressive models have been proposed to relax traditional parametric models. Su and Jin[1] proposed a partially linear spatial autoregressive model and studied the properties of the profile quasi-maximum likelihood estimators for the parameters in the model. Li and Mei[2] proposed a statistical test procedure to check a polynomial relationship of the non-parametric component in partially linear spatial autoregressive models. Su[3] studied a nonprametric spatial autoregressive model that the spatially lagged response variable enters the model linearly while the covariates enter the model nonparametrically.

Like parametric models, semiparametric models have various forms. An alternative approach to relax the conditions imposed on traditional parametric models and explore the hidden structure is varying coefficient models, which was introduced by Cleveland et al.[4] and popularized by Hastie and Tibshirani[5]. Varying coefficient model is a useful extension of linear regression model, by allowing all the regression coefficients to vary as unknown functions of other factors. Due to its flexibility, varying coefficient model has been studied in many different contexts and has been successfully applied to nonlinear time series analysis, longitudinal and functional data analysis, and time-varying models in finance, see Fan and Zhang[6] for a comprehensive survey. In this paper, we consider the following varying coefficient spatial autoregressive model by combining the spatial autoregressive model and the varying coefficient regression model,

Yi=ρ∑j=1nwijYj+XXiTα(Ui)+εi,i=1,2,⋯,n(1)

where Yi,s are responses, X_i = (X_i₁, X_i₂, · · ·, X_ip)^T and U_i are associated covariates, α(·) = (α₁(·), α₂(·), · · ·, α_p(·))^T is a p-dimensional vector of unknown functions. W = (w_ij), 1 ≤ i, j ≤ n is a specified n × n spatial weight matrix, εi,s are independent and identically distributed random errors with zero mean and finite variance σ². Obviously, when α(·) = α, model (1.1) becomes the standard spatial autoregressive model, which was studied by Cliff and Ord[7], Anselin[8], Kelejian and Prucha[9] and Lee[10,11]. When ρ = 0, the model (1.1) reduces to the varying coefficient model.

For model (1), Li and Chen[12] proposed a profile likelihood approach based on the local linear method to estimate the unknown coefficient functions α(·) and the spatial lag parameter ρ. As we all know, an important question of spatial autoregressive model is to test the existence of the spatial effects. This leads to the following testing problem

H0:ρ=0VSH1:ρ≠0(2)

For the standard linear spatial autoregressive models, Likelihood ratio test and Rao’s Score (Lagrange Multiplier ) test can be applied to solve this problem, details can be found in Anselin[8]. However, the test problem (2) is a semiparametric hypothesis versus another semiparametric hypothesis testing problem. Many traditional tests cannot be directly applied to the above hypothesis. For this kind of testing problem, Fan and Huang[13] proposed a profile generalized likelihood ratio (PGLR) test based on the partially linear varying coefficient model. Thus, we are motivated to extend the PGLR test procedure for the testing problem (2) of model (1).

The rest of this paper is organized as follows. In Section 2, the profile maximum likelihood method to fit the varying coefficient spatial autoregressive model is briefly described to facilitate our consequent discussions. In Section 3, the test statistic is proposed and the residual-based bootstrap procedure is suggested to derive the p-value of the test. Simulations are conducted in Section 4 to examine the finite sample performance of the proposed test procedure. Conclusion is presented in Section 5.

2 Profile Likelihood Estimation Procedure

For the need of constructing the test statistic, we first introduce the profile likelihood estimating approach of model (1) proposed by Li and Chen[12]. Let us work with the matrix notation. Denote

YY=Y1Y2⋮Yn,XX=XX1TXX2T⋮XXnT,MM=XX1Tα(U1)XX2Tα(U2)⋮XXnTα(Un),ε=ε1ε2⋮εn

Then model (1) can be written as

YY=ρWWYY+MM+ε(3)

Assume that ε ∼ N(0, σ²I_n), let θ = (ρ, σ²)^T, the log-likelihood function of model (3) is

log⁡Ln(YY|θ,α(U1),α(U2),⋯,α(Un))=−n2log(2π)−n2log(σ2)+log|IIn−ρWW|−(YY−ρWWYY−MM)T(YY−ρWWYY−MM)2σ2(4)

where I_n is the identity matrix of order n.

If the parameter ρ is known, then model (1) can be written as

Yi∗=α1(Ui)Xi1+α2(Ui)Xi2+⋯+αp(Ui)Xip+εi(5)

where (Y1∗,Y2∗,⋯,Yn∗)=YY−ρWWYY=(IIn−ρWW)YY. This transforms the varying coefficient spatial autoregressive model (1) into the standard varying coefficient model (5). Now, we apply a local linear regression technique to estimate the varying coefficient functions {α_j(·), j = 1, 2, · · ·, p} in the model (5). For u in a small neighborhood of u₀, one can approximate α_j(·) locally by a linear function

αj(u)≈αj(u0)+αj′(u0)(u−u0),j=1,2,⋯,p

This leads to the following weighted local least-squares problems: find αj(u0),αj′(u0) to minimize

∑i=1nYi∗−∑j=1p{αj(u0)+αj′(u0)(Ui−u0)}Xij2Kh(Ui−u0)(6)

where K is a kernel function, h is a bandwidth and K_h(·) = K(·/h)/h.

Let

DDu0=XX1T(U1−u0)XX1TXX2T(U2−u0)XX2T⋮⋮XXnT(Un−u0)XXnT,SS=(XX1T01×p){DDu1TKKu1DDu1}−1DDu1TKKu1(XX2T01×p){DDu2TKKu2DDu2}−1DDu2TKKu2⋮(XXnT01×p){DDunTKKunDDun}−1DDunTKKun

with 0₁_×_p is the 1 × p zero matrix, and

KKu0=diag{Kh(U1−u0),Kh(U2−u0),⋯,Kh(Un−u0)}.

The solution of the problem (61) is given by

α^1(u0),⋯,α^p(u0),α^1′(u0),⋯,α^p′(u0)T={DDu0TKKu0DDu0}−1DDu0TKKu0(IIn−ρWW)YY.

Then, we have

α^(u0)=(IIp0p){DDu0TKKu0DDu0}−1DDu0TKKu0(IIn−ρWW)YY(7)

where 0_p is the p × p zero matrix.

We take u₀ to be each of U₁, U₂, · · ·, U_n, then we can obtain the estimators of α̂(U_j), j = 1, 2, · · ·, p. Then we can define the estimator for M as

MM¯=XX1Tα^(U1),XX2Tα^(U2),⋯,XXnTα^(Un)T=SS(IIn−ρWW)YY(8)

Replacing M of (4) by M̄, we obtain the following profile log-likelihood function

Given ρ, by differentiating log L_n(Y | θ) with respect to σ², we obtain the following equation:

∂log⁡Ln(YY|θ)∂σ2=0⇒nσ2=(IIn−SS)(IIn−ρWW)YYT(IIn−SS)(IIn−ρWW)YY(10)

Then, the profile maximum likelihood estimator of σ² can be obtained as

σ^2(ρ)=1nYYT(IIn−ρWW)T(IIn−SS)T(IIn−SS)(IIn−ρWW)YY(11)

Furthermore, the concentrated log-likelihood function of ρ is

log⁡Ln(ρ)=−n2[log(2π)+1]−n2log[σ^2(ρ)]+log|IIn−ρWW|(12)

Maximizing log L_n(ρ) leads to the estimator ρ̂ of ρ. Then, we can define the final estimator of σ² as σ̂² = σ̂²(ρ̂). Substituting ρ̂ into (8), we can obtain the estimator of M as

MM^=SS(II−ρ^WW)YY(13)

Finally, the residual vector is

ε^=YY−ρ^WWYY−MM^(14)

3 Profile Generalized Likelihood Ratio Test

3.1 Construction of the Test Statistic

On one hand, under the alternative hypothesis H₁, we can obtain the profile log-likelihood as

l(H1)=−n2[log(2π)+1]−n2log[σ^2(ρ^)]+log|IIn−ρ^WW|(15)

On the other hand, if the null hypothesis H₀ is true, model (1) reduces to the following standard varying coefficient model

Yi=XXiTα(Ui)+εi,i=1,2,⋯,n(16)

As a special case of model (3), we can obtain the profile log-likelihood of model (16) as

l(H0)=−n2[log(2π)+1]−n2log(σ~2)(17)

with

σ~2=∑i=1n[Yi−XXiTα~(Ui)]2n,α~(Ui)=(IIp0p){DDUiTKKUiDDUi}−1DDUiTKKUiYY

Following Fan and Huang[13], we define the profile generalized likelihood ratio test statistic

T=l(H1)−l(H0)=n2logσ~2σ^2(ρ^)+log|IIn−ρ^WW|(18)

Intuitively, if H₀ is true, there should not be significant difference between RSS(H₀) and RSS(H₁). Otherwise, RSS(H₀) – RSS(H₁) will tend to take a large positive value as RSS(H₀) should become systematically larger than RSS(H₁) under H₁. Hence, a large value of the test statistic T_n indicates that the null hypothesis should be rejected. If we denote the observed value of T by t, the p-value of the test is

p=PH0(T≥t)(19)

where PH0(⋅) refers to the probability computed under the null hypothesis H₀. For a given significance level a, if p < a then reject H₀; otherwise not reject H₀.

3.2 Calculation of the p-Value by the Residual Based Bootstrap Approach

For the proposed test statistic T, it is difficult to obtain its asymptotic null distribution as the presence of the spatial lag term in model (1). To solve this problem, the bootstrap approach was suggested to obtain the p-value by many researchers (Cai et al.[14], Fan and Jiang[15], Hall and Hart[16], Herwartz and Xu[17]). In the following, we propose here a residual based bootstrap procedure to derive the p-value of the test.

Step 1 Based on the data set {Yi,XXi,Ui}i=1n, under H₁, compute the residual vector ε̂₁ = (ε̂₁₁, ε̂₂₁ · · ·, ε̂_n₁) shown in (14) and centralize it to obtain ε^=(ε^11−ε^¯1,ε^21−ε^¯1,⋯,ε^n1−ε^¯1) in which ε^¯1=1n∑i=1nε^i1. Furthermore, with the estimation results under both H₀ and H₁, compute the observed value t of the test statistic T by (18).

Step 2 Generate the bootstrap residuals ε^∗=(ε^1∗,ε^2∗,⋯,ε^n∗) from the empirical distribution function of ε̂. Define

Yi∗=XXiTα~(Ui)+ε^i∗

Step 3 Calculate the bootstrap test statistic Tn∗ based on the data set {Yi∗,XXi,Ui}i=1n.

Step 4 Repeat Steps 2 and 3 k times and obtain a bootstrap sample of the test statistic T as T1∗,T2∗,⋯,Tk∗. The p-value is then estimated by

p^=♯{Ti∗:Ti∗≥t}k

where t is the observed value of the test statistic T obtained in Step 1 and ♯A denotes the number of the elements in set A.

For the sake of simplicity, we use the same bandwidth in calculating Ti∗ as that in t.

4 Simulation Studies

In this section, we shall conduct some simulations to examine the performance of the proposed estimate procedure.

Assume that the observations are collected from a uniform, two-dimensional grid consisting of m × m lattice points with unit distance between any two neighboring points along the horizontal and vertical axes. These m² points are arranged in an orthogonal coordinate system. We generate the spatial weight matrix W according to the principle of Rook contiguity.

The data are generated from the following varying coefficient spatial autoregressive model

Yi=ρ∑j=1nwijYj+α1(ui)xi1+α2(ui)xi2+εi

where x₁_i ∼ U(–2, 2), x₂_i ∼ N(1, 1), u_i ∼ U(0, 1), and α1(ui)=sin⁡(2πui)+1,α2(ui)=2e−2(2ui−1)2+3ui. The spatial lag parameter ρ took each of the values in the set (–0.15, –0.1–0.05, 0, 0.05, 0.1, 0.15). To gain an idea of the effect of the distribution of the error on our results, we take the following two different types of the error distribution, 1) ε_i ∼ N(0, 1), 2) εi∼U(−3,3). In the simulation, the kernel function was taken to be the Gaussian kernel K(z)=12πexp⁡(−z22), and the smoothing parameter h was taken as sun−1/5 for simplicity, where s_u is the sample standard deviation of u₁, u₂, · · ·, u_n.

For each given value of ρ and each type of error distribution and bandwidth, 1000 replications with n = 10², 15² were run and the rejection rate at the significance level α = 0.05 was computed as the simulated power of our proposed test procedure. And for each replication, the p-value was computed based on m = 500 bootstrap samples. The results are shown in Table 1.

Table 1

The Rejection frequencies for H₀ : ρ = 0 at the significance level α = 0.05

ρ	ε ∼ N(0, 1)		ε∼U(−3,3)
ρ
	m = 10	m = 15	m = 10	m = 15
–0.15	0.944	0.999	0.954	1.000
–0.1	0.692	0.964	0.691	0.956
–0.05	0.229	0.453	0.245	0.464
0	0.055	0.059	0.049	0.054
0.05	0.256	0.549	0.237	0.538
0.1	0.768	0.977	0.737	0.986
0.15	0.980	1.000	0.998	1.000

We summarize our findings as follows. When the null hypothesis is true (that is ρ = 0), the rejection frequencies (estimated sizes) of our proposed test are quite good and close to their nominal level 0.05 under different error distributions. Under the alternative hypothesis, the rejection rate seems very robust to the variation of the type of error distribution, and increases rapidly as the alternative hypothesis deviates from the null hypothesis.

5 Conclusion

In this paper, a test approach is proposed to check the existence of spatial effects in varying coefficient spatial autoregressive models, in which a residual-based bootstrap procedure is suggested to derive the p-value of the test. The simulation experiment demonstrates that the proposed test performs satisfactorily.

Supported by Beijing Higher Education Young Elite Teacher Project (YETP1316), and the National Natural Science Foundation of China (11301565)

References

[1] Su L J, Jin S N. Profile quasi-maximum likelihood estimation of partially linear spatial autoregressive models. Journal of Econometrics, 2010, 157(1): 18–33.10.1016/j.jeconom.2009.10.033Search in Google Scholar

[2] Li T Z, Mei C L. Testing a polynomial relationship of the non-parametric component in partially linear spatial autoregressive models. Papers in Regional Science, 2013, 92(3): 633–649.10.1111/j.1435-5957.2012.00428.xSearch in Google Scholar

[3] Su L J. Semiparametric GMM estimation of spatial autoregressive models. Journal of Econometrics, 2012, 167(2): 543–560.10.1016/j.jeconom.2011.09.034Search in Google Scholar

[4] Cleveland W S, Grosse E, Shyu W M. Local regression models. In Statistical Models in S, Eds. by Chambers J M and Hastie T J. Wadsworth and Brooks, Pacific Grove, 1991, 309–376.10.1201/9780203738535-8Search in Google Scholar

[5] Hastie T, Tibshirani R. Varying-coefficient models (with discussion). Journal of the Royal Statistical Society Series B, 1993, 55(4): 757–796.Search in Google Scholar

[6] Fan J Q, Zhang W Y. Statistical methods with varying coefficient models. Statistics and its Interface, 2008, 1: 179–195.10.4310/SII.2008.v1.n1.a15Search in Google Scholar

[7] Cliff A D, Ord J K. Spatial Autocorrelation. London: Pion Ltd., 1973.Search in Google Scholar

[8] Anselin L. Spatial econometrics: Methods and models. Dordrecht: Kluwer Academic Publishers, 1988.10.1007/978-94-015-7799-1Search in Google Scholar

[9] Kelejian H H, Prucha I R. A generalized moments estimator for the autoregressive parameter in a spatial model. International Economic Review, 1999, 40(2): 509–533.10.1111/1468-2354.00027Search in Google Scholar

[10] Lee L F. Asymptotic distributions of quasi-maximum likelihood estimators for spatial autoregressive models. Econometrica, 2004, 72(6): 1899–1925.10.1111/j.1468-0262.2004.00558.xSearch in Google Scholar

[11] Lee L F. GMM and 2SLS estimation of mixed regressive spatial autoregressive models. Journal of Econometrics, 2007, 137(2): 489–514.10.1016/j.jeconom.2005.10.004Search in Google Scholar

[12] Li K M, Chen J B. Profile maximum likelihood estimation of semiparametric varying coefficient spatial lag model. The Journal of Quantitative Technical Economics, 2013, 4: 85-98 (in Chinese).Search in Google Scholar

[13] Fan J Q, Huang T. Profile likelihood inferences on semiparametric varying-coefficient partially linear models. Bernoulli, 2005, 11(6): 1031–1057.10.3150/bj/1137421639Search in Google Scholar

[14] Cai Z W, Fan J Q, Yao Q W. Functional-coefficient regression models for nonlinear times series. Journal of the American Statistical Association, 2000, 95(451): 941–956.10.1080/01621459.2000.10474284Search in Google Scholar

[15] Fan J Q, Jiang J C. Nonparametric inferences for additive models. Journal of the American Statistical Association, 2005, 100(471): 890–907.10.1198/016214504000001439Search in Google Scholar

[16] Hall P, Hart J D. Bootstrap test for difference between means in nonparametric regression. Journal of the American Statistical Association, 1990, 85(412): 1039–1049.10.1080/01621459.1990.10474974Search in Google Scholar

[17] Herwartz H, Xu F. A new approach to bootstrap inference in functional coefficient models. Computational Statistics and Data Analysis, 2009, 53(6): 2155–2167.10.1016/j.csda.2008.09.014Search in Google Scholar

Received: 2015-4-12

Accepted: 2015-7-22

Published Online: 2015-12-25

Articles in the same Issue

https://doi.org/10.1515/JSSI-2015-0561

Keywords for this article

varying coefficient spatial autoregressive model; profile likelihood estimation; profile generalized likelihood ratio test; local linear method; bootstrap