On the Use of the Helmert Transformation, and its Applications in Panel Data Econometrics

Gueorgui I. Kolev; Helmuts Āzacis

doi:10.1515/jem-2021-0023

Article Open Access

On the Use of the Helmert Transformation, and its Applications in Panel Data Econometrics

Gueorgui I. Kolev and Helmuts Āzacis

Published/Copyright: October 5, 2022

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information

From the journal Journal of Econometric Methods Volume 12 Issue 1

Abstract

We revisit the Helmert transformation, and provide a useful and simple derivation of the joint distribution of the sample mean and the sample variance in samples from independently and identically distributed normal random variables. Our derivation is distinguished by concreteness, very little abstractness, and should be appealing to beginning students of statistics, and to both beginning and advanced students of econometrics. We also highlight one fruitful application of the Helmert transformation in panel data econometrics. The Helmert transformation can be used to eliminate the fixed effects in the estimation of fixed effects models, and we briefly review this application of the transformation in the panel data context.

Keywords: Helmert transformation; sample mean; sample variance; panel data econometrics; fixed effects model

JEL Classification: B23; C16; C20

1 Introduction

The Helmert transformation is named after the German geodesist Friedrich Robert Helmert (1876) and has a long history of use in statistics (Sawkins 1940; Cramér 1946, p. 116; Kruskal 1946; Weatherburn 1961, p. 164; Brownlee 1965, p. 271; Rao 1973, p. 182 among others). One application (the application from now on for brevity) of the Helmert transformation in statistics is to find the joint distribution of the sample mean, and the sample variance, calculated from a sample from a normal population. Although the cited references are authoritative, the presentations there of the application are in our view often rather complicated, which inhibits readers’ understating of the topic. We present our version of how the Helmert transformation works in the application, and our version should be particularly suitable and accessible to either beginning undergraduate university students in statistics, or students at both undergraduate and postgraduate level in econometrics, or quantitative social and political sciences which use econometrics.

For example, Cramér (1946, p. 116) and Weatherburn (1961, p. 164) both follow Sawkins (1940), and introduce in abstract terms an orthogonal linear transformation having certain properties. Instead, we think that explicitly stating the transformation, and then directly establishing its key properties would have pedagogical advantages. Rao (1973, p. 182) similarly uses an orthogonal matrix, i.e. a Helmert matrix, which he defines in terms of certain abstract properties. We think there would be pedagogical advantage to instead explicitly display one such Helmert matrix, so that the reader can visualize what is going on, and then proceed to establishing its properties and what it exactly does in the context of the application.

Upon presenting our treatment of the application, which of course uses the vary same ideas of the Helmert transformation and the associated Helmert matrix as the sources mentioned above, we also briefly discuss the use of the Helmert transformation in panel data econometrics. In particular, the Helmert transformation can be employed to eliminate fixed effects in the estimation of fixed effects models. Therefore, our treatment has the advantage that it is very concrete and explicit compared to previous literature, and advocates for more extensive use of the Helmert transformation. Unlike previous authors, we firstly explicitly state the Helmert transformation and display the Helmert matrix, and then proceed to prove the implications of applying the Helmert transformation in the application.

Our proof is closest to Brownlee (1965, p. 271). Brownlee (1965, p. 271) starts with a formula for the sample variance and using it, derives new variables that possess certain desired properties. This derivation of the new variables is fairly involved. We instead start by defining the new variables and then using a simple induction argument, prove that these variables have the same variance as the original variables.

We use mathematical induction in our proof, and Stigler (1984) also uses mathematical induction in his proof, but the argument there is different. He first establishes the joint distribution of the sample mean and variance for a sample of just two observations and then via the induction argument, proves that this joint distribution extends to samples of larger size. Zehna (1991) asserts that Stigler’s proof is not completely rigorous, and relies three times throughout the proof on a faulty argument.

This article proceeds as follows. In Section 2 we introduce the Helmert transformation and derive the joint distribution of the sample mean and the sample variance. In Section 3 we introduce the Helmert matrix associated with the transformation, and provide some remarks on its properties. In Section 4 we suggest that the use of the Helmert transformation has been somewhat overlooked in within estimation of one way error component fixed effects models in panel data econometrics, and we briefly survey the extant applications of the Helmert transformation in panel data context. In the last section we conclude.

2 The Helmert Transformation and the Joint Distribution of the Sample Mean and the Sample Variance in Samples from a Normal Population

We have a set of T random variables x _t, t = 1, 2, …, T independently and identically distributed (i.i.d. from now on) as Normal(μ, σ ²), and we want to find the joint distribution of the sample mean x ̄ T = ∑ t = 1 T x t / T and the sample variance s 2 = ∑ t = 1 T ( x t − x ̄ T ) 2 / ( T − 1 ) .

Consider the Helmert transformation, which takes the set x _t, t = 1, 2, …, T and produces a new set of variables z _t, t = 1, 2, …, T as follows:

z 1 = x ̄ T , z 2 = ( x 2 − x 1 ) 1 / 2 , z 3 = x 3 − [ ( x 2 + x 1 ) / 2 ] 2 / 3 , … ,

(1) z t = ( x t − x ̄ t − 1 ) ( t − 1 ) / t , … , z T = ( x T − x ̄ T − 1 ) ( T − 1 ) / T ,

where x ̄ t − 1 = ∑ τ = 1 t − 1 x τ / ( t − 1 ) .

The new set of variables z _t have convenient properties. (In what follows, we liberally use the properties of the expectation E(⋅), variance Var(⋅), and covariance Cov(⋅, ⋅) operators, and we assume that the reader is familiar with these properties.)

Properties of the Helmert transformation:

z _t, t = 1, 2, …, T are mutually uncorrelated: Cov ( z 1 , z t ) = Cov x ̄ T , ( x t − x ̄ t − 1 ) ( t − 1 ) / t = ( 1 / T ) ( t − 1 ) / t ⋅ Cov ∑ τ = 1 T x τ , x t − ∑ τ = 1 t − 1 x τ / ( t − 1 ) = ( 1 / T ) ( t − 1 ) / t [ σ 2 − ( t − 1 ) σ 2 / ( t − 1 ) ] = 0 for t = 2, …, T. And, Cov ( z t , z θ ) = Cov x t − ∑ τ = 1 t − 1 x τ / ( t − 1 ) , x θ − ∑ τ = 1 θ − 1 x τ / ( θ − 1 ) = −σ ²/(θ − 1) + (t − 1)σ ²/[(t − 1)(θ − 1)] = 0, where without loss of generality we take θ > t > 1. (Property 1 does not require normality; an i.i.d. distribution of the variables would be sufficient.)
z _t, t = 1, 2, …, T are linear combinations of x _t, t = 1, 2, …, T (which we have assumed, are jointly normally distributed) and hence z _t, t = 1, 2, …, T are jointly normally distributed as well.
E z 1 = E x ̄ T = μ ; E z t = E [ ( x t − x ̄ t − 1 ) ( t − 1 ) / t ] = ( t − 1 ) / t ⋅ ( μ − μ ) = 0 , for t = 2, 3, …, T.
Var z ₁ = σ ²/T, Var z _t = [σ ² + σ ²/(t − 1)](t − 1)/t = σ ², t = 2, …, T. So z ₂, …, z _T are homoskedastic.
∑ t = 1 T ( x t − x ̄ T ) 2 = ∑ t = 2 T z t 2 = ∑ t = 2 T ( x t − x ̄ t − 1 ) 2 ( t − 1 ) / t , which we prove in the theorem below.

An Auxiliary Theorem

Let x _t be a scalar quantity observed over t = 1, 2, …, T periods. Then,

(2) ∑ t = 1 T ( x t − x ̄ T ) 2 = ∑ t = 2 T ( x t − x ̄ t − 1 ) 2 ( t − 1 ) / t ,

where x ̄ T = ∑ t = 1 T x t / T and x ̄ t − 1 = ∑ τ = 1 t − 1 x τ / ( t − 1 ) .

Proof of the Auxiliary Theorem by induction

For T = 2 we have

∑ t = 1 2 ( x t − x ̄ T ) 2 = ( x 1 − x 1 / 2 − x 2 / 2 ) 2 + ( x 2 − x 1 / 2 − x 2 / 2 ) 2 = ( x 2 − x 1 ) 2 / 2 = ∑ t = 2 2 ( x t − x ̄ t − 1 ) 2 ( t − 1 ) / t ,

so for T = 2 the relationship holds indeed.

Now we assume that the relationship holds for T, and demonstrate that if it holds for T, then it holds for T + 1 too. First,

∑ t = 2 T + 1 ( x t − x ̄ t − 1 ) 2 ( t − 1 ) / t = ∑ t = 2 T ( x t − x ̄ t − 1 ) 2 ( t − 1 ) / t + ( x T + 1 − x ̄ T ) 2 T / ( T + 1 ) .

Second,

∑ t = 1 T + 1 ( x t − x ̄ T + 1 ) 2 = ∑ t = 1 T + 1 x t 2 − ( T + 1 ) x ̄ T + 1 2 = ∑ t = 1 T x t 2 + x T + 1 2 − ( T + 1 ) x ̄ T + 1 2 = ∑ t = 1 T ( x t − x ̄ T ) 2 + T x ̄ T 2 + x T + 1 2 − ( T + 1 ) x ̄ T + 1 2 .

By the induction hypothesis

∑ t = 1 T ( x t − x ̄ T ) 2 = ∑ t = 2 T ( x t − x ̄ t − 1 ) 2 ( t − 1 ) / t .

Therefore, what remains to be shown is that

( x T + 1 − x ̄ T ) 2 T / ( T + 1 ) = T x ̄ T 2 + x T + 1 2 − ( T + 1 ) x ̄ T + 1 2 .

Rewrite

( T + 1 ) x ̄ T + 1 2 = ∑ t = 1 T x t + x T + 1 2 / ( T + 1 ) = ( T x ̄ T + x T + 1 ) 2 / ( T + 1 ) = [ T 2 / ( T + 1 ) ] x ̄ T 2 + 2 [ T / ( T + 1 ) ] x ̄ T x T + 1 + x T + 1 2 / ( T + 1 ) .

Hence,

T x ̄ T 2 + x T + 1 2 − ( T + 1 ) x ̄ T + 1 2 = T x ̄ T 2 + x T + 1 2 − [ T 2 / ( T + 1 ) ] x ̄ T 2 − 2 [ T / ( T + 1 ) ] x ̄ T x T + 1 − x T + 1 2 / ( T + 1 ) = [ T / ( T + 1 ) ] x ̄ T 2 − 2 [ T / ( T + 1 ) ] x ̄ T x T + 1 + [ T / ( T + 1 ) ] x T + 1 2 ,

which is indeed equal to ( x T + 1 − x ̄ T ) 2 T / ( T + 1 ) . □

We will now use the properties of the Helmert transformation to deduce the joint distribution of the sample mean and the sample variance in samples from a normal population. We use the following three facts. First, linear combinations of jointly normal random variables are themselves jointly normally distributed (Cramér 1946, p. 213; Weatherburn 1961, p. 57). Second, for a set of jointly normal random variables, if they are uncorrelated, they are independent as well (e.g. Lancaster 1959; David 2009). There is a subtle point here, the set of variables need to be jointly normal, as one can construct counter examples where the marginal distributions are normal but the joint distribution is not normal, variables are uncorrelated, and yet they are not independent. Lancaster (1959) presents such a counter example, and precisely states the conditions in his Theorem 1. Third, the sum of k independent, squared, standard normal variables is distributed as χ ²(k), which is a distribution discovered and described by Helmert (1876). (Helmert (1876) discovered the χ ² distribution, however he did not observe that the sample mean and the sample variance are independent, see David (2009).)

Main Theorem

The joint distribution of the sample mean and the sample variance in a sample of T i.i.d. random variables x _t, t = 1, 2, …, T, each x _t distributed as Normal(μ, σ ²), has the following properties:

The sample mean x ̄ T and the sample variance s ² are independently distributed.
The sample mean x ̄ T is distributed as Normal(μ, σ ²/T).
(T − 1)s ²/σ ² is distributed as χ ²(T − 1).

Proof of the Main Theorem

In the proof we will refer to the listed Properties of the Helmert transformation as Property plus the number under which the property was listed.

The sample mean x ̄ T is a function of z ₁ only, and by Property 5, the sample variance s ² is a function of z _t, t = 2, …, T only. By Property 1, z _t, t = 1, 2, …, T are uncorrelated, and by Property 2, z _t, t = 1, 2, …, T are jointly normally distributed. Because z _t, t = 1, 2, …, T are a set of uncorrelated and jointly normal random variables, they are a set of independent random variables as well (Lancaster 1959, Theorem 1). Therefore the independence of x ̄ T (a function of z ₁ only) and s ² (a function of z ₂, z ₃, …, z _T only) follows because z ₁ is independent of z ₂, z ₃, …, z _T.
The sample mean x ̄ T is normally distributed by Property 2. Its mean is given in Property 3, and its variance is given in Property 4. Overall x ̄ T is distributed as Normal(μ, σ ²/T).
( T − 1 ) s 2 / σ 2 = ∑ t = 1 T ( x t − x ̄ T ) 2 / σ 2 = ∑ t = 2 T z t 2 / σ 2 , where the last equality follows by dividing both sides of Property 5 by σ ². However, ∑ t = 2 T z t 2 / σ 2 is the sum of T − 1 squared standard normal variables, and hence is distributed as χ ²(T − 1).

The Main Theorem was first proved by Fisher (1915, 1925, but Fisher’s “geometric arguments” are difficult to follow. What we have presented, albeit somewhat lengthy, is an elementary and concrete proof which requires very little abstract thinking. Our proof should be accessible to students with any moderately quantitative background, and should be much preferable to students who are not used to elaborate abstract mathematical thinking, such as beginning undergraduates in statistics, or both undergraduate and graduate students in econometrics. Overall, from the transformed set of variables z _t it is very easy to deduce the joint distribution of the sample mean and the sample variance.

We finish this section by stating another property of the Helmert transformation, which we will briefly use in Section 4. Suppose there is another set of variables y _t, t = 1, …, T. Then,

∑ t = 1 T ( x t − x ̄ T ) ( y t − y ̄ T ) = ∑ t = 2 T ( x t − x ̄ t − 1 ) ( y t − y ̄ t − 1 ) ( t − 1 ) / t ,

The proof of this property follows the same steps as the proof of the Auxiliary Theorem and, therefore, is omitted. Note that if y _t = x _t for all t, this property reduces to Property 5.

3 The Helmert Matrix

The previous section is self contained, and using the Helmert transformation to deduce the joint distribution of the sample mean and the sample variance does not require any matrix algebra. However if there is need, or desire to do so, one can also relate the Helmert transformation from the previous section to what Lancaster (1965) calls a Helmert matrix in the strict sense.

Consider the following matrix

H o T × T ≡ 1 1 1 1 … 1 − 1 1 0 0 … 0 − 1 / 2 − 1 / 2 1 0 … 0 − 1 / 3 − 1 / 3 − 1 / 3 1 … 0 … … … … … … − 1 / ( T − 1 ) − 1 / ( T − 1 ) − 1 / ( T − 1 ) − 1 / ( T − 1 ) … 1

We can verify by direct multiplication that the rows of this matrix are orthogonal, that is, HoHo′ results in a diagonal matrix. We can also consider the rescaled version of Ho

H m T × T ≡ diag 1 / T 1 / 2 2 / 3 3 / 4 … ( T − 1 ) / T ⋅ H o ,

where diag(v) is an operator that transforms a vector v into a diagonal matrix, and overall the operation diag(v) ⋅ Ho results in multiplying the ith row of the matrix Ho by the ith element of the vector v. With the choice of the first element in v as 1/T, we can see by direct multiplication that Hm′Hm is symmetric with one element repeated on the main diagonal and another element repeated everywhere off the main diagonal. HmHm′ is diagonal and is almost the identity matrix, only the upper left element is different from 1, the rest of HmHm′ coincides with the identity matrix.

If we instead choose the first element in v to be 1 / T ,

H n T × T ≡ diag 1 / T 1 / 2 2 / 3 3 / 4 … ( T − 1 ) / T ⋅ H o ,

direct multiplication shows that Hn′Hn and HnHn′ are both the identity matrix, and we would call such a matrix Hn an orthonormal matrix.

We see that if we arrange the set x _t, t = 1, 2, …, T into a column vector x ≡ x 1 x 2 … x T ′ , and choose the first element of v to be 1/T, we will obtain the Helmert transformation from the previous section, z = Hm ⋅ x, where z ≡ z 1 z 2 … z T ′ . Because of the appealing aesthetics of Hn with the first element in v chosen to be 1 / T , i.e. a choice resulting in an orthonormal matrix Hn′Hn = HnHn′ = Identity, all authors we are aware of use this version of the Helmert matrix. However for our application all we need is that HmHm′ be a diagonal matrix with all the elements on the main diagonal below the first being equal to 1. In this situation, the elements of the vector z = Hm ⋅ x will be mutually uncorrelated and each element z _t, t = 2, …, T will be homoskedastic. Therefore, for our application choosing the first element in the vector v as 1/T serves perfectly fine.

4 The Helmert Transformation in Panel Data Models

We have derived the joint distribution of the sample mean and the sample variance with particular focus on the simplicity and concreteness of the derivation, and particular focus on the use of the Helmert transformation. The Helmert transformation has found its application in the “fixed effects” panel data model too. Consider the standard one way “fixed effects” panel data model (e.g. Wooldridge 2010, Ch.10; Hsiao 2014, Ch.3)

(3) y i t = x i t ′ β + μ i + ε i t , i = 1,2 , … , I , t = 1,2 , … , T

where the regressand y _it is a scalar, the regressor vector x _it is K × 1, y i t , x i t ′ are i.i.d. for i = 1, 2, …, I, i.e. the variables constitute an i.i.d. random sample in the cross section. The individual “fixed effects” (so called by convention) are time constant, random and potentially correlated with the regressor vector. The idiosyncratic error term ɛ _it is uncorrelated with the regressor x _jτ for all t, τ, i, j, i.e. the regressor x _it is strictly exogenous with respect to ɛ _it conditional on the fixed effects, and ɛ _it is i.i.d. both in the cross section and in the time series dimensions.

Consistent estimation of the parameter vector β in Eq. (3) under the assumptions that the fixed effects can be arbitrarily correlated with the regressors, and the regressors are strictly exogenous with respect to ɛ _it, conditional on the fixed effects, proceeds by eliminating the fixed effects. The conventional way of eliminating the fixed effects is by the within transformation, firstly averaging Eq. (3) across time, to obtain y ̄ i T = x ̄ i T ′ β + μ i + ε ̄ i T , where y ̄ i T = ∑ t = 1 T y i t / T , x ̄ i T = ∑ t = 1 T x i t / T and ε ̄ i T = ∑ t = 1 T ε i t / T . Then we subtract this averaged equation from Eq. (3) to eliminate the fixed effects and to obtain the estimating equation

(4) y i t − y ̄ i T = ( x i t − x ̄ i T ) ′ β + ( ε i t − ε ̄ i T ) , i = 1,2 , … , I , t = 1,2 , … , T .

Finally, we estimate Eq. (4) by ordinary least squares (OLS) over the I ⋅ T pooled observations to obtain the within estimator

(5) β ̂ = ∑ i = 1 I ∑ t = 1 T ( x i t − x ̄ i T ) ( x i t − x ̄ i T ) ′ − 1 ∑ i = 1 I ∑ t = 1 T ( x i t − x ̄ i T ) ( y i t − y ̄ i T ) .

The within estimator is well studied, well understood and a basic building block in the panel data econometrics literature. However the within transformation introduces strong correlation between the transformed errors in the estimating Eq. (4), because all the transformed errors ( ε i t − ε ̄ i T ) for a cross sectional unit i share the same ε ̄ i T . This makes residual diagnostic checks and residual analysis awkward and difficult, for example, if we wanted to test that the errors ɛ _it are indeed i.i.d.

On the other hand if we apply the Helmert transformation in Eq. (1) on each variable in the fixed effects model Eq. (3) and for each cross sectional unit i separately, to construct transformed variables corresponding to z ₂, z ₃, …, z _T which have mean 0, then we again eliminate the fixed effects μ _i:

(6) ( y i t − y ̄ i t − 1 ) ( t − 1 ) / t = ( x i t − x ̄ i t − 1 ) ′ ( t − 1 ) / t β + ( ε i t − ε ̄ i t − 1 ) ( t − 1 ) / t ,

i = 1, 2, …, I, t = 2, …, T, where y ̄ i t − 1 = ∑ τ = 1 t − 1 y i τ / ( t − 1 ) , x ̄ i t − 1 = ∑ τ = 1 t − 1 x i τ / ( t − 1 ) , ε ̄ i t − 1 = ∑ τ = 1 t − 1 ε i τ / ( t − 1 ) .

We can proceed with the OLS estimation of the Helmert-transformed estimating Eq. (6),

(7) β ̂ = ∑ i = 1 I ∑ t = 2 T ( x i t − x ̄ i t − 1 ) ( x i t − x ̄ i t − 1 ) ′ − 1 ∑ i = 1 I ∑ t = 2 T ( x i t − x ̄ i t − 1 ) ( y i t − y ̄ i t − 1 ) ,

which is the best linear unbiased estimator, because the errors in the Helmert-transformed estimating equation ( ε i t − ε ̄ i t − 1 ) ( t − 1 ) / t are uncorrelated and homoskedastic, (and normal if the original ɛ _it were normal to start with). By invoking Property 6 of the Helmert transformation, one can verify that the estimator in Eq. (7) coincides with the within estimator in Eq. (4). However, as an added benefit, we can apply any residual diagnostics and checks that we might have in mind, because the Helmert-transformed errors in the estimating equation have the same stochastic properties as the original errors in the structural model. To the best of our knowledge, the existing literature on panel data models has not exploited this simplification in the analysis afforded by the Helmert transformation.

The use of the Helmert transformation in the context of panel models and, in particular, dynamic panel models has been popularized by Arellano and Bover (1995) and Arellano (2003). (See also Alvarez and Arellano 2003.) It is also described in Hansen (2022, Section 17.43). A dynamic panel model is like the one given in Eq. (3), but it additionally contains the lagged values of the dependent variable as regressors. For example, if y _it (directly) depends only on its own value in the previous period, then the model is

(8) y i t = y i t − 1 α + x i t ′ β + μ i + ε i t , i = 1,2 , … , I , t = 2 , … , T .

Arellano and Bover (1995) suggest to transform the variables in the following way:

(9) ( y i t − y ̄ i t + 1 ) ( T − t ) / ( T − t + 1 ) , ( x i t − x ̄ i t + 1 ) ′ ( T − t ) / ( T − t + 1 ) ,

i = 1, 2, …, I, t = 1, …, T − 1, where y ̄ i t + 1 = ∑ τ = t + 1 T y i τ / ( T − t ) , x ̄ i t + 1 = ∑ τ = t + 1 T x i τ / ( T − t ) . Observe that this is the same Helmert transformation in Eq. (1) but with the variables ordered in the reverse order according to the time index. Arellano and Bover (1995) refer to this transformation as “the forward orthogonal deviation” as opposed to “the backward orthogonal deviation” which is the one displayed in Eq. (6).

Both forward and backward orthogonal deviations produce uncorrelated and homoskedastic errors in the transformed model, provided that the original errors ɛ _it are i.i.d. However, the forward orthogonal deviation has the following advantage in the dynamic panel model. When transforming the variables to remove the fixed effects, it introduces correlation between the transformed lagged dependent variable and the transformed error term. Therefore, one needs to use an instrumental variables estimator to obtain consistent estimates of α and β. With the forward orthogonal deviation, the past values of the dependent variable y _i1, …, y _it−1 are valid instruments for ( y i t − 1 − y ̄ i t ) ( T − t − 1 ) / ( T − t ) , which is the new lagged dependent variable after the variable transformation. Furthermore, Hayakawa (2009a, 2009b) shows that the instrumental variables estimator is more efficient if the instruments themselves are constructed using backward orthogonal deviations.

5 Conclusion

We revisit the Helmert transformation, and we provide a simple and useful induction-based derivation of the joint distribution of the sample mean and the sample variance in samples from independently and identically distributed normal random variables. Our derivation is concrete and should be appealing to students of statistics and econometrics. We also suggest one fruitful application of the Helmert transformation in panel data econometrics – residual based tests in fixed effects models. We briefly review the applications of the Helmert transformation in panel data context, where the transformation is more commonly known as “the forward/backward orthogonal deviations operator”.

Corresponding author: Gueorgui I. Kolev, Department of Finance, Tilburg University, Tilburg, Netherlands, E-mail: joro.kolev@gmail.com

We would like to thank two anonymous reviewers, an associate editor, and the editor Prof. Tong Li for their comments and suggestions which improved the quality of this work. Remaining errors are our own.

References

Alvarez, J., and M. Arellano. 2003. “The Time Series and Cross-Section Asymptotics of Dynamic Panel Data Estimators.” Econometrica 71 (4): 1121–59. https://doi.org/10.1111/1468-0262.00441.Search in Google Scholar

Arellano, M. 2003. Panel Data Econometrics. Oxford: Oxford University Press.10.1093/0199245282.001.0001Search in Google Scholar

Arellano, M., and O. Bover. 1995. “Another Look at the Instrumental Variable Estimation of Error-Components Models.” Journal of Econometrics 68 (1): 29–51. https://doi.org/10.1016/0304-4076(94)01642-d.Search in Google Scholar

Brownlee, K. A. 1965. Statistical Theory and Methodology in Science and Engineering. New York: A Wiley Publication in Applied Statistics.Search in Google Scholar

Cramér, H. 1946. Mathematical Methods of Statistics. Princeton: Princeton University Press.10.1515/9781400883868Search in Google Scholar

David, H. A. 2009. “A Historical Note on Zero Correlation and Independence.” The American Statistician 63 (2): 185–6. https://doi.org/10.1198/tast.2009.0034.Search in Google Scholar

Fisher, R. A. 1915. “Frequency Distribution of the Values of the Correlation Coefficient in Samples from an Indefinitely Large Population.” Biometrika 10 (4): 507–21. https://doi.org/10.2307/2331838.Search in Google Scholar

Fisher, R. A. 1925. “Applications of ”Student’s” Distribution.” Metron 5: 90–104.Search in Google Scholar

Hansen, B. 2022. Econometrics. Princeton: Princeton University Press.Search in Google Scholar

Hayakawa, K. 2009a. “A Simple Efficient Instrumental Variable Estimator for Panel AR(p) Models when Both N and T Are Large.” Econometric Theory 25 (3): 873–90. https://doi.org/10.1017/s0266466609090707.Search in Google Scholar

Hayakawa, K. 2009b. “First Difference or Forward Orthogonal Deviation- Which Transformation Should be Used in Dynamic Panel Data Models?: A Simulation Study.” Economics Bulletin 29 (3): 2008–17.Search in Google Scholar

Helmert, F. R. 1876. “Die Genauigkeit der Formel von Peters zur Berechnung des wahrscheinlichen Beobachtungsfehlers director Beobachtungen gleicher Genauigkeit.” Astronomische Nachrichten 88: 113. https://doi.org/10.1002/asna.18760880802.Search in Google Scholar

Hsiao, C. 2014. Analysis of Panel Data, 3rd ed. Cambridge: Cambridge University Press.10.1017/CBO9781139839327Search in Google Scholar

Kruskal, W. 1946. “Helmert’s Distribution.” The American Mathematical Monthly 53 (8): 435–8. https://doi.org/10.1080/00029890.1946.11991723.Search in Google Scholar

Lancaster, H. O. 1959. “Zero Correlation and Independence.” Australian Journal of Statistics 1 (2): 53–6. https://doi.org/10.1111/j.1467-842x.1959.tb00274.x.Search in Google Scholar

Lancaster, H. O. 1965. “The Helmert Matrices.” The American Mathematical Monthly 72 (1): 4–12. https://doi.org/10.2307/2312989.Search in Google Scholar

Rao, C. R. 1973. Linear Statistical Inference and its Applications, 2nd ed. New York: Wiley.10.1002/9780470316436Search in Google Scholar

Sawkins, D. T. 1940. “Elementary Presentation of the Frequency Distribution of Certain Statistical Populations.” Journal and Proceedings of the Royal Society of New South Wales 74: 209–39.10.5962/p.360296Search in Google Scholar

Stigler, S. M. 1984. “Kruskal’s Proof of the Joint Distribution of X̄$\bar{X}$ and S2.” The American Statistician 38 (2): 134–5. https://doi.org/10.2307/2683251.Search in Google Scholar

Weatherburn, C. E. 1961. A First Course in Mathematical Statistics. Cambridge: The English Language Book Society and Cambridge University Press.Search in Google Scholar

Wooldridge, J. M. 2010. Econometric Analysis of Cross Section and Panel Data. Cambridge: MIT press.Search in Google Scholar

Zehna, P. W. 1991. “On Proving that X̄$\bar{X}$ and S2 are Independent.” The American Statistician 45 (2): 121–2. https://doi.org/10.1080/00031305.1991.10475782.Search in Google Scholar

Received: 2021-09-03

Revised: 2022-09-06

Accepted: 2022-09-09

Published Online: 2022-10-05

This work is licensed under the Creative Commons Attribution 4.0 International License.

Articles in the same Issue

https://doi.org/10.1515/jem-2021-0023

Keywords for this article

Helmert transformation; sample mean; sample variance; panel data econometrics; fixed effects model

Creative Commons

BY 4.0