Home Fast inference methods for high-dimensional factor copulas
Article Open Access

Fast inference methods for high-dimensional factor copulas

  • Alex Verhoijsen EMAIL logo and Pavel Krupskiy
Published/Copyright: September 9, 2022
Become an author with De Gruyter Brill

Abstract

Gaussian factor models allow the statistician to capture multivariate dependence between variables. However, they are computationally cumbersome in high dimensions and are not able to capture multivariate skewness in the data. We propose a copula model that allows for arbitrary margins, and multivariate skewness in the data by including a non-Gaussian factor whose dependence structure is the result of a one-factor copula model. Estimation is carried out using a two-step procedure: margins are modelled separately and transformed to the normal scale, after which the dependence structure is estimated. We develop an estimation procedure that allows for fast estimation of the model parameters in a high-dimensional setting. We first prove the theoretical results of the model with up to three Gaussian factors. Then, simulation results confirm that the model works as the sample size and dimensionality grow larger. Finally, we apply the model to a selection of stocks of the S&P500, which demonstrates that our model is able to capture cross-sectional skewness in the stock market data.

1 Introduction

In the literature, Gaussian factor models are used as a dimensionality reduction technique to describe a vector of d variables X = ( X 1 , , X d ) as a linear combination of a vector of p Gaussian latent factors Z = ( Z 1 , , Z p ) , where for meaningful cases p < d . Thus, variables form groups based on their correlation, such that correlation within one particular group of variables can be high, while the correlation between variables assigned to different groups can be low [4,5]. Because of the minimal assumptions underlying factor models, this technique is amply used in a wide range of applications. This includes econometrics, where factor models are either used to identify the factor structure itself [1,2,18] or to control endogenous common factors in factor-augmented panel data regressions [15]. Other applications include the finance literature, see Chapter 5 in [10] or Chapter 9 in [21], and hydrology [19].

Nevertheless, the assumption of multivariate Gaussianity implies that these factor models cannot capture cross-sectional skewness or kurtosis in the underlying multivariate distribution. Copula models, on the other hand, provide a straightforward technique to capture the entire dependence structure between random variables, and allow for non-Gaussian skewness and kurtosis. A d -dimensional copula C : [ 0 , 1 ] d [ 0 , 1 ] is the cumulative distribution function (cdf) of a random vector U = ( U 1 , , U d ) , where U j U ( 0 , 1 ) , for j = 1 , , d . Let F 1 , , F d be univariate cdfs, and let x = ( x 1 , , x d ) . Then by [17], the function F ( x ) = C ( F 1 ( x 1 ) , , F d ( x d ) ) is a d -dimensional cdf with margins F 1 , , F d . In contrast, if F is a d -dimensional cdf with univariate cdfs F 1 , , F d , then there exists a copula such that F ( x ) = C ( F 1 ( x 1 ) , , F d ( x d ) ) holds [17]. If the margins are continuous, then one speaks of the copula, which is unique and equal to C ( u ) = F ( F 1 ( u 1 ) , , F d ( u d ) ) , where F j is the generalised inverse function of F j .[1]

While copula models provide an elegant technique to model dependence between random variables, they come with certain restrictions. On one hand, they can be computationally demanding in a high-dimensional setting, while on the other hand they might not be flexible enough to model dependence between a large number of random variables. A solution was provided by Krupskii and Joe [7], who proposed a class of factor copula models that are both parsimonious and flexible. The idea is to assume p latent factors V 1 , , V p that cause dependence between d observable variables U 1 , , U d via bivariate linking copulas. These types of factor copula models can be extended to allow for a more structured approach [8]. A different type of factor copula model was proposed by [13]. The latter model uses a simulated method of moments (SMM) estimation method described in [12] and is extended to allow for time-varying parameters in [14]. However, estimation for this type of factor copula model is particularly slow in higher dimensions, as SMM is based on a resampling scheme. Bayesian estimation methods for factor copula models are explored by [3,6,11,16,20], but are generally also computationally demanding. A novel method to decrease the computational time to estimate factor copula models is proposed by Krupskii and Joe [9], who provide a procedure that allows for fast estimation of the factor copula parameters in non-overlapping factor copula models. The latter include the one-factor copula model in [7] or the nested factor copula model in [8].

The main contribution of this article is to provide a fast and computationally tractable procedure to model very high-dimensional data characterised by non-Gaussian skewness in the multivariate distribution. This is done in a two-step approach. First, marginals are modelled separately and transformed to normal scores. Second, dependence between the marginals is modelled by means of p + 1 common factors: one factor resulting from a one-factor copula model as described in [7] and p Gaussian factors.

Factor loadings and the residual correlation parameter are estimated using a generalised method of moments (GMM) approach, which provides consistent and asymptotically normal estimates as the sample size n tends to infinity. In a subsequent step, the unobserved Gaussian factors are estimated using a weighted combination of the observed variables. Weights are obtained by maximising the correlation between the estimated and the true Gaussian factors, while convergence is shown by letting the sample size n tend to infinity, and setting the dimension d sufficiently large. To allow for fast inference of the one-factor copula parameter, we use the proxy factor idea proposed by Krupskii and Joe [9]. We show that as n and d tend to infinity, the proxy converges towards the true factor.

In sum, the proposed model exploits both the versatility of Gaussian factor models and the fast inference scheme of non-overlapping factor copula models. The remainder of the article is organised as follows. The new model is proposed in Section 2, its main properties are discussed, and the statistical procedure is outlined. The theoretical results are outlined in Section 3, and the estimation strategy is illustrated with a simulation experiment in Section 4, where we propose a closed-form expression to quickly invert the variance-covariance matrix of the observed data. We illustrate the model by applying it to a real-world data example in Section 5. Section 6 wraps up the article with a discussion and ideas for future research.

2 Model outline and inference

In Section 2.1, a new factor copula model is presented that combines the characteristics of Gaussian factor models and of factor copulas, while still being computationally tractable in high dimensions. The estimation procedure is subsequently discussed in Section 2.2.

2.1 Model outline

Assume one has at hand data from arbitrary margins which in a first step are transformed to N ( 0 , 1 ) marginals. Then X = ( X 1 , , X d ) is a d -dimensional vector of observable variables so that X j N ( 0 , 1 ) , for j = 1 , , d . In a p -factor model, the observable variables X j are a linear combination of p independent latent Gaussian factors and an error term:

(2.1) X j = k = 1 p λ j k Z k + γ j ε j , Z 1 , , Z k , ε j i.i.d. N ( 0 , 1 ) ,

where λ j k = Corr ( X j , Z k ) and γ j = 1 k = 1 p λ j k 2 , j = 1 , , d . If the errors ε 1 , , ε d are independent, then the model (2.1) is a classical Gaussian p -factor model. In this work, we assume that E ( ε j 1 ε j 2 ) = η for j 1 j 2 , and the errors are independent conditional on some unobserved factor V U ( 0 , 1 ) and the respective copula

(2.2) C ( u 1 , , u d ) = 0 1 j = 1 d C U V ( u j v ) d v ,

where C U V ( u j v ) = C U V ( u j , v ) / v , and C U , V is the copula linking ( ε j , V ) . In other words, C is a one-factor copula; see [7] for more details. Casting the model in (2.1) into matrix notation gives

(2.3) X = Λ Z + Γ 1 / 2 ε ,

where X = ( X 1 , , X d ) is a column vector of d observable variables, and Λ = ( λ 1 , , λ p ) is a d × p matrix of p column vectors λ k = ( λ 1 k , , λ d k ) consisting of d factor loadings, where λ j k captures the impact of factor Z k on the j th variable X j . The p Gaussian factors are elements of the column vector Z = ( Z 1 , , Z p ) , while the loadings on the d residuals ε = ( ε 1 , , ε d ) are given in the d × d diagonal matrix Γ 1 / 2 = diag ( γ 1 , , γ d ) . Finally, dependence between the residuals is modelled by the homogeneous one-factor copula model in (2.2).

Denote

H = 1 η η η 1 η η η 1 ,

H is a d × d matrix, with η = E ( ε j 1 ε j 2 ) , where j 1 j 2 and j 1 , j 2 = 1 , , d . Then the correlation matrix of X is

(2.4) Σ X = Λ Λ + Γ 1 / 2 H Γ 1 / 2 .

The difference with a standard Gaussian factor model lies in the assumption that E ( ε ε ) = H I d , which implies the presence of correlation between the residual terms.

The following example shows that model 2.1 can capture asymmetric dependence.

Example 2.1

Let p = 2 , d = 4 and assume that the copula linking ( ε j , V ) is the Clayton copula with parameter 2, which corresponds to moderate lower tail dependence. In this case, η 0.52 . Consider the model:

X j = 0.6 Z 1 + 0.8 ε j , j = 1 , 2 , X j = 0.6 Z 2 + 0.8 ε j , j = 3 , 4 .

Figure 1 shows scatterplots of different pairs of the observed variables. It is seen that pairs ( X 1 , X 2 ) and ( X 3 , X 4 ) have stronger dependence in the lower tail, while ( X 1 , X 3 ) has much weaker dependence and it does not exhibit strong asymmetry.

Figure 1 
                  Scatterplots of different pairs of observed variables in Example 2.1.
Figure 1

Scatterplots of different pairs of observed variables in Example 2.1.

2.2 Inference

In this section, we present the statistical procedure that ensures we can estimate the model parameters, the Gaussian factors, and the latent factor of the one-factor copula model. The corresponding theoretical results are outlined in Section 3.

We begin by estimating the parameters in the model, for which we implement a GMM approach. Let X n = ( X 1 , , X n ) be a sample of size n , where X i = ( X i 1 , , X i d ) , for i = 1 , , n . Moreover, let Σ ˆ n = Σ ˆ X n and define

Q n ( Λ , η ) = j = 1 d k = 1 d { ( Σ X Σ ˆ n ) 2 } j , k = 1 d ,

then the model parameters can be obtained by

( Λ ˆ , η ˆ ) = argmin ( Λ , η ) Q n ( Λ , η ) .

After obtaining the parameters, we want to estimate the Gaussian factors. This can be done by creating a proxy Z ˆ k n d = w ˆ k n d X n of the true Gaussian factors Z k (for k = 1 , , p ) by taking a weighted combination of the observable variables. Moreover, it is possible to obtain unique optimal weights w ˆ k n d = ( λ ˆ k Σ ˆ n 1 λ ˆ k ) 1 Σ ˆ n 1 λ ˆ k by minimising the L 2 norm between the proxy factor and the true factor. In Section 4.1, we propose a closed-form expression to quickly invert the d × d variance-covariance matrix Σ X .

The next step is to recover the latent factor in the one-factor copula model. To this end, the estimated residuals can be recovered as:

ε ˆ j n d = 1 γ ˆ j X j k = 1 p λ ˆ j k Z ˆ k n d ,

using the estimated values of the factor loadings. We can then use a proxy U ¯ n d = j = 1 d ε ˆ j n d / d to approximate the unknown factor V in the one-factor copula model. The density of the one-factor copula model is obtained by differentiating (2.2), that is

c ( u 1 , , u d ) = C ( u 1 , , u d ) u 1 u d = 0 1 j = 1 d c U , V ( u j , v ) d v .

Thus, by plugging in U ¯ n d as a proxy for V , we can obtain the copula parameter θ by maximising the log-likelihood

j = 1 d i = 1 n ln c U , U ¯ n d ( u i j , v i ; θ ) .

3 Theoretical results

We prove these results for the model with p 3 Gaussian factors. All proofs are deferred to Appendix B.

We assume the following conditions hold.

Condition 3.1

Let δ j k = λ j k / γ j , the vector Δ k = { ( δ l k δ m k ) / d } l , m = 1 d , and ϕ k 1 , k 2 is the angle between Δ k 1 and Δ k 2 . There exist ξ 0 , ξ L , ξ U ( 0 , ) , and a constant integer d 0 N such that the following hold for any d > d 0 and all j = 1 , , d , and for all k , k 1 , k 2 , k 3 = 1 , , p :

  1. ξ L < 1 d l = 1 d δ l k 2 < ξ U < ;

  2. Σ X is invertible and ξ 0 < ( Σ X ) j , k < 1 ξ 0 for j k ;

  3. Δ k > ξ 0 , ϕ k 1 , k 2 + ϕ k 1 , k 3 ϕ k 2 , k 3 > ξ 0 for k 1 k 2 k 3 and ϕ k 1 , k 2 + ϕ k 1 , k 3 + ϕ k 2 , k 3 < 2 π ξ 0 , where Δ k = 1 d l = 1 d m = 1 d ( δ l k δ m k ) 2 is the Euclidean norm.[2]

The next theorem states that GMM estimation yields parameter estimates that converge in probability to their true values, given some regularity conditions. Moreover, the resulting estimators are asymptotically normal, as stated in the subsequent theorem.

Theorem 3.2

(Consistency of the GMM estimators) Let f ( X n , Λ , η ) = Σ X Σ ˆ n . The GMM estimators converge in probability to their true values as n :

(3.1) ( Λ ˆ , η ˆ ) P ( Λ , η ) ,

if the following conditions are satisfied:

  1. E [ f ( X , Λ , η ) ] = 0 only for ( Λ , η ) = ( Λ 0 , η 0 ) ;

  2. The parameter space Θ R p d + 1 is compact;

  3. With probability one, f ( X , Λ , η ) is continuous at each ( Λ , η ) ;

  4. E [ sup ( Λ , η ) Θ f ( X , Λ , η ) ] < .

Theorem 3.3

(Asymptotic normality of GMM estimators) Let f ( X n , Λ , η ) = Σ X Σ ˆ n . The GMM estimators are asymptotically normal if the following conditions are satisfied:

  1. ( Λ ˆ , η ˆ ) is consistent;

  2. The parameter space Θ R p d + 1 is compact;

  3. f ( X , Λ , η ) is continuously differentiable in some neighbourhood of ( Λ 0 , η 0 ) with probability one;

  4. E [ f ( X i , Λ , η ) 2 ] < ;

  5. sup ( Λ , η ) Θ ( Λ , η ) Σ X < ;

  6. The matrix ( ( Λ , η ) Σ X ) ( Λ , η ) Σ X is non-singular.

The following lemma states that we can obtain a proxy of the true Gaussian factors Z k (for k = 1 , , p ) by taking a weighted combination of the observable variables X = ( X 1 , , X d ) .

Lemma 3.4

(Optimal weights) Let w k d = ( w k 1 , , w k d ) be a weight vector of length d (for k = 1 , , p ), such that the weighted sum of the observed variables Z ˆ k d = w k d X is a proxy of the Gaussian factor Z k . Then one can obtain the unique optimal weight vector w k d = ( λ k Σ X 1 λ k ) 1 / 2 Σ X 1 λ k by minimising E { ( Z ˆ k d Z k ) 2 } with respect to w k d such that Var ( Z ˆ k d ) = 1 .

Theorem 3.5 states that Z ˆ k d = w k d X will converge to its true value if the dimension is sufficiently large.

Theorem 3.5

(Convergence in L 2 of Gaussian factor proxies) Assume Condition 3.1is satisfied. For k = 1 , , p , let Z ˆ k d = w k d X , with w k d = ( λ k Σ X 1 λ k ) 1 Σ X 1 λ k . Then the following holds as d ,

lim d E { ( Z ˆ k d Z k ) 2 } = 0 .

Theorem 3.6 states that Z ˆ k n d will also converge to its true value.

Theorem 3.6

(Convergence in L 2 of estimated Gaussian factor proxies) Let Z ˆ k n d = w ˆ k n d X n , with w ˆ k n d = ( λ ˆ k Σ ˆ n 1 λ ˆ k ) 1 / 2 Σ ˆ n 1 λ ˆ k . Let Σ ˆ n 1 and λ ˆ k be the estimated counterparts of Σ X 1 and λ k , respectively. Then, for k = 1 , , p

lim d lim n E { ( Z ˆ k n d Z k ) 2 } = 0 .

Note that in Theorem 3.6, the order of the limits is crucial to ensure that convergence takes place.

Denote

ε ˆ j d = 1 γ j X j k = 1 p λ j k Z ˆ k d .

A first step is to show in Lemma 3.7 that the estimated proxy for the latent factor U ¯ n d = j = 1 d ε ˆ j n d / d converges towards the proxy for the latent factor U ¯ d = j = 1 d ε ˆ j d / d . Subsequently, in Theorem 3.8 we show that U ¯ d converges in probability to a monotone function of the true latent factor, if the dimension is sufficiently large.

Lemma 3.7

Let U ¯ n d = j = 1 d ε ˆ j n d / d , and let U ¯ d = j = 1 d ε j d / d . Then

U ¯ n d P U ¯ d , as n .

Theorem 3.8

(Convergence in probability to a monotonic function of the factor V as d ) Let U ¯ n d = j = 1 d ε ˆ j n d / d . Under the assumptions of Theorem 5 in [9] and Theorem 3.5,

U ¯ n d P m ( V ) ,

as n and d , where m ( ) is a monotone function.

The next result shows that the observed variables in (2.1) are asymptotically independent so this model is not suitable for modelling data with strong tail dependence.

Theorem 3.9

(Asymptotic independence of X ) If 0 < γ l < 1 , 0 < γ m < 1 , and γ l ( 1 γ m 2 ) 1 / 2 γ m ( 1 γ l 2 ) 1 / 2 for 1 l < m d , then

lim z Pr ( X l < z , X m < z ) Φ ( z ) = 0 ,

where Φ ( ) is the cdf of a standard normal distribution. This result implies that the copula linking ( X l , X m ) has no lower tail dependence. Similar result holds for the upper tail.

4 Simulation study

In this section, we perform a simulation study with the goal to evaluate the performance of the estimation methods for model (2.1). The numerical implementation techniques that can be used to efficiently compute the presented estimators of the model are discussed in Section 4.1, and Section 4.2 sets up the simulation experiment for which the results are presented in Section 4.3.

4.1 Numerical implementation

The main hurdle in applying the calculations in the previous section is in obtaining the inverse of the correlation matrix, Σ X 1 . Such an inversion becomes computationally infeasible whenever the dimension d is too large. As mentioned in the proof of Theorem 3.5, the following expression can be obtained using Woodbury’s formula [Chapter 2 in 22]:

Σ X 1 = ( Γ 1 / 2 H Γ 1 / 2 ) 1 ( Γ 1 / 2 H Γ 1 / 2 ) 1 Λ ( Λ ( Γ 1 / 2 H Γ 1 / 2 ) 1 Λ + I p ) 1 Λ ( Γ 1 / 2 H Γ 1 / 2 ) 1 .

As one can see, the bottleneck lies in the inversion of the symmetric d × d matrix Γ 1 / 2 H Γ 1 / 2 . However, we can find that ( Γ 1 2 H Γ 1 2 ) 1 = { ( y j , k ) } j , k = 1 d , where

y j j = 1 + ( d 2 ) η γ j 2 ( 1 + ( d 2 ) η ( d 1 ) η 2 ) , y j k = η γ j γ k ( 1 + ( d 2 ) η ( d 1 ) η 2 ) ,

where j k for j , k = 1 , , d . Using this closed-form expression for the inverse greatly improves computational efficiency, as we no longer have to deal with the inversion of a d × d matrix. Indeed, the remaining computations only require the inversion of the p × p matrix Λ ( Γ 1 / 2 H Γ 1 / 2 ) 1 Λ + I p , for which in practical applications p d .

4.2 Simulation setup

We assume that we have at hand a sample of size n and dimension d which admits the factor model in (2.1) with two Gaussian factors and residual one-factor copula dependence:

(4.1) X i j = λ j 1 Z i 1 + λ j 2 Z i 2 + γ j ε i j ,

for i = 1 , , n , and j = 1 , , d . In addition, let λ j 1 U ( 0.4 , 0.8 ) and λ j 2 U ( 0.2 , 0.6 ) . The following simulation procedure is used to generate the Gaussian factors and the residuals ε i j :

  1. Draw a sample of size n for Z i 1 N ( 0 , 1 ) , Z i 2 N ( 0 , 1 ) , and W i 1 , , W i d , V i U ( 0 , 1 ) with i = 1 , , n .

  2. For each j = 1 , , d , let U i j = C U V 1 ( W i j V i ) .

  3. Set ε i j = Φ 1 ( U i j ) , where Φ 1 ( ) is the quantile function of the Gaussian cdf.

In a subsequent step, the factor model parameters are estimated using GMM, with parameter λ 11 fixed at its true value to make the model parameters identifiable. Then, the factors are estimated using the weighted variables approach, where the weights are computed as in Theorem 3.6. Finally, the factor copula parameter is estimated using standard maximum likelihood.

This experiment is repeated for 1,000 independent simulations. For every experiment, one of the following settings is used for the sample size n , the dimension d , the copula parameter θ , and the copula family C :

  • The sample size is set to n { 250 , 500 , 750 , 1,000 , 2,500 , 5,000 } .

  • The dimension is set to d { 10 , 50 , 100 , 250 , 500 } .

  • The copula parameter is calculated using Kendall’s tau, τ K = 0.50 .

  • The family of bivariate copulas are Clayton ( C Cl ), and t 4 ( C t 4 ).

Next, over all simulation results, we calculate the mean and variance of the correlation between the true and the estimated Gaussian factors, and the Pearson correlation coefficient between the estimated copula factor V ˆ , and the true copula factor V .[3] Finally, we calculate the average bias of the copula parameter and its variance over all simulations. In the next section, we report these estimation results.

4.3 Simulation results

The results of the simulation experiments are presented in Tables 1 and 2. As one can see in both tables, for fixed n , the correlation between the estimated and the true factors increases as d becomes larger. Similarly, for the bias of the estimated copula parameter reported in the last column of the tables, we can see that, for each n , there is a decrease in the bias as the dimension d becomes larger. However, as d increases from 250 to 500, the bias starts to increase again. On the other hand, as the sample size n increases, this decrease becomes less important.

Table 1

Simulation results for 1,000 independent runs from a two-factor model with residual one-factor copula dependence with Clayton linking copulas

Clayton copula with τ K = 0.5
n d C or ( Z 1 , Z ˆ 1 n d ) (s.d) C or ( Z 2 , Z ˆ 2 n d ) (s.d) C or ( V , V ˆ n d ) (s.d) Bias ( θ ˆ ) (s.d)
250 10 0.84 (0.05) 0.68 (0.10) 0.82 (0.09) 0.13 (0.38)
50 0.95 (0.01) 0.92 (0.01) 0.92 (0.01) 0.07 (0.20)
100 0.97 (0.01) 0.95 (0.01) 0.95 (0.01) 0.10 (0.19)
250 0.98 (0.01) 0.98 (0.01) 0.97 (0.01) 0.07 (0.19)
500 0.98 (0.03) 0.97 (0.03) 0.98 (0.01) 0.07 (0.19)
500 10 0.86 (0.04) 0.74 (0.06) 0.85 (0.04) 0.25 (0.22)
50 0.96 (0.01) 0.93 (0.01) 0.92 (0.01) 0.07 (0.15)
100 0.97 ( < 0.01 ) 0.95 (0.01) 0.95 (0.01) 0.05 (0.14)
250 0.99 ( < 0.01 ) 0.98 (0.01) 0.98 ( < 0.01 ) 0.04 (0.14)
500 0.99 (0.02) 0.98 (0.02) 0.98 ( < 0.01 ) 0.05 (0.14)
750 10 0.86 (0.03) 0.75 (0.04) 0.86 (0.01) 0.29 (0.16)
50 0.96 ( < 0.01 ) 0.93 (0.02) 0.93 ( < 0.01 ) 0.06 (0.12)
100 0.97 ( < 0.01 ) 0.95 ( < 0.01 ) 0.95 ( < 0.01 ) 0.04 (0.12)
250 0.99 ( < 0.01 ) 0.98 ( < 0.01 ) 0.98 ( < 0.01 ) 0.03 (0.12)
500 0.99 (0.02) 0.98 (0.02) 0.99 ( < 0.01 ) 0.05 (0.12)
1,000 10 0.87 (0.03) 0.76 (0.03) 0.86 (0.01) 0.29 (0.14)
50 0.96 ( < 0.01 ) 0.93 ( < 0.01 ) 0.93 ( < 0.01 ) 0.06 (0.10)
100 0.97 ( < 0.01 ) 0.95 ( < 0.01 ) 0.96 ( < 0.01 ) 0.04 (0.10)
250 0.99 ( < 0.01 ) 0.98 ( < 0.01 ) 0.98 ( < 0.01 ) 0.02 (0.10)
500 0.99 (0.01) 0.98 (0.01) 0.99 ( < 0.01 ) 0.05 (0.10)
2,500 10 0.88 (0.02) 0.78 (0.02) 0.87 (0.01) 0.31 (0.09)
50 0.96 ( < 0.01 ) 0.93 ( < 0.01 ) 0.93 ( < 0.01 ) 0.04 (0.07)
100 0.98 ( < 0.01 ) 0.96 ( < 0.01 ) 0.96 ( < 0.01 ) 0.03 (0.06)
250 0.99 ( < 0.01 ) 0.98 ( < 0.01 ) 0.98 ( < 0.01 ) 0.02 (0.06)
500 0.99 (0.01) 0.98 (0.01) 0.99 ( < 0.01 ) 0.04 (0.07)
5,000 10 0.88 (0.01) 0.78 (0.01) 0.87 ( < 0.01 ) 0.31 (0.06)
50 0.96 ( < 0.01 ) 0.93 ( < 0.01 ) 0.93 ( < 0.01 ) 0.04 (0.05)
100 0.98 ( < 0.01 ) 0.96 ( < 0.01 ) 0.96 ( < 0.01 ) 0.03 (0.05)
250 0.99 ( < 0.01 ) 0.98 ( < 0.01 ) 0.98 ( < 0.01 ) 0.01 (0.05)
500 0.99 (0.01) 0.98 (0.01) 0.99 ( < 0.01 ) 0.04 (0.05)

Note: Loadings on the first factors are uniformly chosen between 0.4 and 0.8, while loadings on the second factor are uniformly chosen between 0.2 and 0.6.

Table 2

Simulation results for 1,000 independent runs from a two-factor model with residual one-factor copula dependence with t 4 linking copulas

t 4 copula with τ K = 0.5
n d C or ( Z 1 , Z ˆ 1 n d ) (s.d) C or ( Z 2 , Z ˆ 2 n d ) (s.d) C or ( V , V ˆ n d ) (s.d) Bias ( θ ˆ ) (s.d.)
250 10 0.83 (0.05) 0.63 (0.11) 0.79 (0.13) 0.05 (0.02)
50 0.94 (0.02) 0.90 (0.03) 0.93 (0.02) 0.02 (0.03)
100 0.96 (0.02) 0.93 (0.03) 0.96 (0.02) 0.02 (0.03)
250 0.98 (0.02) 0.97 (0.02) 0.98 (0.01) 0.02 (0.02)
500 0.96 (0.05) 0.95 (0.06) 0.98 (0.02) 0.02 (0.03)
500 10 0.84 (0.04) 0.70 (0.08) 0.86 (0.07) 0.02 (0.12)
50 0.95 (0.01) 0.92 (0.01) 0.94 (0.01) 0.01 (0.02)
100 0.97 (0.01) 0.95 (0.01) 0.97 ( < 0.01 ) 0.01 (0.02)
250 0.99 (0.01) 0.98 (0.01) 0.99 ( < 0.01 ) 0.01 (0.02)
500 0.98 (0.03) 0.97 (0.03) 0.99 ( < 0.01 ) 0.01 (0.02)
750 10 0.85 (0.04) 0.73 (0.06) 0.88 (0.04) 0.04 (0.06)
50 0.95 ( < 0.01 ) 0.92 ( < 0.01 ) 0.95 (0.01) 0.01 (0.02)
100 0.97 ( < 0.01 ) 0.95 ( < 0.01 ) 0.97 ( < 0.01 ) 0.01 (0.01)
250 0.99 ( < 0.01 ) 0.98 ( < 0.01 ) 0.99 ( < 0.01 ) 0.01 (0.01)
500 0.99 (0.02) 0.98 (0.02) 0.99 ( < 0.01 ) 0.01 (0.01)
1,000 10 0.86 (0.04) 0.74 (0.04) 0.88 (0.02) 0.05 (0.02)
50 0.95 ( < 0.01 ) 0.92 ( < 0.01 ) 0.95 ( < 0.01 ) 0.01 (0.01)
100 0.97 ( < 0.01 ) 0.95 ( < 0.01 ) 0.97 ( < 0.01 ) 0.01 (0.01)
250 0.99 ( < 0.01 ) 0.98 ( < 0.01 ) 0.99 ( < 0.01 ) 0.01 (0.01)
500 0.99 (0.01) 0.98 (0.01) 0.99 ( < 0.01 ) 0.01 (0.01)
2,500 10 0.87 (0.02) 0.77 (0.02) 0.89 (0.02) 0.05 (0.01)
50 0.96 ( < 0.01 ) 0.93 ( < 0.01 ) 0.95 ( < 0.01 ) 0.01 (0.01)
100 0.97 ( < 0.01 ) 0.95 ( < 0.01 ) 0.97 ( < 0.01 ) 0.00 (0.01)
250 0.99 ( < 0.01 ) 0.98 ( < 0.01 ) 0.99 ( < 0.01 ) 0.00 (0.01)
500 0.99 (0.01) 0.98 (0.01) 0.99 ( < 0.01 ) 0.01 (0.01)
5,000 10 0.88 ( < 0.01 ) 0.77 ( < 0.01 ) 0.89 ( < 0.01 ) 0.05 ( < 0.01 )
50 0.96 ( < 0.01 ) 0.93 ( < 0.01 ) 0.95 ( < 0.01 ) 0.00 ( < 0.01 )
100 0.97 ( < 0.01 ) 0.95 ( < 0.01 ) 0.97 ( < 0.01 ) 0.00 ( < 0.01 )
250 0.99 ( < 0.01 ) 0.98 ( < 0.01 ) 0.99 ( < 0.01 ) 0.01 ( < 0.01 )
500 0.99 ( < 0.01 ) 0.98 ( < 0.01 ) 0.99 ( < 0.01 ) 0.01 ( < 0.01 )

Note: Loadings on the first factors are uniformly chosen between 0.4 and 0.8, while loadings on the second factor are uniformly chosen between 0.2 and 0.6.

5 Data application

In this section, we apply the model outlined in the article to a real-world dataset. We consider three different sectors in the S&P500 between January 3, 2017, and December 30, 2020, which accounts for 1,006 observations. The sectors considered are Industrials (71 stocks), Information Technology (74 stocks), and Financials (64 stocks) as classified by the Global Industry Classification Standard.[4]

To model the univariate time series, we use an AR(1)-GARCH(1,1) process for the log-returns s j , t = log ( S j , t / S j , t 1 ) of stock price S j , t , for each individual stock j = 1 , , d and each time period t = 1 , , T . The univariate time series model can be written as follows:

s j , t = μ j + ϕ j s j , t 1 + κ j , t ξ j , t , κ j , t 2 = ω j + α j κ j , t 1 2 ξ j , t 1 2 + β j κ j , t 1 2 ,

where μ j is the unconditional mean, and ϕ j gives, for stock j , the dynamic impact of a change in the log-returns in period t 1 . Furthermore, κ j , t introduces, for stock j , non-constant volatility to the process. The latter is a function of the average volatility ω j , the dynamic impact α j of a change in the previous period of the squared series, and β j captures the impact of a change in the volatility at time t 1 . Finally, error terms ξ j , t are i.i.d. Skew- t distributed with ν degrees of freedom and skewness parameter γ . After estimating the univariate parameters, we transform the estimated residuals to a uniform scale.

Next, we want to identify the number of underlying factors. In Section 2.4 of [13] it is shown that, under regularity conditions, one can use a scree plot based on the eigenvalues of the rank correlation matrix to determine the appropriate number of factors. The scree plot for the AR(1)-GARCH(1,1) filtered data transformed to uniform ranks is given in Figure 2. One can see that for the Industrials sector, the majority of the variance in the data is explained by one factor, while for the Information Technology and Financials sectors, two common factors appear to be more appropriate to capture the variance in the data. Thus, when we look at the number of factors required to simultaneously model the Industrials, Information Technology, and Financials sectors, it is no surprise that the number of factors appears to be between 2 and 4.

Figure 2 
               Scree plot of the ten largest eigenvalues of the rank correlation matrix for the AR(1)-GARCH(1,1) filtered observations in the Industrials, Information Technology, and Financials stocks of the S&P500.
Figure 2

Scree plot of the ten largest eigenvalues of the rank correlation matrix for the AR(1)-GARCH(1,1) filtered observations in the Industrials, Information Technology, and Financials stocks of the S&P500.

Next, we model the multivariate dependence between the different log-returns of stock prices. First, we transform the data to normal scores and look at the empirical semi-correlation of the data. Semi-correlation can be defined as the correlation in the lower tail ( ρ < 0 ) or upper tail ( ρ > 0 ) of the data. We can see in Table 3 that the average absolute semi-correlation indicates that the data in each of the three sectors are skewed to the left.[5] This is consistent with stylised facts in the financial literature that show stronger correlation during downturns. We therefore expect that a copula with stronger dependence in the lower tail is more appropriate to model these data. For comparison, we use different copula models to see which one fits these characteristics best.

Table 3

Sector-wise average absolute lower semi-correlation ( ρ < 0 ) and upper semi-correlation ( ρ > 0 ) using three different sectors in the S&P500

Sector ρ < 0 ρ > 0
Industrials 0.38 0.25
Information Technology 0.39 0.21
Financials 0.44 0.34

First, we model dependence within each sector separately using only one Gaussian factor, and a Clayton, Frank, and normal one-factor copula model to capture the residual dependence. Because the lower tail dependence does not appear to be very strong, we also use a reflected Gumbel copula model to capture the residual dependence. Then, we subsequently use two and three Gaussian factors while using the same one-factor copula models. These estimation results for each sector are presented in Table 4. As one can see in the model with one Gaussian factor, all three sectors establish a moderate degree of dependence between its constituent stocks. This is not surprising, as linear dependence is already captured by the Gaussian factor.[6] Similarly, when pairing industry sectors together in a model with two Gaussian factors, we can see that there is a moderate degree of dependence between the stocks in each of the pairs of industry sectors. Finally, when using three Gaussian factors, we can see that the estimated parameter for each of the three copula models is somewhat higher than the estimated dependence parameters in the model with two Gaussian factors.

Table 4

Estimated copula parameters for three different sectors in the S&P500

Sectors r-Gumbel Clayton Frank Normal d
Industrials 1.46 0.73 3.51 0.50 71
Information Technology 1.50 0.79 3.75 0.52 74
Financials 1.74 1.12 4.98 0.63 64
Industrials-IT 1.35 0.56 2.86 0.43 145
Industrials-Financials 1.49 0.77 3.61 0.51 135
IT-Financials 1.46 0.73 3.48 0.50 138
Industrials-IT-Financials 1.61 0.91 4.32 0.57 209

Note: Top: one-factor model, middle: two-factor model, bottom: three-factor model.

To evaluate the performance of our model, we use the estimated parameters to simulate a sample of size 1,000,000. We then calculate the semi-correlations of this simulated sample and calculate the average (absolute) bias with respect to the empirical semi-correlations in Table 3. These results are reported in Tables 5 and 6. As we can see, the average absolute bias of the lower semi-correlation, b ( ρ < 0 ) , is smallest for the model using reflected Gumbel linking copulas in the model with one Gaussian factor. This shows that our model is capable of capturing multivariate skewness in high-dimensional data. The average absolute bias of the upper semi-correlation is somewhat lower for the Frank and Normal copula, but the difference with that of the reflected Gumbel copula is relatively small, except for the model with one Gaussian factor with the Financials sector. Finally, we also modelled the data using a multivariate Student t ν copula with a three-factor covariance structure. However, the estimated degrees of freedom parameter is close to 20, which makes it indistinguishable from a normal copula model.

Table 5

Absolute average bias for simulated semi-correlations, with n = 1,000,000 , and dimension matching that of each sector

r-Gumbel Clayton Frank Normal
Sectors b ( ρ < 0 ) b ( ρ > 0 ) b ( ρ < 0 ) b ( ρ > 0 ) b ( ρ < 0 ) b ( ρ > 0 ) b ( ρ < 0 ) b ( ρ > 0 )
Ind 0.06 0.07 0.07 0.10 0.16 0.05 0.14 0.05
IT 0.07 0.07 0.07 0.09 0.18 0.06 0.16 0.06
Fin 0.06 0.07 0.06 0.12 0.14 0.06 0.11 0.05
Ind-IT 0.08 0.06 0.09 0.07 0.16 0.05 0.15 0.05
Ind-Fin 0.11 0.09 0.12 0.10 0.18 0.07 0.17 0.06
IT-Fin 0.10 0.06 0.10 0.08 0.17 0.06 0.17 0.05
Ind-IT-Fin 0.06 0.06 0.06 0.09 0.15 0.05 0.13 0.06

Note: Parameters used in the simulation are the estimated parameters for each sector.

Table 6

Average bias for simulated semi-correlations, with n = 1,000,000 , and dimension matching that of each sector

r-Gumbel Clayton Frank Normal
Sectors b( ρ < 0 ) b( ρ > 0 ) b( ρ < 0 ) b( ρ > 0 ) b( ρ < 0 ) b( ρ > 0 ) b( ρ < 0 ) b( ρ > 0 )
Ind 0.04 0.06 0.04 0.09 0.16 0.02 0.14 0.01
IT 0.05 0.03 0.04 0.07 0.18 0.01 0.16 0.02
Fin 0.01 0.06 0.01 0.12 0.14 0.03 0.11 0.01
Ind-IT 0.07 0.03 0.08 0.05 0.16 0.00 0.15 0.01
Ind-Fin 0.11 0.08 0.11 0.10 0.18 0.06 0.17 0.05
IT-Fin 0.09 0.04 0.09 0.07 0.17 0.02 0.17 0.01
Ind-IT-Fin 0.01 0.04 0.01 0.08 0.15 −0.00 0.13 0.02

Note: Parameters used in the simulation are the estimated parameters for each sector.

6 Discussion

Applying Gaussian factor models to high-dimensional data requires extensive computational power. Moreover, Gaussian factor models do not capture multivariate skewness in the data. In this article, we provided a method that exploits the high dimensionality of the data to accurately estimate the unobserved factors and simultaneously allows us to model multivariate skewness in the data. The proposed model is flexible, as it allows us to separately model the margins from the dependence structure. We proved that our model works for a factor model with up to three Gaussian factors. Simulation studies showed that the accuracy of our estimations improves as the sample size and dimension increase. To illustrate the model, we applied it to a dataset of three constituent sectors of the S&P500 stock data and found evidence that the stocks in the three sectors are skewed, consistent with findings in the finance literature.

There are different avenues for future research. One direction is to include fast inference models that allow for tail dependence in factor models or for a nested structure in the factor copula dependence. Another option is to include multiple variables that capture dependence between the data using different one-factor copula models.

Acknowledgments

The authors would like to thank the referee and Associate Editor for their constructive comments that have helped to improve this paper.

  1. Funding information: This research was supported by the Andrew Sisson Grant and the Melbourne Research Scholarship from the University of Melbourne.

  2. Conflict of interest: The authors state no conflict of interest.

Appendix A Validity of Condition 3.1(iii)

We can illustrate Condition 3.1(iii) in the case where the factor loadings are uniformly spread out in the interval [ a k , b k ] , with 1 < a k < b k < 1 ,

λ l k = a k + σ k ( l ) d ( b k a k ) ,

where σ k ( ) is a permutation of indexes { 1 , 2 , , d } . Recall that δ l k = λ l k / γ l .

We first consider the case p = 1 and we can assume without loss of generality σ 1 ( j ) = j so that

δ l 1 = a 1 + ( l / d ) ( b 1 a 1 ) [ 1 { a 1 + ( l / d ) ( b 1 a 1 ) } 2 ] 1 / 2 .

Let g k ( x ) = a k + x ( b k a k ) . We find that

(6.1) lim d Δ 1 2 = lim d 1 d 2 l = 1 d m = 1 d ( δ l 1 δ m 1 ) 2 = 2 lim d 1 d l = 1 d δ l 1 2 2 lim d 1 d l = 1 d δ l 1 2 = 2 0 1 g 1 ( x ) 2 1 g 1 ( x ) 2 d x 2 0 1 g 1 ( x ) { 1 g 1 ( x ) 2 } 1 / 2 d x 2 .

We illustrate the validity of Condition 3.1(iii) in Tables A1 and A2 by plugging a range of values for a 1 and b 1 into (6.1).

Table A1

Values for the integral in (6.1) for different values of ( a 1 , b 1 )

a 1 0.99 0.80 0.50 0.00 0.50 0.80
b 1 0.99 0.99 0.99 0.99 0.99 0.99
lim d Δ 1 2 3.347 2.053 1.816 1.841 2.183 2.627
Table A2

Values for the integral in (6.1) for different values of ( a 1 , b 1 )

a 1 0.20 0.15 0.10 0.00 0.10 0.15
b 1 0.20 0.20 0.20 0.20 0.20 0.20
lim d Δ 1 2 0.0273 0.0208 0.0153 0.0069 0.0018 0.0005

The limiting norm of the vector Δ 1 is smaller if a 1 and b 1 are close. In practical applications, it indicates that more accurate parameter estimates can be obtained if the values of the loadings λ l 1 are more spread.

Now we consider the case p = 2 with

δ l k = a k + ( σ k ( l ) / d ) ( b k a k ) [ 1 { a 1 + ( σ 1 ( l ) / d ) ( b 1 a 1 ) } 2 { a 2 + ( σ 2 ( l ) / d ) ( b 2 a 2 ) } 2 ] 1 / 2 , k = 1 , 2 .

Furthermore, we consider the case when the loadings Λ 1 and Λ 2 are monotonically dependent with σ 1 ( l ) = σ 2 ( l ) (case 1), and when they are independent (case 2).

In the first case, for k = 1 , 2 ,

lim d Δ k 2 = 2 0 1 g k ( x ) 2 1 g 1 ( x ) 2 g 2 ( x ) 2 d x 2 0 1 g k ( x ) { 1 g 1 ( x ) 2 g 2 ( x ) 2 } 1 / 2 d x 2 ,

and in the second case, for k = 1 , 2 ,

lim d Δ k 2 = 2 0 1 0 1 g k ( x k ) 2 1 g 1 ( x 1 ) 2 g 2 ( x 2 ) 2 d x 1 d x 2 2 0 1 0 1 g k ( x k ) { 1 g 1 ( x 1 ) 2 g 2 ( x 2 ) 2 } 1 / 2 d x 1 d x 2 2 .

Furthermore, we need to check if cos 2 ( ϕ 1 , 2 ) = Δ 1 , Δ 2 2 / ( Δ 1 2 Δ 2 2 ) is bounded away from one as d where Δ 1 , Δ 2 is the inner product. We find

Δ 1 , Δ 2 = 1 d 2 l = 1 d m = 1 d ( δ l 1 δ m 1 ) ( δ l 2 δ m 2 ) = 2 d l = 1 d δ l 1 δ l 2 2 1 d l = 1 d δ l 1 1 d l = 1 d δ l 2 .

It implies that, in case 1,

lim d Δ 1 , Δ 2 = 2 0 1 g 1 ( x ) g 2 ( x ) 1 g 1 ( x ) 2 g 2 ( x ) 2 d x 2 k = 1 2 0 1 g k ( x ) { 1 g 1 ( x ) 2 g 2 ( x ) 2 } 1 / 2 d x ,

and in case 2,

lim d Δ 1 , Δ 2 = 2 0 1 0 1 g 1 ( x 1 ) g 2 ( x 2 ) 1 g 1 ( x 1 ) 2 g 2 ( x 2 ) 2 d x 1 d x 2 2 k = 1 2 0 1 0 1 g k ( x k ) { 1 g 1 ( x 1 ) 2 g 2 ( x 2 ) 2 } 1 / 2 d x 1 d x 2 .

Tables A3 and A4 show the limiting values for some pairs ( a 1 , b 1 ) and ( a 2 , b 2 ) for cases 1 and 2, respectively.

Table A3

Limiting values of Δ k 2 , k = 1 , 2 , and cos 2 ( ϕ 1 , 2 ) for different values of ( a 1 , b 1 ) and ( a 2 , b 2 ) ; case 1

( a 1 , b 1 ) ( 0.5 , 0.6 ) (0.2, 0.6) (0.2, 0.6) (0.2, 0.4) (0.2, 0.4)
( a 2 , b 2 ) ( 0.3 , 0.7 ) (0.2, 0.6) (0.0, 0.4) (0.1, 0.9) (0.4, 0.8)
lim d Δ 1 2 0.427 0.119 0.067 0.219 0.061
lim d Δ 2 2 0.419 0.119 0.051 1.359 0.243
lim d cos 2 ( ϕ 1 , 2 ) 0.997 1.000 0.999 0.997 1.000
Table A4

Limiting values of Δ k 2 , k = 1 , 2 , and cos 2 ( ϕ 1 , 2 ) for different values of ( a 1 , b 1 ) and ( a 2 , b 2 ) ; case 2

( a 1 , b 1 ) ( 0.5 , 0.6 ) (0.2, 0.6) (0.2, 0.6) (0.2, 0.4) (0.2, 0.4)
( a 2 , b 2 ) ( 0.3 , 0.7 ) (0.2, 0.6) (0.0, 0.4) (0.1, 0.9) (0.4, 0.8)
lim d Δ 1 2 0.319 0.075 0.054 0.071 0.028
lim d Δ 2 2 0.296 0.075 0.040 0.796 0.180
lim d cos 2 ( ϕ 1 , 2 ) 0.004 0.189 0.041 0.703 0.416

Note that in case 1, cos 2 ( ϕ 1 , 2 ) is very close to one, so the convergence of the parameter estimates to their true values may be very slow. In fact, Λ 1 and Λ 2 are linearly dependent with Λ 1 = Λ 2 when ( a 1 , b 1 ) = ( a 2 , b 2 ) = ( 0.2 , 0.6 ) and Λ 1 = 0.5 Λ 2 with ( a 1 , b 1 ) = ( 0.2 , 0.4 ) and ( a 2 , b 2 ) = ( 0.4 , 0.8 ) . Condition 3.1(iii) fails in this case.

In case 2, cos 2 ( ϕ 1 , 2 ) is far from one in all cases, and accurate parameter estimates can be obtained in this case.

Similar results can be obtained when p = 3 , so they are not reported here.

B Proofs

In this section, proofs are provided for theoretical results presented in Section 3.

Proof of Theorem 3.2

  1. In order for the parameter vector ( Λ 0 , η 0 ) to be unique, one has to impose p ( p 1 ) / 2 restrictions on the factor loadings Λ 0 . Otherwise, one can always find a rotation Λ rot = Λ A , with A a p × p orthogonal matrix, such that the variance-covariance matrix remains the same.

  2. This is trivially satisfied, as the values of the correlation matrix by definition are restricted to the interval [ 1 , 1 ] .

  3. This is trivially satisfied, as the correlation matrix is continuous at each ( Λ , η ) in the parameter space.

  4. Since the correlation parameter σ i j = Corr ( X i , X j ) is bounded by the interval [ 1 , 1 ] , one can show that

    E [ sup ( Λ , η ) Θ f ( X , Λ , η ) ] d .

Proof of Theorem 3.3

  1. The GMM estimator is consistent if the conditions from Theorem 3.2 are satisfied.

  2. This condition is satisfied, as the parameter space for the candidate loadings λ j k and correlation between the residuals, η , is restricted to the interval [ 1 , 1 ] .

  3. We know that ( Λ , η ) f ( X i , Λ , η ) = ( Λ , η ) Σ X . So for every element σ j l = k = 1 p λ j k λ l k + η γ j γ l , we have

    λ j k σ j l = γ l ( δ l k η δ j k ) , η σ j l = γ j γ l ,

    The derivatives exist and are finite under Condition 3.1(i).

  4. It is easy to see that E [ f ( X i , Λ , η ) 2 ] d 2 .

  5. Note that

    λ j k σ j l 2 2 δ l k 2 + 2 δ j k 2 , λ l k σ j l 2 2 δ l k 2 + 2 δ j k 2 , η σ j l 2 1 .

    Under Condition 3.1(i),

    sup ( Λ , η ) Θ ( Λ , η ) f ( X i , Λ , η ) 2 8 k = 1 p l = 1 d δ l k 2 + p d = p d ( 8 ξ U + 1 ) .

  6. To check this condition, we have to show that the matrix

    G G = ( ( Λ , η ) Σ X ) ( Λ , η ) Σ X

    is invertible. The condition is illustrated for a model with one Gaussian factor, and the proof is similar in the general case. The model with one factor has the following derivatives:

    σ i j λ i = λ j η γ j γ i λ i , σ i j λ j = λ i η γ i γ j λ j , σ i j η = γ i γ j ,

    for i , j = 1 , , d and i j . Then it suffices to show that the following does not hold

    a i λ j η γ j γ i λ i + a j λ i η γ i γ j λ j = γ i γ j , a ˜ i ( δ j η δ i ) + a ˜ j ( δ i η δ j ) = 1 ,

    where a ˜ i = a i / γ i and δ j = λ j / γ j . Then for i j , and i k we have the following system of equations:

    a ˜ i ( δ j η δ i ) + a ˜ j ( δ i η δ j ) = 1 a ˜ i ( δ k η δ i ) + a ˜ k ( δ i η δ k ) = 1 ,

    subtracting the two equations gives another set of equations, where j l and k l

    a ˜ i ( δ j δ k ) + δ i ( a ˜ j a ˜ k ) + η ( a ˜ k δ k a ˜ j δ j ) = 0 a ˜ l ( δ j δ k ) + δ l ( a ˜ j a ˜ k ) + η ( a ˜ k δ k a ˜ j δ j ) = 0 ,

    and again subtracting the aforementioned equations gives

    (A1) ( a ˜ i a ˜ l ) ( δ j δ k ) = ( a ˜ j a ˜ k ) ( δ i δ l ) .

    Hence, there are two cases. In the first case, a ˜ i a ˜ l = 0 , which implies that δ i δ l = 0 . Otherwise, all differences a ˜ i a ˜ l and a ˜ j a ˜ k would be zero, for every possible combination of the pairs ( i , l ) and ( j , k ) . In the second case, δ i δ l and δ j δ k are different from zero, and we can rewrite the display in (A1) as

    a ˜ i a ˜ l δ i δ l = a ˜ j a ˜ k δ j δ k .

    The numerator and denominator on both sides of the above display are nonzero, even though the ratios are the same. However, looping through every possible combination of the pairs ( i , l ) and ( j , k ) , the only valid result is that all differences a ˜ i a ˜ l are zero, which implies that all δ s are equal. However, this contradicts Condition 3.1(iii), which states that there can only be up to O ( d 2 ) terms that are zero, which concludes the proof.□

Proof of Lemma 3.4

The optimal weight vector w k d can be derived by differentiating

(A2) E { ( Z ˆ k d Z k ) 2 } = w k d Σ X w k d 2 w k d λ k + 1

with respect to w k d for each k = 1 , , p . Recall that E ( Z ˆ k d ) = E ( Z k ) = 0 . Then assuming that Σ X is invertible, one can obtain the optimal weight vector w k d by optimising (A2) with respect to w k d subject to constraint Var ( Z ˆ k d ) = w k d Σ X w k d = 1 .□

Proof of Theorem 3.5

After substituting w k d = ( λ k Σ X 1 λ ) 1 / 2 Σ X 1 λ k back into (A2), one obtains

lim d E { ( Z ˆ k d Z k ) 2 } = 2 2 lim d ( λ k Σ X 1 λ k ) 1 / 2 .

Therefore, we need to show that lim d λ k Σ X 1 λ k = 1 . Using Woodbury’s formula, one can show that

λ k Σ X 1 λ k = λ k ( Γ 1 / 2 H Γ 1 / 2 ) 1 λ k λ k Γ 1 2 H Γ 1 / 2 1 Λ ( Λ ( Γ 1 / 2 H Γ 1 / 2 ) 1 Λ + I p ) 1 Λ ( Γ 1 / 2 H Γ 1 / 2 ) 1 λ k .

Without loss of generality, convergence of the factor estimates is illustrated for k = 1 . However, the results hold for any k = 1 , , p . Furthermore, define b q r = λ q ( Γ 1 / 2 H Γ 1 / 2 ) 1 λ r and a k = b k k for k , r , q = 1 , , p . We find

a k = 1 + ( d 2 ) η 1 + ( d 2 ) η ( d 1 ) η 2 l = 1 d δ l k 2 η ( 1 + ( d 2 ) η ( d 1 ) η 2 ) l = 1 l m d δ l k δ m k , b q r = 1 + ( d 2 ) η 1 + ( d 2 ) η ( d 1 ) η 2 l = 1 d δ l q δ l r η ( 1 + ( d 2 ) η ( d 1 ) η 2 ) l = 1 l m d δ l q δ m r ,

where δ l k = λ l k / γ l . Then a 1 = λ 1 ( Γ 1 / 2 H Γ 1 / 2 ) 1 λ 1 is a scalar.

Proof for a one-factor model ( p = 1 , and k = 1 ). In the case where p = 1 , it follows that Λ = λ 1 . Hence,

λ 1 Σ X 1 λ 1 = a 1 1 + a 1 .

It remains to be shown that a 1 as d . If Conditions 3.1(i) and 3.1(ii) are satisfied, then

a 1 / d = 1 1 η 1 d l = 1 d δ l 1 2 1 d 2 l = 1 d m = 1 d δ l 1 δ m 1 l = 1 d δ l 1 2 + O ( 1 / d ) = 1 1 η 1 d l = 1 d δ l 1 2 1 d 2 l = 1 d δ l 1 2 + O ( 1 / d ) = 1 1 η 1 2 d 2 l = 1 d m = 1 d ( δ l 1 δ m 1 ) 2 + O ( 1 / d ) = 0.5 Δ 1 2 1 η + O ( 1 / d ) > 0.5 ξ 0 2 1 η + O ( 1 / d ) ,

where Condition 3.1(iii) is used in the last step. It follows that lim d λ 1 Σ X 1 λ 1 = 1 .

Proof for a two-factor model ( p = 2 , and k = 1 ).

In the case of a two-factor model,

Λ ( Γ 1 / 2 H Γ 1 / 2 ) 1 Λ + I 2 = 1 + a 1 b 12 b 12 1 + a 2 .

Computing the inverse, we obtain

( Λ ( Γ 1 / 2 H Γ 1 / 2 ) 1 Λ + I 2 ) 1 = 1 ( 1 + a 1 ) ( 1 + a 2 ) b 12 2 1 + a 2 b 12 b 12 1 + a 1 .

Finally,

λ 1 ( Γ 1 / 2 H Γ 1 / 2 ) 1 Λ = a 1 b 12 .

Putting everything together, we obtain

λ 1 Σ X 1 λ 1 = a 1 ( 1 + a 2 ) b 12 2 ( 1 + a 1 ) ( 1 + a 2 ) b 12 2 = 1 1 1 + a 1 1 b 12 2 ( 1 + a 1 ) ( 1 + a 2 ) 1 .

It is required to show that α 12 = 1 b 12 2 / [ ( 1 + a 1 ) ( 1 + a 2 ) ] is bounded away from zero. To start with,

b 12 / d = 1 1 η 1 d l = 1 d δ l 1 δ l 2 1 d 2 l = 1 m l d δ l 1 δ m 2 + O ( 1 / d ) = 1 1 η 1 d l = 1 d δ l 1 δ l 2 1 d l = 1 d δ l 1 1 d m = 1 d δ m 2 + O ( 1 / d ) = 0.5 Δ 1 , Δ 2 1 η + O ( 1 / d ) = 0.5 Δ 1 Δ 2 cos ( ϕ 1 , 2 ) 1 η + O ( 1 / d ) .

Then Condition 3.1(iii) implies that α 1 , 2 = 1 cos 2 ( ϕ 1 , 2 ) + O ( 1 / d ) = sin 2 ( ϕ 1 , 2 ) + O ( 1 / d ) > sin 2 ( ξ 0 / 2 ) + O ( 1 / d ) .

Proof for a three-factor model ( p = 3 , and k = 1 ). For a three-factor model, we obtain the following expression:

Λ Γ 1 2 H Γ 1 2 1 Λ + I p = 1 + a 1 b 12 b 13 b 12 1 + a 2 b 23 b 13 b 23 1 + a 3 .

After some calculations, we find

(A3) λ 1 Σ X 1 λ 1 = 1 b 123 1 a 1 + 1 1 b 23 2 ( 1 + a 2 ) ( 1 + a 3 ) ,

where

b 123 = 1 + 2 b 12 b 13 b 23 ( 1 + a 1 ) ( 1 + a 2 ) ( 1 + a 3 ) b 23 2 ( 1 + a 2 ) ( 1 + a 3 ) b 13 2 ( 1 + a 1 ) ( 1 + a 3 ) b 12 2 ( 1 + a 1 ) ( 1 + a 2 ) = 1 + 2 cos ( ϕ 1 , 2 ) cos ( ϕ 1 , 3 ) cos ( ϕ 2 , 3 ) cos 2 ( ϕ 1 , 2 ) cos 2 ( ϕ 1 , 3 ) cos 2 ( ϕ 2 , 3 ) + O ( 1 / d ) = 4 sin ( 0.5 ( ϕ 1 , 2 + ϕ 1 , 3 ϕ 2 , 3 ) ) sin ( 0.5 ( ϕ 1 , 2 + ϕ 2 , 3 ϕ 1 , 3 ) ) sin ( 0.5 ( ϕ 1 , 3 + ϕ 2 , 3 ϕ 1 , 2 ) ) × sin ( 0.5 ( ϕ 1 , 2 + ϕ 1 , 3 + ϕ 2 , 3 ) ) + O ( 1 / d ) 4 sin 4 ( ξ 0 / 2 ) + O ( 1 / d )

by Condition 3.1(iii). It implies that λ 1 Σ X 1 λ 1 1 sin 4 ( ξ 0 / 2 ) / ( a 1 + 1 ) + O ( 1 / d ) 1 as d .□

Proof of Theorem 3.6

Using the properties of the GMM estimator, we know that λ ˆ k P λ k and Σ ˆ X P Σ X . Under Condition 3.1(ii) Σ X is invertible. Thus, by the continuous mapping theorem (CMT) it follows that Σ ˆ X 1 P Σ X 1 . As a result w ˆ k n d = ( λ ˆ k Σ ˆ X 1 λ ˆ k ) 1 / 2 λ ˆ k Σ ˆ X 1 P w k d = ( λ k Σ X 1 λ k ) 1 / 2 λ k Σ X 1 . By expanding, we obtain

(A4) lim d lim n E { ( Z ˆ k n d Z k ) 2 } = lim d lim n E { ( X w ˆ k n d ) 2 } 2 lim d lim n E ( w ˆ k n d X Z k ) + 1 .

The first and second terms in (A4) can be rewritten as

(A5) E { ( X w ˆ k n d ) 2 } = E { X ( w ˆ k n d + w ˆ k d ) X ( w ˆ k n d w ˆ k d ) } + E { ( X w ˆ k d ) 2 } ,

(A6) E ( w ˆ k n d X Z k ) = E { ( w ˆ k n d w ˆ k d ) X Z k } + E ( w k d X Z k ) ,

respectively. From Theorem 3.5, we know that

lim d E { ( X w k d ) 2 } 2 lim d E ( w k d X Z k ) + 1 = 0 .

Thus, it remains to be shown that the first term in (A5) and in (A6) converge to zero as n . Using Cauchy’s inequality, the first term in (A5) is bounded by

( E [ { X ( w ˆ k n d + w ˆ k d ) } 2 ] E [ { X ( w ˆ k n d w ˆ k d ) } 2 ] ) 1 / 2 .

Again using Cauchy’s inequality, one obtains

E [ { X ( w ˆ k n d + w ˆ k d ) } 2 ] [ E ( X 2 X 2 ) E { ( w ˆ k n d + w ˆ k d ) 2 ( w ˆ k n d + w ˆ k d ) 2 } ] 1 / 2 , E [ { X ( w ˆ k n d w ˆ k d ) } 2 ] [ E ( X 2 X 2 ) E { ( w ˆ k n d w ˆ k d ) 2 ( w ˆ k n d w ˆ k d ) 2 } ] 1 / 2 .

Note that E ( X 2 X 2 ) = 3 d and w ˆ k d Σ ˆ X w ˆ k d = 1 . From Condition 3.1(ii) we can conclude that w ˆ k d and w ˆ k n d are bounded.

Since we know that w ˆ k n d P w k d and ( w ˆ k n d w ˆ k d ) 4 is uniformly integrable, we conclude that E { ( w ˆ k n d w ˆ k d ) 2 ( w ˆ k n d w ˆ k d ) 2 } 0 as n by Vitali’s convergence theorem. It implies that E { X ( w ˆ k n d + w ˆ k d ) X ( w ˆ k n d w ˆ k d ) } 0 as n . Similarly, E { ( w ˆ k n d w ˆ k d ) X Z k } 0 as n , which concludes the proof.□

Proof of Lemma 3.7

This follows immediately from the fact that the GMM estimators converge in probability combined with the CMT, and from Theorem 3.6.□

Proof of Theorem 3.8

From Lemma 3.7, we know that U ¯ n d P U ¯ d as n . Moreover, from Theorem 3.5, it follows that Z ˆ k d L 2 Z k , for each k = 1 , , p . Rewriting, we obtain

ε ˆ j d = 1 γ j X j k = 1 p λ j k Z ˆ k d = ε j 1 γ j k = 1 p λ j k ( Z ˆ k d Z k )

and

U ¯ d = ε ¯ d 1 d j = 1 d k = 1 p δ j k ( Z ˆ k d Z k ) ,

where ε ¯ d = ( 1 / d ) i = 1 d ε j d P m ( V ) by Theorem 5 in [9] and E { ( Z ˆ k d Z k ) 2 } < ξ / d for some constant ξ > 0 as follows from the proof of Theorem 3.5 so that

E 1 d j = 1 d k = 1 p δ j k ( Z ˆ k d Z k ) 2 p d j = 1 d k = 1 p δ j k 2 E { ( Z ˆ k d Z k ) 2 } ξ p d k = 1 p 1 d j = 1 d δ j k 2 0 as d .

It implies that U ¯ d P m ( V ) , as n and d .□

Proof of Theorem 3.9

Let γ ˜ l = ( 1 γ l 2 ) 1 / 2 and γ ˜ m = ( 1 γ m 2 ) 1 / 2 . Denote W l = γ ˜ l 1 k = 1 p λ l k Z k and W m = γ ˜ m 1 k = 1 p λ m k Z k . Note that ( W m , W l ) have the bivariate standard normal distribution. Let ϕ l , m be the joint density of ( W l , W m ) and F l , m be the joint cdf of ( ε l , ε m ) . We find

Pr ( X l < z , X m < z ) = R 2 F l , m z γ ˜ l w l γ l , z γ ˜ m w m γ m ϕ l , m ( w l , w m ) d w l d w m R 2 min Φ z γ ˜ l w l γ l , Φ z γ ˜ m w m γ m ϕ l , m ( w l , w m ) d w l d w m = Pr ( γ ˜ l W l + γ l V < z , γ ˜ m W m + γ m V < z ) = Φ ρ l , m ( z , z ) ,

where Φ ρ l , m is the cdf of a standard bivariate normal distribution with the correlation ρ l , m = γ ˜ l γ ˜ m Cor ( W l , W m ) + γ l γ m < 1 , so that

lim z Pr ( X l < z , X m < z ) Φ ( z ) lim z Φ ρ l , m ( z , z ) Φ ( z ) = 0 .

References

[1] Bai, J., & Ng, S. (2002). Determining the number of factors in approximate factor models. Econometrica, 70(1), 191–221. 10.1111/1468-0262.00273Search in Google Scholar

[2] Bai, J., & Ng, S. (2008). Large dimensional factor analysis. Foundations and Trends in Econometrics, 3(2), 89–163. 10.1561/0800000002Search in Google Scholar

[3] Creal, D. D., & Tsay, R. S. (2015). High dimensional dynamic stochastic copula models. Journal of Econometrics, 189(2), 335–345. 10.1016/j.jeconom.2015.03.027Search in Google Scholar

[4] Johnson, R. A., & Wichern, D. W. (2002). Applied multivariate statistical analysis. Volume 5. Upper Saddle River, NJ: Prentice Hall. Search in Google Scholar

[5] Kent, J., Bibby, J., & Mardia, K. (1979). Multivariate analysis. Amsterdam: Academic Press. Search in Google Scholar

[6] Kreuzer, A., & Czado, C. (2021). Bayesian inference for a single factor copula stochastic volatility model using Hamiltonian Monte Carlo. Econometrics and Statistics, 19, 130–150. 10.1016/j.ecosta.2020.12.001Search in Google Scholar

[7] Krupskii, P., & Joe, H. (2013). Factor copula models for multivariate data. Journal of Multivariate Analysis, 120, 85–101. 10.1016/j.jmva.2013.05.001Search in Google Scholar

[8] Krupskii, P., & Joe, H. (2015). Structured factor copula models: Theory, inference and computation. Journal of Multivariate Analysis, 138, 53–73. 10.1016/j.jmva.2014.11.002Search in Google Scholar

[9] Krupskii, P., & Joe, H. (2021). Approximate likelihood with proxy variables for parameter estimation in high-dimensional factor copula models. Statistical Papers, 63, 1–27. 10.1007/s00362-021-01252-1Search in Google Scholar

[10] McNeil, A., Frey, R., & Embrechts, P. (2005). Quantitative risk management: Concepts, techniques and tools. Princeton, New Jersey: Princeton University Press. Search in Google Scholar

[11] Nguyen, H., Ausín, M. C., & Galeano, P. (2020). Variational inference for high dimensional structured factor copulas. Computational Statistics and Data Analysis, 151, 107012. 10.1016/j.csda.2020.107012Search in Google Scholar

[12] Oh, D. H., & Patton, A. J. (2013). Simulated method of moments estimation for copula-based multivariate models. Journal of the American Statistical Association, 108(502), 689–700. 10.1080/01621459.2013.785952Search in Google Scholar

[13] Oh, D. H., & Patton, A. J. (2017). Modeling dependence in high dimensions with factor copulas. Journal of Business and Economic Statistics, 35(1), 139–154. 10.1080/07350015.2015.1062384Search in Google Scholar

[14] Oh, D. H., & Patton, A. J. (2018). Time-varying systemic risk: Evidence from a dynamic copula model of CDS spreads. Journal of Business and Economic Statistics, 36(2), 181–195. 10.1080/07350015.2016.1177535Search in Google Scholar

[15] Pesaran, M. H. (2006). Estimation and inference in large heterogeneous panels with a multifactor error structure. Econometrica, 74(4), 967–1012. 10.1111/j.1468-0262.2006.00692.xSearch in Google Scholar

[16] Schamberger, B., Gruber, L., & Czado, C. (2017). Bayesian inference for latent factor copulas and application to financial risk forecasting. Econometrics, 5(2), 21. 10.3390/econometrics5020021Search in Google Scholar

[17] Sklar, M. (1959). Fonctions de répartition àn dimensions et leurs marges. Publ. Inst. Statist. Univ. Paris, 8, 229–231. Search in Google Scholar

[18] Stock, J. H., & Watson, M. W. (2002). Forecasting using principal components from a large number of predictors. Journal of the American Statistical Association, 97(460), 1167–1179. 10.1198/016214502388618960Search in Google Scholar

[19] Subbarao, C., Subbarao, N., & Chandu, S. (1996). Characterization of groundwater contamination using factor analysis. Environmental Geology, 28(4), 175–180. 10.1007/s002540050091Search in Google Scholar

[20] Tan, B. K., Panagiotelis, A., & Athanasopoulos, G. (2019). Bayesian inference for the one-factor copula model. Journal of Computational and Graphical Statistics, 28(1), 155–173. 10.1080/10618600.2018.1482765Search in Google Scholar

[21] Tsay, R. S. (2005). Analysis of financial time series, Volume 543. New Jersey: John Wiley & Sons. 10.1002/0471746193Search in Google Scholar

[22] Van Loan, C. F., & Golub, G. (1996). Matrix computations. Maryland: Johns Hopkins Studies in Mathematical Sciences. Search in Google Scholar

Received: 2021-11-29
Revised: 2022-06-06
Accepted: 2022-07-12
Published Online: 2022-09-09

© 2022 Alex Verhoijsen and Pavel Krupskiy, published by De Gruyter

This work is licensed under the Creative Commons Attribution 4.0 International License.

Downloaded on 20.9.2025 from https://www.degruyterbrill.com/document/doi/10.1515/demo-2022-0117/html
Scroll to top button