Fast estimation of Kendall's Tau and conditional Kendall's Tau matrices under structural assumptions

Rutger van der Spek; Alexis Derumigny

doi:10.1515/demo-2025-0012

Article Open Access

Fast estimation of Kendall's Tau and conditional Kendall's Tau matrices under structural assumptions

Rutger van der Spek and Alexis Derumigny

Published/Copyright: April 4, 2025

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information Explore this Subject

From the journal Dependence Modeling Volume 13 Issue 1

Abstract

Kendall’s tau and conditional Kendall’s tau matrices are multivariate (conditional) dependence measures between the components of a random vector. For large dimensions, available estimators are computationally expensive and can be improved by averaging. Under structural assumptions on the underlying Kendall’s tau and conditional Kendall’s tau matrices, we introduce new estimators that have a significantly reduced computational cost while keeping a similar error level. In the unconditional setting, we assume that, up to reordering, the underlying Kendall’s tau matrix is block structured with constant values in each of the off-diagonal blocks. Consequences on the underlying correlation matrix are then discussed. The estimators take advantage of this block structure by averaging over (part of) the pairwise estimates in each of the off-diagonal blocks. Derived explicit variance expressions show their improved efficiency. In the conditional setting, the conditional Kendall’s tau matrix is assumed to have a block structure, for some value of the conditioning variable. Conditional Kendall’s tau matrix estimators are constructed similarly as in the unconditional case by averaging over (part of) the pairwise conditional Kendall’s tau estimators. We establish their joint asymptotic normality and show that the asymptotic variance is reduced compared to the naive estimators. Then, we perform a simulation study that displays the improved performance of both the unconditional and conditional estimators. Finally, the estimators are used for estimating the value at risk of a large stock portfolio; backtesting illustrates the obtained improvements compared to the previous estimators.

Keywords: Kendall’s tau matrix; block structure; kernel smoothing; conditional dependence measure

MSC 2010: Primary: 62H20; Secondary: 62F30; 62G05

1 Introduction

In dependence modeling, the main object of interest is the copula, which is a cumulative distribution function on [0, 1]^p with uniform margins, describing the links between elements of a p -dimensional random vector X . However, the copula belongs to an infinite-dimensional space, and it is not easy to represent it as soon as p is larger than 3 or 4. In such cases, finite-dimensional statistics becomes more useful to understand the dependence, the most well known of them being Kendall’s tau matrix.

Kendall’s tau between two random variables X i and X j , denoted by τ i , j = τ ( P i , j ) , is defined as the probability of concordance between two independent replications from the distribution P i , j of ( X i , X j ) minus the probability of discordance. The equality τ ( P i , j ) = 4 ∫ C i , j d C i , j − 1 relates Kendall’s tau with the copula C i , j of X i and X j ; we refer to the previous study [32] for an extensive introduction to Kendall’s tau and copulas.

When a covariate Z ∈ R d is available, we can extend the definition of Kendall’s tau to the conditional setting. Conditional Kendall’s tau is then defined as τ i , j ∣ Z = z ≔ τ ( P ( i , j ) ∣ Z = z ) , where P ( i , j ) ∣ Z = z denotes the conditional law of ( X i , X j ) given Z = z , for some z ∈ R d . In the previous studies [10,20,45], smoothing-based estimators of conditional Kendall’s tau are studied. In [9], it is shown that the estimation of conditional Kendall’s tau can be written as a classification task; they proposed to use classification algorithms to estimate conditional Kendall’s tau. In [11], a regression-type model is used to estimate conditional Kendall’s tau in a parametric conditional framework. [3] uses conditional Kendall’s tau for hypothesis testing.

For a random vector X , we define Kendall’s tau matrix by T ≔ [ τ i , j ] 1 ≤ i , j ≤ p , which contains all pairwise Kendall’s taus; respectively, in the conditional framework, a natural counterpart is conditional Kendall’s tau matrix, denoted by T ∣ Z = z ≔ [ τ i , j ∣ Z = z ] 1 ≤ i , j ≤ p . Kendall’s tau matrix is especially useful for elliptical graphical models and their generalizations, see [4,26]. In the study by Lu et al. [28], a time-varying graphical model is studied using an estimate of conditional Kendall’s tau matrix. Kendall’s tau matrix plays an important role since it allows robust estimation of the dependence, and can be used to fit an appropriate copula [19]. In an elliptical distribution framework, it can also be used to estimate the Value at Risk of a portfolio, see [37,42].

Estimation of the p × p Kendall’s tau matrix T becomes particularly challenging in the high-dimensional setting when p is large. Simple use of the naive Kendall’s tau matrix estimator of all pairwise sample Kendall’s taus will result in noisy estimates with estimation errors piling up due to the estimates’ individual imprecision [15]. Over the past two decades, various regularization strategies have been proposed to reduce the aggregation of estimation errors. Ultimately, these methods all make certain assumptions on the underlying dependence structure, hereby reducing the number of free parameters to estimate.

In many instances, sparsity of the target matrix is assumed. For such settings, various (combinations of) thresholding and shrinkage methods have been proposed [5,21,39]. However, such assumptions are certainly not appropriate for the modeling of most financial data, e.g., market risk is reflected in all share prices, and therefore, their returns are certainly correlated. To this end, factor models are usually imposed, where the correlations depend on a number of common factors, which may or may not be latent [15,16].

In studies by Perreault [34,35], an alternative approach to estimating large Kendall’s tau matrices was introduced. They studied a model in which it is assumed that the set of variables could be partitioned into smaller clusters with exchangeable dependence. As such, after reordering of the variables by cluster, the corresponding Kendall’s tau matrix is block structured with constant values within each block. Following naturally is an improved estimation by averaging all pairwise sample Kendall’s taus within each of the blocks. In addition, they have proposed a robust algorithm identifying such structures (see also [36] for testing for the presence of such a structure).

In this article, we study a similar framework as in previous studies [35], where we relax the partial exchangeability assumption: we only assume that off-diagonal blocks of Kendall’s tau matrix are constant. One of the drawbacks of the estimator studied in [35] is its computational cost, which is close to the one of the naive Kendall’s tau matrix estimator: the number of pairwise sample Kendall’s taus that are to be computed scales quadratically with the dimension p .

Naturally, the idea of averaging among several Kendall’s taus can be applied to part of the blocks, which allows for faster computations. As such, we propose several estimators that average among part of the Kendall’s tau per off-diagonal block and study their efficiencies and computational costs. For every off-diagonal block, we will consider averaging over elements in the same row, averaging over elements on the diagonal and averaging over a number of randomly selected elements. We will be referring to these estimators as the row, diagonal and random estimators; the estimator that averages over all elements is referred to as the block estimator.

We then extend this model to the conditional setup: conditional Kendall’s taus are depending on z and are assumed to be clustered such that for all z ∈ R d , the Kendall’s tau conditionally on Z = z between variables of different groups is only depending on group numbers and on the value of z . In view of applications to finance, the conditional version of our structural assumption could, for example, be seen as assuming that the correlations between European stocks of two different groups are equal and react equally to changes of some other American stock or portfolio. Furthermore, in previous studies [2,14,27], it was shown that stock returns actually exhibit higher correlations during market declines than during market upturns, and moreover that the same applies to exchange rates in [33]. In such a model, it is also important to limit computation times and study improved estimators that can take advantage of the block structure of the Kendall’s tau matrix.

In this framework, we adopt nonparametric estimates of the conditional Kendall’s tau based on kernel smoothing. On the basis of these nonparametric estimates, we introduce conditional versions of the averaging estimators and study their asymptotic behavior as the sample size n tends to infinity. It is worth noting that conditional estimates of Kendall’s tau using kernel smoothing carry significantly more computational cost than their unconditional counterparts, especially when the covariate’s dimension d is large. Therefore, faster computations of conditional Kendall’s tau matrices will be of particular use in the conditional, nonparametric setup.

The rest of this article is structured as follows. In Section 2, we present the unconditional framework, and detail a few consequences on the correlation matrix. Then we construct the different estimators in this framework and derive variance expressions. Similarly, Section 3 is devoted to the improved estimation of the conditional Kendall’s tau matrix, where we propose averaged conditional estimators and we derive the estimators’ joint asymptotic normality. In Section 4, we perform a simulation study in order to support the theoretical findings. Finally, in Section 5, we examine a possible application to study the behavior of the estimators in real data conditions. The estimators are used for the robust inference of the covariance matrix to estimate the value at risk of a large stock portfolio. Proofs are postponed to the Appendix.

Notations. We denote by 1 be the vector and the matrix with all entries equal to 1, where the dimensions can be inferred from the context. For a matrix M of size p × p , and a set of indices J ⊂ { 1 , … , p } 2 , we denote by [ M ] J the submatrix ( M j ) j ∈ J .

2 Fast estimation of Kendall’s tau matrix

2.1 The structural assumption

Let n ≥ 2 and assume that we observe n i.i.d. replications X i = ( X i , 1 , … , X i , p ) of a random vector X = ( X 1 , … , X p ) ∈ R p , for i = 1 , … , n . Moreover, assume that the Kendall’s tau matrix T ≔ [ τ i , j ] 1 ≤ i , j ≤ p of X satisfies the following structural assumption.

Assumption 1

(Structural assumption) There exists K > 0 , a partition G = { G 1 , … , G K } of { 1 , … , p } , a set J ⊂ { 1 , … , K } 2 and some constants ( τ k 1 , k 2 ) ( k 1 , k 2 ) ∈ J ∈ [ − 1 , 1 ] J such that for all ( k 1 , k 2 ) ∈ J ,

[ T ] G k 1 × G k 2 = τ k 1 , k 2 ⋅ 1 .

Note that after reordering of the variables by group, the corresponding Kendall’s tau matrix is block structured with constant values in some of the off-diagonal blocks. The interest in investigating this structural assumption originates from applications in stock return modeling. In this context, the clustering of the variables could be considered as grouping companies by sector or economy. It then seems at least intuitive to assume that companies from different groups have correlations that depend only on the groups they are in, without making any assumptions on the correlations between companies from the same group. We will therefore call τ k 1 , k 2 the intergroup Kendall’s tau between groups G k 1 and G k 2 .

This can be clearly seen in Figure 1: in each of the off-diagonal blocks, the Kendall’s tau is mostly homogeneous, but significant differences can be seen in the fourth diagonal block. Indeed, it gathers companies whose link to other groups is constant, but with different relationships inside the group. This may be explained by the presence of subgroups inside this fourth group, even if the relationship with variables from other groups do not seem to be related to these subgroup structures.

Figure 1

Heatmap plots of the sample Kendall’s tau matrix computed on the daily log returns from 01 January 2007 until 14 January 2022 of all 240 portfolio stocks (whose list is available in Appendix C). (a) Unclustered and (b) clustered.

Obviously, the structural assumption is satisfied for any set of variables by using only groups of length 1. Therefore, assuming larger groups will make the assumption more constraining. Indeed, in this framework, Kendall’s tau matrix depends on

1 2 K ( K − 1 ) + 1 2 ∑ k = 1 K ∣ G k ∣ ( ∣ G k ∣ − 1 )

free parameters. For a dimension of 100, assuming we can split into K = 10 groups of equal size, this translates to a reduction by factor of 10 of the number of free parameters to estimate (from 4950 to 495). Such a reduction suggests that the use of appropriate estimators can lead to significant estimation improvements. We will define estimators of the Kendall’s tau matrix T under Assumption 1 for some known partitions G of { 1 , … , p } . Note that such partition can also be inferred from the data, see [35,36] and the thesis [34]. Even if Assumption 1 is not satisfied, the estimators that we will propose can still be of interest, for example, to do linear shrinkage.

Note that although our results are only interesting when k 1 ≠ k 2 , they also hold if a pair of form ( k 1 , k 1 ) belongs to J . Indeed, in this case, we must have τ k 1 , k 1 = 1 , and then all pairs of observations in the block will be concordant, so all our estimators will be equal to 1.

In this article, we are interested in the estimation of the intergroup Kendall’s tau τ k 1 , k 2 for some pair ( k 1 , k 2 ) belonging to J , i.e., such that the block [ T ] G k 1 × G k 2 is a constant block. This means that the random vector ( X G k 1 ⊤ , X G k 2 ⊤ ) ⊤ satisfies the following assumption.

Assumption 2

(Simplified Structural assumption) Let X be a p -dimensional random vector of interest. Kendall’s tau matrix of the random vector X can be written in the block form

T = ⋅ τ 1 τ 1 ⋅ ,

where τ 1 represents a block filled with the value τ ∈ [ − 1 , 1 ] and the symbol ⋅ represents any matrices of respective sizes b 1 × b 1 and b 2 × b 2 for some b 1 ∈ { 1 , … , p } and b 2 ≔ p − b 1 .

Under Assumption 2, we call τ the intergroup Kendall’s tau, as a particular case of the previous framework. As discussed earlier, Assumption 2 may seem more constraining than the previous Assumption 1, but both assumptions are actually equivalent insofar as we are only interested in the estimation of each interblock Kendall’s tau τ k 1 , k 2 . Therefore, and in order to simplify the notations, we will choose to assume Assumption 2. Similarly as the more general Assumption 1, Assumption 2 can be tested using the framework developed in [36].

Note that [35] proposed a similar model with a more restrictive version of Assumption 1, the partial exchangeability assumption, by assuming that the variables could be partitioned into K clusters with exchangeable dependence.

Assumption 3

(Partial exchangeability assumption) For j ∈ { 1 , … , p } , let U j = F j ( X j ) , where F j is the cumulative distribution function of X j , and C be the copula of X . For any partition G = { G 1 , … , G K } of { 1 , … , p } , let π ( G ) be the set of permutations π of { 1 , … , p } such that for all j ∈ { 1 , … , p } and all k ∈ { 1 , … , K } , π ( j ) ∈ G k if and only if j ∈ G k .

A partition G = { G 1 , … , G K } satisfies the partial exchangeability assumption if for any u 1 , … , u p ∈ [ 0 , 1 ] and any permutation π ∈ π ( G ) , one has

C ( u 1 , … , u p ) = C ( u π ( 1 ) , … , u π ( p ) ) ,

or, equivalently, ( U 1 , … , U p ) = law ( U π ( 1 ) , … , U π ( p ) ) .

Note that the partial exchangeability assumption imposes restrictions on the underlying copula, whereas Assumption 1 only does so on the underlying Kendall’s tau matrix, making it a lot less restrictive. Further, under Assumption 3, Kendall’s tau matrix is fully block structured including constant diagonal blocks as well, after reordering of the variables. In contrast to [35], we are more interested in a model where we do not consider partial exchangeability nor constant interdependence of marginal variables within the same cluster. Particularly in view of the aforementioned application of stock returns, the partial exchangeability assumption seems quite restrictive and a model without partial exchangeability in which companies from the same cluster have different mutual dependence is more plausible (see Figure 1). For these reasons, we opt for a more flexible variant of this model.

2.2 Consequences of the block structure on the correlation matrix

As explained in [24, Chapter 3], if X follows a multivariate Gaussian distribution with exchangeable dependence and correlation ρ ∈ ( − 1 , 1 ) , then Corr ( X ) is positive definite if and only if ρ > − 1 ⁄ ( p − 1 ) . In terms of Kendall’s tau, this constraint translates as τ > − ( 2 ⁄ π ) Arcsin ( 1 ⁄ ( p − 1 ) ) . Assumption 1 is weaker than exchangeable dependence, and even in the Gaussian setting with two groups of variables, it allows for arbitrary negative correlation in the off-diagonal block. The results presented in this section are related to the results of [6] who study the eigenstructure of such block structured correlation matrices, but without giving precise constraint on the allowed values of the correlation. McNeil et al. [30] discuss attainability of Kendall’s tau matrices in a general framework, i.e., without discussing the block structure specifically.

Proposition 1

Let b 1 , b 2 ≥ 2 be integers. Let ρ 1 , ρ 2 , ρ 3 ∈ ( − 1 , 1 ) , and define the block matrix M ∈ R ( b 1 + b 2 ) 2 with blocks of size b 1 and b 2 by

M ≔ I + ρ 1 I ˜ ρ 3 1 ρ 3 1 I + ρ 2 I ˜ ,

where I is the identity matrix and I ˜ ≔ 1 − I is the matrix with 1 at each off-diagonal entry and 0 on the diagonal. Then M is positive definite if and only if

(1) ( b 2 b 1 − b 1 − b 2 + 1 ) ρ 1 ρ 2 − b 1 b 2 ρ 3 2 + ( b 2 − 1 ) ρ 2 + ( b 1 − 1 ) ρ 1 + 1 > 0 .

Furthermore, this inequality is satisfied as soon as

ρ 1 > b 1 b 2 ρ 3 2 − ( b 2 − 1 ) ρ 2 − 1 ( b 1 b 2 − b 1 − b 2 + 1 ) ρ 2 + b 1 − 1 , ρ 2 > − b 1 − 1 b 1 b 2 − b 1 − b 2 + 1 .

This result is proved in Appendix B.1. We can remark that the constraint (1) is always satisfied as soon as

(2) ρ 1 ρ 2 ≥ ρ 3 2 ,

i.e., the absolute value of ρ 3 has to be smaller than the geometric mean of ρ 1 and ρ 2 . Furthermore, in the high-dimensional setting, where b 1 , b 2 → + ∞ and ρ 1 , ρ 2 are fixed and positive, (1) actually becomes equivalent to the simplified constraint (2). Note that (1) allows for situations, where ρ 3 is arbitrarily close to − 1 , for any choice of block sizes. This is possible, for example, by setting ρ 1 = ρ 2 = ∣ ρ 3 ∣ . Concretely, if all variables of one group have a high correlation with all variables of the second group, then in each group, the intragroup correlation should be high. Such a result translates directly for Kendall’s tau matrix by using the relationship τ = ( 2 ⁄ π ) Arcsin ( ρ ) , allowing for Kendall’s tau matrix with arbitrary entries in the off-diagonal blocks.

Rather surprisingly, as soon as the group sizes b 1 and b 2 are large enough, they don’t appear anymore in the constraint (2). This phenomenon is in fact typical of block-structured matrices and will appear again in the performance of our estimators in the next sections. We now give a lower bound for the intergroup Kendall’s tau in the setting, where K groups are present, with equal intergroup Kendall’s tau. It is proved in Appendix B.1.

Proposition 2

Let K ≥ 2 and let b 1 , … , b K be positive integers. Let ρ ∈ ( − 1 , 1 ) , Σ 1 , … , Σ K be K correlation matrices of size b 1 , … , b K , respectively, and let M ∈ R ( b 1 + b 2 + ⋯ + b K ) 2 be the block matrix defined by

M ≔ Σ 1 ρ 1 ⋯ ρ 1 ρ 1 Σ 2 ⋱ ⋮ ⋮ ⋱ ⋱ ρ 1 ρ 1 ⋯ ρ 1 Σ K .

Then

inf { ρ ∈ ( − 1 , 1 ) : ∃ Σ 1 , … , Σ K such that M is a correlation matrix } = − 1 K − 1 .

Interestingly, this bound does not depend on the sizes of the blocks b 1 , … , b K . This shows that, for such a matrix, the structure of each block does not have a strong influence on the possible choice of intergroup Kendall’s tau. The constraint ρ ≥ − 1 ⁄ ( p − 1 ) for exchangeable correlation matrices becomes in this framework ρ ≥ − 1 ⁄ ( K − 1 ) for exchangeable intergroup dependence, suggesting that the number of blocks becomes the relevant dimension for this problem (instead of the number of variables). Nevertheless, the knowledge of the intragroup correlations matrices Σ 1 , … , Σ K will still constrain the range of possible values of ρ , as this was the case in Proposition 1 for the particular case K = 2 and exchangeable dependence in each block.

2.3 Construction of estimators

First, note that we can naturally rely on the usual estimator of Kendall’s tau between X j 1 and X j 2 defined by

(3) τ ^ j 1 , j 2 ≔ 2 n ( n − 1 ) ∑ i 1 < i 2 sign ( ( X i 1 , j 1 − X i 2 , j 1 ) ( X i 1 , j 2 − X i 2 , j 2 ) ) ,

for any 1 ≤ j 1 , j 2 ≤ p . We denote the corresponding Kendall’s tau matrix estimator by T ^ = [ τ ^ j 1 , j 2 ] 1 ≤ j 1 , j 2 ≤ p , which serves as a first step estimator for obtaining a better estimator of the Kendall’s tau matrix. This estimated Kendall’s tau matrix T ^ does not make any use of the underlying structure and will therefore be a rather naive tool in practice. As discussed in the previous section, we will introduce several estimators in the setting of Assumption 2, with straightforward generalizations under Assumption 1. More precisely, for every estimator τ ^ A 2 of τ under Assumption 2, we define a corresponding matrix estimator T ^ A 1 under Assumption 1 by

(4) T ^ A 1 ≔ [ T ^ j 1 , j 2 A 1 ] 1 ≤ j 1 , j 2 ≤ p = τ ^ A 2 ( G k 1 , G k 2 ) , if j 1 ∈ G k 1 , j 2 ∈ G k 2 and ( k 1 , k 2 ) ∈ J , τ ^ j 1 , j 2 , else.

Remember that J is the set of pairs ( k 1 , k 2 ) of block indices such that Kendall’s tau τ j 1 , j 2 does not depend on j 1 ∈ G k 1 , j 2 ∈ G k 2 . Since we assume that the Kendall’s taus in (some of) the off-diagonal blocks are equal, the idea of averaging the pairwise sample Kendall’s tau follows naturally. Let us introduce the block estimator τ ^ B that averages all sample Kendall’s taus within each of the off-diagonal blocks. Formally, we have

τ ^ B ≔ 1 b 1 b 2 ∑ j 1 = 1 b 1 ∑ j 2 = b 1 + 1 p τ ^ j 1 , j 2 .

We define T ^ B the corresponding estimator under Assumption 1 by Equation (4).

Under the partial exchangeability assumption, Perreault et al. [35] showed that the estimator T ^ B is asymptotically normal and optimal under the Mahalanobis distance. However, in terms of computational efficiency, the block estimator T ^ B does not show any improvement over the usual estimator T ^ , as both estimators require the computation of the usual Kendall’s tau between all pairs of variables anyway.

To reduce the computation time, we propose not averaging over all Kendall’s taus in the block but only over some of them. This would lead to computationally cheaper estimates. Naturally, the question arises over which elements then to average over. For this purpose, we introduce several estimators that average over different subsets of elements within each of the off-diagonal blocks.

We introduce two estimators that each average N ∈ { 1 , … , b 1 ∨ b 2 } pairs in the off-diagonal block under Assumption 2, so that we can compare estimators that either average pairs in the same row/column, or pairs on the diagonal. Without loss of generality, since we can switch both blocks of variables, we assume that b 1 ≤ b 2 , and we will average over the row. For averaging pairs on the diagonal, it is moreover required that N ≤ b 1 ∧ b 2 . The number of Kendall’s tau estimates is then reduced to scaling linearly with group size, which is a significant improvement over the previous quadratic growth. We set

τ ^ R ≔ 1 N ∑ j = 1 N τ ^ 1 , b 1 + j , and τ ^ D ≔ 1 N ∑ j = 1 N τ ^ j , b 1 + j .

Then, the “row-based” Kendall’s tau matrix estimator T ^ R and the “diagonal-based” Kendall’s tau matrix estimator T ^ D are defined by Equation (4) for the choices τ ^ A 2 = τ ^ R and τ ^ A 2 = τ ^ D . As such, for each of the off-diagonal blocks, T ^ R averages only the pairs on the first line along the largest side, whereas T ^ D averages only the pairs along the first diagonal.

Finally, we introduce the estimator that randomly selects pairs to average over per block. We denote the (deterministic) number of averaged pairs per block by N ∈ { 1 , … , b 1 × b 2 } and the corresponding estimator by T U . The pairs are selected with uniform probability and without replacement. We define

τ ^ U ≔ 1 N ∑ j 1 = 1 b 1 ∑ j 2 = 1 b 2 W j 1 , j 2 τ ^ j 1 , j 2 ,

where W is a b 1 × b 2 matrix of random weights that selects N pairs per off-diagonal block with uniform probability and without replacement. W j 1 , j 2 = 1 corresponds to selecting pair ( X j 1 , X j 2 ) and W j 1 , j 2 = 0 corresponds to passing over it. The corresponding matrix estimator is then denoted by T ^ U with W defined as selecting N pairs in each of the averaged blocks.

2.4 Comparison of their variances

Before we proceed with the main theoretical results on the estimators’ variances, let us introduce some auxiliary notations. For every j 1 , j 2 ∈ { 1 , … , p } , we set P j 1 , j 2 ≔ P ( ( X 1 , j 1 − X 2 , j 1 ) ( X 1 , j 2 − X 2 , j 2 ) > 0 ) . The quantity P j 1 , j 2 is equal to the probability of concordance of the variables X j 1 and X j 2 , and thus, τ j 1 , j 2 = 2 P j 1 , j 2 − 1 . As such, the structural assumption 2 ensures that P j 1 , j 2 is independent of the choice of pair whenever j 1 and j 2 are not in the same block. Alternatively we can write P j 1 , j 2 in terms of the copula C j 1 , j 2 of ( X j 1 , X j 2 ) by P j 1 , j 2 = 2 ∫ [ 0 , 1 ] 2 C j 1 , j 2 ( u 1 , u 2 ) d C j 1 , j 2 ( u 1 , u 2 ) . By extension, we define

P j 1 , j 2 , j 3 , j 4 ≔ P ( ( X 1 , j 1 − X 2 , j 1 ) ( X 1 , j 2 − X 2 , j 2 ) > 0 , ( X 1 , j 3 − X 2 , j 3 ) ( X 1 , j 4 − X 2 , j 4 ) > 0 ) , Q j 1 , j 2 , j 3 , j 4 ≔ P ( ( X 1 , j 1 − X 2 , j 1 ) ( X 1 , j 2 − X 2 , j 2 ) > 0 , ( X 1 , j 3 − X 3 , j 3 ) ( X 1 , j 4 − X 3 , j 4 ) > 0 ) , S j 1 , j 2 , j 3 , j 4 ≔ P ( ( X 1 , j 1 − X 2 , j 1 ) ( X 1 , j 2 − X 2 , j 2 ) > 0 , ( X 4 , j 3 − X 3 , j 3 ) ( X 4 , j 4 − X 3 , j 4 ) > 0 ) = P j 1 , j 2 P j 3 , j 4 ,

for every j 1 , j 2 , j 3 , j 4 ∈ { 1 , … , p } . Note that both P and Q quantities can be understood as some kind of “cross-concordance measures,” but there is an important difference between them: for the Q measure of cross-concordance, there is a third copy X 3,1 : p needed. When j 1 = j 3 and j 2 = j 4 , we obtain P j 1 , j 2 , j 3 , j 4 = P j 1 , j 2 . Q -type measures of cross-concordance naturally appears in the asymptotic variance of the usual estimator of Kendall’s tau through the particular case

(5) Q j 1 , j 2 ≔ Q j 1 , j 2 , j 1 , j 2 = P ( ( X 1 , j 1 − X 2 , j 1 ) ( X 1 , j 2 − X 2 , j 2 ) > 0 , ( X 1 , j 1 − X 3 , j 1 ) ( X 1 , j 2 − X 3 , j 2 ) > 0 ) = ∫ [ 0 , 1 ] 2 ( C j 1 , j 2 ( u 1 , u 2 ) + C ¯ j 1 , j 2 ( u 1 , u 2 ) ) 2 d C j 1 , j 2 ( u 1 , u 2 ) ,

where C ¯ denotes the survival function of a copula C . For S , a fourth independent copy is needed, but by independence, it reduces to the product P j 1 , j 2 P j 3 , j 4 . For completeness, this expression is derived in Appendix B.2. Note that this equality was already given in [19, Equation (8)]. Note that both P and Q can be written as eight-dimensional integrals involving the copula of ( X j 1 , X j 2 , X j 3 , X j 4 ) . As a consequence, they are functions of the joint law P ( j 1 , j 2 , j 3 , j 4 ) of the random vector X ( j 1 , j 2 , j 3 , j 4 ) . Note further that even under Assumption 1, they both depend on the choice of pairs within the considered off-diagonal block. Obviously, this is not the case if we assume the stronger Assumption 3.

To give explicit expressions for our estimators, we need to average P , Q , and S quantities. We need to separate these averages depending on the number of common variables in the cross-concordance terms. We define

P B , 2 ≔ 1 b 1 b 2 ∑ j 1 = 1 b 1 ∑ j 2 = b 1 + 1 p P j 1 , j 2 , P B , 1 ≔ 1 b 1 ( b 1 − 1 ) b 2 + b 2 ( b 2 − 1 ) b 1 ∑ j 1 = 1 b 1 ∑ j 2 = b 1 + 1 p ∑ j 3 = 1 , j 3 ≠ j 1 b 1 P j 1 , j 2 , j 3 , j 2 + ∑ j 1 = 1 b 1 ∑ j 2 = b 1 + 1 p ∑ j 4 = b 1 + 1 , j 4 ≠ j 2 p P j 1 , j 2 , j 1 , j 4 , P B , 0 ≔ 1 b 1 ( b 1 − 1 ) b 2 ( b 2 − 1 ) ∑ j 1 = 1 b 1 ∑ j 2 = b 1 + 1 p ∑ j 3 = 1 , j 3 ≠ j 1 b 1 ∑ j 4 = b 1 + 1 , j 4 ≠ j 2 p P j 1 , j 2 , j 3 , j 4 ,

and similarly, we define P α , β , Q α , β , S α , β for α ∈ { B , R , D } and β ∈ { 0 , 1 , 2 } . In the latter expression, α denotes the type of averaging (respectively, over the block, over the row and over the diagonal) and β denotes the number of common variables, i.e., the size of { j 1 , j 2 } ∩ { j 3 , j 4 } . This means that for β = 2 , P B , 2 is the average of the P j 1 , j 2 , j 3 , j 4 over the set of ( j 1 , j 2 , j 3 , j 4 ) such that j 1 = j 3 and j 2 = j 4 (2 common variables, since both variables are the same). P B , 1 is the average of the P j 1 , j 2 , j 3 , j 4 over the set of ( j 1 , j 2 , j 3 , j 4 ) such that j 1 = j 3 or j 2 = j 4 (1 common variable, since one is the same and one is different). Finally, P B , 0 is the average of the P j 1 , j 2 , j 3 , j 4 over the set of ( j 1 , j 2 , j 3 , j 4 ) such that j 1 ≠ j 3 and j 2 ≠ j 4 (0 common variables, all variables are different). To deal with the random estimator, we need to define the corresponding weighted quantities for a p × p matrix w :

P w , 2 ≔ 1 b 1 b 2 ∑ j 1 = 1 b 1 ∑ j 2 = b 1 + 1 p w j 1 , j 2 P j 1 , j 2 , P w , 1 ≔ 1 b 1 ( b 1 − 1 ) b 2 + b 2 ( b 2 − 1 ) b 1 ∑ j 1 = 1 b 1 ∑ j 2 = b 1 + 1 p ∑ j 3 = 1 , j 3 ≠ j 1 b 1 w j 1 , j 2 w j 3 , j 2 P j 1 , j 2 , j 3 , j 2 + ∑ j 1 = 1 b 1 ∑ j 2 = b 1 + 1 p ∑ j 4 = b 1 + 1 , j 4 ≠ j 2 p w j 1 , j 2 w j 1 , j 4 P j 1 , j 2 , j 1 , j 4 , P w , 0 ≔ 1 b 1 ( b 1 − 1 ) b 2 ( b 2 − 1 ) ∑ j 1 = 1 b 1 ∑ j 2 = b 1 + 1 p ∑ j 3 = 1 , j 3 ≠ j 1 b 1 ∑ j 4 = b 1 + 1 , j 4 ≠ j 2 p w j 1 , j 2 w j 3 , j 4 P j 1 , j 2 , j 3 , j 4 ,

and similarly for Q w , β and S w , β for β ∈ { 0 , 1 , 2 } .

Now that we have all auxiliary notations in place, let us start by showing that each of the estimators is in fact a U-statistic.

Lemma 3

Under Assumption 2, the estimators τ ^ j 1 , j 2 , τ ^ B , τ ^ R , τ ^ D , and τ ^ U are all second order U-statistics, in the sense that they can be written as ( n ( n − 1 ) ) − 1 ∑ 1 ≤ i 1 ≠ i 2 ≤ n g ( X i 1 , X i 2 ) , for some real-valued function g. We denote these functions, respectively, by g * , g B , g R , g D , and g U .

This lemma is proved in Appendix B.3 where the expressions of the kernels g * , g B , g R , g D , and g U are given. From Lemma 3 and the fact that E [ g ( X 1 , X 2 ) ] = τ for each kernel g , it follows that T ^ , T ^ B , T ^ R , T ^ D , and T ^ U are all unbiased estimators of the Kendall’s tau matrix under Assumption 1. The finite sample variances are given in the following theorem, proved in Appendix B.4.

Theorem 4

Let 1 ≤ j 1 , j 2 ≤ p , 1 ≤ b 1 ≤ p and let b 2 ≔ p − b 1 . The variances of τ ^ j 1 , j 2 , τ ^ B , τ ^ R , τ ^ D , and τ ^ U , and the conditional variance of τ ^ U conditionally to W are given by

V ar [ τ ^ j 1 , j 2 ] = 8 n ( n − 1 ) 2 ( n − 2 ) ( Q j 1 , j 2 − P j 1 , j 2 2 ) + P j 1 , j 2 − P j 1 , j 2 2 ,
V ar [ τ ^ B ] = 8 b 1 b 2 n ( n − 1 ) P B , 2 − S B , 2 + ( b 1 + b 2 − 2 ) ( P B , 1 − S B , 1 ) + ( b 1 − 1 ) ( b 2 − 1 ) ( P B , 0 − S B , 0 ) + 2 ( n − 2 ) ( Q B , 2 − S B , 2 + ( b 1 + b 2 − 2 ) ( Q B , 1 − S B , 1 ) + ( b 1 − 1 ) ( b 2 − 1 ) ( Q B , 0 − S B , 0 ) ) .
V ar [ τ ^ R ] = 8 N n ( n − 1 ) P R , 2 − S R , 2 + ( N − 1 ) ( P R , 1 − S R , 1 ) + 2 ( n − 2 ) ( Q R , 2 − S R , 2 + ( N − 1 ) ( Q R , 1 − S R , 1 ) ) .
V ar [ τ ^ D ] = 8 N n ( n − 1 ) P D , 2 − S D , 2 + ( N − 1 ) ( P D , 0 − S D , 0 ) + 2 ( n − 2 ) ( Q D , 2 − S D , 2 + ( N − 1 ) ( Q D , 0 − S D , 0 ) ) .
V ar [ τ ^ U ∣ W ] = 8 b 1 b 2 n ( n − 1 ) P W , 2 − S W , 2 + ( b 1 + b 2 − 2 ) ( P W , 1 − S W , 1 ) + ( b 1 − 1 ) ( b 2 − 1 ) ( P W , 0 − S W , 0 ) + 2 ( n − 2 ) ( Q W , 2 − S W , 2 + ( b 1 + b 2 − 2 ) ( Q W , 1 − S W , 1 ) + ( b 1 − 1 ) ( b 2 − 1 ) ( Q W , 0 − S W , 0 ) ) .
V ar [ τ ^ U ] = V ar [ τ J 1 , J 2 ] + 8 b 1 b 2 n ( n − 1 ) 2 ( n − 2 ) × ( Q B , 2 − S B , 2 + N − 1 b 1 b 2 − 1 ( b 1 + b 2 − 2 ) ( Q B , 1 − S B , 1 ) + N − 1 b 1 b 2 − 1 ( b 1 − 1 ) ( b 2 − 1 ) ( Q B , 0 − S B , 0 ) ) + ( P B , 2 − S B , 2 + N − 1 b 1 b 2 − 1 ( b 1 + b 2 − 2 ) ( P B , 1 − S B , 1 ) + N − 1 b 1 b 2 − 1 ( b 1 − 1 ) ( b 2 − 1 ) ( P B , 0 − S B , 0 ) ) ,
where J 1 and J 2 are independent random variables, uniformly distributed on { 1 , … , b 1 } and { b 1 + 1 , … , p } , respectively.

Note that the variance of the usual Kendall’s tau estimator τ ^ j 1 , j 2 is already known [19]. We can also remark that this theorem always holds, even if Assumption 2 is not satisfied. However, in such a case, the estimators will have different expectations: this would be the average of the considered Kendall’s tau over the corresponding pairs. Note that the conditional variance V ar [ τ ^ U ∣ W ] is the variance of the estimator τ ^ U for a fixed choice of pairs to average; it only measure the variability due to the randomness of the sample. On the contrary, the unconditional variance V ar [ τ ^ U ] takes into account both the randomness of the sample and the randomness of the choice of the pairs.

Remark 5

If Assumption 2 holds, the first term in the variance of τ ^ U vanishes, and P B , 2 = P R , 2 = P D , 2 = P j 1 , j 2 and all S terms become equal to P j 1 , j 2 2 . If the stronger Assumption 3 holds, all quantities P j 1 , j 2 , j 3 , j 4 and Q j 1 , j 2 , j 3 , j 4 becomes independent of the choice of indices ( j 1 , j 2 , j 3 , j 4 ) . Therefore, all P -type averages becomes equal, as well as all Q -type averages.

Corollary 6

Under Assumption 2, P 2 ≔ P j 1 , j 2 2 does not depend on the choice of j 1 , j 2 , and it holds that as n → ∞

n 1 ⁄ 2 ( τ ^ − τ ) ⟶ law N ( 0 , V ) .

Here τ ^ denotes any of the estimators τ ^ j 1 , j 2 , τ ^ B , τ ^ R , τ ^ D , τ ^ U given W , τ ^ U , and the corresponding asymptotic variances V are, respectively, given by

(6) V j 1 , j 2 = V j 1 , j 2 ( P X ) ≔ 16 ( Q j 1 , j 2 − P 2 ) ,

(7) V B = V B ( P X ) ≔ 16 b 1 b 2 ( Q B , 2 − P 2 + ( b 1 + b 2 − 2 ) ( Q B , 1 − P 2 ) + ( b 1 − 1 ) ( b 2 − 1 ) ( Q B , 0 − P 2 ) ) ,

(8) V R = V R ( P X ) ≔ 16 N ( Q R , 2 − P 2 + ( N − 1 ) ( Q R , 1 − P 2 ) ) ,

(9) V D = V D ( P X ) ≔ 16 N ( Q D , 2 − P 2 + ( N − 1 ) ( Q D , 0 − P 2 ) ) ,

(10) V U ∣ W = V U ∣ W ( P X ) ≔ 16 N ( Q W , 2 − P 2 + ( b 1 + b 2 − 2 ) ( Q W , 1 − P 2 ) + ( b 1 − 1 ) ( b 2 − 1 ) ( Q W , 0 − P 2 ) ) ,

(11) V U = V U ( P X ) ≔ 16 N Q B , 2 − P 2 + N − 1 b 1 b 2 − 1 ( ( b 1 + b 2 − 2 ) ( Q B , 1 − P 2 ) + ( b 1 − 1 ) ( b 2 − 1 ) ( Q B , 0 − P 2 ) ) ,

where P X denotes the law of the random vector X . Furthermore, these distributions are degenerate whenever the corresponding Q-type averages are equal to P 2 . A sufficient condition for this to happen is when all Kendall’s tau are equal to 1.

This corollary can straightforwardly be derived by combining Theorem A of [43, Section 5.5.1] and the computations of the corresponding ζ 1 ’s in the proof of Theorem 4. The asymptotic normality of τ ^ B was already known to hold under the stronger assumption of partial exchangeability [34, Theorem 1.2]. From the lengthy expressions in Theorem 4, we can derive the asymptotic variances in the setting where n , b 1 , b 2 , N → + ∞ .

Corollary 7

Under Assumption 2, as n , b 1 , b 2 , N → + ∞ , we have the following equivalents:

V ar [ τ ^ j 1 , j 2 ] ∼ 16 n ( Q j 1 , j 2 − P 2 ) = 1 n × V j 1 , j 2 ( P X ) ,
V ar [ τ ^ B ] ∼ 16 n ( Q B , 0 − P 2 ) = 1 n × lim b 1 , b 2 → + ∞ V B ( P X ) ,
V ar [ τ ^ R ] ∼ 16 n ( Q R , 1 − P 2 ) = 1 n × lim N → + ∞ V R ( P X ) ,
V ar [ τ ^ D ] ∼ 16 n ( Q D , 0 − P 2 ) = 1 n × lim N → + ∞ V D ( P X ) ,
V ar [ τ ^ U ∣ W ] ∼ 16 n ( Q W , 0 − P 2 ) = 1 n × lim b 1 , b 2 , N → + ∞ V U ∣ W ( P X ) ,
V ar [ τ ^ U ] ∼ 16 n ( Q B , 0 − P 2 ) = 1 n × lim b 1 , b 2 , N → + ∞ V U ( P X ) ,

assuming, respectively, that Q j 1 , j 2 , Q B , 0 , Q R , 1 , Q D , 0 , Q W , 0 , and Q B , 0 are strictly larger than P 2 . If this is not the case, the corresponding variances converges to 0 at a rate faster than O ( 1 ⁄ n ) .

Surprisingly, note that the variances do not depend on the block dimensions as soon as they are large enough. This is also true if the block dimensions tends to the infinity at different rates. In the limit, the quality of the estimator will therefore not improve by averaging over additional elements in general. Note that this is coherent, since we do not assume the dependence to converge to 0, which would correspond to some mixing assumption. Therefore, averaging must have only a limited effect, as in the simpler statistical model Y i = θ + ε i where ε = ( ε 1 , … , ε n ) follows a centered exchangeable Normal distribution with correlation ρ > 0 . In this case, the average Y ¯ n is inconsistent for estimating θ since E [ ( Y ¯ n − θ ) 2 ] = n − 2 ∑ i , j = 1 n E [ ε i ε j ] = 1 ⁄ n + ( n − 1 ) ρ ⁄ n → ρ as n → ∞ .

For large sample sizes, only the Q -type averages determine the levels of variance. For the diagonal, random and block estimators, the number of terms corresponding to non-overlapping combinations grow faster than the number of overlapping combinations. However, for the row estimator only pairs within the same row are averaged, and thus, the limiting variance contains the quantity Q R , 1 (instead of a hypothetical Q R , 0 ).

Open problem 1. As all asymptotic variances are equal up to a constant, it seems logical to ask which is the best estimator. For this, we need to compare these constants. However, these constants are defined using eight-dimensional integrals, making explicit computations difficult.

Interestingly, the block estimator and the random estimator perform equally well in the limit. Hence, we can greatly reduce computation time by using the random estimator instead of the block estimator, while still maintaining a low asymptotic variance.

This is coherent with Theorem 1 of [35] which shows that under Assumption 3 the block averaging estimator is optimal with respect to the Mahalanobis distance. Furthermore, since the diagonal estimator averages solely over non-overlapping combinations, note that it should converge faster than that of the random estimator. Therefore, if computation costs are to be reduced, the diagonal estimator is preferable to both the random and the row estimator.

Finally, we note that if only part of the row or diagonal is averaged, the asymptotic variances of the resulting estimators do not change. By doing so we can further lower computation times, but at the cost of attaining the limiting variances at slower rates. Therefore, it makes sense to choose N large enough to attain the asymptotic regime, but not too large to keep a low computation time.

3 Fast estimation of conditional Kendall’s tau matrix

We extend the aforementioned setting to the conditional setup, when a d -dimensional covariate Z is available taking values in Z ⊂ R d . Formally, this means that we observe a sample ( X i , Z i ) i = 1 , … , n of n independent and identically distributed replications of a random vector ( X , Z ) ∈ R p + d . The objective is now to estimate the p × p conditional Kendall’s tau matrix T ∣ Z = z = [ τ j 1 , j 2 ∣ Z = z ] 1 ≤ j 1 , j 2 ≤ p for a given point z ∈ Z .

3.1 Estimation of conditional Kendall’s tau

For construction of nonparametric estimates of the conditional Kendall’s tau, let us start by recalling the expression of the conditional Kendall’s tau, following [10]:

τ 1,2 ∣ Z = z = P ( ( X 1 , 1 − X 1,2 ) ( X 2,1 − X 2,2 ) > 0 ∣ Z 1 = Z 2 = z ) − P ( ( X 1 , 1 − X 1,2 ) ( X 2,1 − X 2,2 ) < 0 ∣ Z 1 = Z 2 = z ) .

Following the approach of [10], we introduce a kernel-based estimator of τ 1,2 ∣ Z = z as follows:

(12) τ ^ 1,2 ∣ Z = z ≔ 1 1 − s n ∑ i 1 = 1 n ∑ i 2 = 1 n w i 1 , n ( z ) w i 2 , n ( z ) g ( X i 1 , ( 1,2 ) , X i 2 , ( 1,2 ) ) ,

where g ( X i 1 , ( 1,2 ) , X i 2 , ( 1,2 ) ) ≔ sign ( ( X i 1 , 1 − X i 2 , 1 ) ( X i 1 , 2 − X i 2 , 2 ) )

with Nadaraya-Watson weights w i , n given by

(13) w i , n ( z ) ≔ K h ( Z i − z ) ∑ k = 1 n K h ( Z k − z ) ,

and s n ≔ ∑ i = 1 n w i , n 2 ( z ) , for some kernel K on R d and a bandwidth sequence h = h ( n ) converging to zero as n → ∞ . In this sense, τ ^ 1,2 ∣ Z = z is a smoothed estimator of τ 1,2 ∣ Z = z = E [ g ( X 1 , X 2 ) ∣ Z 1 = Z 2 = z ] . The factor 1 ⁄ ( 1 − s n ) tends to 1 as n → ∞ , and ensures that the estimated conditional Kendall’s tau takes values in the whole interval [ − 1 , 1 ] .

We adapt the simplified structural Assumption 2 by assuming that the underlying structural pattern applies to the conditional Kendall’s tau matrix given Z = z .

Assumption 4

(Simplified structural assumption conditionally to Z = z ∈ Z ) Kendall’s tau matrix of the random vector X can be written in the block form

T ∣ Z = z = ⋅ τ ∣ Z = z 1 τ ∣ Z = z 1 ⋅

for some value τ ∣ Z = z ∈ [ − 1 , 1 ] , where τ ∣ Z = z 1 represent a block filled with the value τ ∣ Z = z ∈ [ − 1 , 1 ] and the ⋅ represent any matrices of respective sizes b 1 × b 1 and b 2 × b 2 for some b 1 ∈ { 1 , … , p } and b 2 ≔ p − b 1 .

In terms of stock return modeling, Assumption 4 has the following interpretation: conditionally on a given market state or portfolio movement, the stocks of companies from different sectors/countries have equal rank correlations with every other pair from the respective groups. This could for instance be used for the computation of conditional risk measures.

Note that Assumption 4 only concern a fixed value of z , and is quite general in the sense that it allows the existence of different block structures depending on the value of the conditional variable z .

Let us denote the naive (unaveraged) conditional Kendall’s tau matrix estimator as T ^ ∣ Z = z = [ τ ^ j 1 , j 2 ∣ Z = z ] 1 ≤ j 1 , j 2 ≤ p with τ ^ j 1 , j 2 ∣ Z = z as in (12). As in the unconditional framework that was studied previously, we define averaged versions of the conditional estimators τ ^ ∣ Z = z B , τ ^ ∣ Z = z R , τ ^ ∣ Z = z D , τ ^ ∣ Z = z U , that, respectively, average over the whole block, the first row, the first diagonal, and uniformly chosen entries of the off-diagonal block.

3.2 Comparison of their asymptotic variances

Before proceeding with the asymptotic results, we need to formalize some regularity assumptions on the kernel K , the covariate Z and the bandwidth sequence h ( n ) . Since we will give similar results as in [10, Proposition 9], where the case of the (bivariate) conditional Kendall’s tau was treated, we give an adapted version of their assumptions.

Assumption 5

The kernel K is bounded, compactly supported, symmetrical in the sense that K ( u ) = K ( − u ) for every u ∈ R d and satisfies ∫ K = 1 , ∫ ∣ K ∣ < ∞ , ∫ K 2 < ∞ .
The kernel is of order α for some integer α > 1 , i.e., for all k = 1 , … , α − 1 and every indices j 1 , … , j k in { 1 , … , d } ,
∫ K ( u ) u j 1 … u j k d u = 0 .
In addition, E [ K h ( Z − z ) ] > 0 for every h > 0 .

These assumptions are classical in nonparametric statistics to obtain convergence rates of kernel-based estimators. The assumption of compactness of the support means that the estimators taken at different points in Z are independent if the sample size is large enough, since the bandwidth h n tends to 0. The assumptions that the kernel is bounded and that ∫ ∣ K ∣ and ∫ K 2 are finite rule out too much irregular kernels that have fat tails or irregular behavior. Higher-order kernels allows to obtain estimators that converge faster, under the assumption below that the joint density f X , Z is smooth enough, see Section 1.2.1 of [44]. Assumption 5(c) ensures that the weights that appear in Equation (13) are well-defined asymptotically since the denominator will converge to a strictly positive value by the law of large numbers.

Assumption 6

For every x ∈ R p , z ↦ f X , Z ( x , z ) is continuous and almost everywhere differentiable on a neighborhood of z up to the order α . For every 0 ≤ k ≤ α and every 1 ≤ j 1 , … , j α ≤ d , let

ℋ k , j → ( u , v , x 1 , x 2 , z ) ≔ sup t ∈ [ 0 , 1 ] ∂ k f X , Z ∂ z j 1 … ∂ z j k ( x 1 , z + t h u ) ∂ α − k f X , Z ∂ z j k + 1 … ∂ z j α ( x 2 , z + t h v )

denoting j → = ( j 1 , … , j α ) . Assume that ℋ k , j → ( u , v , x 1 , x 2 , z ) is integrable and there exists a finite constant C XZ , α > 0 such that, for every h < 1 ,

∫ ∣ K ∣ ( u ) ∣ K ∣ ( v ) ∑ k = 0 α α k ∑ j 1 , … , j α = 1 d ℋ k , j → ( u , v , x 1 , x 2 , z ) ∣ u j 1 … u j k v j k + 1 … v j α ∣ d u d v d x 1 d x 2

is less than C XZ , α .

The regularity condition on f X , Z can be interpreted in the classical way: smoother functions are easier to estimate than very irregular ones. Therefore, densities f X , Z that are α -times differentiable allows to use a larger range of bandwidths for large α ; this can be seen in the following Assumption 7.

Note that C XZ , α ≤ C α , p ‖ K ‖ ∞ 2 sup k ∈ { 0 , … , α } sup j → ∈ { 1 , … , d } k ‖ ∂ k f X , Z ⁄ ∂ z j 1 … ∂ z j k ‖ ∞ 2 , where C α , p is a constant that only depends on α and p . In this sense, the second part of Assumption 6 can be seen as a relaxed version of a uniform control on the higher-order derivatives of the density. Therefore, Assumption 6 is implied by Assumption 5 and by the (stronger) assumption that all partial derivatives of f X , Z with respect to components of z exist up to the order α and are bounded.

Assumption 7

n h n d → ∞ and n h n d + 2 α → 0 as n → ∞ .

This assumption controls the rate at which the sequence ( h n ) tends to 0. It should tend to 0 fast enough but not too fast: the second condition controls the bias (see Equation (A16) in the proof), while the first condition ensures that the rate n h n d is meaningful and allows us to verify Lyapunov’s condition (second part of [10, Section A.10]). We now present our main theoretical result on the joint asymptotic normality at different points of the conditioning variable Z , at the nonparametric rate ( n h n d ) 1 ⁄ 2 .

Theorem 8

(Joint asymptotic normality at different points) Let n ′ > 0 , and let z 1 ′ , … , z n ′ ′ be a collection of n ′ points of Z ⊂ R d such that Assumptions 4–7 are satisfied for any choice z ∈ { z 1 ′ , … , z n ′ ′ } . Then, as n → ∞ ,

( n h n d ) 1 ⁄ 2 ( τ ^ ∣ Z = z j ′ ′ − τ k 1 , k 2 ∣ Z = z j ′ ′ ) j ′ = 1 , … , n ′ ⟶ law N ( 0 , H k 1 , k 2 ) ,

where the diagonal matrix H is given by

H = ∫ K 2 1 { j 1 ′ = j 2 ′ } f Z ( z j 1 ′ ′ ) × V ( P X ∣ Z = z j 1 ′ ′ ) , 1 ≤ j 1 ′ , j 2 ′ ≤ n ′

and the asymptotic variance functions V are, respectively, defined in Equations (6)–(11).

The proof of this result is given in Appendix B.5.

Remark 9

Under the assumptions of Theorem 8, we have for a given z ∈ Z ,

lim n → + ∞ ( n h n d ) 1 ⁄ 2 ( τ ^ ∣ Z = z − τ ∣ Z = z ) = law ∫ K 2 f Z ( z ) × lim n → + ∞ n 1 ⁄ 2 ( τ ^ ( P X ∣ Z = z ) − τ ∣ Z = z ) ,

where on the left-hand side τ ^ ∣ Z = z denotes any of the estimators τ ^ j 1 , j 2 ∣ Z = z , τ ^ ∣ Z = z B , τ ^ ∣ Z = z R , τ ^ ∣ Z = z D , τ ^ k 1 , k 2 ∣ Z = z U , and on the right-hand side, τ ^ ( P X ∣ Z = z ) denotes the similarly-averaged estimated Kendall’s tau if we had observed a sample of size n from the distribution P X ∣ Z = z .

Corollary 10

Under the same assumptions as in Theorem 8 and by letting the sample size and dimensions tend to infinity, the following holds for P Z -almost all z ∈ Z ,

V ar [ τ ^ j 1 , j 2 ∣ Z = z ] ∼ 16 ∫ K 2 n h d f Z ( z ) ( Q j 1 , j 2 ∣ Z = z − P ∣ Z = z 2 ) ,
V ar [ τ ^ ∣ Z = z B ] ∼ 16 ∫ K 2 n h d f Z ( z ) ( Q B , 0 ∣ Z = z − P ∣ Z = z 2 ) ,
V ar [ τ ^ ∣ Z = z R ] ∼ 16 ∫ K 2 n h d f Z ( z ) ( Q R , 1 ∣ Z = z − P ∣ Z = z 2 ) ,
V ar [ τ ^ ∣ Z = z D ] ∼ 16 ∫ K 2 n h d f Z ( z ) ( Q D , 0 ∣ Z = z − P ∣ Z = z 2 ) ,
V ar [ τ ^ ∣ Z = z U ∣ W ] ∼ 16 ∫ K 2 n h d f Z ( z ) ( Q W , 0 ∣ Z = z − P ∣ Z = z 2 ) ,
V ar [ τ ^ ∣ Z = z U ] ∼ 16 ∫ K 2 n h d f Z ( z ) ( Q B , 0 ∣ Z = z − P ∣ Z = z 2 ) ,

assuming, respectively, that Q j 1 , j 2 ∣ Z = z , Q B , 0 ∣ Z = z , Q R , 1 ∣ Z = z , Q D , 0 ∣ Z = z , Q W , 0 ∣ Z = z , and Q B , 0 ∣ Z = z are strictly larger than P ∣ Z = z 2 , where conditional versions of P, Q, and their averages are defined with the same expression as in the previous section, but applied to the conditional law P X ∣ Z = z .

If these assumptions are not met, the corresponding variances converges to 0 at a rate faster than O ( 1 ⁄ ( n h d ) ) .

As seen earlier, we remark that the asymptotic variances have analogous expressions to that of their unconditional counterparts. Therefore, all averaging estimators exhibit a lower asymptotic variance than the naive conditional Kendall’s tau estimator. Also, the row averaging estimator intuitively performs worse than the block, diagonal and random estimators and for growing dimensions the block, diagonal and random estimator perform (almost) equally in the limit, assuming that the averages of Q are not far apart. Again, we can greatly reduce computation time by using either one of the diagonal or random averaging estimator instead of the block averaging estimator, since then only part of all conditional Kendall’s taus have to be computed. Finally, it again holds that if only part of the row or diagonal is averaged, the asymptotic variances of the resulting estimators do not change. By doing so, we can further decrease computation time, but at the cost of attaining the limiting variances at slower rates.

4 Simulation study

We perform a simulation study to assess the finite sample properties of our estimators. First, in Section 4.1, we compare the unconditional estimators by studying their variances and computation times for varying block and sample sizes. In Section 4.2, we focus on the conditional versions of the diagonal and block estimators and we let the Kendall’s taus depend on a one-dimensional covariate. Similarly, we compare their accuracy and computational efficiency for varying sample size and block dimensions. In addition, we examine the estimators’ optimal bandwidths under varying conditional dependencies of the Kendall’s tau matrix. The simulations are all executed with the help of the statistical environment R [38] on the DelftBlue supercomputer [7]. For simplicity, we choose N = min ( b 1 , b 2 ) , so that diagonal, row and random estimators average over the same number of terms.

4.1 Unconditional Kendall’s tau

In the unconditional framework, we compare the block, row, diagonal, random, and naive Kendall’s tau matrix estimators. We will examine how the estimators’ variance changes as a function of the block dimensions and the sample size. For this purpose, we consider mean squared errors (MSEs), which is a measure of variance here as all unconditional estimators are unbiased. Furthermore, we measure computation times for comparing the computational efficiency. For computing the pairwise sample Kendall’s taus, we use the function wdm in the R package wdm [31], which can efficiently calculate sample and weighted Kendall’s tau with time complexity O ( n log ( n ) ) . The row, diagonal, random, and block estimators are now available as part of the package ElliptCopulas [13] and can be computed using the function KTMatrixEst.

In each simulation, data is generated using a meta-elliptical copula [1,12,17,18]. A copula is said to be meta-elliptical if it is the copula of a distribution with density ∣ Σ ∣ − 1 ⁄ 2 g ( x ⊤ Σ x ) for a covariance matrix Σ (here chosen to be a correlation matrix) and a function g : R + → R + , called the generator of the meta-elliptical copula. This means that we simulate data from the Gaussian copula, or from other meta-elliptical copulas with different generators. Note that for meta-elliptical copulas, the matrix Σ can be directly obtained from the Kendall’s tau matrix. We let the underlying Kendall’s tau matrix be block structured corresponding to two groups of equal size, which we will refer to as the block size. This results in two diagonal blocks and a single distinct off-diagonal block due to symmetry.

Open problem 2. Simulating from meta-elliptical copulas is easy, for example, using the ElliptCopulas package [13] as they rely on elliptical distributions, which are well-understood distributions. Constructing explicit nonmeta-elliptical models of dependence that satisfy Assumption 2 seems difficult (unless both groups are independent) and is left for future research. Indeed, even in the low-dimensional case, where b 1 = b 2 = 2 (so that the two blocks are { 1,2 } and { 3,4 } ), a D-vine (for example) would give a decomposition of the copula c 1,2,3,4 = c 1,2 c 2,3 c 3,4 c 1,3 ∣ 2 c 2,4 ∣ 3 c 1,4 ∣ 2,3 of ( X 1 , X 2 , X 3 , X 4 ) . So Kendall’s tau τ 2,3 can be easily chosen through the specification of the copula c 2,3 . However, this would not give an explicit expression for the other interblocks Kendall’s taus τ 1,3 , τ 1,4 , τ 2,4 . Indeed, they depend in a complicated way of all the copulas’ and conditional copulas’ parameters through integration.

As obtained in Theorem 4, the variances depend on the averages of the auxiliary quantities P , Q , and S of pairs either along the row, the diagonal, or over the entire block. For a fair comparison of the different estimators, we need all different averages of these auxiliary quantities to be equal. As such, in addition to having identical Kendall’s tau values in the off-diagonal block, we take the values within the diagonal blocks to be identical as well. In that case, all auxiliary quantities are independent of the choice of pairs within the off-diagonal block, and moreover, the partial exchangeability assumption holds.

For performance analysis, we will focus on estimates of the single off-diagonal block, as all estimators treat the diagonal blocks equally. As such, computation times and MSEs result from only estimating the single off-diagonal block.

4.1.1 Effect of the sample size

In the first experiment, we study the dependency of the MSE on the sample size. To this end, the sample size is varied and the block size is fixed to 32 × 32 . The true Kendall’s tau values are fixed in the following way: the intragroup Kendall’s tau are τ 1 , 1 = τ 2,2 = 0.5 and the intergroup Kendall’s tau is τ 1,2 = 0.3 . We examine data generated from the Gaussian distribution. For each estimator, the MSEs are calculated using 3,000 replications. The results can be found on a log–log scale in Figure 2.

$Figure 2 Log–log plots of the MSE of the estimators τ ^ j 1 , j 2 {\widehat{\tau }}_{{j}_{1},{j}_{2}} (“naive”), τ ^ B {\widehat{\tau }}^{B} (“block”), τ ^ R {\widehat{\tau }}^{R} (“row”), τ ^ D {\widehat{\tau }}^{D} (“diag”) and τ ^ U {\widehat{\tau }}^{U} (“random” with uniform selection of pairs) as a function of the sample size. The diagonal block Kendall’s taus are set at 0.5 and the off-diagonal block values at 0.3.$

Figure 2

Log–log plots of the MSE of the estimators τ ^ j 1 , j 2 (“naive”), τ ^ B (“block”), τ ^ R (“row”), τ ^ D (“diag”) and τ ^ U (“random” with uniform selection of pairs) as a function of the sample size. The diagonal block Kendall’s taus are set at 0.5 and the off-diagonal block values at 0.3.

In Figure 2, we clearly observe almost straight lines for all of the estimators, with slopes indicating an inverse relationship. This not only confirms that the limiting variances are inversely proportional to the sample size but also that this applies accurately for small sample sizes. In addition, we see that averaging the sample Kendall’s taus does indeed lead to better estimates. This applies to any given sample size, as all estimates depend equally on it. As expected, the block estimator behaves best, only closely followed by the diagonal estimator.

Next, we study the dependency of the computation times on sample size. For this experiment, we compute the average computation time. The results are shown in Figure 3 on a log–log scale. It shows that the computation times gradually increase with the sample size, to a point where they appear to scale almost linearly with each other. These observations are in line with the computation time O ( n log ( n ) ) of the pairwise sample Kendall’s tau estimator. As expected, the computation times of the block and sample Kendall’s tau matrix estimator are very similar, as are the computation times of the row and diagonal estimators, with the latter two being significantly more efficient for any given sample size.

$Figure 3 Log–log plot of the mean computation time [ms] of the estimators τ ^ j 1 , j 2 {\widehat{\tau }}_{{j}_{1},{j}_{2}} (“naive”), τ ^ B {\widehat{\tau }}^{B} (“block”), τ ^ R {\widehat{\tau }}^{R} (“row”), τ ^ D {\widehat{\tau }}^{D} (“diag”), and τ ^ U {\widehat{\tau }}^{U} (“random” with uniform selection of pairs) as a function of the sample size, calculated using a block size of 32.$

Figure 3

Log–log plot of the mean computation time [ms] of the estimators τ ^ j 1 , j 2 (“naive”), τ ^ B (“block”), τ ^ R (“row”), τ ^ D (“diag”), and τ ^ U (“random” with uniform selection of pairs) as a function of the sample size, calculated using a block size of 32.

4.1.2 Effect of the block size

We first study the behavior of the MSE with respect to varying block sizes with off-diagonal block Kendall’s taus of 0.3 and diagonal block Kendall’s taus of 0.5. In this experiment, we set the sample size to 4 to reduce the computational cost of running a sufficient number of replications. Again, we examine data generated from the Gaussian distributions. The MSEs are calculated using 3,000 replications. See Figure 4 for a log–log plot of the MSEs as a function of the block size.

$Figure 4 Log–log plots of the mean squared error of the estimators τ ^ j 1 , j 2 {\widehat{\tau }}_{{j}_{1},{j}_{2}} (“naive”), τ ^ B {\widehat{\tau }}^{B} (“block”), τ ^ R {\widehat{\tau }}^{R} (“row”), τ ^ D {\widehat{\tau }}^{D} (“diag”), and τ ^ U {\widehat{\tau }}^{U} (“random” with uniform selection of pairs) as a function of the block size. The diagonal block Kendall’s taus are set at 0.5 and the off-diagonal block values at 0.3.$

Figure 4

Log–log plots of the mean squared error of the estimators τ ^ j 1 , j 2 (“naive”), τ ^ B (“block”), τ ^ R (“row”), τ ^ D (“diag”), and τ ^ U (“random” with uniform selection of pairs) as a function of the block size. The diagonal block Kendall’s taus are set at 0.5 and the off-diagonal block values at 0.3.

Figure 4 shows that all of the averaging estimators perform increasingly better than the sample Kendall’s tau estimator for growing block dimensions. For large block dimensions, MSEs seem to reach constant values, confirming that the asymptotic variances do not depend on block dimensions. As expected, the block and diagonal averaging estimators both converge to the lowest limiting variance, approached fastest by the block averaging estimator. The row and the random averaging estimator perform considerably less.

Furthermore, we find that the relative difference between the diagonal and block estimator is largest for small dimensions, but they are still well within a factor of 1.5 of each other. As the dimension increases, the MSEs of the diagonal estimator converge rapidly to that of the block estimator, again confirming that the block and diagonal estimators have close variances for large block dimensions. A more detailed presentation of the interplay between the sample size n and the size b 1 and b 2 of the blocks is available in the Supplementary file “MSE_n,b1,b2.pdf.”

4.1.3 Effect of the value of the true Kendall’s taus

In this section, we fix the sample size n = 16 and the block sizes b 1 = b 2 = 4 , and we change the value of Kendall’s tau. The MSE is displayed in Figure 5 for different combinations of τ 1 , 1 , τ 1,2 , and τ 2,2 . Note that some combinations are not present due to the constraints presented in Section 2.2.

$Figure 5 MSE as a function of the intergroup Kendall’s tau τ 1,2 {\tau }_{\mathrm{1,2}} for different combinations of τ 1,1 {\tau }_{\mathrm{1,1}} and τ 2,2 {\tau }_{\mathrm{2,2}} . We use the sample size n = 16 n=16 and the block sizes b 1 = b 2 = 4 {b}_{1}={b}_{2}=4 .$

Figure 5

MSE as a function of the intergroup Kendall’s tau τ 1,2 for different combinations of τ 1,1 and τ 2,2 . We use the sample size n = 16 and the block sizes b 1 = b 2 = 4 .

As expected, the relative order of the estimator is the same for all values of the Kendall’s tau. A more detailed version of this figure is available as supplementary material (File “MSE_tau.pdf”), showing the same phenomena for a larger range of value for the Kendall’s taus. Interestingly, the performance of all estimators is very similar when both intragroup Kendall’s taus are equal to 0.9. Indeed, in this case, the variables in each blocks are mostly the same, and then averaging does not change the situation anymore.

4.1.4 Effect of the copula

In this section, we fix the sample size, block sizes, and Kendall’s tau value. We vary instead the copula of the distribution. For this, we use different meta-elliptical copulas, because of their natural relationships between Kendall’s tau and the underlying correlation matrix. These meta-elliptical copulas are defined as the copulas of the distributions with densities ∣ Σ ∣ − 1 ⁄ 2 g ( x ⊤ Σ x ) for a covariance matrix Σ (here chosen to be a correlation matrix) and a function g : R + → R + respectively, chosen as

x ⁄ ( 1 + x 3 ) ,
1 ⁄ ( 1 + x 2 ) ,
exp ( − x ) × ∣ cos ( x ) ∣ ,
exp ( − x ⁄ 2 ) + exp ( − x ) × ∣ cos ( x ) ∣ ,
exp ( − x ) + exp ( − x ⁄ 3 ) × cos ( x ) 2 .

The results are displayed in Figure 6. We can observe that the relative order of the estimators is mostly the same as mentioned earlier, with the averaging estimator the best and the no-averaging estimator having the highest mean square error.

Figure 6

MSE for different meta-elliptical copulas and different estimators.

4.2 Conditional Kendall’s tau

In this section, we study the conditional versions of the block and diagonal estimators. Since the estimators make use of kernel regression, a larger sample size is needed for obtaining stable results. We therefore consider only a one-dimensional covariate Z , so that we do not need to increase the sample size even further and can run a sufficient number of replications. Kernel estimation is carried out with the Epanechnikov kernel [44], and the estimation procedures are now available in the function CKTmatrix.kernel of the R package CondCopulas [8].

In each of the experiments, we let the covariate Z be uniformly distributed on the interval [ 0 , 1 ] . We will estimate conditional Kendall’s taus for points z ranging from 0 to 1 in steps of 0.1. We generate data with the Gaussian distribution, as other distribution yield similar results. All variables will have a mean of Z and variance of 1 + Z 2 . The Kendall’s tau matrix is again block-structured corresponding to two groups of equal size. Similarly to the unconditional case, we only focus on the estimates of the single off-diagonal block. We set all Kendall’s taus within the diagonal blocks to a constant value of 0.3, which is independent of Z . Finally, we let the Kendall’s taus within the off-diagonal block depend on the covariate Z .

In Section 4.2.1, we examine the accuracy and computational efficiency of the estimates under varying sample size. A similar analysis for the effect of the block size is done in Appendix A.1. To this end, we set Kendall’s tau in the off-diagonal blocks to 0.1 Z . As Z is distributed on [0, 1], the conditional Kendall’s taus range from 0 to 0.1. As such, the underlying variables are again partially exchangeable conditionally to Z = z for any z ∈ [ 0 , 1 ] . It follows that the biases of the pairwise estimates in the off-diagonal block are all equal and thus that averaging over them does not change the total bias. Since therefore all estimators have equal biases, we focus on the sample variances instead of the MSEs for a comparison of accuracies.

Then, in Section 4.2.2, we study optimal bandwidths where we vary the way in which the off-diagonal block Kendall’s taus depend on Z . We consider a model in which we let the off-diagonal block conditional Kendall’s taus be given by

[ T ∣ Z = z ] ℬ 1,2 = 0.1 ( cos ( 0.5 π ω z ) + 1 ) 1 ,

with frequencies ω in { 1 , 2 , 3 , 4 } . As such, these conditional Kendall’s taus range from 0 until 0.2. For comparing the accuracies under varying bandwidths, we study mean integrated squared errors (MISEs) computed by averaging the MSEs of conditional estimates computed at conditioning points ranging from 0 to 1 in steps of 0.1.

4.2.1 Effect of the sample size

In this experiment, we study the dependency of the variances on the sample size. To this end, we vary the sample size under a fixed block size of 4 and a bandwidth of 0.5. We use this relatively large bandwidth to ensure stable results even at lower sample sizes. The integrated variance IVAR ≔ ∫ z = 0 1 V ar [ τ ^ ∣ Z = z ] d z is estimated as the average of sample variances for the grid points z i = i ⁄ n , i = 0 , … , 10 , using 3,000 replications. Supplementary plots showing the influence of the sample size on each term V ar [ τ ^ ∣ Z = z i ] are available in the Appendix A, Figure A1 while the influence of the sample size on IVAR is shown in Figure 7.

Figure 7

Log–log plots of the conditional estimators’ integrated variance as a function of the sample size, using a block size of 4 and a bandwidth of 0.5.

Unsurprisingly, the conditional variances are also inversely related to the sample size. It follows that if bandwidths are kept constant, MSEs converge to the bias. As such, appropriate bandwidths are naturally smaller for larger sample sizes. Furthermore, it is seen that the estimates near the edges of the interval [0, 1] are less accurate than those in the middle. This can be attributed to the fact that there are fewer observations of Z near grid points close to the edges than near grid points in the middle, since the observations can be found there on both sides. Evidently, a change in the distribution of Z also changes the level of the variances.

Next, let us study the dependency of the computation time on the sample size. We leave the setting unchanged, though the results correspond to the calculation of the conditional block estimates on a single grid point. The results are computed using 500 replications and are represented on log–log scale in Figure 8. Here, it is seen that the computation times gradually increase with the sample size to a point where they appear to scale quadratically with each other. This behavior follows from the fact that the conditional estimates require the calculation of a double sum of n terms. Note that the computation times of the diagonal and block estimators are relatively close since only a block size of 4 is used here.

Figure 8

Log–log plot of the estimated mean computation time [ms] of the conditional estimator as a function of the sample size, for a block size of 4.

4.2.2 Bandwidth selection

Let us compare the estimators’ MISEs for different bandwidths. In this experiment, we set the diagonal block Kendall’s taus to 0.3 and the off-diagonal block Kendall’s taus conditionally at Z = z to

0.1 ( cos ( 0.5 π ω z ) + 1 ) ,

with frequencies ω ∈ { 1 , 2 , 3 , 4 } . The block size is fixed at 8 and the sample size at 200. The MISEs and 95% confidence intervals are estimated using 100 replications (Figure 9).

$Figure 9 Log-plots of the conditional estimators’ MISEs as a function of the bandwidth for different frequencies ω including 95% confidence intervals, for a sample size of 200 and a block size of 8. “Conditional KT estimate” refers to the naive estimator of conditional Kendall’s tau τ ^ 1 , 2 ∣ Z = z {{\widehat{\tau }}_{1,2}}_{| {\bf{Z}}={\bf{z}}} .$

Figure 9

Log-plots of the conditional estimators’ MISEs as a function of the bandwidth for different frequencies ω including 95% confidence intervals, for a sample size of 200 and a block size of 8. “Conditional KT estimate” refers to the naive estimator of conditional Kendall’s tau τ ^ 1 , 2 ∣ Z = z .

The figure confirms that indeed the averaging estimators have smaller optimal bandwidths than the naive estimator. It should be noted that only a block size of 8 is used here, and that the optimal bandwidth decreases with block size until the limit values are reached. Furthermore, the figure shows that as the frequency increases, the optimal bandwidth is reduced. This is fully consistent with kernel regression theory: increasing the frequency increases the difference in Kendall’s tau values conditionally on adjacent points of z , and therefore, we need to pick a smaller bandwidth. Finally, it should be noted that as the bandwidth increases the effect of averaging is less and less visible. This can be attributed to the fact that by increasing the bandwidth, the variance term within the MISE becomes less and less prominent, while the bias term generally increases.

5 Application to real data

In this section, we study the behavior of the estimators under real data conditions and provide value at risk (VaR) computations of a large stock portfolio as an example of possible applications. In Section 5.2, we describe the methods used to estimate the VaR input parameters. The results are presented in Section 5.3, where backtesting is applied to asses the viability. All computations have been done using the R statistical environment [38].

5.1 Value at risk for elliptical distributions

The value at risk (VaR) is a widely used risk measure in a variety of financial fields, ranging from auditing and financial reporting to risk management and the calculation of regulatory capital [29]. It is used to quantify potential losses over a specific time frame of some financial entity or portfolio of assets. We will follow the approach of [37,42], in which explicit expressions for the VaR of elliptical distributions was derived. For the reader’s convenience, we recall these expressions in the present section.

Let X be the loss of a given portfolio, i.e., X > 0 means that the portfolio manager is loosing X euros. The VaR at level α ∈ ( 0 , 1 ) is defined as the quantile of X at level ( 1 − α ) . To estimate the VaR of a given portfolio of assets, it is often assumed that the portfolio’s profits and losses are a linear function of the returns of the individual constituents. More formally, a portfolio with value Π ( t ) at time t is called linear if its profit and loss Δ Π ( t ) = Π ( t ) − Π ( 0 ) over a time window [ 0 , t ] is a linear function of the returns X 1 ( t ) , … , X p ( t ) :

Δ Π ( t ) = δ 1 X 1 ( t ) + δ 2 X 2 ( t ) + … + δ p X p ( t ) .

This clearly applies to any common stock portfolio by using the ordinary returns of the individual shares and when considering the log returns, this holds to a good approximation provided that the time window [ 0 , t ] is small, e.g., for daily log returns. The time window t will be kept constant and will therefore be omitted from future notations.

Furthermore, we will assume that the X j are elliptically distributed with mean μ , covariance matrix Σ with Cholesky decomposition Σ = A A T and density generator g . Thus, the probability density function f X of X = ( X 1 , … , X p ) is given by

f X ( x ) = ∣ Σ ∣ − 1 ⁄ 2 g ( ( x − μ ) T Σ − 1 ( x − μ ) ) .

When considering elliptically distributed risk factors, we cannot simply use the delta-normal approach to calculate the VaR, as it relies on the stronger assumption of normality. A generalization of the delta-normal method was derived for the class of elliptical distributions in [42].

Let us start by noting that the VaR of the portfolio profits and losses Δ Π ( t ) can be rewritten as P ( Δ Π < − VaR α ) = α . Then, given the linearity of the portfolio and the fact that X follows an elliptical distribution, the VaR is obtained by solving the following equation:

α = ∣ Σ ∣ − 1 ⁄ 2 ∫ { δ x T ≤ − VaR α } g ( ( x − μ ) Σ − 1 ( x − μ ) T ) d x ,

where δ denotes the vector of weights ( δ 1 , … , δ p ) . After several changes of variables, we obtain

(14) α = ∣ S p − 2 ∣ ∫ 0 ∞ r p − 2 ∫ − ∞ − δ μ T − VaR α ∣ δ A ∣ g ( z 1 2 + r 2 ) d z 1 d r ,

where ∣ S p − 2 ∣ ≔ 2 π p − 1 2 ⁄ Γ ( p − 1 2 ) . Let us now introduce the function

(15) G ( s ) = 2 π p − 1 2 Γ ( p − 1 2 ) ∫ − ∞ − s ∫ 0 ∞ r n − 2 g ( z 1 2 + r 2 ) d r d z 1 = π p − 1 2 Γ ( p − 1 2 ) ∫ s ∞ ∫ z ∞ ( u − z 2 ) p − 3 2 g ( u ) d u d z ,

where we have changed variables to u = r 2 + z 1 2 and z = − z 1 . Let us denote by q α , p g the unique solution of the transcendental equation

(16) α = G ( q α , p g ) .

It then finally follows from expressions (14) and (15) that the Delta-Elliptic VaR is given by

(17) VaR α = − δ μ T + q α , p g ∣ δ A ∣ = − δ μ T + q α , p g δ Σ δ T .

Note that this equation has a clear financial interpretation: the portfolio’s average return is given by δ μ T and the portfolio’s standard deviation by δ Σ δ T . Further note that the result is analogous to that of the delta-normal VaR, in which we simply replace q α , p g with the 1 − α quantile of the standard-normal distribution.

5.2 Estimation procedure

To test the estimators in real data conditions, we consider a portfolio consisting of 240 different stocks. All stocks are listed on the Euronext markets and data has been downloaded from Yahoo Finance. The complete list of all shares involved is available in Appendix C. We will estimate the portfolio’s daily VaR assuming that the price is set at a level of 100 and that all stocks in the portfolio are equally weighted. To this end, we model the daily log returns of the individual stocks, assuming they follow an elliptical distribution.

To achieve a proper clustering, we compute the pairwise Kendall’s tau matrix over a long time period from 01 January 2007 to 14 January 2022, after which we reorder the variables in order to obtain the intended block structure. Since we have not proposed a clustering method, we simply use the method GW_Ward method from package seriation [22], along with a few manual adjustments. The resulting reordering corresponds to four large groups, which are specified further in Appendix C. See Figure 1 for a heatmap of the pairwise Kendall’s tau matrix before and after reordering the variables by group. To indicate the groups, lines have been drawn around the diagonal blocks. It should be noted that, if studied carefully, the large groups can be broken down into smaller and more accurate groups. Nevertheless, these large groups already seem to be quite useful, and therefore, we will simply use them for our further analysis.

Based on the groups displayed in Figure 1, the objective is to compute the VaR at 30 June 2017, leaving sufficient future data for backtesting the results. To this end, we estimate the Kendall’s tau matrix of the log returns using the block, row, diagonal, and sample Kendall’s tau matrix estimators using data points over the period 01 August 2015 to 30 June 2017. To estimate the standard deviations and averages over the same period, we use the sample mean and sample standard deviation.

Following the elliptical assumption, we can now obtain covariance matrix estimates from each of the Kendall’s tau matrix estimates. Subsequently, we can compute nonparametric estimates of the density generator for each of these inputs. To this end, we make use of the function EllDistrEst from the ElliptCopulas package [13], which implements Liebscher’s procedure [25].

For the density generator estimation, we require a complete data set with no missing values. As such, the interval on which we estimate the density generator will be chosen as shorter (01 June 2016 to 30 June 2017). The kernel function will be chosen as the Epanechnikov kernel. Choice of tuning parameters in this setting is discussed in [41]. For simplicity, we use Silverman’s rule of thumb for bandwidth selection to estimate elliptical density generators [37], which for a sample size of n is given by

(18) h = 1.06 Var [ ξ ^ ] n 1 ⁄ 5 ,

where

ξ ^ i = − 1 + ( 1 + ( ( x i − μ ) Σ ^ − 1 ( x i − μ ) ) p ⁄ 2 ) 2 ⁄ p ,

for i = 1 , … , n and p = 240 . Here, x i stands for the vector of log returns at the i th date, Σ ^ stands for one of the covariance matrix estimates and μ stands for the log returns’ sample mean. Clearly, by using this bandwidth selection method, the use of different Kendall’s tau matrix estimators yields different values for the bandwidth. To get a better idea of the effects of the bandwidth choice, we also consider several deterministic bandwidths, and compare the performance of the estimators for each of them.

Finally, we can numerically solve the transcendental equation as given in (14) to arrive at the corresponding quantiles. As such, we have discussed all ingredients for calculating the VaR as in (17). To test the results, we perform backtesting on two intervals, one in the future from 1 July 2017 to 14 January 2022 and one during the period on which the estimations are based, from 01 August 2015 to 30 June 2017.

5.3 Results

We compute the portfolio’s 5 and 10% VaR values by following the estimation procedure described in Section 5.2. Table 1 shows the quantile estimates obtained by solving the transcendental equation for each of the different density generator estimates. The density generators were estimated using each of the block, row, diagonal, and naive Kendall’s tau matrix estimators and using varying values of the bandwidth.

Table 1

Estimated quantiles corresponding to the 5 and 10% VaRs calculated by estimating an elliptical distribution for the daily log returns using each of the different Kendall’s tau matrix estimates and several values of the bandwidth

Quantiles q α , p g ^ h		Estimated
α	Estimator	h = 20	h = 40	h = 100	Silverman’s h
5%	Naive	2.11	1.94	1.98	2.12 ( h = 586.8 )
	Block	1.60	1.60	1.60	1.60 ( h = 40.8 )
	Row	1.60	1.60	1.60	1.60 ( h = 41.1 )
	Diagonal	1.59	1.59	1.60	1.59 ( h = 40.5 )
10%	Naive	1.48	1.38	1.40	1.53 ( h = 586.8 )
	Block	1.23	1.23	1.23	1.23 ( h = 40.8 )
	Row	1.24	1.24	1.23	1.24 ( h = 41.1 )
	Diagonal	1.23	1.23	1.23	1.23 ( h = 40.5 )

The table shows that the averaging estimators yield very similar quantiles which are all relatively constant for different choices of the bandwidth. In contrast, the quantiles of the naive estimator lie substantially higher and vary significantly for the different bandwidths. In that sense, the estimates obtained with the averaging estimators seem to be much more stable. Moreover, the Silverman’s bandwidths of the averaging estimators are also all very similar, while that of the naive estimator is again considerably larger.

Table 2 shows the VaR estimates for each of the different estimators and bandwidths, and also the backtested VaR values. As discussed in Section 5.2, backtests were conducted at two intervals, interval 1 refers to the upcoming interval from 1 July 2017 until 14 January 2022, and interval 2 refers to the interval on which the estimation is based, from 01 August 2015 until 30 June 2017.

Table 2

Estimated 5 and 10% VaRs including the corresponding backtesting results on two intervals. Interval 1 corresponds to 01 July 2017 until 14 January 2022 and interval 2 to 01 August 2015 until 30 June 2017

VaR		Estimated				Backtested
α	Estimator	h = 20	h = 40	h = 100	Silverman’s h	Interval 1	Interval 2
5%	Naive	1.647	1.512	1.544	1.655 ( h = 586.8 )	1.392	1.262
	Block	1.320	1.320	1.320	1.320 ( h = 40.8 )
	Row	1.332	1.332	1.332	1.332 ( h = 41.1 )
	Diagonal	1.284	1.284	1.292	1.284 ( h = 40.5 )
10%	Naive	1.147	1.083	1.068	1.187 ( h = 586.8 )	0.861	0.839
	Block	1.008	1.008	1.008	1.008 ( h = 40.8 )
	Row	1.026	1.017	1.026	1.026 ( h = 41.1 )
	Diagonal	0.987	0.987	0.987	0.987 ( h = 40.5 )

This clearly shows that the averaging estimators have performed significantly better than the naive estimator when compared to both backtesting intervals. For both α -levels, it can be seen that the VaRs generated using the naive estimator are considerably larger than those using the averaging estimators, which themselves produce relatively similar values. Furthermore, it can be seen that the 5% VaRs of the averaging estimators agree fairly well with the results of the backtesting, unlike those of the naive estimator.

However, the 10% VaR estimates are not as accurate and all estimators yield considerably higher VaRs than those obtained by backtesting. This could indicate that the log returns are not elliptically distributed, or that the interval at which we estimate the density generator is too short. Recall that the interval on which we estimate the density generator is merely from 01 June 2016 until 30 June 2017. This lack of performance is hard to relate directly to the block sizes. Indeed, the “naive” estimator of Kendall’s tau (i.e., without any averaging) corresponds to the case where all the blocks have size 1, so all block sizes are as small as possible. Still the VaR estimates from this estimator are the worst.

To obtain a better understanding of how well the VaR estimates correspond with the backtesting results, we examine how often the estimates are exceeded by the portfolio’s losses in each of the backtesting periods. Tables 3 and 4 show the number of exceedances in interval 1 and interval 2, respectively.

Table 3

The number of exceedances of the estimated 5 and 10% VaRs during backtesting interval 1, from 1 July 2017 until 14 January 2022

# Exceedances		Estimated				Backtested
α	Estimator	h = 20	h = 40	h = 100	Silverman’s h	Interval 1
5%	Naive	47	53	53	46 ( h = 586.8 )	58
	Block	58	58	58	58 ( h = 40.8 )
	Row	58	58	58	58 ( h = 41.1 )
	Diagonal	61	61	59	61 ( h = 40.5 )
10%	Naive	76	85	83	72 ( h = 586.8 )	116
	Block	94	94	94	94 ( h = 40.8 )
	Row	91	91	92	91 ( h = 41.1 )
	Diagonal	100	100	100	100 ( h = 40.5 )

Table 4

The number of exceedances of the estimated 5 and 10% VaRs during backtesting interval 2, from 01 August 2015 until 30 June 2017

# Exceedances		Estimated				Backtested
α	Estimator	h = 20	h = 40	h = 100	Silverman’s h	interval 2
5%	Naive	9	12	12	9 ( h = 586.8 )	25
	Block	20	20	20	20 ( h = 40.8 )
	Row	20	20	20	20 ( h = 41.1 )
	Diagonal	22	22	21	22 ( h = 40.5 )
10%	Naive	30	32	32	30 ( h = 586.8 )	49
	Block	35	35	35	35 ( h = 40.8 )
	Row	35	35	35	35 ( h = 41.1 )
	Diagonal	35	35	35	35 ( h = 40.5 )

Both tables show that the difference between the theoretical and the observed number of exceedances is much larger when using the naive sample Kendall’s tau matrix estimator than when using any of the averaging estimators and this applies to both α -levels as well as to all bandwidths. As such, the averaging estimators are overall significantly better performers than the naive estimator. In addition, although there are subtle differences in the performance of the block, row, and diagonal estimators, there is no clear winner in this example. This shows that computing all Kendall’s tau using the block estimators incur no clear additional benefits compared to using only the row or diagonal estimators, which are computationally much cheaper.

6 Conclusion

We have provided an alternative approach to the generally challenging task of estimating Kendall’s tau and conditional Kendall’s tau matrices in high-dimensional settings. By imposing structural assumptions on the underlying (conditional) Kendall’s tau matrix, we have introduced new estimators that have significantly reduced computational costs without much loss in performance.

For the unconditional case, a model was studied in which the set of variables could be grouped in such a way that the Kendall’s taus of variables from different groups depend only on the group numbers. After reordering the variables by group, the underlying Kendall’s tau matrix is then block-structured with constant values in the off-diagonal blocks. We have proposed several (unbiased) estimators that take advantage of this block structure by averaging over the usual pairwise Kendall’s tau estimates in each of the off-diagonal blocks: the block estimator averages over all pairwise estimates, whereas the row, the diagonal, and the random estimators only average over part of the off-diagonal blocks (respectively, over the pairs on the first row, on the first diagonal and over a random selection of pairs).

We have formally derived variance expressions, which showed not only that all estimators are improvements over the usual sample Kendall’s tau matrix estimator but also, interestingly, that the asymptotic variances do not depend on the block dimensions. Furthermore, we have seen that the block, the diagonal, and the random estimators have similar asymptotic variances, whereas that of the row estimator was different. In most examples, the diagonal estimator performed the best, but a formal characterization of the set of such copulas is left for future work. Under light assumptions, we have shown that asymptotic variances are equal and that it is approached fastest by the block estimator, followed by the diagonal estimator and then the random estimator. Hence, if the computational costs were to be reduced, the diagonal estimator is preferable to both the random and the row estimator.

Furthermore, a model was studied in which the Kendall’s taus depend on a conditioning variable. Here it was assumed that the conditional Kendall’s tau matrix has the aforementioned block structure and, moreover, that it is preserved under fluctuations of the conditioning variable. We have adopted nonparametric, kernel-based estimates of the conditional Kendall’s tau to construct the conditional versions of the block, row, diagonal, and random estimators. Under some additional regularity assumptions, we have shown that the estimators are all asymptotically normal conditionally to different values of the covariate. Following from these expressions, we have seen that the asymptotic variances have analogous expressions to their unconditional counterparts. As such, all estimators are again improvements over the naive estimator, with the block estimator having the best performance. Similarly, if computational costs were to be reduced, the diagonal estimator is preferable to both the random and the row estimator. Moreover, the reduction of computing costs becomes particularly relevant in the conditional setting, as the use of kernel smoothing introduces additional complexity.

We have performed a simulation study in order to support the theoretical findings. In the unconditional setting, simulations were performed with different meta-elliptical copulas. It was furthermore confirmed that the diagonal and the block estimator indeed have the lowest asymptotic variance in most cases, with the block estimator converging the fastest, though closely followed by the diagonal estimator. This emphasizes the practical use of the diagonal estimator.

We remarked again that the conditional estimators’ variances decrease in a similar fashion for growing block dimensions. As a consequence, the averaging estimators allow for a reduced optimal bandwidth; this was indeed confirmed in the simulations. This makes the averaging estimators perfectly suited for practical applications, as reducing the bandwidth goes hand in hand with reducing the estimation bias.

Finally, we have demonstrated the use of the estimators in a real world application. The estimators were used to model the daily log returns of a large stock portfolio consisting of 240 Euronext listed stocks. After clustering the sample Kendall’s tau matrix, the proposed block structure was clearly visible. Building on these groups, robust estimates of the correlation matrix were obtained by assuming that the log returns follow an elliptical distribution. Using each of these estimates, the portfolio’s 5 and 10% VaR values were estimated. The results of the averaging estimators were much more stable under changes in the bandwidth used for the estimation of the density generator. Moreover, the averaging VaRs were significant improvements over the naive estimates. This example confirmed that the proposed block structures are well reflected in real data conditions and that the averaging estimators lead to significantly more stable and accurate results.

Acknowledgements

The authors thank Thomas Nagler for useful comments on a previous draft, and Dorota Kurowicka for a discussion and references that lead to Section 2.2. The authors also thank the Associate Editor and two anonymous reviewers for their useful comments which significantly improved the manuscript.

Funding information: The authors state that no specific funding was involved in this research.
Author contributions: All authors have accepted responsibility for the entire content of this manuscript and consented to its submission to the journal, reviewed all the results, and approved the final version of the manuscript. RVDS: conceptualization, formal analysis, software, investigation, writing. AD: conceptualization, methodology, formal analysis, software, writing, supervision.
Conflict of interest: The authors state no conflict of interest.
Data availability statement: All data were obtained using the getSymbols function from the quantmod R package [40], fetching the data from Yahoo Finance.

Appendix A Additional figures

Figure A1

Log–log plots of the conditional estimators’ variances as a function of the sample size on several conditioning points, using a block size of 4 and a bandwidth of 0.5.

A.1 Effect of the block size in the conditional framework (Section 4.2)

We first study the estimators’ variance under varying block dimensions. To run a sufficient number of replications, we set the sample size to 20 and consequently the bandwidth to 0.5. The variances and 95% confidence intervals are estimated using 30,000 replications. For each grid point z , the resulting sample variances are displayed on log–log scale in Figure A2.

Figure A2

Log–log plots of the conditional estimators’ variances as a function of the block size on several conditioning points including 95% confidence intervals, for a sample size of 20 and a bandwidth of 0.5.

From the figure, we observe that the estimators’ variances behave similarly to the unconditional setting under varying block dimensions, for each of the grid points. That is, both estimators are improvements over the naive estimator, both limiting variances are identical, and the block estimator converges slightly faster than the diagonal estimator. It further follows that since averaging reduces variance, it also reduces the optimal bandwidth. This is studied in Section 4.2.2. Again, as there are fewer observations of Z near grid points close to the edges of [ 0 , 1 ] , the variance levels vary slightly over the different grid points.

As for the computation times, there is clearly no fundamental change in how these depend on the block size when compared to the unconditional setting. However, since the conditional estimators are kernel based, it should be noted that they generally require more computation time than their unconditional counterparts, as was also seen in Figure 8. For the sake of completeness, we still include a plot of the average computation time against the block size, see Figure A3. The results correspond to estimating the off-diagonal block conditional Kendall’s taus simultaneously on the 11 grid points and follow from 10,000 replications with a sample size of 150. As expected, the block estimator scales quadratically with the block size, while the diagonal estimator scales linearly with block size. Therefore, as in the unconditional case, one may prefer the diagonal estimator over the block estimator to gain substantial computational efficiency and lose only little precision.

Figure A3

Log–log plot of the conditional estimators’ mean computation time [s] as a function of the block size, for a sample size of 150.

B Proofs

B.1 Proofs for Section 2.2

Proof of Proposition 1

In this proof, we will need the following notation: for an integer i ∈ { 1 , … , p } , we define the vector 1 i as the vector with a 1 at the i th component and 0 elsewhere. For a set I ⊂ { 1 , … , p } , we define 1 I ≔ ∑ i ∈ I 1 i , which the vector with 1 at the components in I and 0 elsewhere. Note that

M ( 1 1 − 1 2 ) = ( 1 − ρ 1 , ρ 1 − 1 , 0 , … , 0 ) = ( 1 − ρ 1 ) ( 1 1 − 1 2 ) M ( 1 b 1 + 1 − 1 b 1 + 2 ) = ( 0 , … , 0 , 1 − ρ 2 , ρ 2 − 1 , 0 , … , 0 ) = ( 1 − ρ 2 ) ( 1 b 1 + 1 − 1 b 1 + 2 ) .

This gives us a number of ( b 1 − 1 ) + ( b 2 − 1 ) eigenvectors with eigenvalues 1 − ρ 1 and 1 − ρ 2 that are positive since ρ 1 , ρ 2 are smaller than 1. Moreover, remark that

M 1 1 : b 1 = ( 1 + ( b 1 − 1 ) ) ρ 1 1 1 : b 1 + b 1 ρ 3 1 b 1 + ( 1 : b 2 ) M 1 b 1 + ( 1 : b 2 ) = b 2 ρ 3 1 1 : b 1 + ( 1 + ( b 2 − 1 ) ρ 2 ) 1 b 1 + ( 1 : b 2 ) ,

so the eigenvalues of the matrix

( 1 + ( b 1 − 1 ) ) ρ 1 b 1 ρ 3 b 2 ρ 3 1 + ( b 2 − 1 ) ρ 2

are also eigenvalues of M . These eigenvalues are

1 + ( b 1 − 1 ) ρ 1 2 + ( b 2 − 1 ) ρ 2 2 ± ( ( b 1 − 1 ) ρ 1 − ( b 2 − 1 ) ρ 2 ) 2 + 4 b 1 ρ 3 2 b 2 2 .

The smallest eigenvalue is positive if and only if

1 + ( b 1 − 1 ) ρ 1 2 + ( b 2 − 1 ) ρ 2 2 2 > ( ( b 1 − 1 ) ρ 1 − ( b 2 − 1 ) ρ 2 ) 2 + 4 b 1 ρ 3 2 b 2 4 .

i.e.,

( ( b 2 b 1 − b 1 − b 2 + 1 ) ρ 2 + b 1 − 1 ) ρ 1 − b 1 ρ 3 2 b 2 + ( b 2 − 1 ) ρ 2 + 1 > 0 .

A sufficient condition is

□ ρ 1 > b 1 b 2 ρ 3 2 − ( b 2 − 1 ) ρ 2 − 1 ( b 2 b 1 − b 1 − b 2 + 1 ) ρ 2 + b 1 − 1 , ρ 2 > − b 1 − 1 b 2 b 1 − b 1 − b 2 + 1 .

Proof of Proposition 2

Assume that M is a correlation matrix, and let X ∼ N ( 0 , M ) . Take one random variable from each block. Their correlation matrix is ( I + ρ J ) and must therefore be positive semidefinite. This yields the constraint ρ ≥ − 1 ⁄ ( K − 1 ) .

Conversely, this bound is reached by choosing the correlation matrices Σ k = 1 for all k = 1 , … , K and considering M as the correlation matrix of ( X 1 , … , X 1 , X 2 , … , X 2 , … , X K , … , X K ) ∈ R b 1 + b 2 + ⋯ + b K , where ( X 1 , … , X K ) follows an exchangeable normal distribution with correlation arbitrarily close to − 1 ⁄ ( K − 1 ) .□

B.2 Derivation of Equation (5)

We have

Q j 1 , j 2 = P ( X 1 , ( j 1 , j 2 ) < X 2 , ( j 1 , j 2 ) , X 1 , ( j 1 , j 2 ) < X 3 , ( j 1 , j 2 ) ) + P ( X 1 , ( j 1 , j 2 ) < X 2 , ( j 1 , j 2 ) , X 1 , ( j 1 , j 2 ) > X 3 , ( j 1 , j 2 ) ) + P ( X 1 , ( j 1 , j 2 ) > X 2 , ( j 1 , j 2 ) , X 1 , ( j 1 , j 2 ) > X 3 , ( j 1 , j 2 ) ) + P ( X 1 , ( j 1 , j 2 ) > X 2 , ( j 1 , j 2 ) , X 1 , ( j 1 , j 2 ) < X 3 , ( j 1 , j 2 ) ) .

Let us write these probabilities in terms of the copula C j 1 , j 2 of X ( j 1 , j 2 ) = ( X j 1 , X j 2 ) . This gives

Q j 1 , j 2 = ∫ [ 0 , 1 ] 2 ∫ ( u 1 , u 2 ) ( 1 , 1 ) ∫ ( u 1 , u 2 ) ( 1 , 1 ) d C j 1 , j 2 ( u 5 , u 6 ) d C j 1 , j 2 ( u 3 , u 4 ) d C j 1 , j 2 ( u 1 , u 2 ) + ∫ [ 0 , 1 ] 2 ∫ ( 0,0 ) ( u 1 , u 2 ) ∫ ( u 1 , u 2 ) ( 1 , 1 ) d C j 1 , j 2 ( u 5 , u 6 ) d C j 1 , j 2 ( u 3 , u 4 ) d C j 1 , j 2 ( u 1 , u 2 ) + ∫ [ 0 , 1 ] 2 ∫ ( u 1 , u 2 ) ( 1 , 1 ) ∫ ( 0,0 ) ( u 1 , u 2 ) d C j 1 , j 2 ( u 5 , u 6 ) d C j 1 , j 2 ( u 3 , u 4 ) d C j 1 , j 2 ( u 1 , u 2 ) + ∫ [ 0 , 1 ] 2 ∫ ( 0,0 ) ( u 1 , u 2 ) ∫ ( 0,0 ) ( u 1 , u 2 ) d C j 1 , j 2 ( u 5 , u 6 ) d C j 1 , j 2 ( u 3 , u 4 ) d C j 1 , j 2 ( u 1 , u 2 ) = ∫ [ 0 , 1 ] 2 C ¯ j 1 , j 2 2 ( u 1 , u 2 ) d C j 1 , j 2 ( u 1 , u 2 ) + ∫ [ 0 , 1 ] 2 C ¯ j 1 , j 2 ( u 1 , u 2 ) C j 1 , j 2 ( u 1 , u 2 ) d C j 1 , j 2 ( u 1 , u 2 ) + ∫ [ 0 , 1 ] 2 C j 1 , j 2 ( u 1 , u 2 ) C ¯ j 1 , j 2 ( u 1 , u 2 ) d C j 1 , j 2 ( u 1 , u 2 ) + ∫ [ 0 , 1 ] 2 C j 1 , j 2 2 ( u 1 , u 2 ) d C j 1 , j 2 ( u 1 , u 2 ) = ∫ [ 0 , 1 ] 2 ( C j 1 , j 2 ( u 1 , u 2 ) + C ¯ j 1 , j 2 ( u 1 , u 2 ) ) 2 d C j 1 , j 2 ( u 1 , u 2 ) ,

as claimed.

B.3 Proof of Lemma 3

Proof

First check that, trivially, the sample Kendall’s tau τ ^ j 1 , j 2 is a U-statistic of order 2 with (symmetric) kernel

g * ( X i 1 , ( j 1 , j 2 ) , X i 2 , ( j 1 , j 2 ) ) ≔ sign ( ( X i 1 , j 1 − X i 2 , j 1 ) ( X i 1 , j 2 − X i 2 , j 2 ) ) .

Consequently,

τ ^ B = 1 b 1 b 2 ∑ j 1 = 1 b 1 ∑ j 2 = b 1 + 1 p τ ^ j 1 , j 2 = 1 b 1 b 2 ∑ j 1 = 1 b 1 ∑ j 2 = b 1 + 1 p 2 n ( n − 1 ) ∑ i 1 < i 2 g * ( X i 1 , ( j 1 , j 2 ) , X i 2 , ( j 1 , j 2 ) ) = 2 n ( n − 1 ) ∑ i 1 < i 2 1 b 1 b 2 ∑ j 1 = 1 b 1 ∑ j 2 = b 1 + 1 p g * ( X i 1 , ( j 1 , j 2 ) , X i 2 , ( j 1 , j 2 ) ) ,

and it follows that τ ^ B is a U-statistic with kernel

g B ( X i 1 , X i 2 ) = 1 b 1 b 2 ∑ j 1 = 1 b 1 ∑ j 2 = b 1 + 1 p g * ( X i 1 , ( j 1 , j 2 ) , X i 2 , ( j 1 , j 2 ) ) .

In a similar manner, it is easily seen that τ ^ R , τ ^ D , and τ ^ U are all U-statistics as well with respective kernels

g R ( X i 1 , X i 2 ) = 1 N ∑ j = 1 N g * ( X i 1 , ( 1 , j ) , X i 2 , ( 1 , j ) ) , g D ( X i 1 , X i 2 ) = 1 N ∑ j = 1 N g * ( X i 1 , ( j , b 1 + j ) , X i 2 , ( j , b 1 + j ) ) , g U ( X i 1 , X i 2 ) = 1 b 1 b 2 ∑ j 1 = 1 b 1 ∑ j 2 = b 1 + 1 p W j 1 , j 2 g * ( X i 1 , ( j 1 , j 2 ) , X i 2 , ( j 1 , j 2 ) ) .

Note that the kernel g U is random by depending on the weights W .□

B.4 Proof of Theorem 4

Recall from Lemma 3 that τ ^ j 1 , j 2 , τ ^ B , τ ^ R , τ ^ D , τ ^ U can all be written as U-statistics of order 2 with the symmetric kernels defined earlier. To compute the variance of these U-statistics, we will use Hoeffding’s formula [23] (see also [43, Section 5.2.1]) that we recall for reader’s convenience. For a second-order U-statistic U n ≔ n 2 − 1 ∑ 1 ≤ i 1 ≠ i 2 ≤ n g ( X i 1 , X i 2 ) with symmetric kernel g : R p → R satisfying E ∣ g ( X 1 , X 2 ) ∣ < + ∞ , the variance is given by

(A1) V ar [ U n ] = n 2 − 1 ∑ c = 1 2 2 c n − 2 2 − c ζ c = 2 n ( n − 1 ) ( 2 ( n − 2 ) ζ 1 + ζ 2 ) ,

where ζ 1 ≔ V ar [ g 1 ( X ) ] , ζ 2 ≔ V ar [ g ( X 1 , X 2 ) ] , and g 1 ( x ) ≔ E [ g ( X 1 , x ) ] . Further, note that E [ g 1 ( X ) ] = E [ g ( X 1 , X 2 ) ] = E [ U n ] .

We proceed to evaluate ζ 1 and ζ 2 for the different kernels, and then substitute them into (A1). Since the kernels g * , g B , g R , and g D are all deterministic, this leaves us with the variances of the corresponding estimators. First, we prove items (i)–(iv) of Theorem 4 and then proceed with the proof of (v), where we deal with the randomness of g U .

B.4.1 Proof of (i)

Under Assumption 2, we have for every ( j 1 , j 2 ) ∈ { 1 , … , b 1 } × { b 1 + 1 , … , p } ,

E [ g * ( X 1 , ( j 1 , j 2 ) , X 2 , ( j 1 , j 2 ) ) ] = τ = 2 P j 1 , j 2 − 1 .

Also,

g 1 * ( x j 1 , x j 2 ) = E [ 2 ( 1 { x j 1 < X 1 , j 1 , x j 2 < X 1 , j 2 } + 1 { X 1 , j 1 < x j 1 , X 1 , j 2 < x j 2 } ) − 1 ] = 2 P c ( x j 1 , x j 2 ) − 1 ,

where P c ( x j 1 , x j 2 ) denotes the probability of concordance of two versions of ( X j 1 , X j 2 ) given that one pair equals ( x j 1 , x j 2 ) . Then,

ζ 1 = V ar [ g 1 * ( X j 1 , X j 2 ) ] = E [ ( 2 P c ( X j 1 , X j 2 ) − 1 − τ ) 2 ] = 4 E [ P c ( X j 1 , X j 2 ) 2 ] − 4 ( 1 + τ ) E [ P c ( X j 1 , X j 2 ) ] + ( 1 + τ ) 2 .

Note that E [ P c ( X j 1 , X j 2 ) ] = P j 1 , j 2 and E [ P c ( X j 1 , X j 2 ) 2 ] = Q j 1 , j 2 . Furthermore, substitution of τ = 2 P j 1 , j 2 − 1 gives us

(A2) ζ 1 = 4 ( Q j 1 , j 2 − P j 1 , j 2 2 ) .

For ζ 2 , we find

ζ 2 = V ar [ g * ( X 1 , ( j 1 , j 2 ) , X 2 , ( j 1 , j 2 ) ) ] = E [ ( 2 ( 1 { X 1 , j 1 < X 2 , j 1 , X 1 , j 2 < X 2 , j 2 } + 1 { X 2 , j 1 < X 1 , j 1 , X 2 , j 2 < X 1 , j 2 } ) − 1 − τ ) 2 ] = 4 E [ ( 1 { X 1 , j 1 < X 2 , j 1 , X 1 , j 2 < X 2 , j 2 } + 1 { X 2 , j 1 < X 1 , j 1 , X 2 , j 2 < X 1 , j 2 } ) 2 ] − 4 ( 1 + τ ) E [ 1 { X 1 , j 1 < X 2 , j 1 , X 1 , j 2 < X 2 , j 2 } + 1 { X 2 , j 1 < X 1 , j 1 , X 2 , j 2 < X 1 , j 2 } ] + ( 1 + τ ) 2 .

Furthermore, note that

1 { X 1 , j 1 < X 2 , j 1 , X 1 , j 2 < X 2 , j 2 } + 1 { X 2 , j 1 < X 1 , j 1 , X 2 , j 2 < X 1 , j 2 } ∈ { 0 , 1 } ,

and that therefore the expression is equal to its square. We obtain

(A3) ζ 2 = 4 E [ ( 1 { X 1 , j 1 < X 2 , j 1 , X 1 , j 2 < X 2 , j 2 } + 1 { X 2 , j 1 < X 1 , j 1 , X 2 , j 2 < X 1 , j 2 } ) 2 ] − 4 ( 1 + τ ) E [ 1 { X 1 , j 1 < X 2 , j 1 , X 1 , j 2 < X 2 , j 2 } + 1 { X 2 , j 1 < X 1 , j 1 , X 2 , j 2 < X 1 , j 2 } ] + ( 1 + τ ) 2 = − 4 τ E [ 1 { X 1 , j 1 < X 2 , j 1 , X 1 , j 2 < X 2 , j 2 } + 1 { X 2 , j 1 < X 1 , j 1 , X 2 , j 2 < X 1 , j 2 } ] + ( 1 + τ ) 2 = − 4 ( 2 P j 1 , j 2 − 1 ) P j 1 , j 2 + ( 1 + 2 P j 1 , j 2 − 1 ) 2 = 4 ( P j 1 , j 2 − P j 1 , j 2 2 ) ,

where in the second step we have used that

E [ 1 { X 1 , j 1 < X 2 , j 1 , X 1 , j 2 < X 2 , j 2 } + 1 { X 2 , j 1 < X 1 , j 1 , X 2 , j 2 < X 1 , j 2 } ] = P j 1 , j 2 .

Substitution of (A2) and (A3) into (A1) gives us the final expression

V ar [ τ ^ j 1 , j 2 ] = 8 n ( n − 1 ) ( 2 ( n − 2 ) ( Q j 1 , j 2 − P j 1 , j 2 2 ) + P j 1 , j 2 − P j 1 , j 2 2 ) .

B.4.2 Proof of (ii)

We have

E [ g B ( X 1 , X 2 ) ] = 2 P B , 1 − 1 ,

and for any x = ( x 1 , … , x p ) ∈ R p ,

g 1 B ( x ) = E 1 b 1 b 2 ∑ j 1 = 1 b 1 ∑ j 2 = b 1 + 1 p g * ( X i 1 , ( j 1 , j 2 ) , x ( j 1 , j 2 ) ) = 1 b 1 b 2 ∑ j 1 = 1 b 1 ∑ j 2 = b 1 + 1 p 2 P c ( x j 1 , x j 2 ) − 1 .

For ζ 1 , we then obtain

ζ 1 = V ar [ g 1 B ( X ) ] = E 1 b 1 b 2 ∑ j 1 = 1 b 1 ∑ j 2 = b 1 + 1 p 2 P c ( X j 1 , X j 2 ) − τ j 1 , j 2 − 1 2 = 1 b 1 2 b 2 2 ∑ j 1 = 1 b 1 ∑ j 2 = b 1 + 1 p ( E [ ( 2 P c ( X j 1 , X j 2 ) − 1 − τ j 1 , j 2 ) 2 ] + ∑ ( j 3 , j 4 ) ∈ { 1 , … , b 1 } × { b 1 + 1 , … , p } { j 1 , j 2 } ∩ { j 3 , j 4 } = ∅ E [ ( 2 P c ( X j 1 , X j 2 ) − 1 − τ j 1 , j 2 ) ( 2 P c ( X j 3 , X j 4 ) − 1 − τ j 3 , j 4 ) ] + ∑ ( j 3 , j 4 ) ∈ { 1 , … , b 1 } × { b 1 + 1 , … , p } ∣ { j 1 , j 2 } ∩ { j 3 , j 4 } ∣ = 1 E [ ( 2 P c ( X j 1 , X j 2 ) − 1 − τ j 1 , j 2 ) ( 2 P c ( X j 3 , X j 4 ) − 1 − τ j 3 , j 4 ) ] .

Now check that for any j 1 , j 2 , j 3 , j 4 ∈ { 1 , … , p } ,

E [ P c ( X j 1 , X j 2 ) P c ( X j 3 , X j 4 ) ] = Q j 1 , j 2 , j 3 , j 4 .

We then have

ζ 1 = 1 b 1 2 b 2 2 ∑ j 1 = 1 b 1 ∑ j 2 = b 1 + 1 p 4 ( Q j 1 , j 2 − P j 1 , j 2 2 ) + ∑ ( j 3 , j 4 ) ∈ { 1 , … , b 1 } × { b 1 + 1 , … , p } { j 1 , j 2 } ∩ { j 3 , j 4 } = ∅ 4 ( Q j 1 , j 2 , j 3 , j 4 − P j 1 , j 2 P j 3 , j 4 ) + ∑ ( j 3 , j 4 ) ∈ { 1 , … , b 1 } × { b 1 + 1 , … , p } ∣ { j 1 , j 2 } ∩ { j 3 , j 4 } ∣ = 1 4 ( Q j 1 , j 2 , j 3 , j 4 − P j 1 , j 2 P j 3 , j 4 ) .

Therefore, we have

(A4) ζ 1 = 4 b 1 b 2 ( Q B , 2 − S B , 2 + ( b 1 + b 2 − 2 ) ( Q B , 1 − S B , 1 ) + ( b 1 − 1 ) ( b 2 − 1 ) ( Q B , 0 − S B , 0 ) ) .

For ζ 2 , we have

ζ 2 = V ar [ g B ( X 1 , X 2 ) ] = E 1 b 1 b 2 ∑ j 1 = 1 b 1 ∑ j 2 = b 1 + 1 p g * ( X 1 , ( j 1 , j 2 ) , X 2 , ( j 1 , j 2 ) ) − τ j 1 , j 2 2 = 1 b 1 2 b 2 2 ∑ j 1 = 1 b 1 ∑ j 2 = b 1 + 1 p ( E [ ( g * ( X 1 , ( j 1 , j 2 ) , X 2 , ( j 1 , j 2 ) ) − τ j 1 , j 2 ) 2 ] + ∑ ( j 3 , j 4 ) ∈ { 1 , … , b 1 } × { b 1 + 1 , … , p } { j 1 , j 2 } ∩ { j 3 , j 4 } = ∅ E [ ( g * ( X 1 , ( j 1 , j 2 ) , X 2 , ( j 1 , j 2 ) ) − τ j 1 , j 2 ) ( g * ( X 1 , ( j 3 , j 4 ) , X 2 , ( j 3 , j 4 ) ) − τ j 3 , j 4 ) ] + ∑ ( j 3 , j 4 ) ∈ { 1 , … , b 1 } × { b 1 + 1 , … , p } ∣ { j 1 , j 2 } ∩ { j 3 , j 4 } ∣ = 1 E [ ( g * ( X 1 , ( j 1 , j 2 ) , X 2 , ( j 1 , j 2 ) ) − τ j 1 , j 2 ) ( g * ( X 1 , ( j 3 , j 4 ) , X 2 , ( j 3 , j 4 ) ) − τ j 3 , j 4 ) ] .

Remark that

E [ g * ( X 1 , ( j 1 , j 2 ) , X 2 , ( j 1 , j 2 ) ) g * ( X 1 , ( j 3 , j 4 ) , X 2 , ( j 3 , j 4 ) ) ] = 4 P j 1 , j 2 , j 3 , j 4 .

Therefore,

ζ 2 = 1 b 1 2 b 2 2 ∑ j 1 = 1 b 1 ∑ j 2 = b 1 + 1 p 4 ( P j 1 , j 2 − P j 1 , j 2 2 ) + ∑ ( j 3 , j 4 ) ∈ { 1 , … , b 1 } × { b 1 + 1 , … , p } { j 1 , j 2 } ∩ { j 3 , j 4 } = ∅ 4 ( P j 1 , j 2 , j 3 , j 4 − P j 1 , j 2 P j 3 , j 4 ) + ∑ ( j 3 , j 4 ) ∈ { 1 , … , b 1 } × { b 1 + 1 , … , p } ∣ { j 1 , j 2 } ∩ { j 3 , j 4 } ∣ = 1 4 ( P j 1 , j 2 , j 3 , j 4 − P j 1 , j 2 P j 3 , j 4 ) ,

and we obtain the expression

(A5) ζ 2 = 4 b 1 b 2 ( P B , 2 − S B , 2 + ( b 1 + b 2 − 2 ) ( P B , 1 − S B , 1 ) + ( b 1 − 1 ) ( b 2 − 1 ) ( P B , 0 − S B , 0 ) ) .

Finally, by substituting (A4) and (A5) into (A1), we find

V ar [ τ ^ B ] = 8 b 1 b 2 n ( n − 1 ) ( 2 ( n − 2 ) ( Q B , 2 − S B , 2 + ( b 1 + b 2 − 2 ) ( Q B , 1 − S B , 1 ) + ( b 1 − 1 ) ( b 2 − 1 ) ( Q B , 0 − S B , 0 ) ) + ( P B , 2 − S B , 2 + ( b 1 + b 2 − 2 ) ( P B , 1 − S B , 1 ) + ( b 1 − 1 ) ( b 2 − 1 ) ( P B , 0 − S B , 0 ) ) ) .

B.4.3 Proof of (iii)

In a similar manner to the proof of (ii), we obtain

ζ 1 = 1 N 2 ∑ j 1 = 1 N 4 ( Q 1 , b 1 + j 1 − P 1 , b 1 + j 1 2 ) + ∑ j 2 = 1 , j 2 ≠ j 1 N 4 ( Q 1 , b 1 + j 1 , 1 , b 1 + j 2 − P 1 , b 1 + j 1 P 1 , b 1 + j 2 ) .

Note that there is one less summation compared to the expressions in (ii) since { 1 , b 1 + j 1 } ∩ { 1 , b 1 + j 2 } ≠ ∅ for every j 1 , j 2 . Therefore,

ζ 1 = 4 N ( Q R , 2 − S R , 2 + ( N − 1 ) ( Q R , 1 − S R , 1 ) ) .

Similarly, we find that

ζ 2 = 4 N ( P R , 2 − S R , 2 + ( N − 1 ) ( P R , 1 − S R , 1 ) ) .

Hence,

V ar [ τ ^ R ] = 8 N n ( n − 1 ) ( 2 ( n − 2 ) ( Q R , 2 − S R , 2 + ( N − 1 ) ( Q R , 1 − S R , 1 ) ) + ( P R , 2 − S R , 2 + ( N − 1 ) ( P R , 1 − S R , 1 ) ) ) .

B.4.4 Proof of (iv)

Again, in a similar manner to the proof of (ii), we obtain

ζ 1 = 1 N 2 ∑ j 1 = 1 N ( 4 ( Q j 1 , b 1 + j 1 − P j 1 , b 1 + j 1 2 ) + ∑ j 2 = 1 , j 2 ≠ j 1 N 4 ( Q j 1 , b 1 + j 1 , j 1 , b 1 + j 2 − P j 1 , b 1 + j 1 P j 1 , b 1 + j 2 ) ) .

Note that there is one less summation compared to the expressions in (ii), as in (iii). Therefore,

ζ 1 = 4 N ( Q D , 2 − S D , 2 + ( N − 1 ) ( Q D , 1 − S D , 0 ) ) .

Similarly, we find that

ζ 2 = 4 N ( P D , 2 − S D , 2 + ( N − 1 ) ( P D , 0 − S D , 0 ) ) .

Hence,

V ar [ τ ^ D ] = 8 N n ( n − 1 ) ( 2 ( n − 2 ) ( Q D , 2 − S D , 2 + ( N − 1 ) ( Q D , 0 − S D , 0 ) ) + ( P D , 2 − S D , 2 + ( N − 1 ) ( P D , 0 − S D , 0 ) ) ) .

B.4.5 Proof of (v)

This proof is very similar as the one of (ii), except that we replace all expectations by conditional expectations given W = w . Note that τ ^ k 1 , k 2 U ∣ W is a U-statistic with (deterministic) kernel g k 1 , k 2 U ∣ W . As mentioned earlier, we obtain the corresponding ζ 1 and ζ 2 ,

ζ 1 = 1 b 1 2 b 2 2 ∑ j 1 = 1 b 1 ∑ j 2 = b 1 + 1 p 4 W j 1 , j 2 2 ( Q j 1 , j 2 − P j 1 , j 2 2 ) + ∑ ( j 3 , j 4 ) ∈ { 1 , … , b 1 } × { b 1 + 1 , … , p } { j 1 , j 2 } ∩ { j 3 , j 4 } = ∅ 4 W j 1 , j 2 W j 3 , j 4 ( Q j 1 , j 2 , j 3 , j 4 − P j 1 , j 2 P j 3 , j 4 ) + ∑ ( j 3 , j 4 ) ∈ { 1 , … , b 1 } × { b 1 + 1 , … , p } ∣ { j 1 , j 2 } ∩ { j 3 , j 4 } ∣ = 1 4 W j 1 , j 2 W j 3 , j 4 ( Q j 1 , j 2 , j 3 , j 4 − P j 1 , j 2 P j 3 , j 4 ) ,

and

ζ 2 = 1 b 1 2 b 2 2 ∑ j 1 = 1 b 1 ∑ j 2 = b 1 + 1 p 4 W j 1 , j 2 2 ( P j 1 , j 2 − P j 1 , j 2 2 ) + ∑ ( j 3 , j 4 ) ∈ { 1 , … , b 1 } × { b 1 + 1 , … , p } { j 1 , j 2 } ∩ { j 3 , j 4 } = ∅ 4 W j 1 , j 2 W j 3 , j 4 ( P j 1 , j 2 , j 3 , j 4 − P j 1 , j 2 P j 3 , j 4 ) + ∑ ( j 3 , j 4 ) ∈ { 1 , … , b 1 } × { b 1 + 1 , … , p } ∣ { j 1 , j 2 } ∩ { j 3 , j 4 } ∣ = 1 4 W j 1 , j 2 W j 3 , j 4 ( P j 1 , j 2 , j 3 , j 4 − P j 1 , j 2 P j 3 , j 4 ) ,

which lead to the result as claimed.

B.4.6 Proof of (vi)

Finally, for obtaining the variance of τ ^ U , we need to deal with the random kernel g U . To this end, we use the law of total variance and find

V ar [ τ ^ U ] = V ar [ E [ τ ^ U ∣ W ] ] + E [ V ar [ τ ^ U ∣ W ] ] = V ar J 1 , J 2 [ τ J 1 , J 2 ] + E [ V ar [ τ ^ U ∣ W ] ] = V ar J 1 , J 2 [ τ J 1 , J 2 ] + E [ V ar [ τ ^ U ∣ W ] ] .

We can thus obtain the desired variance by first evaluating the variance of τ U under a given W and then by taking the expectation with respect to W . Therefore,

(A6) E [ V ar [ τ ^ U ∣ W ] ] = 2 n ( n − 1 ) ( 2 ( n − 2 ) E [ ζ 1 ] + E [ ζ 2 ] ) .

By inserting the expression of ζ 1 and ζ 2 found in (v) into (A6), we obtain

(A7) E [ V ar [ τ ^ U ∣ W ] ] = 8 N 2 n ( n − 1 ) 2 ( n − 2 ) b 1 2 b 2 2 ∑ j 1 = 1 b 1 ∑ j 2 = b 1 + 1 p ( 4 E [ W j 1 , j 2 2 ] ( Q j 1 , j 2 − P j 1 , j 2 2 ) + ∑ ( j 3 , j 4 ) ∈ { 1 , … , b 1 } × { b 1 + 1 , … , p } { j 1 , j 2 } ∩ { j 3 , j 4 } = ∅ 4 E [ W j 1 , j 2 W j 3 , j 4 ] ( Q j 1 , j 2 , j 3 , j 4 − P j 1 , j 2 P j 3 , j 4 ) + ∑ ( j 3 , j 4 ) ∈ { 1 , … , b 1 } × { b 1 + 1 , … , p } ∣ { j 1 , j 2 } ∩ { j 3 , j 4 } ∣ = 1 4 E [ W j 1 , j 2 W j 3 , j 4 ] ( Q j 1 , j 2 , j 3 , j 4 − P j 1 , j 2 P j 3 , j 4 ) + 1 b 1 2 b 2 2 ∑ j 1 = 1 b 1 ∑ j 2 = b 1 + 1 p ( 4 E [ W j 1 , j 2 2 ] ( P j 1 , j 2 − P j 1 , j 2 2 ) + ∑ ( j 3 , j 4 ) ∈ { 1 , … , b 1 } × { b 1 + 1 , … , p } { j 1 , j 2 } ∩ { j 3 , j 4 } = ∅ 4 E [ W j 1 , j 2 W j 3 , j 4 ] ( P j 1 , j 2 , j 3 , j 4 − P j 1 , j 2 P j 3 , j 4 ) + ∑ ( j 3 , j 4 ) ∈ { 1 , … , b 1 } × { b 1 + 1 , … , p } ∣ { j 1 , j 2 } ∩ { j 3 , j 4 } ∣ = 1 4 E [ W j 1 , j 2 W j 3 , j 4 ] ( P j 1 , j 2 , j 3 , j 4 − P j 1 , j 2 P j 3 , j 4 ) .

Recall that we select N pairs out of the b 1 b 2 possible pairs with uniform probability and without replacement. Therefore, for every distinct pairs ( j 1 , j 2 ) and ( j 3 , j 4 ) , we can compute the following expectations:

E [ W j 1 , j 2 2 ] = E [ W j 1 , j 2 ] = N b 1 b 2 , E [ W j 1 , j 2 W j 3 , j 4 ] = N ( N − 1 ) b 1 b 2 ( b 1 b 2 − 1 ) .

Finally, by combining this with Equation (A7), we establish the desired formula

E [ V ar [ τ ^ U ∣ W ] ] = 8 b 1 b 2 n ( n − 1 ) 2 ( n − 2 ) ( Q B , 2 − S B , 2 + N − 1 b 1 b 2 − 1 ( b 1 + b 2 − 2 ) ( Q B , 1 − S B , 1 ) + N − 1 b 1 b 2 − 1 ( b 1 − 1 ) ( b 2 − 1 ) ( Q B , 0 − S B , 0 ) ) + ( P B , 2 − S B , 2 + N − 1 b 1 b 2 − 1 ( b 1 + b 2 − 2 ) ( P B , 1 − S B , 1 ) + N − 1 b 1 b 2 − 1 ( b 1 − 1 ) ( b 2 − 1 ) ( P B , 0 − S B , 0 ) ) .

B.5 Proof of Theorem 8

Proof

In [10] (see p. 299), it was already shown that the conditional Kendall’s tau estimator defined in (12) is asymptotically normal at different points of the conditioning variable. However, the asymptotic normalities of the averaging estimators remain to be proven. To this end, we follow their approach of studying the joint distribution of U-statistics at several conditioning points, and give a detailed proof for completeness. The asymptotic covariance matrices are then obtained by combining the results with the appropriate kernels under Assumption 4.

Remember that the averaged conditional Kendall’s tau are conditional U-statistics with kernels given by Lemma 3, which treats the unconditional case. Furthermore, for any measurable function g : R 2 d → R , let us define the second-order U-statistic

(A8) U n , j ′ ( g ) ≔ 1 n ( n − 1 ) ∑ 1 ≤ i 1 ≠ i 2 ≤ n g i 1 , i 2 ,

where

g i 1 , i 2 ≔ g ( X i 1 , X i 2 ) K h ( z j ′ ′ − Z i 1 ) K h ( z j ′ ′ − Z i 2 ) E [ K h ( z j ′ ′ − Z ) ] 2 .

It follows easily that the averaging estimators can be written in terms of U n , j ′ by

(A9) τ ^ ∣ Z = z j ′ ′ = U n , j ′ ( g ) U n , j ′ ( 1 ) + ε n , j ′ ,

where we write τ ^ ∣ Z = z j ′ ′ for any of the estimators τ ^ ∣ Z = z j ′ ′ B , τ ^ ∣ Z = z j ′ ′ R , τ ^ ∣ Z = z j ′ ′ D , τ ^ ∣ Z = z j ′ ′ U with g given by, respectively, g B , g R , g D , and g U . The residual term ε n , j ′ is given by

ε n , j ′ ≔ ∑ i = 1 n K h 2 ( z j ′ ′ − Z i ) n ( n − 1 ) E [ K h ( z j ′ ′ − Z ) ] 2 .

Further, we set

(A10) τ ˜ ∣ Z = z j ′ ′ ≔ U n , j ′ ( g ) U n , j ′ ( 1 ) ,

to be the equivalent of (A9) with the term ε n , j ′ removed. This is a simpler version of (A9) that will be easier to analyze theoretically. We now show that τ ^ ∣ Z = z j ′ ′ and τ ˜ ∣ Z = z j ′ ′ are close. Under Assumption 5, we replace both τ ^ ∣ Z = z j ′ ′ = U n , j ′ ( g ) ⁄ U n , j ′ ( 1 ) and τ ˜ ∣ Z = z j ′ ′ = U n , j ′ ( g ) ⁄ ( U n , j ′ ( 1 ) + ε n , j ′ ) by their expressions to obtain

E 1 ε n , j ′ ( τ ˜ ∣ Z = z j ′ ′ − τ ^ ∣ Z = z j ′ ′ ) = E 1 ε n , j ′ U n , j ′ ( g ) U n , j ′ ( 1 ) − U n , j ′ ( g ) U n , j ′ ( 1 ) + ε n , j ′ = E U n , j ′ ( g ) ( U n , j ′ ( 1 ) + ε n , j ′ ) − U n , j ′ ( g ) U n , j ′ ( 1 ) ε n , j ′ U n , j ′ ( 1 ) ( U n , j ′ ( 1 ) + ε n , j ′ ) = E U n , j ′ ( g ) U n , j ′ ( 1 ) + U n , j ′ ( g ) ε n , j ′ − U n , j ′ ( g ) U n , j ′ ( 1 ) ε n , j ′ U n , j ′ ( 1 ) ( U n , j ′ ( 1 ) + ε n , j ′ ) = E U n , j ′ ( g ) ε n , j ′ ε n , j ′ U n , j ′ ( 1 ) ( U n , j ′ ( 1 ) + ε n , j ′ ) = E 1 U n , j ′ ( 1 ) U n , j ′ ( g ) U n , j ′ ( 1 ) + ε n , j ′ = E 1 U n , j ′ ( 1 ) τ ^ ∣ Z = z j ′ ′ = O ( 1 ) ,

because ∣ τ ^ ∣ Z = z j ′ ′ ∣ is bounded by 1 and U n , j ′ ( 1 ) is asymptotically normal, hence convergent by Lemma 17 of [10]. Therefore, τ ^ ∣ Z = z j ′ ′ − τ ˜ ∣ Z = z j ′ ′ = O P ( ε n , j ′ ) using Markov’s inequality. By Assumption 5(c) and by Bochner’s lemma e.g., [44] we see that ε n , j ′ = O P ( ( n h d ) − 1 ) . It then follows by Assumption 7 that

( n h d ) 1 ⁄ 2 ( τ ^ ∣ Z = z j ′ ′ − τ ˜ ∣ Z = z j ′ ′ ) = O P ( ( n h d ) 1 ⁄ 2 ε n , j ′ ) = o P ( 1 ) .

It therefore suffices to obtain the limiting law of ( n h d ) 1 ⁄ 2 ( τ ˜ ∣ Z = z j ′ ′ − τ ∣ Z = z j ′ ′ ) as n → ∞ .

Now let us apply Lemma 17 again from [10] on the joint asymptotic law of U-statistics of the form U n , j ′ . That is, under Assumptions 5–7 and for any two bounded measurable functions g 1 and g 2 ,

(A11) ( n h d ) 1 ⁄ 2 ( ( U n , j ′ ( g 1 ) − E [ U n , j ′ ( g 1 ) ] ) j ′ = 1 , … , n ′ , ( U n , j ′ ( g 2 ) − E [ U n , j ′ ( g 2 ) ] ) j ′ = 1 , … , n ′ ) ⟶ law N 0 , M ∞ ( g 1 ) M ∞ ( g 1 , g 2 ) M ∞ ( g 1 , g 2 ) M ∞ ( g 2 ) ,

as n → ∞ , where

(A12) [ M ∞ ( g 1 , g 2 ) ] j 1 ′ , j 2 ′ ≔ 4 ∫ K 2 1 { z j 1 ′ ′ = z j 2 ′ ′ } f Z ( z j 1 ′ ′ ) ∫ g 1 ( x 1 , x ) g 2 ( x 2 , x ) f X ∣ Z = z j 1 ′ ′ ( x ) f X ∣ Z = z j 1 ′ ′ ( x 1 ) f X ∣ Z = z j 1 ′ ′ ( x 2 ) d x d x 1 d x 2 .

Let us investigate the expectation of U n , j ′ ( g ) . We write by (A8)

E [ U n , j ′ ( g ) ] = 1 E [ K h ( z j ′ ′ − Z ) ] 2 E [ g ( X 1 , X 2 ) K h ( z j ′ ′ − Z 1 ) K h ( z j ′ ′ − Z 2 ) ] .

Further, by a change of variable, we find

(A13) E [ g ( X 1 , X 2 ) K h ( z j ′ ′ − Z 1 ) K h ( z j ′ ′ − Z 2 ) ] = ∫ g ( x 1 , x 2 ) K h ( z j ′ ′ − Z 1 ) K h ( z j ′ ′ − Z 2 ) f X , Z ( x 1 , Z 1 ) f X , Z ( x 2 , Z 2 ) d x 1 d x 2 d Z 1 d Z 2 = ∫ g ( x 1 , x 2 ) K ( u 1 ) K ( u 2 ) f X , Z ( x 1 , z j ′ ′ + h u 1 ) f X , Z ( x 2 , z j ′ ′ + h u 2 ) d x 1 d x 2 d u 1 d u 2 ,

with the change of variable Z 1 = z j ′ ′ + h u 1 and Z 2 = z j ′ ′ + h u 2 . Let us define the function ϕ x 1 , x 2 , u 1 , u 2 ( t ) ≔ f X , Z ( x 1 , z j ′ ′ + t h u 1 ) f X , Z ( x 2 , z j ′ ′ + t h u 2 ) for t ∈ [ 0 , 1 ] . By Assumption 6, ϕ x 1 , x 2 , u 1 , u 2 ( t ) is α times differentiable, allowing us to apply the Taylor-Lagrange formula. This gives

(A14) ϕ x 1 , x 2 , u 1 , u 2 ( t ) = ∑ k = 0 α − 1 1 k ! ϕ x 1 , x 2 , u 1 , u 2 ( k ) ( 0 ) + 1 α ! ϕ x 1 , x 2 , u 1 , u 2 ( α ) ( t * ) ,

for some t * ∈ [ 0 , 1 ] and where ϕ x 1 , x 2 , u 1 , u 2 k ( t ) is equal to

∑ l = 0 k k l ∑ j 1 , … , j k = 1 d h k u j 1 , 1 … u j l , 1 u j l + 1 , 2 … u j k , 2 ∂ k f X , Z ∂ z j 1 … ∂ z j l ( x 1 , z j ′ ′ + t h u 1 ) ∂ k − l f X , Z ∂ z j l + 1 … ∂ z j k ( x 2 , z j ′ ′ + t h u 2 ) .

After substituting (A14) into (A13), we obtain

∫ g ( x 1 , x 2 ) K ( u 1 ) K ( u 2 ) ∑ k = 0 α − 1 1 k ! ϕ x 1 , x 2 , u 1 , u 2 ( k ) ( 0 ) + 1 α ! ϕ x 1 , x 2 , u 1 , u 2 ( α ) ( t * ) d x 1 d x 2 d u 1 d u 2 = ∫ g ( x 1 , x 2 ) K ( u 1 ) K ( u 2 ) ϕ x 1 , x 2 , u 1 , u 2 ( 0 ) + 1 α ! ϕ x 1 , x 2 , u 1 , u 2 ( α ) ( t * ) d x 1 d x 2 d u 1 d u 2 = ∫ g ( x 1 , x 2 ) K ( u 1 ) K ( u 2 ) f X , Z ( x 1 , z j ′ ′ ) f X , Z ( x 2 , z j ′ ′ ) d x 1 d x 2 d u 1 d u 2 + 1 α ! ∫ g ( x 1 , x 2 ) K ( u 1 ) K ( u 2 ) ϕ x 1 , x 2 , u 1 , u 2 ( α ) ( t * ) d x 1 d x 2 d u 1 d u 2 = f Z 2 ( z j ′ ′ ) E [ g ( X 1 , X 2 ) ∣ Z 1 = Z 2 = z j ′ ′ ] + 1 α ! ∫ g ( x 1 , x 2 ) K ( u 1 ) K ( u 2 ) ϕ x 1 , x 2 , u 1 , u 2 ( α ) ( t * ) d x 1 d x 2 d u 1 d u 2 ,

where in the first equality we have used the fact that ∫ K ( u ) u j 1 … u j k d u = 0 for all k = 1 , … , α − 1 as stated in Assumption 5(b), which results in the elimination of all the terms in the sum except the first one. In the second equality, we replaced ϕ and ϕ ( α ) by their expressions. In the last equality, we factor out f Z 2 ( z j ′ ′ ) and recognize that the corresponding integral is a conditional expectation.

We now bound the second term of the previous display. By Assumption 6, we have

1 α ! ∫ g ( x 1 , x 2 ) K ( u 1 ) K ( u 2 ) ϕ x 1 , x 2 , u 1 , u 2 ( α ) ( t * ) d x 1 d x 2 d u 1 d u 2 ≤ ∫ ∣ K ∣ ( u 1 ) ∣ K ∣ ( u 2 ) ∣ ϕ x 1 , x 2 , u 1 , u 2 ( α ) ∣ ( t * ) d x 1 d x 2 d u 1 d u 2 ≤ C X , Z h α .

Therefore,

E [ g ( X 1 , X 2 ) K h ( z j ′ ′ − Z 1 ) K h ( z j ′ ′ − Z 2 ) ] = f Z 2 ( Z j ′ ) E [ g ( X 1 , X 2 ) ∣ Z 1 = Z 2 = z j ′ ′ ] + O ( h α ) ,

and by the same reasoning, we obtain

E [ K h ( z j ′ ′ − Z ) ] = f Z 2 ( Z j ′ ) + O ( h α ) .

Consequently, we find that

(A15) E [ U n , j ′ ( g ) ] = 1 E [ K h ( z j ′ ′ − Z ) ] 2 E [ g ( X 1 , X 2 ) K h ( z j ′ ′ − Z 1 ) K h ( z j ′ ′ − Z 2 ) ] = E [ g ( X 1 , X 2 ) ∣ Z 1 = Z 2 = z j ′ ′ ] + r n , j ′ ,

where ∣ r n , j ′ ∣ ≤ C 0 h α for some constant C 0 independent of j ′ . Then, by Assumption 7,

(A16) ( n h d ) 1 ⁄ 2 ( E [ U n , j ′ ( g ) ] − E [ g ( X 1 , X 2 ) ∣ Z 1 = Z 2 = z j ′ ′ ] ) = O ( ( n h d ) 1 ⁄ 2 h α ) = o ( 1 ) .

Therefore, the asymptotic law of (A11) still holds after replacing E [ U n , j ′ ( g ) ] with E [ g ( X 1 , X 2 ) ∣ Z 1 = Z 2 = z j ′ ′ ] . As such,

(A17) ( n h d ) 1 ⁄ 2 ( ( U n , j ′ ( g ) − τ ∣ Z = z j ′ ′ ) j ′ = 1 , … , n ′ , ( U n , j ′ ( 1 ) − 1 ) j ′ = 1 , … , n ′ ) ⟶ law N 0 , M ∞ ( g , g ) M ∞ ( g , 1 ) M ∞ ( g , 1 ) M ∞ ( 1 , 1 ) ,

as n → ∞ , where we have used that E [ g ( X 1 , X 2 ) ∣ Z 1 = Z 2 = z j ′ ′ ] = τ ∣ Z = z j ′ ′ under Assumption 4.

To derive the asymptotic law of ( n h d ) 1 ⁄ 2 ( τ ˜ ∣ Z = z j ′ ′ − τ ∣ Z = z j ′ ′ ) , we apply the Delta-method on (A17) with the function γ ( x , y ) ≔ x ⁄ y , which divides two real vectors x , y of size n ′ component-wise. The corresponding Jacobian is given by the n ′ × 2 n ′ matrix

J γ ( x , y ) = [ Diag ( y 1 − 1 , … , y n ′ − 1 ) , Diag ( − x 1 y 1 − 2 , … , − x n ′ y n ′ − 2 ) ] .

Hence, as n → ∞

( n h n d ) 1 ⁄ 2 ( τ ^ ∣ Z = z j ′ ′ − τ ∣ Z = z j ′ ′ ) j ′ = 1 , … , n ′ ⟶ law N ( 0 , H ) ,

setting

H ≔ J γ ( τ → ∣ Z , e ) M ∞ ( g ) M ∞ ( g , 1 ) M ∞ ( g , 1 ) M ∞ ( 1 ) J γ ( τ → ∣ Z , e ) T ,

where τ → ∣ Z and e denote n ′ -dimensional vectors filled with respectively τ ∣ Z = z j ′ ′ and 1. This gives

H = M ∞ ( g , g ) − Diag ( τ → ∣ Z ) M ∞ ( g , 1 ) − M ∞ ( g , 1 ) Diag ( τ → ∣ Z ) + Diag ( τ → ∣ Z ) M ∞ ( 1 , 1 ) Diag ( τ → ∣ Z )

and for 1 ≤ j 1 ′ , j 2 ′ ≤ n ′ , we find

[ M ∞ ( g , g ) ] j 1 ′ , j 2 ′ = 4 ∫ K 2 1 { z j 1 ′ ′ = z j 2 ′ ′ } f Z ( z j 1 ′ ′ ) E [ g ( X 1 , X ) g ( X 2 , X ) ∣ Z = z 1 = Z 2 = z j 1 ′ ′ ] , [ Diag ( τ → ∣ Z ) M ∞ ( g , 1 ) ] j 1 ′ , j 2 ′ = 4 ∫ K 2 1 { z j 1 ′ ′ = z j 2 ′ ′ } f Z ( z j 1 ′ ′ ) τ ∣ Z = z j ′ ′ E [ g ( X 1 , X ) ∣ Z = z 1 = z j 1 ′ ′ ] = 4 ∫ K 2 1 { z j 1 ′ ′ = z j 2 ′ ′ } f Z ( z j 1 ′ ′ ) τ ∣ Z = z j ′ ′ 2 = [ M ∞ ( g , 1 ) Diag ( τ → ∣ Z ) ] j 1 ′ , j 2 ′ = [ Diag ( τ → ∣ Z ) M ∞ ( 1 , 1 ) Diag ( τ → ∣ Z ) ] j 1 ′ , j 2 ′ ,

and thus,

[ H ] j 1 ′ , j 2 ′ = 4 ∫ K 2 1 { z j 1 ′ ′ = z j 2 ′ ′ } f Z ( z j 1 ′ ′ ) ( E [ g ( X 1 , X 2 ) g ( X 1 , X 3 ) ∣ Z 1 = Z 2 = Z 3 = z j 1 ′ ′ ] − τ ∣ Z = z j ′ ′ 2 ) .

Finally, by substituting the appropriate kernels and by similar steps as in the derivation of the corresponding ζ 1 ’s from the proof of Theorem 4, it is easily seen that under Assumption 4, we obtain the desired asymptotic covariance matrices.□

C List of stocks used

Group 1

Cafom (CAFO)
Techstep (TECH)
Gold by Gold (ALGLD)
Fonciere Inea (INEA)
NSC Groupe (ALNSC)
Hofseth BioCare (HBC)
GC Rieber Shipping (RISH)
Aega (AEGA)
i2S (ALI2S)
Moury Construct (MOUR)
Gascogne (ALBI)
Thunderbird (TBIRD)
Hydratec Industries (HYDRA)
Sparebank 1 Ostfold Akershus (SOAG)
Altareit (AREIT)
Unibel (UNBL)
Cheops Technology France (MLCHE)
Zenobe Gramme Cert (ZEN)
Indel Fin (INFE)
Artois Nom (ARTO)
IDS (MLIDS)
Musee Grevin (GREV)
Robertet (CBE)
Aurskog Sparebank (AURG)
Alliance Developpement Capital (ALDV)
Fonciere Atland (FATL)
FREYR Battery (FREY)
Hotels de Paris (HDP)
Phone Web (MLPHW)
Maroc Telecom (IAM)
Sporting (SCP)
MG International (ALMGI)
Ucar (ALUCR)
Cumulex (CLEX)
Televerbier (TVRB)
Alan Allman Associates (AAA)
Serma Group (ALSER)
Planet Media (ALPLA)
Philly Shipyard (PHLY)
Augros Cosmetic Packaging (AUGR)
Sequa Petroleum (MLSEQ)
EMOVA Group (ALEMV)
Streamwide (ALSTW)
Accentis (ACCB)
Smalto (MLSML)
Signaux Girod (ALGIR)

Group 2

Interoil (IOX)
Ensurge Micropower (ENSU)
Idex Biometrics (IDEX)
SD Standard Drilling (SDSD)
SpareBank 1 Nord-Norge (NONG)
Bonheur (BONHR)
Eidesvik Offshore (EIOF)
DOF (DOF)
Solstad Offshore (SOFF)
Havila Shipping (HAVI)
Awilco LNG (ALNG)
FLEX LNG (FLNG)
Avance Gas Holding (AGAS)
Hunter Group (HUNT)
Itera (ITERA)
Q-Free (QFR)
Photocure (PHO)
PCI Biotech Holding (PCIB)
Hexagon Composites (HEX)
Nel (NEL)
McPhy Energy (MCPHY)
Vow (VOW)
Axactor (ACR)
Magseis Fair Fairfield (MSEIS)
NRC Group (NRC)
Petrolia (PSE)
Ctac (CTAC)
StrongPoint (STRO)
Crescent (OPTI)
Magnora (MGN)
Rec Silicon (RECSI)
Questerre Energy Corp (QEC)
ElectroMagnetic GeoServices (EMGS)
ABG Sundal Collier Holding (ABG)
Nekkar (NKR)
SeaBird Exploration (GEG)
Wilh. Wilhelmsen Holding (WWI)
Golden Ocean Group (GOGL)
Frontline (FRO)
Euronav (EURN)
Norwegian Air Shuttle (NAS)
Atea (ATEA)
Vopak (VPK)
Orkla (ORK)
Corbion (CRBN)
Otello Corporation (OTEC)

Group 3

Aures Technologies (AURS)
Keyrus (ALKEY)
Nextedia (ALNXT)
Cabasse Group (ALCG)
Groupe Guillin (ALGIL)
Guillemot (GUI)
Solutions 30 (S30)
Esker (ALESK)
Wavestone (WAVE)
Groupe Open (OPN)
Envea (ALTEV)
Stern Groep (STRN)
IT Link (ALITL)
Lectra (LSS)
Groupe CRIT (CEN)
Aubay (AUB)
Sword Group (SWP)
NRJ Group (NRG)
Van de Velde (VAN)
Hunter Douglas (HDG)
Oeneo (SBT)
Axway Software (AXW)
SES-imagotag (SESL)
Ateme (ATEME)
Infotel (INF)
Sergeferrari Group (SEFER)
Umanis (ALUMS)
Corticeira Amorim (COR)
Pharmagest Interactive (PHA)
Asetek (ASTK)
ID Logistics Group (IDL)
Scana (SCANA)
Acheter Louer fr (ALOLO)
Adomos (ALADO)
Glintt (GLINT)
Inapa (INA)
Cegedim (CGM)
Lavide Holding (LVIDE)
TIE Kinetix (TIE)
Alumexx (ALX)
Ober (ALOBR)
Cibox Interactive (CIB)
Evolis (ALTVO)
Proactis (PROAC)
Visiodent (SDT)
Fashion B Air (ALFBA)
Adthink (ALADM)
Innelec Multimedia (ALINN)
Herige (ALHRG)
Egide (GID)
U10 Corp (ALU10)
Mr. Bricolage (ALMRB)
Coheris (COH)
Pcas (PCA)
Rosier (ENGB)
Itesoft (ITE)
Gea Grenobl.Elect. (GEA)
Immobel (IMMO)
IGE+XAO Group (IGE)
Koninklijke Brill (BRILL)
Argan (ARG)
Fonciere Lyonnaise (FLY)
Covivio Hotels (COVH)
Electricite de Strasbourg (ELEC)
Robertet (RBT)
Norway Royal Salmon (NRS)
Olympique Lyonnais Groupe (OLG)
GeoJunxion (GOJXN)
Hybrid Software Group (HYSG)
Cast (CAS)
Acteos (EOS)
HF Company (ALHF)
Vranken-Pommery Monopole (VRAP)
Generix Group (GENX)
Union Technologies Infor. (FPG)
Diagnostic Medical Systems (DGM)
Capelli (CAPLI)
EXEL Industries (EXE)
Groupe LDLC (ALLDL)
genOway (ALGEN)
CBo Territoria (CBOT)
Aurea (AURE)
EO2 (ALEO2)
RAK Petroleum (RAKP)

Group 4

DNO (DNO)
Archer Limited (ARCH)
Odfjell Drilling (ODL)
BW Offshore (BWO)
Panoro Energy (PEN)
PGS (PGS)
TGS (TGS)
Subsea 7 (SUBC)
Equinor (EQNR)
Aker BP (AKRBP)
Aker (AKER)
Aker Solutions (AKSO)
Akastor (AKAST)
Etablissements Maurel et Prom (MAU)
Vallourec (VK)
CGG (CGG)
TechnipFMC (FTI)
Fugro (FUR)
SBM Offshore (SBMO)
Galp Energia (GALP)
TotalEnergies (TTE)
Royal Dutch Shell B (RDSB)
Schlumberger Limited (LSD)
Alten (ATE)
Capgemini (CAP)
Atos (ATO)
STMicroelectronics (STM)
ASML Holding (ASML)
ASM International (ASM)
BE Semiconductors (BESI)
Banco Comercial Portugues (BCP)
Mota-Engil (EGL)
Altri (ALTR)
The Navigator Company (NVG)
Semapa (SEM)
Trigano (TRI)
Ipsos (IPS)
Barco (BAR)
TomTom (TOM2)
PostNL (PNL)
TF1 (TFI)
Derichebourg (DBG)
Heijmans (HEIJM)
Koninklijke BAM Groep (BAMNB)
Aegon (AGN)
AXA (CS)
Agaas (AGS)
Bouygues (EN)
VINCI (DG)
Eiffage (FGR)
Faurecia (EO)
Valeo (FR)
Michelin (ML)
Ackermans & Van Haaren (ACKB)
Royal Boskalis Westminster (BOKA)
Imerys (NK)
Solvay (SOLB)
Umicore (UMI)
AkzoNobel (AKZA)
Air Liquide (AI)
Koninklijke Philips (PHIA)
Aperam (APAM)
Eramet (ERA)
Norsk Hydro (NHY)

References

[1] Abdous, B., Genest, C., & Rémillard, B. (2005). Dependence properties of meta-elliptical distributions. In: Statistical modeling and analysis for complex data problems (pp. 1–15). Boston, MA: Springer US.10.1007/0-387-24555-3_1Search in Google Scholar

[2] Ang, A., & Bekaert, G. (2002). International asset allocation with regime shifts. Review of Financial Studies, 15, 1137–1187. 10.1093/rfs/15.4.1137Search in Google Scholar

[3] Ascorbebeitia, J., Ferreira, E., & Orbe, S. (2022). Testing conditional multivariate rank correlations: the effect of institutional quality on factors influencing competitiveness. TEST, 31, 931–949. 10.1007/s11749-022-00806-1Search in Google Scholar PubMed PubMed Central

[4] Barber, R. F., & Kolar, M. (2018). Rocket: Robust confidence intervals via Kendall’s tau for transelliptical graphical models. The Annals of Statistics, 46(6B), 3422–3450. 10.1214/17-AOS1663Search in Google Scholar

[5] Bickel, P., & Levina, E. (2008). Covariance regularization by thresholding. The Annals of Statistics, 36(6), 2577–2604. 10.1214/08-AOS600Search in Google Scholar

[6] Cadima, J., Calheiros, F. L., & Preto, I. P. (2010). The eigenstructure of block-structured correlation matrices and its implications for principal component analysis. Journal of Applied Statistics, 37(4), 577–589. 10.1080/02664760902803263Search in Google Scholar

[7] Delft High Performance Computing Centre (DHPC) (2022). DelftBlue Supercomputer (Phase 1). https://www.tudelft.nl/dhpc/ark:/44463/DelftBluePhase1.Search in Google Scholar

[8] Derumigny, A. (2023). CondCopulas: Estimation and Inference for Conditional Copulas Models. R package version 0.1.3. Available at https://cran.r-project.org/package=CondCopulas. 10.32614/CRAN.package.CondCopulasSearch in Google Scholar

[9] Derumigny, A., & Fermanian, J.-D. (2019). A classification point-of-view about conditional Kendal’s tau. Computational Statistics & Data Analysis, 135, 70–94. 10.1016/j.csda.2019.01.013Search in Google Scholar

[10] Derumigny, A., & Fermanian, J.-D. (2019). On kernel-based estimation of conditional Kendallas tau: finite-distance bounds and asymptotic behavior. Dependence Modeling, 7(1), 292–321. 10.1515/demo-2019-0016Search in Google Scholar

[11] Derumigny, A., & Fermanian, J.-D. (2020). On Kendall’s regression. Journal of Multivariate Analysis, 178, 104610. 10.1016/j.jmva.2020.104610Search in Google Scholar

[12] Derumigny, A., & Fermanian, J.-D. (2022). Identifiability and estimation of meta-elliptical copula generators. Journal of Multivariate Analysis, 190, 104962. 10.1016/j.jmva.2022.104962Search in Google Scholar

[13] Derumigny, A., & Fermanian, J.-D. (2023). ElliptCopulas: Inference of Elliptical Copulas and Elliptical Distributions. R package version 0.1.3. https://cran.r-project.org/package=ElliptCopulas. 10.32614/CRAN.package.ElliptCopulasSearch in Google Scholar

[14] Erb, C., Harvey, C., & Viskanta, T. (1994). Forecasting international equity correlations. Financial Analysts Journal, 50, 32–45. 10.2469/faj.v50.n6.32Search in Google Scholar

[15] Fan, J., Fan, Y., & Lv, J. (2008). High dimensional covariance matrix estimation using a factor model. Journal of Econometrics, 147(1), 186–197. 10.1016/j.jeconom.2008.09.017Search in Google Scholar

[16] Fan, J., Liao, Y., & Wang, W. (2014). Projected principal component analysis in factor models. SSRN Electronic Journal, 44, 219–254. 10.2139/ssrn.2450770Search in Google Scholar

[17] Fang, H.-B., Fang, K.-T., & Kotz, S. (2002). The meta-elliptical distributions with given marginals. Journal of Multivariate Analysis, 82(1), 1–16. 10.1006/jmva.2001.2017Search in Google Scholar

[18] Genest, C., Favre, A.-C., Béliveau, J., & Jacques, C. (2007). Metaelliptical copulas and their use in frequency analysis of multivariate hydrological data. Water Resources Research, 43(9), W09401, 1–12. 10.1029/2006WR005275Search in Google Scholar

[19] Genest, C., Nessslehová, J., & Ghorbal, N. (2011). Estimators based on Kendall’s tau in multivariate copula models. Australian & New Zealand Journal of Statistics, 53, 157–177. 10.1111/j.1467-842X.2011.00622.xSearch in Google Scholar

[20] Gijbels, I., Veraverbeke, N., & Omelka, M. (2011). Conditional copulas, association measures and their applications. Computational Statistics & Data Analysis, 55, 1919–1932. 10.1016/j.csda.2010.11.010Search in Google Scholar

[21] Gray, H., Leday, G. G., Vallejos, C. A., & Richardson, S. (2018). Shrinkage estimation of large covariance matrices using multiple shrinkage targets. ArXiv: arXiv:1809.08024. Search in Google Scholar

[22] Hahsler, M., Buchta, C., & Hornik, K. (2022). Seriation: Infrastructure for Ordering Objects Using Seriation. R package version 1.3.2. https://cran.r-project.org/package=seriation. Search in Google Scholar

[23] Hoeffding, W. (1948). A non-parametric test of independence. The Annals of Mathematical Statistics, 19(4), 546–557. 10.1214/aoms/1177730150Search in Google Scholar

[24] Kurowicka, D., & Cooke, R. M. (2006). Uncertainty analysis with high dimensional dependence modelling. England: John Wiley & Sons. 10.1002/0470863072Search in Google Scholar

[25] Liebscher, E. (2005). A semiparametric density estimator based on elliptical distributions. Journal of Multivariate Analysis, 92(1), 205–225. 10.1016/j.jmva.2003.09.007Search in Google Scholar

[26] Liu, H., Han, F., & Zhang, C.-H. (2012). Transelliptical graphical models. In Advances in Neural Information Processing Systems (vol. 25, pp. 800–808) . Search in Google Scholar

[27] Longin, F., & Solnik, B. (2001). Extreme value correlation of international equity markets. The Journal of Finance, 56, 649–676. 10.1111/0022-1082.00340Search in Google Scholar

[28] Lu, J., Kolar, M., & Liu, H. (2018). Post-regularization inference for time-varying nonparanormal graphical models. Journal of Machine Learning Research, 18, 1–78. Search in Google Scholar

[29] McNeil, A., Frey, R., & Embrechts, P. (2005). Quantitative Risk Management: Concepts, Techniques, and Tools (vol. 101), New Jersey, US: Princeton University Press.Search in Google Scholar

[30] McNeil, A. J., Nessslehová, J. G., & Smith, A. D. (2022). On attainability of Kendall’s tau matrices and concordance signatures. Journal of Multivariate Analysis, 191, 105033. 10.1016/j.jmva.2022.105033Search in Google Scholar

[31] Nagler, T. (2023). WDM: Weighted Dependence Measures. R package version 0.2.4. Search in Google Scholar

[32] Nelsen, R. B. (2007). An Introduction to Copulas. New York, NY, USA: Springer Science & Business Media. Search in Google Scholar

[33] Patton, A. (2006). Modeling asymmetric exchange rate dependence. International Economic Review, 47, 527–556. 10.1111/j.1468-2354.2006.00387.xSearch in Google Scholar

[34] Perreault, S. (2020). Structures de corrélation partiellement échangeables: inférence et apprentissage automatique. PhD thesis, Université Laval. Search in Google Scholar

[35] Perreault, S., Duchesne, T., & Nešlehová, J. (2019). Detection of block-exchangeable structure in large-scale correlation matrices. Journal of Multivariate Analysis, 169, 400–422. 10.1016/j.jmva.2018.10.009Search in Google Scholar

[36] Perreault, S., Nešlehová, J. G., & Duchesne, T. (2022). Hypothesis tests for structured rank correlation matrices. Journal of the American Statistical Association, 118(544), 2889–2900.10.1080/01621459.2022.2096619Search in Google Scholar

[37] Pimenova, I. (2012). Semi-parametric Estimation of Elliptical Distribution in Case of High Dimensionality. Master’s Thesis, Humboldt-Universität zu Berlin, Wirtschaftswissenschaftliche Fakultät. Search in Google Scholar

[38] R Core Team. (2022). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. Search in Google Scholar

[39] Rothman, A., Levina, E., & Zhu, J. (2009). Generalized thresholding of large covariance matrices. Journal of the American Statistical Association, 104(485), 177–186. 10.1198/jasa.2009.0101Search in Google Scholar

[40] Ryan, J. A., & Ulrich, J. M. (2024). quantmod: Quantitative Financial Modelling Framework. R package version 0.4.26. Search in Google Scholar

[41] Ryan, V., & Derumigny, A. (2024). On the choice of the two tuning parameters for nonparametric estimation of an elliptical distribution generator. ArXiv: arXiv:2408.17087. Search in Google Scholar

[42] Sadefo Kamdem, J. (2005). Value-at-risk and expected shortfall for linear portfolios with elliptically distributed risk factors. International Journal of Theoretical and Applied Finance, 8, 537–551. 10.1142/S0219024905003104Search in Google Scholar

[43] Serfling, R. J. (2009). Approximation Theorems of Mathematical Statistics (vol. 162), New York, NY, USA: John Wiley & Sons. Search in Google Scholar

[44] Tsybakov, A. (2003). Introduction à l’estimation non paramétrique (vol. 41), Berlin Heidelberg, Germany: Springer Science & Business Media. Search in Google Scholar

[45] Veraverbeke, N., Omelka, M., & Gijbels, I. (2011). Estimation of a conditional copula and association measures. Scandinavian Journal of Statistics, 38, 766–780. 10.1111/j.1467-9469.2011.00744.xSearch in Google Scholar

Received: 2023-09-29

Revised: 2024-12-27

Accepted: 2025-02-04

Published Online: 2025-04-04

This work is licensed under the Creative Commons Attribution 4.0 International License.

Articles in the same Issue

https://doi.org/10.1515/demo-2025-0012

Keywords for this article

Kendall’s tau matrix; block structure; kernel smoothing; conditional dependence measure

Creative Commons

BY 4.0

Fast estimation of Kendall's Tau and conditional Kendall's Tau matrices under structural assumptions

Article

Abstract

1 Introduction

2 Fast estimation of Kendall’s tau matrix

2.1 The structural assumption

Assumption 1

Assumption 2

Assumption 3

2.2 Consequences of the block structure on the correlation matrix

Proposition 1

Proposition 2

2.3 Construction of estimators

2.4 Comparison of their variances

Lemma 3

Theorem 4

Remark 5

Corollary 6

Corollary 7

3 Fast estimation of conditional Kendall’s tau matrix

3.1 Estimation of conditional Kendall’s tau

Assumption 4

3.2 Comparison of their asymptotic variances

Assumption 5

Assumption 6

Assumption 7

Theorem 8

Remark 9

Corollary 10

4 Simulation study

4.1 Unconditional Kendall’s tau

4.1.1 Effect of the sample size

4.1.2 Effect of the block size

4.1.3 Effect of the value of the true Kendall’s taus

4.1.4 Effect of the copula

4.2 Conditional Kendall’s tau

4.2.1 Effect of the sample size

4.2.2 Bandwidth selection

5 Application to real data

5.1 Value at risk for elliptical distributions

5.2 Estimation procedure

5.3 Results

6 Conclusion

Acknowledgements

Appendix A Additional figures

A.1 Effect of the block size in the conditional framework (Section 4.2)

B Proofs

B.1 Proofs for Section 2.2

Proof of Proposition 1

Proof of Proposition 2

B.2 Derivation of Equation (5)

B.3 Proof of Lemma 3

Proof

B.4 Proof of Theorem 4

B.4.1 Proof of (i)

B.4.2 Proof of (ii)

B.4.3 Proof of (iii)

B.4.4 Proof of (iv)

B.4.5 Proof of (v)

B.4.6 Proof of (vi)

B.5 Proof of Theorem 8

Proof

C List of stocks used

References

Supplementary Material

Articles in the same Issue

Articles in the same Issue

Articles in the same Issue