Improved algorithms for computing worst Value-at-Risk

Marius Hofert; Amir Memartoluie; David Saunders; Tony Wirjanto

doi:10.1515/strm-2015-0028

Article Publicly Available

Improved algorithms for computing worst Value-at-Risk

Marius Hofert , Amir Memartoluie , David Saunders and Tony Wirjanto

Published/Copyright: February 3, 2017

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information

From the journal Statistics & Risk Modeling Volume 34 Issue 1-2

Abstract

Numerical challenges inherent in algorithms for computing worst Value-at-Risk in homogeneous portfolios are identified and solutions as well as words of warning concerning their implementation are provided. Furthermore, both conceptual and computational improvements to the Rearrangement Algorithm for approximating worst Value-at-Risk for portfolios with arbitrary marginal loss distributions are given. In particular, a novel Adaptive Rearrangement Algorithm is introduced and investigated. These algorithms are implemented using the R package qrmtools and may be of interest in any context in which it is required to find columnwise permutations of a matrix such that the minimal (maximal) row sum is maximized (minimized).

Keywords: Risk aggregation; model uncertainty; Value-at-Risk; Adaptive Rearrangement Algorithm

MSC 2010: 65C60; 62P05

1 Introduction

An integral part of Quantitative Risk Management is to analyze the one-period ahead vector of losses 𝑳=(L1,…,Ld), where Lj represents the loss (a random variable) associated with a given business line or risk type with counterparty j, j∈{1,…,d}, over a fixed time horizon. For financial institutions, the aggregated loss

L+=∑j=1dLj

is of particular interest. A risk measure ρ⁢(⋅) is used to map the aggregate position L+ to ρ⁢(L+)∈ℝ for obtaining the amount of capital required to account for future losses over a predetermined time period. As a risk measure, Value-at-Risk (VaRα) has been widely adopted by the financial industry since the mid nineties. It is defined as the α-quantile of the distribution function FL+ of L+, i.e.,

VaRα⁡(L+)=FL+-⁢(α)=inf⁡{x∈ℝ:FL+⁢(x)≥α},α∈(0,1),

where FL+- denotes the quantile function of FL+; see [11] for more details. There are various methods for estimating the marginal loss distributions F1,…,Fd of L1,…,Ld, respectively, but capturing the d-variate dependence structure (i.e., the underlying copula C) of 𝑳 is often more difficult. This is due to the fact that typically not much is known about C and estimation is often not feasible (e.g., for rare-event losses occurring in different geographic regions). However, partial dependence information is often available (e.g., through knowledge of the variance of the sum). This case (and thus a possibly smaller dependence uncertainty spreadVaR¯α⁡(L+)-VaR¯α⁡(L+)), is studied by [5, 6, 7, 8, 21], where the RA also has been shown to be a useful tool.

In this work we focus on the case where C is unknown and study the problem of computing VaRα⁡(L+) bounds VaR¯α⁡(L+) and VaR¯α⁡(L+), where

VaR¯α⁡(L+)=inf⁡{VaRα⁡(L+):L1∼F1,…,Ld∼Fd},

VaR¯α⁡(L+)=sup⁡{VaRα⁡(L+):L1∼F1,…,Ld∼Fd},

i.e., VaR¯α⁡(L+) and VaR¯α⁡(L+) denote the best and the worst VaRα⁡(L+) over all distributions of 𝑳 with marginals F1,…,Fd, respectively. This problem is non-trivial as VaRα⁡(L+) can be superadditive, so, in particular, computing VaRα⁡(L+) in the comonotonic case (of all losses being equal in distribution to F1-⁢(U),…,Fd-⁢(U) for U∼U⁡[0,1]) may not lead to VaR¯α⁡(L+).

The dependence uncertainty interval[VaR¯α⁡(L+),VaR¯α⁡(L+)] can be wide, see, e.g., [14], but financial firms are interested in computing it (often in a high dimensional d) to determine their risk capital for L+ within this range. As we show in this work, even the computations for small values of d (and other moderate parameter choices) require care.

We investigate analytical solutions in the homogeneous case (i.e., F1=…=Fd); see [14]. In the general, inhomogeneous case (i.e., not all marginals Fj necessarily being equal), we consider and improve the Rearrangement Algorithm (RA) of [13] for computing VaR¯α⁡(L+) and VaR¯α⁡(L+); we mostly focus on VaR¯α⁡(L+). All presented algorithms have been implemented in the R package qrmtools; see also the accompanying vignette VaR_bounds which provides further results, numerical investigations, diagnostic checks and an application. The results presented in this paper can be reproduced with the package and vignette (and, obviously, other parameter values can be chosen).

Another direction to improve the RA is by considering so-called block rearrangements (thus leading to the so-called block RA (BRA)), where blocks of columns (instead of single columns) are rearranged at a time. This idea was introduced in [3] and further applied in [5]. It is related to the notion of Σ-countermonotonicity studied in [22]. Whenever a block of columns can be identified with the property that its row sums are not oppositely ordered with respect to the row sums across the remaining columns, the outcome of the RA can be improved. In general, however, there are many blocks to consider and it is not clear whether in practice this approach always leads to better results (checking all blocks is not feasible for moderate or large dimensions of the underlying matrix). Bernard and McLeish [2] address the issue of non-convergence of the BRA to a global optimum by applying Markov Chain Monte Carlo methods.

The remaining parts of this paper are organized as follows. In Section 2 we highlight and solve numerical challenges that practitioners may face when implementing theoretical solutions for VaR¯α⁡(L+) in the homogeneous case F1=…=Fd. Section 3 presents the main concept underlying the Rearrangement Algorithm for computing VaR¯α⁡(L+) and VaR¯α⁡(L+), improves the choices of its tuning parameters and investigates its empirical performance under various scenarios. Section 4 then presents a conceptually and numerically improved version of the RA, which we call the Adaptive Rearrangement Algorithm (ARA), for calculating VaR¯α⁡(L+) and VaR¯α⁡(L+). Section 5 concludes.

2 Known optimal solutions in the homogeneous case and their tractability

For d=2, Embrechts, Puccetti and Rüschendorf [13, Proposition 2] provide an analytical solution for computing VaR¯α⁡(L+) and VaR¯α⁡(L+) for distribution functions concentrated on [0,∞) with ultimately decreasing densities (i.e., densities which are decreasing beyond a certain point); the dependence underlying VaR¯α⁡(L+) above the confidence level α∈(0,1) is simply countermonotonicity, see [13, Remark 3]. In particular, a sum which is minimal with respect to the convex ordering is obtained under countermonotonicity and this is sufficient to get sharp VaRα bounds; see [4, Theorem 2.5] for more details. The latter result also holds for d≥3 and implies that maximization of VaRα can be achieved by constructing a sum that is minimal with respect to the convex ordering above the α-quantile. The RA provides a practical way of achieving such a sum. In order to assess its quality from a numerical point of view, we need to know (at least some) optimal solutions to which we can compare the RA algorithm. The paper [12] presents a formula for obtaining VaR¯α⁡(L+) in the homogeneous case; see also [13, Proposition 4] or [14, Proposition 1]. In this section, we address the corresponding numerical aspects and algorithmic improvements. We assume d≥3 throughout.

2.1 Crude bounds for any VaRα⁡(L+)

The following lemma is a consequence of [18, Theorem 2.7] and provides (crude) bounds for VaRα⁡(L+) which are useful for computing initial intervals (see Section 2.2) and for conducting sanity checks. Note that this lemma does not involve any assumptions on the involved marginal loss distributions.

Lemma 2.1 (Crude bounds for VaRα⁡(L+)).

Let Lj∼Fj, j∈{1,…,d}. For any α∈(0,1),

(1)d⁢minj⁡Fj-⁢(αd)≤VaRα⁡(L+)≤d⁢maxj⁡Fj-⁢(1-1-αd),

where Fj- denotes the quantile function of Fj.

The bounds (1) can be computed with the function crude_VaR_bounds() in the R package qrmtools.

2.2 The dual bound approach for computing VaR¯α⁡(L+)

This approach, termed dual bound approach, for computing VaR¯α⁡(L+) in the homogeneous case with margin(s) F is presented in [12]; note that there is no corresponding algorithm for computing VaR¯α⁡(L+) with this approach. In the remaining part of this subsection we assume that F⁢(0)=0, F⁢(x)<1 for all x∈[0,∞) and that F is absolutely continuous with an ultimately decreasing density. Let

(2)D⁢(s,t)=ds-d⁢t⁢∫ts-(d-1)⁢tF¯⁢(x)⁢𝑑x and D⁢(s)=mint∈[0,s/d]⁡D⁢(s,t),

where F¯ denotes the survival function of F, i.e., F¯⁢(x)=1-F⁢(x); D⁢(s) is known as dual bound. Note that by l’Hôpital’s rule,

limt↑s/d⁡D⁢(s,t)=d⁢F¯⁢(sd)

(concerning the required differentiability assumptions, the numerator t↦d⁢∫ts-(d-1)⁢tF¯⁢(x)⁢𝑑x is differentiable by absolute continuity of F and the denominator t↦s-d⁢t has derivative -d≠0).

Numerically, the procedure for computing VaR¯α⁡(L+) can now be given as follows.

Algorithm 2.2 (Computing VaR¯α⁡(L+) according to the dual bound approach).

Proceed as follows.

Specify initial intervals [sl,su] and [tl,tu].
Inner root-finding in t: For each considered s∈[sl,su], compute D⁢(s) by finding a t*∈[tl,tu] such that h⁢(s,t*)=0, where
h⁢(s,t):=D⁢(s,t)-(F¯⁢(t)+(d-1)⁢F¯⁢(s-(d-1)⁢t)).
Then D⁢(s)=F¯⁢(t*)+(d-1)⁢F¯⁢(s-(d-1)⁢t*).
Outer root-finding in s: Repeat Step (2) for s∈[sl,su] until an s* is found for which D⁢(s*)=1-α. Then return s*=VaR¯α⁡(L+).

Algorithm 2.2 is implemented in the function VaR_bounds_hom(..., method="dual") in the R package qrmtools; the dual bound D is available via dual_bound(). It requires a one-dimensional numerical integration (unless F¯ can be integrated analytically) within two nested calls of a root-finding algorithm (uniroot() in R). Note that Algorithm 2.2 requires a specification of the two initial intervals [sl,su] and [tl,tu].

First consider [tl,tu]. Under the assumptions on F, one can choose tl=0. For tu one would like to choose s/d; see the definition of D⁢(s) in (2). However, care has to be exercised as h⁢(s,s/d)=0 for any s and thus the inner root-finding procedure will directly stop when a root is found at tu=s/d. To address this issue, the inner root-finding algorithm in VaR_bounds_hom(..., method="dual") fixes f.upper, i.e., h⁢(s,tu) for the considered s, to -h⁢(s,0) so that a root below tu=s/d can be detected; note that this is an (inelegant; for lack of a better method) adjustment in the function value, and not in the root-finding interval [tl,tu].

Now consider [sl,su], in particular, sl. If it is chosen too small, the inner root-finding procedure in Step (2) of Algorithm 2.2 will not be able to locate a root. A solution candidate is VaRα⁡(L+) under comonotonicity, so sl=d⁢VaRα⁡(L) for L∼F. Given sl, one can then choose su as the maximum of sl+1 and the upper bound on VaRα⁡(L+) as given in (1), for example; another option for su is worst expected shortfall (see later) if it exists, so d⁢ESα⁡(L).

On the theoretical side, the following proposition implies that if F¯ is strictly convex, so is D⁢(s,⋅) for fixed s. This shows the uniqueness of the minimum when computing D⁢(s) as in (2). A standard result on convexity of objective value functions then implies that D⁢(s) itself is also convex (and therefore continuous on the interior of its domain); see [23, Proposition 2.22].

Proposition 2.3 (Properties of D⁢(s,t) and D⁢(s)).

The following statements hold.

D⁢(s) is decreasing. If F¯ is strictly decreasing, so is D.
If F¯ is convex, so is D⁢(s,t).

Proof.

(1) Let s≥s′ and t′∈[0,s′d] such that D⁢(s′,t′)=D⁢(s′). Define

t=s-(s′-t′⁢d)d=s-s′d+t′

so that 0≤t′≤t≤sd. Let κ=s′-t′⁢d=s-t⁢d. If κ>0, noting that F¯ is decreasing and that t′≤t, we obtain

D⁢(s′,t′)-D⁢(s,t)=dκ⁢(∫t′t′+κF¯⁢(x)⁢𝑑x-∫tt+κF¯⁢(x)⁢𝑑x)≥0,

so that D⁢(s)≤D⁢(s,t)≤D⁢(s′,t′)=D⁢(s′). If κ=0, then

D⁢(s′)=D⁢(s′,s′d)=d⁢F¯⁢(s′d)≥d⁢F¯⁢(sd)≥D⁢(s).

When F¯ is strictly decreasing, the inequalities are strict.

(2) Recall that

D⁢(s,t)=ds-t⁢d⁢∫tt+(s-t⁢d)F¯⁢(x)⁢𝑑x.

Using the transformation z=(x-t)/(s-t⁢d), we have

D⁢(s,t)=d⁢∫01F¯⁢(s⁢z+t⁢(1-z⁢d))⁢𝑑z.

Define C={(s,t):0≤s<∞, 0≤t≤sd}, and note that C is convex. Furthermore, if F¯ is convex, then D⁢(s,t) is jointly convex in s and t on C since for λ∈(0,1),

D⁢(λ⁢s1+(1-λ)⁢s2,λ⁢t1+(1-λ)⁢t2)=d⁢∫01F¯⁢((λ⁢s1+(1-λ)⁢s2)⁢z+(λ⁢t1+(1-λ)⁢t2)⁢(1-z⁢d))⁢𝑑z

=d⁢∫01F¯⁢(λ⁢(s1⁢z+t1⁢(1-z⁢d))+(1-λ)⁢(s2⁢z+t2⁢(1-z⁢d)))⁢𝑑z

≤∫01λ⁢F¯⁢(((s1+t1⁢(1-z⁢d)))+(1-λ)⁢F¯⁢((s2+t2⁢(1-z⁢d))))⁢𝑑z

=λ⁢D⁢(s1,t1)+(1-λ)⁢D⁢(s2,t2).∎

2.3 Wang’s approach for computing VaR¯α⁡(L+)

Theory

The approach mentioned in [14, Proposition 1] is termed Wang’s approach here. It originates from [25, Corollary 3.7]. For notational simplicity, let us introduce

ac=α+(d-1)⁢c,bc=1-c,

for c∈[0,(1-α)/d] (so that ac∈[α,1-(1-α)/d] and bc∈[1-(1-α)/d,1]) and

I¯⁢(c):=1bc-ac⁢∫acbcF-⁢(y)⁢𝑑y,c∈(0,1-αd],

with the assumption that F admits a density which is positive and decreasing on [β,∞) for some β≤F-⁢(α). Then, for L∼F,

(3)VaR¯α(L+)=d𝔼[L|L∈[F-(ac),F-(bc)]],α∈[F(β),1),

where c (typically depending on d, α) is the smallest number in (0,(1-α)/d] such that

(4)I¯⁢(c)≥d-1d⁢F-⁢(ac)+1d⁢F-⁢(bc).

In contrast to what is given in [14], note that (0,(1-α)/d] has to exclude 0 since otherwise, for Par⁡(θ) margins with θ∈(0,1], c equals 0 and thus, erroneously, VaR¯α⁡(L+)=∞. If F is the distribution function of the Par⁡(θ) distribution, then I¯ is given by

(5)I¯⁢(c)={1bc-ac⁢θ1-θ⁢((1-bc)1-1/θ-(1-ac)1-1/θ)-1,if ⁢θ≠1,1bc-ac⁢log⁡(1-ac1-bc)-1,if ⁢θ=1.

The conditional distribution function FL|L∈[F-(ac),F-(bc)] of L|L∈[F-(ac),F-(bc)] is given by

FL|L∈[F-(ac),F-(bc)]⁢(x)=F⁢(x)-acbc-ac,x∈[F-⁢(ac),F-⁢(bc)].

Using this fact and by substitution, we obtain that, for α∈[F⁢(β),1), (3) becomes

(6)VaR¯α⁡(L+)=d⁢∫F-⁢(ac)F-⁢(bc)x⁢𝑑FL|L∈[F-(ac),F-(bc)]⁢(x)=d⁢∫F-⁢(ac)F-⁢(bc)x⁢𝑑F⁢(x)bc-ac=d⁢I¯⁢(c).

Equation (6) has the advantage of having the integration in I¯⁢(c) over an (at least theoretically) compact interval. Furthermore, finding the smallest c such that (4) holds also involves I¯⁢(c). We thus only need to know the quantile function F- in order to compute VaR¯α⁡(L+). This leads to the following algorithm.

Algorithm 2.4 (Computing VaR¯α⁡(L+) according to Wang’s approach).

Proceed as follows.

Specify an initial interval [cl,cu] with 0≤cl<cu<(1-α)/d.
Root-finding in c: Find a c*∈[cl,cu] such that h⁢(c*)=0, where
h⁢(c):=I¯⁢(c)-(d-1d⁢F-⁢(ac)+1d⁢F-⁢(bc)),c∈(0,1-αd].
Then return (d-1)⁢F-⁢(ac*)+F-⁢(bc*).

This procedure is implemented in the function VaR_bounds_hom(..., method="Wang") in the R package qrmtools with numerical integration via R’s integrate() for computing the integral I¯⁢(c); the function VaR_bounds_hom(..., method="Wang.Par") makes use of (5).

The following proposition shows that the root-finding problem in Step (2) of Algorithm 2.4 is well defined in the case of Pareto margins for all θ>0 (including the infinite-mean case); for other distributions under more restrictive assumptions, see [1].

Proposition 2.5.

Let F⁢(x)=1-(1+x)-θ, θ>0, be the distribution function of the Par⁡(θ) distribution. Then h in Step (2) of Algorithm 2.4 has a unique root on (0,(1-α)/d), for all α∈(0,1) and d>2.

Proof.

First consider θ≠1. Using (5), one can rewrite h⁢(c) as

h⁢(c)=c-1/θ+1⁢θ1-θ⁢(1-(1-αc-(d-1))-1/θ+1)1-α-d⁢c-(d-1)⁢(1-αc-(d-1)-1/θ+1)c1/θ⁢d.

Multiplying with c1/θ⁢d and rewriting the expression, one sees that h⁢(c)=0 is equivalent to h1⁢(xc)=0 where

(7)xc=1-αc-(d-1)

(which is in (1,∞) for c∈(0,(1-α)/d)) and h1⁢(x)=d⁢θ1-θ⁢1-x-1/θ+1x-1-((d-1)⁢x-1/θ+1). It is easy to see that h1⁢(x)=0 if and only if h2⁢(x)=0, where

(8)h2⁢(x)=(d1-θ-1)⁢x-1/θ+1-(d-1)⁢x-1/θ+x-(d⁢θ1-θ+1),x∈(1,∞).

We are done for θ≠1 if we show that h2 has a unique root on (1,∞). To this end, note that h2⁢(1)=0 and limx↑∞⁡h2⁢(x)=∞. Furthermore,

h2′⁢(x)=(1-1θ)⁢(d1-θ-1)⁢x-1/θ+d-1θ⁢x-1/θ-1+1,

h2′′⁢(x)=d+θ-1θ2⁢x-1/θ-1-(1θ+1)⁢d-1θ⁢x-1/θ-2.

It is not difficult to check that h2′′⁢(x)=0 if and only if x=(d-1)⁢(1+θ)d+θ-1 (which is greater than 1 for d>2). Hence, h2 can have at most one root. We are done if we find an x0∈(1,∞) such that h2⁢(x0)<0, but this is guaranteed by the fact that limx↓1⁡h2′⁢(x)=0 and limx↓1⁡h2′′⁢(x)=-(d-2)/θ<0 for d>2.

The proof for θ=1 works similarly; in this case, h2 is given by

(9)h2⁢(x)=x2+x⁢(-d⁢log⁡(x)+d-2)-(d-1),x∈(1,∞),

and the unique point of inflection of h2 is x=d/2. ∎

Practice

Let us now focus on the case of Par⁡(θ) margins (see VaR_bounds_hom(..., method="Wang.Par")) and, in particular, how to choose the initial interval [cl,cu] in Algorithm 2.4. We first consider cl. Note that I¯ satisfies I¯⁢(0)=11-α⁢∫α1F-⁢(y)⁢𝑑y=ESα⁡(L), i.e., I¯⁢(0) is the expected shortfall of L∼F at confidence level α. If L has a finite first moment, then I¯⁢(0) is finite. Therefore, h⁢(0) is finite (if F-⁢(1)<∞) or -∞ (if F-⁢(1)=∞). Either way, one can take cl=0. However, if L∼F has an infinite first moment (see, e.g., [16] or [9] for situations in which this can happen), then I¯⁢(0)=∞ and F-⁢(1)=∞, so h⁢(0) is not well defined; this happens, e.g., if F is Par⁡(θ) with θ∈(0,1]. In such a case, we are forced to choose cl∈(0,(1-α)/d); see the following proposition for how this can be done theoretically. Concerning cu, note that l’Hôpital’s rule implies that I¯⁢(cu)=F-⁢(1-(1-α)/d) and thus h⁢((1-α)/d)=0. We thus have a similar problem (a root at the upper endpoint of the initial interval) to the computation of the dual bound. However, here we can construct a suitable cu<(1-α)/d; see the following proposition.

Proposition 2.6 (Computing cl,cu for Par⁡(θ) risks).

Let F be the distribution function of a Par⁡(θ) distribution, θ>0. Then cl and cu in Algorithm 2.4 can be chosen as

cl={(1-θ)⁢(1-α)d,if ⁢θ∈(0,1),1-α(d+1)ee-1+d-1,if ⁢θ=1,1-α(d/(θ-1)+1)θ+d-1,if ⁢θ∈(1,∞), cu={(1-α)⁢(d-1+θ)(d-1)⁢(2⁢θ+d),if ⁢θ≠1,1-α3⁢d/2-1,if ⁢θ=1.

Proof.

First consider cl and θ≠1. Instead of h, formulas (7) and (8) allow us to study

h2⁢(x)=(d1-θ-1)⁢x-1/θ+1-(d-1)⁢x-1/θ+x-(d⁢θ1-θ+1),x∈[1,∞).

Consider the two cases θ∈(0,1) and θ∈(1,∞) separately. If θ∈(0,1), then d/(1-θ)-1>d-1≥0 and x-1/θ+1≥x-1/θ for all x≥1, so

h2⁢(x)≥((d1-θ-1)-(d-1))⁢x-1/θ+x-(d⁢θ1-θ+1)≥x-(d⁢θ1-θ+1)

which is 0 if and only if x=d⁢θ/(1-θ)+1. Setting this equal to xc (defined in (7)) and solving for c leads to the cl as provided. If θ∈(1,∞), then using x-1/θ≤1 leads to h2⁢(x)≥(d/(1-θ)-1)⁢x-1/θ+1+x which is 0 for x≥1 if and only if x=(d/(θ-1)+1)θ. Setting this equal to xc and solving for c leads to the cl as required.

Now consider θ=1. As before, we can consider (7) and (9). By using the fact that log⁡x≤x1/e and x≥-x1+1/e for x∈[1,∞), we obtain h2⁢(x)≥x2+x⁢(-d⁢x1/e+d-2)-(d-1)≥x2-(d+1)⁢x1+1/e which is 0 if and only if x=(d+1)e/(e-1). Setting this equal to xc and solving for c leads to the cl as required.

Lastly consider cu. It is easy to see that the inflection point of h2 provides a lower bound xc on the root of h2. As derived in the proof of Proposition 2.5, the point of inflection is x=xc:=(d-1)⁢(1+θ)/(d+θ-1) for θ≠1 and x=d/2 for θ=1. Solving xc=(1-α)/c-(d-1) for c then leads to cu as stated. ∎

In the following example, we briefly address several numerical hurdles we had to overcome when implementing VaR_bounds_hom(..., method="Wang.Par"); see the vignette VaR_bounds for more details.

Example 2.7 (Comparison of the approaches for Par⁡(θ) risks).

For obtaining numerically reliable results, one has to be careful when computing the root of h for c∈(0,(1-α)/d). As an example, consider Par⁡(θ) risks and the confidence level α=0.99. Figure 1 compares Wang’s approach (using numerical integration; see VaR_bounds_hom(...,method="Wang")), Wang’s approach (with an analytical formula for the integral I¯⁢(c) but uniroot()’s default tolerance; see the vignette VaR_bounds), Wang’s approach (with an analytical formula for the integral I¯⁢(c) and auxiliary function h transformed to (1,∞); see the vignette VaR_bounds), Wang’s approach (with analytical formula for the integral I¯⁢(c), smaller uniroot() tolerance and adjusted initial interval; see VaR_bounds_hom(..., method="Wang.Par")), and the lower and upper bounds obtained from the RA (with absolute tolerance 0); see Section 3 and RA(). All of the results are divided by the values obtained from the dual bound approach to facilitate comparison.

As can be seen, choosing a smaller root-finding tolerance is crucial. Figure 1 shows what could occur if this is not done (our procedure chooses MATLAB’s default 2.2204⋅10-16 instead of the much larger uniroot() default .Machine$double.eps^0.25). Furthermore, it turned out to be required to adjust the theoretically valid initial interval described in Proposition 2.6 in order to guarantee that h is numerically of opposite sign at the interval end points. In particular, VaR_bounds_hom(..., method="Wang.Par") chooses cl/2 as a lower end point (with cl as in Proposition 2.6) in the case θ≠1.

These problems are described in detail in Section 1.4 of the vignette VaR_bounds, where we also show that transforming the auxiliary function h to a root-finding problem on (1,∞) as described in the proof of Proposition 2.6 does not only require a smaller root-finding tolerance but also an extended initial interval and, furthermore, it faces a cancellation problem (which can be solved, though); see also the left-hand side of Figure 3, where we compare this approach to VaR_bounds_hom(..., method="Wang.Par") after fixing these numerical issues.

In short, one should exercise caution in implementing analytical solutions for computing VaR¯α⁡(L+) or VaR¯α⁡(L+) in the homogeneous case with Par⁡(θ) (and most likely also other) margins.

Let us again stress how important the initial interval [cl,cu] is. One could be tempted to simply choose cu=(1-α)/d and force the auxiliary function h to be of opposite sign at cu, e.g., by setting h⁢(cu) to .Machine$double.xmin, a positive but small number. Figure 2 (left-hand side) shows how Figure 1 looks like in this case (here it is standardized with respect to the upper bound obtained from the RA). Figure 2 (right-hand side) shows the implied VaR¯α⁡(L+). In particular, the computed VaR¯α⁡(L+) values are not monotone in α anymore (and thus not correct).

$Figure 1 Comparisons of Wang’s approach (using numerical integration; see VaR_bounds_hom(..., method="Wang")), Wang’s approach (with an analytical formula for the integral I¯⁢(c)\bar{I}(c) but uniroot()’s default tolerance; see the vignette VaR_bounds), Wang’s approach (with an analytical formula for the integral I¯⁢(c)\bar{I}(c) and auxiliary function h transformed to (1,∞)(1,\infty); see the vignette VaR_bounds), Wang’s approach (with analytical formula for the integral I¯⁢(c)\bar{I}(c), smaller uniroot() tolerance and adjusted initial interval; see VaR_bounds_hom(..., method="Wang.Par")), and the lower and upper bounds obtained from the RA; all of the results are divided by the values obtained from the dual bound approach to facilitate comparison. Theleft-hand side shows the case with d=8d=8, the right-hand side with d=100d=100.$

Figure 1

Comparisons of Wang’s approach (using numerical integration; see VaR_bounds_hom(..., method="Wang")), Wang’s approach (with an analytical formula for the integral I¯⁢(c) but uniroot()’s default tolerance; see the vignette VaR_bounds), Wang’s approach (with an analytical formula for the integral I¯⁢(c) and auxiliary function h transformed to (1,∞); see the vignette VaR_bounds), Wang’s approach (with analytical formula for the integral I¯⁢(c), smaller uniroot() tolerance and adjusted initial interval; see VaR_bounds_hom(..., method="Wang.Par")), and the lower and upper bounds obtained from the RA; all of the results are divided by the values obtained from the dual bound approach to facilitate comparison. Theleft-hand side shows the case with d=8, the right-hand side with d=100.

$Figure 2 Comparison of various methods for computing VaR¯0.99⁡(L+)\operatorname{\overline{VaR}}_{0.99}(L^{+}) (left-hand side) and VaR¯α⁡(L+)\operatorname{\overline{VaR}}_{\alpha}(L^{+}) as a function of α (right-hand side) for Par⁡(θ)\operatorname{Par}(\theta) risks with h⁢((1-α)/d)h((1-\alpha)/d) being naively adjusted to .Machine$double.xmin.$

Figure 2

Comparison of various methods for computing VaR¯0.99⁡(L+) (left-hand side) and VaR¯α⁡(L+) as a function of α (right-hand side) for Par⁡(θ) risks with h⁢((1-α)/d) being naively adjusted to .Machine$double.xmin.

After carefully addressing the numerical issues, we can now consider VaR¯α⁡(L+) and VaR¯α⁡(L+) from a different perspective. The right-hand side of Figure 3 shows VaR¯α⁡(L+) and VaR¯α⁡(L+) as functions of the dimension d. The linearity of VaR¯α⁡(L+) in the log-log scale suggests that VaR¯α⁡(L+) is actually a power function in d. To the best of our knowledge, this is not known (nor theoretically justified) yet.

$Figure 3 A comparison of VaR¯α⁡(L+){\operatorname{\overline{VaR}}_{\alpha}(L^{+})} computed with VaR_bounds_hom(..., method="Wang.Par") (solid line) and with theapproach based on transforming the auxiliary function h to a root-finding problem on (1,∞){(1,\infty)} (dashed line) as described in the proof of Proposition 2.6 (left-hand side). VaR¯α⁡(L+){\operatorname{\underline{VaR}}_{\alpha}(L^{+})} (dashed line) and VaR¯α⁡(L+){\operatorname{\overline{VaR}}_{\alpha}(L^{+})} (solid line) as functions of d on log-log scale (right-hand side).$

Figure 3

A comparison of VaR¯α⁡(L+) computed with VaR_bounds_hom(..., method="Wang.Par") (solid line) and with theapproach based on transforming the auxiliary function h to a root-finding problem on (1,∞) (dashed line) as described in the proof of Proposition 2.6 (left-hand side). VaR¯α⁡(L+) (dashed line) and VaR¯α⁡(L+) (solid line) as functions of d on log-log scale (right-hand side).

3 The Rearrangement Algorithm

We now consider the RA for computing VaR¯α⁡(L+) and VaR¯α⁡(L+) in the inhomogeneous case; as before, we mainly focus on VaR¯α⁡(L+) here. The algorithm was proposed by [19]; see also [13]. The idea underlying the RA and the numerical approximation for calculating VaR¯α⁡(L+) and VaR¯α⁡(L+) dates back to [24]. Note that the RA tries to find solutions to maximin (for VaR¯α⁡(L+)) and minimax (for VaR¯α⁡(L+)) problems and is thus of interest in a wider context, e.g., Operations Research; an early reference for this is [10]. In the following subsections we look at how the RA works and analyze its performance using various test cases.

3.1 How the Rearrangement Algorithm works

The RA can be applied to approximate the best Value-at-Risk VaR¯α⁡(L+) or the worst Value-at-Risk VaR¯α⁡(L+) for any set of marginals Fj, j∈{1,…,d}. In what follows our focus is mainly on VaR¯α⁡(L+); our implementation RA() in the R package qrmtools also addresses VaR¯α⁡(L+). To understand the algorithm, note that two columns 𝒂,𝒃∈ℝN are called oppositely ordered if for all i,j∈{1,…,N} we have (ai-aj)⁢(bi-bj)≤0. Given a number N of discretization points of the marginal quantile functions F1-,…,Fd- above α (see Steps (a) and (a) of Algorithm 3.1 below), the RA constructs two (N,d)-matrices, denoted by X¯α and X¯α, respectively; the first matrix aims at constructing an approximation of VaR¯α⁡(L+) from below, the second matrix is used to construct an approximation of VaR¯α⁡(L+) from above. Separately for each of these matrices, the RA iterates over its columns and rearranges each of them so that it is oppositely ordered to the sum of all other columns. This is repeated until the minimal row sum

s⁢(X)=min1≤i≤N⁢∑1≤j≤dxi⁢j

(for X=(xi⁢j) being one of the said (N,d)-matrices) changes by less than a given (convergence) toleranceε≥0. The RA for VaR¯α⁡(L+) thus aims at approximating the maximin problem. The algorithm reduces the variance of the conditional distribution of L+⁢|L+>⁢FL+-⁢(α) so that it can concentrate more of the 1-α mass of FL+ in its tail. This pushes VaRα⁡(L+) further up. As [13] state, one then typically ends up with two matrices whose minimal row sums are close to each other and roughly equal to VaR¯α⁡(L+). Note that if one iteration over all columns of one of the matrices does not lead to any change in that matrix, then each column of the matrix is oppositely ordered to the sum of all others and thus there is also no change in the minimal row sum (but the converse is not necessarily true; see below).

The version of the RA given below slightly differs from the one stated in [13]; e.g., concerning dealing with infinite quantiles, which, due to the efficiency of the implementation and in particular the sorting algorithm used, cannot be handled by rearrange() at this point (see the source code for an explanation). The actual implementation, see RA() and the underlying workhorse rearrange(), additionally provides more information about the computed quantities. This includes, e.g., a parameter max.ra determining the maximal number of columns to rearrange or the choice ε=(abstol=)NULL to iterate until each column is oppositely ordered to the sum of all others. The latter is typically (by far) not implied by ε=0, but does not result in higher accuracy (see, e.g., the application discussed in the vignette VaR_bounds) and is typically very time-consuming (hence the introduction of max.ra in our implementation).

Algorithm 3.1 (RA for computing VaR¯α⁡(L+)).

Proceed as follows.

Fix a confidence level α∈(0,1), marginal quantile functions F1-,…,Fd-, an integer N∈ℕ and the desired (absolute) convergence tolerance ε≥0.
Compute the lower bound:
1. Define the matrix X¯α=(x¯i⁢jα) for x¯i⁢jα=Fj-⁢(α+(1-α)⁢(i-1)N), i∈{1,…,N}, j∈{1,…,d}.
2. Permute randomly the elements in each column of X¯α.
3. Set Y¯α=X¯α. For 1≤j≤d, rearrange the jth column of the matrix Y¯α so that it becomes oppositely ordered to the sum of all other columns.
4. While s⁢(Y¯α)-s⁢(X¯α)>ε, set X¯α to Y¯α and repeat Step (c).
5. Set s¯N=s⁢(Y¯α).
Compute the upper bound:
1. Define the matrix X¯α=(x¯i⁢jα) for x¯i⁢jα=Fj-⁢(α+(1-α)⁢iN), i∈{1,…,N}, j∈{1,…,d}. If (for i=N and) for any j∈{1,…,d}, Fj-⁢(1)=∞, adjust it to Fj-⁢(α+(1-α)⁢(N-1/2)N).
2. Permute randomly the elements in each column of X¯α.
3. Set Y¯α=X¯α. For 1≤j≤d, rearrange the jth column of the matrix Y¯α so that it becomes oppositely ordered to the sum of all other columns.
4. While s⁢(Y¯α)-s⁢(X¯α)>ε, set X¯α to Y¯α and repeat Step (c).
5. Set s¯N=s⁢(Y¯α).
Return (s¯N,s¯N).

As mentioned earlier, the main feature of the RA is to iterate over all columns and oppositely order each of them with respect to the sum of all others (see Steps (c) and c). This procedure reduces the variance of the row sums with each rearrangement. Note that it does not necessarily reach an optimal solution of the maximin problem (see, e.g., [15, Lemma 6] for a counter-example) and thus the convergence |s¯N-s¯N|→0 is not guaranteed in this approach. The randomization of the initial input in Steps (b) and (b) aims at avoiding these situations and also helps reducing the run time due to avoiding the worst-case sorting. Finally, let us remark that the actual outcome of the algorithm may depend on the underlying sorting algorithm used; see our study in Section 2.3 of the vignette VaR_bounds concerning the possible rearranged output matrices and the influence of the underlying sorting algorithm on the outcome.

3.2 Conceptual and numerical improvements

Besides the confidence level α and the marginal quantile functions F1-,…,Fd-, RA relies on the inputs N∈ℕ and ε≥0. Concerning N, it needs to be “sufficiently large”, but a practitioner is left to make a choice about this value.

Another issue is the use of the absolute tolerance ε in the algorithm. There are two problems. The first problem is that it is more natural to use a relative instead of an absolute tolerance. Without (roughly) knowing the minimal row sum, a pre-specified absolute tolerance does not guarantee that the change in the minimal row sum from X¯α to Y¯α is of the right order (and this order depends at least on d and the chosen quantile functions). If ε is chosen to be too large, the computed bounds s¯N and s¯N would carry too much uncertainty, whereas if it is selected to be too small, an unnecessarily long run time results; the latter seems to be the case for [13, Table 3], where the chosen ε=0.1 is roughly 0.000004 % of the computed VaR¯0.99⁡(L+).

The second problem is that the absolute tolerance ε is only used to check individual “convergence” of s¯N and of s¯N. It does not guarantee that s¯N and s¯N are sufficiently close to provide a reasonable approximation to VaR¯α⁡(L+). From a computational point of view, this should still be checked. Also, an implementation should return convergence and other useful information, e.g., the relative rearrangement range|(s¯N-s¯N)/s¯N|, the actual individual absolute tolerances reached when computing s¯N and s¯N, the number of column rearrangements used, logical variables indicating whether the individual absolute tolerances have been reached, the number of column rearrangements for s¯N and s¯N, the row sums computed after each column rearrangement, the constructed input matrices X¯α and X¯α, and the corresponding rearranged, final matrices Y¯α and Y¯α; see RA() in the R package qrmtools for such information.

A further problem is that the RA iterates over all d columns before checking the termination conditions; see Steps (c) and (c) of Algorithm 3.1. Our underlying workhorse rearrange() keeps track of the column rearrangements of the last d considered columns and can thus terminate after rearranging any column (not only the last one); see also Algorithm 4.1 below. This “early termination” criterion saves run time (note that the tracking overhead is rather minimal). We advise the interested reader to have a look at the source code of rearrange() for further numerical and run-time improvements (fast accessing of columns via lists; avoiding having to compute the row sums over all but the current column; an extended tracing feature), some of which are mentioned in the vignette VaR_bounds.

3.3 Empirical performance under various scenarios

In order to empirically investigate the performance of the RA, we consider two studies, each of which addresses four cases; we thus consider eight scenarios. In terms of studies, we consider the following:

(N running, d fixed): N∈{27,28,…,217} and d=20.
(N fixed, d running): N=28=256 and d∈{22,23,…,210}.

These choices allow us to investigate the impact of the upper tail discretization parameter N (in Study (1)) and the impact of the number of risk factors d (in Study (2)) on the performance of the RA. In terms of cases, we consider the following different marginal tail behaviors based on the Pareto distribution function Fj⁢(x)=1-(1+x)-θj:

θ1,…,θd form an equidistant sequence from 0.6 to 0.4; this case represents a portfolio with all marginal loss distributions being heavy-tailed (slightly increasing in heavy-tailedness).
θ1,…,θd form an equidistant sequence from 1.5 to 0.5; this case represents a portfolio with marginals with different tail behaviors ranging from comparably light-tailed to very heavy-tailed distributions.
θ1,…,θd form an equidistant sequence from 1.6 to 1.4; this case represents a portfolio with all marginal loss distributions being comparably light-tailed.
θ1,…,θd-1 are chosen as in Case LL and θd=0.5; this case represents a portfolio all marginal loss distributions being light-tailed except for the last.

To keep the studies tractable, we focus on the confidence level α=0.99 and the absolute convergence tolerance ε=0 in all scenarios. Furthermore, we consider B=100 replicated simulation runs in order to provide empirical 95 % confidence intervals for the estimated quantities; note that some of them are so tight that they are barely visible in the figures presented below. The B replications only differ due to different permutations of the columns in Steps (b) and (b) of Algorithm 3.1, everything else is deterministic; this allows us to study the effect of these (initial) randomization steps on the (convergence) results of the RA. Concerning the hardware used, all results were produced on an AMD 3.2 GHz Phenom II X4 955 processor with 8 GB RAM. Note that we only present detailed figures for the results of Study (1); the figures related to Study (2) can be obtained from the authors upon request.

Results of Study (1) (N running, d fixed)

The simulation results for Study (1) can be summarized as follows:

As can be seen in Figure 4, the means over all B computed s¯N and s¯N converge as N increases.
Figure 5 indicates that as N increases, so does the mean of the elapsed time (as to be expected). Overall, run time does not drastically depend on the case for our choices of Pareto margins, which is a good feature.
Figure 6 shows that the upper limit of the bootstrapped 95 % confidence interval for the number of column rearrangements rarely exceeds 10⁢d as N increases; this will be used as a default number of column rearrangements required in the ARA (see later).
Figure 7 indicates that the rate at which the number of oppositely ordered columns (based on the final rearranged Y¯α and Y¯α) decreases depends on the characteristics of the marginal distributions involved, i.e., the input matrix X. The number of oppositely ordered columns seems particularly small (for large N) in Case LL, where essentially only the last column is oppositely ordered to the sum of all others.

$Figure 4 Study (1): VaR¯0.99{\operatorname{\overline{VaR}}_{0.99}} bounds s¯N{\underline{s}_{N}} and s¯N{\overline{s}_{N}} for Cases HH, LH, LL and LH1{{}_{1}} (from left to right).$

Figure 4

Study (1): VaR¯0.99 bounds s¯N and s¯N for Cases HH, LH, LL and LH1 (from left to right).

$Figure 5 Study (1): Run times (in s) for Cases HH, LH, LL and LH1{{}_{1}} (from left to right).$

Figure 5

Study (1): Run times (in s) for Cases HH, LH, LL and LH1 (from left to right).

$Figure 6 Study (1): Number of rearranged columns for Cases HH, LH, LL and LH1{{}_{1}} (from left to right).$

Figure 6

Study (1): Number of rearranged columns for Cases HH, LH, LL and LH1 (from left to right).

$Figure 7 Study (1): Number of oppositely ordered columns (of Y¯α{\underline{Y}^{\alpha}} and Y¯α{\overline{Y}^{\alpha}}) for Cases HH, LH, LL and LH1{{}_{1}} (from left to right).$

Figure 7

Study (1): Number of oppositely ordered columns (of Y¯α and Y¯α) for Cases HH, LH, LL and LH1 (from left to right).

Results of Study (2) (N fixed, d running)

In Study (2) we are interested in analyzing the impact of the number of risk factors d on portfolios which exhibit different marginal tail behaviors. The simulation results can be summarized as follows:

The means over all computed s¯N and s¯N as functions of d diverge from one another; especially in the case where more marginal distributions are heavy-tailed. This is due to the fact that we have kept N the same for all cases in Study (2).
Similar to what we have seen in Study (1), the mean of the run time of the RA increases as the number of risk factors increases, with Case LL in Study (2) having the smallest run time on average.
As in Study (1), the mean of the number of rearranged columns is typically below 10⁢d.
The number of oppositely ordered columns (of Y¯α and Y¯α) increases as d increases; note that this finding does not contradict what we have seen in Study (1) as we have kept N constant here.

4 The Adaptive Rearrangement Algorithm

4.1 How the Adaptive Rearrangement Algorithm works

In this subsection we present an adaptive version of the RA, termed the Adaptive Rearrangement Algorithm (ARA). This algorithm for computing the bounds s¯N and s¯N for VaR¯α⁡(L+) or VaR¯α⁡(L+) (as before, we focus on the latter) provides an algorithmically improved version of the RA, has more meaningful tuning parameters and adaptively chooses the number of discretization points N. The ARA is implemented in the R package qrmtools, see the function ARA(). Similar to our RA() implementation, ARA() also relies on the workhorse rearrange() and returns much more information (and can also compute VaR¯α⁡(L+)), but the essential part of the algorithm is given below.

Algorithm 4.1 (ARA for computing VaR¯α⁡(L+)).

Proceed as follows.

Fix a confidence level α∈(0,1), marginal quantile functions F1-,…,Fd-, an integer vector 𝑲∈ℕl, l∈ℕ, (containing the numbers of discretization points which are adaptively used), a bivariate vector of relative convergence tolerances 𝜺=(ε1,ε2) (containing the individual relative tolerance ε1 and the joint relative tolerance ε2; see below) and the maximal number of iterations used for each k∈𝑲.
For N=2k, k∈𝑲, do:
1. Compute the lower bound:
  1. Define the matrix X¯α=(x¯i⁢jα) for
    x¯i⁢jα=Fj-⁢(α+(1-α)⁢(i-1)N),i∈{1,…,N},j∈{1,…,d}.
  2. Permute randomly the elements in each column of X¯α.
  3. Set Y¯α=X¯α. For j∈{1,2,…,d,1,2,…,d,…}, rearrange the jth column of the matrix Y¯α so that it becomes oppositely ordered to the sum of all other columns. After having rearranged at least d+1 columns, set, after every column rearrangement, the matrix X¯α to the matrix Y¯α from d rearrangement steps earlier and stop if the maximal number of column rearrangements is reached or if
    (10)|s⁢(Y¯α)-s⁢(X¯α)s⁢(X¯α)|≤ε1.
  4. Set s¯N=s⁢(Y¯α).
2. Compute the upper bound:
  1. Define the matrix X¯α=(x¯i⁢jα) for
    x¯i⁢jα=Fj-⁢(α+(1-α)⁢iN),i∈{1,…,N},j∈{1,…,d}.
    If (for i=N and) for any j∈{1,…,d}, Fj-⁢(1)=∞, adjust it to Fj-⁢(α+(1-α)⁢(N-1/2)N).
  2. Permute randomly the elements in each column of X¯α.
  3. Set Y¯α=X¯α. For j∈{1,2,…,d,1,2,…,d,…}, rearrange the jth column of the matrix Y¯α so that it becomes oppositely ordered to the sum of all other columns. After having rearranged at least d+1 columns, set, after every column rearrangement, the matrix X¯α to the matrix Y¯α from d rearrangement steps earlier and stop if the maximal number of column rearrangements is reached or if
    (11)|s⁢(Y¯α)-s⁢(X¯α)s⁢(X¯α)|≤ε1.
  4. Set s¯N=s⁢(Y¯α).
3. Determine convergence based on both the individual and the joint relative convergence tolerances: If (10) and (11) hold, and if
  |s¯N-s¯Ns¯N|≤ε2,
  break.
Return (s¯N,s¯N).

Concerning the choices of tuning parameters in Algorithm 4.1, note that if 𝑲=(k=log2N), i.e., if we have a single number of discretization points, then the ARA reduces to the RA but uses the improved relative instead of absolute tolerances and not only checks what we termed the individual (convergence) tolerance, i.e., the tolerance ε1 for checking “convergence” of s¯N and of s¯N individually, but also the joint (convergence tolerance), i.e., the relative tolerance ε2 between s¯N and s¯N; furthermore, termination is checked after each column rearrangement (which is also done by our implementation RA()). As our simulation studies in Section 3.3 suggest, useful (conservative) defaults for 𝑲 and the maximal number of column rearrangements are 𝑲=(8,9,…,19) and 10⁢d, respectively. Given the ubiquitous model risk and the (often) rather large values of VaR¯α⁡(L+) (especially in heavy-tailed of test cases), a conservative choice for the relative tolerance 𝜺 may be 𝜺=(0, 0.01); obviously, all these values can be freely chosen in the actual implementation of ARA().

4.2 Empirical performance under various scenarios

In this subsection, we shall empirically investigate the performance of the ARA. To this end, we consider d∈{20,100} risk factors, paired with Cases HH, LH, LL, LH1 as in Section 3.3. The considered relative joint tolerances are 0.5 %, 1 % and 2 % and we investigate the results for the individual relative tolerances 0 % and 0.1 %. Therefore the performance of ARA is investigated in 48 different scenarios. As before, the results shown are based on B=100 simulations and we investigate the VaR¯0.99⁡(L+) bounds s¯N and s¯N, the N used on the final column rearrangement of ARA (i.e., the N=2k, k∈𝑲 for which the algorithm terminates), the run time (in s), the number of column rearrangements (measured for the N used on termination of the algorithm) and the number of oppositely ordered columns after termination of the algorithm.

Results

We first consider the results for the individual relative tolerance fixed at ε1=0.

Our findings (see Figures 8–11) indicate that:

Although for both d=20 and d=100 the length of the confidence intervals for VaR¯0.99⁡(L+) can be checked to be increasing as the joint relative tolerance ε2 gets larger, the mean and lower and upper confidence bounds remain fairly close to each other. More importantly for a fixed individual relative tolerance, as ε2 increases, we do not observe a drastic shift in both lower and upper bounds for the mean across different examples.
An important observation about the N in the ARA is that in virtually all examples, the 95 % confidence interval remains the same; this fact can be leveraged in practice for portfolios which exhibit the same marginal tail behavior to reduce the run time of the ARA.
Across all of the 24 scenarios, doubling the joint relative tolerance reduces the run time (measured in s) by more than 50 %.
The number of column rearrangements for the N used remains below 10⁢d.
Finally, as Figures 8–11 reveal, randomizing the input matrix X has a minimal impact on various outputs of the ARA. However, this randomization has an interesting effect on the run time in that it seems to avoid the worst case in which sorting of a lot of numbers is required when oppositely ordering the columns causing the algorithm to take quite a bit longer.

Concerning the effect of the choice ε1=0.001, the findings are overall very similar. Note, however, that Figure 13 (based on the twenty constituents of the SMI from 2011-09-12 to 2012-03-28) which visualizes the inaccuracy possible appearing if ε1 is chosen too large; hence our default ε1=0 of RA() and ARA().

$Figure 8 Boxplots of the lower and upper VaR¯0.99⁡(L+){\operatorname{\overline{VaR}}_{0.99}(L^{+})} bounds s¯N{\underline{s}_{N}} (left-hand side) and s¯N{\overline{s}_{N}} (right-hand side) computed with the ARA for itol=0 based on B=100{B=100} replications.$

Figure 8

Boxplots of the lower and upper VaR¯0.99⁡(L+) bounds s¯N (left-hand side) and s¯N (right-hand side) computed with the ARA for itol=0 based on B=100 replications.

$Figure 9 Boxplots of the actual N=2k{N=2^{k}} used for computing the lower and upper VaR¯0.99⁡(L+){\operatorname{\overline{VaR}}_{0.99}(L^{+})} bounds s¯N{\underline{s}_{N}} (left-hand side) and s¯N{\overline{s}_{N}} (right-hand side) with the ARA for itol=0 based on B=100{B=100} replications.$

Figure 9

Boxplots of the actual N=2k used for computing the lower and upper VaR¯0.99⁡(L+) bounds s¯N (left-hand side) and s¯N (right-hand side) with the ARA for itol=0 based on B=100 replications.

$Figure 10 Boxplots of the run time (in s) for computing the lower and upper VaR¯0.99⁡(L+){\operatorname{\overline{VaR}}_{0.99}(L^{+})} bounds s¯N{\underline{s}_{N}} (left-hand side) and s¯N{\overline{s}_{N}} (right-hand side) with the ARA for itol=0 based on B=100{B=100} replications.$

Figure 10

Boxplots of the run time (in s) for computing the lower and upper VaR¯0.99⁡(L+) bounds s¯N (left-hand side) and s¯N (right-hand side) with the ARA for itol=0 based on B=100 replications.

$Figure 11 Boxplots of the number of column rearrangements (measured for the N used) for computing the lower and upper VaR¯0.99⁡(L+){\operatorname{\overline{VaR}}_{0.99}(L^{+})} bounds s¯N{\underline{s}_{N}} (left-hand side) and s¯N{\overline{s}_{N}} (right-hand side) with the ARA for itol=0 based on B=100{B=100} replications.$

Figure 11

Boxplots of the number of column rearrangements (measured for the N used) for computing the lower and upper VaR¯0.99⁡(L+) bounds s¯N (left-hand side) and s¯N (right-hand side) with the ARA for itol=0 based on B=100 replications.

$Figure 12 Boxplots of the number of oppositely ordered columns for computing the lower and upper VaR¯0.99⁡(L+){\operatorname{\overline{VaR}}_{0.99}(L^{+})} bounds s¯N{\underline{s}_{N}} (left-hand side) and s¯N{\overline{s}_{N}} (right-hand side) with the ARA for itol=0 based on B=100{B=100} replications.$

Figure 12

Boxplots of the number of oppositely ordered columns for computing the lower and upper VaR¯0.99⁡(L+) bounds s¯N (left-hand side) and s¯N (right-hand side) with the ARA for itol=0 based on B=100 replications.

$Figure 13 Computed VaR¯0.99⁡(L+){\operatorname{\overline{VaR}}_{0.99}(L^{+})} bounds s¯N{\underline{s}_{N}} and s¯N{\overline{s}_{N}} (with rearrange()) depending on the chosen relative tolerance ε1{\varepsilon_{1}} (the crosses “+{+}” indicate the values for ε1={\varepsilon_{1}=} NULL) for an application to SMI constituents data from 2011-09-12 to 2012-03-28; see the vignette VaR_bounds for more details.$

Figure 13

Computed VaR¯0.99⁡(L+) bounds s¯N and s¯N (with rearrange()) depending on the chosen relative tolerance ε1 (the crosses “+” indicate the values for ε1= NULL) for an application to SMI constituents data from 2011-09-12 to 2012-03-28; see the vignette VaR_bounds for more details.

Figure 14

Relative speed-up (in %) of the actual implementation of ARA() over a basic version as given in the vignette VaR_bounds.

4.3 A comparison with an asymptotic result for large d

Another interesting practical question not addressed so far is whether, given existence of the first moments of all marginal distributions, asymptotic sharp bounds by worst expected shortfall ES¯α⁡(L+) can replace the need of a computational tool such as the (A)RA. Since expected shortfall is subadditive and comonotone additive, the worst expected shortfall ES¯α⁡(L+) is attained for the sum of the marginal expected shortfalls. Under suitable conditions, it is known that

limd↑∞⁡ES¯α⁡(L+)VaR¯α⁡(L+)=1;

see [20] or [17, Proposition 8.36].

Figure 15 shows d↦ES¯α⁡(L+)/VaR¯α⁡(L+) for different confidence levels α under Cases HHshifted, LHshifted, LLshifted and LHshifted1 which simply correspond to Cases HH, LH, LL and LH1 where each marginal θj is increased by one; this is done to ensure that the first moment of all considered marginal distributions (and thus expected shortfall) exists. Worst Value-at-Risk VaR¯α⁡(L+) is computed as the mean of the bounds s¯N and s¯N obtained from the ARA (with default values). Figure 15 also displays 2 % bounds around 1.

$Figure 15 Worst expected shortfall ES¯α⁡(L+){\operatorname{\overline{ES}}_{\alpha}(L^{+})} over worst Value-at-Risk VaR¯α⁡(L+){\operatorname{\overline{VaR}}_{\alpha}(L^{+})} as functions of the dimension d for different confidence levels α. The considered cases HHshifted{{}^{\text{shifted}}}, LHshifted{{}^{\text{shifted}}}, LLshifted{{}^{\text{shifted}}} and LHshifted1{{}_{1}^{\text{shifted}}} (from left to right) are as before, but each parameter is increased by 1 so that ES¯α⁡(L+){\operatorname{\overline{ES}}_{\alpha}(L^{+})} exists.$

Figure 15

Worst expected shortfall ES¯α⁡(L+) over worst Value-at-Risk VaR¯α⁡(L+) as functions of the dimension d for different confidence levels α. The considered cases HHshifted, LHshifted, LLshifted and LHshifted1 (from left to right) are as before, but each parameter is increased by 1 so that ES¯α⁡(L+) exists.

As can be seen from Figure 15, the convergence rate of VaR¯α⁡(L+) to ES¯α⁡(L+) for large d seems to largely depend on the chosen confidence level α. Only for very large α∈(0,1) (as required for credit risk and operational risk within the Basel II framework), the convergence rate seems to be acceptable.

5 Conclusion

This paper presents two contributions to the computation of Value-at-Risk bounds for a sum of losses with given marginals in the context of Quantitative Risk Management.

First, we considered the homogeneous case (i.e., all margins being equal) and addressed two approaches for computing Value-at-Risk bounds. We identified and overcame several numerical and computational hurdles in their implementation and addressed them using the R package qrmtools including the vignette VaR_bounds. We covered several numerical aspects such as how to compute initial intervals for the root-finding procedures involved or that care has to be taken when choosing the tolerance of the root-finding procedures; a particular example which highlights the numerical challenges when computing Value-at-Risk bounds in general is the case of equal Pareto margins for which we also showed uniqueness of the root even in the infinite-mean case.

Overall, there is an important difference between implementing a specific model (say, with Par⁡(2) margins) where initial intervals can be guessed or found by trial and error and the proper implementation of a result such as [12] in the form of an black-box algorithm; see the source code of qrmtools for the work required to go in this direction and the technical details not presented here.

Second, we considered the inhomogeneous case. We first investigated the Rearrangement Algorithm, which, by now, has been widely adopted by the industry (see also https://www.sites.google.com/site/rearrangementalgorithm/). Its tuning parameters were shown to impact the algorithm’s performance and thus need to be chosen with care. We therefore presented an improved version of the Rearrangement Algorithm termed the Adaptive Rearrangement Algorithm. The latter improves the former in that it addresses the aforementioned tuning parameters and improves on the underlying algorithmic design. The number of discretization points is chosen automatically in an adaptive way (hence the name of the algorithm). The absolute convergence tolerance is replaced by two relative convergence tolerances. Since they are relative tolerances, their choice is more intuitive. The first relative tolerance is used to determine the individual convergence of the lower bound s¯N and the upper bound s¯N for worst (or best) Value-at-Risk. The second relative tolerance is introduced to control how far apart s¯N and s¯N are. The Adaptive Rearrangement algorithm has been implemented in the R package qrmtools, together with conservative defaults. The implementation contains several other improvements as well (e.g., fast accessing of columns via lists; avoiding having to compute the row sums over all but the current column; an extended tracing feature). Both the algorithmic and the numerical improvements may be of interest in any context in which it is required to find columnwise permutations of a matrix such that the minimal (maximal) row sum is maximized (minimized).

There are several interesting questions left to be investigated. Besides the evaluation of the numerical complexity of the algorithm, the theoretical convergence properties of the (Adaptive) Rearrangement Algorithm remain an open problem. Also, it remains unclear how the rows and columns of the input matrices for the (A)RA can be set up in an initialization such that the run time is minimal. Another question is whether one can reduce run time when using the rearranged matrix from the case where N=2k-1 to construct the initial matrix for the case where N=2k. It remains an open question whether the overhead of building such initial matrices is compensated by the resulting reduced run time; at the moment, randomization seems difficult to beat in this regard.

Acknowledgements

We would like to thank Giovanni Puccetti (University of Milano), Ruodu Wang (University of Waterloo), Kurt Hornik (Vienna University of Economics and Business) and the reviewers for valuable comments and feedback on the paper. We would also like to thank Robert Stelzer (Editor-in-Chief) and the Associate Editor for handling our paper. All remaining errors and omissions are ours.

References

[1] C. Bernard, X. Jiang and R. Wang, Risk aggregation with dependence uncertainty, Insurance Math. Econom. 54 (2014), 93–108. 10.1016/j.insmatheco.2013.11.005Search in Google Scholar

[2] C. Bernard and D. McLeish, Algorithms for finding copulas minimizing convex functions of sums, preprint (2015), http://arxiv.org/abs/1502.02130. 10.1142/S0217595916500408Search in Google Scholar

[3] C. Bernard, L. Rüschendorf and S. Vanduffel, Value-at-risk bounds with variance constraints, preprint (2013), http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2342068. 10.2139/ssrn.2342068Search in Google Scholar

[4] C. Bernard, L. Rüschendorf and S. Vanduffel, Value-at-risk bounds with variance constraints, preprint (2015), http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2342068. 10.1111/jori.12108Search in Google Scholar

[5] C. Bernard, L. Rüschendorf, S. Vanduffel and J. Yao, How robust is the value-at-risk of credit risk portfolios?, Eur. J. Finance 23 (2017), no. 6, 507–534. 10.1080/1351847X.2015.1104370Search in Google Scholar

[6] C. Bernard and S. Vanduffel, A new approach to assessing model risk in high dimensions, J. Banking Finance 58 (2015), 166–178. 10.1016/j.jbankfin.2015.03.007Search in Google Scholar

[7] C. Bernard and S. Vanduffel, Quantile of a mixture with application to model risk assessment, Dependence Model. 3 (2015), 172–181. 10.1515/demo-2015-0012Search in Google Scholar

[8] V. Bignozzi, G. Puccetti and L. Rüschendorf, Reducing model risk via positive and negative dependence assumptions, Insurance Math. Econom. 61 (2015), no. 1, 17–26. 10.1016/j.insmatheco.2014.11.004Search in Google Scholar

[9] V. Chavez-Demoulin, P. Embrechts and M. Hofert, An extreme value approach for modeling operational risk losses depending on covariates, J. Risk Insurance 83 (2016), no. 3, 735–776. 10.1111/jori.12059Search in Google Scholar

[10] E. G. Coffman and M. Yannakakis, Permuting elements within columns of a matrix in order to minimize maximum row sum, Math. Oper. Res. 9 (1984), no. 3, 384–390. 10.1287/moor.9.3.384Search in Google Scholar

[11] P. Embrechts and M. Hofert, A note on generalized inverses, Math. Oper. Res. 77 (2013), no. 3, 423–432. 10.1007/s00186-013-0436-7Search in Google Scholar

[12] P. Embrechts and G. Puccetti, Bounds for functions of dependent risks, Finance Stoch. 10 (2006), no. 3, 341–352. 10.1007/s00780-006-0005-5Search in Google Scholar

[13] P. Embrechts, G. Puccetti and L. Rüschendorf, Model uncertainty and VaR aggregation, J. Banking Finance 37 (2013), no. 8, 2750–2764. 10.1016/j.jbankfin.2013.03.014Search in Google Scholar

[14] P. Embrechts, G. Puccetti, L. Rüschendorf, R. Wang and A. Beleraj, An academic response to Basel 3.5, Risks 2 (2014), no. 1, 25–48. 10.3390/risks2010025Search in Google Scholar

[15] U.-U. Haus, Bounding stochastic dependence, complete mixability of matrices, and multidimensional bottleneck assignment problems, preprint (2014), http://arxiv.org/abs/1407.6475. Search in Google Scholar

[16] M. Hofert and M. V. Wüthrich, Statistical review of nuclear power accidents, Asia-Pac. J. Risk Insurance 7 (2012), 10.1515/2153-3792.1157. 10.1515/2153-3792.1157Search in Google Scholar

[17] A. J. McNeil, R. Frey and P. Embrechts, Quantitative Risk Management: Concepts, Techniques, Tools, 2nd ed., Princeton University Press, Princeton, 2015. Search in Google Scholar

[18] G. Puccetti and L. Rüschendorf, Bounds for joint portfolios of dependent risks, Stat. Risk Model. 29 (2012), no. 2, 107–132. 10.1524/strm.2012.1117Search in Google Scholar

[19] G. Puccetti and L. Rüschendorf, Computation of sharp bounds on the distribution of a function of dependent risks, J. Comput. Appl. Math. 236 (2012), no. 7, 1833–1840. 10.1016/j.cam.2011.10.015Search in Google Scholar

[20] G. Puccetti and L. Rüschendorf, Asymptotic equivalence of conservative VaR- and ES-based capital charges, J. Risk 16 (2014), no. 3, 3–22. 10.21314/JOR.2014.291Search in Google Scholar

[21] G. Puccetti, L. Rüschendorf, D. Small and S. Vanduffel, Reduction of value-at-risk bounds via independence and variance information, Scand. Actuar. J. (2015), 10.1080/03461238.2015.1119717. 10.1080/03461238.2015.1119717Search in Google Scholar

[22] G. Puccetti and R. Wang, Extremal dependence concepts, Stat. Sci. 30 (2015), no. 4, 485–517. 10.1214/15-STS525Search in Google Scholar

[23] R. T. Rockafellar and R. J.-B. Wets, Variational Analysis, Springer, Berlin, 1998. 10.1007/978-3-642-02431-3Search in Google Scholar

[24] L. Rüschendorf, On the multidimensional assignment problem, Methods OR 47 (1983), 107–113. Search in Google Scholar

[25] R. Wang, L. Peng and J. Yang, Bounds for the sum of dependent risks and worst value-at-risk with monotone marginal densities, Finance Stoch. 17 (2013), no. 2, 395–417. 10.1007/s00780-012-0200-5Search in Google Scholar

Received: 2015-11-29

Revised: 2016-12-29

Accepted: 2017-1-11

Published Online: 2017-2-3

Published in Print: 2017-6-1

Articles in the same Issue

https://doi.org/10.1515/strm-2015-0028

Keywords for this article

Risk aggregation; model uncertainty; Value-at-Risk; Adaptive Rearrangement Algorithm