Home Business & Economics Bayesian Analyses of the Two-Stage CARR-Return Models with Applications to COVID-19 Impact on the Cryptocurrency Market
Article Open Access

Bayesian Analyses of the Two-Stage CARR-Return Models with Applications to COVID-19 Impact on the Cryptocurrency Market

  • and EMAIL logo
Published/Copyright: December 22, 2025
Become an author with De Gruyter Brill

Abstract

Volatility is one of the main risk measures as it describes the variability of returns; thereby it drives many investment decisions in the financial markets. Yet, the COVID-19 pandemic introduces a new source of market risk, and its impact remains in focus. This paper studies the volatility and return series of Bitcoin and Ethereum to gain a better understanding of the impact of this widespread global event on the financial markets particularly when central banks’ stimulus measures led to concerns about inflation, boosting interest in Bitcoin as the “digital gold”. We extend the conditional autoregressive range (CARR) volatility model by incorporating efficient realised range-based volatility measures taken at higher frequencies, which capture stronger short-term memory. These conditional volatility estimates are then used as inputs in return models, forming a two-stage model that models volatility and returns separately. This approach serves as a compelling alternative to the commonly used GARCH models. The key improvement comes from the choice of error distribution – the Generalized Beta Type two, which can be specified in the CARR model but not in GARCH models. For the return model, the Student-t distribution is typically the best choice. The two-stage model demonstrates strong out-of-sample forecasting performance, particularly in terms of Value-at-Risk (VaR) and Volatility-at-Risk (VoaR), underscoring its practical effectiveness. Model parameters are estimated using a Bayesian MCMC approach via the RStan package.

JEL Classification: C11; C16; C22; D53; G17

1 Introduction

Volatility in financial assets is one of the main ways to quantify risk and it can be thought of as the variability in returns. Being able to come up with relatively accurate predictions of future volatility can improve predictions of future return distributions and hence, shed light on portfolio formulation. Another concept, the Value at Risk (VaR) defined as a certain quantile of the return distribution, can reveal any level of tail risk and give the worst loss expected for a portfolio over a certain time period at a given confidence level. VaR is originated from portfolio theory and probability in 1980s. In 1994, J.P. Morgan introduced the Risk Metrics framework, which popularised the use of VaR (Jorion 1996). In 1996, the Basel Committee on Banking Supervision (BCBS) incorporated VaR into its market risk capital framework under Basel I amendments. Since these amendments, VaR has become the standard measure of financial market risk across major banks and financial institutions, especially for trading desks and portfolio management. After the 2008 financial crisis, VaR faced criticism for underestimating risk during extreme events, leading to discussions on enhancing risk measures and tightening regulations. Expected Shortfall or Conditional VaR (CVaR), proposed by Artzner et al. (1997, 1999), is the expected loss of a portfolio given that losses exceed VaR. CVaR is considered a more useful risk measure than VaR because it is a coherent measure of financial portfolio risk. In May 2012, Basel III was proposed to replace VaR with CVaR for better capture of tail risks.

Modelling of volatility and returns is essential in many applications such as risk management as companies need to come up with accurate market risk measures for returns such as VaR and CVaR. While it is common to look at the lower quantiles within the distribution of returns, it is also important to look in the upper quantiles as companies may hold short positions in financial assets, meaning that high returns in the assets would cause losses for the company holding short positions in these assets. Other risk measures which are useful in practice are volatility-at-risk (VoaR) and conditional VoaR (CVoaR). In comparison to VaR, VoaR looks at the quantiles within the forecasted distribution of volatilities. This is particularly useful when the value of positions in certain financial assets such as options is directly related to volatility. Having accurate predictions of these market risk measures help companies ensure they have adequate funds to deal with market stress and stay within their risk policies.

However, volatility including VaR and CVaR measures are often affected by extreme events such as the COVID-19 pandemic causing periods of high volatility and uncertainty. The study of COVID-19’s impact on the cryptocurrency market (Nitithumbundit and Chan 2022) is particularly important as the market reacted strongly to COVID-19, showing investors’ sentiment toward risk assets. In recent years, cryptocurrencies have gained much more popularity, particularly Bitcoin (BTC) and Ethereum (ETH) as they have the highest market capitalisation’s in the cryptocurrency market. They are being used as investments, potentially to diversify existing portfolios, and also being used for payments. However, they tend to have much higher volatility and hence risk, so it is important to better understand their volatility dynamics when investing in them. As evidenced from the BTC crash in early 2020 alongside the traditional markets, cryptocurrencies such as BTC have proved their connection to the global financial trends. The pandemic accelerated institutional interest in cryptocurrencies, for hedge funds and investment portfolios. Moreover, central banks’ stimulus measures during the pandemic led to concerns about inflation, boosting interest in Bitcoin as the “digital gold” for inflation hedging. There are also technological interest in digital finance, blockchain adoption, and regulatory discussions around cryptocurrencies. This shift reflects broader financial market transformations.

It is clear that days of higher volatility tend to cluster together causing higher autocorrelation of volatility in comparison to returns. Hence, return models assuming constant volatility would not be appropriate for risk management. Popular heteroscedastic return models include the Auto-Regressive Conditionally Heteroskedastic (ARCH) (Engle 1982) and the generalised ARCH (GARCH) models (Bollerslev 1986) which account for the time-varying volatility, capturing the short-term memory and persistence in volatility. However, this model only provides a distribution for returns while volatility is taken as a latent process. Instead, the conditional autoregressive range (CARR) model (Chou 2005) can be used to directly model volatility to get estimates of VoaR and CVoaR. The volatility estimates from the CARR model can be used as input to a return model, where the distribution of returns can be modelled to get estimates of VaR and CVaR. This process of modelling volatility and returns is referred to as the two-stage model (Shao et al. 2009). Since this model uses the two sources of data, volatility and returns, it is usually more efficient than the GARCH model.

There are many different ways to define volatility measures as inputs to the CARR models. A popular measure is the squared return which the GARCH model uses. However, this measure does not capture the price movement of an asset throughout the day. Some more robust measures include the Parkinson measure (PK) (Parkinson 1980) which captures the high and low price of an asset during a day, and a realised volatility (RV) measure which captures the returns from smaller time intervals during a day. These measures can also be combined to create a realised PK (RPK) measure which captures the high and low prices from smaller time intervals during a day. Other features, such as leverage effect, can be incorporated into the mean function of the CARR models apart from the short-term memory and persistence components. The leverage effect is the asymmetric impact of positive and negative returns on the next period volatility. Usually in financial assets such as stocks, a negative return will generally have a higher impact on volatility than a positive return of the same magnitude. Other features include the interaction effects between lagged observed (short-term memory) and fitted (persistence) volatilities.

One of the difficulties in volatility and return models is the modelling of fat tails. For example, extreme events which results in returns being far away from the mean, are not captured well by some distributions such as Normal. So the choice of distributions in these models play a major role in providing accurate quantile forecasts. For the distribution of volatilities, a distribution defined on the positive real domain needs to be used as volatilities can only take non-negative values. The generalised distribution allowing extra flexibility for fatter tails are the Generalised Beta type two (GB2) distributions (McDonald and Xu 1995) which nest many other common distributions including Generalised Gamma (GG) as special cases. For returns, a popular distribution used to capture to fatter tails is the Student-t (ST) distribution. On the other hand, the Variance-Gamma (VG) distribution (Madan and Seneta 1990) is efficient in modelling high frequency data with heavy kurtosis, displaying unbounded peak density near 0 returns.

For parameter inference, the common methods are the classical likelihood and Bayesian approaches. Bayesian Markov Chain Monte Carlo (MCMC) method offer several advantages. It can estimate complex hierarchical or latent-variable models whereas the frequentist optimization approach might struggle with high dimensional integration. Moreover, prior information about the parameters can be incorporated particularly when data is scarce and expert knowledge is available. More importantly, inferences can be made from the posterior distributions and the posterior predictive distributions so generated allow for easy computation of measures such as VaR and CVaR. These posterior predictive distribution provides a more realistic uncertainty quantification, instead of a single point estimate from the classical likelihood approach. It also enjoys the asymptotic normality for the parameter estimates like the classical likelihood approach. Hence, we adopt the Bayesian approach via the user-friendly RStan package.

This paper has three aims. Firstly, we study the COVID-19 impact on the cryptocurrency market by considering the pre-pandemic, early pandemic, and late pandemic periods. We explore how the features of volatility change across these periods by applying BTC and ETH’s volatility measures to CARR models with short-term memory, persistence, their interaction (called bilinear effect) and leverage effect. We investigate how the choice of distributions affects the model performance. We also study the pandemic impact on returns using the two-stage CARR-return model. Secondly, we propose some effective ways of defining the lagged volatility and return regressors for the autoregressive and leverage effects that also improve model performance. Lastly, we perform one-day-ahead volatility and return forecasts as well as other risk metric forecasts including VaR, CVaR, VoaR, and CVoar. We showcase the enhanced performance of the two-stage CARR-return models relative to the GARCH model over all periods.

The paper is structured as follows. Section 2 introduces three efficient volatility measures. Section 3 outlines the two-stage CARR-return models with their distribution choices and the Bayesian approach of parameter estimation. Section 4 conducts some data analyses of the stage-one CARR models to the BTC and ETH data with model selection, forecast and extensions to regressors. Further data analyses using the stage-two return models are reported in Section 5. Lastly, our work is summarised in Section 6 with proposals of further analyses.

2 Measures

Denoting the price of an asset at time t by P t , a random walk price model can be written as

(1) ln P t = ln P t 1 + σ z t

where σ is a volatility constant and z t is a normally distributed random variable with mean 0 and variance 1. This model is a non-drift discretised version of the geometric Brownian motion. Let H t , L t , C t  (=P t ), O t be the High, Low, Closing and Opening prices of the asset during the time period t. The close-to-close logarithmic return (hereafter referred to as return) from (1) using closing price P t is given by

(2) r C C , t = ln ( C t / C t 1 ) = ln C t ln C t 1 .

To estimating the volatility constant σ in (1), Parkinson (1980) proposed the extreme value method which uses the range between the high and low logarithmic prices during the time period t. This range measure of volatility is called the Parkinson (PK) measure. Allowing σ2 to change over time, the PK measure at time t is given by

(3) v P K , t = ( ln H t ln L t ) 2 4 ln 2 .

Parkinson (1980) also showed that the constant 1 4 ln 2 ensures that the measure is an unbiased estimate for σ t 2 . The traditional method of estimating σ t 2 uses the rate of return during the unit time interval. Research showed that the PK measure is five times more efficient than the traditional method while it is also an unbiased estimate of the return variance.

Andersen and Bollerslev (1998) proposed the realised volatility (RV) measure which uses higher frequency data to capture movements in an asset throughout the day defined as

(4) v R V , t = i = 1 I ( C t ( i ) C t ( i 1 ) ) 2

where I is the number of such intervals in time period t, and C t is the closing price at time t. Andersen and Bollerslev (1998) noted how using squared returns on fixed horizons provided noisy measurements so it would not be a good proxy for future volatility when used in models. Instead, they demonstrated how one could get much better predictions for future volatility by using high frequency data and summing over the squared intraday returns. They also noted that as one increase the number of intraday returns to infinity (I), the measure converges to the true volatility.

Martens and van Dijk (2007) combined the ideas of both the PK and RV measures by replacing each squared intraday return in the RV measure with the PK measure. This measure called realised PK (RPK) is written as

(5) v R P K , t = 1 4 ln 2 i = 1 I ( ln H t , i ln L t , i ) 2 .

They demonstrated that in theory, this combined RPK measure is more efficient than the RV measure using the same sampling frequency. However, it was also noted that in the case of infrequent trading, this was not true for the highest sampling frequencies.

Since range-based volatility measures were very sensitive to outliers, Tan et al. (2019) proposed the Quantile PK (QPK) measure and added scaling factors to ensure that the measure is unbiased. The QPK measure can be written as

(6) v Q P K ( τ , τ + ν ) , t = ( ln H t , τ + ν ln L t , τ ) 2 φ Q P K ( τ , τ + ν )

using the upper 100(τ + ν)-th and lower 100τ-th quantiles respectively of the log price, where ν is the width of the interquartile range and φ Q P K ( τ , τ + ν ) is the scaling factor which ensures that the volatility measure is unbiased, that is, E [ v Q P K ( τ , τ + ν ) , t ] = σ t 2 , the variance of log returns. Tan et al. (2019) estimated the scaling factors for a range of interquartile ranges using simulation studies. The scaling factor of the (0, 100) interquartile becomes 1 4 ln 2 as it is the same as the PK measure. This QPK measure was shown to be more efficient than the PK measure when there are outliers which can be excluded from the interquartile range (τ, τ + ν). The scaling factors used for the QPK measure can be found in Table 1 of Tan et al. (2019).

3 Two-Stage CARR-Return Model

3.1 CARR Model

Engle and Russell (1998) studied the irregular time intervals between transactions, particularly financial market transactions such as trading stocks. They found that there was clustering of transactions, and hence, a high autocorrelation in the time intervals between trades. The autoregressive conditional duration (ACD) model was then proposed to model the conditional density of the durations of such transactions. Chou (2005) extended this idea by using range-based volatility measures in the ACD model, proposing the Conditional Autoregressive Range (CARR) model. Let V t be the random variable for the volatility process at time t which can be measured by (3) to (6). The CARR ( p , q ) model can be written as

(7) V t = λ t ε t
(8) λ t = β 0 + i = 1 p β 1 , i V t i + i = 1 q β 2 , i λ t i

where λ t is the conditional mean of the volatility measure and ɛ t is the error term which has a mean of one and is assumed to follow some distribution defined on the positive real domain. For this process to be stationary, the following conditions must hold

(9) β 0 , β 1 , i , β 2 , j > 0 and i = 1 p β 1 , i + i = 1 q β 2 , i < 1 .

Parameters β1,i are the autoregressive terms which capture the short-term volatility while β2,i measures the persistence by giving weight to the previous volatility estimates which slow the rate of change for the estimates. The long term unconditional mean is captured in the term β0.

Chou (2005) showed that this model can be extended to capture other exogenous variables denoted by Xi,t. The conditional mean in the CARRX model can be written as

λ t = β 0 + i = 1 p β 1 , i V t i + i = 1 q β 2 , i λ t i + i = 1 L β 3 , i X t i .

Some exogenous variables that could be used include lagged returns, trading volume, or variables which capture seasonal affects.

With volatility, there can be asymmetry in future volatility based on whether the previous days return was positive or negative. This is known as the leverage effect. Tan et al. (2019) considered three asymmetric mean functions denoted as LCARR ( p , q , w ) and can be written as

(10) λ t = β 0 + i = 1 p β 1 , i V t i + i = 1 q β 2 , i λ t i + β 3 w | R t 1 | 1 R t 1 < 0 + β 4 w R t 1 1 R t 1 > 0

where w = 1 , 1 λ t 1 , 1 V t 1 are denoted by a , b , and c respectively to capture the leverage effect and 1 E is an indicator function of event E such that it is 1 when E holds and 0 otherwise.

Storti and Vitale (2003) proposed a GARCH model which included a bi-linear effect, capturing the interaction between past observed and estimated volatilities. This interaction effect will be incorporated into the CARR model, denoted by BCARR ( p , q , s ) and will be written as

(11) λ t = β 0 + i = 1 p β 1 , i V t i + i = 1 q β 2 , i λ t i + i = 1 s β 5 , i V t i λ t i

The parameter β5 is used here to avoid confusion between the parameters which measure the leverage effect, β3 and β4.

Table 1:

Error distributions used for the CARR models.

Distribution Density f(ɛ t ) Standardisation Constraint
Wei(a, b) a b ε t b a 1 exp ε t b a E ( ε t ) = b Γ 1 + 1 a = 1 b = 1 Γ 1 + 1 a
GG(a, b, p) a b Γ ( p ) ε t b a p 1 exp ε t b a E ( ε t ) = b Γ p + 1 a Γ ( p ) = 1 b = Γ ( p ) Γ p + 1 a
GB2(a, b, p, q) a b ε t b a p 1 B ( p , q ) 1 + ε t b a p + q E ( ε t ) = b B p + 1 a , q 1 a B ( p , q ) b = B ( p , q ) B p + 1 a , q 1 a
  1. GG(a, b, p) = GB2(a, b, p, q) as q and Wei(a, b) = GG(a, b, 1).; Γ(a) is the gamma function and B(a, b) is the beta function.

In the CARR model, the error term (ɛ t ) is assumed to follow some positive distribution. Previous works on the CARR models typically used the Weibull (Wei) distribution. However, the GB2 distribution is more general as it nests many other distributions as special cases such as the Wei and GG distribution. See Table 1 for a list of distributions used in this paper and Figure 1 of Chan et al. (2018) for a list of nested distributions.

3.2 Return Model

In order to get VaR estimates of returns using the CARR model, Shao et al. (2009) denoted a return process to include a conditional return with volatility inputs λ t from the CARR models. Let R t be the random variable of return at time t measured by (2). This model is referred to as stage two of the CARR-return model and can be written as

(12) R t = μ t + σ t z t where σ t 2 = ρ λ t

where E ( z t ) = 0 , Var(z t ) = 1, and z t can follow any distribution defined on the real domain such as Normal, Student-t and VG distributions. Read Fung and Seneta (2007) for details of the symmetric VG distribution including the pdf and the fact that it contains the symmetric Laplace distribution as a special case when the shape parameter ν = 1. The parameter λ t is the fitted volatility using the CARR models in (8), (10) and (11) and ρ is a scaling parameter to ensure that the volatility estimate σ t 2 = ρ λ t is unbiased for the true volatility in (1) which is further assumed to change over time. The mean μ t can follow any model such as a constant mean or autoregressive model. An AR(1) model

(13) μ t = μ 0 + ϕ 1 R t 1

is used in this analysis as the autocorrelation of returns (see Figure 7) at the first lag is sometimes significant, however, typically there is not much autocorrelation in returns making it difficult to forecast. This model provides an alternative way to model returns in comparison to the GARCH model. The volatility is modelled separately in the two-stage model and is assumed to follow some distribution such as GB2. This is beneficial as the stochastic nature of volatility allows one to estimate VoaR and CVoaR which cannot be done with the GARCH model. In addition, any volatility measure such as the RPK volatility measure in (5) can be used for the volatility model and they capture more information about volatility throughout a day. The GARCH model is restricted to using the return squared although realised GARCH model allows more flexibility. Because the two-stage model takes in two data sources, it is often more efficient than the GARCH model.

3.3 Model Evaluation

3.3.1 Deviance Information Criterion for Model Selection

To compare the relative fit between Bayesian models in model selection, Spiegelhalter et al. (2002) proposed the Deviance Information Criterion (DIC). The DIC is similar the Akaike’s Information Criterion (AIC) as it has a measure for goodness of fit but penalises for the model complexity by estimating the number of effective parameters. The DIC can be defined as

(14) DIC = D ( θ ) ̄ + p D

where D ( θ ) ̄ = E θ | y [ 2 ln f ( y | θ ) ] is the mean of the deviance which is a measure of model fit. So D ( θ ) ̄ will be smaller when there is a higher log-likelihood. The term p D is the effective number of parameters so it is a measure of model complexity and can be written as

(15) p D = D ( θ ) ̄ D ( θ ̄ ) .

So it is the difference between the mean of the deviance and the deviance of the mean for the posterior.

3.3.2 Kupiec Test for Risk Forecast Accuracy

3.3.2.1 VaR and CVaR Risk Measures

To evaluate the risk of holding an asset, a commonly used metric is VaR which is the maximum expected loss at a specified probability level α. In this paper, the VaR of the return will be considered for both the lower quantiles α1 ∈ (0, 0.5), and upper quantiles α2 ∈ (0.5, 1) (the specified probability level is essentially 1 − α2). VaR or quantile can be calculated using the inverse of the cumulative distribution function F R 1 of the return distribution being used in the model and can then be written as

(16) VaR t ( α 1 ) = F R , t 1 ( α 1 ) and VaR t ( α 2 ) = F R , t 1 ( α 2 ) .

The expected shortfall for R t , also called conditional VaR (CVaR), is the expected return given a certain VaR level has been exceeded. The CVaR at lower and upper quantiles are defined as

(17) CVaR t ( α 1 ) = E R t | R t < VaR t ( α 1 ) = 1 α 1 0 α 1 V a R t ( u ) d u CVaR t ( α 2 ) = E R t | R t > VaR t ( α 2 ) = 1 1 α 2 α 2 1 V a R t ( u ) d u .

The concept of VaR can also be applied to volatility and is denoted as VoaR. While VaR is the quantile of the return distribution (using inverse CDF) such as Normal, ST and VG, VoaR (volatility at risk) is the quantile of the volatility distributions such as Weibull, GG and GB2. The lower quantiles of volatility is usually less of an issue in practice. Given an upper quantile α2 such as α2 = 0.95, VoaR and CVoaR can be written as

(18) VoaR t ( α 2 ) = F V , t 1 ( α 2 ) and CVoaR t ( α 2 ) = E V t | V t > VoaR t ( α 2 ) .

In this case, the cumulative distribution function FV,t comes from a distribution defined on the positive real domain like Weilbull.

3.3.2.2 Test for Violation Rates

To test the accuracy of VaR estimates, Kupiec (1995) proposed a test on proportion of failures or violation rate (VR) and determines whether the VR is significantly different from the VaR level. The VR is defined for the lower quantile level α1 as

(19) VR = N α 1 T where N α 1 = t = 1 T I ( R t VaR t ( α 1 ) )

is the number of times the return R t exceeds a VaR level at the lower quantile, I(⋅) is the indicator function, and T is the number of observations. The “ ≤ ” sign becomes “ ≥” for the upper VaR quantiles at level α2. Then a likelihood ratio (LR) applied to lower quantile α1

(20) LR = 2 ln 1 VR 1 α 1 T N α 1 VR α 1 N α 1

can be computed which is asymptotically chi-squared distributed with one degree of freedom. If the p-value is less than 0.05, then the VaR estimated at the α level should be rejected, meaning that it is not a good estimate.

3.4 Bayesian Inference

3.4.1 Hierarchical Structure

Bayesian inference is an alternative to the classical likelihood approach for estimating parameters, making use of prior knowledge about the data and hierarchical model structure. Choy and Chan (2008) showed how the ST and VG distributions can be expressed as a normal scale mixture distribution

(21) Student t : X | μ , σ 2 , ν , u N μ , σ 2 u and u | ν G ν 2 , ν 2
(22) Variance gamma: X | μ , σ 2 , ν , u N μ , σ 2 u and u | ν I G ν 2 , ν 2

where G(a, b) is the gamma distribution, IG(a, b) is the inverse gamma distribution, ν is the degrees of freedom parameter, u in the variance term follows a G ( ν 2 , ν 2 ) and IG ( ν 2 , ν 2 ) distributions which adds extra variability to the distribution of X. The probability density function (pdf) of the ST (VG) distribution can be derived by integrating the product of the Normal and Gamma (Inverse Gamma) pdf’s with respect to u. For example,

(23) f ST ( x | μ , σ , ν ) = 0 N x | μ , σ 2 u G u | ν 2 , ν 2 d u .

The pdf of VG distribution is more peaked near 0 and becomes unbounded at 0 when ν < 0.5 so it can be used to model returns with infrequent trading meaning that there would be a lot of observations near 0. The Normal distribution is a special case of both ST and VG distributions when ν goes to infinity. The normal scale mixture representation of ST and VG distributions facilitates the implementation of the models making use of the hierarchical structure in Bayesian inference.

3.4.2 Prior Distributions

After specifying the hierarchical data density, the prior density is combined with the data density, or likelihood, to get the posterior density where inferences can be made from.

f ( θ | y ) f ( θ ) × f ( y | θ ) Posterior Prior × Likelihood 

When no prior knowledge is available for the model parameters, non-informative priors are adopted such that the priors have minimal effect on the inference; instead, the data dominates the inference, approaching the classical likelihood approach, and acting as a generalised classical likelihood approach.

Specifically, the priors for the return model are mostly noninformative given by

(24) μ 0 N ( 0,1,000 ) ; ϕ 1 N ( 0,10 ) ; ρ N N ( 1,10 ) , ρ T G ( 1,1 ) ; ν G ( 3,1 )

where ρ ≈ 1 scales the variance λ t in (12) estimated from the CARR model, ρN and ρT are the ρ for Normal and Student-t return distribution respectively and the degrees of freedom ν has mean 3 and variance 3. For the CARR models, more informative priors are assigned to speed up the convergence and provide more stable models. The intercept and persistence parameters β0:2 > 0 adopt Gamma priors to ensure positive but the leverage effect parameters β3:4 take Normal priors as they can be negative. The general choices of regression and distribution parameters for the CARR models in Tables 4 and 5 are reported Table 2.

Table 2:

Priors for parameters in the CARR models reported in Table 4 and 5.

Parameter C(1, 1) C(1, 1) C(1, 1) C(2, 1) C(1, 2) LC(1, 1) LC(1, 2)
Wei GG GB2 GB2 GB2 GB2 GB2
β 0 G(1, 1,000) G(1, 1,000) G(1, 1,000) G(1, 1,000) G(1, 1,000) G(1, 1,000) G(1, 1,000)
β1 or β11 G(1, 2) G(1, 2) G(1, 2) G(1, 3) G(1, 3) G(1, 3) G(1, 3)
β2 or β21 G(1, 2) G(1, 2) G(1, 2) G(1, 3) G(1, 3) G(1, 3) G(1, 3)
β 12 G(1, 3)
β 22 G(1, 3) G(1, 3)
β 3 N(1, 10) N(1, 10)
β 4 N(1, 10) N(1, 10)
k or a G(50, 50) G(0.5, 1) G(4, 1) G(4, 1) G(4, 1) G(4, 1) G(4, 1)
α or p G(2, 1) G(1, 1) G(1, 1) G(1, 1) G(1, 1) G(1, 1)
q G(0.5, 1) G(0.5, 1) G(0.5, 1) G(0.5, 1) G(0.5, 1)

3.4.3 MCMC Convergence

The parameters for each model is estimated using Bayesian MCMC method through the RStan package (Burker 2017) which uses a no-U-turn-sampler (NUTS) proposed by Hoffman and Gelman (2014). NUTS is an extension to the Hamiltonian Monte Carlo (HMC) algorithm (Duane et al. 1987; Neal 1994), performing at least as efficiently and requiring less human intervention. For models run in RStan, there are two convergence diagnostics for each parameter: R ̂ and neff. R-hat ( R ̂ ) also called the potential scale reduction factor in RStan checks for the consistency or convergence of multiple MCMC chains. Specifically, it combines information on variation within chains, assessing whether each chain converged to a stationary target posterior distribution while neff is the effective size of posterior samples. A small neff indicates high autocorrelation within chains which means the posterior was explored slowly and inefficiently.

Each model is run with 5,000 iterations and four chains where half of the iterations are used for warm-up and the other half for sampling. This means that there are K = 10,000 posterior samples for each parameter. We check history plots, R ̂ and neff carefully to ensure that all parameters meet the convergence condition. When there are non- or slow convergence for individual parameters, indicating unstable models, priors in (24) and Table 2 are re-adjusted to be around some plausible regions by trial searches to ensure a high enough neff (usually between 4,000 and 8,000) and R ̂ around 1 ± 0.001 for each parameter. These priors aim to narrow the parameter space, helping MCMC to converge more quickly and stably by avoiding implausible regions. Figure 13 in Appendix B presents the model diagnostics for the first model in Table 4. Figure 13a and b compare the prior and posterior distributions, demonstrating the asymptotic normality of θ |y whereas Figure 13c shows good convergence and consistency for the four chains for all parameters. Lastly, Figure 13d depicts the posterior predictive distribution of the first volatility forecast VT+1 as discussed in Section 4.5.

3.4.4 Risk Measure Forecast

Bayesian inference of risk forecasts, including VoaR, CVoaR, VaR, and CVaR, can be obtained from the quantiles of posterior predictive distribution from MCMC with no extra estimations. Suppose one-step ahead forecast is performed for the last 100 days of a pandemic period with sample size T*. Hundred moving window W s = (V,R)(s+1):(s+T), s = 0, , 99 with T = T* − 100 days is used to train each model and forecast the next day’s, day s + T + 1, volatility and return. The procedures for estimating VoaR, CVoaR, VaR, and CVaR are

  • Step 1: perform posterior sampling of V ̂ s + T + 1 ( k ) and R ̂ s + T + 1 ( k ) from respective data distributions (say (25) and (30));

  • Step 2: locate α j -level quantile, that is, the j -th order volatility V o a R s + T + 1 ( α j ) = V ̂ s + T + 1 ( K α j ) in (26) from the posterior predictive distributions of V ̂ s + T + 1 ( 1 : K ) in ascending order and the j -th order return V a R s + T + 1 ( α j ) = R ̂ s + T + 1 ( K α j ) in (31) from the posterior predictive distribution of R ̂ s + T + 1 ( 1 : K ) in ascending order; and

  • Step 3: evaluate averages of tail values from VoaR and VaR as estimates of CVoaR and CVaR (in (26) and (31)).

Note that these predictive forecasts, say VoaR, are based on quantiles of predictive distributions, say V ̂ s + T + 1 ( 1 : K ) , and are different from the quantile of fitted parametric distribution say GB2 with V o a R s + T + 1 ( α j ) = F GB 2 1 ( α j ; λ ̂ s + T + 1 , a ̂ , p ̂ , q ̂ ) . These estimates exclude model uncertainty across MCMC as parameter estimates are aggregated over MCMC.

4 Application to Stage One Volatility Model

4.1 Volatility Data

The data used for the analysis comes from the cryptocurrency exchange Binance and is extracted using the Python package binance.client. Data is collected in the time intervals 1 day, 1 h, 5 min, and 1 min for the two cryptocurrencies BTC and ETH. All timestamps are in Coordinated Universal Time (UTC) time zone. The time period that will be considered in the analysis is 2019-01-01 to 2022-12-31.

To compare how the features of volatility and returns have changed throughout the pandemic, three time periods:

  1. Period 1: 2019-01-01 to 2019-12-31 (pre-pandemic; T* = 365)

  2. Period 2: 2020-01-01 to 2020-12-31 (early pandemic; T* = 366)

  3. Period 3: 2021-01-01 to 2022-07-01 (late pandemic; T* = 547)

are considered where each model is run on each period. Period 1 covers the pre-pandemic period which will be used as a comparison to see whether certain features of volatility and returns have changed since the COVID-19 pandemic begun. Period 2 covers the initial outbreak and will give us an idea of the impact that the early stage of the pandemic had on volatility and returns. During this period, there were many lock downs around the world as well as travel restrictions which had major impacts on the financial markets. Finally, a late pandemic period, Period 3, which covers the development of COVID-19 including variants such as Delta and Omicron is also analysed. There were many more cases of COVID-19 during this period, however, the world was more used to the pandemic: new economic and support policies were implemented and many more lockdowns and travel restrictions were eased.

Table 3 reports the summary statistics for the volatility measures PK, RPK and QPK defined in (3), (5) and (6), respectively, for both BTC and ETH as well as for each time period. It can be seen that the RPK measures usually have the lowest variance for time period 1 and QPK measures for time period 3 demonstrating their relative efficiencies under the pre and post pandemic periods respectively. Moreover, QPK measures have lower mean by excluding outliers. Across periods, period 2 has the highest variance, skewness and kurtosis and next comes period 3. These are due to more frequent market shocks which cause short periods of heighten measures during the pandemic particularly the early period.

Table 3:

Summary statistics of selected volatility measures for BTC and ETH across pandemic periods.

Period Coin BTC ETH
Vol measure PK RPK RPK QPK QPK PK RPK RPK QPK QPK
1 h 1 min (1, 99) (5, 95) 1 h 1 min (1, 99) (5, 95)
1 Mean* 1.331 1.387 1.310 1.115 1.104 1.936 1.953 1.791 1.551 1.530
Variance* 0.006 0.004 0.003 0.005 0.005 0.010 0.006 0.004 0.007 0.008
Min* 0.024 0.078 0.119 0.013 0.010 0.078 0.161 0.244 0.038 0.021
Max 0.022 0.021 0.016 0.022 0.023 0.033 0.019 0.016 0.024 0.026
Skewness 4.52 4.73 4.56 4.60 4.74 4.73 3.60 3.57 4.10 4.49
Excess kurtosis 27.0 31.1 29.1 29.0 29.8 32.7 16.7 16.3 21.7 25.7
LB Q10 97.6 268.4 355.1 110.7 91.7 28.6 143.5 206.3 26.4 12.5
2 Mean* 1.721 1.800 1.667 1.404 1.124 2.787 2.757 2.434 2.326 2.026
Variance* 0.061 0.046 0.035 0.045 0.017 0.093 0.067 0.058 0.080 0.039
Min* 0.034 0.061 0.060 0.017 0.012 0.067 0.124 0.136 0.028 0.022
Max 0.126 0.104 0.089 0.114 0.068 0.156 0.123 0.109 0.155 0.106
Skewness 13.43 12.23 11.93 14.18 12.89 12.90 11.65 11.62 14.38 13.43
Excess kurtosis 196.3 164.6 158.7 220.9 197.1 188.8 152.5 147.0 231.8 215.3
LB Q10 90.2 144.0 174.6 73.4 78.6 83.5 122.0 130.8 60.2 53.8
3 Mean* 2.059 2.233 2.105 1.709 1.630 3.429 3.572 3.226 2.925 2.780
Variance* 0.012 0.012 0.013 0.007 0.006 0.053 0.049 0.041 0.033 0.024
Min* 0.091 0.181 0.137 0.059 0.029 0.126 0.264 0.210 0.081 0.030
Max 0.050 0.056 0.059 0.030 0.021 0.130 0.127 0.117 0.092 0.058
Skewness 6.77 8.22 8.93 4.70 3.65 10.95 11.28 11.21 8.82 5.64
Excess kurtosis 73.9 104.9 121.1 33.4 17.2 171.5 181.3 178.2 114.5 43.8
LB Q10 181.3 302.4 336.9 224.5 184.8 227.7 317.9 341.2 294.6 344.9
  1. *Re-scaled by multiplying by 103; LB Q10: Ljung-Box test statistic for lag 10. Test statistic in boldface is significant.

The test statistic LB Q10 tests whether the group of autocorrelation’s up to lag 10 are different from 0. A higher test statistic indicates that autocorrelation’s are further from 0. All test statistic are significant at the 5 % level except the QPK(5, 95) measure in Period 1 for ETH which is highlighted in boldface. Hence, it is important to model the short and long-term memory in volatility as it is clearly present. The RPK, 1 m (1 min) measure always has the highest LB Q10 test statistic followed by the RPK, 1 h (1 h) measure which could mean that realised volatility measures have higher short-term memory, however, this will be explored later in the analysis.

Figure 1 visualises the data characteristics. The times series plot (column 1) for period 2 shows that there is a period of extremely high volatility in early 2020, particularly on the 12th and 13th of March. This was just after the World Health Organisation (WHO) declared COVID-19 a pandemic on the 11th of March which highlights the major impact it had on the cryptocurrency market. The x-axis of the density plots (column 2) is capped at 0.02 to exclude these extreme outliers as it would be hard to see the shape of the distribution otherwise. The exclusion of outliers can also be seen from the robustness of the QPK measure: in period 2, the max of the QPK(5, 95) measure is 0.068 (see Table 3) which is less than the max of 0.126 for the PK measure by excluding these outliers. From the ACF plots (column 3) in Figure 1, there appears to be more autocorrelation for the RPK measures as seen from the higher spikes in the ACF plots and the higher Ljung-Box test statistics in Table 3. Moreover, the ACF plots do not display periodic persistence. Summary plots for ETH can be found in the Appendix A. The behaviour of ETH is relatively similar to BTC.

Figure 1: 
BTC time series, density, and ACF plots for selected volatility measures.
Figure 1:

BTC time series, density, and ACF plots for selected volatility measures.

Figure 1: 
(continued)
Figure 1:

(continued)

4.2 Comparison of Pandemic Periods, Error Distributions and Volatility Measures

Different error distributions, Wei, GG, and GB2, are compared using the CARR(1, 1) model in (7) and (8) with order p = q = 1 where the PK, RPK and QPK volatility measures are used. For each period, they are compared using both BTC and ETH. DIC is used to evaluate the model-fit and determine the best choice of error distribution where a lower DIC is preferred. Table 4 reports regression parameter estimates using each combination of volatility measures, error distributions, periods and both BTC and ETH. We do not report shape parameters (a, p, q) as they do not provide direct interpretation of the distribution properties such as moments. Instead we focus on evaluating persistence effects. In terms of computational speed, the models took roughtly 0.5, 1.5, 5 min to run using Wei, GG and GB2 distributions respectively. This time is about 50 % longer when running the models on period 3 reflecting model complexity and data size. The log density is manually specified for the GG and GB2 distributions as no inbuilt functions are available in RStan.

Table 4:

Parameter estimates and DIC for the CARR(1, 1) model fitted to different volatility measures and error distributions (Weibull, Generalised Gamma, Generalised Beta type 2) across periods for BTC and ETH.

Period Dist. Par. BTC ETH
PK RPK RPK QPK QPK PK RPK RPK QPK QPK
1 h 1 min (1, 99) (5, 95) 1 h 1 min (1, 99) (5, 95)
1 Wei β 0 * 0.354 0.332 0.228 0.278 0.264 0.874 0.590 0.496 0.681 0.729
β 1 0.419 0.657 0.822 0.419 0.336 0.308 0.536 0.674 0.276 0.238
β 2 0.334 0.145 0.052 0.346 0.423 0.247 0.195 0.083 0.284 0.276
DIC −4,233 −4,232 −4,350 −4,425 −4,466 −3,876 −3,921 −4,041 −4,058 −4,092
GG β 0 * 0.182 0.190 0.186 0.140 0.132 0.528 0.400 0.391 0.404 0.462
β 1 0.348 0.567 0.730 0.352 0.293 0.293 0.491 0.642 0.264 0.222
β 2 0.516 0.307 0.144 0.529 0.592 0.416 0.307 0.149 0.455 0.445
DIC −4,317 −4,360 −4,485 −4,499 −4,533 −3,969 −4,043 −4,181 −4,142 −4,173
GB2 β 0 * 0.090 0.075 0.075 0.059 0.060 0.286 0.220 0.207 0.284 0.385
β 1 0.265 0.377 0.417 0.242 0.214 0.298 0.424 0.484 0.306 0.284
β 2 0.707 0.595 0.541 0.737 0.767 0.648 0.513 0.433 0.631 0.641
DIC −4,354 −4,445 −4,588 −4,529 −4,558 −4,023 −4,127 −4,275 −4,186 −4,214
2 Wei β 0 * 0.135 0.142 0.098 0.120 0.108 0.311 0.353 0.186 0.272 0.282
β 1 0.306 0.652 0.652 0.262 0.184 0.364 0.828 0.699 0.306 0.232
β 2 0.640 0.321 0.328 0.665 0.719 0.554 0.126 0.276 0.594 0.633
DIC −4,193 −4,200 −4,277 −4,369 −4,463 −3,748 −3,827 −3,934 −3,889 −3,950
GG β 0 * 0.082 0.119 0.088 0.066 0.057 0.295 0.334 0.218 0.249 0.267
β 1 0.232 0.584 0.613 0.208 0.171 0.290 0.657 0.690 0.258 0.227
β 2 0.707 0.368 0.353 0.732 0.770 0.586 0.241 0.246 0.616 0.626
DIC −4,335 −4,370 −4,464 −4,492 −4,559 −3,857 −3,963 −4,114 −3,986 −4,026
GB2 β 0 * 0.067 0.063 0.053 0.056 0.054 0.294 0.315 0.242 0.254 0.307
β 1 0.229 0.335 0.391 0.220 0.196 0.321 0.538 0.600 0.314 0.305
β 2 0.751 0.641 0.586 0.761 0.783 0.625 0.381 0.322 0.632 0.625
DIC −4,393 −4,452 −4,553 −4,537 −4,588 −3,896 −4,025 −4,199 −4,013 −4,042
3 Wei β 0 * 0.368 0.355 0.337 0.280 0.272 0.411 0.384 0.328 0.348 0.370
β 1 0.294 0.574 0.654 0.255 0.206 0.339 0.588 0.666 0.341 0.299
β 2 0.537 0.302 0.226 0.586 0.627 0.557 0.338 0.273 0.555 0.573
DIC −5,783 −5,833 −5,939 −5,984 −6,021 −5,308 −5,412 −5,565 −5,483 −5,508
GG β 0 * 0.185 0.297 0.265 0.139 0.127 0.322 0.365 0.307 0.268 0.252
β 1 0.194 0.470 0.555 0.182 0.156 0.290 0.564 0.646 0.277 0.248
β 2 0.704 0.389 0.318 0.730 0.762 0.606 0.335 0.267 0.625 0.660
DIC −5,928 −6,054 −6,170 −6,090 −6,107 −5,468 −5,660 −5,831 −5,598 −5,596
GB2 β 0 * 0.123 0.197 0.193 0.111 0.109 0.238 0.295 0.281 0.210 0.202
β 1 0.182 0.325 0.395 0.188 0.168 0.269 0.425 0.530 0.263 0.239
β 2 0.788 0.596 0.525 0.778 0.799 0.687 0.499 0.394 0.688 0.715
DIC −5,986 −6,178 −6,298 −6,116 −6,122 −5,515 −5,775 −5,959 −5,620 −5,607
  1. *Re-scaled by multiplying by 103; Lowest DIC for each period is in bold.

Comparing error distributions, the models using GB2 always have the lowest DIC followed by the GG distribution as expected. Comparing persistence parameter estimates across more generalised distributions, the short-term memory β1 drops, the persistence β2 increases and their sum (β1 + β2 or total persistence) also increases. For example, β2 for RPK, 1 m measure using BTC in period 1 rises from 0.052 for Wei to 0.541 for GB2 distribution. The averaged persistence and total persistence across coins, periods and models for the three distributions in Table 4 are (0.31, 0.48, 0.63) and (0.76, 0.87, 0.95) showing very clear increases. This is because a more generalised distribution can capture the outliers better with a heavier tail so that the persistence becomes more clear. Across periods, the persistence and total persistence averaged across models for BTC ( 0.45 , 0.52 , 0.58 ) and ( 0.88 , 0.87 , 0.90 ) and ETH ( 0.38 , 0.40 , 0.52 ) and ( 0.79 , 0.84 , 0.91 ) also increases in general revealing increasing persistence across pandemic. Reversed trends are also displayed for short-term memory β1.

For the volatility measures fitted to the CARR models, four sampling frequency, 1 h, 15 min, 5 min, and 1 min of the RPK measures are originally considered but only results for 1 h and 1 min are reported in Table 4 because the trends of effect are roughly in order of sampling frequency. As the sampling frequency increases, information for the short-term memory will also increase. For example, a RPK, 1 m measure has 1,440 intervals during the day in comparison to 24 intervals when using the RPK, 1 h measure. Hence, the RPK, 1 m is capturing more information about the volatility allowing for better predictions of the short-term memory in volatility. This can also be seen from the ACF plots in Figure 1 that the RPK, 1 m measure has higher autocorrelation for all periods. Andersen and Bollerslev (1998) noted how as the number of intervals in a realised volatility measure goes to infinity, it approaches the true volatility. Hence, it is evident that as the sampling frequency increases, the short-term memory parameter β1 become higher and so explains why β1 from RPK, 1 m is higher relative to RPK, 1 h.

Similarly, the symmetric quantile ranges (1, 99), (2, 98), (3, 97), (4, 96) and (5, 95) are also applied to the CARR models and only results from (1, 99) and (5, 95) are reported. Symmetric ranges are more efficient as suggested by Tan et al. (2019). Between (1, 99) and (5, 95), the persistence (short-term memory) of using (5, 95) seems to be slightly higher (lower) as more outliers are dropped. Comparing across the three types of volatility measures PK, RPK and QPK, the persistence and total persistence averaged over periods, distributions and models (1 h, 1 m for RPK; (1, 99), (5, 95) for RPK) are (0.60, 0.35, 0.64) and (0.87, 0.89, 0.87) for BTC and (0.51, 0.29, 0.53) and (0.82, 0.89, 0.81) for ETH in Table 4 showing a clear drop of persistence using RPK measures for both coins as expected and a clear increase of total persistence using RPK measures for ETH. Table 14 in Appendix C reports the 95 % credible intervals for all the parameters providing some ideas of their estimates’ precision.

Figure 2 visualises the impact of changing the sampling interval length of the RPK measure and the error distribution used on short-term memory and persistence. This graph is created using the data from period 1. As can be seen on the left hand side of the graph, when the interval size is 1 min, β1 is highest which can be seen from the dotted lines and it declines as the interval size increases. Simultaneously, the persistence term β2 increases which is expected as there is a trade-off between long-term persistence and short-term memory. However, after an interval size of about 120 min (2 h) there is not much difference in short-term memory. So in practice, sampling frequency less than 2 h would be needed to capture the extra information in a realised volatility measure. Also, it is clear that when using a more generalised distribution, β1 is always lower and β2 is higher as explained before.

Figure 2: 
RPK interval length and distribution impact on short-term memory and persistence for BTC in period 1.
Figure 2:

RPK interval length and distribution impact on short-term memory and persistence for BTC in period 1.

Figure 3a and b compares the observed and theoretical error densities from the CARR(1, 1) model for each error distribution and period using the RPK, 1 m volatility for BTC and ETH respectively. It is clear that the GB2 distribution (column 3) captures observed error density better. The QQ-plots for each model using the different error distributions for BTC and ETH across periods can be seen in Figure 4. While the observed quantiles from 0 to 2 are relatively close to the theoretical quantiles in each graph, the right tails are usually far off the theoretical quantiles when using the Wei and GG error distributions. On the other hand, models using the GB2 distribution are able to model these heavy tails better, however, not perfectly which highlights the difficulty in modelling the tails of volatility. There is one extreme observation for BTC in period 2 which can be seen in the middle row that none of the distributions could capture. This observation was on March 12th, just after COVID-19 was declared a pandemic on March 11th. Similar observations are seen for ETH.

Figure 3: 
Theoretical versus actual density for the CARR(1, 1) model errors using RPK, 1 m volatility measure.
Figure 3:

Theoretical versus actual density for the CARR(1, 1) model errors using RPK, 1 m volatility measure.

Figure 4: 
QQ plots for the CARR(1, 1) model error using RPK, 1 m volatility measure.
Figure 4:

QQ plots for the CARR(1, 1) model error using RPK, 1 m volatility measure.

In summary, results highlight the importance of exploring different error distributions when modelling volatility and that a more generalised distribution such as GB2 provides better model fit in terms of DIC in Table 4, density plots in Figure 3 and QQ plots in Figure 4. This is because its flexible tail can downweigh the effect of outliers on the model-fit and so is more able to indicate a true lower short-term memory and capture stronger persistence. On the other hand, a potential reason for higher short-term memory when using Weibull is because the error terms will take higher values with lower probability due to the thinner tails with the distribution.

Moving forward, only the GB2 error distribution will be used for the CARR models as GB2 distribution models the error term the best allowing for more accurate parameter estimates. Also, the RPK, 1 m measure will be used for the two-stage model in Section 5 as this measure captures more information about the true volatility.

4.3 Comparison of Mean Functions and Time Periods

Using the GB2 error distribution and the RPK, 1 m measure, the CARR model with different mean functions are compared using different orders, as well as the leverage and bilinear effects. The models compared are the CARR ( p , q ) , LCARR ( p , q , a ) and BCARR ( p , q , s ) as defined in (8), (10) and (11), respectively. We also consider leverage effect choices a , b , c and the choice a in which w = 1 often gives better DIC. Hence, only the LCARR ( p , q , a ) model is reported in Table 5 and considered in subsequent analyses. Moreover, the BCARR(1, 1, 1) model gives very similar results to CARR(1, 1) model. Hence, it is reported in Table 16 in Appendix D together with LCARR ( 1,1 , b ) and LCARR ( 1,1 , c ) models for both coins. In Table 5, we drop parameter a from their model names without causing confusion and report parameter estimates as well as DICs. Again, 95 % credible intervals showing precision of all parameter estimates are reported in Table 15.

Table 5:

Parameter estimates and DIC for CARR models across periods, model orders and model types using the RPK, 1 m measure and the GB2 error distribution for BTC and ETH.

Period Par. BTC ETH
C(1, 1) C(2, 1) C(1, 2) LC(1, 1) LC(1, 2) C(1, 1) C(2, 1) C(1, 2) LC(1, 1) LC(1, 2)
1 β 0 * 0.075 0.081 0.080 0.040 0.044 0.207 0.231 0.201 0.105 0.097
β 11 0.417 0.414 0.475 0.310 0.361 0.484 0.484 0.513 0.347 0.373
β 12 0.026 0.040
β 21 0.541 0.516 0.233 0.582 0.297 0.433 0.383 0.210 0.502 0.282
β 22 0.242 0.223 0.191 0.186
β 3 0.445 0.483 0.688 0.757
β 4 0.569 0.599 0.853 0.875
DIC −4,588 −4,586 −4,595 −4,595 −4,602 −4,275 −4,273 −4,281 −4,286 −4,291
2 β 0 * 0.053 0.057 0.056 0.006 0.008 0.242 0.266 0.225 0.023 0.037
β 11 0.391 0.393 0.464 0.245 0.310 0.600 0.613 0.607 0.197 0.281
β 12 0.020 0.041
β 21 0.586 0.564 0.205 0.669 0.352 0.322 0.264 0.078 0.646 0.317
β 22 0.304 0.241 0.236 0.227
β 3 0.191 0.260 0.427 0.540
β 4 0.777 0.818 1.208 1.267
DIC −4,553 −4,550 −4,566 −4,588 −4,593 −4,199 −4,196 −4,213 −4,243 −4,245
3 β 0 * 0.192 0.207 0.195 0.151 0.148 0.282 0.308 0.259 0.209 0.179
β 11 0.394 0.398 0.433 0.363 0.397 0.531 0.544 0.539 0.485 0.489
β 12 0.019 0.024
β 21 0.525 0.498 0.273 0.536 0.278 0.393 0.349 0.174 0.403 0.189
β 22 0.211 0.220 0.213 0.214
β 3 0.409 0.481 0.586 0.611
β 4 0.132 0.139 0.318 0.324
DIC −6,298 −6,295 −6,307 −6,299 −6,309 −5,959 −5,956 −5,975 −5,963 −5,979
  1. *Re-scaled by multiplying by 103; ⋆ Re-scaled by multiplying by 102; Lowest DIC for each period is in bold.

As shown in Table 5 and 16, the LCARR(1, 2, a ) model has the best performance due to the lowest DIC for each period. This model has second order persistence and the leverage affect. This LCARR(1, 2, a ) model shows an improvement in DIC relative to CARR(1, 1) from −4,588 to −4,602. Clearly, the improvement is less than −4,350 to −4,588 when the errors distribution changes from Wei to GB2 on the CARR(1, 1) model.

Comparing between models, CARR(2, 1) models have higher DIC (as β12 are often tiny) than CARR(1, 2) showing that the higher order terms should be applied to long-term persistence as expected. On the other hand, the sum of β21 + β22 in CARR(1, 2) is close β2 in CARR(2, 1) or CARR(1, 1) showing no obvious overall increases in persistence for higher order models. Comparing CARR(1, 2) with LCARR(1, 2, a ) models, the latter with leverage effect display lower short-term memory and stronger persistence, hence demonstrating better model-fit. Another observation is on parameters β3 and β4 for the LCARR(1, 2, a ) model which are always positive indicating that a larger previous day return in either direction would result in higher volatility.

Comparing effects across periods, β4 (0.00818) is much higher than β3 (0.00260) for period 2 or early pandemic, meaning that a positive return would have a higher impact on volatility in comparison to a negative return of the same magnitude. This trend is different in period 3 as β3 is higher than β4 meaning that negative returns had a higher impact on volatility as expected. This indicates that the stress of early pandemic has reversed the usual leverage effect in the market. Moreover, there is lower short-term memory and higher persistence during the early pandemic period (period 2). For example, when considering the LCARR(1, 2, a ) model, β1 is 0.310 in comparison to 0.361 and 0.397 in the other two periods. This lower short-term memory could be due to more uncertainty in the market with sudden shocks from the two days of extremely high volatility in period 2 which had an impact on the parameter estimates as discussed in Section 4.4.

Similar observations are seen for ETH. For example, the LCARR(1, 2, a ) is the best model in each period and as well, the usual leverage effect is reversed in period 2. Similar results are also observed when fitting RPK, 5 m measures to the CARR models in Table 5. Again, the best model is LCARR(1, 2, a ) for all periods and both coins according to DIC. As the trends are similar, this table is omitted.

4.4 Model Robustness – Removing Outliers in Period 2

In period 2, there are two days of extremely high volatility and large negative and positive returns on March 12th and 13th, 2020. The RPK, 1 m measure on these two days is 0.06 and 0.09 which is approximately 10 and 15 standard deviations away from the mean volatility. The returns on these two days also have a high magnitude being 12 and 3.5 standard deviations from the mean return on the 12th and 13th of March, respectively. This was just after the WHO declared COVID-19 a pandemic. Outliers can have an impact on the parameter estimates so it is important to check the sensitivity of parameter estimates when the outliers are removed. In order to remove the outliers, the volatility on the 12th and 13th of March was set to be equal to the volatility on the 14th of March. Also, the returns on the 12th and 13th were set to be equal to 0.

Table 6 shows the effect of removing the outliers for the LCARR(1, 2, a ) model in period 2 using both BTC and ETH. It is clear that the short-term memory β1 is now higher for both coins. Parameter β1 is likely to be lower as the outliers will mask the autocorrelation pattern (short-term memory) in the data. Now its clear that after removing outliers, there is an increase in short-term memory and a drop in persistence for period 2. Between periods, the β1 (β21 + β22) value of 0.393 (0.533) for BTC after removing the outliers is closer to the β1 (β21 + β22) values of 0.361 (0.520) and 0.397 (0.498) for the other two periods using the LCARR(1, 2, a ) model (see Table 5). Hence, it appears that COVID-19 has a mild gradual increase on short-term memory and no obvious change on persistence if there are no shocks to the market on the 12th and 13th of March, 2020. In terms of the leverage effect, both β3 and β4 become smaller but not by a large amount. Parameter β4 also remains larger than β3 meaning that a positive return has a larger impact on volatility than a negative return of the same magnitude. Lastly, DICs for both coins improve after removing the outliers.

Table 6:

Effect of removing outliers for LCARR(1, 2, a ) model fitted to RPK, 1 m measure in period 2 for BTC and ETH.

Param. BTC ETH
With outliers Outliers removed With outliers Outliers removed
β 0 * 0.0084 0.0114 0.0367 0.0487
β 1 0.311 0.393 0.282 0.401
β 21 0.350 0.320 0.315 0.277
β 22 0.243 0.213 0.229 0.190
β 3 0.0026 0.0022 0.0053 0.0037
β 4 0.0081 0.0071 0.0126 0.0105
DIC −4,593 −4,607 −4,245 −4,280
  1. *Re-scaled by multiplying by 103.

In summary, these modelling results show that capturing different effects in volatility such as the leverage effect and second order persistence does improve the in-sample model fit as indicated by the lower DIC’s in Table 5. However, the choice of error distribution has the biggest impact on the DIC value. Hence, it is more important to ensure that an accurate error distribution is being used when modelling the volatility as capturing the fatter tail of the distribution is more effective than fitting a more complicated conditional volatility mean function.

Comparing the COVID-19 impact on volatility across periods, it did appear that there was lower short-term memory due to the lower β1 estimates in Table 5 for period 2. However, once the two outliers on the 12th and 13th of March, 2020 were manipulated to have a much lower volatility by setting them to the volatility on the 14th of March, β1 became higher as seen in Table 6 and is closer to the β1 values in the other two periods. Hence, it appears that COVID-19 did not have a major impact on short-term memory and persistence after discounting the large shocks. The outliers in period 2 may violate the thinner tail assumption such that the effect may contaminate the short-term memory and persistence effects in the mean function. This could be managed by either adopting more general distributions with flexible tails or manipulating the mean function to capture a possible jump.

Comparing the leverage affect across periods, it did appear to change across periods. During the early pandemic period (period 2), β4 (0.00818) is the highest and larger than β3 indicating that a positive return would have a larger impact on the next day volatility. But for the late pandemic (period 3), we do observe the usual phenomenon that β3 is much larger than β4 indicating higher impacts on volatility from negative returns and so, the market is starting to behave more like the usual financial asset markets.

While these analyses compare the features of volatility across the three periods, it is important to note that differences between periods may not be due to the COVID-19 pandemic alone. For example, the changing popularity of investing in cryptocurrencies as well as regulation may also be strong drivers of the differences between periods.

4.5 Out-Of-Sample Volatility Forecast Performance

So far, the models have only been compared based on their in-sample performance. Forecast performance comparison is important to evaluate models’ ability to generalise to unseen data. Section 3.4.4 details the one-step-ahead risk measure forecasts based on the 100 moving window W s = (V,R)(s+1):(s+T), s = 0, , 99 with T = T* − 100 days of window size for model training. Taking period 1 with T* = 365 days as an example, the window size is 265 and windows (1, 265), (2, 251), …, (100, 364) are used to train 100 models and forecast day 266, 267, , 365 (2019-09-23 to 2019-12-31) respectively. For period 2 and 3 with sample sizes T* = 366, 547, the forecast periods are 2020-09-23 to 2020-12-31 and 2022-03-24 to 2022-07-01 with 266 and 447 window sizes (days for model training) respectively. These forecast periods can be seen in Figure 5.

Figure 5: 
One-day-ahead forecasts of VoaR and CVoaR using the LCARR(1, 2, 


a

$\mathfrak{a}$


) model for the RPK, 1 m measure across periods for BTC and ETH.
Figure 5:

One-day-ahead forecasts of VoaR and CVoaR using the LCARR(1, 2, a ) model for the RPK, 1 m measure across periods for BTC and ETH.

With Bayesian inference, posterior predictive distributions V s + T + 1 ( 1 : K ) (size K = 10,000) for the volatility forecasts after burn-in (see Section 3.4.4) can be simulated from the model trained using data window W s , s = 0, , 99. Taking the best LCARR(1, 2, a ) model with GB2 distribution as an example, the one-step ahead forecast of Vs+T+1 from window W s in the MCMC iteration k ∈ 1: K is given by

(25) V ̂ s + T + 1 ( k ) Wei  a ( s , k ) , λ s + T + 1 ( k ) Γ 1 + 1 a ( s , k )

(see Table 1) where λ s + T + 1 ( k ) is given by (10) using the data window W s , p = 1 , q = 2 , and parameter β b , i = β b , i ( s , k ) , b = 0,1,2 , at iteration k. In this way, we could obtain the posterior predictive sample V s + T + 1 ( k ) : k = 1 , , K for the forecast day T + s + 1, s = 0, , 99, that is, the last 100 days of each pandemic period. For the VoaR and CVoaR forecasts, it is clear that only the upper quantile levels α2 are of interest and they can be estimated based on the quantiles and conditional mean of the posterior sample, that is,

(26) V o a R s + T + 1 ( α 2 ) = V ̂ s + T + 1 ( K α 2 ) and C V o a R s + T + 1 ( α 2 ) = 1 K ( 1 α 2 ) k K α 2 V ̂ s + T + 1 ( k )

where V ̂ s + T + 1 ( k α 2 ) denotes the 2-th posterior volatility forecast sample arranged in ascending order from V ̂ s + T + 1 ( 1 : K ) defined in (25). As a demonstration, Figure 13d present the posterior predictive distribution of V ̂ T + 1 ( 1 : K ) ( s = 0 ) in (25) for the first one-step-ahead forecast, fitting V1:T for BTC to the first model of Table 4, that is CARR(1, 1) with Weibull distribution. The window size T = 265 in period 1 and the posterior sample size K = 10,000. Hence, the estimate V ̂ 266 indicated by the red dash line is the median V ̂ 266 ( 5,000 ) of the distribution V ̂ 266 ( 1 : 10,000 ) in ascending order whereas V o a R 266 ( 0.9 ) = V ̂ 266 ( 9,000 ) takes the 9,000-th value and CVoaR266(0.9) takes the average of the 9,000-th to 10,000-th values.

4.5.1 Accuracy for Volatility Forecasts

The out-of-sample volatility forecast performance is compared between the CARR(1, 1) and LCARR(1, 2, a ) model using the root mean square error (RMSE) reported in Table 7. RMSE is a common forecast performance measure for forecast accuracy. In Table 7, it is clear that the LCARR(1, 2, a ) model often has a lower RMSE when including the leverage effect and second order persistence. So the LCARR(1, 2, a ) model generally has better in-sample model fit and out-of-sample forecast accuracy. However, the only exception is for BTC in period 3 as the RMSE of 0.171 using the LCARR(1, 2, a ) model is marginally higher than 0.170 when using the CARR(1, 1) model.

Table 7:

RMSE (multiplied by 102) between CARR(1, 1) and LCARR(1, 2, a ) across periods for BTC and ETH.

Period BTC ETH
CARR(1, 1) LCARR(1, 2, a ) CARR(1, 1) LCARR(1, 2, a )
1 0.173 0.165 0.194 0.192
2 0.118 0.109 0.169 0.151
3 0.170 0.171 0.326 0.322

To visualise the performance, Figure 5 plots the volatility forecast using the RPK, 1 m measure fitted to the LCARR(1, 2, a ) model and is compared to the actual RPK, 1 m measure in red. The volatility forecasts are relatively close to the obversed volatilities, adapting to periods of higher or lower volatility. Volatility clustering is also more clear from the plots as days of high volatility tend to cluster together. For example, after a day of high volatility, the prediction would become higher from the autoregressive components in the mean function.

4.5.2 Kupiec Tests for VoaR Forecasts

For testing the accuracy of the VoaR forecasts, the Kupiec test in (20) is used. Violation rate (VR) in (19) is the proportion of times the volatility exceeded a VoaR level so in general, a VR close to 1 − α2 for upper quantiles indicates good VoaR forecasts.

Table 8 reports the VR and the p-value of Kupiec test. As all the p-values are greater than 0.05, we conclude that the LCARR(1, 2, a ) model using the RPK, 1 m measure is able to provide relatively accurate estimates of VoaR across periods for both BTC and ETH. This means that the heavy tails in volatility are indeed captured by the model, even during times of market distress. Hence, the LCARR(1, 2, a ) model would be useful in practice. However, since each out-of-sample test only looks at 100 days, a longer forecast period could be used to test higher VoaR levers (α1 < 0.01 and α2 > 0.99) to assess the model performance in more extreme cases.

Table 8:

Violation rate and Kupiec test p-value using RPK, 1 m measure fitted to LCARR(1, 2, a ) model across periods for BTC and ETH.

Period Metric VoaR level
BTC ETH
0.9 0.95 0.975 0.99 0.9 0.95 0.975 0.99
1 VR 0.13 0.07 0.05 0.03 0.1 0.07 0.04 0.01
p-value 0.337 0.386 0.158 0.105 1.000 0.386 0.376 1.000
2 VR 0.09 0.04 0.03 0.01 0.09 0.05 0.01 0.99
p-value 0.735 0.635 0.756 1.000 0.735 1.000 0.275 0.156
3 VR 0.09 0.04 0.02 0.00 0.11 0.08 0.03 0.02
p-value 0.735 0.635 0.740 0.156 0.742 0.204 0.756 0.376
  1. All p-values are greater than 0.05 and so the VoaR estimates are accepted.

Figure 5 also reveals the performance of different VoaR (quantile) levels shaded in decreasing grey levels on the left column. For the higher VoaR levels, we can see that the actual volatilities do not exceed these levels too much as expected. The plots on the right display the CVoaR estimates. They are the expected volatility given they exceeded that VoaR level. It is clear that the CVoaR levels are higher than the VoaR levels for upper quantiles and display more extreme CVoaR forecasts despite the patterns are similar.

4.6 Extension of Lagged Predictors to Varying Interval Length to Identify Pandemic Impacts

In the current time series models, variables such as return or volatility are regressed on their lagged measures which capture information from the whole previous days. For the LCARR(1, 2, a ) model with the best in-sample and out-of-sample performance, the volatility is regressed on the previous day volatility, and the leverage effect uses the previous day return. This model has an implicit assumption that the volatility from the previous day is the best predictor of next days volatility for the autoregressive component, and that this previous day volatility is essentially uniform in a day. Now consider two different days within a time series where their volatility measures are almost equal. On one of these days, the volatility is increasing, and the other day it is decreasing. Clearly, we would expect the day with increasing volatility to have a higher volatility the next day on average. However, current models would treat these two days the same as they only measure volatility on a daily base.

This section aims to cover some potential improvements to the CARR model, however, these improvements can be applied to other models such as the GARCH model. We focus only on the LCARR(1, 2, a ) model in this analysis as it is the best performing volatility model in Table 5.

4.6.1 Varying Period Length of Lagged Volatility for Short Memory

Looking at the LCARR(1, 2, a ) model, the previous days volatility Vt−1 in the conditional mean function λ t could be replaced by a measure which only captures volatility from the second half of the previous day to predict the next days whole volatility. This section will explore the effect of regressing on previous volatility, captured from different lengths, e.g. the last 12, 16, and 20 h from the previous day.

To explore the effect, attempts are made to regress on previous volatilities, captured from time intervals of different lengths. The idea can also be extended to regress on multiple ( m ) non-overlapping previous volatilities. For example, treating the volatilities from the first and second half of the previous day as two separate variables ( m = 2 ) in the model. In theory, this model would perform just as well as the model which only uses the previous day volatility as this is a special case of the extended model when the parameter estimates are equal.

To define this model, take the conditional mean of the LCARR ( p , q , a ) model given by (10) with w = 1 and replace the term i = 1 p β 1 , i V t i by i = 1 m α i v t 1 , i where m is the specified number of variables to regress on and vt−1 are the set of non-overlapping volatilities captured from previous time intervals t − 1 of different lengths. Note, it is not required that p = m . Then the new model denoted by LCARR * ( m , q , a ) can be written as

(27) λ t = β 0 + i = 1 m α i v t 1 , i + i = 1 q β 2 , i λ t i + β 3 | R t i | 1 R t 1 < 0 + β 4 R t i 1 R t 1 > 0 .

The stationary conditions in (9) become more complicated when we regress on varying time lengths. For example, if v t 1 , i * comes from a time interval smaller than a day, then α i can take a value larger than 1. Since E v t 1 , i * = p E ( V t 1 ) where p is the length of the time interval as a proportion of 1 day, the volatility measures vt−1,i will be multiplied by 1/p, that is v t 1 , i = v t 1 , i * / p to ensure that E ( v t 1 , i ) = E ( V t 1 ) . Then the same stationary conditions

(28) β 0 , α i , β 2 , j > 0 and i = 1 m α i + i = 1 q β 2 , i < 1

apply.

To show how the DIC of the LCARR * ( 1,2 , a ) model changes with increasing length of the interval that capture previous volatility, Figure 6a plots DIC against the last n hours of the day (n = 4, 6, 8, …, 22) that measures vt−1 and replaces the lagged volatility term Vt−1 (n = 24). This was run for each time period using the RPK, 1 m measure and BTC. In the figure, the red line shows the DIC from the LCARR(1, 2, a ) model using Vt−1 (n = 24) as the benchmark while the black line shows the DIC of the LCARR * ( 1,2 , a ) model for different interval lengths of vt−1. For period 1, regressing on the previous 18 h results in the lowest DIC of −4,606 which is lower than the DIC of −4,602 for the LCARR(1, 2, a ) model using 24 h. However, a longer period length has a lower DIC for period 2 and 3, which indicates that regressing on volatility measures captured from shorter periods generally has no obvious improvement. Hence, results show that using the whole previous days volatility measure is perhaps a good choice.

Figure 6: 
DIC against interval length in the previous day volatility for the autoregressive effect (subfigures (a) and (b)) and in the previous day returns for the leverage effect (subfigure (c)) across the three periods.
Figure 6:

DIC against interval length in the previous day volatility for the autoregressive effect (subfigures (a) and (b)) and in the previous day returns for the leverage effect (subfigure (c)) across the three periods.

Another LCARR * ( 2,2 , a ) model is run where the previous day volatility Vt−1 is replaced by two volatility measures, vt−1,1 and vt−1,2, computed using non-overlapping intervals from the previous day. For example, one volatility measure is computed from the last 8 h of the previous day denoted as vt−1,1, and the other volatility measure is computed from the first 16 h denoted as vt−1,2. This model can be written as (27) with m = 2 . This setting aims to see whether each of the 2 intervals from the previous day has different correlation with the next day volatility. Hence, they should be given different weights or coefficients α1 and α2.

Figure 6b plots the DIC against the length n in hours for the volatility measure vt−1,1 computed from the last n hours of the previous day. Note that the other volatility measure vt−1,2 is computed using volatility from the first 24 − n hours of the previous day. In the figure, we can see that splitting up the volatility measure from the previous day into two measures results in an improvement on the LCARR * ( 2,2 , a ) model for periods 1 and 3. This can be seen as the black line which indicates the DIC’s for the new model does go lower than the benchmark red line which is the DIC for the LCARR(1, 2, a ) model. In period 3, when vt−1,1 is computed using the last 4 h of volatility from the previous day, and vt−1,2 from the first 20 h, the DIC is about −6,327 in comparison to −6,309 for the LCARR(1, 2, a ) model. This is quite a large improvement and shows that in period 3, the most recent hours from the previous day would have higher predictive power. Hence, vt−1,1 should be treated as a separate variable. Moreover, for period 1, it is evident that regressing on the previous volatility computed from a smaller interval from the end of the previous day improves the DIC. However, the new model performed worse for period 2 as seen by the higher DIC’s. Overall, these results indicate that in some cases, it may be appropriate to split up the previous day volatility into two volatility measures from two non-overlapping intervals and regress on them as two separate measures.

4.6.2 Varying Interval Length of Lagged Returns for Leverage Effect

Instead of changing the time intervals of the lagged volatilities for the autoregressive effect, the time intervals used to capture the leverage effect from lagged returns will be changed. The aim is to check whether just using the previous days returns to capture the leverage effect is most appropriate or whether the previous days returns should be computed from a different time interval.

The Rt−1 term in the LCARR(1, 2, a ) was captured from the last n hours where n = 4, 6, 8, …, 48. This model will be denoted as LCARR * * ( 1,2 , a ) and the conditional mean can be written as

(29) λ t = β 0 + β 1 V t 1 + i = 1 2 β 2 , i λ t i + β 3 | R t 1 * | 1 R t 1 * < 0 + β 4 R t 1 * 1 R t 1 * > 0

where R t i * is the modified return which is computed from an interval of different length in the previous day. Figure 6c plots the DIC from each model against the last n hours in previous days used to compute R t 1 * . Again, the benchmark red line is the DIC from the LCARR(1, 2, a ) model which is why it goes through the observation which used the last 24 h to capture the return as this is equivalent to the LCARR(1, 2, a ) model.

There are a few interesting observations from Figure 6c. In period 2, it is clear that using a larger time interval to capture the leverage effect from return resulted in a lower DIC. For example, using the return from the previous two days instead of just the previous day for the leverage effect results in a DIC of −4,613 which is even lower than the DIC using just the previous day when the outliers are removed (−4,607) as seen in Table 6. This could mean the market is reacting to large changes in returns over longer periods of time. Another potential reason is that there are periods of very high volatility where the return alternated between positive and negative. So looking at the return from the last two days would be less sensitive to a single day’s shock in return.

For period 3, using the return from the last 6 h resulted in the lowest DIC. This result closely aligns with the autoregressive effect in Figure 6b which favours short latest interval. Lastly, there do not seem to have much difference in performance for period 1 after an interval of larger than 24 h was used to compute the previous return for leverage effect.

4.6.3 Discussion

This section explores using a more general definition of lagged/previous volatility for the autoregressive effect and lagged returns for the leverage effect in the CARR model. This definition allows the variables to be computed from intervals of different lengths, releasing the requirement that these lagged volatility and return regressors to come from a one day period. In terms of regressing on previous volatility from a shorter time interval for the autoregressive effect, there appears to be no improvement in model fit than regressing on the previous one-day volatility. However, by splitting up the previous day volatility into two measures, there are some improvement which means that volatility from different intervals of the previous day have different correlations with the next day volatility. Also, for the leverage effect, using a different interval to compute the previous return may improve the model performance.

Overall, while the new lagged variables computed from different intervals of previous days sometimes result in an improvement in model-fit, there are no consistent observations across the three periods. However, improvement does appear in every period through at least one of the modifications. So the way to modify and improve model-fit as well as the interval lengths to compute the corresponding lagged variables can be searched case by case as we have conducted. This area should be explored further. Also, the techniques used in this section by changing the interval length of the lagged predictors can be applied to other time series models such as GARCH or ARMA models for returns.

5 Application to Stage Two Return Model

Return modelling and forecast are also important to understand the pandemic impact on the cryptocurrency market. In this section, return models with AR(1) mean function and various error distributions are applied across periods to identify the pandemic impact.

5.1 Return Data

The daily close-to-close log return is considered as input to the return models. Table 9 summarises and compares this return data across the three periods for both BTC and ETH. From the table, the mean return is clear highest for period 2 for both coins. This is because the price of BTC went from 7,201 to 28,924 USD in 2020 and ETH went from 131 to 736 USD. The variance of returns is also the highest in period 2 which shows how the uncertainty in the market due to COVID-19 interacted with the increasing popularity of cryptocurrencies may impact the volatility. For periods 1 and 3, the skewness and kurtosis are relatively low in comparison to period 2. The LB Q10 test for autocorrelations up to lag 10 are significant with the test statistics highlighted in boldface except periods 1 and 3 for BTC. The LB test statistics are also higher for ETH across all periods.

Table 9:

Summary statistics of returns from each period for BTC and ETH.

Coin BTC ETH
Period 1 2 3 1 2 3
Mean* 1.820 3.801 −0.681 −0.048 4.756 0.686
Variance* 1.273 1.795 1.669 1.801 3.039 2.890
Min −0.145 −0.503 −0.167 −0.194 −0.591 −0.325
Max 0.159 0.150 0.178 0.145 0.218 0.234
Skewness 0.141 −4.399 −0.129 −0.508 −3.270 −0.404
Excess kurtosis 4.11 54.91 1.93 3.79 36.74 3.80
LB test for autocorrelation Q10 13.83 28.74 12.17 18.90 30.07 25.19
KS test for normality p-value <0.001 <0.001 0.037 <0.001 <0.001 0.141
  1. *Re-scaled by multiplying by 103; Test statistics in boldface are significant at the 5 % level.

The bottom row of Table 9 is the p-values from the Kolmogorov-Smirnov (KS) test for normality. Most of the p-values are below 0.05 indicating that the returns do not come from a normal distribution highlighting the need to use heavy tail distributions hinted from the high kurtoses. There is one exception for ETH in period 3. Moreover, the p-value for BTC in period 3 is just marginally significant. This closer to normality observation for both coins in period 3 is also evidenced from the fact that period 3 is, in fact, longer than the other periods, and so is more powerful in testing against normality. On the other hand (Cont 2001), proposed the idea of aggregational normality: as one increases the time scale Δt (decrease the sampling frequency) over which returns are calculated, their distribution looks more and more like a normal distribution.

Figure 7 displays the time series, density, and ACF plots for the returns of BTC. Those plots for ETH are similar and so are omitted. The time series plot for period 2 shows the large outlier on the 12th of March after COVID-19 was declared a pandemic. On that date, the return was about −0.5 as compared to the mean of 0.0038. From the ACF plot for period 2, there is a significant negative autocorrelation at the first lag which is not surprising given the high LB Q10 test statistic in Table 9 and explains why an AR(1) in (13) will be used for the conditional mean of the return model.

Figure 7: 
BTC time series, density, and ACF plots for returns.
Figure 7:

BTC time series, density, and ACF plots for returns.

5.2 Return Models

In this section, the return models in (12) and (13) for the stage two of the CARR-return models are analysed and compared across periods. The volatility input from the stage one CARR model to the state two return model are volatility estimates from the LCARR(1, 2, a ) model using the RPK, 1 m measure as this volatility model has the lowest DIC and better out-of-sample forecast accuracy. Table 10 reports the parameter estimates and DIC of the return model for each period and for BTC and ETH. The parameter μ0 in (13) is the unconditional return, ϕ1 is the autoregressive term for the AR(1) process of the return, and ρ in (12) is the scaling parameter to ensure that the volatility estimate ρλ t where λ t is given in (10), is unbiased for the true return variance. Unlike the shape parameters a, p, q for volatility distributions, we report ν in Table 10 as it is directly related to kurtosis of the return distributions; hence revealing the effects of outliers. The lowest DIC in each period for each coin and the parameters significant at the 5 % level are highlighted in boldface.

Table 10:

Parameter estimates and DIC for the two-stage CARR-return model using the fitted RPK, 1 m volatilities from the LCARR(1, 2, a ) model for BTC and ETH.

Period Param BTC ETH
N ST VG N ST VG
1 μ 0 * 0.573 0.378 0.234 0.048 −0.325 −0.069
ϕ 1 0.032 −0.037 −0.056 −0.031 −0.108 −0.122
ν 2.359 1.614 2.488 1.693
ρ 1.119 0.325 1.011 1.110 0.369 1.033
DIC −1,396.82 1,537.47 −1,533.21 −1,249.68 −1,355.48 1,356.90
2 μ 0 * 1.013 1.575 1.537 1.117 1.123 0.962
ϕ 1 −0.047 −0.115 −0.132 −0.047 −0.110 −0.095
ν 2.526 1.709 3.215 2.098
ρ 1.296 0.377 1.055 1.300 0.593 1.148
DIC −1,301.23 1,523.80 −1,507.01 −1,121.28 1,245.34 −1,233.37
3 μ 0 * −0.519 −0.273 −0.145 −0.246 0.065 0.198
ϕ 1 −0.007 −0.028 −0.048 0.005 −0.019 −0.026
ν 4.205 2.298 5.584 2.985
ρ 0.988 0.545 1.010 1.064 0.712 1.083
DIC −1,940.92 −1,986.64 1,988.90 −1,700.83 1,730.19 −1,722.04
  1. *Re-scaled by multiplying by 103. Parameters in boldface are significant at the 5 % level. Lowest DIC for each period is in boldface.

5.2.1 Comparison of Error Distributions

For the error distribution, three distributions namely Normal (N), Student-t (ST), and Variance-Gamma (VG) distributions are considered. While the Normal distribution is a special case of the ST and VG distributions when ν goes to infinity, the ST and VG distributions are not special cases of each other and have different shapes. Relative to ST distribution, VG distribution has lower density for the mid-regions defined as the regions between the peak and the tails on each side (Fung and Seneta 2007). Hence, it is important to compare these two distributions. For computational time, the return models take less time to run likely because of less parameters and complexity in the model. The running times are around 15 s, 30 s, and 1 min for the Normal, ST, and VG distributions respectively.

As shown by the DICs, the ST distribution is the best for BTC in periods 1 and 2 and ETH is periods 2 and 3 while the VG distribution modelled returns the best for the remaining cases. While the VG distribution modelled returns better than ST distribution in some cases, the improvement is only marginal. For example, in period 3 for BTC, the difference in DIC for ST and VG is only 2.26 while the difference in DIC is much larger when the ST distribution performs better. Hence, the ST distribution is better at modelling this return data overall.

The residual density plots in Figure 8 visualises the comparison across distributions. The density plot for period 2 is cropped on the left tail to exclude the extreme negative return on March 12th because it would be hard to see the shape of the density plots if that observation is included. Overall, it is clear that the ST and VG distributions model the error term well. But we can see how the shape of these two distributions are different. The VG distribution is more peaked around 0 so it can be good for modelling returns when there is infrequent trading resulting in more around 0 returns. This is likely for high frequency data which often come with high kurtosis. Between ST and VG distribution, the daily returns for BTC and ETH is usually better modelled by the ST distribution and in the cases where the VG distribution fit the data better, it is only marginally better than the ST distribution. Also, for period 3, it is clear that the Normal distribution models the error term better in comparison to the other two periods supporting the observation that returns are closer to a Normal distribution in period 3 relative to periods 1 and 2.

Figure 8: 
Theoretical versus actual density for the return errors across distributions and periods.
Figure 8:

Theoretical versus actual density for the return errors across distributions and periods.

Furthermore, Figure 9 graphs the QQ plots for the error terms from the return model across distributions and periods for both coins. The QQ plots highlight how the ST and VG distributions are better at modelling the tails of the return distribution in comparison to the Normal distribution. However, there are still some outliers on the tails which cannot be captured by the distributions, highlighting the difficulty in modelling the extreme ends of the tails. For period 1, the VG error distribution appears to model the error terms near 0 better in comparison to the ST distribution as the observed quantiles are closer to the theoretical quantiles in the middle, showing how the peak of VG distributions is better at capturing the central portion of the data with high kurtosis.

Figure 9: 
QQ plots for return error across distributions and periods.
Figure 9:

QQ plots for return error across distributions and periods.

5.2.2 Comparison of Pandemic Periods via Parameter Estimates

Looking at parameter estimates, the unconditional mean μ0 is highest for period 2 for both coins, however, it is only significant for BTC using the ST and VG distributions. This would likely be due to the fact that BTC went from 7,201 to 28,924 USD in 2020. During the early pandemic period (period 2), the AR term ϕ1 is lowest negative and significant when using ST and VG distributions, indicating highest anti-persistence compared to the other periods in Table 10. The reason for this higher anti-persistence could be a result of returns alternating between positive and negative more frequently during times of high volatility and uncertainty. For example, ϕ1 for BTC in period 2 was −0.115 compared to −0.037 and −0.028 for the other two periods. This is expected as the ACF of BTC returns for period 2 is negative and significant at lag one in Figure 7. Parameter ϕ1 is also significant for ETH during the pre and early pandemic periods. However, during the late pandemic period, ϕ1 is not significant for both coins. In these cases, the mean function should be reduced to a constant.

During the early pandemic period (period 2), the AR term ϕ1 was lowest, indicating highest anti-persistence compared to the other periods in Table 10. The reason for this higher anti-persistence could be a result of returns alternating between positive and negative more frequently during times of high volatility and uncertainty. Even after removing some of the most extreme outliers, this higher anti-persistence was still significant as seen in Table 11.

Table 11:

Effect of removing outliers on two-stage return model using Student-t distribution and RPK, 1 m measure fitted to LCARR(1, 2, a ) model for period 2 for BTC and ETH.

Parameter BTC ETH
With outliers Outliers removed With outliers Outliers removed
μ 0 * 1.575 1.543 1.123 1.082
ϕ 1 −0.115 −0.102 −0.110 −0.096
ν 2.526 2.865 3.215 3.700
ρ 0.377 0.411 0.593 0.637
DIC −1,524 −1,544 −1,245 −1,264
  1. *Re-scaled by multiplying by 103; Bold numbers are significant at the 5 % level.

Regarding the shape parameter ν for the ST distribution, it is the highest in period 3 for both coins as the KS test for normality is marginally significant or insignificant for the returns in period 3 as reported in Table 9. Moreover, the improvement in DIC for ST or VG from Normal distributions in period 3 is much smaller. For BTC, the drop in DIC when using ST relative to Normal distributions is only 45.72 for period 3 in comparison to 222.57 for period 2. This could be due to the less shocks in the cryptocurrency market during the late pandemic period.

For the scaling parameter ρ to ensure unbiasedness of the volatility input, it is expected to be near 1 when using the Normal and VG distributions with E ( R t ) = σ t 2 meaning that the volatility estimates are close to the true volatility. However, for the ST distribution, E ( R t ) = ν ν 2 σ t 2 so that ρ ν 2 ν is always lower than 1 to ensure unbiasedness. For example, ρ = 0.545 for BTC in period 3 because ρ 4.205 2 4.205 = 0.524 . Hence, the volatility estimate is essentially being scaled by 0.545/0.524 = 1.04 which is approximately 1.

5.2.3 Removing Outliers in Period 2

Similarly to the analysis in Section 4.4, the effect of removing outliers in period 2 (12th and 13th of March) is addressed. The outliers for the RPK, 1 m measure are modified such that they are equal to the volatility on the 14th of March. The returns on the 12th and 13th of March are set to 0. They are then fitted to the LCARR(1, 2, a ) volatility model and the conditional mean volatility estimates λ t are then input into the return model. It is important to note that after the returns for these days were set to 0, the skewness became 0.256 in comparison to −4.417 with the outliers. Hence, if a skewed ST distribution is used, it may try to capture the outliers better, resulting in a worse fit for the rest of the error terms.

Table 11 compares the parameter estimates from the return model with and without the outliers. After removing the outliers, the AR term ϕ1 becomes slightly less negative for BTC and ETH but this higher anti-persistence is still significant. Also, the shape parameter ν becomes higher, indicating lower kurtosis as expected because the tails become thinner after removing the outliers. Also, the DIC becomes smaller after removing the outliers since other returns can be captured with a higher likelihood.

5.3 Out-Of-Sample Forecast Performance

Again, return models are also evaluated in terms of out-of-sample forecast to test the generalisation capacity to unseen data.

5.3.1 Return, VaR and CVaR Forecasts

For the out-of-sample return forecast, the same method used for volatility is adopted. That is, a window of T = 250 days is used to train a model and forecast the next day return. The return model uses the ST distribution for the error terms, and the AR(1) conditional mean function. This is performed for the last 100 days of each period. The one-day-ahead VaR forecasts are estimated from the posterior predictive distributions. Similar to Section 4.5, the posterior predictive distribution of return forecasts R s + T + 1 ( k ) : k = 1 , , K for each of the forecast time T + 1 + s, s + 1 = 1, , 100 can be obtained by performing the posterior sampling. The one-step ahead forecast of Rs+T+1 at the MCMC iteration k using window W s is

(30) R ̂ s + T + 1 ( k ) ST μ s + T + 1 ( k ) , ρ ( s , k ) λ s + T + 1 ( k ) , ν ( s , k )

where λ s + T + 1 ( k ) is the volatility mean forecast using window W s and μ s + T + 1 ( k ) is given in (13) with t = s + T + 1 and parameters μ 0 = μ 0 ( s , k ) , ϕ 1 = ϕ 1 ( s , k ) and ρ = ρ(s,k) at iteration k using W s . For the VaR and CVaR forecasts, they can be estimated based on the quantiles and conditional mean of the posterior predictive sample, that is,

(31) V a R s + T + 1 ( α j ) = R ́ s + T + 1 ( K α j ) , j = 1,2 and C V a R s + T + 1 ( α j ) = 1 K α j k K α j R ́ s + T + 1 ( k ) , j = 1 1 K ( 1 α j ) k K α j R ́ s + T + 1 ( k ) , j = 2

where R ̂ s + T + 1 K α j denotes the j -th posterior sample arranged in ascending order from R ̂ s + T + 1 ( 1 : K ) defined in (30).

Figure 10 displays the one-day-ahead forecasts for returns and prices as well as several confidence intervals (CIs) based on the posterior sample R s + T + 1 ( k ) , s = 0 , , 99 and VaR in (31). The (closing) price forecasts are computed from the return forecasts as follows

(32) E ( P t + 1 ) = P t e E ( R t + 1 )

where E ( P t + 1 ) is the estimated price at time t + 1 and returns are defined as close-to-close log return in (2).

Figure 10: 
One-day-ahead forecasts and confidence intervals (CIs) using VaR for returns (left column) and prices (right column) using the two-stage CARR-return model with the ST distribution.
Figure 10:

One-day-ahead forecasts and confidence intervals (CIs) using VaR for returns (left column) and prices (right column) using the two-stage CARR-return model with the ST distribution.

The return forecasts in Figure 10 are close to zero as expected due to the mean reversion and low autocorrelation in returns. However, they display more variability in period 2, likely due to the higher magnitude of the AR term ϕ1 in the return mean function. It appears that the estimated CIs are relatively accurate as the returns and prices do not go outside the wider CIs too much for both BTC and ETH. Between coins, the observed variances and the unconditional mean β0 in Table 5 for ETH are higher. Hence Figure 10 also shows higher variability for ETH relative to BTC. Across periods, period 2 (early pandemic) reveals more turmoil market throughout the period and a clear increasing trend of price. This increasing price trend is followed by a clear decreasing price trend in period 3 (late pandemic).

5.3.2 Kupiec Tests for VaR Forecasts

The Kupiec test is again used to check whether the one-day-ahead VaR forecasts are accurate. Both the lower quantiles and upper quantiles are tested. Table 12 reports the VR and p-value of the test. It is clear that some p-values in period 2 and 3 are less than 0.05 which are highlighted in boldface. So the corresponding VaR estimates need to be rejected. For period 2, the VR for the lower quantiles of BTC are always smaller than the VaR level. So the model is underestimating the lower VaR levels for the last 100 days of period 2. On the other hand, the VR is generally higher than expected for the upper VaR levels for BTC in period 2 with a 0.19 violation rate for the 0.9 VaR level. A potential reason for having a higher VR on one side would be if there is high skewness. However, the skewness for the last 100 days of period 2 for BTC is only 0.26 which is relatively small.

Table 12:

Kupiec Test Results for the two-stage return model using the Student-t distribution and the RPK, 1 m measure fitted to LCARR(1, 2, a ) model. Both lower and upper quantile exceedances are tested.

Coin Period Metric VaR level
Lower quantiles Upper quantiles
0.1 0.05 0.025 0.01 0.9 0.95 0.975 0.99
BTC 1 VR 0.15 0.06 0.03 0.01 0.07 0.04 0.02 0.01
p-value 0.118 0.656 0.756 1.000 0.293 0.635 0.740 1.000
2 VR 0.04 0.02 0.01 0.00 0.19 0.10 0.04 0.01
p-value 0.024 0.119 0.275 0.156 0.007 0.042 0.376 1.000
3 VR 0.17 0.08 0.05 0.02 0.06 0.04 0.01 0.00
p-value 0.032 0.204 0.158 0.376 0.153 0.635 0.275 0.156
ETH 1 VR 0.12 0.07 0.04 0.02 0.07 0.03 0.01 0.01
p-value 0.517 0.386 0.376 0.376 0.293 0.323 0.275 1.000
2 VR 0.08 0.04 0.00 0.00 0.14 0.05 0.01 0.00
p-value 0.491 0.635 0.024 0.156 0.206 1.000 0.275 0.156
3 VR 0.13 0.11 0.05 0.02 0.06 0.02 0.01 0.00
p-value 0.337 0.017 0.158 0.376 0.153 0.119 0.275 0.156
  1. p-Values < 0.05 are highlighted in boldface. In these cases, the VaR estimates are rejected.

Another reason is possibly the autoregressive term ϕ1. From Table 10, ϕ1 is negative and significant, however, the autocorrelation of returns for the last 100 days of period 2 is near 0 (−0.018) and not significant. Since the returns are usually increasing as shown in the price plot for period 2 in Figure 10, there would be more positive returns. So, the estimated returns and quantiles would be lower since ϕ1 is negative resulting in higher VRs on the upper quantiles and lower VRs on the lower quantiles. Hence, using a constant mean model for the estimated returns might be more appropriate. It is also important to note that if a skewed ST distribution had been used, the estimated quantiles might be even lower as the skewness is negative for early 2020 resulting in lower VRs for the lower VaR levels and higher VRs for the upper VaR levels.

For period 3, there is a significant p-value for both BTC and ETH on the lower quantiles in Table 12. This is a result of higher VR in lower quantiles, contrary to period 2. As seen in Figure 10, the price of BTC was moving down so there may have an opposite effect to period 2. Also, the last 100 days of period 3 for BTC and ETH do have negative skewness, −1.0 and −0.5 respectively. Now since a symmetric ST distribution is used, the VaR estimates may be too high resulting in higher VRs on the lower quantiles.

On the other hand, most of the Kupiec tests are insignificant particularly on the low VaR levels like 0.01 which are used more in practice. However, the low p-values do highlight some of the limitations of out-of-sample forecasting of returns. When the price is consistently increasing or decreasing, the model will generally perform worse in forecasting the next day VaR. Indeed, in period 1 where the price is not consistently moving in one direction for the last 100 days, all the p-values from the Kupiec tests are high and insignificant indicating that the model provides relatively accurate VaR forecasts. In summary, if the features of the return series keep changing, the out-of-sample forecasts may become inaccurate and a simpler model may be more appropriate.

5.4 Comparison with GARCH

In this section, the GARCH model is compared with the two-stage CARR-return model using the DIC. For fair comparison, the GARCH model is also run using the same distributions. Also, the leverage effect and second order persistence are also applied to the GARCH model. So the difference between the GARCH and the two-stage CARR-return models is that the volatility measured by RPK, 1 m is modelled separately using GB2 distribution in the two-stage model instead of treating volatility as latent and deterministic. Moreover the short memory term uses lagged RPK, 1 m measure instead of return square R t 2 in the GARCH model.

Table 13 shows that the model with the lowest DIC (highlighted in boldface) is the two-stage CARR-return model in five of the six period by coin cases. Among these five cases, ST distribution wins three times agreeable with Table 10. The only exception when the GARCH model performed best is for ETH in period 2 using the ST distribution. In this case, the DIC is only marginally lower as it is −1,246 using the GARCH model compared to −1,245 using the two-stage model. In conclusion, the two-stage model has advantages over the GARCH model as it utilises two data sources, returns and some efficient volatility measures allowing a separate volatility model using flexible distribution such as GB2. By inputing the volatility estimates into the return model, it reduces the load of estimating both mean and volatility parameters in one GARCH model.

Table 13:

DIC of the GARCH and two-stage CARR-return models across error distributions and periods for BTC and ETH.

Period Model BTC ETH
N ST VG N ST VG
1 GARCH −1,420.16 −1,518.70 −1,523.27 −1,268.06 −1,356.52 −1,346.88
2-stage −1,396.82 −1,537.47 −1,533.21 −1,249.68 −1,355.48 −1,356.90
2 GARCH −1,302.38 −1,507.44 −1,483.99 −1,132.32 −1,246.24 −1,226.87
2-stage −1,301.23 −1,523.80 −1,507.01 −1,121.28 −1,245.34 −1,233.37
3 GARCH −1,936.44 −1,978.54 −1,982.99 −1,650.21 −1,708.68 −1,696.93
2-stage −1,940.92 −1,986.64 −1,988.90 −1,700.83 −1,730.19 −1,722.04

6 Conclusions

This paper covers several aspects in volatility and return modelling, focusing on the applications to BTC and ETH to study the pandemic impact on cryptocurrency market. The data are split into three periods to facilitate comparisons between pre-pandemic, early pandemic, and late pandemic periods and help to detect consistent patterns across periods.

Firstly, in terms of modelling, Section 4 applies the CARR model exploring how the choice of error distribution, namely Wei, GG and GB2, and volatility measures, namely PK, RPK and QPK with different frequency and quantile range, affect the model performance using CARR(1, 1) models. This section also explores different features in volatility such as the leverage (LCARR model) and bilinear (BCARR model) effects and checks if they would improve the model fit using DIC. Model robustness is also assessed by comparing the best LCARR(1, 2, a ) model using the RPK, 1 m measure with and without capping the extreme outliers in period 2. Models’ ability to forecast volatility, VoaR and CVoaR is also compared across periods and coins using the best model. Moreover, extensions of the autoregressive and leverage effect terms in the CARR models are proposed by allowing the lagged volatility and return regressor to be computed from intervals of different lengths. However, there is no extended predictors which shows a consistent improvement across periods. Then Section 5 applies the stage-two return model to returns of BTC and ETH. Volatility estimates from the CARR model are integrated into the return model with three error distribution choices, namely Normal, ST and VG, and an AR(1) mean function. Model performance is compared across error distributions, periods and the GARCH model. Forecast performance of returns, VaR and CVaR is assessed similar to the CARR models. Results show that ST distribution is mostly favoured and the two-stage CARR-return model performs consistently better than the GARCH model. Lastly, computing this Bayesian two-stage CARR-return models is feasible as the processing time is less than 10 min for one model. Moreover, CPU usage and memory requirement are also manageable as the models do not have large number of parameters.

Secondly, in terms of model and forecast performance, results show that the biggest improvement come from the choice of error distribution. So while one can create more complicated models by including features such as leverage and bilinear effects to the mean function, this becomes obsolete if the error distribution is not properly chosen. By downweighing the effect of outliers, flexible error distributions can reduce contamination of regression parameters and improve the accuracy of volatility persistence estimates. This also highlights the importance of examining the density and QQ plots to check how well the model captures the heavy tails. In addition, the two-stage CARR-return models in the Bayesian approach allow easy computation of (VoaR, CVoaR) and (VaR,CVaR) using the posterior predictive samples of volatility and return forecasts respectively. This is an advantage of using Bayesian approach. Then the VoaR and VaR forecast performance can be tested using the Kupiec test. All VoaR forecasts have insignificant results indicating that the best LCARR(1, 2, a ) model provides good VoaR forecasts. However, some VaR forecasts in periods 2 and 3 are significant indicating that these VaR forecasts should be rejected.

Thirdly, in terms of pandemic effects on cryptocurrencies, there are a few observations. The first observation comes from the fact that the early pandemic period (period 2) contains two days of extremely high volatility. These extreme outliers in period 2 are the result of the WHO declaring COVID-19 a pandemic. These outliers lower the short-term memory in volatility, however, once the outliers are capped, the short-term memory in volatility is similar to the other periods. In practice, one may consider more advanced models to capture these sudden jumps in a probabilistic way to avoid contamination of estimates by outliers. Our second observation stems from the increase of volatility persistence in early and then later pandemic showing slower information absorption. For returns, the AR(1) term is significant and exhibits anti-persistence, which is most pronounced during the early pandemic. This suggests that during periods of uncertainty, the market experiences more frequent shifts between positive and negative returns. These fluctuations may result from market overreactions to certain events, followed by rapid corrections, causing returns to reverse direction.

The third observation stems from the leverage effect in volatility which appears to change across periods. During the early pandemic, the asymmetric impact of a positive return is higher than the impact from a negative return, even after capping the outliers. Then in the later pandemic period, the asymmetric impact of negative returns is higher which is more consistent with the usual market behaviour. This change of leverage effect could suggest reduce predictability of volatility during periods of market distress. The fourth observation comes from the increase in degrees of freedom ν of ST return distribution from early pandemic to late pandemic demonstrating less market stress and a gradual revert to normal market condition. Lastly, we observe the inaccuracy of VaR forecasts in period 2 and 3 which hints heightened market risk. The increasing price trend in period 2 for both coins lead to underestimated VR for lower quantiles and overestimated VR for upper quantiles particularly for BTC. The price trends are reversed in period 3 resulting in over and underestimation of VR for the lower and upper quantiles respectively. This pandemic impact on VaR should be a cause for concern.

Lastly, in terms of economic interpretation of pandemic impact, we focus our discussion on volatility clustering and market efficiency. The distribution and persistence parameters of volatility and return models have different interpretation of the two properties. The preference of GB2 volatility distribution, with thicker right tails than Weibull and GG distributions, in all periods indicates more risky and panic market with extreme market stress and stronger volatility clustering for crypto market. In addition, the increasing high volatility persistence across pandemic periods with β1 + β2 ≈ 0.9 and β2 ≈ 0.6 in late pandemic (paragraph 2, Section 4.2) reflects increasing pandemic impact which induces irrational investor behaviours and responses to pandemic policies. These signals further indicate economic fragility, and prolonged market and economic uncertainty. To mitigate these pandemic impacts, central banks and regulators should use non-Gaussian scenarios in stress test, monitor the systemic risk and set up volatility risk premium to protect investors.

The other economic property is market efficiency. This is not just a financial concept but a critical enabler of economic development and stability. Acting like a “mirror of the economy”, efficient markets guide investment and consumption decisions wisely, allow efficient capital allocation, and hence, facilitate lower transaction cost and more informed decisions. When markets are not efficient, they become a distorted lens, leading to waste, risk, and instability. Efficient market hypothesis says that all available information is fully reflected in the prices; hence, returns should be random, memoryless and hence, unpredictable. As the AR(1) terms in return models are only significant during early pandemic, they suggest some market inefficiency with friction and delayed information diffusion. Moreover, the best ST return distributions with low estimated degrees of freedom parameter ν during early pandemic also reflects more frequent extreme returns, slow information absorption and herding behaviour.

Despite these observations, there are some limitations of this research suggesting some further exploration in this area. For example, the heavy tails of ST and VG distributions may not be thick enough to capture some very extreme outliers, such as those on March 12th and 13th, 2020 which negatively skewed the distribution despite the increasing price trend. More flexible distributions including skewed distributions could be explored. These distributions can be expressed as mean scale mixture of Normals (MSMN), as extensions to the scale mixture of Normals in (21) and (22) for ST and VG distributions respectively to simplify the Bayesian inference. As an example, the MSMN for the ST distribution can be expressed as

(33) X | μ , σ 2 , γ , ν , u N μ + u γ , σ 2 u and u | ν G a ν 2 , ν 2

where γ is the skewness parameter. Alternatively, quantile regression (Koenker and Bassett Jr 1978) using asymmetric loss function could potentially model the extreme quantiles better. Besides, a jump model (Chan et al. 2014) could be used to capture the discontinuous behaviour in prices. The jump model includes a jump term J t q t in the mean function that accounts for the jump size J t  ∼ Ga(a, b) and jump probability q t  ∼ Bernoulli(π). This component describes the price process that has the normal variations as modelled by the geometric Brownian motion, but also has a jump component caused by information shocks.

Moreover, there are many factors that have affected crypto markets during the pandemic period (regulatory polices including lockdown and quarantine, remote work and schooling, testing and isolation, border closure and travel ban, vaccine and booster shots, stimulus payments and business support, etc.) which can be captured by dummy indicators/covariates in the volatility and return mean functions. However, our experience shows that it is near impossible to capture these individual effects by adopting many covariates into the volatility and return mean functions. Covariates in the return mean function are often not significant because the mean is often close to zero with large variance, making it hard to estimate effects in return mean function precisely. In addition, including many covariates into the volatility model makes it unstable. In fact, better approaches for studying pandemic policy effects are neural networks as they can accommodate many covariates and their higher order interactions. This is a promising area to pursue in the future. On the other hand, our aim is to study how these factors affect the persistence of crypto markets across pandemic periods as persistence is an important indicator of price movement related to market performance.

Lastly, current study focuses on cryptocurrency market. To evaluate the impact of COVID-19 on the financial markets and further to the global economy, it would be interesting to extend the application to energy, gold, stock markets such as S&P 500, etc. as they represent different financial sectors. The study could evaluate how features such as short-term memory (liquidity and market frictions), leverage effect (market reaction), kurtosis (shocks), VaR (market risk) etc. are changed across pandemic periods. Finally, the multivariate extensions of these models could be explored as the dependency between different assets is important in risk measurement when managing portfolio’s. This would include looking at the multivariate CARR model of Tan et al. (2022). Alternatively, one could use the Wishart distribution which is the multivariate Gamma distribution to model the covariance matrices. In stage two, the vector autoregressive model could be used for the returns with multivariate ST or VG distributions.


Corresponding author: Jennifer S.K. Chan, School of Mathematics and Statistics, University of Sydney, Sydney, NSW 2006, Australia, E-mail: 

A Volatility Summary Plots for ETH

Figure 11: 
ETH time series, density, and ACF plots for selected volatility measures.
Figure 11:

ETH time series, density, and ACF plots for selected volatility measures.

Figure 12: 
ETH time series, density, and ACF plots for selected volatility measures.
Figure 12:

ETH time series, density, and ACF plots for selected volatility measures.

B Bayesian Model Diagnostics

Figure 13: 
Bayesian model diagnostics for CARR(1, 1) model using Weibull distribution applied to BTC in period 1 (the first model in Table 4). Subplot (d) is the posterior predictive distribution of VT+1 in (25), the first one-step-ahead forecast, fitting the model to V1:T where the window size T = 265 in period 1. The red dash line is the estimate 






V

̂



266



${\hat{V}}_{266}$


 being the median of the posterior predictive distribution.
Figure 13:

Bayesian model diagnostics for CARR(1, 1) model using Weibull distribution applied to BTC in period 1 (the first model in Table 4). Subplot (d) is the posterior predictive distribution of VT+1 in (25), the first one-step-ahead forecast, fitting the model to V1:T where the window size T = 265 in period 1. The red dash line is the estimate V ̂ 266 being the median of the posterior predictive distribution.

C Credible Intervals for the CARR Models in Table 4 and 5

Table 14:

Ninety-five percent credible intervals for the CARR models in Table 4.

Dist. P BTC ETH
PK RPK RPK QPK QPK PK RPK RPK QPK QPK
1 h 1 min (1, 99) (5, 95) 1 h 1 min (1, 99) (5, 95)
1 Wei β 0 * 0.191, 0.562 0.200, 0.477 0.147, 0.323 0.145, 0.444 0.127, 0.446 0.457, 1.289 0.315, 0.890 0.301, 0.698 0.313, 1.039 0.324, 1.114
β 1 0.244, 0.634 0.449, 0.859 0.674, 0.954 0.234, 0.648 0.171, 0.562 0.167, 0.477 0.353, 0.735 0.507, 0.844 0.139, 0.454 0.105, 0.418
β 2 0.106, 0.568 0.008, 0.373 0.002, 0.173 0.109, 0.584 0.160, 0.660 0.024, 0.543 0.013, 0.439 0.003, 0.248 0.027, 0.611 0.020, 0.622
GG β 0 * 0.079, 0.562 0.095, 0.315 0.104, 0.274 0.053, 0.278 0.042, 0.278 0.238, 0.949 0.211, 0.641 0.237, 0.560 0.169, 0.764 0.179, 0.906
β 1 0.204, 0.540 0.377, 0.775 0.549, 0.897 0.191, 0.577 0.150, 0.505 0.167, 0.454 0.335, 0.673 0.485, 0.798 0.147, 0.423 0.110, 0.377
β 2 0.279, 0.703 0.077, 0.525 0.010, 0.344 0.265, 0.733 0.324, 0.788 0.121, 0.659 0.087, 0.511 0.014, 0.332 0.138, 0.689 0.094, 0.713
GB2 β 0 * 0.015, 0.209 0.033, 0.136 0.041, 0.122 0.006, 0.015 0.004, 0.175 0.124, 0.560 0.119, 0.359 0.121, 0.317 0.117, 0.577 0.146, 0.875
β 1 0.129, 0.428 0.266, 0.517 0.318, 0.538 0.105, 0.412 0.088, 0.394 0.194, 0.432 0.311, 0.557 0.365, 0.626 0.190, 0.462 0.160, 0.462
β 2 0.533, 0.860 0.449, 0.713 0.414, 0.641 0.553, 0.887 0.570, 0.906 0.494, 0.758 0.375, 0.626 0.283, 0.558 0.455, 0.758 0.429, 0.780
2 Wei β 0 * 0.057, 0.271 0.066, 0.260 0.044, 0.174 0.050, 0.246 0.039, 0.246 0.121, 0.711 0.168, 0.547 0.076, 0.348 0.099, 0.573 0.097, 0.570
β 1 0.184, 0.465 0.459, 0.904 0.485, 0.864 0.144, 0.427 0.092, 0.320 0.214, 0.591 0.548, 0.979 0.472, 0.968 0.177, 0.499 0.124, 0.388
β 2 0.458, 0.766 0.059, 0.520 0.114, 0.497 0.474, 0.793 0.519, 0.842 0.243, 0.731 0.003, 0.418 0.011, 0.508 0.331, 0.764 0.396, 0.802
GG β 0 * 0.034, 0.163 0.054, 0.212 0.041, 0.152 0.026, 0.135 0.019, 0.125 0.122, 0.584 0.165, 0.545 0.110, 0.358 0.096, 0.487 0.099, 0.540
β 1 0.142, 0.355 0.385, 0.821 0.432, 0.821 0.123, 0.324 0.100, 0.273 0.177, 0.444 0.443, 0.884 0.490, 0.917 0.152, 0.401 0.126, 0.371
β 2 0.566, 0.805 0.129, 0.572 0.138, 0.535 0.593, 0.825 0.637, 0.855 0.365, 0.736 0.029, 0.473 0.027, 0.447 0.412, 0.763 0.392, 0.786
GB2 β 0 * 0.030, 0.125 0.031, 0.111 0.025, 0.092 0.023, 0.113 0.019, 0.116 0.133, 0.562 0.172, 0.509 0.139, 0.373 0.105, 0.515 0.116, 0.666
β 1 0.160, 0.318 0.233, 0.477 0.277, 0.543 0.150, 0.312 0.128, 0.292 0.204, 0.477 0.368, 0.735 0.430, 0.794 0.194, 0.471 0.173, 0.490
β 2 0.660, 0.820 0.493, 0.743 0.431, 0.702 0.662, 0.830 0.683, 0.852 0.462, 0.746 0.179, 0.558 0.130, 0.496 0.460, 0.756 0.414, 0.771
3 Wei β 0 * 0.185, 0.623 0.218, 0.517 0.211, 0.479 0.124, 0.507 0.107, 0.538 0.223, 0.655 0.227, 0.571 0.197, 0.491 0.192, 0.542 0.208, 0.597
β 1 0.179, 0.444 0.418, 0.751 0.479, 0.842 0.149, 0.402 0.119, 0.340 0.243, 0.454 0.453, 0.735 0.515, 0.825 0.244, 0.456 0.211, 0.402
β 2 0.328, 0.709 0.138, 0.462 0.057, 0.402 0.361, 0.760 0.374, 0.797 0.426, 0.672 0.190, 0.481 0.114, 0.426 0.430, 0.665 0.440, 0.685
GG β 0 * 0.085, 0.335 0.177, 0.442 0.162, 0.385 0.059, 0.258 0.050, 0.258 0.172, 0.516 0.221, 0.529 0.188, 0.448 0.135, 0.446 0.116, 0.442
β 1 0.127, 0.281 0.344, 0.615 0.416, 0.709 0.120, 0.262 0.100, 0.233 0.207, 0.392 0.439, 0.701 0.510, 0.793 0.195, 0.377 0.169, 0.346
β 2 0.564, 0.807 0.223, 0.542 0.155, 0.476 0.603, 0.823 0.628, 0.852 0.480, 0.710 0.196, 0.468 0.125, 0.406 0.502, 0.731 0.530, 0.766
GB2 β 0 * 0.056, 0.222 0.125, 0.294 0.123, 0.284 0.047, 0.205 0.043, 0.212 0.120, 0.402 0.186, 0.438 0.176, 0.411 0.102, 0.362 0.091, 0.364
β 1 0.123, 0.255 0.244, 0.424 0.297, 0.518 0.127, 0.264 0.110, 0.243 0.190, 0.368 0.325, 0.548 0.402, 0.680 0.182, 0.363 0.160, 0.336
β 2 0.708, 0.853 0.486, 0.687 0.391, 0.629 0.692, 0.847 0.710, 0.866 0.581, 0.771 0.363, 0.607 0.237, 0.530 0.579, 0.778 0.601, 0.805
  1. *Re-scaled by multiplying by 103.

Table 15:

Ninety-five percent credible intervals for the CARR models in Table 5.

P BTC ETH
C(1, 1) C(2, 1) C(1, 2) LC(1, 1) LC(1, 2) C(1, 1) C(2, 1) C(1, 2) LC(1, 1) LC(1, 2)
1 β 0 * 0.042, 0.121 0.043, 0.131 0.044, 0.125 0.007, 0.084 0.001, 0.009 0.122, 0.318 0.131, 0.363 0.119, 0.303 0.020, 0.209 0.016, 0.205
β 11 0.318, 0.538 0.311, 0.537 0.368, 0.595 0.205, 0.439 0.246, 0.498 0.364, 0.625 0.360, 0.631 0.397, 0.642 0.225, 0.488 0.253, 0.522
β 12 0.001, 0.093 0.001, 0.138
β 21 0.410, 0.642 0.374, 0.627 0.070, 0.436 0.466, 0.678 0.128, 0.493 0.283, 0.558 0.192, 0.532 0.049, 0.404 0.372, 0.620 0.109, 0.481
β 22 0.095, 0.384 0.083, 0.358 0.053, 0.333 0.043, 0.325
β 3 0.099, 0.838 0.089, 0.904 0.187, 1.209 0.210, 1.279
β 4 2.269, 9.193 0.247, 0.985 0.400, 1.325 0.387, 1.371
2 β 0 * 0.025, 0.091 0.027, 0.100 0.028, 0.093 0.000, 0.023 0.000, 0.028 0.139, 0.381 0.152, 0.418 0.135, 0.332 0.001, 0.075 0.001, 0.117
β 11 0.275, 0.542 0.273, 0.552 0.351, 0.587 0.162, 0.357 0.207, 0.438 0.431, 0.796 0.436, 0.820 0.467, 0.759 0.110, 0.339 0.156, 0.465
β 12 0.000, 0.074 0.001, 0.142
β 21 0.433, 0.704 0.387, 0.691 0.055, 0.398 0.569, 0.747 0.172, 0.561 0.124, 0.497 0.041, 0.459 0.003, 0.227 0.508, 0.738 0.102, 0.567
β 22 0.156, 0.446 0.085, 0.385 0.107, 0.361 0.067, 0.377
β 3 0.012, 0.473 0.016, 0.600 0.094, 0.834 0.108, 1.007
β 4 0.512, 1.093 0.524, 1.140 0.894, 1.591 0.877, 1.670
3 β 0 * 0.120, 0.285 0.127, 0.312 0.127, 0.279 0.074, 0.244 0.072, 0.233 0.177, 0.417 0.192, 0.467 0.170, 0.363 0.096, 0.344 0.078, 0.295
β 11 0.299, 0.515 0.296, 0.526 0.339, 0.540 0.268, 0.481 0.304, 0.501 0.402, 0.681 0.408, 0.705 0.434, 0.653 0.365, 0.631 0.390, 0.602
β 12 0.001, 0.071 0.001, 0.094
β 21 0.389, 0.630 0.337, 0.614 0.138, 0.443 0.410, 0.638 0.145, 0.436 0.236, 0.530 0.148, 0.504 0.062, 0.308 0.259, 0.528 0.078, 0.322
β 22 0.083, 0.330 0.096, 0.336 0.106, 0.317 0.110, 0.317
β 3 0.091, 0.762 0.035, 0.871 0.175, 1.026 0.188, 1.045
β 4 0.004, 0.374 0.005, 0.394 0.034, 0.671 0.038, 0.682
  1. *Re-scaled by multiplying by 103; ⋆ Re-scaled by multiplying by 102.

D LCARR and BCARR Model Results

Table 16:

Parameter estimates and DIC for LCARR and BCARR models compared across periods using the RPK, 1 m for BTC and ETH.

Period Parameter BTC ETH
LC(1, 1, b ) LC(1, 1, c ) BC(1, 1, 1) LC(1, 1, b ) LC(1, 1, c ) BC(1, 1, 1)
1 β 0 * 0.053 0.034 0.075 0.093 0.139 0.207
β 11 0.396 0.412 0.414 0.402 0.486 0.481
β 21 0.556 0.548 0.542 0.514 0.446 0.434
β 3 0.121 0.168 0.511 0.168
β 4 0.153 0.165 0.663 0.165
β 5 0.513 0.481
DIC −4,584.65 −4,588.20 −4,587.69 −4,278.49 −4,273.27 −4,275.24
2 β 0 * 0.011 0.018 0.053 0.082 0.154 0.245
β 11 0.322 0.378 0.392 0.390 0.572 0.603
β 21 0.653 0.605 0.584 0.528 0.367 0.318
β 3 0.065 0.030 0.173 0.095
β 4 0.222* 0.092* 0.808 0.244
β 5 0.376 0.382
DIC −4,566.65 −4,557.60 −4,552.35 −4,210.14 −4,196.19 −4,198.09
3 β 0 * 0.152 0.157 0.194 0.167 0.209 0.285
β 11 0.388 0.407 0.394 0.503 0.543 0.529
β 21 0.533 0.518 0.523 0.425 0.389 0.392
β 3 0.332 0.214 0.773 0.486
β 4 0.117 0.060 0.411 0.117
β 5 0.455 0.373
DIC −6,296.48 −6,295.07 −6,297.65 −5,962.18 −5,958.83 −5,958.49
  1. *Re-scaled by multiplying by 103; ⋆ Re-scaled by multiplying by 102.

References

Andersen, T. G., and T. Bollerslev. 1998. “Answering the Skeptics: Yes, Standard Volatility Models Do Provide Accurate Forecasts.” International Economic Review 39 (4): 885–905, https://doi.org/10.2307/2527343.Search in Google Scholar

Artzner, P., F. Delbaen, J. M. Eber, and D. Heath. 1997, 1999. “Coherent Measures of Risk.” Mathematical Finance 9 (3): 203–28, https://doi.org/10.1111/1467-9965.00068.Search in Google Scholar

Bollerslev, T. 1986. “Generalized Autoregressive Conditional Heteroskedasticity.” Journal of Econometrics 31 (3): 307–27, https://doi.org/10.1016/0304-4076(86)90063-1.Search in Google Scholar

Burker, P. C. 2017. “brms: An R Package for Bayesian Multilevel Models using Stan.” Journal of Statistical Software 80 (1): 1–28, https://doi.org/10.18637/jss.v080.i01.Search in Google Scholar

Chan, J. S., S. B. Choy, and C. P. Lam. 2014. “Modeling Electricity Price Using a Threshold Conditional Autoregressive Geometric Process Jump Model.” Communications in Statistics-Theory and Methods 43 (10–12): 2505–15, https://doi.org/10.1080/03610926.2013.788714.Search in Google Scholar

Chan, J., S. Choy, U. Makov, and Z. Landsman. 2018. “Modelling Insurance Losses Using Contaminated Generalised Beta Type-II Distribution.” ASTIN Bulletin 48 (2): 871–904, https://doi.org/10.1017/asb.2017.37.Search in Google Scholar

Chou, R. Y. T. 2005. “Forecasting Financial Volatilities with Extreme Values: The Conditional Autoregressive Range (CARR) Model.” Journal of Money, Credit, and Banking 37 (3): 561–82, https://doi.org/10.1353/mcb.2005.0027.Search in Google Scholar

Choy, B. S., and J. S. Chan. 2008. “Scale Mixtures Distributions in Statistical Modelling.” Australian and New Zealand Journal of Statistics 50 (2): 135–46, https://doi.org/10.1111/j.1467-842x.2008.00504.x.Search in Google Scholar

Cont, R. 2001. “Empirical Properties of Asset Returns: Stylized Facts and Statistical Issues.” Quantitative Finance 1 (2): 223–36, https://doi.org/10.1080/713665670.Search in Google Scholar

Duane, S., A. D. Kennedy, B. J. Pendleton, and D. Roweth. 1987. “Hybrid Monte Carlo.” Physics Letters B 195 (2): 216–22, https://doi.org/10.1016/0370-2693(87)91197-x.Search in Google Scholar

Engle, R. F. 1982. “Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation.” Econometrica: Journal of the Econometric Society (4): 987–1007, https://doi.org/10.2307/1912773.Search in Google Scholar

Engle, R. F., and J. R. Russell. 1998. “Autoregressive Conditional Duration: A New Model for Irregularly Spaced Transaction Data.” Econometrica 66 (5): 1127–62, https://doi.org/10.2307/2999632.Search in Google Scholar

Fung, T., and E. Seneta. 2007. “Tailweight, Quantiles and Kurtosis: A Study of Competing Distributions.” Operations Research Letters 35 (4): 448–54, https://doi.org/10.1016/j.orl.2006.07.003.Search in Google Scholar

Hoffman, M. D., and A. Gelman. 2014. “The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo.” The Journal of Machine Learning Research 15 (1): 1593–623.Search in Google Scholar

Jorion, P. 1996. “Risk2: Measuring the Risk in Value at Risk.” Financial Analysts Journal 52 (6): 47–56, https://doi.org/10.2469/faj.v52.n6.2039.Search in Google Scholar

Koenker, R., and G. Bassett Jr. 1978. “Regression Quantiles.” Econometrica: Journal of the Econometric Society (1): 33–50, https://doi.org/10.2307/1913643.Search in Google Scholar

Kupiec, P. H. 1995. “Techniques for Verifying the Accuracy of Risk Measurement Models.” The Journal of Derivatives 3 (24): 73–84, https://doi.org/10.3905/jod.1995.407942.Search in Google Scholar

Madan, D. B., and E. Seneta. 1990. “The Variance Gamma (V.G.) Model for Share Market Returns.” The Journal of Business 63: 511–24. https://doi.org/10.1086/296519.Search in Google Scholar

Martens, M., and D. van Dijk. 2007. “Measuring Volatility with the Realized Range.” Journal of Econometrics 138 (1): 181–207, https://doi.org/10.1016/j.jeconom.2006.05.019.Search in Google Scholar

McDonald, J. B., and Y. J. Xu. 1995. “A Generalization of the Beta Distribution with Applications.” Journal of Econometrics 66 (1–2): 133–52, https://doi.org/10.1016/0304-4076(94)01612-4.Search in Google Scholar

Neal, R. M. 1994. “An Improved Acceptance Procedure for the Hybrid Monte Carlo Algorithm.” Journal of Computational Physics 111 (1): 194–203, https://doi.org/10.1006/jcph.1994.1054.Search in Google Scholar

Nitithumbundit, T., and J. S. Chan. 2022. “Covid-19 Impact on Cryptocurrencies Market Using Multivariate Time Series Models.” The Quarterly Review of Economics and Finance 86: 365–75. https://doi.org/10.1016/j.qref.2022.08.006.Search in Google Scholar

Parkinson, M. 1980. “The Extreme Value Method for Estimating the Variance of the Rate of Return.” The Journal of Business 53: 61–5. https://doi.org/10.1086/296071.Search in Google Scholar

Shao, X. D., Y. J. Lian, and L. Q. Yin. 2009. “Forecasting value-at-risk Using High Frequency Data: The Realized Range Model.” Global Finance Journal 20 (2): 128–36, https://doi.org/10.1016/j.gfj.2008.11.003.Search in Google Scholar

Spiegelhalter, D. J., N. G. Best, B. P. Carlin, and A. van der Linde. 2002. “Bayesian Measures of Model Complexity and Fit.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 64 (4): 583–639, https://doi.org/10.1111/1467-9868.00353.Search in Google Scholar

Storti, G., and C. Vitale. 2003. “BL-GARCH Models and Asymmetries in Volatility.” Statistical Methods and Applications 12 (1): 19–39, https://doi.org/10.1007/bf02511581.Search in Google Scholar

Tan, S. K., K. H. Ng, and J. S. K. Chan. 2022. “Predicting Returns, Volatilities and Correlations of Stock Indices Using Multivariate Conditional Autoregressive Range and Return Models.” Mathematics 11 (1): 13, https://doi.org/10.3390/math11010013.Search in Google Scholar

Tan, S. K., K. H. Ng, J. S. K. Chan, and I. Mohamed. 2019. “Quantile Range-based Volatility Measure for Modelling and Forecasting Volatility Using High Frequency Data.” The North American Journal of Economics and Finance 47: 537–51. https://doi.org/10.1016/j.najef.2018.06.010.Search in Google Scholar

Received: 2025-03-05
Accepted: 2025-11-23
Published Online: 2025-12-22

© 2025 the author(s), published by De Gruyter, Berlin/Boston

This work is licensed under the Creative Commons Attribution 4.0 International License.

Downloaded on 27.3.2026 from https://www.degruyterbrill.com/document/doi/10.1515/snde-2025-0027/html
Scroll to top button