Integrated variance of irregularly spaced high-frequency data: A state space approach based on pre-averaging

Vitali Alexeev; Jun Chen; Katja Ignatieva

doi:10.1515/snde-2021-0093

Enjoy 40% off

academic books on De Gruyter Brill *

Article

Integrated variance of irregularly spaced high-frequency data: A state space approach based on pre-averaging

Vitali Alexeev , Jun Chen and Katja Ignatieva

Published/Copyright: March 20, 2023

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information

From the journal Studies in Nonlinear Dynamics & Econometrics Volume 27 Issue 5

Abstract

We propose a new state space model to estimate the Integrated Variance (IV) in the presence of microstructure noise. Applying the pre-averaging sampling scheme to the irregularly spaced high-frequency data, we derive equidistant efficient price approximations to calculate the noise-contaminated realised variance (NCRV), which is used as an IV estimator. The theoretical properties of the new volatility estimator are illustrated and compared with those of the realised volatility. We highlight the robustness of the new estimator to market microstructure noise (MMN). The pre-averaging sampling effectively eliminates the influence of the MMN component on the NCRV series. The empirical illustration features the EUR/USD exchange rate and provides evidence of a superior performance in volatility forecasting at very high sampling frequencies.

Keywords: high-frequency data; integrated variance; pre-averaging; sampling scheme

Corresponding author: Jun Chen, School of Risk and Actuarial Studies, Business School, UNSW Sydney, Sydney, NSW 2052, Australia, E-mail: chen.jun@unsw.edu.au

Author contributions: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.
Research funding: None declared.
Conflict of interest statement: The authors declare no conflicts of interest regarding this article.

Appendix A. Proofs

A.1 Proof of Lemma 3.1

Given an equidistant partition in Eq. (3.1), for any two adjacent true prices within a sub-interval (τ _i, τ _i+1] on day T, in other words, x i j and x i j − 1 , the true return is defined as

r i j = x i j − x i j − 1 .

In order to prove Lemma 3.1, we require values of V a r r i j 2 , V a r r i p r i q for p ≠ q and C o v r i p 2 , r i p r i q for p ≠ q.

The definition of the RV shows that

(A.1) r i j 2 = x i j − x i j − 1 2 = σ i j 2 + d i j ,

where σ i j 2 is the value of the IV, σ i j 2 is the true value of the IV over the period ( t i j − 1 , t i j ] , and δ i j = t i j − t i j − 1 . The expectation of the discretised error d i j is zero, in other words, E d i j = 0 . Since the IV σ T 2 is assumed to be constant throughout a day T, it holds that

σ i j 2 = δ i j σ T 2 .

Given the assumption that the price process follows a one factor SR-SARV, we can make use of the results in Barndorff-Nielsen and Shephard (2002) and calculate the variance as

V a r d i j = 2 V a r σ i j 2 + E σ i j 2 2 = 2 σ T 2 δ i j 2 + 4 ω 1 2 λ 1 2 δ i j exp − λ 1 δ i j − 1 + λ 1 δ i j ,

where λ ₁ and ω ₁ are parameters in the SR-SARV model. Given an assumption that the IV is constant throughout a trading day, the price process within a trading day follows a Brownian motion with λ ₁ = 0 and ω ₁ = 0. Hence,

V a r d i j = 2 σ T 2 δ i j 2 .

The result above shows the conditional variance of the discretised error d i j for a given δ i j , i.e., V a r d i j | δ i j . Since the arrival of observations within (τ _i, τ _i+1] is assumed to follow a Poisson process in Assumption 3, when l _i, the number of observations within (τ _i, τ _i+1] is known, δ i j follows a beta distribution with α = 1 and β = l _i − 1. Then, the law of total variance shows that

V a r d i j = E V a r d i j | δ i j + V a r E d i j | δ i j = E V a r d i j | δ i j + 0 = 2 σ T 2 m 2 2 l i 2 + l i ,

where m is the number of sub-intervals on the partition in Eq. (3.1). Based on Eq. (A.1) and under the assumption of a constant IV throughout a trading T, we have

V a r r i j 2 = V a r d i j = 2 σ T 2 m 2 2 l i 2 + l i .

To find the value of V a r r i p r i q for p ≠ q, we have an unconditional expectation of r i i 2 computed in Section 3.2 such that

V a r r i j = E r i j 2 = σ T 2 m l i .

On day T, using the property of the Brownian motion that r i p is independent from r i q when p ≠ q, we have

V a r r i p r i q = E r i p 2 E r i 1 2 = σ T 2 m l i 2

and

C o v r i p 2 , r i p r i q = 0 for p ≠ q .

In Eq. (3.9), we defined the discretised error d _T when the pre-averaging sampling scheme is employed. Since σ _T and σ _T−1 are known and are constant on corresponding days, it holds that

V a r ( d T ) = V a r RV ̄ T * .

To find the variance of d _T, the proof will focus on the elements of the ith sub-interval when expanding the variance of the RV ̄ T * . Similar results can be obtained for other sub-intervals.

Using the pre-averaging sampling scheme, we notice that the elements of the ith sub-interval only appear in term ( x ̄ i − x ̄ i − 1 ) 2 + ( x ̄ i + 1 − x ̄ i ) 2 . Expanding this term results in

( x ̄ i − x ̄ i − 1 ) 2 + ( x ̄ i + 1 − x ̄ i ) 2 = 1 l i r i l i + ⋯ + l i l i r i 1 + l i − 1 − 1 l i − 1 r i − 1 l i − 1 + ⋯ + 1 l i − 1 r i − 1 2 + 0 l i − 1 r i − 1 1 2 + 1 l i + 1 r i + 1 l i + 1 + ⋯ + l i + 1 l i + 1 r i + 1 1 + l i − 1 l i r i l i + ⋯ + 1 l i r i 2 + 0 l i r i 1 2 = A 2 + 2 A 1 l i r i l i + ⋯ + l i l i r i 1 ︸ ① + 1 l i r i l i + ⋯ + l i l i r i 1 2 ︸ ② + l i − 1 l i r i l i + ⋯ + 1 l i r i 2 + 0 l i r i 1 2 ︸ ③ + 2 B l i − 1 l i r i l i + ⋯ + 1 l i r i 2 + 0 l i r i 1 ︸ ④ + B 2 ,

where A and B represent the elements that are not contained in the ith sub-interval. In the above equation

② + ③ = ∑ p = 1 l i l i − p + 1 l i 2 + p − 1 l i 2 r i p 2 + 4 ∑ p = 1 l i ∑ q = p l i l i − p + 1 l i l i − q + 1 l i + p − 1 l i q − 1 l i r i p r i q .

Given the equations

V a r r i j 2 = 2 σ T 2 m 2 2 l i 2 + l i , V a r r i p r i q = σ T 2 m l i 2 , C o v r i p 2 , r i p r i q = 0 for p ≠ q ,

we can separately calculate the variance of ② + ③:

V a r ( ② + ③ ) = 7 l i 4 + 10 l i 2 − 2 15 l i 3 × 2 σ T 2 m 2 2 l i 2 + l i + 5 l i 4 + 11 l i 2 + 2 9 l i 2 − 2 7 l i 4 + 10 l i 2 − 2 15 l i 3 × σ T 2 m l i 2 .

We can expand A ² and B ² to get similar results.

After we expand ①, we obtain

2 A 1 l i r i l i + ⋯ + l i l i r i 1 = 2 ∑ p = 1 l i ∑ q = 1 l i − 1 l i − p + 1 l i q − 1 l i − 1 r i p r i q .

Then,

V a r 2 A 1 l i r i l i + ⋯ + l i l i r i 1 = 4 V a r 2 ∑ p = 1 l i ∑ q = 1 l i − 1 l i − p + 1 l i q − 1 l i − 1 r i p r i q = 4 ∑ p = 1 l i ∑ q = 1 l i − 1 l i − p + 1 l i 2 q − 1 l i − 1 2 V a r r i p r i q = 2 l i 2 + 3 l i + 1 ( 2 l i − 1 2 − 3 l i − 1 + 1 ) 9 l i l i − 1 σ T 2 m l i σ T 2 m l i − 1 .

The same result can be obtained for ④. Finally, the discretised error can be expressed as

V a r ( d T ) = − 1 + 10 l m − 1 ′ 2 − 15 l m − 1 ′ 3 + 6 l m − 1 ′ 4 30 l m − 1 ′ 3 σ T − 1 2 m 2 2 l i 2 + l i + 6 + 5 l m − 1 ′ − 90 l m − 1 ′ 2 + 155 l m − 1 ′ 3 − 96 l m − 1 ′ 4 + 20 l m − 1 ′ 5 90 l m − 1 ′ 3 σ T − 1 m l m − 1 ′ 2 + ( 1 − 3 l m − 1 ′ + 2 l m − 1 ′ 2 ) 1 + 3 l 0 + 2 l 0 2 18 l m − 1 ′ l 0 σ T − 1 2 m l m − 1 ′ σ T 2 m l 0 + ∑ i = 0 m − 2 1 − 3 l i − 1 + 2 l i − 1 2 ( 1 + 3 l i + 2 l i 2 ) 18 l i − 1 l i σ T 2 ( i ) m l i − 1 σ T 2 m l i + ∑ i = 0 m − 2 − 2 + 10 l i 2 + 7 l i 4 15 l i 3 σ T 2 m 2 2 l i 2 + l i + ∑ i = 0 m − 2 12 + 10 l i − 60 l i 2 + 55 l i 3 − 42 l i 4 + 25 l i 5 45 l i 3 σ T m l i 2 + ∑ i = 0 m − 2 1 − 3 l i + 2 l i 2 ( 1 + 3 l i + 1 + 2 l i + 1 2 ) 18 l i l i + 1 σ T 2 m l i σ T 2 m l i + 1 + 1 − 3 l m − 2 + 2 l m − 2 2 ( 1 + 3 l m − 1 + 2 l m − 1 2 ) 18 l m − 2 l m − 1 σ T 2 m l m − 2 σ T 2 m l m − 1 + − 1 + 10 l m − 1 2 + 15 l m − 1 3 + 6 l m − 1 4 30 l m − 1 3 σ T 2 m 2 2 l m − 1 2 + l m − 1 + 6 + 5 l m − 1 − 30 l m − 1 2 − 25 l m − 1 3 + 24 l m − 1 4 + 20 l m − 1 5 90 l m − 1 3 σ T m l m − 1 2 .

We note that σ T 2 ( i ) = σ T − 1 2 when i = 0, and the number of observations that are within the last sub-interval on day T − 1 is l 0 − 1 = l m − 1 ′ .

A.2 Proof of Lemma 3.2

Let θ i = 1 l i − 1 ∑ j = 1 l i − 1 ϵ i − 1 j . Since the ϵ i − 1 j ’s are i.i.d. random variables under Assumption 2, θ _i has the following properties:

(A.2) E [ θ i ] = 1 l i − 1 ∑ j = 1 l i − 1 E ϵ i − 1 j = 0 , V a r [ θ i ] = 1 l i − 1 2 V a r ∑ j = 1 l i − 1 ϵ i − 1 j = σ ϵ 2 l i − 1 ,

(A.3) C o v ( θ i , θ i − 1 ) = E [ ( θ i − E ( θ i ) ) ( θ i − 1 − E ( θ i − 1 ) ) ] = E 1 l i − 1 ∑ j = 1 l i − 1 ϵ i − 1 j 1 l i − 2 ∑ j = 1 l i − 1 ϵ i − 2 , j = 0 ,

in other words, θ _i’s are also i.i.d. random variables. Given definition of e _i = θ _i − θ _i−1 in Eq. (3.5) and based on the properties of θ _i from Eqs. (A.2) and (A.3), we can complete the proof of Lemma 3.2 as follows:

E ( e i ) = E [ ( θ i − θ i − 1 ) ] = 0 , V a r ( e i ) = V a r [ ( θ i − θ i − 1 ) ] = V a r ( θ i ) − V a r ( θ i − 1 ) = 1 l i − 1 + 1 l i − 2 σ ϵ 2 , C o v ( e i , e i − 1 ) = E [ ( e i − E ( e i ) ) ( e i − 1 − E ( e i − 1 ) ) ] = E [ ( θ i − θ i − 1 ) ( θ i − 1 − θ i − 2 ) ] = − E θ i − 1 2 = − 1 l i − 2 σ ϵ 2 .

A.3 Proof of Lemma 3.3

Since MMN is independent of the price process, we can write:

E ( u T ) = E 2 ∑ i = 1 m r ̄ i * e i + E ∑ i = 1 m e i 2 = ∑ i = 1 m E e i 2 .

Lemma 3.2 shows that E ( e i ) = 0 and V a r ( e i ) = E e i 2 = 1 l i − 1 + 1 l i − 2 σ ϵ 2 , such that

E ( u T ) = ∑ i = 1 m 1 l i − 1 + 1 l i − 2 σ ϵ 2 = 1 l m − 1 ′ + ∑ i = 1 m − 1 2 l i − 1 + 1 l m − 1 σ ϵ 2 .

Again, using the independence between the MMN and the price process, we have

(A.4) C o v ( u T , u T ) = C o v 2 ∑ i = 1 m r ̄ i * e i + ∑ i = 1 m e i 2 , 2 ∑ i = 1 m r ̄ i * e i + ∑ i = 1 m e i 2 = C o v 2 ∑ i = 1 m r ̄ i * e i , 2 ∑ i = 1 m r ̄ i * e i + C o v ∑ i = 1 m e i 2 , ∑ i = 1 m e i 2 = 4 ∑ i = 1 m C o v r ̄ i * e i , r ̄ i * e i ︸ ① + 8 ∑ i = 2 m C o v r ̄ i * e i , r ̄ i − 1 * e i − 1 ︸ ②

(A.5) + ∑ i = 1 m C o v e i 2 , e i 2 ︸ ③ + 2 ∑ i = 2 m C o v e i 2 , e i − 1 2 ︸ ④ .

We proceed by calculating the value of each part in Eqs. (A.4) and (A.5) separately. For ①, since E ( e i ) = 0 , we have

C o v r ̄ i * e i , r ̄ i * e i = E r ̄ i * e i r ̄ i * e i = E r ̄ i 2 E e i 2 .

Using the methodology outlined in Alexeev, Chen, and Ignatieva (2021), E r ̄ i * 2 computed for day T has the following expression

E r ̄ i * 2 = E 1 l i − 1 ∑ j = 1 l i − 1 x i − 1 j − 1 l i − 2 ∑ j = 1 l i − 2 x i − 2 j 2 = 2 l i − 1 2 + 3 l i − 1 + 1 6 l i − 1 2 + 2 l i − 2 2 − 3 l i − 2 + 1 6 l i − 2 2 σ T 2 m ,

and

E r ̄ 1 * 2 = E 1 l 0 ∑ j = 1 l 0 x 0 , j − 1 l m − 1 ′ ∑ j = 1 l m − 1 ′ x m − 1 j ( T − 1 ) 2 = 2 l 0 2 + 3 l 0 + 1 6 l 0 2 σ T 2 m + 2 l m − 1 ′ 2 − 3 l m − 1 ′ + 1 6 l m − 1 ′ 2 σ T − 1 2 m ,

where x m − 1 j ( T − 1 ) is the jth true price in the sub-interval (τ _m−1, τ _m) on day T − 1.

Combining the result with Lemma 3.2, we obtain

① = 2 l 0 2 + 3 l 0 + 1 6 l 0 2 σ T 2 m + 2 l m − 1 ′ 2 − 3 l m − 1 ′ + 1 6 l m − 1 ′ 2 σ T − 1 2 m 1 l 0 + 1 l m − 1 ′ σ ϵ 2 + 4 ∑ i = 2 m 2 l i − 1 2 + 3 l i − 1 + 1 6 l i − 1 2 + 2 l i − 2 2 − 3 l i − 2 + 1 6 l i − 2 2 1 l i − 1 + 1 l i − 2 σ T 2 σ ϵ 2 m .

For part ②, using the independence of the random variables θ _i’s in the Proof of Lemma 3.2, we obtain

(A.6) C o v r ̄ i * e i , r ̄ i − 1 * e i − 1 = E r ̄ i * r ̄ i − 1 * e i e i − 1 = − E r ̄ i * r ̄ i − 1 * θ i − 1 2 = − C o v r ̄ i * r ̄ i − 1 * V a r θ i − 1 2 .

In Eq. (A.6),

C o v r ̄ i * r ̄ i − 1 * = E 1 l i − 1 ∑ j = 1 l i − 1 x i − 1 j − 1 l i − 2 ∑ j = 1 l i − 2 x i − 2 j 1 l i − 2 ∑ j = 1 l i − 2 x i − 2 j − 1 l i − 3 ∑ j = 1 l i − 3 x i − 3 j = l i − 2 2 − 1 6 l i − 2 2 σ T 2 m .

Therefore,

C o v r ̄ i * e i , r ̄ i − 1 * e i − 1 = l i − 2 2 − 1 6 l i − 2 2 σ T 2 m σ ϵ 2 l i − 2

and

② = 8 ∑ i = 2 m l i − 2 2 − 1 6 l i − 2 2 σ T 2 m σ ϵ 2 l i − 2 .

For part ③,

C o v e i 2 , e i 2 = V a r e i 2 = V a r ( θ i − θ i − 1 ) 2 = V a r θ i 2 + 4 V a r ( θ i θ i − 1 ) + V a r θ i − 1 2 + 2 C o v θ i 2 , − 2 θ i θ i − 1 + 2 C o v θ i 2 , θ i − 1 2 + 2 C o v θ i − 1 2 , − 2 θ i θ i − 1 ,

where

V a r θ i 2 = V a r 1 l i − 1 ∑ j = 1 l i − 1 ϵ i , j 2 = 1 l i − 1 4 ∑ j = 1 l i − 1 ϵ i − 1 j 2 = 1 l i − 1 4 V a r ∑ j = 1 l i − 1 ϵ i − 1 j 2 + 2 ∑ p = 1 l i − 1 − 1 ∑ q = p l i − 1 ϵ i − 1 , p ϵ i − 1,1 = 1 l i − 1 4 ∑ j = 1 l i − 1 V a r ϵ i − 1 j 2 + 4 l i − 1 4 ∑ p = 1 l i − 1 − 1 ∑ q = p l i − 1 V a r ( ϵ i − 1 , p ) V a r ( ϵ i − 1 , q ) = 1 l i − 1 3 ω ϵ 2 + 2 ( l i − 1 − 1 ) l i − 1 3 σ ϵ 4

and

V a r ( θ i θ i − 1 ) = E θ i 2 E θ i − 1 2 = σ ϵ 4 l i − 1 l i − 2 ,

using i.i.d. property of θ _i’s. Therefore,

C o v e i 2 , e i 2 = V a r θ i 2 + 4 V a r ( θ i θ i − 1 ) + V a r θ i − 1 2 = 1 l i − 1 3 + 1 l i − 2 3 ω ϵ 2 + 2 ( l i − 1 − 1 ) l i − 1 3 + 2 ( l i − 2 − 1 ) l i − 2 3 σ ϵ 4 + 4 σ ϵ 4 l i − 1 l i − 2

and

③ = ∑ i = 2 m 1 l i − 1 3 + 1 l i − 2 3 ω ϵ 2 + 2 ( l i − 1 − 1 ) l i − 1 3 + 2 ( l i − 2 − 1 ) l i − 2 3 σ ϵ 4 + 4 σ ϵ 4 l i − 1 l i − 2 .

For part ④ we have

C o v e i 2 , e i − 1 2 = E ( e i 2 − E e i 2 ) ( e i − 1 2 − E e i − 1 2 ) = E e i 2 − 1 l i − 1 + 1 l i − 2 σ ϵ 2 e i − 1 2 − 1 l i − 2 + 1 l i − 3 σ ϵ 2 = E e i 2 e i − 1 2 − 1 l i − 1 + 1 l i − 2 σ ϵ 2 e i − 1 2 − 1 l i − 2 + 1 l i − 3 σ ϵ 2 e i 2 + 1 l i − 1 + 1 l i − 2 1 l i − 2 + 1 l i − 3 σ ϵ 4 = E e i 2 e i − 1 2 − 1 l i − 1 + 1 l i − 2 1 l i − 2 + 1 l i − 3 σ ϵ 4 ︸ α = E ( θ i − θ i − 1 ) 2 e i − 1 2 − α = E θ i 2 − 2 θ i θ i − 1 + θ i − 1 2 e i − 1 2 − α = E θ i 2 e i − 1 2 − 2 E θ i θ i − 1 e i − 1 2 + E θ i − 1 2 e i − 1 2 − α = E θ i 2 E e i − 1 2 + E θ i − 1 2 θ i − 1 2 − 2 θ i − 1 θ i − 2 + θ i − 2 2 − α = 1 l i 1 l i − 1 − 1 l i − 2 σ ϵ 4 + E θ i − 1 4 + E θ i − 1 2 θ i − 2 2 − α = 1 l i 1 l i − 1 − 1 l i − 2 σ ϵ 4 + V a r θ i − 1 2 + E θ i − 1 2 2 + E θ i − 1 2 θ i − 2 2 − α = 1 l i − 2 3 ω ϵ 2 + 2 ( l i − 2 − 1 ) l i − 2 3 σ ϵ 4 .

Thus,

④ = 2 ∑ i = 2 m 1 l i − 2 3 ω ϵ 2 + 2 ( l i − 2 − 1 ) l i − 2 3 σ ϵ 4 .

Finally,

C o v ( u T , u T ) = 2 l 0 2 + 3 l 0 + 1 6 l 0 2 σ T 2 m + 2 l m − 1 ′ 2 − 3 l m − 1 ′ + 1 6 l m − 1 ′ 2 σ T − 1 2 m 1 l 0 + 1 l m − 1 ′ σ ϵ 2 + 4 ∑ i = 2 m 2 l i − 1 2 + 3 l i − 1 + 1 6 l i − 1 2 + 2 l i − 2 2 − 3 l i − 2 + 1 6 l i − 2 2 1 l i − 1 + 1 l i − 2 σ T 2 σ ϵ 2 m + 8 ∑ i = 2 m l i − 2 2 − 1 6 l i − 2 2 σ T 2 m σ ϵ 2 l i − 2 + ∑ i = 2 m 1 l i − 1 3 + 1 l i − 2 3 ω ϵ 2 + 2 ( l i − 1 − 1 ) l i − 1 3 + 2 ( l i − 2 − 1 ) l i − 2 3 σ ϵ 4 + 4 σ ϵ 4 l i − 1 l i − 2 + 2 ∑ i = 2 m 1 l i − 2 3 ω ϵ 2 + 2 ( l i − 2 − 1 ) l i − 2 3 σ ϵ 4 .

Then, when considering the covariance between u _T and u _T−1, we obtain

C o v ( u T , u T − 1 ) = 2 3 l m − 2 ′ 2 − 1 l m − 2 ′ 2 σ T 2 m + 1 l m − 2 ′ 3 ω ϵ 2 + 2 l m − 2 ′ − 1 l m − 2 ′ 3 σ ϵ 4 ,

and

C o v ( u T , u T + S ) = 0 , for S ≥ 2 .

Appendix B. Simulation study

In this simulation study we compare and contrast the finite sample performance of both methods; the one developed in this paper and the method of Nagakura and Watanabe (2015). Following Alexeev, Chen, and Ignatieva (2021), we choose Heston model for the data generation process, refer to Heston (1993). We assume that X _t, the true log price of an asset, follows Heston stochastic volatility model defined as follows:

d X t = μ − ν t 2 d t + ν t d W 1 , t , d ν t = κ ( θ − ν t ) d t + ξ ν t d W 2 , t

Here, ν _t denotes stochastic variance. We set the same parameters as in Jacod et al. (2009), that is, μ = 0.05/252, κ = 5/252, θ = 0.04/252, ξ = 0.05/252 and ρ = −0.5, where ρ = corr(W _1,t, W _2,t) is the correlation between Brownian motions. The microstructure noise ϵ ∼ N ( 0 , η 2 ) , where η ² = 0.0005². There are 10,000 days in the simulation.

To estimate the parameters of the two models in our paper, given generated data, we set the initial values for each model as follows:

Model 1 : κ 1 , σ 2 , ω 1 2 , σ ϵ 2 × 100 , ω ϵ 2 × 1000 = ( 0.6 , 0.1 , 0.1 , 0.1 , 0.1 ) Model 2 : κ 1 , σ 2 , ω 1 2 , σ ν 2 = ( 0.6 , 0.1 , 0.1 , 0.1 )

Estimation results are reported in Tables B.7 and B.8 below.

Table B.7:

Estimated parameters for the SV model.

M	4320	2880	1440	288
*Panel A: previous tick* sampling scheme**
Δ_k	20 s	30 s	1 min	5 mins
κ ̂ 1	0.3032	0.3355	0.3289	0.9328
σ ̂ 2	0.0908	0.0950	0.1046	0.1392
ω ̂ 1 2	0.0031	0.0026	0.0012	0.0006
σ ̂ ϵ 2 × 100	0.0822	0.0955	0.1191	0.3201
ω ̂ ϵ 2 × 1000	0.0069	0.0082	0.0112	0.0757
L	−4145.4951	−3046.3032	−1219.7400	−2826.5891
*Panel B: pre-averaging* sampling scheme**
Δ_k	20 s	30 s	1 min	5 mins
κ ̂ 1	0.9999	0.9997	0.9998	1.0000
σ ̂ 2	0.0684	0.0323	0.0136	0.0330
ω ̂ 1 2	6.0309	3.1934	0.2093	0.0545
σ ̂ υ 2	0.0684	0.0323	0.0136	0.0330
L	−148.2576	1244.3137	7422.0095	2861.3558

Table B.8:

Estimates of parameters in the state space model.

M	4320	2880	1440	288
*Panel A: previous tick* sampling scheme**
Δ_k	20 s	30 s	1 min	5 mins
κ ̂ 1	0.3032	0.3355	0.3289	0.9328
c ̂ IV	0.0633	0.0632	0.0702	0.0093
θ ̂ 1	0.2478	0.2508	0.2502	0.2679
σ ̂ η 2	0.0016	0.0013	0.0006	0.0000
c ̂ u	7.1040	5.5000	3.4289	1.8440
θ ̂ u	0.0001	0.0001	0.0002	0.0007
σ ̂ ξ 2	0.1320	0.1058	0.0739	0.1024
σ ̂ d 2	0.0000	0.0000	0.0000	0.0001
*Panel B: pre-averaging* sampling scheme**
Δ_k	20 s	30 s	1 min	5 mins
κ ̂ 1	0.9999	0.9997	0.9998	1.0000
c ̂ IV	0.0000	0.0001	0.0001	0.0000
θ ̂ 1	0.2681	0.2679	0.2680	0.3595
σ ̂ η 2	0.0011	0.0013	0.0001	0.0000
σ ̂ υ 2	0.0684	0.0323	0.0136	0.0330

From the tables we observe that κ ₁ is less than one in Model 1, which indicates that IV can be modelled as an AR(1) model. In Model 2, κ ₁ is close to 1 and c _IV is close to 0, which means IV follows a random walk process. Furthermore, c _u is decreasing when the sampling frequency is getting lower. In addition, Table B.9 reports MSE & QLIKE loss functions computed for two models based on NCRV^lobs (Panel A) and NCRV^avg (Panel B) at four sampling frequencies. Based on MSE and QLIKE, we observe that model 2, which is based on the pre-averaging sampling scheme, outperforms the model based on the previous tick sampling scheme at all four sampling frequencies.

Table B.9:

In-sample forecast evaluation. MSE and QLIKE loss functions are computed for two models based on NCRV^lobs (Panel A) and NCRV^avg (Panel B) at four sampling frequencies in the in-sample study.

Panel A: previous tick sampling scheme
Δ_k	MSE	QLIKE
20 s	2.8771	14.4314
30 s	2.8639	13.6612
1 min	2.8339	12.1419
5 mins	2.7281	8.5102

Panel B: pre-averaging sampling scheme
Sampling frequency	MSE	QLIKE
20 s	4.7524	0.33730
30 s	1.9162	0.2081
1 min	0.4381	0.0822
5 mins	0.6592	0.1462

References

Alexeev, V., J. Chen, and K. Ignatieva. 2021. “From Previous-Tick to Pre-averaging: Spectra of Equidistant Transformations for Unevenly Spaced High Frequency Data.” PhD Thesis, School of Risk and Actuarial Studies, UNSW Business School, UNSW Sydney.Search in Google Scholar

Andersen, T. G. 1994. “Stochastic Autoregressive Volatility: A Framework for Volatility Modeling.” Mathematical Finance 4: 75–102.Search in Google Scholar

Andersen, T. G., and T. Bollerslev. 1997. “Intraday Periodicity and Volatility Persistence in Financial Markets.” Journal of Empirical Finance 4: 115–58. https://doi.org/10.1016/s0927-5398(97)00004-2.Search in Google Scholar

Andersen, T. G., and T. Bollerslev. 1998. “Deutsche Mark–Dollar Volatility: Intraday Activity Patterns, Macroeconomic Announcements, and Longer Run Dependencies.” The Journal of Finance 53: 219–65. https://doi.org/10.1111/0022-1082.85732.Search in Google Scholar

Andersen, T. G., T. Bollerslev, F. X. Diebold, and H. Ebens. 2001. “The Distribution of Realized Stock Return Volatility.” Journal of Financial Economics 61: 43–76. https://doi.org/10.1016/s0304-405x(01)00055-1.Search in Google Scholar

Andersen, T. G., T. Bollerslev, F. X. Diebold, and P. Labys. 2003. “Modeling and Forecasting Realized Volatility.” Econometrica 71: 579–625. https://doi.org/10.1111/1468-0262.00418.Search in Google Scholar

Andreou, E., and E. Ghysels. 2002. “Detecting Multiple Breaks in Financial Market Volatility Dynamics.” Journal of Applied Econometrics 17: 579–600. https://doi.org/10.1002/jae.684.Search in Google Scholar

Bandi, F. M., and J. R. Russell. 2008. “Microstructure Noise, Realized Variance, and Optimal Sampling.” The Review of Economic Studies 75: 339–69. https://doi.org/10.1111/j.1467-937x.2008.00474.x.Search in Google Scholar

Barndorff-Nielsen, O. E., and N. Shephard. 2002. “Econometric Analysis of Realized Volatility and its Use in Estimating Stochastic Volatility Models.” Journal of the Royal Statistical Society: Series B 64: 253–80. https://doi.org/10.1111/1467-9868.00336.Search in Google Scholar

Beine, M., J. Lahaye, S. Laurent, C. J. Neely, and F. C. Palm. 2007. “Central Bank Intervention and Exchange Rate Volatility, its Continuous and Jump Components.” International Journal of Finance & Economics 12: 201–23. https://doi.org/10.1002/ijfe.330.Search in Google Scholar

Campbell, J. Y., A. W. Lo, and A. MacKinlay. 1997. The Econometrics of Financial Markets, 2nd ed. Princeton University Press.10.1515/9781400830213Search in Google Scholar

Caporin, M. 2022. “The Role of Jumps in Realized Volatility Modeling and Forecasting.” Journal of Financial Econometrics 1–26. https://doi.org/10.1093/jjfinec/nbab030.Search in Google Scholar

Fukasawa, M. 2010. “Realized Volatility with Stochastic Sampling.” Stochastic Processes and their Applications 120: 829–52. https://doi.org/10.1016/j.spa.2010.02.006.Search in Google Scholar

Heston, S. L. 1993. “A Closed-form Solution for Options with Stochastic Volatility with Applications to Bond and Currency Options.” Review of Financial Studies 6: 327–43. https://doi.org/10.1093/rfs/6.2.327.Search in Google Scholar

Jacod, J., Y. Li, P. A. Mykland, M. Podolskij, and M. Vetter. 2009. “Microstructure Noise in the Continuous Case: The Pre-averaging Approach.” Stochastic Processes and their Applications 119: 2249–76. https://doi.org/10.1016/j.spa.2008.11.004.Search in Google Scholar

Jacod, J., P. Protter (1998): “Asymptotic Error Distributions for the Euler Method for Stochastic Differential Equations,” Annals of Probability, 26, 267–307, https://doi.org/10.1214/aop/1022855419.Search in Google Scholar

Meddahi, N. 2001. “An Eigenfunction Approach for Volatility Modeling.” In Working Paper. Université de Montréal. Département de sciences économiques.Search in Google Scholar

Meddahi, N. 2003. “ARMA Representation of Integrated and Realized Variances.” The Econometrics Journal 6: 335–56. https://doi.org/10.1111/1368-423x.t01-1-00112.Search in Google Scholar

Mykland, P. A., and L. Zhang. 2016. “Between Data Cleaning and Inference: Pre-averaging and Robust Estimators of the Efficient Price.” Journal of Econometrics 194: 242–62. https://doi.org/10.1016/j.jeconom.2016.05.005.Search in Google Scholar

Nagakura, D., and T. Watanabe. 2015. “A State Space Approach to Estimating the Integrated Variance under the Existence of Market Microstructure Noise.” Journal of Financial Econometrics 13: 45–82. https://doi.org/10.1093/jjfinec/nbt015.Search in Google Scholar

Nelson, D. B. 1990. “Arch Models as Diffusion Approximations.” Journal of Econometrics 45: 7–38. https://doi.org/10.1016/0304-4076(90)90092-8.Search in Google Scholar

Owens, J. P., and D. G. Steigerwald. 2006. “Noise Reduced Realized Volatility: A Kalman Filter Approach.” In Econometric Analysis of Financial and Economic Time Series, 211–27. Emerald Group Publishing Limited.10.1016/S0731-9053(05)20008-7Search in Google Scholar

Patton, A. J., and K. Sheppard. 2009. “Evaluating Volatility and Correlation Forecasts.” In Handbook of Financial Time Series, 801–38. Springer.10.1007/978-3-540-71297-8_36Search in Google Scholar

Wasserfallen, W., and H. Zimmermann. 1985. “The Behavior of Intra-daily Exchange Rates.” Journal of Banking & Finance 9: 55–72. https://doi.org/10.1016/0378-4266(85)90062-7.Search in Google Scholar

Supplementary Material

This article contains supplementary material (https://doi.org/10.1515/snde-2021-0093).

Received: 2021-10-19

Revised: 2023-02-16

Accepted: 2023-02-22

Published Online: 2023-03-20

You are currently not able to access this content.

Supplementary Material Details

Articles in the same Issue

https://doi.org/10.1515/snde-2021-0093

Keywords for this article

high-frequency data; integrated variance; pre-averaging; sampling scheme