A Nonparametric Model for High-Frequency Energy Prices

Nikolay Gudkov; Katja Ignatieva

doi:10.1515/snde-2022-0113

Enjoy 40% off

academic books on De Gruyter Brill *

Article

A Nonparametric Model for High-Frequency Energy Prices

Nikolay Gudkov and Katja Ignatieva

Published/Copyright: December 24, 2024

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information

From the journal Studies in Nonlinear Dynamics & Econometrics

Abstract

This paper proposes an efficient approach for modelling a high frequency continuous time diffusion process for the dynamics of crude oil. While various applications of continuous time models are considered in the literature, the results on choosing the right model are mixed. We employ a very general non-parametric approach to capture the dynamics of the crude oil market proxied by United States Oil (USO) exchange traded fund. This approach is purely data driven and does not require specification of the drift or the diffusion coefficient function. The proposed nonparametric kernel-based estimation procedure relies on the local polynomial kernel regression, where the choice of a bandwidth parameter plays a significant role. We demonstrate that besides offering a convenient way of estimating the continuous-time models for energy prices, our estimation procedure performs well when dealing with predicting USO prices out-of-sample. The analysis is extended by incorporating possible jump diffusion, where the assumption of continuity of the stochastic process is relaxed and a jump component is added to the diffusion process. In addition, we extend our model by adding possible seasonalities in the underlying dynamics, which requires decomposing the price by means of the Maximum Overlap Discrete Wavelet Transform (MODWT) algorithm and applying nonparametric kernel-based estimation procedure to modelling of the deseasonalized prices.

Keywords: high frequency data; model specification; nonparametric estimation; wavelet analysis

JEL Classification: C00; C01; C02; C10; C14; C58

Corresponding author: Nikolay Gudkov, RiskLab, Department of Mathematics, ETH Zurich, Zurich, Switzerland, E-mail: nikola.gudkov@gmail.com

Appendix A: Automated Bandwidth Parameter Algorithm Testing

A.1 Numerical Experiment Bandwidth Selection, Regression

In this subsection we test the automated bandwidth parameter procedures based on Eqs. (2.11) and (2.12), outlined in Section 2.2. For that purpose we use Examples 6.2 and 6.5 from Racine (2019), which study the regression function g(x) = E[Y|X] = sin(2πx). Similar numerical experiments are performed in Köhler, Schindler, and Sperlich (2014), who note difficulty of estimation of trigonometric functions.

We simulate N = 2,000 i.i.d samples {X _t, Z _t} using the model Z _t = sin(2πX _t) + ϵ _t, with X _t ∼ U[0, 1] and the noise term, ϵ ∼ N(0, σ ²), for 0 ≤ t ≤ N − 1. From Eq. (2.10), one can find the theoretically “optimal” bandwidth parameter for the local linear kernel regression, h * = 1.719 ⋅ σ 2 / ( 8 π 4 ) 1 / 5 ⋅ N − 1 / 5 , assuming that the Epanechnikov kernel function is used. For the generated dataset {X _t, Z _t} we find the bandwidth parameters using the automated procedures in Eqs. (2.11) and (2.12), i.e., based on the Generalised Cross-Validation and the Information Theoretic criteria. In addition, we compare the performance of these algorithms with the methods discussed in Section 3.1 of Köhler, Schindler, and Sperlich (2014). Thereby, we use the corrected Average Squared Error (ASE) defined by

(A.1) h ASE = argmin h ̄ ≤ h ∑ t = 1 N − 1 ( Z t − g ̂ h ( X t ) ) 2 Ω 1 N W t ω ( X t ) ,

where

(A.2) W t = K 0 S 2 ( X t ) ∑ l = 0 N − 1 K X l − X t h S 2 ( X t ) − ( X l − X t ) S 1 ( X t )

with S j ( x ) = ∑ l = 0 N − 1 K X l − x h ( X l − x ) j ; and the penalty terms defined according to Shibata (1981) and Rice (1984) are given by Ω(u) = 1 + 2 ⋅ u, and Ω ( u ) = 1 − 2 ⋅ u − 1 , respectively. The weight function ω(u) returns 1, if u falls between the 2.5 % and 97.5 % sample quantiles of {X _t}, and 0 otherwise.

Figure 8 shows h* together with the automatically selected bandwidth parameters for varying volatility parameter of the noise term, σ. One can make two observations from this plot. Firstly, the bandwidth parameters selected by the automated procedures are very close to each other. Secondly, the automatically selected bandwidth parameters are close to the theoretically “optimal” bandwidth, h*, for most of the values of σ. Unsurprisingly, as the noise, ϵ, becomes more volatile, i.e., the volatility parameter, σ, increases, the automated bandwidth selection procedures become less accurate.

$Figure 8: Theoretically optimal bandwidth parameter, h * = 1.719 ⋅ σ 2 / ( 8 π 4 ) 1 / 5 ⋅ N − 1 / 5 ${h}^{{\ast}}=1.719\cdot {\left[{\sigma }^{2}/\left(8{\pi }^{4}\right)\right]}^{1/5}\cdot {N}^{-1/5}$ together with the automatically selected bandwidth parameters obtained using the generalized cross-validation, information-theoretic criteria, and average squared error with Rice (1984) and Shibata (1981) penalty terms, for varying volatility parameter of the noise term, σ.$

Figure 8:

Theoretically optimal bandwidth parameter, h * = 1.719 ⋅ σ 2 / ( 8 π 4 ) 1 / 5 ⋅ N − 1 / 5 together with the automatically selected bandwidth parameters obtained using the generalized cross-validation, information-theoretic criteria, and average squared error with Rice (1984) and Shibata (1981) penalty terms, for varying volatility parameter of the noise term, σ.

A.2 Numerical Experiments for Bandwidth Selection Parameter

In this subsection, we investigate the accuracy of the automated methods for selecting the bandwidth parameter in the density estimator presented in Section 2.3. For this purpose, we simulate artificial datasets using random distributions with a known density function, f, and compute theoretically optimal bandwidth parameters from Eq. (2.14) together with the data-driven bandwidths in Eqs. (2.15)–(2.17).

First, we use normal distribution with mean μ and standard deviation σ to simulate artificial datasets. For the Gaussian family, it can be shown that ∫ f ″ ( x ) 2 d x = 3 8 π σ 5 . We select σ = 1.5 and N = 1,000. Then, using Eq. (2.14), we obtain the theoretically optimal bandwidth parameter h* = 0.8835. The automated bandwidth selection based on the least squares cross-validation and the likelihood cross-validation, given in Eqs. (2.15)–(2.17), results in h ̂ LSCV = 0.9131 and h ̂ LCV = 0.9605 , which are computed as averages across the bandwidths selected for 50 artificially simulated datasets of size N = 1,000, with the mean square errors (MSEs) being equal to 0.038 and 0.0974, respectively.

We repeat a similar numerical experiment with the artificial data simulated from the chi-square distribution with the number of degrees of freedom k. One can show that the integrated squared curvature of the density function is given by

(A.3) ∫ f ″ ( x ) 2 d x = Γ ( k − 1 ) + ς 1 2 + 2 ς 2 Γ ( k − 3 ) + ς 2 2 Γ ( k − 5 ) + 2 ς 1 Γ ( k − 2 ) + 2 ς 1 ς 2 Γ ( k − 4 ) 2 k + 4 Γ ( k / 2 ) 2 ,

where ς ₁ = 4 − 2k and ς ₂ = k ² − 6k + 8. For k = 7.5 and N = 1,000, one computes from Eq. (2.14) the optimal bandwidth parameter h* = 1.6175. The automatically selected bandwidth parameters are h ̂ LSCV = 1.6356 and h ̂ LCV = 2.0146 with the MSEs corresponding to 0.1946 and 0.5621, computed as averages across the bandwidths selected for 50 artificial datasets.

We notice that in both numerical experiments, the least-squares cross-validation procedure results in the bandwidth parameter selection closer to the optimal value as measured by the lower MSE.

Appendix B: Estimation of the Drift and Diffusion Coefficients from an Artificial Dataset

In this section we present the results of the drift and diffusion coefficients estimation through the nonparametric kernel-based numerical procedure outlined in Section 2. For that purpose, we generate a time series {X _t} for 0 ≤ t ≤ N − 1 with N = 2,000 and Δt = 0.004, using the CIR-model^[7] given by

(B.1) d X t = ( κ − α X t ) d t + η X t β d W t ,

where the set of parameters: κ = 0.035, α = 0.5, η = 0.1, and β = 0.5 is chosen in line with the numerical experiment in Stanton (1997).

For the simulated time series, {X _t}, (see top panel in Figure 9), we select an evenly spaced set of grid points, [x ₁, …, x _m], where m = 250 with x ₀ = min{X _t} and x _m = max{X _t}. At each of these points, we obtain estimates of the drift function with the procedures outlined in Section 2.4. These estimates constitute a set of values { μ ̄ i } , for each grid point, x _i, 1 ≤ i ≤ m. Using the least squares procedure we fit the function κ + α ⋅ x to the set { μ ̄ } and obtain the fitted drift function μ ̃ ( x ) = κ ̂ + α ̂ ⋅ x . This function is shown in the middle panel of Figure 9.

$Figure 9: In the top subplot, we show the simulated path of the CIR-process, X t with drift μ(x) = 0.04 − 0.5 ⋅ x and diffusion σ(x) = 0.1 ⋅ x 0.5. The middle subplot depicts the “true” drift function, μ(x) and the fitted drift function μ ̂ ( x ) = κ ̂ + α ̂ ⋅ x $\hat{\mu }\left(x\right)=\hat{\kappa }+\hat{\alpha }\cdot x$ together with the 95-% bootstrap confidence bands. The bottom subplot depicts the “true” diffusion function, σ(x) and the diffusion functions σ ̂ ( x ) = η ̂ ⋅ x β ̂ $\hat{\sigma }\left(x\right)=\hat{\eta }\cdot {x}^{\hat{\beta }}$ fitted to three sets of pointwise diffusion coefficient estimates, obtained with three different algorithm algorithms, conditional moments (CM), Chen, Cheng, and Peng (2009) and Fan and Yao (1998).$

Figure 9:

In the top subplot, we show the simulated path of the CIR-process, X _t with drift μ(x) = 0.04 − 0.5 ⋅ x and diffusion σ(x) = 0.1 ⋅ x ^0.5. The middle subplot depicts the “true” drift function, μ(x) and the fitted drift function μ ̂ ( x ) = κ ̂ + α ̂ ⋅ x together with the 95-% bootstrap confidence bands. The bottom subplot depicts the “true” diffusion function, σ(x) and the diffusion functions σ ̂ ( x ) = η ̂ ⋅ x β ̂ fitted to three sets of pointwise diffusion coefficient estimates, obtained with three different algorithm algorithms, conditional moments (CM), Chen, Cheng, and Peng (2009) and Fan and Yao (1998).

Furthermore, we use the three estimation algorithms to obtain three sets of estimates { σ ̄ i } , for 1 ≤ i ≤ m.^[8] These algorithms include the approach given in Eqs. (2.2) and (2.3), as well as the methods of Fan and Yao (1998) and Chen, Cheng, and Peng (2009), also introduced in the same subsection. Using the non-linear least squares procedure, we fit the function η ⋅ x ^β to the threes sets { σ ̄ } and obtain the fitted diffusion function σ ̃ ( x ) = η ̂ ⋅ x β ̂ for each of the sets. These fitted functions are shown in the bottom panel of Figure 9.

For all four datasets, i.e., one { μ ̄ } and three { σ ̄ } , we obtain a bandwidth parameter using the automated selection procedure based on Eq. (2.11) (show in parentheses in the middle and bottom panels).

From Figure 9 we notice that the estimated parameters { κ ̂ , α ̂ , η ̂ , β ̂ } are very close to the ones used to generate the original dataset. The most accurate estimates of the diffusion function parameters are obtained with the approach given in Eqs. (2.2) and (2.3), albeit the estimates obtained via other two methods are very similar.

We notice that the bandwidth parameters selected using the automated procedure in Eq. (2.11) are different for all four regressions performed in this numerical experiment. This observation highlights the importance of selecting a bandwidth parameter using some data-driven methodology and avoiding “plug-in” approaches.

Appendix C: Jump Diffusion Modelling

We rewrite the diffusion model in Eq. (2.1) as

(C.1) d X t = μ X t − d t + σ X t − d W t + Y t d J t ,

where, similarly to Eq. (2.1), W _t is a one-dimensional standard Brownian motion, μ and σ are the drift and the diffusion coefficients, respectively. Furthermore, J _t is a compensated jump process with intensity λ(X _t) ≥ 0, Y is a random jump size with stationery distribution p _Y independent of W _t and J _t.

Under this setting, the conditional moments in Eq. (2.2), m k ( x ) = E X t + Δ − X t k | X t = x , k ≥ 1, can be connected with the functions μ, σ, λ, and random variable Y as follows (see Figa-Talamanaca (2015):

(C.2) μ ( x ) = lim Δ → 0 1 Δ m 1 ( x ) + O ( Δ ) ,

(C.3) σ 2 ( x ) + λ ( x ) E Y 2 = lim Δ → 0 1 Δ m 2 ( x ) + O ( Δ ) ,

(C.4) λ ( x ) E Y k = lim Δ → 0 1 Δ m k ( x ) + O ( Δ ) ,

for k ≥ 3.

The estimation of m k ( x ) = E X t + Δ − X t k | X t = x , k ≥ 1 can be performed using the local linear polynomial regression outlined in Section 2.1. That is, by substitution of {Z _t} with X t + δ − X t k , for k ≥ 1 and 1 ≤ t ≤ N − 1, in Eq. (2.8), we obtain the local linear estimator m ̂ k . In addition, we utilise the automated selection procedure based on the Information Theoretic criteria in Eq. (2.12) to obtain bandwidth, optimal for a particular local linear regression. Table 3 outlines the bandwidth obtained for the estimates m ̂ k , for 1 ≤ k ≤ 6, obtained from the log-price of the USO time series. Figure 10 shows the nonparametric estimates of the moments m ̂ k for 1 ≤ k ≤ 6 together with the 95 % confidence bands obtained via bootstrap simulation. Thereby, we followed the wild bootstrap procedure in which artificial data is sampled from the vector of observations uniformly with replacement, and moments are estimated using these artificial datasets. As expected, we observe narrower bounds in the middle where we have more data, and thus, chances of sampling these observations is higher.

Table 3:

Optimal bandwidth parameter, h _k, selected using the automated procedure based on the Information-Theoretic criteria, for the local linear estimator, m ̂ k , for 1 ≤ k ≤ 6.

k	1	2	3	4	5	6
h _k	2.3152	0.1980	0.1814	0.1806	0.1798	0.1794

$Figure 10: Nonparametric estimates of the moments, m ̂ k ( x ) ${\hat{m}}_{k}\left(x\right)$ , for 1 ≤ k ≤ 6, using the local linear kernel regression. Optimal bandwidth parameters are selected using the information theoretic criteria. The red dotted curves represent the 95 % confidence bands obtained with the wild bootstrap.$

Figure 10:

Nonparametric estimates of the moments, m ̂ k ( x ) , for 1 ≤ k ≤ 6, using the local linear kernel regression. Optimal bandwidth parameters are selected using the information theoretic criteria. The red dotted curves represent the 95 % confidence bands obtained with the wild bootstrap.

Let us further assume that the stationary distribution of the random jump size, p _y, is a centred Gaussian with volatility parameter σ _Y. Following the derivations in Figa-Talamanaca (2015), one can show that

(C.5) m ̂ 2 ( x ) = σ 2 ( x ) + λ ( x ) σ y 2 ,

(C.6) m ̂ 4 ( x ) = 3 λ ( x ) σ Y 4 ,

(C.7) m ̂ 6 ( x ) = 15 λ ( x ) σ Y 6 .

From the equations above, one obtains the estimate of the jump volatility parameter as

σ ̂ Y = 1 n ∑ t = 1 N − 1 m ̂ 6 ( X t ) 5 m ̂ 4 ( X t ) ≈ 0.017 .

Using the daily observations of the oil price, Figa-Talamanaca (2015) estimate the volatility parameters of the random jump size to be approximately 0.0066, which is 2.5 times lower than our estimate obtained using the 5-min observations. This indicates that random jumps estimated from high-frequency data experience higher volatility compared to those estimated using lower frequency data.

Furthermore, from Eqs. (C.5)–(C.7) one can obtain the estimate of the random jump intensity, which corresponds to λ ̂ ( x ) = m 4 ( x ) 3 σ ̂ Y 4 , while the estimate of the diffusion coefficient function is given by σ ̂ 2 = m ̂ 2 ( x ) − λ ̂ ( x ) σ ̂ Y 2 . These estimates are shown in Figure 11 together with the estimate of the drift function μ ̂ ( x ) = m ̂ 1 ( x ) . As in Figure 10, we show the 95 % confidence bands obtained via bootstrap simulation using the red dotted lines.

Figure 11:

Nonparametric estimates of the drift, the diffusion and the jump intensity functions. The red dotted curves represent the 95 % confidence bands obtained with the wild bootstrap.

By relaxing the assumption of constant volatility of the random jumps, we get σ ̂ Y ( x ) = m ̂ 6 ( X t ) 5 m ̂ 4 ( X t ) , the estimate of this parameter in shown in Figure 12. Furthermore, one can fit a polynomial function to the estimated drift coefficient, μ ̃ ( x ) = α 0 μ + α 1 μ x + α 2 μ x β μ .

Figure 12:

Nonparametric estimate for the drift, the diffusion, the jump intensity and the jump volatility function.

Appendix D: Seasonality Modelling

We denote by P = {P _t, t ∈ [0, N − 1]} a collection of observations from the USO price process. It can be decomposed into two parts, a slow-varying component, S _t, and a fast-varying component, X _t:

(D.1) P t = S t + X t

for t ∈ [0, N − 1]. A slow-varying component, S _t, is associated with medium- and long-term variations in the price process. Sometimes, in the literature, it is referred to as a “deterministic” or “seasonal” component. The term “deterministic” is used because deterministic functions of time, t, are implemented to describe the dynamics of this component, see, for example, Brix, Lunde, and Wei (2018), Janczura et al. (2013), and Nowotarski, Tomczyk, and Weron (2013). Popular choices of these functions are: (i) piecewise constant functions and dummy variables; (ii) Fourier-based sine or cosine functions which may be coupled with an exponentially weighted moving average of the price process. Although these methods are easy to implement and interpret, we do not use them in our analysis due to several shortcoming. Specifically, piecewise constant functions and dummy variables are not smooth and could introduce artificial artefacts such as jumps and kinks in the transformed data. In addition, using Fourier-based functions results in a parametric approach to seasonality modelling, which contradicts our goal to develop fully nonparametric methodology for modelling crude oil prices. Moreover, the adoption of sine or cosine functions to represent variations in the seasonal component is based on the assumption that the frequencies of these variations are constant. While this assumption could be partially justified for some time series (e.g., electricity prices, see Ignatieva 2014), for the oil market, it is less reasonable in our setting.

Instead, we use a smooth component of the Maximal Overlap Discrete Wavelet Decomposition (MODWT) to model the slow-varying component of the price process. MODWT’s ability to decompose time series into various frequency components without losing information makes it particularly adept at capturing both the slow-varying trends and the more volatile components of energy prices. This technique is a fundamental in time series modelling and financial econometrics, as analysed by Percival and Walden (2000) and Gencay, Selcuk, and Whitcher (2002). The wavelet approach offers distinct advantages in effectively capturing the complexities of energy price movements. Recognised for its flexibility, it models the nonlinear and non-stationary characteristics of energy prices by decomposing time series data into multiple frequency components. This allows the approach to capture both long-term trends and short-term fluctuations without adhering to a predetermined functional form. By breaking down the price series into stationary sub-series, each modelled separately, the method yields more precise forecasts. Studies such as Pindoriya, Singh, and Singh (2008) demonstrate that wavelet-based models excel beyond traditional forecasting techniques like ARIMA, MLP, and RBF neural networks, including fuzzy neural networks (FNN), in predicting energy prices. This superior performance is largely attributed to the wavelet method’s effectiveness in addressing the market’s nonlinear dynamics.

Furthermore, the integration of wavelet transform with ARIMA and GARCH models, as explored by Tan et al. (2010), presents a novel and more effective way to address the inherent volatility in energy markets compared to conventional methods. By segmenting historical price series into approximation and detail series, the wavelet transform enables more accurate predictions of volatile components, enhancing overall forecasting accuracy. Additionally, the combination of wavelet transforms with other predictive techniques, such as Particle Swarm Optimisation (PSO) and Adaptive-Network-Based Fuzzy Inference System (ANFIS), has been shown to improve forecasting accuracy significantly by capturing both linear and non-linear patterns in energy prices efficiently (Catalão, Pousinho, and Mendes 2011). The efficacy of the wavelet approach extend further, as demonstrated by Qiao and Yang (2020), through a hybrid model that integrates wavelet transform and long short-term memory (LSTM) networks for U.S. electricity price forecasting. This novel model surpasses other AI models in accuracy, showcasing the wavelet transform’s ability to optimise forecasting through parameter selection and to adeptly manage the non-linear and volatile nature of energy markets. Additionally, the combination of wavelet decomposition with machine learning models, as explored by Risse (2019), underlines the effectiveness of wavelet-based approaches in forecasting gold prices. This research shows how wavelet decomposition can disentangle predictors according to their time and frequency domains, leading to enhanced forecasting performance. While not directly focused on energy prices, the principles and findings are readily applicable to energy markets, evidencing the wavelet transform’s broad applicability and its potential to refine predictive models by capturing both short-term and long-term trends.

This suite of advantages, from flexibility and enhanced accuracy to improved handling of non-stationarity, volatility and data non-stationarity, underscores the suitability of wavelet-based models for forecasting energy prices, offering considerable improvements over existing methodologies in terms of adaptability and forecasting precision. The following subsection (Appendix D.1) describes the methodology for the price decomposition by the MODWT algorithm.

D.1 Price Decomposition via Maximum Overlap Discrete Wavelet Transform

Based on the frequency of variations in the data-generating process, the set of price observations in Eq. (D.1) is decomposed with the aid of the Maximum Overlap Discrete Wavelet Transform (MODWT); see Chapter 5 in Percival and Walden (2000) for a detailed description of the procedure. In this subsection, we briefly review the key aspects of this methodology.

Let J ∈ Z + be the level of decomposition, then the vector of prices, P = (P ₀, …, P _N−1)′, can be decomposed as

(D.2) P = ∑ j = 1 J W ′ ( j ) W ( j ) + V ′ ( J ) V ( J ) ,

where, W ( j ) , V ( J ) ∈ R N × N , W ( j ) ∈ R N , for 1 ≤ j ≤ J, and V ( J ) ∈ R N . The vectors W ( j ) = W ( j ) P contain wavelet coefficients that are associated with changes in P on scale 2^j−1, while the vectors V ( J ) = V ( J ) P are filled with scaling coefficients associated with averages on scale 2^J.

Furthermore, at level 1 ≤ j ≤ J one defines D ( j ) = W ′ ( j ) W ( j ) to be a detail component, associated with frequencies within the pass-band 1 2 j + 1 , 1 2 j . Similarly, one defines S ( J ) = V ′ ( J ) V ( J ) to be a smooth component at level J, associated with frequencies in the interval 0 , 1 2 J + 1 . Intuitively, the detail components, D ( j ) , 1 ≤ j ≤ J, represent “high-frequency” variations in the original time series, while the smooth component S ( J ) provides “low-frequency” behaviour. Therefore, Eq. (D.2) can be written in the form of a MODWT-based multiresolution analysis (MRA) of the price vector P in terms of the detail and smooth components: P = ∑ j = 1 J D ( j ) + S ( J ) .

In practice, the wavelet and scaling coefficients in the vectors W ^(j) and V ^(j), for 1 ≤ j ≤ J, are obtained through the so-called “pyramid” algorithm. This algorithm is formulated as a multi-step iterative procedure. Namely, let the vector of scaling coefficients at level 0 be V ⁽⁰⁾ = P. Then, for the levels of decomposition j = 1, …, J and the time instances t = 0, …, N − 1, the elements of the wavelet coefficient vectors, W ( j ) = ( W 0 ( j ) , … , W N − 1 ( j ) ) ′ , and the elements of the scaling coefficients, V ( j ) = ( V 0 ( j ) , … , V N − 1 ( j ) ) ′ , are obtained through the circular convolution of the scaling coefficients in V ^(j−1) with the MODWT wavelet and scaling filters, { h ̃ l } and { g ̃ l } , for 0 ≤ l ≤ L − 1:

(D.3) W t ( j ) = ∑ l = 0 L − 1 h ̃ l V t − 2 j − 1 l mod N ( j − 1 ) and V t ( j ) = ∑ l = 0 L − 1 g ̃ l V t − 2 j − 1 l mod N ( j − 1 ) .

This procedure results in the set of (J + 1) vectors W ( 1 ) , W ( 2 ) , … , W ( J ) , V ( J ) . The vectors V ^(j) for 1 ≤ j ≤ J − 1, are considered as by-products of the pyramid algorithm.

The MODWT wavelet and scaling filters { h ̃ l } and { g ̃ l } , for 0 ≤ l ≤ L − 1, are linear filters of even length L and represent rescaled versions of the discrete wavelet transform (DWT) wavelet and scaling filters, i.e., h ̃ l ≡ h l / 2 and g ̃ l ≡ g l / 2 . An even-length filter {h _l}, for 0 ≤ l ≤ L − 1, is called a (DWT) wavelet filter if it satisfies the following conditions:

(D.4) ∑ l = 0 L − 1 h l = 0 and ∑ l h l h l + 2 n = 1 , if n = 0 , 0 , if n is a nonzero integer .

The latter condition represents the orthogonality of the wavelet filter to even shifts. Furthermore, the (DWT) scaling filter {g _l}, for 0 ≤ l ≤ L − 1 is defined through the “quadrature mirror” relationship: g _l = (−1)^l+1 h _L−1−l. In general, the conditions in Eq. (D.4) do not imply DWT coefficients that can be interpreted in terms of changes in the original time series on particular scales. Thus, one has to apply additional regularity conditions that result in a very diverse groups of wavelet filters that approximate high-pass filters, i.e., filters with nominal pass-band in 1 4 , 1 2 , and scaling filters that approximate low-pass filters, i.e., filters with nominal pass-band in 0 , 1 4 . The most frequently used filters (refer to Haar 1910) are the shortest wavelet filter with L = 2; the D(L) having an extremal phase and minimum delay property; the LA(L) possessing the least asymmetric property, i.e., with the smallest maximum deviation in frequency from the best-fitting linear fitting function; the coiflets, C(L), satisfying vanishing moment conditions. We refer to Daubechies (1992) and Chapters 4.8–4.9 in Percival and Walden (2000) for further discussion on wavelet various filters of length L ≥ 4.

It should be noted that the MODWT pyramid algorithm has a more considerable computational complexity of O(N log₂(N))^[9] as compared to the standard DWT pyramid algorithm, whose complexity is O(N). Nevertheless, the key advantage of using the MODWT is twofold. Unlike the DWT, which restricts the sample size to an integer multiple of 2^J, the MODWT algorithm can be applied to a time series of any length N. Furthermore, the detail and smooth components of the MODWT algorithm are associated with zero phase filters, which make it easy in practice to align the variations on the MRA series, D and S , with the changes in the original series, P.

The inverse of the pyramid MODWT algorithm provided in Eq. (D.3) is given by

(D.5) V t ( j − 1 ) = ∑ l = 0 L − 1 h ̃ l W t + 2 j − 1 l mod N ( j ) + ∑ l = 0 L − 1 g ̃ l V t + 2 j − 1 l mod N ( j ) ,

for t = 0, …, N − 1 and j = 1, …, J. This algorithm being applied to a set of (J + 1) vectors of wavelet and scaling coefficients, W ( 1 ) , W ( 2 ) , … , W ( J ) , V ( J ) , reconstructs the original time series, P = V ⁽⁰⁾.

Moreover, using the procedure in , the detail components, D ( j ) , 1 ≤ j ≤ J, can be reconstructed by applying the inverse MODWT to a set of vectors 0 , 0 , … , W ( j ) , … , 0 , 0 , where all vectors, except for the j-th one, are zero vectors of length N. Similarly, the smooth component, S ( J ) , can be reconstructed by applying the inverse pyramid algorithm to the set of vectors, 0 , … , 0 , V ( J ) , where the first J vectors are zero vectors of length N.

Figure 13 shows the MODWT smooth components, S ( J ) , at levels J = {5, 6, 7}, obtained from the USO price time series with the least asymmetric wavelet filter of length L = 8, see, LA(8) in Table 109 in Percival and Walden (2000). Since the frequency of observations in the original USO time series is equal to 5 min, the periods of S ( J ) , for J = {5, 6, 7}, are approximately 0.8, 1.6 and 3.2 days, respectively.

Figure 13:

Original USO price data for the period 03–07/06/2019 (blue) and its MODWT smooth components at level J = {5, 6, 7} obtained with aid of the LA(8) DWT filter.

As we are interested in analysing the high-frequency dynamics of the USO price data, we remove the seasonal variations with periods of one day and larger. Thus, in what follows, we set the slow-varying (“seasonal”) component of the USO price, S in Eq. (D.1), to be equal to S ( 5 ) , and by subtracting it from the original time series we obtain the “stochastic” (deseasonalised) component of the USO price, X t = P t − S t ( 5 ) for t = 0, …, N − 1.

For modelling of the stochastic component we utilize the non-parametric kernel-based statistical methodology outlined in Section 2. Figure 14 shows the estimated drift and diffusion coefficients in the dynamics of the stochastic component. The nonparametrically estimated drift coefficient function (top panel) is almost linear and decreases with increasing values of the deseasonalised prices, while the diffusion function (bottom panel) is concave. For low values of the USO prices, there is only a very slight mean reversion. Sharp decline in the drift that was estimated at high values of deseasonalised prices has the effect of preventing crude oil prices from exploding towards infinity, despite the increase in volatility shown in the bottom panel of Figure 14. The shapes of the drift and the diffusion coefficient functions are largely consistent with the results reported in Figa-Talamanaca (2015) as well as Ignatieva (2014) for the nonparametric estimates of the drift and diffusion coefficients for energy and commodity prices. Along with the estimates we plot the pointwise 95 % confidence bands (red dotted lines) for the drift and the diffusion estimates. As expected, the confidence bands are tight for the center values of deseasonalised prices and widen in the tails where the data become sparse.

$Figure 14: Nonparametric estimates of the drift and diffusion coefficients, μ ̂ ( x ) $\hat{\mu }\left(x\right)$ and σ ̂ ( x ) $\hat{\sigma }\left(x\right)$ , in the dynamics of the stochastic component of USO price. The estimates are obtained with the local linear kernel regression where the bandwidth parameter is selected using the information-theoretic procedure of Hurvich, Simonoff, and Tsai (1998), h = 0.531. The red dotted curves represent the 95 % confidence bands obtained via the bootstrap method.$

Figure 14:

Nonparametric estimates of the drift and diffusion coefficients, μ ̂ ( x ) and σ ̂ ( x ) , in the dynamics of the stochastic component of USO price. The estimates are obtained with the local linear kernel regression where the bandwidth parameter is selected using the information-theoretic procedure of Hurvich, Simonoff, and Tsai (1998), h = 0.531. The red dotted curves represent the 95 % confidence bands obtained via the bootstrap method.

References

Aït-Sahalia, Y., and J. Park. 2016. “Bandwidth Selection and Asymptotic Properties of Local Nonparametric Estimators in Possibly Nonstationary Continuous-Time Models.” Journal of Econometrics 192: 119–38. https://doi.org/10.1016/j.jeconom.2015.11.002.Search in Google Scholar

Arismendi, J. C., J. Back, M. Prokopczuk, R. Paschke, and M. Rudolf. 2016. “Seasonal Stochastic Volatility: Implications for the Pricing of Commodity Options.” Journal of Banking & Finance 66: 53–65. https://doi.org/10.1016/j.jbankfin.2016.02.001.Search in Google Scholar

Askari, H., and N. Krichene. 2008. “Oil Price Dynamics (2002–2006).” Energy Economics 30 (5): 2134–53. https://doi.org/10.1016/j.eneco.2007.12.004.Search in Google Scholar

Bandi, F., and T. Nguyen. 1999. “Fully Nonparametric Estimators for Diffussion: A Small Sample Analysis.” Working paper. University of Chicago.Search in Google Scholar

Bandi, F., and T. Nguyen. 2003. “On the Functional Estimation of Jump-Diffusion Process.” Journal of Econometrics 116: 292–328.10.1016/S0304-4076(03)00110-6Search in Google Scholar

Bandi, F., and P. Phillips. 2003. “Fully Nonparametric Estimation of Scalar Diffusion Models.” Econometrica 71: 241–83. https://doi.org/10.1111/1468-0262.00395.Search in Google Scholar

Barndorff-Nielsen, O. E., F. E. Benth, and A. E. D. Veraart. 2013. “Modelling Energy Spot Prices by Volatility Modulated Lévy-Driven Volterra Processes.” Bernoulli 19 (3): 803–45. https://doi.org/10.3150/12-bej476.Search in Google Scholar

Baum, C. F., and P. Zerilli. 2016. “Jumps and Stochastic Volatility in Crude Oil Futures Prices Using Conditional Moments of Integrated Volatility.” Energy Economics 53: 175–81. https://doi.org/10.1016/j.eneco.2014.10.007.Search in Google Scholar

Benth, F., J. Benth, and S. Koekebakker. 2008. “Statistical Modelling of Electricity and Related Markets.” In Advanced Series on Statistical Science and Applied Probability. Singapore: World Scientific.10.1142/9789812812315Search in Google Scholar

Bowman, A. 1984. “An Alternative Method of Cross-Validation for the Smoothing of Density Estimate.” Biometrika 71: 353–60. https://doi.org/10.1093/biomet/71.2.353.Search in Google Scholar

Brix, A. F., A. Lunde, and W. Wei. 2018. “A Generalized Schwartz Model for Energy Spot Prices – Estimation Using a Particle MCMC Method.” Energy Economics 72: 560–82. https://doi.org/10.1016/j.eneco.2018.03.037.Search in Google Scholar

Cai, Z., and Y. Hong. 2009. “Some Recent Developments in Nonparametric Finance.” In Nonparametric Econometric Methods, Vol. 25, 379–432. Leeds, England: Emerald Group Publishing Limited.10.1108/S0731-9053(2009)0000025015Search in Google Scholar

Catalão, J., H. Pousinho, and V. Mendes. 2011. “Hybrid Wavelet-Pso-Anfis Approach for Short-Term Electricity Prices Forecasting.” IEEE Transactions on Power Systems 26 (1): 137–44. https://doi.org/10.1109/TPWRS.2010.2049385.Search in Google Scholar

Chapman, D., and N. Pearson. 2000. “Is the Short Rate Drift Actually Nonlinear?” The Journal of Finance 55 (1): 355–88. https://doi.org/10.1111/0022-1082.00208.Search in Google Scholar

Chen, L., M. Cheng, and L. Peng. 2009. “Conditional Variance Estimation in Heteroscedastic Regression Models.” Journal of Statistical Planning and Inference 139 (2): 236–45. https://doi.org/10.1016/j.jspi.2008.04.020.Search in Google Scholar

Cheng, B., C. S. Nikitopoulos, and E. Schlögl. 2018. “Pricing of Long-Dated Commodity Derivatives: Do Stochastic Interest Rates Matter?” Journal of Banking & Finance 95: 148–66. https://doi.org/10.1016/j.jbankfin.2017.05.012.Search in Google Scholar

Clark, R. 1977. “Non-Parametric Estimation of a Smooth Regression Function.” Journal of the Royal Statistical Society: Series B 39: 107–13. https://doi.org/10.1111/j.2517-6161.1977.tb01611.x.Search in Google Scholar

Cortazar, G., M. Lopez, and L. Naranjo. 2017. “A Multifactor Stochastic Volatility Model of Commodity Prices.” Energy Economics 67: 182–201. https://doi.org/10.1016/j.eneco.2017.08.007.Search in Google Scholar

Cox, J., J. Ingerson, and S. Ross. 1985. “A Theory of the Term Structure of Interest Rates.” Econometrica 53: 385–407. https://doi.org/10.2307/1911242.Search in Google Scholar

Craven, P., and G. Wahba. 1979. “Smoothing Noisy Data with Spline Functions.” Numerische Mathematik 13: 377–403. https://doi.org/10.1007/bf01404567.Search in Google Scholar

Daubechies, I. 1992. Ten Lectures on Wavelets. Philadelphia, Pennsylvania, United States: SIAM - Society for Industrial and Applied Mathematics.10.1137/1.9781611970104Search in Google Scholar

Epanechnikov, V. 1969. “Non-Parametric Estimation of a Multivariate Probability Density.” Theory of Probability and Its Applications 14 (1): 153–8. https://doi.org/10.1137/1114019.Search in Google Scholar

Fan, J., and I. Gijbels. 1996. Local Polynomial Modelling and its Applications. London: Chapman & Hall.Search in Google Scholar

Fan, J., and Q. Yao. 1998. “Efficient Estimation of Conditional Variance Functions in Stochastic Regression.” Biometrika 85: 645–60. https://doi.org/10.1093/biomet/85.3.645.Search in Google Scholar

Fan, J., and Q. Yao. 2005. Nonlinear Time Series: Nonparametric and Parametric Methods. New York: Springer-Verlag.Search in Google Scholar

Fan, J., and C. Zhang. 2003. “A Re-Examination of Diffusion Estimators with Applications to Financial Model Validation.” Journal of the American Statistical Association 98: 118–34. https://doi.org/10.1198/016214503388619157.Search in Google Scholar

Figa-Talamanaca, G. 2015. Handbook of Multi-Commodity Markets and Products: Strucutring, Trading and Risk Management, Chapter Nonparametric Estimation of Energy Commodity Price Processes, 659–72. New Jersey, United States: John Wiley & Sons Ltd.10.1002/9781119011590.ch13Search in Google Scholar

Geisser, S. 1975. “A Predicitve Sample Reuse Method with Application.” Journal of the American Statistical Association 70: 320–8. https://doi.org/10.2307/2285815.Search in Google Scholar

Gencay, R., F. Selcuk, and B. Whitcher. 2002. An Introduction to Wavelets and Other Filtering Methods in Finance and Economics. Cambridge, Massachusetts, United States: Academic Press.10.1016/B978-012279670-8.50004-5Search in Google Scholar

Gudkov, N., and K. Ignatieva. 2021. “Electricity Price Modelling with Stochastic Volatility and Jumps: An Empirical Investigation.” Energy Economics 98. https://doi.org/10.1016/j.eneco.2021.105260.Search in Google Scholar

Haar, A. 1910. “Zur Theorie der Orthogonalen Funktionensysteme.” Mathematische Annalen 69: 331–71. https://doi.org/10.1007/bf01456326.Search in Google Scholar

Hall, P., and J. Racine. 2015. “Infinite Order Cross-Validated Local Polynomial Regression.” Journal of Econometrics 185 (2): 510–25. https://doi.org/10.1016/j.jeconom.2014.06.003.Search in Google Scholar

Hamilton, J. D. 2009. “Causes and Consequences of the Oil Shock of 2007–08.” Brookings Papers on Economic Activity 40 (1): 215–83. https://doi.org/10.1353/eca.0.0047.Search in Google Scholar

Hart, J., and S. Yi. 1998. “One-Sided Cross Validation.” Journal of the American Statistical Association 93: 620–31. https://doi.org/10.1080/01621459.1998.10473715.Search in Google Scholar

Heath, D. 2019. “Macroeconomic Factors in Oil Futures Markets.” Management Science 65 (9): 3949–4450. https://doi.org/10.1287/mnsc.2017.3008.Search in Google Scholar

Hilliard, J. E., and J. Reis. 1998. “Valuation of Commodity Futures and Options under Stochastic Convenience Yields, Interest Rates, and Jump Diffusions in the Spot.” Journal of Financial and Quantitative Analysis 33 (1): 61–86. https://doi.org/10.2307/2331378.Search in Google Scholar

Hurvich, C., J. Simonoff, and C. Tsai. 1998. “Smoothing Parameter Selection in Nonparametric Regression Using an Improved Akaike Information Criterion.” Journal of the Royal Statistical Society: Series B 60: 271–93. https://doi.org/10.1111/1467-9868.00125.Search in Google Scholar

Ignatieva, K. 2014. “A Nonparametric Model for Spot Price Dynamics and Pricing of Futures Contracts in Electricity Markets.” Studies in Nonlinear Dynamics and Econometrics 18 (5): 483–505. https://doi.org/10.1515/snde-2012-0001.Search in Google Scholar

Ignatieva, K., and N. Ponomareva. 2017. “Commodity Currencies and Commodity Prices: Modelling Static and Time-Varying Dependence.” Applied Economics 49 (43): 4337–59. https://doi.org/10.1080/00036846.2017.1284970.Search in Google Scholar

Ignatieva, K., and P. Wong. 2022. “Modelling High Frequency Crude Oil Dynamics Using Affine and Non-Affine Jump-Diffusion Models.” Energy Economics 108. https://doi.org/10.1016/j.eneco.2022.105873.Search in Google Scholar

Janczura, J., S. Trück, R. Weron, and R. Wolff. 2013. “Identifying Spikes and Seasonal Components in Electricity Spot Price Data: A Guide to Robust Modeling.” Energy Economics 38: 96–110. https://doi.org/10.1016/j.eneco.2013.03.013.Search in Google Scholar

Kang, B., C. S. Nikitopoulos, and M. Prokopczuk. 2020. “Economic Determinants of Oil Futures Volatility: A Term Structure Perspective.” Energy Economics 88: 104743. https://doi.org/10.1016/j.eneco.2020.104743.Search in Google Scholar

Kilian, L. 2009. “Not All Oil Price Shocks are Alike: Disentangling Demand and Supply Shocks in the Crude Oil Market.” The American Economic Review 99 (3): 1053–69. https://doi.org/10.1257/aer.99.3.1053.Search in Google Scholar

Köhler, M., A. Schindler, and S. Sperlich. 2014. “A Review and Comparison of Bandwidth Selection Methods for Kernel Regression.” International Statistical Review 82 (2): 243–74. https://doi.org/10.1111/insr.12039.Search in Google Scholar

Kyriakou, I., P. Pouliasis, and N. C. Papapostolou. 2016. “Jumps and Stochastic Volatility in Crude Oil Prices and Advances in Average Option Pricing.” Quantitative Finance 16 (12): 1859–73. https://doi.org/10.1080/14697688.2016.1211798.Search in Google Scholar

Larsson, K., and M. Nossman. 2011. “Jumps and Stochastic Volatility in Oil Prices: Time Series Evidence.” Energy Economics 33: 504–14. https://doi.org/10.1016/j.eneco.2010.12.016.Search in Google Scholar

Li, Q., and J. Racine. 2007. Nonparametric Econometrics: Theory and Practice. Princeton: Princeton University Press.Search in Google Scholar

Loader, C. 1999. “Bandwidth Selection: Classical or Plug-In?” Annals of Statistics 27 (2): 415–38. https://doi.org/10.1214/aos/1018031201.Search in Google Scholar

Mammen, M., M. Martinez-Miranda, J. Nielsen, and S. Sperlich. 2011. “Do-Validation for Kernel Density Estimation.” Journal of the American Statistical Association 20: 91–113.10.1198/jasa.2011.tm08687Search in Google Scholar

Mancini, C., and R. Reno. 2011. “Threshold Estimation of Markov Models with Jumps and Interest Rate Modelling.” Journal of Econometrics 160: 77–92. https://doi.org/10.1016/j.jeconom.2010.03.019.Search in Google Scholar

Nadaraya, E. 1964. “Some New Estimates for Distribution Functions.” Theory of Probability and Its Applications 9: 497–500. https://doi.org/10.1137/1109069.Search in Google Scholar

Nadaraya, E. 1965. “On Nonparametric Estimates of Density Functions and Regression Curves.” Theory of Applied Probability 10: 186–90. https://doi.org/10.1137/1110024.Search in Google Scholar

Nowotarski, J., J. Tomczyk, and R. Weron. 2013. “Robust Estimation and Forecasting of the Long-Term Seasonal Component of Electricity Spot Prices.” Energy Economics 39: 13–27. https://doi.org/10.1016/j.eneco.2013.04.004.Search in Google Scholar

Park, J., and B. Wang. 2021. “Nonparametric Estimation of Jump Diffusion Models.” Journal of Econometrics 222: 688–715. https://doi.org/10.1016/j.jeconom.2020.07.020.Search in Google Scholar

Percival, D., and A. Walden. 2000. Wavelet Methods for Time Series Analysis. England: Cambridge University Press.10.1017/CBO9780511841040Search in Google Scholar

Pindoriya, N., S. N. Singh, and S. Singh. 2008. “An Adaptive Wavelet Neural Network-Based Energy Price Forecasting in Electricity Markets.” IEEE Transactions on Power Systems 23 (3): 1423–32. https://doi.org/10.1109/TPWRS.2008.922251.Search in Google Scholar

Pindyck, R. 2004. “Volatility and Commodity Price Dynamics.” Journal of Futures Markets 24: 1029–47. https://doi.org/10.1002/fut.20120.Search in Google Scholar

Qiao, W., and Z. Yang. 2020. “Forecast the Electricity Price of U.S. Using a Wavelet Transform-Based Hybrid Model.” Energy 193: 116704. https://doi.org/10.1016/j.energy.2019.116704.Search in Google Scholar

Racine, J. 2019. An Introduction to the Advanced Theory and Practice of Nonparametric Econometrics. A Replicable Approach Using R. England: Cambridge University Press.10.1017/9781108649841Search in Google Scholar

Rice, J. 1984. “Bandwidth Choice for Nonparametric Regression.” Annals of Statistics 12: 1215–30. https://doi.org/10.1214/aos/1176346788.Search in Google Scholar

Risse, M. 2019. “Combining Wavelet Decomposition with Machine Learning to Forecast Gold Returns.” International Journal of Forecasting 35: 601–15. https://doi.org/10.1016/j.ijforecast.2018.11.008.Search in Google Scholar

Rudemo, M. 1984. “Empirical Choice of Histograms and Kernel Density Estimators.” Scandinavian Journal of Statistics 9: 65–78.Search in Google Scholar

Ruppert, D., S. Sheather, and M. Wand. 1995. “An Effective Bandwidth Selector for Local Least Squared Regression.” Journal of the American Statistical Association 90: 1257–70. https://doi.org/10.2307/2291516.Search in Google Scholar

Shibata, R. 1981. “An Optimal Selection of Regression Variables.” Biometrika 68: 45–54. https://doi.org/10.2307/2335804.Search in Google Scholar

Stanton, R. 1997. “A Nonparametric Model of Term Structure Dynamics and the Market Price of Interest Rate Risk.” The Journal of Finance 52 (5): 1973–2002. https://doi.org/10.1111/j.1540-6261.1997.tb02748.x.Search in Google Scholar

Stone, C. 1974. “Cross-Validatory Choice and Assessment of Statistical Predictions (With Discussion).” Journal of the Royal Statistical Society 36: 111–47.10.1111/j.2517-6161.1974.tb00994.xSearch in Google Scholar

Stroud, J. R., and M. S. Johannes. 2014. “Bayesian Modeling and Forecasting of 24-hour High-Frequency Volatility.” Journal of the American Statistical Association 109 (508): 1368–84. https://doi.org/10.1080/01621459.2014.937003.Search in Google Scholar

Tan, Z., J. Zhang, J. Wang, and J. Xu. 2010. “Day-Ahead Electricity Price Forecasting Using Wavelet Transform Combined with Arima and Garch Models.” Applied Energy 87 (12): 3606–10. https://doi.org/10.1016/J.APENERGY.2010.05.012.Search in Google Scholar

Trolle, A. B., and E. S. Schwartz. 2009. “Unspanned Stochastic Volatility and the Pricing of Commodity Derivatives.” Review of Financial Studies 22 (11): 4423–61. https://doi.org/10.1093/rfs/hhp036.Search in Google Scholar

Watson, G. 1964. “Smooth Regression Analysis.” Sankhya 26 (15): 175–84.Search in Google Scholar

Yang, L. 2019. “Connectedness of Economic Policy Uncertainty and Oil Price Shocks in a Time Domain Perspective.” Energy Economics 80: 2019–233. https://doi.org/10.1016/j.eneco.2019.01.006.Search in Google Scholar

Yang, L. 2023. “Oil Price Bubbles: The Role of Network Centrality on Idiosyncratic Sovereign Risk.” Resources Policy 82: 103493. https://doi.org/10.1016/j.resourpol.2023.103493.Search in Google Scholar

Received: 2022-12-01

Accepted: 2024-10-25

Published Online: 2024-12-24

You are currently not able to access this content.

https://doi.org/10.1515/snde-2022-0113

Keywords for this article

high frequency data; model specification; nonparametric estimation; wavelet analysis