Generative modelling of financial time series with structured noise and MMD-based signature learning

Lu Chung I; Julian Sester

doi:10.1515/strm-2025-0004

Enjoy 40% off

academic books on De Gruyter Brill *

Article

Generative modelling of financial time series with structured noise and MMD-based signature learning

Lu Chung I and Julian Sester

Published/Copyright: November 12, 2025

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information

From the journal Statistics & Risk Modeling

Abstract

Generating synthetic financial time series data that accurately reflects real-world market dynamics holds tremendous potential for various applications, including portfolio optimization, risk management, and large scale machine learning. We present an approach that uses structured noise for training generative models for financial time series. The expressive power of the signature transform has been shown to be able to capture the complex dependencies and temporal structures inherent in financial data when used to train generative models in the form of a signature kernel. We employ a moving average model to model the variance of the noise input, enhancing the model’s ability to reproduce stylized facts such as volatility clustering. Through empirical experiments on S&P 500 index data, we demonstrate that our model effectively captures key characteristics of financial time series and outperforms comparable approaches. In addition, we explore the application of the synthetic data generated to train a reinforcement learning agent for portfolio management, achieving promising results. Finally, we propose a method to add robustness to the generative model by tweaking the noise input so that the generated sequences can be adjusted to different market environments with minimal data.

Keywords: Generative model; time series; signatures; MMD; portfolio optimization; reinforcement learning

MSC 2020: 91G80

A Heston model experiment

The Heston model is defined by the following stochastic differential equations:

d ⁢ S t = μ ⁢ S t ⁢ d ⁢ t + v t ⁢ S t ⁢ d ⁢ W t 1 ,

d ⁢ v t = κ ⁢ ( θ - v t ) ⁢ d ⁢ t + σ ⁢ v t ⁢ d ⁢ W t 2 ,

where S t is the asset price, v t is the variance of the asset price, μ is the drift, κ is the mean reversion rate, θ is the long-term variance, σ is the volatility of the variance, W t 1 and W t 2 are Brownian motions where d ⁢ W t 1 ⁢ d ⁢ W t 2 = ρ ⁢ d ⁢ t for some correlation parameter ρ ∈ [ - 1 , 1 ] .

We set the parameters μ = 0.2 , κ = 1 , θ = 0.25 , σ = 0.7 , ρ = - 0.7 and the initial variance v 0 = 0.09 . Using the Quantlib python library https://quantlib-python-docs.readthedocs.io/en/latest/, we generate 6,400 samples of length 250 with a time step of 1/252. The Quantlib implementation uses first order Euler discretisation to simulate trajectories of the Heston model.

The models are trained using a simplified version of Algorithm 1, where the conditioning process is omitted the LSTM cell state and hidden state and only the time augmentation is used instead of both lead-lag and time augmentation.

To simulate noise, we generate independent paths using the same parameters for the Heston process, except with μ = 0 , then calculate

z t i = log ⁡ S t i - log ⁡ S t i - 1 ( t i - t i - 1 ) ⁢ v 0

to obtain the noise at time t i . For the i.i.d. standard Gaussian noise, we use the same Quantlib Heston implementation but set κ = σ = 1 ⁢ e - 9 and θ = v 0 which effectively results in the noise having a mean of zero and variance of one. This method of simulating the i.i.d. Gaussian noise allows the use of the same random seed and generate noise sequences that are only different in variance so as to provides a fair comparison with the noise that has changing variance. We use a noise dimension of 2 as the Heston model has two Brownian motions.

The truncation level was set to m = 5 and the static kernel was a rational quadratic kernel with α = 1 and l = 0.1 .

B Moments of returns

We briefly present the formulas used to calculate the empirical estimates of the moments of the returns in Section 5.6. Let ( r 1 , r 2 , … , r n ) be a sample of log returns; we define r ¯ := 1 n ⁢ ∑ i = 1 n r i and m i := 1 n ⁢ ∑ j = 1 n ( r j - r ¯ ) i . We will assume that there are 252 business days in a year.

The first four moments of the log returns are named and calculated as follows:

The mean is annualised by business day convention which we calculate it as

Ann. returns = 252 × r ¯ .
Similarly volatility is also annualised by convention and calculated as

𝐕𝐨𝐥𝐚𝐭𝐢𝐥𝐢𝐭𝐲 = 252 × m 2 .
The skew is calculated as

𝐒𝐤𝐞𝐰 = m 3 m 2 3 2 .
The kurtosis is calculated as

𝐊𝐮𝐫𝐭𝐨𝐬𝐢𝐬 = m 4 m 2 2 - 3 .

References

[1] B. Acciaio, S. Eckstein and S. Hou, Time-causal VAE: Robust financial time series generator, preprint (2024), https://arxiv.org/abs/2411.02947. Search in Google Scholar

[2] M. Arjovsky, S. Chintala and L. Bottou, Wasserstein generative adversarial networks, Proceedings of the 34th International Conference on Machine Learning. Vol. 70, ACM, New York (2017), 214–223. Search in Google Scholar

[3] S. A. Assefa, D. Dervovic, M. Mahfouz, R. E. Tillman, P. Reddy and M. Veloso, Generating synthetic data in finance: Opportunities, challenges and pitfalls, Proceedings of the First ACM International Conference on AI in Finance, ACM, New York (2020), 1–8. 10.1145/3383455.3422554Search in Google Scholar

[4] J. Backhoff, D. Bartl, M. Beiglböck and J. Wiesel, Estimating processes in adapted Wasserstein distance, Ann. Appl. Probab. 32 (2022), no. 1, 529–550. 10.1214/21-AAP1687Search in Google Scholar

[5] F. Biagini, L. Gonon and N. Walter, Universal randomised signatures for generative time series modelling, preprint (2024), https://arxiv.org/abs/2406.10214. Search in Google Scholar

[6] M. Bińkowski, D. J. Sutherland, M. Arbel and A. Gretton, Demystifying MMD GANs, International Conference on Learning Representations, Vancouver (2018). Search in Google Scholar

[7] H. Buehler, B. Horvath, T. Lyons, I. P. Arribas and B. Wood, A data-driven market simulator for small data environments, preprint (2020), https://arxiv.org/abs/2006.14498. 10.2139/ssrn.3632431Search in Google Scholar

[8] I. Chevyrev and A. Kormilitzin, A primer on the signature method in machine learning, preprint (2016), https://arxiv.org/abs/1603.03788. Search in Google Scholar

[9] I. Chevyrev and H. Oberhauser, Signature moments to characterize laws of stochastic processes., J. Mach. Learn. Res. 23 (2022), no. 176, 1–42. Search in Google Scholar

[10] A. Coletta, M. Prata, M. Conti, E. Mercanti, N. Bartolini, A. Moulin, S. Vyetrenko and T. Balch, Towards realistic market simulations: A generative adversarial networks approach, preprint (2021), https://arxiv.org/abs/2110.13287. 10.1145/3490354.3494411Search in Google Scholar

[11] E. M. Compagnoni, A. Scampicchio, L. Biggio, A. Orvieto, T. Hofmann and J. Teichmann, On the effectiveness of randomized signatures as reservoir for learning rough dynamics, 2023 International Joint Conference on Neural Networks, IEEE Press, Piscataway (2023), 1–8. 10.1109/IJCNN54540.2023.10191624Search in Google Scholar

[12] R. Cont, Empirical properties of asset returns: Stylized facts and statistical issues, Quant. Finance 1 (2001), no. 2, 223–236. 10.1088/1469-7688/1/2/304Search in Google Scholar

[13] R. Cont, M. Cucuringu, R. Xu and C. Zhang, Tail-gan: Learning to simulate tail risk scenarios, preprint (2022), https://arxiv.org/abs/2203.01664. Search in Google Scholar

[14] M. Duembgen and L. C. G. Rogers, Estimate nothing, Quant. Finance 14 (2014), no. 12, 2065–2072. 10.1080/14697688.2014.951678Search in Google Scholar

[15] A. Fermanian, Embedding and learning with signatures, Comput. Statist. Data Anal. 157 (2021), Article ID 107148. 10.1016/j.csda.2020.107148Search in Google Scholar

[16] W. C. Freund, M. Larrain and M. S. Pagano, Market efficiency before and after the introduction of electronic trading at the toronto stock exchange, Rev. Financial Econ. 6 (1997), no. 1, 29–56. 10.1016/S1058-3300(97)90013-6Search in Google Scholar

[17] W. Fu, A. Hirsa and J. Osterrieder, Simulating financial time series using attention, preprint (2022), https://arxiv.org/abs/2207.00493. Search in Google Scholar

[18] G. M. Goerg, The Lambert way to gaussianize heavy-tailed data with the inverse of Tukey’s h transformation as a special case, Sci. World J. 2015 (2015), Article ID 909231. 10.1155/2015/909231Search in Google Scholar PubMed PubMed Central

[19] I. Goodfellow, Y. Bengio and A. Courville, Deep Learning, MIT Press, Cambridge, 2016. Search in Google Scholar

[20] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville and Y. Bengio, Generative adversarial nets, Advances in Neural Information Processing Systems. Vol. 27, MIT Press, Cambridge (2014), 2672–2680. Search in Google Scholar

[21] A. Gretton, K. M. Borgwardt, M. J. Rasch, B. Schölkopf and A. Smola, A kernel two-sample test, J. Mach. Learn. Res. 13 (2012), 723–773. Search in Google Scholar

[22] B. Hambly and T. Lyons, Uniqueness for the signature of a path of bounded variation and the reduced path group, Ann. of Math. 171 (2010), no. 1, 109–167. 10.4007/annals.2010.171.109Search in Google Scholar

[23] S. L. Heston, A closed-form solution for options with stochastic volatility with applications to bond and currency options, Rev. Financial Stud. 6 (1993), no. 2, 327–343. 10.1093/rfs/6.2.327Search in Google Scholar

[24] J. Ho, A. Jain and P. Abbeel, Denoising diffusion probabilistic models, Proceedings of the 34th International Conference on Neural Information Processing Systems, Curran Associates, Red Hook (2020), 6840–6851. Search in Google Scholar

[25] S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural Comput. 9 (1997), no. 8, 1735–1780. 10.1162/neco.1997.9.8.1735Search in Google Scholar PubMed

[26] H. Huang, M. Chen and X. Qiao, Generative learning for financial time series with irregular and scale-invariant patterns, The Twelfth International Conference on Learning Representations, ICLR, Vienna (2024), 1–21. Search in Google Scholar

[27] Z. Issa, B. Horvath, M. Lemercier and C. Salvi, Non-adversarial training of neural SDEs with signature kernel scores, preprint (2023), https://arxiv.org/abs/2305.16274. 10.2139/ssrn.4581481Search in Google Scholar

[28] P. Kidger, On neural differential equations, preprint (2022), https://arxiv.org/abs/2202.02435. Search in Google Scholar

[29] P. Kidger, J. Foster, X. Li and T. J. Lyons, Neural SDEs as infinite-dimensional GANs, Proceedings of the 38th International Conference on Machine Learning. Vol. 139, PMLR, New York (2021), 5453–5463. Search in Google Scholar

[30] D. P. Kingma and M. Welling, Auto-encoding variational Bayes, preprint (2013), https://arxiv.org/abs/1312.6114. Search in Google Scholar

[31] F. J. Király and H. Oberhauser, Kernels for sequentially ordered data, J. Mach. Learn. Res. 20 (2019), 1–45. Search in Google Scholar

[32] O. Kondratyev and C. Schwarz, The market generator, Econometrics JYea2019, 10.2139/ssrn.3384948. 10.2139/ssrn.3384948Search in Google Scholar

[33] Z. Kong, W. Ping, J. Huang, K. Zhao and B. Catanzaro, Diffwave: A versatile diffusion model for audio synthesis, preprint (2020), https://arxiv.org/abs/2009.09761. Search in Google Scholar

[34] A. Koshiyama, N. Firoozye and P. Treleaven, Generative adversarial networks for financial trading strategies fine-tuning and combination, Quant. Finance 21 (2021), no. 5, 797–813. 10.1080/14697688.2020.1790635Search in Google Scholar

[35] Y. LeCun, L. Bottou, G. B. Orr and K.-R. Müller, Efficient backprop, Neural Networks: Tricks of the Trade, Lecture Notes in Comput. Sci. 7700, Springer, Berlin (2002), 9–50. 10.1007/3-540-49430-8_2Search in Google Scholar

[36] S. Liao, H. Ni, M. Sabate-Vidales, L. Szpruch, M. Wiese and B. Xiao, Sig-Wasserstein GANs for conditional time series generation, Math. Finance, 34 (2024), no. 2, 622–670. 10.1111/mafi.12423Search in Google Scholar

[37] P. D. Lozano, T. Lozano Bagén and J. Vives, Neural stochastic differential equations for conditional time series generation using the Signature-Wasserstein-1 metric, J. Comput. Finance 27 (2023), 1–23. 10.21314/JCF.2023.005Search in Google Scholar

[38] C. I. Lu, Evaluation of deep reinforcement learning algorithms for portfolio optimisation, preprint (2023), https://arxiv.org/abs/2307.07694. Search in Google Scholar

[39] T. Lyons, Rough paths, signatures and the modelling of functions on streams, preprint (2014), https://arxiv.org/abs/1405.4537. Search in Google Scholar

[40] T. Lyons and A. D. McLeod, Signature methods in machine learning, preprint (2022), https://arxiv.org/abs/2206.14674. Search in Google Scholar

[41] T. J. Lyons, M. Caruana and T. Lévy, Differential Equations Driven by Rough Paths, Springer, Berlin, 2007. 10.1007/978-3-540-71285-5Search in Google Scholar

[42] T. J. Lyons and W. Xu, Inverting the signature of a path, J. Eur. Math. Soc. (JEMS) 20 (2018), no. 7, 1655–1687. 10.4171/jems/796Search in Google Scholar

[43] K. Muandet, K. Fukumizu, B. Sriperumbudur and B. Schölkopf, Kernel mean embedding of distributions: A review and beyond, preprint (2016), https://arxiv.org/abs/1605.09522. 10.1561/9781680832891Search in Google Scholar

[44] H. Ni, L. Szpruch, M. Sabate-Vidales, B. Xiao, M. Wiese and S. Liao, Sig-Wasserstein GANs for time series generation, Proceedings of the Second ACM International Conference on AI in Finance, ACM, New York (2021), 1–8. 10.1145/3490354.3494393Search in Google Scholar

[45] A. Rame, M. Kirchmeyer, T. Rahier, A. Rakotomamonjy, P. Gallinari and M. Cord, Diverse weight averaging for out-of-distribution generalization, Proceedings of the 36th International Conference on Neural Information Processing System, Curran Associates, Red Hook (2022), 10821–10836. Search in Google Scholar

[46] C. Salvi, T. Cass, J. Foster, T. Lyons and W. Yang, The signature kernel is the solution of a Goursat PDE, SIAM J. Math. Data Sci. 3 (2021), no. 3, 873–899. 10.1137/20M1366794Search in Google Scholar

[47] J. Schulman, F. Wolski, P. Dhariwal, A. Radford and O. Klimov, Proximal policy optimization algorithms, preprint (2017), https://arxiv.org/abs/1707.06347. Search in Google Scholar

[48] K. Sheppard, Bashtage/arch: Release 6.3, 2024. Search in Google Scholar

[49] Y. Song and S. Ermon, Generative modeling by estimating gradients of the data distribution, Proceedings of the 33rd International Conference on Neural Information Processing Systems, Curran Associates, Red Hook (2019), 11918–11930. Search in Google Scholar

[50] B. K. Sriperumbudur, K. Fukumizu and G. R. G. Lanckriet, Universality, characteristic kernels and RKHS embedding of measures, J. Mach. Learn. Res. 12 (2011), 2389–2410. Search in Google Scholar

[51] I. Steinwart and A. Christmann, Support Vector Machines, Springer, New York, 2008. 10.1007/978-0-387-77242-4Search in Google Scholar

[52] D. J. Sutherland and N. Deka, Unbiased estimators for the variance of MMD estimators, preprint (2019), https://arxiv.org/abs/1906.02104. Search in Google Scholar

[53] S. Takahashi, Y. Chen and K. Tanaka-Ishii, Modeling financial time-series with generative adversarial networks, Phys. A 527 (2019), Article ID 121261. 10.1016/j.physa.2019.121261Search in Google Scholar

[54] M. Wiese, R. Knobloch, R. Korn and P. Kretschmer, Quant GANs: deep generation of financial time series, Quant. Finance 20 (2020), no. 9, 1419–1440. 10.1080/14697688.2020.1730426Search in Google Scholar

[55] M. Wiese, B. Wood, A. Pachoud, R. Korn, H. Buehler, P. Murray and L. Bai, Multi-asset spot and option market simulation, preprint (2021), https://arxiv.org/abs/2112.06823. 10.2139/ssrn.3980817Search in Google Scholar

[56] C. K. I. Williams and C. E. Rasmussen, Gaussian Processes for Machine Learning. Vol. 2, MIT Press, Cambridge, 2006. Search in Google Scholar

[57] M. Wortsman, G. Ilharco, S. Ya Gadre, R. Roelofs, R. Gontijo-Lopes, A. S. Morcos, H. Namkoong, A. Farhadi, Y. Carmon, S. Kornblith and L. Schmidt, Model soups: Averaging weights of multiple fine-tuned models improves accuracy without increasing inference time, Proceedings of the 39th International Conference on Machine Learning. Vol. 162, PMLR, New York (2022), 23965–23998. Search in Google Scholar

[58] T. Xu, L. K. Wenliang, M. Munn and B. Acciaio, COT-GAN: Generating sequential data via causal optimal transport, Proceedings of the 34th International Conference on Neural Information Processing Systems. Vol. 33, Curran Associates, Red Hook (2020), 8798–8809. Search in Google Scholar

[59] J. Yoon, D. Jarrett and M. van der Schaar, Time-series generative adversarial networks, Proceedings of the 33rd International Conference on Neural Information Processing Systems, Curran Associates, Red Hook (2019), 5508–5518. Search in Google Scholar

Received: 2025-01-17

Revised: 2025-10-21

Accepted: 2025-10-31

Published Online: 2025-11-12

You are currently not able to access this content.

https://doi.org/10.1515/strm-2025-0004

Keywords for this article

Generative model; time series; signatures; MMD; portfolio optimization; reinforcement learning