On the approximation of high-order binary Markov chains by parsimonious models

Yuriy S. Kharin; Valeriy A. Voloshko

doi:10.1515/dma-2024-0007

Article

On the approximation of high-order binary Markov chains by parsimonious models

Yuriy S. Kharin and Valeriy A. Voloshko

Published/Copyright: April 15, 2024

Published by

Become an author with De Gruyter Brill

Author Information Explore this Subject

From the journal Discrete Mathematics and Applications Volume 34 Issue 2

Abstract

We consider two parsimonious models of binary high-order Markov chains and discover their ability to approximate arbitrary high-order Markov chains. Two types of global measures for approximation accuracy are introduced, theoretical and experimental results are obtained for these measures and for the considered parsimonious models. New consistent statistical parameter estimator is constructed for parsimonious model based on two-layer artificial neural network.

Keywords: high-order Markov chain; parsimonious model; approximation; artificial neural network; statistical estimation

Originally published in Diskretnaya Matematika (2022) 34, №3, 114–135 (in Russian).

Funding statement: This work was supported by the State scientific research program of the Republic of Belarus, project No. 20211983.

Appendix

Proof of Lemma 7

In view of (53)

ek=∑J∈J(T)u^(J)Fk(Ak′J),

this proves (56). According to (51) and (53)

dqr=∑J∈J(T)Fq(Aq′J)Fr(Ar′J),

and in analogy with above we obtain (57).

The equation (58) follows from [28, (A.9.5)]:

∂D−1∂dqr=−D−1KqrD−1,

where K_qr is the (m × m)-matrix having only one «1» at the position (q, r), and all other elements are equal to 0, or follows from the equation [29, (1.4.21)]:

(∂D−1∂D)(k,l),(q,r)=−(D−1)kq((D−1)′)lr=−d¯kqd¯rl.

□

Proof of the Lemma 8

In view of (55) we have:

∂w1∂ai,ν=−∂∂ai,ν∑k,l=1mekeld¯kl=−∑k,l=1m(ekel∂d¯kl∂ai,ν+d¯kl(∂ek∂ai,νel+∂el∂ai,νek)). (62)

Using the relation (56) and the Lemma 7 we find the expression for the second group of summands in (62), accounting for (53) and the symmetry of D:

∑k,l=1md¯kl(∂ek∂ai,νel+∂el∂ai,νek)=∑J∈J(T)u^(J)fi(Ai′J)jν∑k,l=1md¯kl(δikel+δilek)=∑J∈J(T)u^(J)fi(Ai′J)jν(∑l=1meld¯il+∑k=1mekd¯ki)=2b^i∑J=(j1,…,js)′∈J(T)jνu^(J)fi(Ai′J). (63)

Now consider the first group of summands in (62). We have

∂d¯kl∂ai,ν=∑q,r=1m∂d¯kl∂dqr⋅∂dqr∂ai,ν.

From (57), (58) and Lemma 7 we obtain, accounting for (51):

∂d¯kl∂ai,ν=−∑q,r=1md¯kqd¯rl∑J∈J(T)jν(δiqFr(Ar′J)+δirFq(Aq′J))fi(Ai′J)=−∑J∈J(T)jνfi(Ai′J)(d¯ki∑r=1md¯rlFr(Ar′J)+d¯il∑q=1md¯kqFq(Aq′J))=−∑J∈J(T)jνfi(Ai′J)(d¯ik(D−1G)l+d¯il(D−1G)k). (64)

Substitute (64) and (63) into (62) and transform accounting for (53):

∂w1∂ai,ν=−2b^i∑J∈J(T)jνfi(Ai′J)u^(J)+∑J∈J(T)jνfi(Ai′J)∑k,l=1mekel(d¯ik(D−1G)l+d¯il(D−1G)k)=−2∑J∈J(T)jνfi(Ai′J)(b^iu^(J)−(D−1E)iG′D−1E),

this proves (59).□

The authors are grateful to A. M. Zubkov for comments and recommendations contributed to the improvement of the paper.

References

[1] Lütkepohl H., New Introduction to Multiple Time Series Analysis, Springer-Verlag, Berlin, 2005, xxi+764 pp.Search in Google Scholar

[2] Kedem B., Fokianos K., Regression Models for Time Series Analysis, Wiley, Hoboken, 2002, 360 pp.Search in Google Scholar

[3] Weiss C. H., An Introduction to Discrete-Valued Time Series, Wiley, Hoboken, 2018, 304 pp.Search in Google Scholar

[4] MacDonald I. L., Zucchini W., Hidden Markov and Other Models for Discrete-valued Time Series, Chapman and Hall, N.-Y., 1997, 256 pp.Search in Google Scholar

[5] Billingsley P., “Statistical methods in Markov chains”, Ann. Math. Statist., 32:1 (1961), 12–40.Search in Google Scholar

[6] Fokianos K., Fried R., Kharin Yu. S., Voloshko V., “Statistical Analysis of Multivariate Discrete-Valued Time Series”, J. Multivar. Anal., 188:C (2022).Search in Google Scholar

[7] Kharin Yu. S., Voloshko V. A., Dernakova O. V., Malyugin V. I., Kharin A. Yu., “Statistical forecasting of the dynamics of epidemio-logical indicators of the incidence of COVID-19 in the Republic of Belarus”, Zh. Belorus. gos. un-ta Matem. Inform., 3(2020), 36–50 (in Russian).Search in Google Scholar

[8] Doob J. L., Stochastic processes, N.-Y.: John Wiley&Sons, 1953, 654 pp.Search in Google Scholar

[9] Kharin Yu. S., “Markov chains with r-partial connections and their statistical evaluation”, Doklady of the NAS of Belarus, 48:1 (2004), 40–44 (in Russian).Search in Google Scholar

[10] Kharin Yu. S.; Petlitskil˘ A. I., “A Markov chain of order s with r partial connections and statistical inference on its parameters”, Discrete Math. Appl., 17:3 (2007), 295–317.Search in Google Scholar

[11] Kharin Yu. S., Maltsew M. V., “Statistical analysis of high-order dependencies”, Acta Comment. Univ. Tartuensis Mathem., 21:1 (2017), 79–91.Search in Google Scholar

[12] Buhlmann P., Wyner A., “Variable length markov chains”, Ann. Statist., 27:2 (1999), 480–513.Search in Google Scholar

[13] Jacobs P. A., Lewis P. A. W., “Stationary discrete autoregressive-moving average time series generated by mixtures”, J. Time Series Anal., 4:1 (1983), 19–36.Search in Google Scholar

[14] Raftery A., Tavare S., “Estimation and modelling repeated patterns in high order Markov chains with the mixture transition distribution model”, J. Appl. Statist., 43:1 (1994), 179–199.Search in Google Scholar

[15] Maksimov Yu. I., “On Markov chains associated with binary shift registers with random elements”, Trudy po Diskretnoi Matem- atike, 1 (1997), 203–220 (in Russian).Search in Google Scholar

[16] Alzaid A. A., Al-Osh M., “An integer-valued p-th order autoregressive structure (INAR(p)) process”, J. Appl. Probab., 27:2 (1990), 314–324.Search in Google Scholar

[17] Kharin Yu. S., Voloshko V. A., Medved E. A., “Statistical estimation of parameters for binary conditionally nonlinear autoregressive time series”, Math. Meth. Statist., 27:2 (2018), 103–118.Search in Google Scholar

[18] Kharin Yu., Voloshko V., “Robust estimation for binomial conditionally nonlinear autoregressive time series based on multivariate conditional frequencies”, J. Multivar. Anal., 185 (2021), 11–27.Search in Google Scholar

[19] Kharin Yu. S., Voloshko V. A., “Binomial conditionally nonlinear autoregressive model of a discrete time series and its probabilistic and statistical properties”, Proc. Inst. Math. NAS Belarus, 26:1 (2019), 95–105 (in Russian).Search in Google Scholar

[20] Kharin Yu. S., “Neural network-based models of binomial time series in data analysis problems”, Doklady of the NAS of Belarus, 65:6 (2021), 654–660 (in Russian).Search in Google Scholar

[21] Zubkov A. M., Serov A. A., “Testing the NIST Statistical Test Suite on artificial pseudorandom sequences”, Matematicheskie voprosy kriptografii, 10:2 (2019), 89–96.Search in Google Scholar

[22] Kolmogorov A. N., “On the representation of continuous functions of many variables by superposition of continuous functions of one variable and addition”, Dokl. Akad. Nauk SSSR, 114:5 (1957), 953–956 (in Russian).Search in Google Scholar

[23] Cybenko G., “Approximation by superpositions of sigmoidal functions”, Math. of Control, Signals, and Systems, 2 (1989), 303–314.Search in Google Scholar

[24] Amari S., Nagaoka H., Methods of Information Geometry, Oxford University Press, 2000, 206 pp.Search in Google Scholar

[25] Schlafli L., Gesammelte mathematische Abhandlugen: Band 1, Birkhauser, Basel, 1850, 392 pp.Search in Google Scholar

[26] Zuev Yu. A., “Methods of geometry and probabilistic combinatorics in threshold logic”, Discrete Math. Appl., 2:4 (1992), 427–438.Search in Google Scholar

[27] Gantmacher F. R., The Theory of Matrices, Chelsea Publ. Co., N.-Y., 1959, vol. 1: x+374 pp. vol. 2: x+277 pp.Search in Google Scholar

[28] Mardia K. V., Kent J. T., Billy J. M., Multivariate Analysis, Academic Press, N.-Y., 1979, 521 pp.Search in Google Scholar

[29] Kollo T., Rosen D., Multivariate Statistics and Matrices, Springer, Dordrecht, 2005, 506 pp.Search in Google Scholar

Received: 2022-04-19

Published Online: 2024-04-15

Published in Print: 2024-04-25

You are currently not able to access this content.

Articles in the same Issue

https://doi.org/10.1515/dma-2024-0007

Keywords for this article

high-order Markov chain; parsimonious model; approximation; artificial neural network; statistical estimation