Discrete-time compartmental models with partially observed data: a comparison among frequentist and Bayesian approaches for addressing likelihood intractability
Abstract
Compartmental models have emerged as useful tools in epidemiology due to their mechanistic nature. They provide insights into complex dynamic systems and allow predictions under different scenarios. However, despite their widespread use, there is still a gap in the literature, concerning their statistical formalization and a systematic discussion of the statistical methods suitable for both tasks of inference and forecasting. In this work, starting from the fundamental distinction between deterministic and stochastic compartmental models, we focus on how the formulation of the likelihood function becomes a necessary and challenging step in the transition from a deterministic to a stochastic framework. We then analyse the various difficulties encountered in evaluating the likelihood function associated with discrete-time stochastic models. We distinguish two reasons for the intractability of the likelihood function, the high dimension of missing data and the complexity of the model structure, and discuss suitable methods for addressing the problem both from a frequentist and Bayesian perspective. We overview likelihood-based methods and explore the use of likelihood-free approaches in this framework, namely approximate Bayesian computation algorithms and a method that combines model calibration with a parametric bootstrap procedure. We emphasize their ability to make inferences from data that are partially observed, or only observed in some aggregated form. To showcase their feasibility and reliability, we compare the likelihood-free and likelihood-based methods at work with a toy example of the Susceptible-Infected-Removed. Finally, we explore the relevance of likelihood-free methods in a real-world framework through an example of a complex compartmental model developed to study smoking dynamics in Tuscany (Italy).
Funding source: Regione Toscana
Funding source: Dipartimenti di Eccellenza
-
Research ethics: Not applicable.
-
Informed consent: Not applicable.
-
Author contributions: C.V. and M.B. conceptualized the project; A.L. wrote the codes and conducted the statistical analysis under the statistical supervision of C.V. and the epidemiological supervision of M.B.; C.V. and A.L. wrote the first version of the paper; M.B. provided critical feedback. All the authors read and approved the final version of the paper.
-
Use of Large Language Models, AI and Machine Learning Tools: None declared.
-
Conflict of interest: The authors state no conflict of interest.
-
Research funding: The authors acknowledge the financial support provided by the Attributable Cancer Burden in Tuscany (ACAB) project funded by Regione Toscana and the “Dipartimenti Eccellenti 2018–2022” ministerial funds.
-
Data availability: Not applicable.
References
1. Broemeling, LD. Bayesian analysis of infectious diseases: COVID-19 and beyond. Chapman & Hall/CRC biostatistics series. Boca Raton London New York: CRC Press, Taylor & Francis Group; 2021.Search in Google Scholar
2. Levy, DT, Friend, K. A simulation model of policies directed at treating tobacco use and dependence. Med Decis Mak 2002;22:6–17.10.1177/02729890222062874Search in Google Scholar
3. Carreras, G, Gallus, S, Iannucci, L, Gorini, G. Estimating the probabilities of making a smoking quit attempt in Italy: stall in smoking cessation levels, 1986-2009. BMC Public Health 2012;12:183.10.1186/1471-2458-12-183Search in Google Scholar PubMed PubMed Central
4. Jourdan, N, Neveux, T, Potier, O, Kanniche, M, Wicks, J, Nopens, I, et al.. Compartmental modelling in chemical engineering: a critical review. Chem Eng Sci 2019;210:115196.10.1016/j.ces.2019.115196Search in Google Scholar
5. Booth, V, Rinzel, J, Kiehn, O. Compartmental model of vertebrate motoneurons for Ca 2+ -dependent spiking and plateau potentials under pharmacological treatment. J Neurophysiol 1997;78:3371–85.10.1152/jn.1997.78.6.3371Search in Google Scholar PubMed
6. Cao, L, Zhao, H, Wang, X, An, X. Competitive information propagation considering local-global prevalence on multi-layer interconnected networks. Front Phys 2023;11:1293177.10.3389/fphy.2023.1293177Search in Google Scholar
7. Brauer, F, den Driessche, PV, Wu, J, Allen, LJS. Mathematical epidemiology. Berlin: Springer; 2008, 1945.10.1007/978-3-540-78911-6Search in Google Scholar
8. Flaig, J, Houy, N. Epidemic control using stochastic and deterministic transmission models: performance comparison with and without parameter uncertainties. medRxiv 2022;2022–11. https://doi.org/10.1101/2022.11.12.22282246.Search in Google Scholar
9. Champagne, C, Cazelles, B. Comparison of stochastic and deterministic frameworks in dengue modelling. Math Biosci 2019;310:1–12.10.1016/j.mbs.2019.01.010Search in Google Scholar PubMed
10. McKinley, T, Cook, AR, Deardon, R. Inference in epidemic models without likelihoods. Int J Biostat 2009;5.10.2202/1557-4679.1171Search in Google Scholar
11. McKinley, TJ, Ross, JV, Deardon, R, Cook, AR. Simulation-based Bayesian inference for epidemic models. Comput Stat Data Anal 2014;71:434–47.10.1016/j.csda.2012.12.012Search in Google Scholar
12. McKinley, TJ, Vernon, I, Andrianakis, I, McCreesh, N, Oakley, JE, Nsubuga, RN, et al.. Approximate Bayesian computation and simulation-based inference for complex stochastic epidemic models. Stat Sci 2018;33:4–18.10.1214/17-STS618Search in Google Scholar
13. Tang, L, Zhou, Y, Wang, L, Purkayastha, S, Zhang, L, He, J, et al.. A review of multi-compartment infectious disease models. Int Stat Rev 2020;88:462–513.10.1111/insr.12402Search in Google Scholar PubMed PubMed Central
14. Butcher, JC. Numerical methods for ordinary differential equations, 3rd ed. Chichester, West Sussex, United Kingdom: Wiley; 2016.10.1002/9781119121534Search in Google Scholar
15. Kurtz, TG. Solutions of ordinary differential equations as limits of pure jump Markov processes. J Appl Probab 1970;7:49–58.10.1017/S0021900200026929Search in Google Scholar
16. Kurtz, TG. Limit theorems for sequences of jump markov processes approximating ordinary differential processes. J Appl Probab 1971;8:344–56.10.1017/S002190020003535XSearch in Google Scholar
17. O’Neill, PD, Roberts, GO. Bayesian inference for partially observed stochastic epidemics. J Roy Stat Soc 1999;162:121–9.10.1111/1467-985X.00125Search in Google Scholar
18. Mckendrick, AG. Applications of mathematics to medical problems. Proc Edinb Math Soc 1925;44:98–130.10.1017/S0013091500034428Search in Google Scholar
19. Kermack, WO, McKendrick, AG. A contribution to the mathematical theory of epidemics. Proc R Soc Lond – Ser A Contain Pap a Math Phys Character 1927;115:700–21.10.1098/rspa.1927.0118Search in Google Scholar
20. Anderson, R. The Kermack-McKendrick epidemic threshold theorem. Bull Math Biol 1991;53:3–32.10.1016/S0092-8240(05)80039-4Search in Google Scholar
21. Allen, LJS, Burgin, AM. Comparison of deterministic and stochastic SIS and SIR models in discrete time. Math Biosci 2000;163:1–33.10.1016/S0025-5564(99)00047-4Search in Google Scholar PubMed
22. Peng, L, Yang, W, Zhang, D, Zhuge, C, Hong, L. Epidemic analysis of COVID-19 in China by dynamical modeling. medRxiv 2020. https://doi.org/10.1101/2020.02.16.20023465.Search in Google Scholar
23. Schlickeiser, R, Kröger, M. Analytical modeling of the temporal evolution of epidemics outbreaks accounting for vaccinations. Physics 2021;3:386–426.10.3390/physics3020028Search in Google Scholar
24. Canto, B, Coll, C, Sanchez, E. Estimation of parameters in a structured SIR model. Adv Differ Equ 2017;2017:1–13.10.1186/s13662-017-1078-5Search in Google Scholar
25. Baccini, M, Cereda, G, Viscardi, C. The first wave of the SARS-CoV-2 epidemic in tuscany (Italy): a SI2R2D compartmental model with uncertainty evaluation. PLOS One 2021;16:e0250029.10.1371/journal.pone.0250029Search in Google Scholar PubMed PubMed Central
26. Nelder, JA, Mead, R. A simplex method for function minimization. Comput J 1965;7:308–13.10.1093/comjnl/7.4.308Search in Google Scholar
27. Dempster, AP, Laird, NM, Rubin, DB. Maximum likelihood from incomplete data via the EM algorithm. J Roy Stat Soc B 1977;39:1–22.10.1111/j.2517-6161.1977.tb01600.xSearch in Google Scholar
28. Levine, RA, Casella, G. Implementations of the monte carlo EM algorithm. J Comput Graph Stat 2001;10:422–39.10.1198/106186001317115045Search in Google Scholar
29. Chowell, G. Fitting dynamic models to epidemic outbreaks with quantified uncertainty: a primer for parameter uncertainty, identifiability, and forecasts. Infect Dis Model 2017;2:379–98.10.1016/j.idm.2017.08.001Search in Google Scholar PubMed PubMed Central
30. Baker, E, Barbillon, P, Fadikar, A, Gramacy, RB, Herbei, R, Higdon, D, et al.. Analyzing stochastic computer models: a review with opportunities. Stat Sci 2022;37:64–89.10.1214/21-STS822Search in Google Scholar
31. Zucchini, W, MacDonald, IL, Langrock, R. Hidden Markov models for time series: an introduction using R. Second edition, first issued in paperback ed. No. 150 in Monographs on statistics and applied probability. Boca Raton London New York: CRC Press, Taylor & Francis Group; 2021.Search in Google Scholar
32. Efron, B, Tibshirani, R. An introduction to the bootstrap. Nachdr. ed. No. 57 in Monographs on statistics and applied probability. Boca Raton, Fla: Chapman & Hall; 1998.Search in Google Scholar
33. Metropolis, N, Ulam, S. The Monte Carlo method. J Am Stat Assoc 1949;44:335–41.10.1080/01621459.1949.10483310Search in Google Scholar PubMed
34. Marshall, AW. The use of multi-stage sampling schemes in Monte Carlo computations. Rand Corporation 1954.Search in Google Scholar
35. Robert, CP, Changye W. All. Markov Chain Monte Carlo Methods, Survey with Some Frequent Misunderstandings. John Wiley & Sons, Ltd; 2021:1–28 pp.10.1002/9781118445112.stat08285Search in Google Scholar
36. Tanner, MA, Wong, WH. The calculation of posterior distributions by data augmentation. J Am Stat Assoc 1987;82:528–40.10.1080/01621459.1987.10478458Search in Google Scholar
37. Liu, JS. The collapsed gibbs sampler in Bayesian computations with applications to a gene regulation problem. J Am Stat Assoc 1994;89:958–66.10.1080/01621459.1994.10476829Search in Google Scholar
38. Robert, C, Casella, G. Monte Carlo statistical methods. New York: Springer Science & Business Media; 2013.Search in Google Scholar
39. Haario, H, Saksman, E, Saksman J. An adaptive Metropolis algorithm. Bernoulli 2001;7:223–42.10.2307/3318737Search in Google Scholar
40. Neal, RM. MCMC using Hamiltonian dynamics. In: Handbook of Markov Chain Monte Carlo. Oxfordshire: Taylor & Francis Group; 2011.10.1201/b10905-6Search in Google Scholar
41. Afshar, HM, Holenstein J. Reflection, refraction, and Hamiltonian Monte Carlo. In: Advances in Neural Information Processing Systems. Red Hook, NY: Curran Associates, Inc; 2015, 28.Search in Google Scholar
42. Andrieu, C, Doucet, A, Holenstein, R. Particle Markov chain Monte Carlo methods. J Roy Stat Soc B Stat Methodol 2010;72:269–342.10.1111/j.1467-9868.2009.00736.xSearch in Google Scholar
43. Rubin, DB. Bayesianly justifiable and relevant frequency calculations for the applied statistician. Ann Statistics 1984;12.10.1214/aos/1176346785Search in Google Scholar
44. Tavaré, S, Balding, DJ, Griffiths, RC, Donnelly, P. Inferring coalescence times from DNA sequence data. Genetics 1997;145:505–18.10.1093/genetics/145.2.505Search in Google Scholar PubMed PubMed Central
45. Pritchard, JK, Seielstad, MT, Perez-Lezaun, A, Feldman, MW. Population growth of human Y chromosomes: a study of Y chromosome microsatellites. Mol Biol Evol 1999;16:1791–8.10.1093/oxfordjournals.molbev.a026091Search in Google Scholar PubMed
46. Sisson, SA, Fan, Y, Beaumont, MA, editors. Handbook of approximate Bayesian computation. Boca Raton: CRC Press, Taylor & Francis Group; 2019.10.1201/9781315117195Search in Google Scholar
47. Kypraios, T, Neal, P, Prangle, D. A tutorial introduction to Bayesian inference for stochastic epidemic models using approximate Bayesian computation. Math Biosci 2017;287:42–53.10.1016/j.mbs.2016.07.001Search in Google Scholar PubMed
48. Beaumont, MA, Cornuet, JM, Marin, JM, Robert, CP. Adaptive approximate Bayesian computation. Biometrika 2009;96:983–90.10.1093/biomet/asp052Search in Google Scholar
49. Lenormand, M, Jabot, F, Deffuant, G. Adaptive approximate Bayesian computation for complex models. Comput Stat 2013;28:2777–96.10.1007/s00180-013-0428-3Search in Google Scholar
50. Mogensen, PK, Riseth, AN. Optim: a mathematical optimization package for Julia. J Open Source Softw 2018;3:615.10.21105/joss.00615Search in Google Scholar
51. Lachi, A, Viscardi, C, Cereda, G, Carreras, G, Baccini, M. A compartmental model for smoking dynamics in Italy: a pipeline for inference, validation, and forecasting under hypothetical scenarios. BMC Med Res Methodol 2024;24.10.1186/s12874-024-02271-wSearch in Google Scholar PubMed PubMed Central
52. Thun, MJ, Carter, BD, Feskanich, D, Freedman, ND, Prentice, R, Lopez, AD, et al.. 50-Year trends in smoking-related mortality in the United States. N Engl J Med 2013;368:351–64.10.1056/NEJMsa1211127Search in Google Scholar PubMed PubMed Central
53. Hellinger, E. Neue Begründung der Theorie quadratischer Formen von unendlichvielen Veränderlichen. J für die Reine Angewandte Math 1909;1909:210–71.10.1515/crll.1909.136.210Search in Google Scholar
54. Alahmadi, AA, Flegg, JA, Cochrane, DG, Drovandi, CC, Keith, JM. A comparison of approximate versus exact techniques for Bayesian parameter inference in nonlinear ordinary differential equation models. R Soc Open Sci 2020;7:191315.10.1098/rsos.191315Search in Google Scholar PubMed PubMed Central
55. Toni, T, Welch, D, Strelkowa, N, Ipsen, A, Stumpf, MPH. Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems. J R Soc Interface 2009;6:187–202.10.1098/rsif.2008.0172Search in Google Scholar PubMed PubMed Central
56. Beven, K, Binley, A. The future of distributed models: model calibration and uncertainty prediction. Hydrol Process 1992;6:279–98.10.1002/hyp.3360060305Search in Google Scholar
57. Nott, DJ, Marshall, L, Brown, J. Generalized likelihood uncertainty estimation (GLUE) and approximate Bayesian computation: what’s the connection? technical note. Water Resour Res 2012;48.10.1029/2011WR011128Search in Google Scholar
58. Pooley, CM, Bishop, SC, Marion, G. Using model-based proposals for fast parameter inference on discrete state space, continuous-time Markov processes. J R Soc Interface 2015;12:20150225.10.1098/rsif.2015.0225Search in Google Scholar PubMed PubMed Central
59. Wang, L, Zhou, Y, He, J, Zhu, B, Wang, F, Tang, L, et al.. An epidemiological forecast model and software assessing interventions on the COVID-19 epidemic in China. J Data Sci 2020;18:409–32.10.6339/JDS.202007_18(3).0003Search in Google Scholar
60. Akira, E, Edwin, VL, Marc, B. Introduction to particle Markov-chain Monte Carlo for disease dynamics modellers. Epidemics 2019;29:100363.10.1016/j.epidem.2019.100363Search in Google Scholar PubMed
61. Roberts, GO, Stramer, O. Langevin diffusions and Metropolis-Hastings algorithms. Methodol Comput Appl Probab 2002;4:337–57.10.1023/A:1023562417138Search in Google Scholar
62. Cranmer, K, Brehmer, J, Louppe, G. The Frontier of simulation-based inference. Proc Natl Acad Sci 2020;117:30055–62.10.1073/pnas.1912789117Search in Google Scholar PubMed PubMed Central
63. Papamakarios, G, Murray, I. Fast ɛ-free inference of simulation models with Bayesian conditional density estimation. Adv Neural Inf Process Syst 2016;29.Search in Google Scholar
64. Prangle, D, Viscardi, C. Distilling importance sampling for likelihood free inference. J Comput Graph Stat 2023;32:1461–71.10.1080/10618600.2023.2175688Search in Google Scholar
65. Gourieroux, C, Monfort, A, Renault, E. Indirect inference. J Appl Econom 1993;8:S85–118.10.1002/jae.3950080507Search in Google Scholar
66. Tran, MN, Nott, D, Kohn, R. Variational Bayes. Amsterdam: Wiley StatsRef: Statistics Reference Online; 2022:1–9 pp. https://doi.org/10.1002/9781118445112.stat08387.Search in Google Scholar
Supplementary Material
This article contains supplementary material (https://doi.org/10.1515/em-2024-0032).
© 2025 Walter de Gruyter GmbH, Berlin/Boston
Articles in the same Issue
- Causal mediation analysis for difference-in-difference design and panel data
- Research Articles
- Discrete-time compartmental models with partially observed data: a comparison among frequentist and Bayesian approaches for addressing likelihood intractability
- Sensitivity analysis for unmeasured confounding for a joint effect with an application to survey data
- Investigating the association between school substance programs and student substance use: accounting for informative cluster size
- The quantiles of extreme differences matrix for evaluating discriminant validity
- Finite-sample improved confidence intervals based on the estimating equation theory for the modified Poisson and least-squares regressions
- What if dependent causes of death were independent?
- Bot invasion: protecting the integrity of online surveys against spamming
- A study of a stochastic model and extinction phenomenon of meningitis epidemic
- Understanding the impact of media and latency in information response on the disease propagation: a mathematical model and analysis
- Time-varying reproductive number estimation for practical application in structured populations
- Perspective
- Should we still use pointwise confidence intervals for the Kaplan–Meier estimator?
- Leveraging data from multiple sources in epidemiologic research: transportability, dynamic borrowing, external controls, and beyond
- Regression calibration for time-to-event outcomes: mitigating bias due to measurement error in real-world endpoints
Articles in the same Issue
- Causal mediation analysis for difference-in-difference design and panel data
- Research Articles
- Discrete-time compartmental models with partially observed data: a comparison among frequentist and Bayesian approaches for addressing likelihood intractability
- Sensitivity analysis for unmeasured confounding for a joint effect with an application to survey data
- Investigating the association between school substance programs and student substance use: accounting for informative cluster size
- The quantiles of extreme differences matrix for evaluating discriminant validity
- Finite-sample improved confidence intervals based on the estimating equation theory for the modified Poisson and least-squares regressions
- What if dependent causes of death were independent?
- Bot invasion: protecting the integrity of online surveys against spamming
- A study of a stochastic model and extinction phenomenon of meningitis epidemic
- Understanding the impact of media and latency in information response on the disease propagation: a mathematical model and analysis
- Time-varying reproductive number estimation for practical application in structured populations
- Perspective
- Should we still use pointwise confidence intervals for the Kaplan–Meier estimator?
- Leveraging data from multiple sources in epidemiologic research: transportability, dynamic borrowing, external controls, and beyond
- Regression calibration for time-to-event outcomes: mitigating bias due to measurement error in real-world endpoints