Extending the scope of the capture-recapture experiment: a multilevel approach with random effects to provide reliable estimates at national level

Eric Janssen; Michael Vuolo

doi:10.1515/em-2025-0011

Article

Extending the scope of the capture-recapture experiment: a multilevel approach with random effects to provide reliable estimates at national level

Eric Janssen and Michael Vuolo

Published/Copyright: October 6, 2025

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information Explore this Subject

From the journal Epidemiologic Methods Volume 14 Issue 1

Abstract

Objectives

This study investigates the use of multilevel models within capture-recapture experiments, a common procedure to estimate the size of an elusive population, accounting for random effects to accurately calculate the probabilities of being observed.

Methods

We review single data-source capture-recapture estimators. We provide a general framework to account for random effects to accurately calculate the probabilities of being observed and apply the method to estimate the number of powder cocaine, crack cocaine, heroin users and people who inject drugs in France in 2019. Using two-level, random-intercepts logistic regression models that account for both individual and center factors, we conduct the calculations once with fixed effects only, and once with predicted center-level random effects.

Results

There are substantial differences between estimates based on fixed-effects only predictions and fixed and random effects combined, reflecting the influence of the treatment center-level. The application of multilevel modelling in capture-recapture allows the researcher to include individual and contextual information, and provide estimates at a large geographical scale.

Conclusions

Multilevel capture-recapture modelling offers a credible alternative to estimate the size of elusive populations in large geographical settings. The influence of higher-level clusters in capture-recapture studies using multilevel modelling should be explicitly considered in future studies.

Keywords: capture-recapture; heterogeneity; illicit substances; multilevel model; nationwide estimates; random effects

Corresponding author: Eric Janssen, French Monitoring Centre on Drugs and Drug Addiction (Observatoire Français des Drogues et des Tendances Addictives – OFDT), 69 rue de Varenne, 75007 Paris, France, E-mail: eric.janssen@ofdt.fr

Acknowledgments

The authors wish to thank Christophe Palle and Léo Bouthier (OFDT) for updating the data and Thomas Seyler (EUDA) for providing the opportunity to expose a first draft of our research during the PDU Expert Meeting held in Lisbon, May 2019.

Research ethics: The survey was approved by an internal steering committee which acts as the equivalent of an Institutional Review Board, and by the National Data Protection Authority (CNIL, authorization #04-1059).
Informed consent: Informed consent was obtained from all individuals included in this study, or their legal guardians or wards.
Author contributions: EJ: conceptualization, data curation and first draft of the manuscript. MV: conceptualization, methodological assessment and major contribution in writing the final version of the manuscript. All authors read and approved the final manuscript.
Use of Large Language Models, AI and Machine Learning Tools: None to declare.
Conflict of interest: None to declare.
Research funding: This study was supported by the French Monitoring Centre for Drugs and Drug Addictions (OFDT), which provided financial support for conducting the survey and writing this paper. The authors received no financial support for the research, authorship, and/or publication of this article.
Data availability: The datasets generated and/or analysed during the current study are not publicly available. The data contain sensitive information which allows the identification of individuals. It is therefore protected, and access can only be granted with special permission.

References

1. van der Heijden, PGM, Cruts, G, Cruyff, M. Methods for population size estimation of problem drug users using a single registration. Int J Drug Policy 2013;24:614–8. https://doi.org/10.1016/j.drugpo.2013.04.002.Search in Google Scholar PubMed

2. Bishop, Y, Fienberg, S, Holland, P. Estimating the size of a closed population. Discrete multivariate analysis: theory and practice. Cambridge, MA: MIT Press; 1975:227–56 pp.Search in Google Scholar

3. Hook, EB, Regal, RR. Capture–recapture methods in epidemiology: methods and limitations. Epidemiol Rev 1995;17:243–64. https://doi.org/10.1093/oxfordjournals.epirev.a036192.Search in Google Scholar PubMed

4. Bloor, M, Leyland, A, Barnard, M, McKeganey, N. Estimating hidden populations: a new method of calculating the prevalence of drug-injecting and non-injecting female street prostitution. Br J Addict 1991;86:1477–83. https://doi.org/10.1111/j.1360-0443.1991.tb01733.x.Search in Google Scholar PubMed

5. Jones, HE, Welton, NJ, Ades, AE, Pierce, M, Davies, W, Coleman, B, et al.. Problem drug use prevalence estimation revisited: heterogeneity in capture-recapture and the role of external evidence. Addiction 2016;111:438–47. https://doi.org/10.1111/add.13222.Search in Google Scholar PubMed PubMed Central

6. Pérez, AO, Cruyff, MJLF, Benschop, A, Korf, DJ. Estimating the prevalence of crack dependence using capture-recapture with institutional and field data: a three-city study in the Netherlands. Subst Use Misuse 2013;48:173–80. https://doi.org/10.3109/10826084.2012.748073.Search in Google Scholar PubMed

7. Hope, VD, Hickman, M, Tilling, K. Capturing crack cocaine use: estimating the prevalence of crack cocaine use in London using capture-recapture with covariates. Addiction 2005;100:1701–8. https://doi.org/10.1111/j.1360-0443.2005.01244.x.Search in Google Scholar PubMed

8. Hay, G. Capture-recapture estimates of drug misuse in urban and non-urban settings in the north east of Scotland. Addiction 2000;95:1795–803. https://doi.org/10.1046/j.1360-0443.2000.951217959.x.Search in Google Scholar PubMed

9. Hickman, M, Cox, S, Harvey, J, Howes, S, Farrell, M, Frischer, M, et al.. Estimating the prevalence of problem drug use in inner London: a discussion of three capture-recapture studies. Addiction 1999;94:1653–62. https://doi.org/10.1046/j.1360-0443.1999.941116534.x.Search in Google Scholar PubMed

10. Platt, L, Hickman, M, Rhodes, T, Mikhailova, L, Karavashkin, V, Vlasov, A, et al.. The prevalence of injecting drug use in a Russian city: implications for harm reduction and coverage. Addiction 2004;99:1430–8. https://doi.org/10.1111/j.1360-0443.2004.00848.x.Search in Google Scholar PubMed

11. Kimber, J, Hickman, M, Degenhardt, L, Coulson, T, Van Beek, I. Estimating the size and dynamics of an injecting drug user population and implications for health service coverage: comparison of indirect prevalence estimation methods. Addiction 2008;103:1604–13. https://doi.org/10.1111/j.1360-0443.2008.02276.x.Search in Google Scholar PubMed

12. Kraus, L, Augustin, R, Frischer, M, Kümmler, P, Uhl, A, Wiessing, L. Estimating prevalence of problem drug use at national level in countries of the European Union and Norway. Addiction 2003;98:471–85. https://doi.org/10.1046/j.1360-0443.2003.00326.x.Search in Google Scholar PubMed

13. King, R, Bird, SM, Overstall, A, Hay, G, Hutchinson, SJ. Estimating prevalence of injecting drug users and associated heroin-related death rates in England by using regional data and incorporating prior information. J R Stat Soc A Stat 2014;177:209–36. https://doi.org/10.1111/rssa.12011.Search in Google Scholar

14. Rabe-Hesketh, S, Skrondal, A. Multilevel and longitudinal modeling using Stata, volume I: continuous responses, 3rd ed. College Station, TX: Stata Press; 2012.Search in Google Scholar

15. Raudenbush, SW, Bryk, AS. Hierarchical linear models: applications and data analysis methods. Thousand Oaks: Sage; 2002.Search in Google Scholar

16. Snijders, TAB, Bosker, RJ. Multilevel analysis: an introduction to basic and advanced multilevel modeling, 2nd ed. Thousand Oaks, CA: Sage; 2012.Search in Google Scholar

17. Fienberg, SE, Johnson, MS, Junker, BW. Classical multilevel and Bayesian approaches to population size estimation using multiple lists. J R Stat Soc Ser A 1999;162:383–405. https://doi.org/10.1111/1467-985x.00143.Search in Google Scholar

18. Papadatou, E, Pradel, R, Schaub, M, Dolch, D, Geiger, H, Ibañez, C, et al.. Comparing survival among species with imperfect detection using multilevel analysis of mark – recapture data: a case study on bats. Ecography 2012;35:153–61. https://doi.org/10.1111/j.1600-0587.2011.07084.x.Search in Google Scholar

19. Rose, JP, Wylie, GD, Casazza, ML, Halstead, BJ. Integrating growth and capture–mark–recapture models reveals size-dependent survival in an elusive species. Ecosphere 2018;9:e02384. https://doi.org/10.1002/ecs2.2384.Search in Google Scholar

20. Schofield, MR, Barker, RJ. Hierarchical modeling of abundance in closed population capture–recapture models under heterogeneity. Environ Ecol Stat 2014;21:435–51. https://doi.org/10.1007/s10651-013-0262-3.Search in Google Scholar

21. Bao, L, Raftery, AE, Reddy, A. Estimating the sizes of populations at risk of HIV infection from multiple data sources using a Bayesian hierarchical model. Stat Interface 2015;8:125–36. https://doi.org/10.4310/sii.2015.v8.n2.a1.Search in Google Scholar

22. Feldman, JM, Gruskin, S, Coull, BA, Krieger, N. Quantifying underreporting of law-enforcement-related deaths in United States vital statistics and news-media-based data sources: a capture-recapture analysis. PLoS Med 2017;14:e1002399. https://doi.org/10.1371/journal.pmed.1002399.Search in Google Scholar PubMed PubMed Central

23. Janssen, E. Estimating the number of heroin users in metropolitan France using treatment centres data. An exploratory analysis. Subst Use Misuse 2017;52:683–7. https://doi.org/10.1080/10826084.2016.1245340.Search in Google Scholar PubMed

24. Janssen, E. Estimating the number of people who inject drugs: a proposal to provide figures nationwide and its application to France. J Publ Health 2018;40:e180–8. https://doi.org/10.1093/pubmed/fdx059.Search in Google Scholar PubMed

25. Bell, A, Fairbrother, M, Jones, K. Fixed and random effects models: making an informed choice. Qual Quant 2019;53:1051–74. https://doi.org/10.1007/s11135-018-0802-x.Search in Google Scholar

26. Ni, H, Groenwold, RHH, Nielen, M, Klugkist, I. Prediction models for clustered data with informative priors for the random effects: a simulation study. BMC Med Res Methodol 2018;18:83. https://doi.org/10.1186/s12874-018-0543-5.Search in Google Scholar PubMed PubMed Central

27. Sohn, SY, Kim, HS. Random effects logistic regression model for default prediction of technology credit guarantee fund. Eur J Oper Res 2007;183:472–8. https://doi.org/10.1016/j.ejor.2006.10.006.Search in Google Scholar

28. Uggen, C, Vuolo, M, Lageson, S, Ruhland, E, Whitham, HK. The edge of stigma: an experimental audit of the effects of low-level criminal records on employment. Criminology 2014;52:627–54. https://doi.org/10.1111/1745-9125.12051.Search in Google Scholar

29. Pavlou, M, Ambler, G, Seaman, S, Omar, RZ. A note on obtaining correct marginal predictions from a random intercepts model for binary outcomes. BMC Med Res Methodol 2015;15:59. https://doi.org/10.1186/s12874-015-0046-6.Search in Google Scholar PubMed PubMed Central

30. Bunge, J, Fitzpatrick, M. Estimating the number of species: a review. J Am Stat Assoc 1993;88:364–73. https://doi.org/10.1080/01621459.1993.10594330.Search in Google Scholar

31. Wilson, RM, Collins, MF. Capture-recapture estimation with samples of size one using frequency data. Biometrika 1992;79:543–53. https://doi.org/10.1093/biomet/79.3.543.Search in Google Scholar

32. Böhning, D, van der Heijden, PMG, Bunge, J. Capture-recapture methods for the social and medical sciences. Boca Raton: Chapman & Hall/CRC; 2017.10.4324/9781315151939Search in Google Scholar

33. Böhning, D, Schön, D. Nonparametric maximum likelihood estimation of population size based on the counting distribution. J R Stat Soc Ser C 2005;54:721–37. https://doi.org/10.1111/j.1467-9876.2005.05324.x.Search in Google Scholar

34. van der Heijden, PGM, Bustami, R, Cruyff, MJLF, Engbersen, G, Van Houwelingen, HC. Point and interval estimation of the population size using the truncated Poisson regression model. Stat Model 2003;3:305–22. https://doi.org/10.1191/1471082x03st057oa.Search in Google Scholar

35. Cruyff, MJLF, Van der Heijden, PGM. Point and interval estimation of the population size using a zero-truncated negative binomial regression model. Biom J 2008;50:1035–50. https://doi.org/10.1002/bimj.200810455.Search in Google Scholar PubMed

36. Böhning, D, van der Heijden, PGM. A covariate adjustment for zero-truncated approaches to estimating the size of hidden and elusive populations. Ann Appl Stat 2009;3:595–610. https://doi.org/10.1214/08-aoas214.Search in Google Scholar

37. Böhning, D, Vidal-Diez, A, Lerdsuwansri, R, Viwatwongkasem, C, Arnold, M. A generalization of Chao’s estimator for covariate information. Biometrics 2013;69:1033–42. https://doi.org/10.1111/biom.12082.Search in Google Scholar PubMed

38. Janssen, E, Cadet-Taïrou, A, Gérome, C, Vuolo, M. Estimating the size of crack cocaine users in France: methods for an elusive population with high heterogeneity. Int J Drug Policy 2020;76:e102637. https://doi.org/10.1016/j.drugpo.2019.102637.Search in Google Scholar PubMed

39. Janssen, E, Vuolo, M, Gérome, C, Cadet-Taïrou, A. Mixed methods to assess the use of rare illicit psychoactive substances: a case study. Epidemiol Methods 2021;10. https://doi.org/10.1515/em-2020-0031.Search in Google Scholar

40. Gelman, A, Hill, J. Data analysis using regression and multilevel/hierarchical models. Cambridge: Cambridge University Press; 2006.10.1017/CBO9780511790942Search in Google Scholar

41. Beck, N, Katz, JN. Random coefficient models for time-series – cross-section data: Monte Carlo experiments. Polit Anal 2007;15:182–95. https://doi.org/10.1093/pan/mpl001.Search in Google Scholar

42. Merlo, J, Chaix, B, Ohlsson, H, Beckman, A, Johnell, K, Hjerpe, P, et al.. A brief conceptual tutorial of multilevel analysis in social epidemiology: using measures of clustering in multilevel logistic regression to investigate contextual phenomena. J Epidemiol Community Health 2006;60:290–7. https://doi.org/10.1136/jech.2004.029454.Search in Google Scholar PubMed PubMed Central

43. Johnson, PC. Extension of Nakagawa & Schielzeth’s R(2)(GLMM) to random slopes models. Methods Ecol Evol 2014;5:944–6. https://doi.org/10.1111/2041-210X.12225.Search in Google Scholar PubMed PubMed Central

44. Goldstein, H, Browne, W, Rasbash, J. Partitioning variation in multilevel models. Underst Stat 2002;1:223–31. https://doi.org/10.1207/s15328031us0104_02.Search in Google Scholar

45. Franceschini, S, Tsai, C, Marani, M. Point estimate methods based on Taylor series expansion – the perturbance moments method – a more coherent derivation of the second order statistical moment. Appl Math Model 2012;36:5445–54. https://doi.org/10.1016/j.apm.2011.11.079.Search in Google Scholar

46. Simeone, R, Nottingham, W, Holland, L. Estimating the size of a heroin using population: an examination of the use of treatment admissions data. Int J Addict 1993;28:107–28. https://doi.org/10.3109/10826089309039618.Search in Google Scholar PubMed

47. Hox, J. Multilevel analysis. Techniques and applications, 2nd ed. London: Lawrence Erlbaum; 2010.Search in Google Scholar

48. Wright, D. Extra-binomial variation in multilevel logistic models with sparse structure. Br J Math Stat Psychol 2007;50:21–9. https://doi.org/10.1111/j.2044-8317.1997.tb01099.x.Search in Google Scholar

49. Royle, JA, Link, WA. Random effects and shrinkage estimation in capture-recapture models. J Appl Stat 2002;29:329–51. https://doi.org/10.1080/02664760120108746.Search in Google Scholar

50. Degenhardt, L, Glantz, M, Evans-Lacko, S, Sadikova, E, Sampson, N, Thornicroft, G, et al.. Estimating treatment coverage for people with substance use disorders: an analysis of data from the world mental health surveys. World Psychiatry 2017;16:299–307. https://doi.org/10.1002/wps.20457.Search in Google Scholar PubMed PubMed Central

51. Yang, LH, Wong, LY, Grivel, MM, Hasin, DS. Stigma and substance use disorders: an international phenomenon. Curr Opin Psychiatr 2017;30:378–88. https://doi.org/10.1097/yco.0000000000000351.Search in Google Scholar PubMed PubMed Central

52. Lahaie, E, Janssen, E, Cadet-Taïrou, A. Determinants of heroin retail prices in metropolitan France: discounts, purity and local markets. Drug Alcohol Rev 2015;35:597–604.10.1111/dar.12355Search in Google Scholar PubMed

53. Cicero, TJ, Ellis, MS, Surratt, HL, Kurtz, SP. The changing face of heroin use in the United States. A retrospective analysis of the past 50 years. JAMA Psychiatry 2014;71:821–6. https://doi.org/10.1001/jamapsychiatry.2014.366.Search in Google Scholar PubMed

54. Vuolo, M, Janssen, E, Flores Laffont, I. Using crack or smoking cocaine, that is the question: the association of sociodemographic factors with self-labeling choices in France. Deviant Behav 2023;44:920–34. https://doi.org/10.1080/01639625.2022.2111671.Search in Google Scholar

55. Bryan, ML, Jenkins, SP. Regression analysis of country effects using multilevel data: a cautionary tale. Eur Socio Rev 2016;32:3–22. https://doi.org/10.1093/esr/jcv059.Search in Google Scholar

Supplementary Material

This article contains supplementary material (https://doi.org/10.1515/em-2025-0011).

Received: 2025-03-10

Accepted: 2025-09-24

Published Online: 2025-10-06

You are currently not able to access this content.

Supplementary Material Details

Articles in the same Issue

https://doi.org/10.1515/em-2025-0011

Keywords for this article

capture-recapture; heterogeneity; illicit substances; multilevel model; nationwide estimates; random effects