Abstract
Objectives
In clinical and epidemiological studies, the modified Poisson and least-squares regression analyses for binary outcomes have been used as standard multivariate analysis methods to provide risk ratio and risk difference estimates. However, their ordinary Wald-type confidence intervals can suffer from finite-sample biases in the robust variance estimators, and the coverage probabilities of true effect measures are substantially below the nominal level (usually 95 %). To address this issue, new accurate inference methods are needed.
Methods
We propose two accurate inference methods based on the estimating equation theory for these regression models. A remarkable advantage of these regression models is that the correct models to be estimated are known, that is, conventional binomial regression models with log and identity links. Using this modeling information, we first derive the quasi-score statistics, whose robust variances are estimated using the correct model information, and then propose a confidence interval based on the regression coefficient test using χ 2-approximation. To further improve the large sample approximation, we propose adapting a parametric bootstrap method to estimate the sample distribution of the quasi-score statistics using the correct model information. In addition, we developed an R package, rqlm (https://doi.org/10.32614/CRAN.package.rqlm), that can implement the new methods via simple commands.
Results
In extensive simulation studies, the coverage probabilities of the two new methods clearly outperformed the ordinary Wald-type confidence interval when the regression function assumptions were correctly specified, especially in small and moderate sample settings. We also illustrated the proposed methods by applying them to an epidemiological study of epilepsy. The proposed methods provided wider confidence intervals, reflecting statistical uncertainty.
Conclusions
The current standard Wald-type confidence intervals may provide misleading evidence. Erroneous evidence can potentially influence clinical practice, public health, and policymaking. These possibly inaccurate results should be circumvented using effective statistical methods. These new inference methods would provide more accurate evidence for future medical studies.
Funding source: Japan Society for the Promotion of Science
Award Identifier / Grant number: JP23K11931, JP22H03554, JP24K21306, and JP23H03063
Acknowledgments
The authors would like to thank Dr. Y. Arai (Tottori University) for permission to use the valuable data and K. Nakazono (The Institute of Statistical Mathematics) for his helpful comments on the earlier draft.
-
Research ethics: Not applicable.
-
Informed consent: Not applicable.
-
Author contributions: The authors have accepted responsibility for the entire content of this manuscript and approved its submission.
-
Use of Large Language Models, AI and Machine Learning Tools: None declared.
-
Conflict of interest: The authors state no conflict of interest.
-
Research funding: This study was supported by Grants-in-Aid for Scientific Research from the Japan Society for the Promotion of Science (grant numbers: JP23K11931, JP22H03554, JP24K21306, and JP23H03063).
-
Data availability: Supplementary Materials are available online at Epidemiologic Methods online and an R package for implementing the proposed methods is available at CRAN (https://cran.r-project.org/web/packages/rqlm).
References
1. Greenland, S. Interpretation and choice of effect measures in epidemiologic analysis. Am J Epidemiol 1987;125:761–8. https://doi.org/10.1093/oxfordjournals.aje.a114593.Search in Google Scholar PubMed
2. Nurminen, M. To use or not to use the odds ratio in epidemiologic analyses. Euro J Epidemiol 1995;11:365–71. https://doi.org/10.1007/bf01721219.Search in Google Scholar PubMed
3. Hopewell, S, Chan AW, Collins GS, Hróbjartsson A, Moher D, Schulz KF, et al.. CONSORT 2025 statement: updated guideline for reporting randomized trials. Nat Med 2025;31:1776–83. https://doi.org/10.1038/s41591-025-03635-5.Search in Google Scholar PubMed
4. Thompson, J, Watson, SI, Middleton, L, Hemming, K. Estimating relative risks and risk differences in randomised controlled trials: a systematic review of current practice. Trials 2025;26:1. https://doi.org/10.1186/s13063-024-08690-w.Search in Google Scholar PubMed PubMed Central
5. Rothman, KJ, Greenland, G, Lash, TL. Modern epidemiology, 3rd ed. Philadelphia: Lippincott Williams & Wilkins; 2008.Search in Google Scholar
6. McNutt, LA, Wu, C, Xue, X, Hafner, JP. Estimating the relative risk in cohort studies and clinical trials of common outcomes. Am J Epidemiol 2003;157:940–3. https://doi.org/10.1093/aje/kwg074.Search in Google Scholar PubMed
7. Wallenstein, S, Bodian, C. Epidemiologic programs for computers and calculators. Inferences on odds ratios, relative risks, and risk differences based on standard regression programs. Am J Epidemiol 1987;126:346–55. https://doi.org/10.1093/aje/126.2.346.Search in Google Scholar PubMed
8. Zou, GY. A modified Poisson regression approach to prospective studies with binary data. Am J Epidemiol 2004;159:702–6. https://doi.org/10.1093/aje/kwh090.Search in Google Scholar PubMed
9. Cheung, YB. A modified least-squares regression approach to the estimation of risk difference. Am J Epidemiol 2007;166:1337–44. https://doi.org/10.1093/aje/kwm223.Search in Google Scholar PubMed
10. Nelder, JA, Wedderburn, RWM. Generalized linear models. J Roy Stat Soc A 1972;135:370–84. https://doi.org/10.2307/2344614.Search in Google Scholar
11. Godambe, VP, Heyde, CC. Quasi-likelihood and optimal estimation. Int Stat Rev 1987;55:231–44. https://doi.org/10.2307/1403403.Search in Google Scholar
12. Wedderburn, RWM. Quasi-likelihood functions, generalized linear models, and the Gauss-Newton method. Biometrika 1974;61:439–47. https://doi.org/10.2307/2334725.Search in Google Scholar
13. White, H. Maximum likelihood estimation of misspecified models. Econometrica 1982;50:1–25. https://doi.org/10.2307/1912526.Search in Google Scholar
14. Albert, A, Anderson, JA. On the existence of the maximum likelihood estimates in logistic regression models. Biometrika 1984;71:1–10. https://doi.org/10.2307/2336390.Search in Google Scholar
15. Zorn, C. A solution to separation in binary response models. Polit Anal 2005;13:157–70. https://doi.org/10.1093/pan/mpi009.Search in Google Scholar
16. Uno, S, Noma, H, Gosho, M. Firth-type penalized methods of the modified Poisson and least-squares regression analyses for binary outcomes. Biom J 2024;66:e202400004. https://doi.org/10.1002/bimj.202400004.Search in Google Scholar PubMed
17. Sato, T. Confidence limits for the common odds ratio based on the asymptotic distribution of the Mantel-Haenszel estimator. Biometrics 1990;46:71–80. https://doi.org/10.2307/2531631.Search in Google Scholar
18. Noma, H, Nagashima, K. A note on the Mantel-Haenszel estimators when the common effect assumptions are violated. Epidemiol Methods 2016;5:19–35. https://doi.org/10.1515/em-2015-0004.Search in Google Scholar
19. Cox, DR, Hinkley, DV. Theoretical statistics. London: Chapman & Hall; 1974.10.1007/978-1-4899-2887-0Search in Google Scholar
20. Burden, RL, Faires, JD. Numerical analysis, 9th ed. Boston: Cengage Learning; 2011.Search in Google Scholar
21. Noma, H. Confidence intervals for a random-effects meta-analysis based on Bartlett-type corrections. Stat Med 2011;30:3304–12. https://doi.org/10.1002/sim.4350.Search in Google Scholar PubMed
22. Efron, B, Tibshirani, R. An introduction to the bootstrap. New York: CRC Press; 1994.10.1201/9780429246593Search in Google Scholar
23. White, H. A heteroskedasticity-consistent covariance matrix and a direct test for heteroskedasticity. Econometrica 1980;48:817–38. https://doi.org/10.2307/1912934.Search in Google Scholar
24. Zou, G, Donner, A. A simple alternative confidence interval for the difference between two proportions. Control Clin Trials 2004;25:3–12. https://doi.org/10.1016/j.cct.2003.08.010.Search in Google Scholar PubMed
25. Arai, Y, Okanishi, T, Noma, H, Kanai, S, Kawaguchi, T, Sunada, H, et al.. Prognostic factors for employment outcomes in patients with a history of childhood-onset drug-resistant epilepsy. Front Pediatr 2023;11:1173126. https://doi.org/10.3389/fped.2023.1173126.Search in Google Scholar PubMed PubMed Central
26. Cordeiro, GM, McCullagh, P. Bias correction in generalized linear models. J Roy Stat Soc B 1991;53:629–43. https://doi.org/10.1111/j.2517-6161.1991.tb01852.x.Search in Google Scholar
27. Noma, H, Kitano, T. Modelling nonlinear effects in risk ratio and risk difference using the Poisson and Gaussian additive regression models. Stats 2024;7:1473–82. https://doi.org/10.3390/stats7040086.Search in Google Scholar
28. Diaz-Quijano, FA. A simple method for estimating relative risk using logistic regression. BMC Med Res Methodol 2012;12:14. https://doi.org/10.1186/1471-2288-12-14.Search in Google Scholar PubMed PubMed Central
29. Dwivedi, AK, Mallawaarachchi, I, Lee, S, Tarwater, P. Methods for estimating relative risk in studies of common binary outcomes. J Appl Stat 2014;41:484–500. https://doi.org/10.1080/02664763.2013.840772.Search in Google Scholar
30. Richardson, TS, Robins, JM, Wang, L. On modeling and estimation for the relative risk and risk difference. J Am Stat Assoc 2017;112:1121–30. https://doi.org/10.1080/01621459.2016.1192546.Search in Google Scholar
31. Talbot, D, Mesidor, M, Chiu, Y, Simard, M, Sirois, C. An alternative perspective on the robust Poisson method for estimating risk or prevalence ratios. Epidemiology 2023;34:1–7. https://doi.org/10.1097/ede.0000000000001544.Search in Google Scholar PubMed
32. Gosho, M, Ishii, R, Noma, H, Maruo, K. A comparison of bias-adjusted generalized estimating equations for sparse binary data in small-sample longitudinal studies. Stat Med 2023;42:2711–27. https://doi.org/10.1002/sim.9744.Search in Google Scholar PubMed
Supplementary Material
This article contains supplementary material (https://doi.org/10.1515/em-2024-0030).
© 2025 Walter de Gruyter GmbH, Berlin/Boston
Articles in the same Issue
- Causal mediation analysis for difference-in-difference design and panel data
- Research Articles
- Sensitivity analysis for unmeasured confounding for a joint effect with an application to survey data
- Investigating the association between school substance programs and student substance use: accounting for informative cluster size
- The quantiles of extreme differences matrix for evaluating discriminant validity
- Finite-sample improved confidence intervals based on the estimating equation theory for the modified Poisson and least-squares regressions
- What if dependent causes of death were independent?
- Bot invasion: protecting the integrity of online surveys against spamming
- A study of a stochastic model and extinction phenomenon of meningitis epidemic
- Understanding the impact of media and latency in information response on the disease propagation: a mathematical model and analysis
- Time-varying reproductive number estimation for practical application in structured populations
- Perspective
- Should we still use pointwise confidence intervals for the Kaplan–Meier estimator?
Articles in the same Issue
- Causal mediation analysis for difference-in-difference design and panel data
- Research Articles
- Sensitivity analysis for unmeasured confounding for a joint effect with an application to survey data
- Investigating the association between school substance programs and student substance use: accounting for informative cluster size
- The quantiles of extreme differences matrix for evaluating discriminant validity
- Finite-sample improved confidence intervals based on the estimating equation theory for the modified Poisson and least-squares regressions
- What if dependent causes of death were independent?
- Bot invasion: protecting the integrity of online surveys against spamming
- A study of a stochastic model and extinction phenomenon of meningitis epidemic
- Understanding the impact of media and latency in information response on the disease propagation: a mathematical model and analysis
- Time-varying reproductive number estimation for practical application in structured populations
- Perspective
- Should we still use pointwise confidence intervals for the Kaplan–Meier estimator?