Home Hypothesis testing for detecting outlier evaluators
Article
Licensed
Unlicensed Requires Authentication

Hypothesis testing for detecting outlier evaluators

  • Li Xu ORCID logo , David M. Zucker and Molin Wang EMAIL logo
Published/Copyright: November 4, 2024

Abstract

In epidemiological studies, the measurements of disease outcomes are carried out by different evaluators. In this paper, we propose a two-stage procedure for detecting outlier evaluators. In the first stage, a regression model is fitted to obtain the evaluators’ effects. Outlier evaluators have different effects than normal evaluators. In the second stage, stepwise hypothesis tests are performed to detect outlier evaluators. The true positive rate and true negative rate of the proposed procedure are assessed in a simulation study. We apply the proposed method to detect potential outlier audiologists among the audiologists who measured hearing threshold levels of the participants in the Audiology Assessment Arm of the Conservation of Hearing Study, which is an epidemiological study for examining risk factors of hearing loss.


Corresponding author: Molin Wang, Department of Epidemiology, Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA; Channing Division of Network Medicine, Harvard Medical School, Brigham and Women’s Hospital, Boston, MA, USA; and Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA, E-mail: 

Award Identifier / Grant number: R01DC017717

  1. Research Ethics: Not applicable.

  2. Informed consent: Not applicable.

  3. Author contributions: All authors have accepted responsibility for the entire content of this manuscript and approved its submission.

  4. Use of Large Language Models, AI and Machine Learning Tools: None declared.

  5. Conflict of interest: All other authors state no conflict of interest.

  6. Research funding: This work is supported by NIH grant R01DC017717.

  7. Data availability: We include the code link here https://github.com/tgh1122334/ESDGen.

Appendix: Technical details on deriving the critical values

To simulate the critical values from

1 α = Pr l = t k ( R l λ l ) | H t 1 , t = 1 , , k ,

H t−1 is the hypothesis that after removing t outliers, the remaining Mt + 1 evaluators are not outliers. The pseudo-code to obtain λ 1, …, λ k through simulation is in Algorithm 2 below.

Algorithm 2.

Pseudo-code for finding the critical values using simulation.

To derive the approximated critical values λ t , t = 1, …, k. From Equation (2), we have

1 α = Pr l = t k ( R l λ l ) | H t 1 Pr R t λ t | H t 1 = Pr max m I t L m , t T β ̂ I t 2 L m , t T Ω β I t L m , t λ t | H t 1 = Pr m I t L m , t T β ̂ I t 2 L m , t T Ω β I t L m , t λ t | H t 1 ,  for  t = 1 , , k .

Since β ̂ I t follows the multivariate normal distribution when N is large,

β ̂ I t N β I t , Ω β I t .

Define Z = ( Z 1 , , Z M t + 1 ) T as Z = A t β ̂ I t where A t is a matrix with rows equals to L m , t T L m , t T Ω β I t L m , t for m I t . Then

Z | H t 1 N 0 , A t Ω β I t A t T , R t = max m I t L m , t T β ̂ I t 2 L m , t T Ω β I t L m , t = max b = 1 M t 1 Z b 2 .

To determine λ t ,

1 α Pr R t λ t | H t 1 = Pr max b = 1 M t 1 Z b 2 λ t = Pr b = 1 M t 1 | Z b | λ t ,

so, λ t is the 1 − α two-sided quantile of the distribution N 0 , A t Ω β I t A t T which we obtain using the function qmvnorm in the R package mvtnorm.

References

1. Sanders, L, Geffner, R, Bucky, S, Ribner, N, Patino, AJ. A qualitative study of child custody evaluators’ beliefs and opinions. J Child Custody 2015;12:205–30. https://doi.org/10.1080/15379418.2015.1120476.Search in Google Scholar

2. Dogan, S, Ricardo Schwedhelm, E, Heindl, H, Mancl, L, Raigrodski, AJ. Clinical efficacy of polyvinyl siloxane impression materials using the one-step two-viscosity impression technique. J Prosthet Dent 2015;114:217–22. https://doi.org/10.1016/j.prosdent.2015.03.019.Search in Google Scholar PubMed

3. Miller, B, Carr, KC. Integrating standardized patients and objective structured clinical examinations into a nurse practitioner curriculum. J Nurse Pract 2016;12:201–10. https://doi.org/10.1016/j.nurpra.2016.01.017.Search in Google Scholar

4. Beckler, DT, Thumser, ZC, Schofield, JS, Marasco, PD. Reliability in evaluator-based tests: using simulation-constructed models to determine contextually relevant agreement thresholds. BMC Med Res Methodol 2018;18:1–12. https://doi.org/10.1186/s12874-018-0606-7.Search in Google Scholar PubMed PubMed Central

5. Iglewicz, B, Hoaglin, DC. How to detect and handle outliers. Milwaukee, WI, USA: Quality Press; 1993, 16.Search in Google Scholar

6. Rosner, B. On the detection of many outliers. Technometrics 1975;17:221–7. https://doi.org/10.2307/1268354.Search in Google Scholar

7. Ali, SH, Simonoff, JS. Procedures for the identification of multiple outliers in linear models. J Am Stat Assoc 1993;88:1264–72. https://doi.org/10.2307/2291266.Search in Google Scholar

8. Davies, L, Gather, U. The identification of multiple outliers. J Am Stat Assoc 1993;88:782–92. https://doi.org/10.2307/2290763.Search in Google Scholar

9. Wu, Y, Curhan, S, Rosner, B, Curhan, G, Wang, M. Analytical method for detecting outlier evaluators. BMC Med Res Methodol 2023;23:177. https://doi.org/10.1186/s12874-023-01988-4.Search in Google Scholar PubMed PubMed Central

10. Malini, N, Pushpa, M. Analysis on credit card fraud identification techniques based on knn and outlier detection. In: 2017 third international conference on advances in electrical, electronics, information, communication and bio-informatics (AEEICB); 2017:255–8 pp.10.1109/AEEICB.2017.7972424Search in Google Scholar

11. Dey, P, Zhang, Z, Dunson, DB. Outlier detection for multi-network data. arXiv preprint arXiv:2205.06398, 2022. https://doi.org/10.1093/bioinformatics/btac431,Search in Google Scholar PubMed PubMed Central

12. Huang, K, Wen, H, Yang, C, Gui, W, Hu, S. Outlier detection for process monitoring in industrial cyber-physical systems. IEEE Trans Autom Sci Eng 2021;19:2487–98. https://doi.org/10.1109/tase.2021.3087599.Search in Google Scholar

13. Zhu, J, Deng, F, Zhao, J, Ye, Z, Chen, J. Gaussian mixture variational autoencoder with whitening score for multimodal time series anomaly detection. In: 2022 IEEE 17th international conference on control & automation (ICCA). IEEE; 2022:480–5 pp.10.1109/ICCA54724.2022.9831885Search in Google Scholar

14. Cabana, E, Lillo, RE, Laniado, H. Multivariate outlier detection based on a robust mahalanobis distance with shrinkage estimators. Stat Pap 2021;62:1583–609. https://doi.org/10.1007/s00362-019-01148-1.Search in Google Scholar

15. Vens, M, Ziegler, A. Generalized estimating equations and regression diagnostics for longitudinal controlled clinical trials: a case study. Comput Stat Data Anal 2012;56:1232–42. https://doi.org/10.1016/j.csda.2011.04.010.Search in Google Scholar

16. Liang, KY, Zeger, SL. Longitudinal data analysis using generalized linear models. Biometrika 1986;73:13–22. https://doi.org/10.2307/2336267.Search in Google Scholar

17. Osorio, F, Gárate, Á, Russo, CM. The gradient test statistic for outlier detection in generalized estimating equations. Stat Probab Lett 2024;209:110087. https://doi.org/10.1016/j.spl.2024.110087.Search in Google Scholar

18. Curhan, SG, Stankovic, K, Halpin, C, Wang, M, Eavey, RD, Paik, JM, et al.. Osteoporosis, bisphosphonate use, and risk of moderate or worse hearing loss in women. J Am Geriatr Soc 2021;69:3103–13. https://doi.org/10.1111/jgs.17275.Search in Google Scholar PubMed PubMed Central

19. Bao, Y, Bertoia, ML, Lenart, EB, Stampfer, MJ, Willett, WC, Speizer, FE, et al.. Origin, methods, and evolution of the three nurses’ health studies. Am J Publ Health 2016;106:1573–81. https://doi.org/10.2105/ajph.2016.303338.Search in Google Scholar PubMed PubMed Central

20. Curhan, SG, Halpin, C, Wang, M, Eavey, RD, Curhan, GC. Prospective study of dietary patterns and hearing threshold elevation. Am J Epidemiol 2020;189:204–14. https://doi.org/10.1093/aje/kwz223.Search in Google Scholar PubMed PubMed Central

21. Rosner, B. Percentage points for a generalized esd many-outlier procedure. Technometrics 1983;25:165–72. https://doi.org/10.1080/00401706.1983.10487848.Search in Google Scholar

22. Zeger, SL, Liang, KY. Longitudinal data analysis for discrete and continuous outcomes. Biometrics 1986:121–30. https://doi.org/10.2307/2531248.Search in Google Scholar


Supplementary Material

This article contains supplementary material (https://doi.org/10.1515/ijb-2023-0004).


Received: 2023-01-06
Accepted: 2024-09-13
Published Online: 2024-11-04

© 2024 Walter de Gruyter GmbH, Berlin/Boston

Articles in the same Issue

  1. Frontmatter
  2. Research Articles
  3. Random forests for survival data: which methods work best and under what conditions?
  4. Flexible variable selection in the presence of missing data
  5. An interpretable cluster-based logistic regression model, with application to the characterization of response to therapy in severe eosinophilic asthma
  6. MBPCA-OS: an exploratory multiblock method for variables of different measurement levels. Application to study the immune response to SARS-CoV-2 infection and vaccination
  7. Detecting differentially expressed genes from RNA-seq data using fuzzy clustering
  8. Hypothesis testing for detecting outlier evaluators
  9. Response to comments on ‘sensitivity of estimands in clinical trials with imperfect compliance’
  10. Commentary
  11. Comments on “sensitivity of estimands in clinical trials with imperfect compliance” by Chen and Heitjan
  12. Research Articles
  13. Optimizing personalized treatments for targeted patient populations across multiple domains
  14. Statistical models for assessing agreement for quantitative data with heterogeneous random raters and replicate measurements
  15. History-restricted marginal structural model and latent class growth analysis of treatment trajectories for a time-dependent outcome
  16. Revisiting incidence rates comparison under right censorship
  17. Ensemble learning methods of inference for spatially stratified infectious disease systems
  18. The survival function NPMLE for combined right-censored and length-biased right-censored failure time data: properties and applications
  19. Hybrid classical-Bayesian approach to sample size determination for two-arm superiority clinical trials
  20. Estimation of a decreasing mean residual life based on ranked set sampling with an application to survival analysis
  21. Improving the mixed model for repeated measures to robustly increase precision in randomized trials
  22. Bayesian second-order sensitivity of longitudinal inferences to non-ignorability: an application to antidepressant clinical trial data
  23. A modified rule of three for the one-sided binomial confidence interval
  24. Kalman filter with impulse noised outliers: a robust sequential algorithm to filter data with a large number of outliers
  25. Bayesian estimation and prediction for network meta-analysis with contrast-based approach
  26. Testing for association between ordinal traits and genetic variants in pedigree-structured samples by collapsing and kernel methods
Downloaded on 14.9.2025 from https://www.degruyterbrill.com/document/doi/10.1515/ijb-2023-0004/html
Scroll to top button