Abstract
Background
In laboratory setting evaluating the agreement between two measurement methods is a very frequent practice. Unfortunately, the guidelines to refer to are not free from criticisms from a statistical methodological point of view. We reviewed the Clinical and Laboratory Standards Institute guideline EP09c, 3rd ed. pointing out some drawbacks and some aspects that have not been well defined, leaving situations of uncertainty and/or of excessive subjectivity in the judgement.
Content
We have stressed the need of having replicates to estimate the systematic and the proportional biases of the measurement methods to be compared. Indeed, unequal variance of the two measurement methods gives a slope and intercept of the regression between the difference and the mean of the two values of the measurement methods to be compared that can be absolutely calculated from their means, their variances and their correlation coefficient. So, it is not possible to disentangle true from spurious biases. For laboratory professionals we have developed a worked exemplification of an agreement assessment.
Summary
We have stressed the need of other approaches than the classic Bland and Altman method to calculate the systematic and proportional biases of two measurement methods compared for their agreement in a study with replicates.
-
Research ethics: Not applicable.
-
Informed consent: Not applicable.
-
Author contributions: Bruno Mario Cesana wrote the first draft; Antonelli Paolo was the main reviewer; Simona Ferraro contributed to the first draft.
-
Competing interests: The authors state no conflict of interest.
-
Research funding: None declared.
-
Data availability: Not applicable.
References
1. CLSI. Measurement procedure comparison and bias estimation using patient samples, 3rd ed. Wayne, PA: Clinical and Laboratory Standard Institute; 2018. CLSI guideline EP09c.Suche in Google Scholar
2. Thienpont, LM, Van Uytfanghe, K, De Leenheer, AP. Reference measurement systems in clinical chemistry. Clin Chim Acta 2002;323:73–87. https://doi.org/10.1016/S0009-8981(02)00188-2.Suche in Google Scholar PubMed
3. Altman, DG, Bland, JM. Measurement in medicine: the analysis of method comparison studies. J R Stat Soc Ser D (The Stat) 1983;32:307–17. https://doi.org/10.2307/2987937.Suche in Google Scholar
4. Bland, JM, Altman, DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1986;8476:307–10.10.1016/S0140-6736(86)90837-8Suche in Google Scholar
5. Bland, JM, Altman, DG. Measuring agreement in method comparison studies. Stat Methods Med Res 1999;8:135–60. https://doi.org/10.1177/096228029900800204.Suche in Google Scholar PubMed
6. Braga, F, Pasqualetti, S, Panteghini, M. The role of external quality assessment in the verification of in vitro medical diagnostics in the traceability era. Clin Biochem 2018;57:23–8. https://doi.org/10.1016/j.clinbiochem.2018.02.004.Suche in Google Scholar PubMed
7. Ferraro, S, Biganzoli, G, Bussetti, M, Castaldi, S, Biganzoli, EM, Plebani, M. Managing the impact of inter-method bias of prostate specific antigen assays on biopsy referral: the key to move towards precision health in prostate cancer management. Clin Chem Lab Med 2022;61:142–53. https://doi.org/10.1515/cclm-2022-0874.Suche in Google Scholar PubMed
8. Ceriotti, F, Fernandez-Calle, P, Klee, GG, Nordin, G, Sandberg, S, Streichert, T, et al.. On behalf of the EFLM Task and Finish Group on Allocation of laboratory tests to different models for performance specifications (TFG-DM). Criteria for assigning laboratory measurands to models for analytical performance specifications defined in the 1st EFLM Strategic Conference. Clin Chem Lab Med 2017;552:189–94. https://doi.org/10.1515/cclm-2016-0091.Suche in Google Scholar PubMed
9. Bland, JM, Altman, DG. Agreement between methods of measurement with multiple observations per individual. J Biopharm Stat 2007;17:571–82. https://doi.org/10.1177/096228029900800204.Suche in Google Scholar
10. Cesana, BM, Antonelli, P. Bland and Altman agreement method: to plot differences against means or differences against standard? An endless tale? Clin Chem Lab Med 2023;62:262–9. https://doi.org/10.1515/cclm-2023-0306.Suche in Google Scholar PubMed
11. Krouwer, JS. Letter to the editor: why Bland–Altman plots should use S, not (Y + X)/2 when X is a reference method. Stat Med 2008;27:778–80. https://doi.org/10.1002/sim.3086.Suche in Google Scholar PubMed
12. EP9-A2, Vol.22, No. 19, replace EP9-A, Vol.15, No.17: Method comparison and bias estimation using patient samples; approved guideline, 2nd ed. NCCSL document EP9-A2 (ISBN 1-56238-472-4). NCCLS, 940 West Valley Roas, Suite 1400, Wayne, Pennsylvania 19087-1898 USA, 2002.Suche in Google Scholar
13. Hopkins, WG. Bias in Bland–Altman but not regression validity analyses. Sportscience 2004;8:42–6.Suche in Google Scholar
14. Bland, JM, Altman, DG. Comparing methods of measurement: why plotting difference against standard method is misleading. Lancet 1995;346:1085–87. https://doi.org/10.1016/s0140-6736(95)91748-9.Suche in Google Scholar PubMed
15. Cesana, BM, Antonelli, P. Agreement analysis: further statistical insights Ophthalmic. Physiol Opt 2012;32:436–40. https://doi.org/10.1111/j.1475-1313.2012.00916.x.Suche in Google Scholar PubMed
16. Stevens, NT, Steiner, SH, MacKay, RJ. Assessing agreement between two measurement systems: an alternative to the limits of agreement approach. Stat Methods Med Res 2017;6:2487–504. https://doi.org/10.1177/0962280215601133.Suche in Google Scholar PubMed
17. Stevens, NT, Steiner, SH, MacKay, RJ. Comparing heteroscedastic measurement systems with the probability of agreement. Stat Methods Med Res 2018;27:3420–35. https://doi.org/10.1177/0962280217702540.Suche in Google Scholar PubMed
18. da Rochaa, LT, Stevens, NT. Comparing two measurement systems using the probability of agreement web app. Qual Eng 2018;30:525–33. https://doi.org/10.1080/08982112.2017.1361538.Suche in Google Scholar
19. Taffé, P. Effective plots to assess bias and precision in method comparison studies. Stat Methods Med Res 2018;27:1650–60. https://doi.org/10.1177/0962280216666667.Suche in Google Scholar PubMed
20. Taffé, P, Peng, M, Stagg, V, Williamson, T. Biasplot: a package to effective plots to assess bias and precision in method comparison studies. STATA J 2017;17:208–21. https://doi.org/10.1177/1536867X1701700111.Suche in Google Scholar
21. Taffé, P, Peng, M, Stagg, V, Williamson, T. MethodCompare: an R package to assess bias and precision in method comparison studies. Stat Methods Med Res 2019;28:2557–65. https://doi.org/10.1177/0962280218759693.Suche in Google Scholar PubMed
22. Deming, WE. Statistical adjustment of data. NY: Wiley; 1943. Dover Publications edition, 1985.Suche in Google Scholar
23. Linnet, K. Estimation of the linear relationship between the measurements of two methods with proportional errors. Stat Med 1990;9:1463–73. https://doi.org/10.1002/sim.4780091210.Suche in Google Scholar PubMed
24. Linnet, K. Evaluation of regression procedures for method comparison studies. Clin Chem 1993;39:424–32. https://doi.org/10.1093/clinchem/39.3.424.Suche in Google Scholar
25. Passing, H, Bablok, W. A new biometrical procedure for testing the equality of measurements from two different analytical methods. Application of linear regression procedures for method comparison studies in Clinical Chemistry, Part I. J Clin Chem Clin Biochem 1983;21:709–20. https://doi.org/10.1515/cclm.1983.21.11.709.Suche in Google Scholar PubMed
26. Passing, H, Bablok, W. Comparison of several regression procedures for method comparison studies and determination of sample sizes. Application of linear regression procedures for method comparison studies in Clinical Chemistry, Part II. J Clin Chem Clin Biochem 1984;22:431–45. https://doi.org/10.1515/cclm.1984.22.6.431.Suche in Google Scholar PubMed
27. Bland, JM. https://www-users.york.ac.uk/∼mb55/meas/sizemeth.htm [Accessed 28 Jul 2023].Suche in Google Scholar
28. Lu, MJ, Zhong, WH, Liu, YX, Miao, HZ, Li, YC, Ji, MH. Sample size for assessing agreement between two methods of measurement by Bland–Altman method. Int J Biostat 2016;12:1–8. https://doi.org/10.1515/ijb-2015-0039.Suche in Google Scholar PubMed
29. Shieh, G. Exact power and sample size calculations for the two one-sided tests of equivalence. PLoS One 2016;11:e0162093. https://doi.org/10.1371/journal.pone.0162093.Suche in Google Scholar PubMed PubMed Central
30. Jan, SL, Shieh, G. The Bland-Altman range of agreement: exact interval procedure and sample size determination. Comput Biol Med 2018;100:247–52. https://doi.org/10.1016/J.Comp.Biomed.2018.06.020.Suche in Google Scholar
31. Shieh, G. Assessing agreement between two methods of quantitative measurements: exact test procedure and sample size calculation. Stat Biopharm Res 2020;12:352–9. https://doi.org/10.1080/19466315.2019.1677495.Suche in Google Scholar
32. Liu, JP, Chow, SCA. Two one-sided tests procedure for assessment of individual bioequivalence. J Biopharm Stat 1997;7:49–61. https://doi.org/10.1080/10543409708835169.Suche in Google Scholar PubMed
33. Cesana, BM, Antonelli, P. Sample size for agreement studies on quantitative variables. To be published on Epidemiol Biostat Public Health 2024;19.10.54103/2282-0930/23479Suche in Google Scholar
34. Carstensen, B. Comparing clinical measurement methods: a practical guide, 2nd ed. Chichester, United Kingdom: John Wiley & Sons LTd; 2010.10.1002/9780470683019Suche in Google Scholar
35. Dunn, G. Statistical evaluation of measurement errors: Design and analysis of reliability studies, 2nd ed. London: Arnold; 2004.Suche in Google Scholar
36. Voelkel, JG, Siskowski, BE. A study of the Bland-Altman plot and its associated methodology. Technical Report, Center for Quality and Applied Statistics. Rochester Institute of Technology; 2005.Suche in Google Scholar
37. Johnson, R. Assessment of bias with emphasis on method comparison. Clin Biochem Rev 2008;29(Supp I):S37–42.Suche in Google Scholar
© 2024 Walter de Gruyter GmbH, Berlin/Boston
Artikel in diesem Heft
- Frontmatter
- Editorials
- Multi-cancer early detection: searching for evidence
- High sensitivity cardiac troponin assays, rapid myocardial infarction rule-out algorithms, and assay performance
- Reviews
- Consensus statement on extracellular vesicles in liquid biopsy for advancing laboratory medicine
- Copeptin as a diagnostic and prognostic biomarker in pediatric diseases
- Opinion Papers
- The Unholy Grail of cancer screening: or is it just about the Benjamins?
- Critical appraisal of the CLSI guideline EP09c “measurement procedure comparison and bias estimation using patient samples”
- Tumor markers determination in malignant pleural effusion: pearls and pitfalls
- Contribution of laboratory medicine and emerging technologies to cardiovascular risk reduction via exposome analysis: an opinion of the IFCC Division on Emerging Technologies
- Guidelines and Recommendations
- Recommendations for European laboratories based on the KDIGO 2024 Clinical Practice Guideline for the Evaluation and Management of Chronic Kidney Disease
- Genetics and Molecular Diagnostics
- Expanded carrier screening for 224 monogenic disease genes in 1,499 Chinese couples: a single-center study
- General Clinical Chemistry and Laboratory Medicine
- How do experts determine where to intervene on test ordering? An interview study
- New concept for control material in glucose point-of-care-testing for external quality assessment schemes
- Vitamin B12 deficiency in newborns: impact on individual’s health status and healthcare costs
- Analytical evaluation of eight qualitative FIT for haemoglobin products, for professional use in the UK
- Colorimetric correcting for sample concentration in stool samples
- Reference Values and Biological Variations
- Assessment of canonical diurnal variations in plasma glucose using quantile regression modelling and Chronomaps
- Inconsistency in ferritin reference intervals across laboratories: a major concern for clinical decision making
- Establishing the TSH reference intervals for healthy adults aged over 70 years: the Australian ASPREE cohort study
- Hematology and Coagulation
- The EuroFlow PIDOT external quality assurance scheme: enhancing laboratory performance evaluation in immunophenotyping of rare lymphoid immunodeficiencies
- Clinical value of smear review of flagged samples analyzed with the Sysmex XN hematology analyzer
- Cardiovascular Diseases
- Evidence for stability of cardiac troponin T concentrations measured with a high sensitivity TnT test in serum and lithium heparin plasma after six-year storage at −80 °C and multiple freeze-thaw cycles
- Letters to the Editor
- Impact of high-sensitivity cardiac troponin I assay imprecision on the safety of a single-sample rule-out approach for myocardial infarction
- Why is single sample rule out of non-ST elevation myocardial infarction using high-sensitivity cardiac troponin T safe when analytical imprecision is so high? A joint statistical and clinical demonstration
- Iron deficiency and iron deficiency anemia in transgender populations: what’s different?
- The information about the metrological traceability pedigree of the in vitro diagnostic calibrators should be improved: the case of plasma ethanol
- Time to refresh and integrate the JCTLM database entries for total bilirubin: the way forward
- Navigation between EQA and sustainability
- C-terminal alpha-1-antitrypsin peptides as novel predictor of hospital mortality in critically ill COVID-19 patients
- Neutralizing antibodies against KP.2 and KP.3: why the current vaccine needs an update
- A simple gatekeeping intervention improves the appropriateness of blood urea nitrogen testing
- Congress Abstracts
- 16ª Reunião Científica da Sociedade Portuguesa de Medicina Laboratorial - SPML
Artikel in diesem Heft
- Frontmatter
- Editorials
- Multi-cancer early detection: searching for evidence
- High sensitivity cardiac troponin assays, rapid myocardial infarction rule-out algorithms, and assay performance
- Reviews
- Consensus statement on extracellular vesicles in liquid biopsy for advancing laboratory medicine
- Copeptin as a diagnostic and prognostic biomarker in pediatric diseases
- Opinion Papers
- The Unholy Grail of cancer screening: or is it just about the Benjamins?
- Critical appraisal of the CLSI guideline EP09c “measurement procedure comparison and bias estimation using patient samples”
- Tumor markers determination in malignant pleural effusion: pearls and pitfalls
- Contribution of laboratory medicine and emerging technologies to cardiovascular risk reduction via exposome analysis: an opinion of the IFCC Division on Emerging Technologies
- Guidelines and Recommendations
- Recommendations for European laboratories based on the KDIGO 2024 Clinical Practice Guideline for the Evaluation and Management of Chronic Kidney Disease
- Genetics and Molecular Diagnostics
- Expanded carrier screening for 224 monogenic disease genes in 1,499 Chinese couples: a single-center study
- General Clinical Chemistry and Laboratory Medicine
- How do experts determine where to intervene on test ordering? An interview study
- New concept for control material in glucose point-of-care-testing for external quality assessment schemes
- Vitamin B12 deficiency in newborns: impact on individual’s health status and healthcare costs
- Analytical evaluation of eight qualitative FIT for haemoglobin products, for professional use in the UK
- Colorimetric correcting for sample concentration in stool samples
- Reference Values and Biological Variations
- Assessment of canonical diurnal variations in plasma glucose using quantile regression modelling and Chronomaps
- Inconsistency in ferritin reference intervals across laboratories: a major concern for clinical decision making
- Establishing the TSH reference intervals for healthy adults aged over 70 years: the Australian ASPREE cohort study
- Hematology and Coagulation
- The EuroFlow PIDOT external quality assurance scheme: enhancing laboratory performance evaluation in immunophenotyping of rare lymphoid immunodeficiencies
- Clinical value of smear review of flagged samples analyzed with the Sysmex XN hematology analyzer
- Cardiovascular Diseases
- Evidence for stability of cardiac troponin T concentrations measured with a high sensitivity TnT test in serum and lithium heparin plasma after six-year storage at −80 °C and multiple freeze-thaw cycles
- Letters to the Editor
- Impact of high-sensitivity cardiac troponin I assay imprecision on the safety of a single-sample rule-out approach for myocardial infarction
- Why is single sample rule out of non-ST elevation myocardial infarction using high-sensitivity cardiac troponin T safe when analytical imprecision is so high? A joint statistical and clinical demonstration
- Iron deficiency and iron deficiency anemia in transgender populations: what’s different?
- The information about the metrological traceability pedigree of the in vitro diagnostic calibrators should be improved: the case of plasma ethanol
- Time to refresh and integrate the JCTLM database entries for total bilirubin: the way forward
- Navigation between EQA and sustainability
- C-terminal alpha-1-antitrypsin peptides as novel predictor of hospital mortality in critically ill COVID-19 patients
- Neutralizing antibodies against KP.2 and KP.3: why the current vaccine needs an update
- A simple gatekeeping intervention improves the appropriateness of blood urea nitrogen testing
- Congress Abstracts
- 16ª Reunião Científica da Sociedade Portuguesa de Medicina Laboratorial - SPML