Home Medicine A simple transformation independent method for outlier definition
Article
Licensed
Unlicensed Requires Authentication

A simple transformation independent method for outlier definition

  • Martin Berg Johansen and Peter Astrup Christensen EMAIL logo
Published/Copyright: April 10, 2018

Abstract

Background:

Definition and elimination of outliers is a key element for medical laboratories establishing or verifying reference intervals (RIs). Especially as inclusion of just a few outlying observations may seriously affect the determination of the reference limits. Many methods have been developed for definition of outliers. Several of these methods are developed for the normal distribution and often data require transformation before outlier elimination.

Methods:

We have developed a non-parametric transformation independent outlier definition. The new method relies on drawing reproducible histograms. This is done by using defined bin sizes above and below the median. The method is compared to the method recommended by CLSI/IFCC, which uses Box-Cox transformation (BCT) and Tukey’s fences for outlier definition. The comparison is done on eight simulated distributions and an indirect clinical datasets.

Results:

The comparison on simulated distributions shows that without outliers added the recommended method in general defines fewer outliers. However, when outliers are added on one side the proposed method often produces better results. With outliers on both sides the methods are equally good. Furthermore, it is found that the presence of outliers affects the BCT, and subsequently affects the determined limits of current recommended methods. This is especially seen in skewed distributions. The proposed outlier definition reproduced current RI limits on clinical data containing outliers.

Conclusions:

We find our simple transformation independent outlier detection method as good as or better than the currently recommended methods.

  1. Author contributions: PAC conceived the idea and study. PAC performed calculation and simulations. MBJ and PAC refined the idea and simulations. PAC reviewed the literature and wrote the first draft. Both authors contributed to subsequent drafts and approval of the final version. All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.

  2. Research funding: None declared.

  3. Employment or leadership: None declared.

  4. Honorarium: None declared.

  5. Competing interests: The funding organization(s) played no role in the study design; in the collection, analysis, and interpretation of data; in the writing of the report; or in the decision to submit the report for publication.

References

1. Solberg HE, Lahti A. Detection of outliers in reference distributions: performance of Horn’s algorithm. Clin Chem 2005;51:2326–32.10.1373/clinchem.2005.058339Search in Google Scholar

2. Dixon WJ. Analysis of extreme values. Ann Math Stat 1950;21:488–506.10.1214/aoms/1177729747Search in Google Scholar

3. Grubbs FE. Sample criteria for testing outlying observations. Ann Math Stat 1950;21:27–58.10.1214/aoms/1177729885Search in Google Scholar

4. Stromme JH, Rustad P, Steensland H, Theodorsen L, Urdal P. Reference intervals for eight enzymes in blood of adult females and males measured in accordance with the International Federation of Clinical Chemistry reference system at 37 degrees C: part of the Nordic Reference Interval Project. Scand J Clin Lab Invest 2004;64:371–84.10.1080/00365510410002742Search in Google Scholar

5. Tozzoli R, Giavarina D, Villalta D, Soffiati G, Bizzaro N. Definition of reference limits for autoantibodies to thyroid peroxidase and thyroglobulin in a large population of outpatients using an indirect method based on current data. Arch Pathol Lab Med 2008;132:1924–8.10.5858/132.12.1924Search in Google Scholar

6. Erasmus RT, Ray U, Nathaniel K, Dowse G. Reference ranges for serum creatinine and urea in elderly coastal Melanesians. P N G Med J 1997;40:89–91.Search in Google Scholar

7. Eskelinen S, Suominen P, Vahlberg T, Lopponen M, Isoaho R, Kivela SL, et al. The effect of thyroid antibody positivity on reference intervals for thyroid stimulating hormone (TSH) and free thyroxine (FT4) in an aged population. Clin Chem Lab Med 2005;43:1380–5.10.1515/CCLM.2005.236Search in Google Scholar

8. Rustad P, Felding P, Franzson L, Kairisto V, Lahti A, Martensson A, et al. The Nordic Reference Interval Project 2000: recommended reference intervals for 25 common biochemical properties. Scand J Clin Lab Invest 2004;64:271–84.10.1080/00365510410006324Search in Google Scholar

9. Tukey JW. Exploratory data analysis. Reading, MA: Addison-Wesley, 1977:688.Search in Google Scholar

10. Bjerner J, Theodorsson E, Hovig E, Kallner A. Non-parametric estimation of reference intervals in small non-Gaussian sample sets. Accred Qual Assur 2009;14:185–92.10.1007/s00769-009-0490-2Search in Google Scholar

11. Hoaglin DC, Iglewicz B, Tukey JW. Performance of some resistant rules for outlier labeling. J Am Stat Assoc 1986;81:991–9.10.1080/01621459.1986.10478363Search in Google Scholar

12. Solberg HE. The theory of reference values Part 5. Statistical treatment of collected reference values. Determination of reference limits. J Clin Chem Clin Biochem 1983;21:749–60.10.1016/0009-8981(84)90319-XSearch in Google Scholar

13. Box GE, Cox DR. An analysis of transformations. J R Stat Soc Series B (Methodological) 1964;26:211–52.10.1111/j.2517-6161.1964.tb00553.xSearch in Google Scholar

14. Harris EK, Boyd JC. Statistical bases of reference values in laboratory medicine. New York: M. Dekker, 1995:xiv, 361.10.1201/9781482273151Search in Google Scholar

15. Horn PS, Pesce AJ, Copeland BE. A robust approach to reference interval estimation and evaluation. Clin Chem 1998;44:622–31.10.1093/clinchem/44.3.622Search in Google Scholar

16. Freedman D, Diaconis P. On the histogram as a density estimator – L2 theory. Z Wahrscheinlichkeit 1981;57:453–76.10.1007/BF01025868Search in Google Scholar

17. CLSI. Defining, establishing, and verifying reference intervals in the clinical laboratory; approved guideline – third edition. CLSI document EP28 – A3c ed. Wayne, PA, USA: CLSI (Clinical Laboratory Standards Institute), 2010.Search in Google Scholar

18. Horn PS, Feng L, Li Y, Pesce AJ. Effect of outliers and nonhealthy individuals on reference interval estimation. Clin Chem 2001;47:2137–45.10.1093/clinchem/47.12.2137Search in Google Scholar

19. Patterson N. A robust, non-parametric method to identify outliers and improve final yield and quality. CS MANTECH Conference; April 23rd–26th, 2012; Boston, MA, USA, 2012.Search in Google Scholar

20. Lykkeboe S, Nielsen CG, Christensen PA. Indirect method for validating transference of reference intervals. Clin Chem Lab Med 2018;56:463–70.10.1515/cclm-2017-0574Search in Google Scholar PubMed

21. Knuth KH. Optimal data-based binning for histograms, 2006. arXiv:physics/0605197 [physicsdata-an].Search in Google Scholar


Supplementary Material:

The online version of this article offers supplementary material (https://doi.org/10.1515/cclm-2018-0025).


Received: 2018-01-09
Accepted: 2018-02-22
Published Online: 2018-04-10
Published in Print: 2018-08-28

©2018 Walter de Gruyter GmbH, Berlin/Boston

Articles in the same Issue

  1. Frontmatter
  2. Editorials
  3. Clinical Chemistry and Laboratory Medicine continues to shine brightly in the constellation of laboratory medicine
  4. The Theranos saga and the consequences
  5. Innovative approaches in diabetes diagnosis and monitoring: less invasive, less expensive… but less, equally or more efficient?
  6. Reviews
  7. Exploring the microbiota to better understand gastrointestinal cancers physiology
  8. Linking type 2 diabetes and gynecological cancer: an introductory overview
  9. Mini Reviews
  10. MicroRNAs as predictive biomarkers of response to tyrosine kinase inhibitor therapy in metastatic renal cell carcinoma
  11. Salivary biomarkers and cardiovascular disease: a systematic review
  12. Opinion Paper
  13. The meteoric rise and dramatic fall of Theranos: lessons learned for the diagnostic industry
  14. General Clinical Chemistry and Laboratory Medicine
  15. Uncertainty evaluation in clinical chemistry, immunoassay, hematology and coagulation analytes using only external quality assessment data
  16. Measurement uncertainty and metrological traceability of whole blood cyclosporin A mass concentration results obtained by UHPLC-MS/MS
  17. Computer-assisted interventions in the clinical laboratory process improve the diagnosis and treatment of severe vitamin B12 deficiency
  18. Trueness, precision and stability of the LIAISON 1-84 parathyroid hormone (PTH) third-generation assay: comparison to existing intact PTH assays
  19. Fibroblast growth factor 23 and renal function among young and healthy individuals
  20. Optimizing charge state distribution is a prerequisite for accurate protein biomarker quantification with LC-MS/MS, as illustrated by hepcidin measurement
  21. Quantification of human complement C2 protein using an automated turbidimetric immunoassay
  22. EE score: an index for simple differentiation of homozygous hemoglobin E and hemoglobin E-β0-thalassemia
  23. Reference Values and Biological Variations
  24. Algorithm on age partitioning for estimation of reference intervals using clinical laboratory database exemplified with plasma creatinine
  25. A simple transformation independent method for outlier definition
  26. Cancer Diagnostics
  27. Quantification of vanillylmandelic acid, homovanillic acid and 5-hydroxyindoleacetic acid in urine using a dilute-and-shoot and ultra-high pressure liquid chromatography tandem mass spectrometry method
  28. Cardiovascular Diseases
  29. Sialylated isoforms of apolipoprotein C-III and plasma lipids in subjects with coronary artery disease
  30. Diabetes
  31. Analysis of protein glycation in human fingernail clippings with near-infrared (NIR) spectroscopy as an alternative technique for the diagnosis of diabetes mellitus
  32. Letter to the Editor
  33. Preanalytical errors before and after implementation of an automatic blood tube labeling system in two outpatient phlebotomy centers
  34. Hemolysis interference studies: freeze method should be used in the preparation of hemolyzed samples
  35. The curious case of postprandial glucose less than fasting glucose: little things that matter much
  36. Finding best practice in internal quality control procedures using external quality assurance performance
  37. Evaluation of the analytical performance of a new ADVIA immunoassay using the Centaur XPT platform system for the measurement of cardiac troponin I
  38. Reference ranges of the Sebia free light chain ratio in patients with chronic kidney disease
  39. Antigen excess detection by automated assays for free light chains
  40. Multiple myeloma and macro creatine kinase type 1: the first case report
  41. Comparison of five cell-free DNA isolation methods to detect the EGFR T790M mutation in plasma samples of patients with lung cancer
  42. Can we use a point-of-care blood gas analyzer to measure the lactate concentration in cerebrospinal fluid of patients with suspected meningitis?
  43. Unstable haemoglobin variant Hb Leiden is detected on Sysmex XN-Series analysers
  44. Congress Abstracts
  45. 59th National Congress of the Hungarian Society of Laboratory Medicine
Downloaded on 22.12.2025 from https://www.degruyterbrill.com/document/doi/10.1515/cclm-2018-0025/html
Scroll to top button