Home Comparison of three indirect methods for verification and validation of reference intervals at eight medical laboratories: a European multicenter study
Article Open Access

Comparison of three indirect methods for verification and validation of reference intervals at eight medical laboratories: a European multicenter study

  • Anne Meyer ORCID logo EMAIL logo , Robert Müller , Markus Hoffmann , Øyvind Skadberg , Aurélie Ladang ORCID logo , Benjamin Dieplinger , Wolfgang Huf , Sanja Stankovic , Georgia Kapoula and Matthias Orth
Published/Copyright: June 1, 2023
Become an author with De Gruyter Brill

Abstract

Objectives

Indirect methods for the indirect estimation of reference intervals are increasingly being used, especially for validation of reference intervals, as they can be applied to routine patient data. In this study, we compare three statistically different indirect methods for the verification and validation of reference intervals in eight laboratories distributed throughout Europe.

Methods

The RefLim method is a fast and simple approach which calculates the reference intervals by extrapolating the theoretical 95 % of non-pathological values from the central linear part of a quantile-quantile plot. The Truncated Maximum Likelihood (TML) method estimates a smoothed kernel density function for the distribution of the mixed data, for which it is assumed that the ‘‘central’’ part of the distribution represents the healthy population. The refineR utilizes an inverse modelling approach. This algorithm identifies a model that best explains the observed data before transforming the data with the Box-Cox transformation.

Results

We show that the different indirect methods each have their advantages but can also lead to inaccurate or ambiguous results depending on the approximation of the mathematical model to real-world data. A combination of different methodologies can improve the informative value and thus the reliability of results.

Conclusions

Based on routine measurements of four enzymes alkaline phosphatase (ALP), total amylase (AMY), cholinesterase (CHE) and gamma-glutamyl transferase (GGT) in adult women and men, we demonstrate that some reference limits taken from the literature need to be adapted to the laboratory’s particular local and population characteristics.

Introduction

Verification and validation of reference intervals are among the most important regulatory requirements for medical laboratories [1]. Although both terms refer to similar procedures with respect to assessing the validity of reference limits, there are slight differences regarding the depth of the analysis [2]. Whereas “verification” means the confirmation or rejection of a given reference interval from a package insert, handbook, or original literature, the term “validation” encompasses the collection of reliable and valid data for a careful comparison of this interval with empirical reference limits obtained under appropriate real-world conditions by direct or indirect statistical methods [3]. A given reference limit can be accepted if the empirical estimate lies within defined tolerance limits [4].

While the use of direct methods, which require recruitment of presumably healthy reference individuals [1], has been established for more than half a century, the indirect methods, which are much simpler in organizational terms, have only been recommended by the International Federation of Clinical Chemistry and Laboratory Medicine (IFCC) in 2019 [5]. For this reason, only a few objective method comparisons of indirect methods are available to this day and typically have been based on data from a single laboratory [3].

The aim of the present work is therefore to compare three different indirect methods in eight laboratories distributed throughout Europe from Norway to Greece (Table 1). We also attempt to assess the influence of methodological vs. regional differences on the validation of reference intervals from the literature (Table 2).

Table 1:

List of participating laboratories sorted from north to south and east to west.

Laboratory Location Specimen type Platform
A Stavanger, Norway Serum ARCHITECT c
B Liège, Belgium Serum Alinity c
C Stuttgart, Germany Serum Alinity c
D Linz, Austria Lithium heparin plasma Alinity c
E Vienna, Austria Serum Alinity c
F Goldach, Switzerland Serum + lithium heparin plasma Alinity c
G Belgrade, Serbia Serum Alinity c
H Lamia, Greece Serum Alinity c
Pooled Pooled Serum + lithium heparin plasma ARCHITECT c + alinity c
Table 2:

Assays and respective published reference intervals.

Assay Manufacturer Reference interval
ALP Abbott GmbH, Wiesbaden, Germany 33–98 U/L (F) 43–115 U/L (M) [6]
AMY Abbott GmbH, Wiesbaden, Germany 31–107 U/L [7]
CHE SENTINEL CH. S.p.A., Milano, Italy 3,930–10,800 U/L (F) 4,620–11,500 U/L (M) [8]
GGT Abbott GmbH, Wiesbaden, Germany 6.4–39.5 U/L(F) 11.7–67.5 U/L (M) [9]

Materials and methods

The analyzed data were derived from routine laboratory reports using serum and lithium heparin plasma specimens (Table 1), which were verified for use by Abbott for the investigated assays. The results of ALP, AMY, CHE, and GGT were generated with commercial assay kits from Abbott (Table 2) using Alinity c systems except laboratory A (Stavanger, Norway, see Table 1), where analyses were performed on an ARCHITECT c analyzer which shares the same reagent formulation. The four enzymes were selected from a larger study panel of proteins, substrates, and electrolytes to illustrate typical challenges in using indirect methods to validate the reference intervals (see discussion).

The data were collected over a period of five years (October 2017 to October 2022). Laboratory A (Stavanger, Norway) provided data for the whole study period, whereas the other laboratories provided data that were collected in shorter study periods. In addition to the individual evaluations, all available data for the four analytes were pooled for combined statistical analyses.

Additional data cleaning steps were performed in an attempt to reduce the number of pathological results in the data set and to guarantee statistical independence of the data and are described in the Supplementary Figure 1 [10, 11]. Study population comprised adult individuals aged 18 to 90 and the data was examined if the assay performance was stable over the study period [6]. Reference intervals were estimated for women and men separately, using three different statistical approaches (RefLim, Truncated Maximum Likelihood (TML), and refineR) if at least 150 individual results were available.

The RefLim method is a so-called “modified Hoffmann” approach [12] that is characterized primarily by simplicity and a low computing time. Modifications of the original probability paper method [13] include the selection of an underlying normal or lognormal distribution based on Bowley’s quartile skewness [14] and log transformation of the original values if needed, followed by truncation of the central 95% of presumably non-pathological values with the iBoxplot95 algorithm [15], and calculation of reference limits from a normal quantile-quantile plot [8]. A tutorial with a complete coding example in R is available at https://reflim.github.io/Epub/(Accessed March 10th, 2024).

The Truncated Maximum Likelihood (TML) method of Arzideh et al. was chosen for this study because it is a well-established and widely used algorithm that was utilized in several publications to establish reference intervals. The program can be downloaded from the website of the German Society for Clinical Chemistry and Laboratory Medicine (DGKL) with an Excel user interface (www.dgkl.de/fileadmin/Verbandsarbeit/Entscheidungsgrenzen/RLE49.zip). The TML derives the lambda exponent for the Box-Cox transformation from an iterative minimization of the Kolmogorov distance between theoretical and empirical cumulative distributions. Permissible lambda values are limited to a range from 0 to 1 to transform a continuum of different distributions between normal and lognormal, rather than a single decision between these two distribution types [16, 17]. The program for this study was provided by the DGKL in the form of an R function.

The refineR method of Ammer et al. is the latest algorithm. It starts by constructing a histogram from which it isolates a sub-fraction that most likely represents the non-pathological values. The bins of the histogram are balanced for width and counts and contributes to the calculation of a maximum-likelihood-based cost function. Similar to the TML method, a Box-Cox transformed normal distribution is generated for the calculation of the reference limits, but the refineR extends the range of lambda values to fit left-skewed distributions as well [18]. The program is available as an open-source R-package (https://cran.r-project.org/web/packages/refineR/; version 1.6.0).

To facilitate automated processing of datasets, data cleaning and pre-preprocessing routines and the three indirect methods were programed into one R function and executed.

Reference limits based on recommended analytical methods of the IFCC and/or the German society of clinical chemistry (DGKC) (see Table 2) were used as target values for this study. The respective publications were chosen based the closeness of the investigated patient collection to a European population. Further, the reference intervals derived from the indirect methods were compared to calculated permissible difference (pD) limits of the literature reference limits [4, 19]. To obtain a standardized view of all empirical reference limits at a glance, we transformed the absolute values into zlog values based on the target reference intervals. The zlog transformation projects the original values onto a scale ranging from approximately −10 to +10 with a standardized reference interval of −1.96 to +1.96. This scale is independent of the analyte, method, measuring unit, etc. [20]. The zlog algorithm is also available as an open-source R-package (https://cran.r-project.org/web/packages/zlog/).

Results

Figures 1, 2, 3, and 4 provide detailed graphical summaries of all results obtained by the three methods in the eight laboratories and show the pooled reference intervals of the combined data across all regions. The graphs demonstrate the observed variability between the indirect methods and regions.

Figure 1: 
Overview of estimated reference intervals for ALP with the indirect methods. The vertical dashed lines represent the reference limits from Table 2. The grey areas around the dashed lines represent the pD. The literature reference intervals are confirmed by a method if the respective colored bar ends within the grey area.
Figure 1:

Overview of estimated reference intervals for ALP with the indirect methods. The vertical dashed lines represent the reference limits from Table 2. The grey areas around the dashed lines represent the pD. The literature reference intervals are confirmed by a method if the respective colored bar ends within the grey area.

Figure 2: 
Overview of estimated reference intervals for AMY. Values for laboratory E are missing since this laboratory routinely measures pancreatic instead of total amylase. The vertical dashed lines represent the reference limits from Table 2. The grey areas around the dashed lines represent the pD. The literature reference intervals are confirmed by a method if the respective colored bar ends within the grey area. The transparent bars indicate insufficient sample size for TML and refineR.
Figure 2:

Overview of estimated reference intervals for AMY. Values for laboratory E are missing since this laboratory routinely measures pancreatic instead of total amylase. The vertical dashed lines represent the reference limits from Table 2. The grey areas around the dashed lines represent the pD. The literature reference intervals are confirmed by a method if the respective colored bar ends within the grey area. The transparent bars indicate insufficient sample size for TML and refineR.

Figure 3: 
Overview of estimated reference intervals for CHE in five of eight laboratories. This analyte was not provided by laboratories A and B and had too small sample size in laboratories F and H (male). The vertical dashed lines represent the reference limits from Table 2. The grey areas around the dashed lines represent the pD. The literature reference intervals are confirmed by a method if the respective colored bar ends within the grey area. The transparent bars indicate insufficient sample size for TML and refineR.
Figure 3:

Overview of estimated reference intervals for CHE in five of eight laboratories. This analyte was not provided by laboratories A and B and had too small sample size in laboratories F and H (male). The vertical dashed lines represent the reference limits from Table 2. The grey areas around the dashed lines represent the pD. The literature reference intervals are confirmed by a method if the respective colored bar ends within the grey area. The transparent bars indicate insufficient sample size for TML and refineR.

Figure 4: 
Overview of estimated reference intervals for GGT. The vertical dashed lines represent the reference limits from Table 2. The grey areas around the dashed lines represent the pD. The literature reference intervals are confirmed by a method if the respective colored bar ends within the grey area.
Figure 4:

Overview of estimated reference intervals for GGT. The vertical dashed lines represent the reference limits from Table 2. The grey areas around the dashed lines represent the pD. The literature reference intervals are confirmed by a method if the respective colored bar ends within the grey area.

According to a recent method comparison, RefLim needs a minimum of 200 individual values for the definition of the underlying distribution model [21], whereas TML requires a minimum of 1,000 with greater than 4,000 preferred [13]. For refineR, a minimum sample size has not been reported, although a recent method comparison recommends a minimum number of 1,000 [17]. In Figures 2 and 3, reference limits with less than 200 individual values for RefLim and less than 1,000 for TML and refineR are depicted in transparent colors. The transparent bars suggest that sample sizes of at least 150 measurements may be sufficient for rough estimates derived from the three methods.

The upper limits for ALP (Figure 1) differed less between the sexes than the one stated in the literature, so that a target value of 98 U/L for women appears too low. Regionally, we found lower upper reference limits (URL) for laboratory G (Belgrade, Serbia) and small methodologically differences overall. The greatest difference between the methods was observed for males from laboratory G (Belgrade, Serbia) and laboratory H (Lamia, Greece). For these laboratories, the minimum number of 1,000 individual values was exceeded, but the preferred number of 4,000 required for TML was not available. For the pooled data the three methods provided almost identical results. For males the lower reference limits (LRL) calculated with the three methods was exactly the same as the one stated in the literature and the calculated URLs were within the pD. For women, the LRLs and URLs exceeded the pDs.

For AMY (Figure 2), the estimated URLs were slightly elevated in both sexes and in comparison to the literature some methods, regardless of the region, were within the pD. Slightly more LRLs were within the pD or were at least located close to it. The methodological differences were small, but more pronounced than was the case for ALP.

A large discrepancy in comparison to the literature values was seen for CHE (Figure 3). The reference intervals from the different regions and the pooled data were markedly shifted upwards for women if compared to the literature values. For men, the same upwards shift was observed, except for laboratory C (Stuttgart, Germany), for which only an upward shift was seen with the TML method. With this single exception, the results of CHE showed good methodological agreement.

GGT (Figure 4) was the only enzyme in this study that showed very poor agreement between the three methods for both sexes. Methodological differences were seen for both sexes but were more pronounced in men. RefLim generally provided wider, and refineR predominantly narrower reference intervals.

Figure 5 summarizes the detailed results of our multicenter study in a boxplot diagram. To make all results comparable, zlog values are shown instead of the absolute values. In this plot, the target reference intervals are uniformly −1.96 and +1.96. The pD values are transformed to zlog values accordingly and represent the tolerant ranges of the literature reference limit.

Figure 5: 
Boxplots for zlog values of all analytes. The grey bands represent the tolerance ranges of the standardized upper and lower reference limits according to Table 2.
Figure 5:

Boxplots for zlog values of all analytes. The grey bands represent the tolerance ranges of the standardized upper and lower reference limits according to Table 2.

The main findings derived from Figure 5 are:

  1. The three methods yield comparable results, except for the URL of GGT where RefLim consistently provided a higher and refineR a lower URL than TML. For ALP the least methodological differences where observed.

  2. The calculated LRL and URL of ALP for women and CHE for both sexes and the LRL of GGT for men exceeded the tolerance range of the target value of −1.96 and/or +1.96 throughout all three methods.

  3. The calculated LRL and URL of AMY for both sexes and the LRL of GGT for women were partially within the tolerance range of the target values of −1.96 and/or +1.96 throughout all three methods.

  4. Due to the methodological differences that were seen for the URL of GGT, RefLim for women and TML for men met the target value, whereas the respective other methods were partially within or exceeded the tolerance range of the target value of −1.96 and/or +1.96.

  5. The lengths of the boxplots showed that for ALP the least regional differences were observed, while AMY showed moderate regional differences. For CHE the regional differences where small for the URL but for the LRL large regional differences were observed. The length of the boxplots of GGT differed between the three indirect methods, especially for the URL.

Although the specific reasons for the discrepancies between the three methods were beyond the scope of our study, it is reasonable to suspect that the heterogeneity of real-world data poses certain problems in choosing the best statistical model. The RefLim distinguished only between a normal and the lognormal distribution and log transformed the data to a normal distribution, if needed [10]. Across all regions and for the pooled data, RefLim predicted a lognormal distribution for all enzymes except for CHE in males where a normal distribution was consistently selected for all regions (data not shown). While the TML method utilizes a forward modelling approach by first transforming the data and then fitting a model, the refineR uses an inverse modelling approach. The algorithm tries to find a model that can best explains the observed data and applies the Box-Cox transformation to it [14]. Despite the difference between the approaches, one would expect TML and refineR to arrive at similar distribution types, i.e. lambda values, when using identical data. However, our analyses showed that the lambda values of the two methods were just moderately correlated with each other (Figure 6). Furthermore, TML and refineR determined a large variety of lambda values for the same analyte across the different regions. This was particularly seen by AMY and CHE (Figure 6).

Figure 6: 
Lambda estimates for all analytes (female, male) of the eight individual laboratories and pooled data obtained from the TML and the refineR algorithm, respectively. For AMY and CHE, the values scatter over a wide range and are only moderate correlated (r=0.629).
Figure 6:

Lambda estimates for all analytes (female, male) of the eight individual laboratories and pooled data obtained from the TML and the refineR algorithm, respectively. For AMY and CHE, the values scatter over a wide range and are only moderate correlated (r=0.629).

Discussion

In 2017, the IFCC Committee on Reference Intervals and Decision Limits (C-RIDL) conducted an international multicenter study with a total recruitment of more than 13,000 apparently healthy individuals to investigate the feasibility of harmonizing reference intervals across countries with direct approaches [22]. In our study, we showed that by using indirect methods, such surveys can be conducted with similar or even larger numbers of cases with far less organizational effort.

We demonstrated that the three methods examined here provided broadly comparable results for ALP, AMY and CHE determined in a European population. For ALP, a harmonized reference interval throughout Europe can be considered, whereas minor regional differences were observed for AMY and CHE. However, we cannot estimate the effect of pathological data as neither the applied data cleaning steps nor the indirect reference algorithms might be capable to exclude all pathological data, especially in limited sample sizes scenario (for example AMY Laboratory G Belgrade, Serbia). Since the reference intervals in this study were derived from a large number of real-world data, they closely fit the respective patient population of the laboratories with their patient characteristics. Our study reinforces the call of guidelines and recommendations that reference intervals reported in the literature or by a manufacturer should be compared with results from the laboratory’s own subject population [1, 2]. The selection of a population-relevant reference interval can optimize the assessment of patients and, in some cases, reduce unnecessary investigations of patients with suspect results by ensuring that the reference interval more accurately reflects the population. In comparison to the literature, our study indicated that the reference limit of ALP for women should be reviewed and adjusted if necessary.

For GGT we observed poor methodological agreement. A high proportion of mildly elevated results, potentially caused by alcohol, being overweight, and medication use could make it difficult for all three methods to separate the pathological fraction clearly from the healthy distribution [23]. This effect might be enhanced by the fact, that GGT concentration increases with age [24, 25]. Only sex-specific but not age-specific reference intervals were calculated, which might be one cause of the observed poor agreement [26].

The observed methodological differences for GGT showed the importance of using more than one indirect approach with distinct statistical techniques to identify certain problems of the methods associated with the heterogeneity of real-world data. The different methods show limitations in defining the healthy population due to data structure of real-world results. It needs to be proven if future algorithms will be capable to overcome these limitations. Similar results can indicate that the indirect methods found comparable models which reflect the reference population, but bigger differences between the methods can indicate that overlapping subfractions or slightly elevated results can make it difficult for the indirect methods to correctly identify and separate the pathological fraction from the healthy results in a mixed population. In those cases, the use of only one indirect method could result in the implementation of false reference limits.

As described above, the specific reasons for the observed methodological discrepancies were beyond the scope of this study, but it is reasonable to suspect that the heterogeneity of real-world data poses certain problems in choosing the best statistical model. For the determination of the distribution type of the data we have not made any specific attempts to distinguish whether the uncertainty of the lambda estimation is more in the algorithms or also in the complex composition of the mixed data sets. Some distributions might be described equally well by a normal or a lognormal distribution and the difference between both distributions is so small that statistical tests may not be able to distinguish between them, like CHE in the current study [26]. Interestingly, however, when we tried to reproduce the theoretical lambda of 1 using refineR with a simulated data set for a clean normal distribution, we obtained fluctuating values between ∼0.1 and ∼1.5 (data not shown). The higher degree of flexibility of TML and refineR provides interesting opportunities for improved curve fitting, but as we showed in Figure 6, the TML and refineR methods determined variable lambdas especially for AMY and CHE, but with a few exceptions yielded comparable reference intervals for the same regions.

Although the indirect methods cause far less organizational effort than direct approaches, the different indirect methods have potential pitfalls and are different in their simplicity of use. This might result in incorrect estimation of reference limits if the required data cleaning and pre-processing steps are not properly implemented due to lack of expert knowledge or limitation of datasets. In contrast to the other methods, TML allows the application of clinical knowledge to define where in the distribution of pathological values is to be expected. If used incorrectly this can lead to falsely high or low reference intervals.

For refineR, we observed that the method had difficulties in calculating the histogram if the data was composed of results with a different number of decimal places. Previous versions of the refineR failed to calculate the more complex pooled data of ALP and GGT and the analyses resulted in an error. This problem could be solved with an update of the refineR package, which still showed some abnormalities in the histograms if data were not rounded to the same number of decimal places.

The findings of this study have to be seen in light of some limitations. Firstly, the study focused only on Abbott reagents, so results might not be transferable to other manufactures, although this limitation can also be seen as a strength of this studies, since it eliminates differences due to the manufacturer and allows to observe difference that might be attributed to other factors (e.g. regional). However, as ALP, AMY and GGT methods are traceable to IFFC reference procedures, the results could principally be adapted by laboratories employing IFCC traceable methods, whereas no IFCC method exist for CHE. Importantly, we could identify some profound differences between the different sites, it is however unclear, whether this is due to ethnical differences or other modifying factors like diet, lifestyle, overall health status, or preanalytical differences between the laboratories. In addition, also different specimen types were used for the calculation of the reference intervals, namely serum and lithium heparin plasma. This should however have negligible effect as serum and lithium heparin plasma have shown to be equivalent by Abbott and by external studies [27].

In summary, our study demonstrated that indirect methods offer a major advance for routine verification and validation of reference intervals taken from the literature. In the present study, we identified several reference limits that need adaptation to the analytical and preanalytical real-world conditions, despite the fact that the target values were taken from IFCC/DGKC recommendations, and the analytical measurements were made with IFCC methods. This surprisingly high percentage needs to be confirmed by further studies, which are currently ongoing.


Corresponding author: Anne Meyer, Abbott GmbH, Wiesbaden, Germany, Phone: +49(0)6122 58 2515, E-mail:

  1. Research funding: None declared.

  2. Author contributions: AM design, analysis, writing of manuscript, MO design, review of manuscript, RM and MH cowriting, review of manuscript, All other authors review of manuscript. All authors have accepted responsibility for the entire content of this manuscript and approved its submission.

  3. Competing interests: AM, RM, MH are employees of Abbott. The other authors state no conflict of interest.

  4. Informed consent: Not applicable.

  5. Ethical approval: Research involving human subjects complied with all relevant national regulations, institutional policies and is in accordance with the tenets of the Helsinki Declaration (as revised in 2013), and has been approved by the authors’ Institutional Review Board or equivalent committee. (2020-839R).

References

1. CLSI. Defining, establishing, and verifying reference intervals in the clinical laboratory; approved guideline, 3rd ed. CLSI EPC28-A3c. Wayne, PA: Clinical and Laboratory Standards Institute; 2010.Search in Google Scholar

2. Ozarda, Y, Higgins, V, Adeli, K. Verification of reference intervals in routine clinical laboratories: practical challenges and recommendations. Clin Chem Lab Med 2018;57:30–7. https://doi.org/10.1515/cclm-2018-0059.Search in Google Scholar PubMed

3. Ozarda, Y, Ichihara, K, Jones, G, Streichert, T, Ahmadian, R. Comparison of reference intervals derived by direct and indirect methods based on compatible datasets in Turkey. Clin Chim Acta 2021;520:186–95. https://doi.org/10.1016/j.cca.2021.05.030.Search in Google Scholar PubMed

4. Haeckel, R, Wosniok, W, Arzideh, F. Equivalence limits of reference intervals for partitioning of population data. Relevant differences of reference limits. J Lab Med 2016;40:199–205. https://doi.org/10.1515/labmed-2016-0002.Search in Google Scholar

5. Jones, G, Haeckel, R, Loh, T, Sikaris, K, Streichert, T, Katayev, A, et al.. Indirect methods for reference interval determination: review and recommendations. Clin Chem Lab Med 2019;57:20–9. https://doi.org/10.1515/cclm-2018-0073.Search in Google Scholar PubMed

6. Schumann, G, Klauke, R, Canalias, F, Bossert-Reuter, S, Franck, PFH, Gella, FJ, et al.. IFCC primary reference procedures for the measurement of catalytic activity concentrations of enzymes at 37 °C. Clin Chem Lab Med 2011;49:1439–46. https://doi.org/10.1515/CCLM.2011.621.Search in Google Scholar PubMed

7. Schumann, G, Aoki, R, Ferrero, CA, Ehlers, G, Ferard, G, Gella, FJ, et al.. IFCC primary reference procedures for the measurement of catalytic activity concentrations of enzymes at 37 °C. Reference procedure for the measurement of catalytic concentration of α-amylase. Clin Chem Lab Med 2006;44:1146–55. https://doi.org/10.1515/CCLM.2006.212.Search in Google Scholar PubMed

8. German Society for Clinical Chemistry. Proposal of standard methods for the determination of enzyme catalytic concentrations in serum and plasma at 37 °C. II. Cholinesterase. Eur J Clin Chem Clin Biochem 1992;30:163–70.Search in Google Scholar

9. Ceriotti, F, Henny, J, Queralto, JM, Ziyu, S, Ilcol, Y, Chen, B, et al.. Common reference intervals for alanine aminotransferase (ALT) and γ-glutamyl transferase (GGT) in serum: results from an IFCC multicenter study. Clin Chem Lab Med 2010;48:1593–601. https://doi.org/10.1515/cclm.2010.315.Search in Google Scholar

10. Farrell, CL, Nguyen, L. Indirect reference intervals: harnessing the power of stored laboratory data. Clin Biochem Rev 2019;40:99–111. https://doi.org/10.33176/AACB-19-00022.Search in Google Scholar PubMed PubMed Central

11. Arzideh, F, Özcürümez, M, Albers, E, Haeckel, R, Streichert, T. Indirect estimation of reference intervals using first or last results and results from patients without repeated measurements. J Lab Med 2021;45:103–9. https://doi.org/10.1515/labmed-2020-0149.Search in Google Scholar

12. Hoffmann, G, Lichtinghagen, R, Wosniok, W. Simple estimation of reference intervals from routine laboratory data. J Lab Med 2016;39:1–13. https://doi.org/10.1515/labmed-2015-0104.Search in Google Scholar

13. Hoffmann, R. Statistics in the practice of medicine. J Am Med Assoc 1963;185:864–73. https://doi.org/10.1001/jama.1963.03060110068020.Search in Google Scholar PubMed

14. Klawonn, F, Hoffmann, G, Orth, M. Quantitative laboratory results: normal or lognormal distribution? J Lab Med 2020;44:143–50. https://doi.org/10.1515/labmed-2020-0005.Search in Google Scholar

15. Klawonn, F, Hoffmann, G. Using fuzzy cluster analysis to find interesting clusters. In: Garcia-Escudero, LA, Gordaliza, A, Mayo, A, Lubiano Gomez, AM, Angeles Gil, M, Grzegorzewski, P, et al., editors. Building bridges between soft and statistical methodologies for data science. Cham: Springer; 2023:231–9 pp.10.1007/978-3-031-15509-3_31Search in Google Scholar

16. Arzideh, F, Wosniok, W, Gurr, E, Hinsch, W, Schumann, G, Weinstock, N, et al.. A plea for intra-laboratory reference limits. Part 2. A bimodal retrospective concept for determining reference limits from intra-laboratory databases demonstrated by catalytic activity concentrations of enzymes. Clin Chem Lab Med 2007;45:1043–57. https://doi.org/10.1515/cclm.2007.250.Search in Google Scholar PubMed

17. Arzideh, F, Brandhorst, G, Gurr, E, Hinsch, Hoff, T, Roggenburck, L, et al.. An improved indirect approach for determining reference limits from intra-laboratory data bases exemplified by concentrations of elektrolytes. J Lab Med 2009;33:52–66. https://doi.org/10.1515/jlm.2009.015.Search in Google Scholar

18. Ammer, T, Schützenmeister, A, Prokosch, H-U, Rauh, M, Rank, C, Zierk, J. refineR: a novel algorithm for reference interval estimation from real-world data. Nat Sci Rep 2021;12:16023. https://doi.org/10.1038/s41598-021-95301-2.Search in Google Scholar PubMed PubMed Central

19. Haeckel, R, Wosniok, W. A new concept to derive permissible limits for analytical imprecision and bias considering diagnostic requirements and technical state-of-the-art. Clin Chem Lab Med 2011;49:623–35. https://doi.org/10.1515/cclm.2011.116.Search in Google Scholar

20. Hoffmann, G, Klawonn, F, Lichtinghagen, R, Orth, M. The zlog value as a basis for the standardization of laboratory results. J Lab Med 2017;41:23–32. https://doi.org/10.1515/labmed-2017-0135.Search in Google Scholar

21. Anker, S, Morgenstern, J, Adler, J, Brune, M, Brings, S, Fleming, E, et al.. Verification of sex- and age-specific reference intervals for 13 serum steroids determined by mass spectrometry: evaluaton of an indirect statistical approach. Clin Chem Lab Med 2022;61:452–63. https://doi.org/10.1515/cclm-2022-0603.Search in Google Scholar PubMed

22. Ichihara, K, Ozarda, Y, Barth, JH, Klee, G, Qiu, L, Erasmus, R, et al.. A global multicenter study on reference values: 1. Assessment of methods for derivation and comparison of reference intervals. Clin Chim Acta 2017;467:70–82. https://doi.org/10.1016/j.cca.2016.09.016.Search in Google Scholar PubMed

23. Ferris, H, O’ Flynn, AM, Kearney, P. Double trouble: the effect of obesity and alcohol consumption on serum GGT in Irish middle-aged adults. Rev Epidemiol Sante Publique 2018;66:S362. https://doi.org/10.1016/j.respe.2018.05.343.Search in Google Scholar

24. Petroff, D, Bätz, O, Jedrysiak, K, Kramer, J, Berg, T, Wiegand, J. Age dependence of liver enzymes: an analysis of over 1,300,000 consecutive blood samples. Clin Gastroenterol Hepatol 2021;20:641–50. https://doi.org/10.1016/j.cgh.2021.01.039.Search in Google Scholar PubMed

25. Puukka, K, Hietala, J, Pohjasniemi, H, Anttila, P, Bloigu, R, Niemelä, O. Age-related changes on serum GGT activity and the assessment of ethanol intake. Alcohol Alcohol 2007;41:522–7. https://doi.org/10.1093/alcalc/agl052.Search in Google Scholar PubMed

26. Haeckel, R, Wosniok, W, Streichert, T. Review of potentials and limitations of indirect approaches for estimating reference limits/intervals of quantitative procedures in laboratory medicine. J Lab Med 2021;45:35–53. https://doi.org/10.1515/labmed-2020-0131.Search in Google Scholar

27. Ercan, S. Comparison of test results obtained from lithium heparin gel tubes and serum gel tubes. Turk J Biochem 2020;45:575–86. https://doi.org/10.1515/tjb-2019-0117.Search in Google Scholar


Supplementary Material

This article contains supplementary material (https://doi.org/10.1515/labmed-2023-0042).


Received: 2023-04-06
Accepted: 2023-05-02
Published Online: 2023-06-01
Published in Print: 2023-08-28

© 2023 the author(s), published by De Gruyter, Berlin/Boston

This work is licensed under the Creative Commons Attribution 4.0 International License.

Downloaded on 24.9.2025 from https://www.degruyterbrill.com/document/doi/10.1515/labmed-2023-0042/html
Scroll to top button