Abstract
Objectives
To determine whether the diagnostic accuracy of the liver fibrosis marker FIB-4 and the likelihood ratios (LRs) of specific FIB-4 values vary with age.
Methods
We used a published dataset of 540 patients diagnosed with non-alcoholic fatty liver disease (NAFLD) or non-alcoholic steatohepatitis. Liver biopsy showed no or early fibrosis in 391 patients, and advanced fibrosis in 149. For each group we established the relation between the mean of the natural logarithm of FIB-4 (log(FIB-4)) and age, and the standard deviation (SD) of log(FIB-4) and age. Using a parametric method, we calculated the area under the ROC curve of FIB-4 as a function of the difference in mean values of log(FIB-4) between the two groups, and the ratio of the SDs of log(FIB-4). LRs were calculated as functions of age at 35, 50 and 65 years, using a parametric method.
Results
The mean log(FIB-4) increased with age in both groups. The SDs of log(FIB-4) did not change with age. There was a trend towards decreasing diagnostic accuracy with age, but this finding did not reach statistical significance. The area under the ROC curve was 0.73 using a parametric method that accounted for age, and 0.79 using a traditional non-parametric method that did not account for age. The LRs of specific FIB-4 values decreased with age.
Conclusions
Age distorts the estimation of both diagnostic accuracy and LRs.
Introduction
Fibrosis-4 index (FIB-4) is recommended for estimating the risk of serious liver fibrosis in patients with type 2 diabetes or obesity with at least one cardiometabolic risk factor or persistently elevated liver enzyme levels [1]. FIB-4 is calculated as (age × S-AST)/(B-PLT × S-ALT0.5), where age is given in years, S-AST is the serum enzymatic activity of aspartate aminotransferase in U/L, B-PLT is the blood platelet count in 109/L, and S-ALT is the serum enzymatic activity of alanine aminotransferase in U/L. Higher FIB-4 values indicate a higher risk of advanced liver fibrosis. According to European guidelines [1], the cut-off values of FIB-4 should be 1.3 and 2.67, with the exception of patients older than 65 years, where the lower cut-off value is 2.0 [1]. There is also caution against using FIB-4 in patients younger than 35 years, due to low diagnostic accuracy [1], 2]. The two cut-off values define three groups, each with a different follow-up regime.
Since age is included in the numerator of the FIB-4 equation, the isolated effect of age is an increase in FIB-4 with age, regardless of whether the patient has advanced liver fibrosis. If FIB-4 increases with age at different rates in patients with and without advanced liver fibrosis, the diagnostic accuracy of FIB-4 may change with age, because the mean difference between the two groups will change. A complicating factor is whether the standard deviations also change with age. Nevertheless, the likelihood ratio (LR) of a specific FIB-4 value must change with age. This is very unfortunate, because LRs can be used to revise the probability of disease [3]: Posttest probability=(pre × LR)/(pre × LR + 1 -pre), where pre is pretest probability. For instance, the FIB-4 value with an LR of 1, often corresponding to the upper left “corner” of the receiver operating characteristic (ROC) curve, must be higher in patients aged 65 years than in those aged 35 years, simply because older patients operate with higher FIB-4 values in both groups. Thus, there can be no common list of LRs corresponding to specific FIB-4 values for all patients, and common cut-off values cannot represent common LRs. We used a published data set [4] to study these issues.
Materials and methods
In 2021, Sang et al. published an article [4] based on data from 540 patients diagnosed with non-alcoholic fatty liver disease (NAFLD) or non-alcoholic steatohepatitis (NASH), which in Europe now is named metabolic dysfunction-associated steatotic liver disease (MASLD) and metabolic dysfunction-associated steatohepatitis (MASH), respectively [1]. Patients with a history of cancer, alcohol abuse, or other causes of chronic liver disease were excluded. All patients underwent liver biopsy, which showed no or early fibrosis in 391 patients and advanced fibrosis in 149. S-ALT, S-AST, and B-PLT levels were measured using instruments from Hitachi, Tokyo, Japan. Various other laboratory tests were also performed, as Sang et al. used the data to estimate a better fibrosis indicator than the FIB-4. Further details are provided in [4]. This dataset (and two others) are freely available at https://github.com/chentianlu/LiveFbr.
We used the variables “group”, “Age”, “Sex”, and “FIB4” in the file discovery_set.csv with 540 records. The “group” variable contained two values, “S0_2” in 391 records and “S34” in 149. We took for granted that these correspond to “no or early fibrosis”, and “advanced fibrosis”, respectively. The mean (range) age was 53.0 (22–73) and 44.4 (17–77) years, respectively, in patients with and without advanced liver fibrosis. For FIB-4, the corresponding figures were 1.85 (0.476–6.30) and 1.05 (0.140–3.67).
We calculated the natural logarithm of FIB-4 (log(FIB-4)) and used these values for all the calculations. First, we modelled log(FIB-4) as a function of log(age), a variable coding for the diagnostic groups (0=“no or early fibrosis”, 1=“advanced fibrosis”) and a variable for the interaction between group and log(age). A reasonable assumption for the association would be a straight linear relation, as (by definition) log(FIB-4)=log(age) + log(S-AST) − log(B-PLT) − log(S-ALT0.5). However, we allowed for a non-straight linear relation using fractional polynomial least squares regression. We also used ordinary least squares linear regression for the entire group and separately for each group. We then tested for heteroscedasticy, and if statistically insignificant, the root mean square value was taken as the standard deviation (SD).
The area under the ROC curve (AUCROC) was calculated as normal(z_auc), where normal is the cumulative normal distribution function and z_auc is the z-value for the AUCROC [5]. In its turn, z_auc was calculated as a/[(1 + b2)0.5], where a=[mean_log(FIB-41) − mean_log(FIB-40)]/SD_log(FIB-41), and b=SD_log(FIB-40)/SD_log(FIB-41) [5]. Subscripts 1 and 0 indicate groups with and without advanced liver fibrosis, respectively.
The LR of a specific biomarker result is the ratio of two probabilities: the probability of obtaining the result if one has the disease in question divided by the probability of obtaining the same result if one does not have the disease. For each of the ages 35, 50, and 65 years we estimated LR as
where normalden is the normal density function for the respective mean and SD at a specific age and log(FIB-4) is the log value of the specific FIB-4 result. In each case, the calculations were limited to the range of mean_log(FIB-40) − 1.96 × SD_log(FIB-40) to mean_log(FIB-41) + 1.96 × SD_log(FIB-41).
The Stata software, version 16 (StataCorp, College Station, TX 77845, USA) was used for estimation, calculation, simulation, and graphical work. Statistical significance was set at p<0.05.
Results
In the fractional polynomial regression model with log(FIB-4) as the independent variable, and sex, log(age), diagnostic group, and a variable of interaction between log(age) and diagnostic group as independent variables, sex was statistically insignificant (p=0.66). After eliminating sex, the coefficient of the interaction variable was estimated to be −0.002072, with a 95 % CI from −0.008139 to 0.003995, i.e. it was statistically insignificant (p=0.50). After eliminating the interaction variable from the model, the association with log(age) was straight linear.
In the least squares linear regression model of log(FIB-4) as a function of log(age) and diagnostic group, the coefficient of the diagnostic group variable was estimated to 0.3458 (95 % CI from 0.2707 to 0.4209). This was interpreted as the constant difference between mean log(FIB-41) and mean log(FIB-40). The coefficient for the log(age) variable was 1.157 (95 % CI from 1.053 to 1.261). The constant was −4.424 (95 % CI from −4.815 to −4.032). Thus the mean log(FIB-41) was estimated to be −4.424 + 0.3458 + 1.157 × log(age) and the mean log(FIB-40) to −4.424 + 1.157 × log(age).
In the least squares linear regression model of log(FIB-4) as a function of log(age) for each diagnostic group, the p-values for the null hypothesis of constant variance with log(age) were 0.52 and 0.48, respectively, for the groups with and without advanced liver fibrosis. Accordingly, we took the root mean square values of 0.4138 and 0.3689 as the SDs of log(FIB-4) for all ages for the groups with and without advanced liver fibrosis, respectively.
The AUCROC was 0.73 for all ages, as neither the difference between the mean log(FIB-4) values nor their SDs were found to significantly vary with age. Using the traditional non-parametric approach, the AUCROC was 0.79 (95 % CI from 0.75 to 0.83) for the whole population. For the 269 individuals<50 years of age, the AUCROC was 0.80 (95 % CI from 0.73 to 0.87), and for those 271≥50 years of age, it was 0.71 (95 % CI from 0.65 to 0.77). The difference did not reach statistical significance (p=0.07).
The LRs of various values of FIB-4 are shown in Figure 1 for patients aged 35 (green curve), 50 (red), and 65 (blue) years. The vertical lines indicate the FIB-4 values associated with LRs 1 and 5 at 35 (green), 50 (red), and 65 (blue) years of age. The respective values are shown in the legend to Figure 1. The traditional cut-off values of 1.3 and 2.67 are also indicated (black lines).

Point estimates of the likelihood ratio (LR) plotted as a function of FIB-4 values for patients at 35 (green), 50 (red), and 65 (blue) years of age. The colored vertical lines indicate the FIB-4 values associated with LR=1 and LR=5. For patients at 35 years of age (green), the FIB-4 values were 0.91 and 1.62. For patients at 50 years of age (red), the values were 1.37 and 2.45, and for those at 65 years of age (blue), the FIB-4 values were 1.86 and 3.31. The black vertical lines represent the traditional cut-off values 1.3 and 2.67.
Discussion
By including age in the FIB-4 equation, the basis was laid for two types of problems.
First, there is a question of diagnostic accuracy. We found an AUCROC of 0.73 for all ages. In that analysis, the log(FIB-4) values in the two groups were compared for patients at an equal age. In contrast, using the same data with a non-parametric method and not accounting for the effect of age, we obtained the value of 0.79, with a 95 % CI from 0.75 to 0.83, i.e. not including 0.73. As the group with advanced liver fibrosis was older than those without (mean age was 8.6 years higher), both age and liver fibrosis contributed to the higher FIB-4 values in the group with advanced liver fibrosis. In fact, at each specific age there may be a specific ROC curve, but we were not able to prove that the AUCROC varied with age. If it were statistically significant, the coefficient of the interaction variable should be added to the coefficient of the log(age) variable in the group with advanced liver fibrosis. Thus, its negative point estimate indicates a trend towards lesser group difference and lower diagnostic accuracy with age. A similar and clearer trend was found when comparing the AUCROC for patients below and above 50 years of age using the non-parametric method (0.80 vs. 0.71). This is partly at odds with a study by McPherson et al. [2]. They found an equal diagnostic accuracy of FIB-4 for all ages, with the exception of lower accuracy in patients younger than 35 years of age.
Second, as mentioned in the Introduction, the ROC curves in older patients must operate at higher FIB-values than those in younger patients, with consequences for the LRs of specific FIB-4 values. This beforehand, qualitative knowledge was confirmed and quantified with the findings shown in Figure 1, where the LRs of specific FIB-4 values decreased with age. If the lower cut-off value of FIB-4 are associated with a LR of 1, as is sometimes the case for biomarkers with only one cut-off value, the lower cut-off values would be 0.91, 1.37, and 1.86 in patients at 35, 50, and 65 years of age, respectively (Figure 1). An LR of 1 does not change the probability of disease. Higher values increase the probability, and if the higher cut-off values of FIB-4 are associated with an LR of, say, 5, then these values would be 1.62, 2.45, and 3.31 for patients at 35, 50, and 65 years of age, respectively. However, LRs are not the only determinants of the optimal decision limits; pretest probability and the benefits and costs of the various consequences of the follow-up regimes are also important [6].
It is difficult to see the criteria behind the traditional cut-off values of 1.3 and 2.67. In any case, they cannot have been selected to represent the same LRs for all ages. The choice of 2.0 as the lower cut-off value for patients ≥65 years of age was done to increase the specificity [2].
A weakness of this work is the lack of confidence intervals for the LR estimates. They could have been estimated by bootstrap techniques. However, we did not see the point of doing this, as the age dependency of the LRs is obvious.
Another weakness was our limited knowledge of the clinical conditions of the patients in the dataset. We do not know how representative they are for a clinically relevant population in Europe. In addition to different clinical settings, ethnicity may play a role [4]. Also, we have no data on analytical quality and whether AST and ALT were measured using IFCC-methods. However, even if the generalizability of our results is unknown, we believe that our methods are generally applicable. We hope that others will test the age dependency of FIB-4, both its diagnostic accuracy and LRs, using their own clinical datasets. If the results of the current study are confirmed, our previous attempt to estimate common LRs for patients of all ages (i.e. not adjusted for age) was in vain [7]. Ideally, FIB-4 should be replaced with a marker less affected by age. In the meantime, the clinicians would benefit from lists or nomograms of age-differentiated cut-off values, where the criteria for setting the cut-off was clearly stated.
Conclusions
For FIB-4, age distorts both the estimation of its diagnostic accuracy and the estimation of LRs.
-
Research ethics: Not applicable.
-
Informed consent: Not applicable.
-
Author contributions: All authors have accepted responsibility for the entire content of this manuscript and approved its submission.
-
Use of Large Language Models, AI and Machine Learning Tools: None declared.
-
Conflict of interest: The authors state no conflict of interest.
-
Research funding: None declared.
-
Data availability: Not applicable.
References
1. European Association for the Study of the Liver (EASL); European Association for the Study of Diabetes (EASD); European Association for the Study of Obesity (EASO). EASL-EASD-EASO Clinical practice guidelines on the management of metabolic dysfunction-associated steatotic liver disease (MASLD). J Hepatol 2024;81:492–542. https://doi.org/10.1016/j.jhep.2024.04.031.Search in Google Scholar PubMed
2. McPherson, S, Hardy, T, Dufour, JF, Petta, S, Romero-Gomez, M, Allison, M, et al.. Age as a confounding factor for the accurate non-invasive diagnosis of advanced NAFLD fibrosis. Am J Gastroenterol 2017;112:740–51. https://doi.org/10.1038/ajg.2016.453.Search in Google Scholar PubMed PubMed Central
3. Grimes, DA, Schulz, KF. Refining clinical diagnosis with likelihood ratios. Lancet 2005;365:1500–5. https://doi.org/10.1016/s0140-6736(05)66422-7.Search in Google Scholar PubMed
4. Sang, C, Yan, H, Chan, WK, Zhu, X, Sun, T, Chang, X, et al.. Diagnosis of fibrosis using blood markers and logistic regression in Southeast Asian patients with non-alcoholic fatty liver disease. Front Med (Lausanne) 2021;8:637652. https://doi.org/10.3389/fmed.2021.637652.Search in Google Scholar PubMed PubMed Central
5. Krzanowski, WJ, Hand, DJ. ROC curves for continuous data. London: CRC Press, Taylor & Francis Group; 2009:17–35 pp.10.1201/9781439800225Search in Google Scholar
6. Åsberg, A, Bolann, BJ. The optimal cut-off value. Clin Chim Acta 2025;565:119953. https://doi.org/10.1016/j.cca.2024.119953.Search in Google Scholar PubMed
7. Åsberg, A, Løfblad, L, Hov, GG. The likelihood ratios of FIB-4-values for diagnosing advanced liver fibrosis in patients with NAFLD. Clin Chem Lab Med 2023;61:e233–4. https://doi.org/10.1515/cclm-2023-0177.Search in Google Scholar PubMed
© 2025 the author(s), published by De Gruyter, Berlin/Boston
This work is licensed under the Creative Commons Attribution 4.0 International License.
Articles in the same Issue
- Frontmatter
- Editorial
- Keeping pace with patient safety by developing and qualifying higher-order reference measurement procedures for laboratory measurement standardization
- Review
- The role of AI in pre-analytical phase – use cases
- Opinion Paper
- Total laboratory automation: fit for its intended purposes?
- Guidelines and Recommendations
- EFLM checklist for the assessment of AI/ML studies in laboratory medicine: enhancing general medical AI frameworks for laboratory-specific applications
- Candidate Reference Measurement Procedures and Materials
- An isotope dilution-liquid chromatography-tandem mass spectrometry-based candidate reference measurement procedure for the quantification of cortisol in human serum and plasma
- Isotope dilution-liquid chromatography-tandem mass spectrometry-based candidate reference measurement procedures for the quantification of 24(R),25-dihydroxyvitamin D2 and 24(R),25-dihydroxyvitamin D3 in human serum and plasma
- An isotope dilution-liquid chromatography-tandem mass spectrometry-based candidate reference measurement procedure for the quantification of cortisone in human serum and plasma
- Candidate reference measurement procedure based on isotope dilution-two dimensional-liquid chromatography-tandem mass spectrometry for the quantification of androstenedione in human serum and plasma
- An isotope dilution-liquid chromatography-tandem mass spectrometry-based candidate reference measurement procedure for the quantification of 17β-estradiol in human serum and plasma
- Isotope dilution-liquid chromatography-tandem mass spectrometry-based candidate reference measurement procedures for the quantification of total and free phenytoin in human serum and plasma
- An isotope dilution-liquid chromatography-tandem mass spectrometry based candidate reference measurement procedure for the simultaneous quantification of 25-hydroxyvitamin D3 and 25-hydroxyvitamin D2 in human serum and plasma
- General Clinical Chemistry and Laboratory Medicine
- Quality assurance using patient split samples: recommendations for primary healthcare laboratories
- Age distorts the interpretation of FIB-4
- Not all anti-parietal cell antibody tests are equal for diagnosing pernicious anemia
- Impact of renal and hepatic function on dihydropyrimidine dehydrogenase phenotype assessed by enzyme activity in peripheral blood mononuclear cells and uracilemia
- Fecal leukocyte esterase levels predict endoscopic severity as an alternative biomarker in inflammatory bowel disease
- Cancer Diagnostics
- CA-125 glycovariant assays enhance diagnostic sensitivity in the detection of epithelial ovarian cancer
- Cardiovascular Diseases
- Defining the analytical characteristics of a novel high-sensitivity point-of-care troponin I assay in its intended clinical environment
- An automatic chemiluminescence immunoassay for a novel biomarker NT-IGFBP-4: analytical performance and clinical relevance in heart failure
- Analysis of total cholesterol results measured in the initial period of the Croatian screening program for familial hypercholesterolemia: a pilot study
- Diabetes
- Comparison of seven different enzymatic methods for serum glycated albumin in pregnant women: a multicenter study
- Infectious Diseases
- Comparative analysis of monocyte distribution width alterations in Escherichia coli sepsis: insights from in vivo and ex vivo models
- Proadrenomedullin for prediction of early and mid-term mortality in patients hospitalized for community-acquired pneumonia
- Annual Reviewer Acknowledgment
- Reviewer Acknowledgment
- Letters to the Editor
- Biological variation of serum Golgi protein 73 concentrations
- Are vitamins A and E results truly traceable and clinically useful? A practical and critical inquiry
- Tafasitamab interference in immunofixation electrophoresis
- Improvement in the turnaround time of PTH(1–84) as part of the intraoperative PTH monitoring for parathyroidectomy
- Rethinking the use of “one-way ANOVA” in CLSI EP15-A3 – a call for terminological precision and methodological clarity
- Toxic beauty: acute kidney injury triggered by hair-straightening treatment
- Congress Abstracts
- 57th National Congress of the Italian Society of Clinical Biochemistry and Clinical Molecular Biology (SIBioC – Laboratory Medicine)
Articles in the same Issue
- Frontmatter
- Editorial
- Keeping pace with patient safety by developing and qualifying higher-order reference measurement procedures for laboratory measurement standardization
- Review
- The role of AI in pre-analytical phase – use cases
- Opinion Paper
- Total laboratory automation: fit for its intended purposes?
- Guidelines and Recommendations
- EFLM checklist for the assessment of AI/ML studies in laboratory medicine: enhancing general medical AI frameworks for laboratory-specific applications
- Candidate Reference Measurement Procedures and Materials
- An isotope dilution-liquid chromatography-tandem mass spectrometry-based candidate reference measurement procedure for the quantification of cortisol in human serum and plasma
- Isotope dilution-liquid chromatography-tandem mass spectrometry-based candidate reference measurement procedures for the quantification of 24(R),25-dihydroxyvitamin D2 and 24(R),25-dihydroxyvitamin D3 in human serum and plasma
- An isotope dilution-liquid chromatography-tandem mass spectrometry-based candidate reference measurement procedure for the quantification of cortisone in human serum and plasma
- Candidate reference measurement procedure based on isotope dilution-two dimensional-liquid chromatography-tandem mass spectrometry for the quantification of androstenedione in human serum and plasma
- An isotope dilution-liquid chromatography-tandem mass spectrometry-based candidate reference measurement procedure for the quantification of 17β-estradiol in human serum and plasma
- Isotope dilution-liquid chromatography-tandem mass spectrometry-based candidate reference measurement procedures for the quantification of total and free phenytoin in human serum and plasma
- An isotope dilution-liquid chromatography-tandem mass spectrometry based candidate reference measurement procedure for the simultaneous quantification of 25-hydroxyvitamin D3 and 25-hydroxyvitamin D2 in human serum and plasma
- General Clinical Chemistry and Laboratory Medicine
- Quality assurance using patient split samples: recommendations for primary healthcare laboratories
- Age distorts the interpretation of FIB-4
- Not all anti-parietal cell antibody tests are equal for diagnosing pernicious anemia
- Impact of renal and hepatic function on dihydropyrimidine dehydrogenase phenotype assessed by enzyme activity in peripheral blood mononuclear cells and uracilemia
- Fecal leukocyte esterase levels predict endoscopic severity as an alternative biomarker in inflammatory bowel disease
- Cancer Diagnostics
- CA-125 glycovariant assays enhance diagnostic sensitivity in the detection of epithelial ovarian cancer
- Cardiovascular Diseases
- Defining the analytical characteristics of a novel high-sensitivity point-of-care troponin I assay in its intended clinical environment
- An automatic chemiluminescence immunoassay for a novel biomarker NT-IGFBP-4: analytical performance and clinical relevance in heart failure
- Analysis of total cholesterol results measured in the initial period of the Croatian screening program for familial hypercholesterolemia: a pilot study
- Diabetes
- Comparison of seven different enzymatic methods for serum glycated albumin in pregnant women: a multicenter study
- Infectious Diseases
- Comparative analysis of monocyte distribution width alterations in Escherichia coli sepsis: insights from in vivo and ex vivo models
- Proadrenomedullin for prediction of early and mid-term mortality in patients hospitalized for community-acquired pneumonia
- Annual Reviewer Acknowledgment
- Reviewer Acknowledgment
- Letters to the Editor
- Biological variation of serum Golgi protein 73 concentrations
- Are vitamins A and E results truly traceable and clinically useful? A practical and critical inquiry
- Tafasitamab interference in immunofixation electrophoresis
- Improvement in the turnaround time of PTH(1–84) as part of the intraoperative PTH monitoring for parathyroidectomy
- Rethinking the use of “one-way ANOVA” in CLSI EP15-A3 – a call for terminological precision and methodological clarity
- Toxic beauty: acute kidney injury triggered by hair-straightening treatment
- Congress Abstracts
- 57th National Congress of the Italian Society of Clinical Biochemistry and Clinical Molecular Biology (SIBioC – Laboratory Medicine)