Home Medicine Assessment of machine-learning techniques on large pathology data sets to address assay redundancy in routine liver function test profiles
Article Open Access

Assessment of machine-learning techniques on large pathology data sets to address assay redundancy in routine liver function test profiles

  • Brett A. Lidbury EMAIL logo , Alice M. Richardson and Tony Badrick
Published/Copyright: January 15, 2015

Abstract

Background: Routine liver function tests (LFTs) are central to serum testing profiles, particularly in community medicine. However there is concern about the redundancy of information provided to requesting clinicians. Large quantities of clinical laboratory data and advances in computational knowledge discovery methods provide opportunities to re-examine the value of individual routine laboratory results that combine for LFT profiles.

Methods: The machine learning methods recursive partitioning (decision trees) and support vector machines (SVMs) were applied to aggregate clinical chemistry data that included elevated LFT profiles. Response categories for γ-glutamyl transferase (GGT) were established based on whether the patient results were within or above the sex-specific reference interval. Single decision tree and SVMs were applied to test the accuracy of GGT prediction by the highest ranked predictors of GGT response, alkaline phosphatase (ALP) and alanine amino-transaminase (ALT).

Results: Through interrogating more than 20,000 individual cases comprising both sexes and all ages, decision trees predicted GGT category at 90% accuracy using only ALP and ALT, with a SVM prediction accuracy of 82.6% after 10-fold training and testing. Bilirubin, lactate dehydrogenase (LD) and albumin did not enhance prediction, or reduced accuracy. Comparison of abnormal (elevated) GGT categories also supported the primacy of ALP and ALT as screening markers, with serum urate and cholesterol also useful.

Conclusions: Machine-learning interrogation of massive clinical chemistry data sets demonstrated a strategy to address redundancy in routine LFT screening by identifying ALT and ALP in tandem as able to accurately predict GGT elevation, suggesting that GGT can be removed from routine LFT screening.

Introduction

Traditionally, pathology laboratories have grouped individual tests performed on high throughput biochemistry analysers into profiles associated with different organs [e.g., liver function tests (LFT) and renal function tests (RFT)] to enhance laboratory screening for disease. As well as health care settings, pathology test profiles are central to screening community patients. In the community setting, the lower prevalence of abnormalities contributes to reduced test specificity, leading to many false positives that are expensive to investigate and heighten patient anxiety. A new assessment of blood test profiles, particularly for community patient screening, is therefore required.

In 2012, an Association of Clinical Biochemists (UK) Clinical Practice Section publication suggested an initial standardisation of common biochemistry profiles, and the eventual harmonisation of profiles across the UK [1]. The content of the LFT and other profiles are largely historically dictated. Traditionally, when a new test was developed and found to be abnormal in a disease associated with a particular organ, it was included in the profile without regard to whether it provided any additional diagnostic information. This study presents an evidence-based strategy via mass laboratory data access and sophisticated machine learning algorithms. The strategy presented herein was simply to predict the diagnostic value of an existing routine LFT marker, γ-glutamyl transferase (GGT), via other routine LFT marker results; the implication being that if patterns from other markers accurately predict GGT elevation, GGT is not required for routine screening. After initial screening however, GGT does retain value for laboratory diagnosis, for example in cases of alcohol and drug abuse. GGT has been found to have very good diagnostic value for monitoring alcoholism, the differentiation of drinking level and liver cirrhosis, particularly when considered with serum urate, carbohydrate-deficient transferrin (CDT) and other routine and novel biomarkers [2–4]. When considering only abnormal GGT responses for this study, serum total cholesterol was also identified as having potential diagnostic value in this context; increased serum GGT has been identified as a predictor of cardiovascular disease [5, 6].

The most common routine serum LFT profile consists of GGT, alkaline phosphatase (ALP), alanine transaminase (ALT), aspartate aminotransferase (AST), sometimes lactate dehydrogenase (LD), serum albumin, and total serum bilirubin [7]. This multicomponent biochemical LFT has poor sensitivity and specificity for liver disease, and it has been suggested that it is time to consider whether some of the components should be removed [1]. The Association of Clinical Biochemists has advocated a four-component profile (albumin, bilirubin, ALT and ALP), but there was no supporting evidence for any basis for reducing the profile to this group. There have been in excess of 6000 papers published since 1990 dealing with LFTs [8], however most have been based on hospital practice rather than community practice, and they have been mostly retrospective and concerned with probabilities given a disease state not the predictive probability of disease. The BALLETS study [9] was a prospective study based in primary care practice where patients were fully investigated following at least one abnormal analyte from a full LFT panel. The BALLETS study showed that the traditional LFT profile has low specificity for liver disease and concluded that “the routine LFT profile contain fewer components and provide the same or greater value”, particularly if it is used as a screening procedure [8–10]. Benefits from reducing the number of components of the routine LFT profile while retaining predictive power include reduced likelihood of over investigation due to the low specificity/sensitivity, reduced costs to the health system and less patient anxiety [11].

Understanding the deeper interactions between routine clinical chemistry serum biomarkers (Table 1) for a given disease condition can be achieved today through access to enormous datasets and the application of pattern recognition strategies via machine learning methods. This study explored a retrospective large data set of individual pathology test results (>25,000 cases), which included thousands of cases with deranged liver enzymes elevated above the specific reference interval, obtained from screened community patients after a medical consultation (Table 2). Two machine learning techniques were combined to examine predictions of normal or elevated GGT response, as defined by laboratory reference interval (Table 1); (i) Recursive partitioning based decision models (decision trees), which have been applied to medical knowledge domains [12, 13] and provide advantages of applicability to diverse data regardless of distribution, as well as multiple decision boundaries [14] and (ii) Support vector machines (SVM), which provide powerful classification and regression tools through kernel analyses without high computational cost [15]. SVMs have been successfully developed previously to separate diagnosed nonalcoholic steatohepatitis (NASH) patients from healthy controls, with 3 biomarkers from 17 identified with high predictive value after algorithm training and testing [16].

Table 1

Routine clinical chemistry serum markers available for machine-learning analyses of elevated γ-glutamyl transferase (GGT) response categories.

Full analyte nameAbbreviationReference range (adult male)Reference range (adult female)
Response variable
 γ-Glutamyl transferaseGGT5–50 U/L5–35 U/L
Predictor variables
 Liver function:
  Total bilirubinTbili4–20 μmol/L3–15 μmol/L
  Alkaline phosphataseALP
“ ”60–200 U/L20–105 U/L
(17–19 years)(19–49 years)
“ ”35–110 U/L30–115 U/L
“ ”(20+years)(50+years)
  Aspartate aminotransferaseAST10–40 U/L10–35 U/L
  Alanine aminotransferaseALT5–40 U/L5–30 U/L
  Lactate dehydrogenaseLD120–250 U/L120–250 U/L
  AST:ALT ratioAST:ALT
  LDH:AST ratioLD:AST
 Serum proteins:
  Total proteinTP61–83 g/L61–83 g/L
  AlbuminALB34–50 g/L34–50 g/L
  GlobulinGLOB23–39 g/L23–39 g/L
 Urea, electrolytes, creatinine:
  SodiumNa+135–145 mmol/L135–145 mmol/L
  PotassiumK+3.5–5.5 mmol/L3.5–5.5 mmol/L
  ChlorideCl95–110 mmol/L95–110 mmol/L
  BicarbonateHCO320–32 mmol/L20–32 mmol/L
  Anion gapAG5–15 mmol/L5–15 mmol/L
  CreatinineCreat60–120 μmol/L45–85 μmol/L
  UreaUrea3–10 mmol/L3–10 mmol/L
  UrateUrate0.20–0.50 mmol/L0.15–0.40 mmol/L
  Estimated glomerular filtration rateEGFR≥60 mL/min/1.73 m2≥60 mL/min/1.73 m2
  OsmolalityOsmol285–300 mmol/kg285–300 mmol/kg
 Bone
  Calcium (total)Ca2+2.15–2.55 mmol/L2.15–2.55 mmol/L
  Corrected calcium (corrected for albumin binding)Ca_Corr2.15–2.55 mmol/L2.15–2.55 mmol/L
  PhosphatePO40.8–1.5 mmol/L0.8–1.5 mmol/L
 Other
  CholesterolCHOLVaries with lab and population
  AgeYears
  SexFemale (1) Male (2)
Table 2

Summary of GGT response categories investigated by machine learning to examine GGT redundancy for routine LFT profiles through tandem recursive partitioning (decision tree) and support vector machine (SVM) prediction.

Table 2 Summary of GGT response categories investigated by machine learning to examine GGT redundancy for routine LFT profiles through tandem recursive partitioning (decision tree) and support vector machine (SVM) prediction.

For the prediction of serum GGT response, single decision trees were used initially to identify individual LFT assay importance for GGT category prediction, and to ascertain the decision thresholds (e.g., ALT >30 U/L) to guide the most accurate GGT category prediction. With the top predictors (ALT and ALP) and predictor thresholds identified by single decision trees, final prediction accuracy for GGT category was achieved via SVM training/testing protocols run as a 10-fold cross validation analyses. An additional aim was to use machine learning to interrogate cases of elevated serum GGT (>50 U/L), to assess the capacity of ALT/ALP with other clinical chemistry biomarkers (e.g., serum urate, cholesterol) to enhance the detection of liver pathology, as reflected by varying degrees of GGT elevation above the reference interval.

Materials and methods

Data

Non-identifiable patient clinical chemistry data from community patients were obtained from Sullivan Nicolaides Pathology (Brisbane, Australia) in a spreadsheet format, with data access and analysis approved by the Human Research Ethics Committees at both Bond University and The Australian National University. The original data set comprised 26,065 cases, collected via centres and laboratories throughout the state of Queensland (Australia), over January 2012. The clinical chemistry biomarkers associated with the analyses of GGT responses are summarised in Table 1, and include male and female reference intervals for all analytes. All serum chemistry analyses were performed on the Roche Modular Chemistry Analyser (Roche Diagnostics Limited, Rotkreuz, Switzerland).

Data pre-processing assigned each case to a GGT response category based on specific reference intervals for males and females (Table 1), with category 0 comprising cases within the reference interval, and category 1 comprising GGT cases elevated above the upper limit of the male or female reference intervals. Further data pre-processing removed all patient cases <18 years of age (Table 2B), and then into separate female or male data sets for extra analysis.

For separate SVM experiments, GGT cases above the upper limit of the reference intervals (GGT >50 U/L) were stratified into low, medium and high GGT elevated response categories for SVM pattern analysis using ALT/ALP, as well as serum urate and total cholesterol predictors (Table 2A and C).

To determine GGT predictive profiles, the response variable for all analyses was the level of activity for serum GGT (U/L), represented as a response variable category of 0 or 1. Analysis of LFT marker and other variable distribution patterns was done by frequency plot and K-S statistics. Apart from age and serum albumin, the LFT variables considered did not follow a normal distribution. Therefore, the comparison of medians for all category 0 versus category 1 predictor variables were conducted by Mann-Whitney U-test, with data dispersion for each median calculated as 25%–75% quartiles. Pearson or Spearman correlations were also performed to detect strength of association for combinations of GGT, ALP, AST and ALT for normal and elevated GGT categories (Table 2) (SPSS version 21).

Machine learning and analysis

All decision tree and SVM analyses were performed using the R packages rpart [17] or e1071 [18] respectively. Single decision tree rules based on calculated predictor variable thresholds (e.g., ALT <30.5 U/L) at decision nodes were determined for each GGT response category (0 or 1). Single decision trees were run using default settings, which included a cp value of 0.01 and a minimum bucket size defined as “minbucket=round(minsplit/3)”. Single decision tree analyses also provided screening for predictor variable importance rankings, with the top 2–3 GGT category predictors ultimately applied to tuned SVM models. SVM predictions were based on 10-fold cross comparison of training and testing data to arrive at an overall mean percentage of class (category 0 or 1) prediction (70% data for training, and 30% for testing prediction accuracy). The SVM investigations included a pre-analysis tuning phase where the best cost and γ-coefficients were identified for the combination of the response category (normal or elevated GGT) and the predictor variables of interest [15]. As a result of tuning, all SVM models used a gamma of 0.1 and cost coefficient of 10 or 100. Unpruned decision tree models containing all LFT predictor variables were fitted to identify the best GGT predictors for 10-fold training/testing SVM analysis of prediction accuracy. Four to six predictor variables were tested for initial decision tree models (ALT, AST, ALP, LD, total bilirubin, serum albumin) with only ALT plus ALP applied to subsequent SVM prediction models. Age was initially included in SVM models, but except for one example (Figure 1), did not aid prediction accuracy. Confidence intervals were calculated for each SVM testing phase percentage accuracy rate using a traditional method based on the Central Limit Theorem, which proves a normal distribution for a proportion [19].

Figure 1 Prediction of GGT category (category 0, within laboratory reference interval; category 1, GGT elevated above reference interval) by ALP and ALT for the total case sample provided (male and female cases; Age 0–106 years; n=25,420).(A) Single decision tree analysis of the combined ALP and ALT rules to predict GGT category, including prediction accuracy and ALP/ALT concentration thresholds. (B) GGT category prediction accuracy calculated from 10-fold training and testing cross validation via support vector machines (70% data training and 30% data for testing). The reported result is the testing phase SVM prediction accuracy.* Adding Age <36.5 years increased the prediction accuracy for Cat 0%–94.5% (794/840 cases correct). Age did not improve category 1 prediction.
Figure 1

Prediction of GGT category (category 0, within laboratory reference interval; category 1, GGT elevated above reference interval) by ALP and ALT for the total case sample provided (male and female cases; Age 0–106 years; n=25,420).

(A) Single decision tree analysis of the combined ALP and ALT rules to predict GGT category, including prediction accuracy and ALP/ALT concentration thresholds. (B) GGT category prediction accuracy calculated from 10-fold training and testing cross validation via support vector machines (70% data training and 30% data for testing). The reported result is the testing phase SVM prediction accuracy.* Adding Age <36.5 years increased the prediction accuracy for Cat 0%–94.5% (794/840 cases correct). Age did not improve category 1 prediction.

Results

Descriptive statistics and correlation analyses for GGT categories

Table 2 summarises the arrangement of data into GGT response categories for interrogation by the machine learning. Analyses were performed to determine whether GGT could be removed from routine LFT screening, due to other LFT markers accurately predicting normal or elevated serum GGT. Firstly, descriptive statistics and correlation were conducted to assess the basic characteristics of the GGT categories for subsequent decision tree/SVM interrogation (Table 2). Single decision trees were applied to identify the most important predictors, determine ALT and ALP concentration thresholds and assess prediction accuracy for normal versus abnormal GGT categories prior to SVM training plus testing. Another analysis was focussed only on abnormal GGT cases elevated above the specific reference interval, which were split into categories depending upon the size of serum concentration increase above the upper limit of the reference interval (Table 2A and C).

Reflecting the expected co-elevation of other LFT markers with GGT, category 1 LFT enzyme and bilirubin medians were significantly higher compared to category 0 (p=0.01–p<0.001, Mann-Whitney U-test) in general (Table 2B). This pattern was also found for the comparisons between elevated GGT categories (Table 2A and C).

The correlation (r) of GGT and ALP was moderately strong for category 1 in each of the three GGT comparisons presented (Table 2). Correlation of ALT with AST was very strong (r≥0.90) for both GGT categories of the “Abnormal LFT” and “Elevated GGT/ALP” examples (Table 2A and C), with a decreased strength of association for both GGT categories comparing normal serum GGT levels with elevated GGT (Table 2B). This suggests that AST has least utility when investigating abnormal GGT cases, with ALT alone sufficient. Correlations between GGT and AST or ALT were weak or null for the “Abnormal LFT” and “Elevated GGT/ALP” conditions examined. GGT and AST/ALT associations were stronger for the categories comparing the normal within reference interval response with elevated GGT (Table 2B). The GGT relationship with the serum transaminases was weakest with very high GGT serum concentration, which may reflect the relatively rare occurrence and hence wide dispersal of strongly elevated GGT results for screened community patients.

Single decision trees, SVM and predictions of elevated GGT responses by ALP and ALT

Preliminary unpruned single decision trees were produced to identify the leading predictors of GGT category for normal versus abnormal responses (Table 2B). To start, the entire sample of 25,420 cases, female and male with an age range from 0 to 106 years, was investigated by including all LFT markers in the decision tree model to predict GGT response category. Of the original 25,420 cases, 18,439 were within the GGT laboratory reference interval (category 0), and 6981 cases were elevated above the upper limit of the GGT reference interval (category 1) (Table 1). The preliminary trees identified only ALP and ALT (and occasionally age) as important to the classification of GGT response (which was confirmed by a multiple tree analysis, the random forest: results not shown). Additional tree analysis utilised only ALP+ALT+Age, or ALP+ALT as predictor variables for GGT response classification.

Figure 1 shows the single decision tree rules identified to predict and differentiate an elevated GGT response (category 1) from a GGT response within the reference interval (category 0). The crucial ALT threshold calculated by the tree was greater or less than (> or <) 30.5 U/L, with an associated ALP threshold of <90.5 U/L for category 0 (85.8% prediction accuracy), and ALP >121.5 U/L for category 1 (90.2% accuracy). The addition of age to the model improved the prediction accuracy for category 0 to 94.5% (<36.5 years), but age did not influence category 1 prediction. Cases <18 years of age were removed and another decision tree analysis conducted on the same sample (Figure 2). Similar prediction accuracies and ALT/ALP thresholds were found, except that the ALP threshold for a category 0 prediction increased to 109.5 U/L.

Figure 2 Prediction of GGT category (category 0, within laboratory reference interval; category 1, GGT elevated above reference interval) by ALP and ALT for the adult case sample provided (male and female cases; Age 18–106 years; n=24,812).(A) Single decision tree analysis of the combined ALP and ALT rules to predict GGT category, including prediction accuracy and ALP/ALT concentration thresholds. (B) GGT category prediction accuracy calculated from 10-fold training and testing cross validation via support vector machines (70% data training and 30% data for testing). The reported result is the testing phase SVM prediction accuracy.
Figure 2

Prediction of GGT category (category 0, within laboratory reference interval; category 1, GGT elevated above reference interval) by ALP and ALT for the adult case sample provided (male and female cases; Age 18–106 years; n=24,812).

(A) Single decision tree analysis of the combined ALP and ALT rules to predict GGT category, including prediction accuracy and ALP/ALT concentration thresholds. (B) GGT category prediction accuracy calculated from 10-fold training and testing cross validation via support vector machines (70% data training and 30% data for testing). The reported result is the testing phase SVM prediction accuracy.

The influence of sex on the prediction of elevated serum GGT response by ALT and ALP was investigated. For females a prediction of category 0 was similar to the combined sample of 18 years or older (Figure 2) with ALT of <29.5 U/L and ALP of <109.5 U/L required (Figure 3). However, for the prediction of category 1, ALT >29.5 alone provided the best accuracy. Compared to the combined sample including both sexes, female prediction accuracies were low with 73.6% for category 0 and 67.3% for category 1. Male category 0/1 prediction accuracies were >90%, with an ALP decision threshold at 124.5 U/L (Figure 4). ALT was also required for the highest prediction percentage with a category 0 prediction needing a threshold of <30.5 U/L, while a category 1 ALT threshold was lower at >27.5 U/L.

Figure 3 Prediction of GGT category (category 0, within laboratory reference interval; category 1, GGT elevated above reference interval) by ALP and ALT for female adult cases (Age 18–103 years; n=11,154).(A) Single decision tree analysis of the combined ALP and ALT rules to predict GGT category, including prediction accuracy and ALP/ALT concentration thresholds. (B) GGT category prediction accuracy calculated from 10-fold training and testing cross validation via support vector machines (70% data training and 30% data for testing). The reported result is the testing phase SVM prediction accuracy.
Figure 3

Prediction of GGT category (category 0, within laboratory reference interval; category 1, GGT elevated above reference interval) by ALP and ALT for female adult cases (Age 18–103 years; n=11,154).

(A) Single decision tree analysis of the combined ALP and ALT rules to predict GGT category, including prediction accuracy and ALP/ALT concentration thresholds. (B) GGT category prediction accuracy calculated from 10-fold training and testing cross validation via support vector machines (70% data training and 30% data for testing). The reported result is the testing phase SVM prediction accuracy.

Figure 4 Prediction of GGT category (category 0, within laboratory reference interval; category 1, GGT elevated above reference interval) by ALP and ALT for male adult cases (Age 18–106 years; n=13,658).(A) Single decision tree analysis of the combined ALP and ALT rules to predict GGT category, including prediction accuracy and ALP/ALT concentration thresholds. (B) GGT category prediction accuracy calculated from 10-fold training and testing cross validation via support vector machines (70% data training and 30% data for testing). The reported result is the testing phase SVM prediction accuracy.
Figure 4

Prediction of GGT category (category 0, within laboratory reference interval; category 1, GGT elevated above reference interval) by ALP and ALT for male adult cases (Age 18–106 years; n=13,658).

(A) Single decision tree analysis of the combined ALP and ALT rules to predict GGT category, including prediction accuracy and ALP/ALT concentration thresholds. (B) GGT category prediction accuracy calculated from 10-fold training and testing cross validation via support vector machines (70% data training and 30% data for testing). The reported result is the testing phase SVM prediction accuracy.

To validate the accuracy of GGT class prediction reported in Figures 14 for single decision trees, the same data was used to train and test SVM ensembles (10-fold cross validation, 70/30 train/test data ratio). For the entire sample (Figure 1B) and the combined female/male sample of 18 years or older (Figure 2B), the category prediction accuracies ranged between 75% and 80%, lower than the accuracy percentages estimated by single decision trees. The disparity in GGT category prediction accuracy by ALP and ALT when comparing adult females and males was also detected by SVM ensembles (Figures 3B and 4B), with the overall category prediction trends via SVM the same as those found by single decision trees, albeit marginally lower.

The addition of AST to SVM ensembles did not increase prediction accuracy percentages produced by ALP and ALT, and for some models decreased prediction accuracy by 2%–3%. SVM models using total serum bilirubin, LD and serum albumin as GGT predictor variables achieved category 0 prediction of 78%, but prediction of category 1 by this model was the lowest of any model tested at 63%. The addition of bilirubin, LD or albumin to the ALP/ALT models did not enhance GGT category prediction for any analysis, and often reduced accuracy (results not shown).

Serum cholesterol and ALP interaction for abnormal LFT cases

Abnormal LFT cases (Table 2A) were examined by SVM to further assess the utility of ALT plus ALP for elevated GGT prediction, as well as consider non-LFT clinical chemistry markers (Table 1), for example, serum urate (which has been previously found as a laboratory marker for alcoholism, a cause of elevated serum GGT).

The GGT response categories were both above the upper limit of the GGT reference interval at 40–180 U/L (Cat 0) and 181–2613 U/L (Cat 1) (female and male cases). While serum urate was found as a useful elevated GGT predictor, total serum cholesterol appeared from preliminary analysis as interacting strongly with ALP to produce a clear discrimination between the two elevated GGT categories. Figure 5A shows a SVM plot describing the relationship between ALP and cholesterol, with a serum urate concentration of 0.50 mmol/L included as a constant in the SVM model. The model found an inverse relationship for ALP and cholesterol, with a decrease in ALP associated with an increase of serum cholesterol when predicting elevated GGT category (overall prediction accuracy of 75%). This analysis was repeated with 3 elevated GGT response categories (Figure 5B), with an intermediate category (GGT 119–293 U/L) introduced to examine the ALP – cholesterol relationship further. The introduction of the extra category emphasised a strong ALP – cholesterol relationship for the intermediate GGT category (119–293 U/L), which was also inverse. The decrease of ALP within the GGT response range of 119–293 U/L showed that a serum ALP concentration of 500 U/L correlated with a serum total cholesterol concentration of approximately 3.0 mmol/L. An ALP concentration of approximately 100 U/L was associated with a cholesterol level of 7.5–8.0 mmol/L (overall prediction accuracy of 60%). Without clinical notes or patient history, the significance of cholesterol and its interaction with ALP cannot be ascertained with confidence, but the analysis does highlight that interactions with other serum markers beside the traditional LFT profile assays may add value to the laboratory diagnosis of abnormal liver function.

Figure 5 Support vector machine (SVM) analysis of serum chemistry predictors of elevated GGT response within an abnormal liver function test (LFT) profile (Table 2A). ALP and serum cholesterol were the y- and x-axis variables respectively, with a constant serum urate concentration of 0.50 mmol/L added to the model to define the SVM hyper-plane separating the GGT response categories.(A) The two response variable categories investigated were elevated GGT category 0 (40–180 U/L: n=344) or 1 (181–2613 U/L: n=347). The replacement of ALP, cholesterol or urate by ALT did not alter the classification plot represented above. (B) The three response variable categories investigated were elevated GGT category 0 (40–118 U/L: n=233), category 1 (297–2613 U/L: n=230) and intermediate category 2 (119–293 U/L: n=228). The replacement of ALP, cholesterol or urate by ALT did not alter the classification plot represented above.
Figure 5

Support vector machine (SVM) analysis of serum chemistry predictors of elevated GGT response within an abnormal liver function test (LFT) profile (Table 2A). ALP and serum cholesterol were the y- and x-axis variables respectively, with a constant serum urate concentration of 0.50 mmol/L added to the model to define the SVM hyper-plane separating the GGT response categories.

(A) The two response variable categories investigated were elevated GGT category 0 (40–180 U/L: n=344) or 1 (181–2613 U/L: n=347). The replacement of ALP, cholesterol or urate by ALT did not alter the classification plot represented above. (B) The three response variable categories investigated were elevated GGT category 0 (40–118 U/L: n=233), category 1 (297–2613 U/L: n=230) and intermediate category 2 (119–293 U/L: n=228). The replacement of ALP, cholesterol or urate by ALT did not alter the classification plot represented above.

Discussion

The results of this study provides a quantitative, evidence-based strategy to assess the value of routine pathology tests in the context of reducing the complexity of the diagnostic information conduit that laboratories report to requesting clinicians. This approach will be of particular value to community practice where tests such as LFTs are used for screening to identify underlying liver disease. GGT is a serum enzyme marker routinely included in the liver function test profile, within a broader clinical chemistry (Table 1) and routine diagnostic pathology-testing regime [7, 10]. In Australia, a full pathology (blood test) profile also includes a full/complete blood count (FBC/CBC), with special tests also requested if indicated (e.g., immunoassay, drug testing). This study focussed on the assessment of GGT as a routine LFT marker through machine learning prediction by other test biomarker patterns contained in large clinical chemistry datasets. The results reported here showed that normal (not elevated) and abnormal (elevated) GGT responses can be predicted with high accuracy by ALP and ALT; AST was not required (particularly for cases with highly elevated, abnormal GGT responses – Table 2A and C). In comparison to ALP and ALT alone, total serum bilirubin, LD and serum albumin were not effective for the prediction of GGT response, and the addition of these markers individually or together to ALP and ALT did not augment prediction accuracy. This evidence, therefore, suggests that ALP and ALT are sufficient for routine LFT screening of community patients, with their combined activity sufficient to also predict normal or abnormal GGT results. The decision tree results added additional value via the calculation of ALP and ALT concentration thresholds to guide decisions on whether to suspect elevated GGT.

In general, an ALT level >30 U/L combined with an ALP level >100–125 U/L suggested an elevated GGT at accuracies of >80% (with prediction variation associated with sex, but not age). If presenting with an ALT >30 U/L and ALP >125 U/L, and a history that indicates alcohol or drug abuse, GGT could then be considered as a second tier test for future monitoring, rather than a primary marker (and additional discernment would be required if the patient is a 17–19 year old male, since the upper limit of the ALP reference interval is 200 U/L: Table 1). In the current configuration, routine LFTs have sensitivity and specificity limitations, leading to inappropriate investigations and patient anxiety if results are misinterpreted or over interpreted. Reducing this potential confusion is difficult because LFTs have become a cornerstone of community medicine testing. Therefore, robust evidence must be produced that demonstrates that LFT profile modification will not compromise existing diagnostic efficacy; as described here, evidence of test redundancy can be identified and evaluated by machine learning analyses of large pathology data sets.

The BALLETS trial [9] investigated a large number of patients who were identified as having at least one abnormal component of their LFT, where after they were extensively investigated to determine what might have caused that abnormality. The BALLETS study found that 45% of patients with an abnormal component of the LFT did not have an identifiable disease and 52% had non-alcoholic fatty liver disease (NAFLD) or at-risk alcohol intake. Lilford et al. [8] reporting on the findings of the BALLETS trial suggested that the LFT should comprise albumin, bilirubin, ALT and ALP, and this was independently suggested by the Association of Clinical Biochemistry (UK) Clinical Practice Section [1]. The BALLETS conclusions and the recommendations of the Association of Clinical Biochemistry (UK), which did not include GGT, have been partly confirmed here by identifying ALP and ALT as leading predictors of GGT levels, as well as adding value through providing decision boundaries for ALP/ALT via machine learning.

Single decision trees rules produced classification predictions of high accuracy, based on subsets of cases taken from the hundreds – thousands available. To increase sampling and prediction accuracy for this method, decision tree ensembles can also be employed [13, 20] or a tandem decision tree plus SVM approach [21]. This study combined single decision trees with 10-fold cross validation SVM ensembles to identify prediction rules and percentages of prediction accuracy. In this context decision trees are very effective for dimension reduction, with ALP and ALT consistently the two highest ranked predictors of GGT category on variable importance scales. Once leading candidate predictor variables are identified, SVM provides very powerful assessment of classifier performance for the response of interest. Running the 10-fold SVM ensembles were also important to explore the risk of over-fitting the single decision tree prediction models, which was suggested by the generally higher prediction accuracies calculated for decision trees (Figures 14).

For the SVM investigation of elevated GGT responses (abnormal LFT profile – Table 2A), a third predictor was added, which like cholesterol had been identified by preliminary decision trees applied to non-LFT markers available in the data set; namely serum urate, which was added to the SVM model as a “slice” at a concentration of 0.50 mmol/L. Urate has been associated with chronic alcohol abuse [22, 23], although this has been debated for a Japanese male population [24]. GGT, AST, mean corpuscular volume (MCV) and carbohydrate-deficient transferrin (CDT), with and without reference to other clinical data, have been nominated for the diagnostic and monitoring utility of alcoholics, with prediction accuracies of 75% (0.75) claimed from this profile [2, 25]. Less expected, as an elevated GGT predictor was serum cholesterol, although GGT has been identified as a predictor of cardiovascular disease, CVD and also of subsequent mortality. This relationship of GGT to CVD included, as part of the risk profile, a role for dyslipidaemia and positive associations with high and low density lipoprotein [HDL/LDL] cholesterol [5]. In addition, studies on cardiac disease risk and GGT also have observed associations to arterial stiffness and coronary artery disease (CAD) [26], as well as coronary artery calcification (CAC) and elevated GGT [27].

Whether GGT elevation is due to drug/alcohol abuse, heart disease or extra-hepatic problems, of course, requires direct interaction with the patient and additional testing. However, the models and predictions reported here address an imperative for laboratory diagnostics, namely an evidence-based method for enquiry into LFT redundancy, and possible removal of traditional LFT markers like GGT without loss of efficacy. The findings reported here support the expert advice given by the Association of Clinical Biochemists (Clinical Practice group) [1] and the outcomes of the BALLETS study [9].


Corresponding author: Brett A. Lidbury, Associate Professor, Genomics and Predictive Medicine, Department of Genome Biology, The John Curtin School of Medical Research, The Australian National University, ACT 2601, Australia, Phone: +61 2 6125 6137, E-mail:

Acknowledgments

The authors wish to thank Mr Ashley Arnott and staff of Sullivan Nicolaides Pathology (Brisbane, Australia) for the extraction and provision of data used for this study.

  1. Author contributions: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.

  2. Research funding: The Quality Use of Pathology Programme (QUPP), The Commonwealth Department of Health and Ageing, Canberra Australia, supported this study. BAL was partly funded by a fellowship from The Medical Advances Without Animals Trust (MAWA).

  3. Employment or leadership: Dr. Tony Badrick was previously employed by Sullivan Nicolaides Pathology (Brisbane) and maintains a role with this laboratory as a Research Associate.

  4. Honorarium: None declared.

  5. Competing interests: The funding organization(s) played no role in the study design; in the collection, analysis, and interpretation of data; in the writing of the report; or in the decision to submit the report for publication.

References

1. Smellie WS. Time to harmonise common laboratory test profiles. Brit Med J 2012;344:e1169.10.1136/bmj.e1169Search in Google Scholar PubMed

2. Chen J, Conigrave KM, Macaskill P, Whitfield JB, Irwig L. Combining carbohydrate-deficient transferrin and gamma-glutamyltransferase to increase diagnostic accuracy for problem drinking. Alcohol Alcohol 2003;38:574–82.10.1093/alcalc/agg113Search in Google Scholar PubMed

3. Afzali A, Weiss NS, Boyko EJ, Ioannou GN. Association between serum uric acid level and chronic liver disease in the United States. Hepatology 2010;52:578–89.10.1002/hep.23717Search in Google Scholar PubMed

4. Whitfield JB, Heath AC, Madden PA, Pergadia ML, Montgomery GW, Martin NG. Metabolic and biochemical effects of low-to-moderate alcohol consumption. Alcohol Clin Exp Res 2013;37:575–86.10.1111/acer.12015Search in Google Scholar PubMed PubMed Central

5. Lee DS, Evans JC, Robins SJ, Wilson PW, Albano I, Fox CS, et al. Gamma glutamyl transferase and metabolic syndrome, cardiovascular disease, and mortality risk: the Framingham Heart Study. Arterioscler Thromb Vasc Biol 2007;27:127–33.10.1161/01.ATV.0000251993.20372.40Search in Google Scholar PubMed

6. Kim KM, Kim BT, Lee DJ, Park SB, Joo NS, Kim KN. Serum gamma-glutamyl transferase as a risk factor for general cardiovascular disease prediction in Koreans. J Investig Med 2012;60:1199–203.10.2310/JIM.0b013e3182746752Search in Google Scholar PubMed

7. Powell LW, Bassett ML, Cooksley WG. Liver function tests. In: Kellerman G, editor. Abnormal laboratory results, 3rd ed. Australian Prescriber and McGraw-Hill Education (Australia & New Zealand), 2011:115.Search in Google Scholar

8. Lilford RJ, Bentham LM, Armstrong MJ, Neuberger J, Girling AJ. What is the best strategy for investigating abnormal liver function tests in primary care? Implications from a prospective study. Brit Med J Open 2013;3.10.1136/bmjopen-2013-003099Search in Google Scholar PubMed PubMed Central

9. Lilford RJ, Bentham L, Girling A, Litchfield I, Lancashire R, Armstrong D, et al. Birmingham and Lambeth Liver Evaluation Testing Strategies (BALLETS): a prospective cohort study. Health Technol Assess 2013;17:i–xiv, 1–307.10.3310/hta17280Search in Google Scholar PubMed PubMed Central

10. Lee TH, Kim WR, Poterucha JJ. Evaluation of elevated liver enzymes. Clin Liver Dis 2012;16:183–98.10.1016/j.cld.2012.03.006Search in Google Scholar PubMed PubMed Central

11. Moynihan R, Doust J, Henry D. Preventing overdiagnosis: how to stop harming the healthy. Brit Med J 2012;344:e3502.10.1136/bmj.e3502Search in Google Scholar PubMed

12. Crowley S, Tognarini D, Desmond P, Lees M, Saal G. Introduction of lamivudine for the treatment of chronic hepatitis B: expected clinical and economic outcomes based on 4-year clinical trial data. J Gastroenterol Hepatol 2002;17:153–64.10.1046/j.1440-1746.2002.02673.xSearch in Google Scholar PubMed

13. Richardson AM, Lidbury BA. Infection status outcome, machine learning method and virus type interact to affect the optimised prediction of hepatitis virus immunoassay results from routine pathology laboratory assays in unbalanced data. BMC Bioinformatics 2013;14:206.10.1186/1471-2105-14-206Search in Google Scholar PubMed PubMed Central

14. Kingsford C, Salzberg SL. What are decision trees? Nat Biotechnol 2008:1011–3.10.1038/nbt0908-1011Search in Google Scholar PubMed PubMed Central

15. Karatzoglou A, Meyer D, Hornik K. Support vector machines in R. J Statistical Software 2006;15:1–29.10.18637/jss.v015.i09Search in Google Scholar

16. Yilmaz Y, Eren F. Identification of a support vector machine-based biomarker panel with high sensitivity and specificity for nonalcoholic steatohepatitis. Clin Chim Acta 2012;414:154–7.10.1016/j.cca.2012.08.005Search in Google Scholar PubMed

17. Therneau TM, Atkinson B. R port by Brian ripley; rpart: recursive partitioning. Oxford, UK2012. Recursive partitioning for classification, regression and survival trees. An implementation of most of the functionality of the 1984 book by Breiman, Friedman, Olshen and Stone]. Available from: http://cran.r-project.org/web/packages/rpart/index.html. Accessed 18 March 2014.Search in Google Scholar

18. Dimitriadou E, Hornik K, Leisch F, Meyer D, Weingessel A. e1071: Misc functions of the department of statistics (e1071), TU Wien. R package version 16 2011. (http://CRAN.R-project.org/package=e1071). Accessed 18 March 2014.Search in Google Scholar

19. Altman, DG, Machin D, Bryant TN, Gardner MJ, Editors. Statistics with confidence: confidence intervals and statistical guidelines, 2nd ed. Bristol: BMJ Books, 2000:46.Search in Google Scholar

20. Richardson A, Shadabi F, Lidbury BA. Learning from pathology databases to improve the laboratory diagnosis of infectious diseases. In: Takeda H, editor. E-Health – First IMIA/IFIP Joint Symposium, E-Health 2010, Held as Part of WCC 2010, Brisbane, Australia, September 20–23, 2010 Proceedings: Springer; 2010:226–7.Search in Google Scholar

21. Lidbury BA, Richardson AM. A Pattern Recognition Bioinformatics Alternative System to Rodent Models in Fundamental Research. World Congress in Animal Alternatives in the Life Sciences (WC8); 2012; Montreal, Canada: ALTEX Proceedings.Search in Google Scholar

22. Whitfield JB, Martin NG. Inheritance and alcohol as factors influencing plasma uric acid levels. Acta Genet Med Gemellol (Roma) 1983;32:117–26.10.1017/S0001566000006401Search in Google Scholar

23. Nakamura K, Sakurai M, Miura K, Morikawa Y, Yoshita K, Ishizaki M, et al. Alcohol intake and the risk of hyperuricaemia: a 6-year prospective study in Japanese men. Nutr Metab Cardiovasc Dis 2012;22:989–96.10.1016/j.numecd.2011.01.003Search in Google Scholar PubMed

24. Nakamura K, Sakurai M, Miura K, Morikawa Y, Nagasawa S, Ishizaki M, et al. Serum gamma-glutamyltransferase and the risk of hyperuricemia: a 6-year prospective study in Japanese men. Horm Metab Res 2012;44:966–74.10.1055/s-0032-1321788Search in Google Scholar PubMed

25. Korzec A, Bar M, Koeter MW, de Kieviet W. Diagnosing alcoholism in high-risk drinking drivers: comparing different diagnostic procedures with estimated prevalence of hazardous alcohol use. Alcohol Alcohol 2001;36:594–602.10.1093/alcalc/36.6.594Search in Google Scholar PubMed

26. Zhu C, Xiong Z, Zheng Z, Chen Y, Qian X, Chen X. Association of serum gamma glutamyl transferase with arterial stiffness in established coronary artery disease. Angiology. 2013;64:15–20.10.1177/0003319712459799Search in Google Scholar PubMed

27. Atar AI, Yilmaz OC, Akin K, Selcoki Y, Er O, Eryonucu B. Association between gamma-glutamyltransferase and coronary artery calcification. Int J Cardiol 2013;167:1264–7.10.1016/j.ijcard.2012.03.157Search in Google Scholar PubMed

Received: 2014-10-9
Accepted: 2014-12-4
Published Online: 2015-1-15
Published in Print: 2015-2-1

©2015, Brett A. Lidbury et al., published by De Gruyter

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.

Downloaded on 25.1.2026 from https://www.degruyterbrill.com/document/doi/10.1515/dx-2014-0063/html
Scroll to top button