Validation of amplicon-based next generation sequencing panel for second-tier test in newborn screening for inborn errors of metabolism

Kwok Yeung Tsang; Toby Chun Hei Chan; Matthew Chun Wing Yeung; Tsz Ki Wong; Wan Ting Lau; Chloe Miu Mak

doi:10.1515/labmed-2021-0115

Article Open Access

Validation of amplicon-based next generation sequencing panel for second-tier test in newborn screening for inborn errors of metabolism

Kwok Yeung Tsang , Toby Chun Hei Chan , Matthew Chun Wing Yeung , Tsz Ki Wong , Wan Ting Lau and Chloe Miu Mak

Published/Copyright: October 25, 2021

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information Explore this Subject

From the journal Journal of Laboratory Medicine Volume 45 Issue 6

Abstract

Objectives

Next generation sequencing (NGS) technology has allowed cost-effective massive parallel DNA sequencing. To evaluate the utility of NGS for newborn screening (NBS) of inborn errors of metabolism (IEM), a custom panel was designed to target 87 disease-related genes. The pilot study was primarily proposed for second-tier testing under the NBSIEM program in Hong Kong.

Methods

The validation of the panel was performed with two reference genomes and an external quality assurance (EQA) sample. Sequencing libraries were synthesized with amplicon-based approach. The libraries were pooled, spiked-in with 2% PhiX DNA as technical control, for 16-plex sequencing runs. Sequenced reads were analyzed using a commercially available pipeline.

Results

The average target region coverage was 208× and the fraction of region with target depth ≥20× was 95.7%, with a sensitivity of 91.2%. There were 85 out of 87 genes with acceptable coverage, and EQA result was satisfactory. The turnaround time from DNA extraction to completion of variant calling and quality control (QC) procedures was 2.5 days.

Conclusions

The NGS approach with the amplicon-based panel has been validated for analytical performance and is suitable for second-tier NBSIEM test.

Keywords: dried blood spot (DBS); inborn errors of metabolism (IEM); newborn screening (NBS); next generation sequencing (NGS); second-tier test

Introduction

A newborn screening (NBS) program is an effective mean to find out conditions which are not clinically evident on physical examination in neonatal period to allow early diagnosis and intervention to prevent disability or death. Many countries carry out screening program for inborn errors of metabolism (IEM) conditions utilizing dried blood spot (DBS) [1]. DBS analysis provides advantages of significantly small volume of blood, minimal sample preparation, and relatively long term stability of analytes through drying [2]. Although individual IEM is rare, the collective incidence could be up to one in 500 to 4,000, posing severe public health problem [3]. During an 18-month pilot study of expanded NBS for 26 IEM conditions conducted in Hong Kong since 2015, nine confirmed IEM cases was found among 15,138 babies screened, with a collective incidence of one in 1,682 newborns [4], compared to incidence reported in other places such as 1/2,920 in Germany [5], 1/3,165 in Singapore [6], and 1/901 in Qatar [7]. Incidence rate may vary due to reasons such as panel selection, ethnic composition and scope of the study.

Tandem mass spectrometry (MS/MS) program for IEM is ideal for early detection and diagnosis. Though the application of MS/MS has allowed rapid, cost-effective, and simultaneous detection of analytes related to IEM, there has still been some limitations such as high false-positive rate, imprecision, and false-negative. False-negative rates are high for several IEM diseases such as citrin deficiency and carnitine uptake defect, delaying diagnosis and treatment in patients [8]. False-positive, which in some cases such as congenital adrenal hyperplasia can be as high as one true-positive in 13 false-positives [9], could introduce unnecessary follow-up tests resulting in negative psychosocial effects on children and parents, as well as high follow-up medical costs. These drawbacks have raised a need for rapid and effective second-tier tests by alternative methods including next generation sequencing (NGS; also called massively parallel sequencing).

NGS technology has revolutionized the study of human genomic variation and disease diagnosis. It is shown to be compatible with DBS samples [10] and suitable for NBS genes [11]. Practically, targeted NGS has advantage in clinical diagnostics due to its speed and cost-effectiveness, when compared with whole genome sequencing (WGS) or whole-exome sequencing (WES). Targeted NGS employing hybridization-based capturing method was found with better sequencing complexity and uniformity, while amplicon-based method had the advantages of shorter preparation time and smaller DNA input [12]. Though amplicon-based method was more vulnerable to false-positive and false-negative results, the problems are mainly caused by insufficient read coverage and minimum variant frequency, hence could be compensated by modifying the algorithms, such as adjusting filter parameters. Because of the limited DNA quantity from DBS and the urgency of prompt diagnosis, amplicon-based approach would be more suitable for applications in NBS.

NGS has the potential to be implemented in NBS in three ways, 1) follow-up diagnosis and family screening; 2) as a first-tier screening test, and 3) as a second-tier test. NGS could be applied in second-tier confirmation testing for NBS, particularly for borderline biochemical screening results. First-tier MS/MS screening method is susceptible to high false-positive rate and false-negative, especially in several IEM diseases. Second-tier NGS test could help to rule out ambiguous screening results due to nutritional impacts or common heterozygous carriers, meanwhile ruling in IEM cases with unsatisfactory sensitivity using MS/MS such as citrin deficiency and carnitine uptake defect. With reference to Hong Kong local statistics and resources available for NBS, targeted NGS panel for second-tier confirmation test would be a suitable choice for implementation. In this study, we validated an 87 gene amplicon-based NGS panel which was designed for second-tier test for NBS program and other IEM diseases/clinical purpose. Reference DNA materials and DBS samples were used for DNA extraction and library preparation. A commercial pipeline was employed for bioinformatic analysis and variant calling. The whole workflow can be finished in 2.5 days and compatible with the turnaround time of five-working days of the current NBS program.

Materials and methods

Validation samples

The validation samples consisted of two reference genomes, RM 8398 (ethnic Utah/Mormon) and RM 8393 (ethnic Chinese) (purchased from NIST, Gaithersburg, USA). A total of three validation experiments were performed, each of which included a 16-plex sequencing run, spiked-in with 2% PhiX DNA as technical control. One sample was triplicated in each run for validation of technical performance on repeatability and between run repeatability. A total of 3 × 16 = 48 sequencing results were obtained. The reference genome samples were used for assessment of sensitivity and specificity.

Panel design

A custom AmpliSeq panel (Illumina, San Diego, USA) was designed for this study, wherein was amplicon-based, targeting the exons and ±20 base pair (bp) of intron-exon-boundaries of 87 genes and 289,264 bps in the human genome with 2,263 amplicons (Panel design summary shown in Table 1). The primers were separated into two pools for library amplification. The panel could detect genetic variants for 87-IEM diseases that are potential targets of NBS or for other clinical purpose (Table 2). For reference, a Browser Extensible Data (BED) file containing the genomic positions of the missed bases in exons is provided as Supplementary Data 1, and the exons containing missed bases in each gene were summarized in Supplementary Data 2.

Table 1:

Summary of panel design and validation.

Design
Genome	GRCh37 (hg19)
Total number of gene	87
Total number of exon	1,139
Number of base covered by panel	289,264
Number of base in reportable region (coding exon ± 20 bp)	171,226
Total number of amplicon	2,263
Number of primer pool	2
Number of base covered by amplicon	405,603
Average amplicon size, bp	179
Performance
Target region mean depth	208×
Fraction of regions target depth ≥20×	95.7%
Sensitivity (SNV and indel combined)	91.2%
Specificity	99.95%

SNV, single nucleotide variant.

Table 2:

List of the panel genes with individual >20× coverage and the associated disease.

Gene		≥20× coverage	Condition	OMIM entry
1.	ABCD1	89.5%	X-linked adrenoleukodystrophy	300100
2.	ACAD8	90.9%	Isobutyryl-CoA dehydrogenase deficiency	611283
3.	ACAD9	93.4%	Mitochondrial complex I deficiency, nuclear type 20	611126
4.	ACADM	100%	Medium-chain acyl-CoA dehydrogenase deficiency^b	201450
5.	ACADS	93.5%	Short-chain acyl-CoA dehydrogenase deficiency	201470
6.	ACADSB	99.9%	2-methylbutyrylglycinuria	610006
7.	ACADVL	92.0%	Very long-chain acyl-CoA dehydrogenase deficiency^b	201475
8.	ACAT1	99.0%	Alpha-methylacetoacetic aciduria (alternative titles: Beta-ketothiolase deficiency/2-methyl-3-hydroxybutyric academia/Mitochondrial acetoacetyl-coa thiolase deficiency)^b	203750
9.	ACAT2	97.6%	Acetyl-CoA acetyltransferase-2 deficiency	614055
10.	ACSF3	99.1%	Combined malonic and methylmalonic aciduria^b	614265
11.	ADA	98.2%	Severe combined immunodeficiency due to ADA deficiency	102700
12.	AHCY	91.3%	Hypermethioninemia with deficiency of S-adenosylhomocysteine hydrolase	613752
13.	ALDH4A1	90.1%	Hyperprolinemia, type II	239510
14.	ALDH6A1	93.4%	Methylmalonate semialdehyde dehydrogenase deficiency^b	614105
15.	AMT	99.5%	Glycine encephalopathy (non-ketotic hyperglycinemia)	605899
16.	ARG1	98.6%	Argininemia^b	207800
17.	ASL	87.0%	Argininosuccinic acidemia^b	207900
18.	ASS1	98.6%	Citrullinemia type I^b	215700
19.	ATP7B	95.4%	Wilson disease	277900
20.	AUH	96.8%	3-methylglutaconic aciduria, type I	250950
21.	BCKDHA	97.0%	Maple syrup urine disease, type Ia^b	248600
22.	BCKDHB	99.8%	Maple syrup urine disease, type Ib^b	248600
23.	BTD	98.0%	Biotinidase deficiency^b	253260
24.	CBS	94.5%	Homocystinuria^b	236200
25.	CD320	81.3%	Methylmalonic aciduria, transient, due to transcobalamin receptor defect^b	613646
26.	CFTR	96.7%	Cystic fibrosis	219700
27.	CPS1	98.0%	Carbamoylphosphate synthetase I deficiency	608307
28.	CPT1A	99.8%	Carnitine palmitoyltransferase type I deficiency	255120
29.	CPT2	99.7%	Carnitine palmitoyltransferase II deficiency^b	600649
30.	CYP21A2 ^a (invalid)	NA	Congenital adrenal hyperplasia^b	201910
31.	DBT	97.1%	Maple syrup urine disease, type II^b	248600
32.	DECR1	100%	2,4-dienoyl-CoA reductase deficiency	616034
33.	DLD	99.9%	Maple syrup urine disease, type III (dihydrolipoamide dehydrogenase deficiency)^b	246900
34.	ETFA	100%	Glutaric acidemia type II^b	231680
35.	ETFB	88.1%	Glutaric acidemia type II^b	231680
36.	ETFDH	99.8%	Glutaric acidemia type II^b	231680
37.	ETHE1	92.5%	Ethylmalonic encephalopathy	602473
38.	FAH	99.3%	Tyrosinaemia type I^b	276700
39.	GAA	91.0%	Glycogen storage disease II (Pompe’s disease)	232300
40.	GALE	93.8%	Galactose epimerase deficiency	230350
41.	GALK1	90.7%	Galactokinase deficiency with cataracts	230200
42.	GALT	97.5%	Classical galactosemia^b	230400
43.	GCDH	93.9%	Glutaric acidemia type I^b	231670
44.	GCH1	92.7%	Dystonia, DOPA-responsive, with or without hyperphenylalaninemia Hyperphenylalaninemia, BH4-deficient, B	128230 233910
45.	GCSH	70.5%	Glycine encephalopathy (non-ketotic hyperglycinemia)	605899
46.	GLDC	93.1%	Glycine encephalopathy (non-ketotic hyperglycinemia)	605899
47.	GLUD1	96.8%	Hyperinsulinism-hyperammonemia syndrome	606762
48.	GNMT	83.8%	Glycine N-methyltransferase deficiency	606664
49.	HADH	89.9%	3-hydroxyacyl-CoA dehydrogenase deficiency (SCHAD deficiency, formerly)	231530
50.	HADHA	100%	Long-chain 3-hydroxyacyl-CoA dehydrogenase deficiency Mitochondrial trifunctional protein deficiency	609016 609015
51.	HADHB	98.5%	Trifunctional protein deficiency	609015
52.	HLCS	98.8%	Multiple carboxylase deficiency/Holocarboxylase synthetase deficiency^b	253270
53.	HMGCL	99.9%	3-hydroxy-3-methylglutaryl-CoA lyase deficiency^b	246450
54.	HMGCS2	97.0%	HMG-CoA synthase-2 deficiency	605911
55.	HPD	98.3%	Tyrosinemia type III	276710
56.	HSD17B4	99.8%	D-bifunctional protein deficiency	261515
57.	IDUA ^a (invalid)	55.9%	Mucopolysaccharidosis Ih/Hurler syndrome	607014
58.	IVD	94.6%	Isovaleric acidemia^b	243500
59.	MAT1A	99.7%	Methionine adenosyltransferase I/III deficiency	250850
60.	MCCC1	99.0%	3-Methylcrotonyl-CoA carboxylase 1 deficiency	210200
61.	MCCC2	99.9%	3-Methylcrotonyl-CoA carboxylase 1 deficiency	210200
62.	MCEE	98.7%	Methylmalonyl-CoA epimerase deficiency^b	251120
63.	MLYCD	83.3%	Malonyl-CoA decarboxylase deficiency^b	248360
64.	MMAA	100%	Methylmalonic aciduria, type cblA, vitamin B12-responsive^b	251100
65.	MMAB	98.3%	Methylmalonic aciduria, vitamin B12-responsive, due to defect in synthesis of adenosylcobalamin, cblB complementation type^b	251110
66.	MMACHC	96.1%	Methylmalonic aciduria and homocystinuria, cblC type^b	277400
67.	MMADHC	100%	Methylmalonic aciduria and homocystinuria, cblD type (alternative gene symbol C2orf25) ^b	277410
68.	MMUT	93.9%	Methylmalonic aciduria, mut(0) type^b	251000
69.	NADK2	99.0%	2,4-dienoyl-CoA reductase deficiency	616034
70.	NAGS	70.2%	N-acetylglutamate synthase deficiency	237310
71.	OTC	95.7%	Ornithine transcarbamylase deficiency	311250
72.	OXCT1	99.5%	Succinyl CoA:3-oxoacid CoA transferase deficiency	245050
73.	PAH	100%	Phenylketonuria due to phenylalanine hydroxylase deficiency^b	261600
74.	PCBD1	95.2%	Pterin-4α-carbinolamine dehydratase deficiency	264070
75.	PCCA	96.5%	Propionic acidemia^b	606054
76.	PCCB	100%	Propionic acidemia^b	606054
77.	PPM1K	99.8%	Maple syrup urine disease, mild variant^b	615135
78.	PRODH	84.1%	Hyperprolinemia, type I	239500
79.	PTS	95.1%	6-pyruvoyl-tetrahydropterin synthase deficiency^b	261640
80.	QDPR	97.7%	Dihydropteridine reductase deficiency	261630
81.	SLC22A5	96.0%	Carnitine uptake deficiency^b	212140
82.	SLC25A13	97.5%	Neonatal-onset type II citrullinemia/Neonatal intrahepatic cholestasis caused by citrin deficiency (NICCD)^b	605814
83.	SLC25A15	100%	Hyperornithinemia-hyperammonemia-homocitrullinemia syndrome	238970
84.	SLC25A20	100%	Carnitine-acylcarnitine translocase deficiency^b	212138
85.	SUCLA2	93.8%	Mitochondrial DNA depletion syndrome 5 (encephalomyopathic with or without methylmalonic aciduria)^b	612073
86.	SUCLG1	95.8%	Mitochondrial DNA depletion syndrome 9 (encephalomyopathic type with methylmalonic aciduria)^b	245400
87.	TAT	97.5%	Tyrosinemia type II	276600

^aIDUA and CYP21A2 are invalid due to poor coverage and presence of pseudogene, respectively. ^bThe IEM conditions included in the Hong Kong NBS program.

DNA extraction for DBS

For DBS collected on Whatman 903 Proteinsaver Cards (Cytiva, Vancouver, Canada), six 3.2 mm diameter DBS discs should be used for DNA extraction. Extraction was performed manually using column-based QIAamp DNA Mini Kit (QIAGEN, Germantown, USA) with modification using manufacturer-supplied buffers ATL, AL, AW1, AW2, EB, and proteinase K. Three-hundred and sixty microliters buffer ATL for tissue lysis was added to the punched DBS discs, followed by incubation at 85 °C for 10 min; 40 μL proteinase K stock solution was added and incubated at 56 °C for 1 h; 400 μL buffer AL was added and incubated at 70 °C for 10 min. Four-hundred microliters ethanol (96–100%) was added to the mixture before applying to the QIAamp Mini spin column and centrifuged at 6,000×g for 1 min with filtrate discarded, followed by washing with AW1 and AW2. DNA was eluted from column with the same 50 μL buffer EB (QIAGEN, Germantown, USA) twice with 1 min incubation at room temperature, followed by centrifugation at 6,000×g for 1 min. The extracted DNA quantity and quality were accessed by Qubit DNA High Sensitivity (HS) Assay Kit with Qubit 2.0 Fluorometer and NanoDrop spectrophotometer (ThermoFisher Scientific, Waltham, USA) respectively.

Library preparation and targeted sequencing

Library preparation was carried out using AmpliSeq Plus Library Prep Kit (Illumina) following the manufacturer’s protocol. About 7.5 ng of DNA was used to synthesize library. AmpliSeq UD indexes were used for indexing. At the end, 27 μL of eluted library DNA was collected to a DNA LoBind tube (Eppendorf, Hamburg, Germany). NGS libraries were quantitated by Qubit dsDNA HS assay. Library size distribution was checked by Agilent High Sensitivity DNA Kit on 2,100 Bioanalyzer (Agilent Technologies, Santa Clara, USA), from which a distinct peak between 300 and 400 bp should be observed, with negligible level of contaminating DNA species. Each sequencing library was diluted to a concentration of 1 nM with buffer EB and 16 libraries were pooled together. The pooled library was then diluted to 90 pM and spiked with 2% 90 pM PhiX DNA as technical control. The pooled library was then loaded to the sequencing cartridge and subject to massive parallel sequencing to generate 150 bp pair-end reads in Illumina iSeq-100 System.

Data analysis

The iSeq-100 built-in software, DNA Amplicon, generated FASTQ output files from the raw reads, performed initial alignment and determined QC metrics. Sequence Analysis Viewer (SAV; Illumina) and FastQC (Bioinformatics Group, Babraham Institute, Cambridge, UK) apps were also used to monitor sequencing QC metrics of pooled and individual library, respectively.

We employed a commercial pipeline, the NextGENe software (v2.4.2.3; Softgenetics, State College, USA), to perform read alignment against the reference genome (GRCh37/hg19), filtering and variant calling. Data analyses targeted all the coding exons ±20 bp to include the splice site intronic region. For variant calling, the criteria included >20% mutation frequency, >3 allele count and >5 total coverage count; homozygous variants were exempted from these cutoffs; in-read phasing (merging) of adjacent variants with a maximum gap of 1 bp and phaseable read percentage of >50% was allowed. Additional custom filtering criteria was imposed to minimize false-positive rates by using a proprietary quality score (cutoff = 4) which was composed of adjusted coverage score, mismatch score and wrong allele score. A variant was regarded as valid only if a minimum of 20 reads at its position was observed. A mutation report (VCF file) was created annotating all variants. Human Gene Mutation Database (HGMD) Professional (QIAGEN, Germantown, USA), Alamut Visual (Sophia Genetics SA, Saint-Sulpice, Switzerland) and Integrative Genomics Viewer (IGV) [13] were used for further characterizing the variants and raw reads. Single nucleotide variant (SNV) and small indel were targeted for curation.

External quality assurance (EQA) program

We joined the European Molecular Genetics Quality Network (EMQN) pilot EQA scheme for NGS (Germline; 2020) as EQA. The provided DNA sample was sequenced in another 16-plex run, and the BED file of the original panel design covering the whole-panel, instead of coding exons only, was given to the EQA program to analyze the data.

Results

DNA extraction, library preparation, and sequencing metrics

All libraries showed satisfactory yield and quality, i.e., concentration ranged from 2.67 to 11.6 ng/μL and average size ranged from 359 to 390 bp with a distinct and dominant peak between 300 and 400 bp. The total yield of each run ranged from 1.74 to 1.92 giga bps. The overall sequencing quality of the three runs was good with %Q30 between 91 and 92% (Table 3). PhiX alignment rate is between 1.3 and 1.4% (compared with the expected 2%; Table 3). All individual library passed the FastQC metrics except one library showed distinctively high duplication level. This sample also showed a low uniformity (defined as % of base with >0.2× mean coverage) thus regarded as fail.

Table 3:

Summary of quality control (QC) metrics of the three batches of 16-plex run.

Sample ID	Coverage (x)	Reads on targets	Uniformity
	Batch 1 Q30 = 91.1% PhiX alignment rate = 1.3%
RM8398	211	644,933	98.1%
RM8393	192	583,892	97.7%
	Batch 2 Q30 = 91.6% PhiX alignment rate = 1.4%
RM8398	191	582,074	96.8%
RM8393	221	676,909	97.1%
	Batch 3 Q30 = 91.8% PhiX alignment rate = 1.4%
RM8398	220	678,307	94.4%
RM8393	176	540,091	96.1%

Coverage and repeatability

After alignment against the reference genome (GRCh37/hg19), on average one library got about 640,000 reads on targets, with an average coverage of 208× ±20 (1 SD) (Table 3). For intra-run repeatability, the triplicated sample had mean coverage of 195, 206, and 223 in each run, corresponding to CV of 9.1, 10.5, and 5.2% respectively. Combining the above to calculate inter-run repeatability, the average coverage is 208× with a CV of 10.1%.

For each of the 87 genes, we looked at the percentage of >20× coverage in reportable region from the three runs (Table 2). Seventy-five genes achieved an average of ≥90% >20× coverage, of which 8 genes achieved 100% >20× coverage in all three runs. Another eight genes had between 80 and 90% >20× coverage and were also useable for reporting. CYP21A2 was disqualified due to its very high homology with its pseudogene CYP21A1P. Another gene, IDUA, has low >20× coverage at 55.9% and will not be used for reporting. NAGS and GCSH had the next lowest >20× coverage at around 70% but were still considered useable for reporting.

Sensitivity

We used the variants in the reference genomes RM8398 (ethnic Utah/Mormon) and RM8393 (ethnic Chinese) to evaluate the sensitivity of the panel by looking at the proportion of true-positive variant called. The sensitivities were 90.9% (442 out of 486 variants; SNV and indel combined) and 91.4% (367 out 401 variants; SNV and indel combined) respectively which were considered acceptable. Specificity was calculated based on the proportion of false-positive over the total number of base in the reporting region achieving >20× coverage, so that specificities were 99.94% (98 false-positives out of 163,932 bases) and 99.95% (87 false-positives out of 163,932 bases) for RM8398 and RM8393 respectively. Many of the false-positives were found near the 3′ end of the mapped reads with lower coverage and quality score, probably due to a drop in sequencing quality near the 3′ end. These assay specific artefacts were recurrently observed in all samples, and could be easily identified and avoided using genome browser during NGS data reporting.

EQA

The results returned from EMQN pilot EQA scheme for NGS (2020) was satisfactory (Table 4), with the median depth at called variants being 178×. The sensitivity for SNV and indel were 97.89 and 81.58% respectively. The precisions were relatively lower for both SNV and indel at 77.44 and 53.4% respectively, due to the fact that the whole-panel BED file was submitted for analysis, which includes more non-coding regions and 3′ end of reads. The latter are known to more likely exhibit lower sequencing quality and PCR artefacts. The artefacts at 3′ end are also more likely to occur in tandem hence making indel more likely to occur than SNP. Fortunately, these artefacts can be easily identified using genome browser. Since we were the only participant who used iSeq and the custom AmpliSeq panel (out of 293 participating labs; 77.6% of which used one of the Illumina NGS platforms), our results cannot be directly compared with the peer group. Our EQA results show good sensitivity and acceptable precision, both of which can be improved if filtering criteria is optimized and only coding exons ±20 bp are analyzed.

Table 4:

Result summary of European Molecular Genetics Quality Network (EMQN) pilot external quality assurance (EQA) scheme for next generation sequencing (NGS) (germline 2020).

Metric	Value
FASTQ files ^a
Bases above Q30 quality scroe	94.1%
Average base quality (Phred scale)	35.9
BAM file ^a
Uniformity^b	97.4%
Off target	2.4%
Error rate on target	0.0032
Coverage at 20×	97.4%
VCF file ^a
Depth at calls (median)	178×
SNP detection ^a
True-positives	278
False-positives	81
False-negatives	6
Sensitivity^c	97.9%
Precision^d	77.4%
F-score	86.5%
Indel detection ^a
True-positives	31
False-positives	27
False-negatives	7
Sensitivity^c	81.6%
Precision^d	53.5%
F-score	64.6%

^aWe submitted the FASTQ sequence data files together with BAM and VCF result files to EMQN, which then evaluated the SNP and indel detection performance. ^bThe percentage of bases on target covered at 0.1 × median coverage. ^cProportion of actual positives that are correctly identified as such. ^dProportion of actual positives among all reported positives. EMQN, European Molecular Genetics Quality Network; EQA, External quality assurance; SNP, single nucleotide polymorphism; NGS, next generation sequencing.

Turnaround time (TAT)

For a 16-plex run, DNA preparation from DBS can be finished in half a day (Day 1). Library synthesis and pooling can be accomplished within 1.5 days (Day 1 and 2) followed by overnight sequencing for about 22 h. Bioinformatic analysis and QC procedure can be done within 3 h at Day 3. With normally only a few variants per gene being discovered and one to two genes per sample needing to be interpreted, an optimized workflow should allow reports to be ready in about 2.5 working days.

Discussion

In this study, we have validated the analytical performance of 16-plex iSeq sequencing using 87-IEM-gene AmpliSeq Panel. The use of this panel was aimed for second-tier NBS and other IEM diseases, targeting variants in coding exons and splice sites. Among the 87 genes, 85 genes excluding IDUA and CYP21A2 were evaluated with satisfactory performance for clinical reporting. The assay has achieved a sensitivity of 91.15%, specificity of 99.95%, satisfactory repeatability and EQA results. There are 75 genes had over 90% per gene >20× coverage based on reportable region. The whole workflow can be done with a 2.5 day TAT. Together with two-day TAT for first-tier MS/MS reporting, a total of five day TAT could be achieved for the whole NBS system.

Optimizing sample preparation and scrutiny

DNA extraction was performed manually using column-based method; in our experience, the minimum yield was about 60 ng, which is more than required for quantitation and library construction (about 12 ng). The excess DNA should be enough for repeat or other tests such as independent single nucleotide polymorphism (SNP) genotyping assays. Considering the urgency of NBS reporting, automated DNA extraction with magnetic particles could be an alternative method with the advantage of better speed, yield, purity and consistency, as well as reduced chance of sample swapping. The complicated sample and library preparation workflow in targeted NGS has introduced the possibility of sample-mix up or cross contamination [14, 15]. Therefore, procedures to scrutinize sample identity are important when implementing the test in order to minimize the chance of misreporting. Possible way includes in silico identity check to look up a set of highly heterogeneous SNP genotypes and sex prediction from the sequencing reads; the amalgamated SNP genotype should not be identical among a batch of 16 samples and the predicted sex should match with the reported sex. Furthermore, independent PCR-based SNP genotyping assays can be performed and the result compared with that from the in silico SNP identification pipeline, in order to minimize the chance of sample-mixing/swapping.

Second-tier NBS test through targeted NGS

The introduction of NGS would be beneficial to current NBS practice. First-tier MS/MS and biochemical approaches allow detection of multiple IEM including amino acid, organic acid and fatty acid disorders. However, these tests may have short-comings of relatively high false-positive or false-negative rates. This might be due to factors such as prematurity, total parenteral nutrition, maternal defects and the age of disease onset, which affect metabolite concentration and lead to false results. Though two-tier testing might help to screen out suspected cases for specific IEM, limited applications are available. To complement with the current NBS approach, targeted NGS could be implemented as second-tier test. The drawbacks of first-tier biochemical tests could be addressed through the integration of targeted NGS, to minimize the false-positive or negative results due to borderline metabolites concentration.

According to the statistics of Hong Kong NBS since 2015, several conditions are found to have relatively high incidence of false-positive or false-negative results [4], therefore, these conditions would be excellent candidates for second-tier test using our assay. For example, our NGS assay could be used as second-tier test for samples with borderline range of free carnitine level (e.g., 5.0–6.5 μmol/L). The absence of pathogenic variants in the SLC22A5 gene, the causative gene of systemic primary carnitine deficiency [16], could reduce false-positive cases.

On the other hand, false-negative cases were frequently reported for citrin deficiency in Hong Kong [4] and elsewhere [17, 18]. By adopting second-tier genetic testing, we can extend the detection of patients with citrin deficiency with a relatively mild citrulline elevations, higher than the upper reference limit but below the NBS reporting cutoff (e.g., citrulline level of 25–35 μmol/L).

Limitations

Short-read mapping is not suitable for genes with highly homologous genomic regions such as pseudogenes [19], in this case CYP21A2 is interfered by its pseudogene CYP21A1P. While the assay can pick up most of the pathogenic variants in the exonic and splice region, other variants within the promoter regions, deep intronic regions, or regulatory elements outside of the targeted regions cannot be detected. Also, gross insertions and deletions within the target regions could not be detected by this assay at the time of reporting. These variants could possibly be detected using other optimized post-analytical pipeline specifically designed for copy number variant (CNV) detection. Amplicon-based panel might also be susceptible to allele drop out due to the presence of SNVs that cause defective primer annealing in primer binding sites [20, 21] or secondary structure formation in amplicon non-primer site [22], culminating in insufficient coverage and false-negative result.

Conclusions

In conclusion, we have successfully validated a custom AmpliSeq panel for IEM. This assay shall be practical for detecting SNVs and small indels applicable in second-tier test in NBS. Further work would be required to optimize the performance and workflow for its application in Hong Kong NBS program.

Corresponding author: Chloe Miu Mak, Newborn Screening for Inborn Errors of Metabolism Laboratory, Hong Kong Children’s Hospital, Hong Kong SAR, P.R. China; and Department of Pathology, Division of Chemical Pathology, Hong Kong Children’s Hospital, Hong Kong SAR, P.R. China, E-mail: makm@ha.org.hk

Acknowledgments

We would like to thank the Genetics and Genomics Division, Department of Pathology, Hong Kong Children’s Hospital for providing expert opinion and technical support for this work.

Research funding: None declared.
Author contributions: All authors have accepted responsibility for the entire content of this manuscript and approved its submission.
Competing interests: Authors state no conflict of interest.
Informed consent: Not applicable.
Ethical approval: Not applicable.

References

1. Therrell, BL, Padilla, CD, Loeber, JG, Kneisser, I, Saadallah, A, Borrajo, GJ, et al.. Current status of newborn screening worldwide: 2015. Semin Perinatol 2015;39:171–87. https://doi.org/10.1053/j.semperi.2015.03.002.Search in Google Scholar PubMed

2. Lehmann, S, Delaby, C, Vialaret, J, Ducos, J, Hirtz, C. Current and future use of “dried blood spot” analyses in clinical chemistry. Clin Chem Lab Med 2013;51:1897–909. https://doi.org/10.1515/cclm-2013-0228.Search in Google Scholar PubMed

3. Feuchtbaum, L, Carter, J, Dowray, S, Currier, RJ, Lorey, F. Birth prevalence of disorders detectable through newborn screening by race/ethnicity. Genet Med 2012;14:937–45. https://doi.org/10.1038/gim.2012.76.Search in Google Scholar PubMed

4. The Task Force on the Pilot Study of Newborn Screening for Inborn Errors of Metabolism, Hong Kong SAR. Evaluation of the 18-month “Pilot Study of Newborn Screening for Inborn Errors of Metabolism” in Hong Kong. HK J Paediatr (New Ser) 2020;25:16-22.Search in Google Scholar

5. Lindner, M, Gramer, G, Haege, G, Fang-Hoffmann, J, Schwab, KO, Tacke, U, et al.. Efficacy and outcome of expanded newborn screening for metabolic diseases--report of 10 years from South-West Germany. Orphanet J Rare Dis 2011;6:44. https://doi.org/10.1186/1750-1172-6-44.Search in Google Scholar PubMed PubMed Central

6. Lim, JS, Tan, ES, John, CM, Poh, S, Yeo, SJ, Ang, JS, et al.. Inborn Error of Metabolism (IEM) screening in Singapore by electrospray ionization-tandem mass spectrometry (ESI/MS/MS): an 8 year journey from pilot to current program. Mol Genet Metabol 2014;113:53–61. https://doi.org/10.1016/j.ymgme.2014.07.018.Search in Google Scholar PubMed

7. Lindner, M, Abdoh, G, Fang-Hoffmann, J, Shabeck, N, Al-Sayrafi, M, Al-Janahi, M, et al.. Implementation of extended neonatal screening and a metabolic unit in the State of Qatar: developing and optimizing strategies in cooperation with the Neonatal Screening Center in Heidelberg. J Inherit Metab Dis 2007;30:522–9. https://doi.org/10.1007/s10545-007-0553-7.Search in Google Scholar PubMed

8. Wang, LY, Chen, NI, Chen, PW, Chiang, SC, Hwu, WL, Lee, NC, et al.. Newborn screening for citrin deficiency and carnitine uptake defect using second-tier molecular tests. BMC Med Genet 2013;14:24. https://doi.org/10.1186/1471-2350-14-24.Search in Google Scholar PubMed PubMed Central

9. Matern, D, Tortorelli, S, Oglesbee, D, Gavrilov, D, Rinaldo, P. Reduction of the false-positive rate in newborn screening by implementation of MS/MS-based second-tier tests: the Mayo Clinic experience (2004-2007). J Inherit Metab Dis 2007;30:585–92. https://doi.org/10.1007/s10545-007-0691-y.Search in Google Scholar PubMed

10. Hollegaard, MV, Grauholm, J, Nielsen, R, Grove, J, Mandrup, S, Hougaard, DM. Archived neonatal dried blood spot samples can be used for accurate whole genome and exome-targeted next-generation sequencing. Mol Genet Metabol 2013;110:65–72. https://doi.org/10.1016/j.ymgme.2013.06.004.Search in Google Scholar PubMed

11. Bodian, DL, Klein, E, Iyer, RK, Wong, WS, Kothiyal, P, Stauffer, D, et al.. Utility of whole-genome sequencing for detection of newborn screening disorders in a population cohort of 1,696 neonates. Genet Med 2016;18:221–30. https://doi.org/10.1038/gim.2015.111.Search in Google Scholar PubMed

12. Samorodnitsky, E, Jewell, BM, Hagopian, R, Miya, J, Wing, MR, Lyon, E, et al.. Evaluation of hybridization capture versus amplicon-based methods for whole-exome sequencing. Hum Mutat 2015;36:903–14. https://doi.org/10.1002/humu.22825.Search in Google Scholar PubMed PubMed Central

13. Robinson, JT, Thorvaldsdóttir, H, Wenger, AM, Zehir, A, Mesirov, JP. Variant review with the integrative genomics viewer. Cancer Res 2017;77:e31–4. https://doi.org/10.1158/0008-5472.can-17-0337.Search in Google Scholar PubMed PubMed Central

14. Koboldt, DC, Ding, L, Mardis, ER, Wilson, RK. Challenges of sequencing human genomes. Briefings Bioinf 2010;11:484–98. https://doi.org/10.1093/bib/bbq016.Search in Google Scholar PubMed PubMed Central

15. Wang, PP, Parker, WT, Branford, S, Schreiber, AW. BAM-matcher: a tool for rapid NGS sample matching. Bioinformatics 2016;32:2699–701. https://doi.org/10.1093/bioinformatics/btw239.Search in Google Scholar PubMed

16. Lee, NC, Tang, NL, Chien, YH, Chen, CA, Lin, SJ, Chiu, PC, et al.. Diagnoses of newborns and mothers with carnitine uptake defects through newborn screening. Mol Genet Metabol 2010;100:46–50. https://doi.org/10.1016/j.ymgme.2009.12.015.Search in Google Scholar PubMed

17. Ohura, T, Kobayashi, K, Tazawa, Y, Abukawa, D, Sakamoto, O, Tsuchiya, S, et al.. Clinical pictures of 75 patients with neonatal intrahepatic cholestasis caused by citrin deficiency (NICCD). J Inherit Metab Dis 2007;30:139–44. https://doi.org/10.1007/s10545-007-0506-1.Search in Google Scholar PubMed

18. Shigetomi, H, Tanaka, T, Nagao, M, Tsutsumi, H. Early detection and diagnosis of neonatal intrahepatic cholestasis caused by citrin deficiency missed by newborn screening using tandem mass spectrometry. Int J Neonatal Screen 2018;4:5. https://doi.org/10.3390/ijns4010005.Search in Google Scholar PubMed PubMed Central

19. Trier, C, Fournous, G, Strand, JM, Stray-Pedersen, A, Pettersen, RD, Rowe, AD. Next-generation sequencing of newborn screening genes: the accuracy of short-read mapping. NPJ Genom Med 2020;5:36. https://doi.org/10.1038/s41525-020-00142-z.Search in Google Scholar PubMed PubMed Central

20. Shestak, AG, Bukaeva, AA, Saber, S, Zaklyazminskaya, EV. Allelic dropout is a common phenomenon that reduces the diagnostic yield of PCR-based sequencing of targeted gene panels. Front Genet 2021;12:620337. https://doi.org/10.3389/fgene.2021.620337.Search in Google Scholar PubMed PubMed Central

21. Zucca, S, Villaraggia, M, Gagliardi, S, Grieco, GS, Valente, M, Cereda, C, et al.. Analysis of amplicon-based NGS data from neurological disease gene panels: a new method for allele drop-out management. BMC Bioinf 2016;17:339. https://doi.org/10.1186/s12859-016-1189-0.Search in Google Scholar PubMed PubMed Central

22. Lam, CW, Mak, CM. Allele dropout caused by a non-primer-site SNV affecting PCR amplification--a call for next-generation primer design algorithm. Clin Chim Acta 2013;421:208–12. https://doi.org/10.1016/j.cca.2013.03.014.Search in Google Scholar PubMed

Supplementary Material

The online version of this article offers supplementary material (https://doi.org/10.1515/labmed-2021-0115).

Received: 2021-09-02

Accepted: 2021-09-29

Published Online: 2021-10-25

Published in Print: 2021-12-20

This work is licensed under the Creative Commons Attribution 4.0 International License.

Articles in the same Issue

https://doi.org/10.1515/labmed-2021-0115

Keywords for this article

dried blood spot (DBS); inborn errors of metabolism (IEM); newborn screening (NBS); next generation sequencing (NGS); second-tier test

Creative Commons

BY 4.0