Startseite Medizin Validation of amplicon-based next generation sequencing panel for second-tier test in newborn screening for inborn errors of metabolism
Artikel Open Access

Validation of amplicon-based next generation sequencing panel for second-tier test in newborn screening for inborn errors of metabolism

  • Kwok Yeung Tsang ORCID logo , Toby Chun Hei Chan , Matthew Chun Wing Yeung , Tsz Ki Wong , Wan Ting Lau und Chloe Miu Mak EMAIL logo
Veröffentlicht/Copyright: 25. Oktober 2021

Abstract

Objectives

Next generation sequencing (NGS) technology has allowed cost-effective massive parallel DNA sequencing. To evaluate the utility of NGS for newborn screening (NBS) of inborn errors of metabolism (IEM), a custom panel was designed to target 87 disease-related genes. The pilot study was primarily proposed for second-tier testing under the NBSIEM program in Hong Kong.

Methods

The validation of the panel was performed with two reference genomes and an external quality assurance (EQA) sample. Sequencing libraries were synthesized with amplicon-based approach. The libraries were pooled, spiked-in with 2% PhiX DNA as technical control, for 16-plex sequencing runs. Sequenced reads were analyzed using a commercially available pipeline.

Results

The average target region coverage was 208× and the fraction of region with target depth ≥20× was 95.7%, with a sensitivity of 91.2%. There were 85 out of 87 genes with acceptable coverage, and EQA result was satisfactory. The turnaround time from DNA extraction to completion of variant calling and quality control (QC) procedures was 2.5 days.

Conclusions

The NGS approach with the amplicon-based panel has been validated for analytical performance and is suitable for second-tier NBSIEM test.

Introduction

A newborn screening (NBS) program is an effective mean to find out conditions which are not clinically evident on physical examination in neonatal period to allow early diagnosis and intervention to prevent disability or death. Many countries carry out screening program for inborn errors of metabolism (IEM) conditions utilizing dried blood spot (DBS) [1]. DBS analysis provides advantages of significantly small volume of blood, minimal sample preparation, and relatively long term stability of analytes through drying [2]. Although individual IEM is rare, the collective incidence could be up to one in 500 to 4,000, posing severe public health problem [3]. During an 18-month pilot study of expanded NBS for 26 IEM conditions conducted in Hong Kong since 2015, nine confirmed IEM cases was found among 15,138 babies screened, with a collective incidence of one in 1,682 newborns [4], compared to incidence reported in other places such as 1/2,920 in Germany [5], 1/3,165 in Singapore [6], and 1/901 in Qatar [7]. Incidence rate may vary due to reasons such as panel selection, ethnic composition and scope of the study.

Tandem mass spectrometry (MS/MS) program for IEM is ideal for early detection and diagnosis. Though the application of MS/MS has allowed rapid, cost-effective, and simultaneous detection of analytes related to IEM, there has still been some limitations such as high false-positive rate, imprecision, and false-negative. False-negative rates are high for several IEM diseases such as citrin deficiency and carnitine uptake defect, delaying diagnosis and treatment in patients [8]. False-positive, which in some cases such as congenital adrenal hyperplasia can be as high as one true-positive in 13 false-positives [9], could introduce unnecessary follow-up tests resulting in negative psychosocial effects on children and parents, as well as high follow-up medical costs. These drawbacks have raised a need for rapid and effective second-tier tests by alternative methods including next generation sequencing (NGS; also called massively parallel sequencing).

NGS technology has revolutionized the study of human genomic variation and disease diagnosis. It is shown to be compatible with DBS samples [10] and suitable for NBS genes [11]. Practically, targeted NGS has advantage in clinical diagnostics due to its speed and cost-effectiveness, when compared with whole genome sequencing (WGS) or whole-exome sequencing (WES). Targeted NGS employing hybridization-based capturing method was found with better sequencing complexity and uniformity, while amplicon-based method had the advantages of shorter preparation time and smaller DNA input [12]. Though amplicon-based method was more vulnerable to false-positive and false-negative results, the problems are mainly caused by insufficient read coverage and minimum variant frequency, hence could be compensated by modifying the algorithms, such as adjusting filter parameters. Because of the limited DNA quantity from DBS and the urgency of prompt diagnosis, amplicon-based approach would be more suitable for applications in NBS.

NGS has the potential to be implemented in NBS in three ways, 1) follow-up diagnosis and family screening; 2) as a first-tier screening test, and 3) as a second-tier test. NGS could be applied in second-tier confirmation testing for NBS, particularly for borderline biochemical screening results. First-tier MS/MS screening method is susceptible to high false-positive rate and false-negative, especially in several IEM diseases. Second-tier NGS test could help to rule out ambiguous screening results due to nutritional impacts or common heterozygous carriers, meanwhile ruling in IEM cases with unsatisfactory sensitivity using MS/MS such as citrin deficiency and carnitine uptake defect. With reference to Hong Kong local statistics and resources available for NBS, targeted NGS panel for second-tier confirmation test would be a suitable choice for implementation. In this study, we validated an 87 gene amplicon-based NGS panel which was designed for second-tier test for NBS program and other IEM diseases/clinical purpose. Reference DNA materials and DBS samples were used for DNA extraction and library preparation. A commercial pipeline was employed for bioinformatic analysis and variant calling. The whole workflow can be finished in 2.5 days and compatible with the turnaround time of five-working days of the current NBS program.

Materials and methods

Validation samples

The validation samples consisted of two reference genomes, RM 8398 (ethnic Utah/Mormon) and RM 8393 (ethnic Chinese) (purchased from NIST, Gaithersburg, USA). A total of three validation experiments were performed, each of which included a 16-plex sequencing run, spiked-in with 2% PhiX DNA as technical control. One sample was triplicated in each run for validation of technical performance on repeatability and between run repeatability. A total of 3 × 16 = 48 sequencing results were obtained. The reference genome samples were used for assessment of sensitivity and specificity.

Panel design

A custom AmpliSeq panel (Illumina, San Diego, USA) was designed for this study, wherein was amplicon-based, targeting the exons and ±20 base pair (bp) of intron-exon-boundaries of 87 genes and 289,264 bps in the human genome with 2,263 amplicons (Panel design summary shown in Table 1). The primers were separated into two pools for library amplification. The panel could detect genetic variants for 87-IEM diseases that are potential targets of NBS or for other clinical purpose (Table 2). For reference, a Browser Extensible Data (BED) file containing the genomic positions of the missed bases in exons is provided as Supplementary Data 1, and the exons containing missed bases in each gene were summarized in Supplementary Data 2.

Table 1:

Summary of panel design and validation.

Design
Genome GRCh37 (hg19)
Total number of gene 87
Total number of exon 1,139
Number of base covered by panel 289,264
Number of base in reportable region (coding exon ± 20 bp) 171,226
Total number of amplicon 2,263
Number of primer pool 2
Number of base covered by amplicon 405,603
Average amplicon size, bp 179
Performance
Target region mean depth 208×
Fraction of regions target depth ≥20× 95.7%
Sensitivity (SNV and indel combined) 91.2%
Specificity 99.95%
  1. SNV, single nucleotide variant.

Table 2:

List of the panel genes with individual >20× coverage and the associated disease.

Gene ≥20× coverage Condition OMIM entry
1. ABCD1  89.5% X-linked adrenoleukodystrophy 300100
2. ACAD8 90.9% Isobutyryl-CoA dehydrogenase deficiency 611283
3. ACAD9 93.4% Mitochondrial complex I deficiency, nuclear type 20 611126
4. ACADM 100% Medium-chain acyl-CoA dehydrogenase deficiencyb 201450
5. ACADS 93.5% Short-chain acyl-CoA dehydrogenase deficiency 201470
6. ACADSB 99.9% 2-methylbutyrylglycinuria 610006
7. ACADVL 92.0% Very long-chain acyl-CoA dehydrogenase deficiencyb 201475
8. ACAT1 99.0% Alpha-methylacetoacetic aciduria (alternative titles: Beta-ketothiolase deficiency/2-methyl-3-hydroxybutyric academia/Mitochondrial acetoacetyl-coa thiolase deficiency)b 203750
9. ACAT2 97.6% Acetyl-CoA acetyltransferase-2 deficiency 614055
10. ACSF3 99.1% Combined malonic and methylmalonic aciduriab 614265
11. ADA 98.2% Severe combined immunodeficiency due to ADA deficiency 102700
12. AHCY 91.3% Hypermethioninemia with deficiency of S-adenosylhomocysteine hydrolase 613752
13. ALDH4A1 90.1% Hyperprolinemia, type II 239510
14. ALDH6A1 93.4% Methylmalonate semialdehyde dehydrogenase deficiencyb 614105
15. AMT 99.5% Glycine encephalopathy (non-ketotic hyperglycinemia) 605899
16. ARG1 98.6% Argininemiab 207800
17. ASL 87.0% Argininosuccinic acidemiab 207900
18. ASS1 98.6% Citrullinemia type Ib 215700
19. ATP7B 95.4% Wilson disease 277900
20. AUH 96.8% 3-methylglutaconic aciduria, type I 250950
21. BCKDHA 97.0% Maple syrup urine disease, type Iab 248600
22. BCKDHB 99.8% Maple syrup urine disease, type Ibb 248600
23. BTD 98.0% Biotinidase deficiencyb 253260
24. CBS 94.5% Homocystinuriab 236200
25. CD320 81.3% Methylmalonic aciduria, transient, due to transcobalamin receptor defectb 613646
26. CFTR 96.7% Cystic fibrosis 219700
27. CPS1 98.0% Carbamoylphosphate synthetase I deficiency 608307
28. CPT1A 99.8% Carnitine palmitoyltransferase type I deficiency 255120
29. CPT2 99.7% Carnitine palmitoyltransferase II deficiencyb 600649
30. CYP21A2 a (invalid) NA Congenital adrenal hyperplasiab 201910
31. DBT 97.1% Maple syrup urine disease, type IIb 248600
32. DECR1 100% 2,4-dienoyl-CoA reductase deficiency 616034
33. DLD 99.9% Maple syrup urine disease, type III (dihydrolipoamide dehydrogenase deficiency)b 246900
34. ETFA 100% Glutaric acidemia type IIb 231680
35. ETFB 88.1% Glutaric acidemia type IIb 231680
36. ETFDH 99.8% Glutaric acidemia type IIb 231680
37. ETHE1 92.5% Ethylmalonic encephalopathy 602473
38. FAH 99.3% Tyrosinaemia type Ib 276700
39. GAA 91.0% Glycogen storage disease II (Pompe’s disease) 232300
40. GALE 93.8% Galactose epimerase deficiency 230350
41. GALK1 90.7% Galactokinase deficiency with cataracts 230200
42. GALT 97.5% Classical galactosemiab 230400
43. GCDH 93.9% Glutaric acidemia type Ib 231670
44. GCH1 92.7% Dystonia, DOPA-responsive, with or without hyperphenylalaninemia

Hyperphenylalaninemia, BH4-deficient, B
128230

233910
45. GCSH 70.5% Glycine encephalopathy (non-ketotic hyperglycinemia) 605899
46. GLDC 93.1% Glycine encephalopathy (non-ketotic hyperglycinemia) 605899
47. GLUD1 96.8% Hyperinsulinism-hyperammonemia syndrome 606762
48. GNMT 83.8% Glycine N-methyltransferase deficiency 606664
49. HADH 89.9% 3-hydroxyacyl-CoA dehydrogenase deficiency (SCHAD deficiency, formerly) 231530
50. HADHA 100% Long-chain 3-hydroxyacyl-CoA dehydrogenase deficiency

Mitochondrial trifunctional protein deficiency
609016

609015
51. HADHB 98.5% Trifunctional protein deficiency 609015
52. HLCS 98.8% Multiple carboxylase deficiency/Holocarboxylase synthetase deficiencyb 253270
53. HMGCL 99.9% 3-hydroxy-3-methylglutaryl-CoA lyase deficiencyb 246450
54. HMGCS2 97.0% HMG-CoA synthase-2 deficiency 605911
55. HPD 98.3% Tyrosinemia type III 276710
56. HSD17B4 99.8% D-bifunctional protein deficiency 261515
57. IDUA a (invalid) 55.9% Mucopolysaccharidosis Ih/Hurler syndrome 607014
58. IVD 94.6% Isovaleric acidemiab 243500
59. MAT1A 99.7% Methionine adenosyltransferase I/III deficiency 250850
60. MCCC1 99.0% 3-Methylcrotonyl-CoA carboxylase 1 deficiency 210200
61. MCCC2 99.9% 3-Methylcrotonyl-CoA carboxylase 1 deficiency 210200
62. MCEE 98.7% Methylmalonyl-CoA epimerase deficiencyb 251120
63. MLYCD 83.3% Malonyl-CoA decarboxylase deficiencyb 248360
64. MMAA 100% Methylmalonic aciduria, type cblA, vitamin B12-responsiveb 251100
65. MMAB 98.3% Methylmalonic aciduria, vitamin B12-responsive, due to defect in synthesis of adenosylcobalamin, cblB complementation typeb 251110
66. MMACHC 96.1% Methylmalonic aciduria and homocystinuria, cblC typeb 277400
67. MMADHC 100% Methylmalonic aciduria and homocystinuria, cblD type (alternative gene symbol C2orf25) b 277410
68. MMUT 93.9% Methylmalonic aciduria, mut(0) typeb 251000
69. NADK2 99.0% 2,4-dienoyl-CoA reductase deficiency 616034
70. NAGS 70.2% N-acetylglutamate synthase deficiency 237310
71. OTC 95.7% Ornithine transcarbamylase deficiency 311250
72. OXCT1 99.5% Succinyl CoA:3-oxoacid CoA transferase deficiency 245050
73. PAH 100% Phenylketonuria due to phenylalanine hydroxylase deficiencyb 261600
74. PCBD1 95.2% Pterin-4α-carbinolamine dehydratase deficiency 264070
75. PCCA 96.5% Propionic acidemiab 606054
76. PCCB 100% Propionic acidemiab 606054
77. PPM1K 99.8% Maple syrup urine disease, mild variantb 615135
78. PRODH 84.1% Hyperprolinemia, type I 239500
79. PTS 95.1% 6-pyruvoyl-tetrahydropterin synthase deficiencyb 261640
80. QDPR 97.7% Dihydropteridine reductase deficiency 261630
81. SLC22A5 96.0% Carnitine uptake deficiencyb 212140
82. SLC25A13 97.5% Neonatal-onset type II citrullinemia/Neonatal intrahepatic cholestasis caused by citrin deficiency (NICCD)b 605814
83. SLC25A15 100% Hyperornithinemia-hyperammonemia-homocitrullinemia syndrome 238970
84. SLC25A20 100% Carnitine-acylcarnitine translocase deficiencyb 212138
85. SUCLA2 93.8% Mitochondrial DNA depletion syndrome 5 (encephalomyopathic with or without methylmalonic aciduria)b 612073
86. SUCLG1 95.8% Mitochondrial DNA depletion syndrome 9 (encephalomyopathic type with methylmalonic aciduria)b 245400
87. TAT 97.5% Tyrosinemia type II 276600
  1. aIDUA and CYP21A2 are invalid due to poor coverage and presence of pseudogene, respectively. bThe IEM conditions included in the Hong Kong NBS program.

DNA extraction for DBS

For DBS collected on Whatman 903 Proteinsaver Cards (Cytiva, Vancouver, Canada), six 3.2 mm diameter DBS discs should be used for DNA extraction. Extraction was performed manually using column-based QIAamp DNA Mini Kit (QIAGEN, Germantown, USA) with modification using manufacturer-supplied buffers ATL, AL, AW1, AW2, EB, and proteinase K. Three-hundred and sixty microliters buffer ATL for tissue lysis was added to the punched DBS discs, followed by incubation at 85 °C for 10 min; 40 μL proteinase K stock solution was added and incubated at 56 °C for 1 h; 400 μL buffer AL was added and incubated at 70 °C for 10 min. Four-hundred microliters ethanol (96–100%) was added to the mixture before applying to the QIAamp Mini spin column and centrifuged at 6,000×g for 1 min with filtrate discarded, followed by washing with AW1 and AW2. DNA was eluted from column with the same 50 μL buffer EB (QIAGEN, Germantown, USA) twice with 1 min incubation at room temperature, followed by centrifugation at 6,000×g for 1 min. The extracted DNA quantity and quality were accessed by Qubit DNA High Sensitivity (HS) Assay Kit with Qubit 2.0 Fluorometer and NanoDrop spectrophotometer (ThermoFisher Scientific, Waltham, USA) respectively.

Library preparation and targeted sequencing

Library preparation was carried out using AmpliSeq Plus Library Prep Kit (Illumina) following the manufacturer’s protocol. About 7.5 ng of DNA was used to synthesize library. AmpliSeq UD indexes were used for indexing. At the end, 27 μL of eluted library DNA was collected to a DNA LoBind tube (Eppendorf, Hamburg, Germany). NGS libraries were quantitated by Qubit dsDNA HS assay. Library size distribution was checked by Agilent High Sensitivity DNA Kit on 2,100 Bioanalyzer (Agilent Technologies, Santa Clara, USA), from which a distinct peak between 300 and 400 bp should be observed, with negligible level of contaminating DNA species. Each sequencing library was diluted to a concentration of 1 nM with buffer EB and 16 libraries were pooled together. The pooled library was then diluted to 90 pM and spiked with 2% 90 pM PhiX DNA as technical control. The pooled library was then loaded to the sequencing cartridge and subject to massive parallel sequencing to generate 150 bp pair-end reads in Illumina iSeq-100 System.

Data analysis

The iSeq-100 built-in software, DNA Amplicon, generated FASTQ output files from the raw reads, performed initial alignment and determined QC metrics. Sequence Analysis Viewer (SAV; Illumina) and FastQC (Bioinformatics Group, Babraham Institute, Cambridge, UK) apps were also used to monitor sequencing QC metrics of pooled and individual library, respectively.

We employed a commercial pipeline, the NextGENe software (v2.4.2.3; Softgenetics, State College, USA), to perform read alignment against the reference genome (GRCh37/hg19), filtering and variant calling. Data analyses targeted all the coding exons ±20 bp to include the splice site intronic region. For variant calling, the criteria included >20% mutation frequency, >3 allele count and >5 total coverage count; homozygous variants were exempted from these cutoffs; in-read phasing (merging) of adjacent variants with a maximum gap of 1 bp and phaseable read percentage of >50% was allowed. Additional custom filtering criteria was imposed to minimize false-positive rates by using a proprietary quality score (cutoff = 4) which was composed of adjusted coverage score, mismatch score and wrong allele score. A variant was regarded as valid only if a minimum of 20 reads at its position was observed. A mutation report (VCF file) was created annotating all variants. Human Gene Mutation Database (HGMD) Professional (QIAGEN, Germantown, USA), Alamut Visual (Sophia Genetics SA, Saint-Sulpice, Switzerland) and Integrative Genomics Viewer (IGV) [13] were used for further characterizing the variants and raw reads. Single nucleotide variant (SNV) and small indel were targeted for curation.

External quality assurance (EQA) program

We joined the European Molecular Genetics Quality Network (EMQN) pilot EQA scheme for NGS (Germline; 2020) as EQA. The provided DNA sample was sequenced in another 16-plex run, and the BED file of the original panel design covering the whole-panel, instead of coding exons only, was given to the EQA program to analyze the data.

Results

DNA extraction, library preparation, and sequencing metrics

All libraries showed satisfactory yield and quality, i.e., concentration ranged from 2.67 to 11.6 ng/μL and average size ranged from 359 to 390 bp with a distinct and dominant peak between 300 and 400 bp. The total yield of each run ranged from 1.74 to 1.92 giga bps. The overall sequencing quality of the three runs was good with %Q30 between 91 and 92% (Table 3). PhiX alignment rate is between 1.3 and 1.4% (compared with the expected 2%; Table 3). All individual library passed the FastQC metrics except one library showed distinctively high duplication level. This sample also showed a low uniformity (defined as % of base with >0.2× mean coverage) thus regarded as fail.

Table 3:

Summary of quality control (QC) metrics of the three batches of 16-plex run.

Sample ID Coverage (x) Reads on targets Uniformity
Batch 1

Q30 = 91.1% PhiX alignment rate = 1.3%
RM8398 211 644,933 98.1%
RM8393 192 583,892 97.7%
Batch 2

Q30 = 91.6% PhiX alignment rate = 1.4%
RM8398 191 582,074 96.8%
RM8393 221 676,909 97.1%
Batch 3

Q30 = 91.8% PhiX alignment rate = 1.4%
RM8398 220 678,307 94.4%
RM8393 176 540,091 96.1%

Coverage and repeatability

After alignment against the reference genome (GRCh37/hg19), on average one library got about 640,000 reads on targets, with an average coverage of 208× ±20 (1 SD) (Table 3). For intra-run repeatability, the triplicated sample had mean coverage of 195, 206, and 223 in each run, corresponding to CV of 9.1, 10.5, and 5.2% respectively. Combining the above to calculate inter-run repeatability, the average coverage is 208× with a CV of 10.1%.

For each of the 87 genes, we looked at the percentage of >20× coverage in reportable region from the three runs (Table 2). Seventy-five genes achieved an average of ≥90% >20× coverage, of which 8 genes achieved 100% >20× coverage in all three runs. Another eight genes had between 80 and 90% >20× coverage and were also useable for reporting. CYP21A2 was disqualified due to its very high homology with its pseudogene CYP21A1P. Another gene, IDUA, has low >20× coverage at 55.9% and will not be used for reporting. NAGS and GCSH had the next lowest >20× coverage at around 70% but were still considered useable for reporting.

Sensitivity

We used the variants in the reference genomes RM8398 (ethnic Utah/Mormon) and RM8393 (ethnic Chinese) to evaluate the sensitivity of the panel by looking at the proportion of true-positive variant called. The sensitivities were 90.9% (442 out of 486 variants; SNV and indel combined) and 91.4% (367 out 401 variants; SNV and indel combined) respectively which were considered acceptable. Specificity was calculated based on the proportion of false-positive over the total number of base in the reporting region achieving >20× coverage, so that specificities were 99.94% (98 false-positives out of 163,932 bases) and 99.95% (87 false-positives out of 163,932 bases) for RM8398 and RM8393 respectively. Many of the false-positives were found near the 3′ end of the mapped reads with lower coverage and quality score, probably due to a drop in sequencing quality near the 3′ end. These assay specific artefacts were recurrently observed in all samples, and could be easily identified and avoided using genome browser during NGS data reporting.

EQA

The results returned from EMQN pilot EQA scheme for NGS (2020) was satisfactory (Table 4), with the median depth at called variants being 178×. The sensitivity for SNV and indel were 97.89 and 81.58% respectively. The precisions were relatively lower for both SNV and indel at 77.44 and 53.4% respectively, due to the fact that the whole-panel BED file was submitted for analysis, which includes more non-coding regions and 3′ end of reads. The latter are known to more likely exhibit lower sequencing quality and PCR artefacts. The artefacts at 3′ end are also more likely to occur in tandem hence making indel more likely to occur than SNP. Fortunately, these artefacts can be easily identified using genome browser. Since we were the only participant who used iSeq and the custom AmpliSeq panel (out of 293 participating labs; 77.6% of which used one of the Illumina NGS platforms), our results cannot be directly compared with the peer group. Our EQA results show good sensitivity and acceptable precision, both of which can be improved if filtering criteria is optimized and only coding exons ±20 bp are analyzed.

Table 4:

Result summary of European Molecular Genetics Quality Network (EMQN) pilot external quality assurance (EQA) scheme for next generation sequencing (NGS) (germline 2020).

Metric Value
FASTQ files a
Bases above Q30 quality scroe 94.1%
Average base quality (Phred scale) 35.9
BAM file a
Uniformityb 97.4%
Off target 2.4%
Error rate on target 0.0032
Coverage at 20× 97.4%
VCF file a
Depth at calls (median) 178×
SNP detection a
True-positives 278
False-positives 81
False-negatives 6
Sensitivityc 97.9%
Precisiond 77.4%
F-score 86.5%
Indel detection a
True-positives 31
False-positives 27
False-negatives 7
Sensitivityc 81.6%
Precisiond 53.5%
F-score 64.6%
  1. aWe submitted the FASTQ sequence data files together with BAM and VCF result files to EMQN, which then evaluated the SNP and indel detection performance. bThe percentage of bases on target covered at 0.1 × median coverage. cProportion of actual positives that are correctly identified as such. dProportion of actual positives among all reported positives. EMQN, European Molecular Genetics Quality Network; EQA, External quality assurance; SNP, single nucleotide polymorphism; NGS, next generation sequencing.

Turnaround time (TAT)

For a 16-plex run, DNA preparation from DBS can be finished in half a day (Day 1). Library synthesis and pooling can be accomplished within 1.5 days (Day 1 and 2) followed by overnight sequencing for about 22 h. Bioinformatic analysis and QC procedure can be done within 3 h at Day 3. With normally only a few variants per gene being discovered and one to two genes per sample needing to be interpreted, an optimized workflow should allow reports to be ready in about 2.5 working days.

Discussion

In this study, we have validated the analytical performance of 16-plex iSeq sequencing using 87-IEM-gene AmpliSeq Panel. The use of this panel was aimed for second-tier NBS and other IEM diseases, targeting variants in coding exons and splice sites. Among the 87 genes, 85 genes excluding IDUA and CYP21A2 were evaluated with satisfactory performance for clinical reporting. The assay has achieved a sensitivity of 91.15%, specificity of 99.95%, satisfactory repeatability and EQA results. There are 75 genes had over 90% per gene >20× coverage based on reportable region. The whole workflow can be done with a 2.5 day TAT. Together with two-day TAT for first-tier MS/MS reporting, a total of five day TAT could be achieved for the whole NBS system.

Optimizing sample preparation and scrutiny

DNA extraction was performed manually using column-based method; in our experience, the minimum yield was about 60 ng, which is more than required for quantitation and library construction (about 12 ng). The excess DNA should be enough for repeat or other tests such as independent single nucleotide polymorphism (SNP) genotyping assays. Considering the urgency of NBS reporting, automated DNA extraction with magnetic particles could be an alternative method with the advantage of better speed, yield, purity and consistency, as well as reduced chance of sample swapping. The complicated sample and library preparation workflow in targeted NGS has introduced the possibility of sample-mix up or cross contamination [14, 15]. Therefore, procedures to scrutinize sample identity are important when implementing the test in order to minimize the chance of misreporting. Possible way includes in silico identity check to look up a set of highly heterogeneous SNP genotypes and sex prediction from the sequencing reads; the amalgamated SNP genotype should not be identical among a batch of 16 samples and the predicted sex should match with the reported sex. Furthermore, independent PCR-based SNP genotyping assays can be performed and the result compared with that from the in silico SNP identification pipeline, in order to minimize the chance of sample-mixing/swapping.

Second-tier NBS test through targeted NGS

The introduction of NGS would be beneficial to current NBS practice. First-tier MS/MS and biochemical approaches allow detection of multiple IEM including amino acid, organic acid and fatty acid disorders. However, these tests may have short-comings of relatively high false-positive or false-negative rates. This might be due to factors such as prematurity, total parenteral nutrition, maternal defects and the age of disease onset, which affect metabolite concentration and lead to false results. Though two-tier testing might help to screen out suspected cases for specific IEM, limited applications are available. To complement with the current NBS approach, targeted NGS could be implemented as second-tier test. The drawbacks of first-tier biochemical tests could be addressed through the integration of targeted NGS, to minimize the false-positive or negative results due to borderline metabolites concentration.

According to the statistics of Hong Kong NBS since 2015, several conditions are found to have relatively high incidence of false-positive or false-negative results [4], therefore, these conditions would be excellent candidates for second-tier test using our assay. For example, our NGS assay could be used as second-tier test for samples with borderline range of free carnitine level (e.g., 5.0–6.5 μmol/L). The absence of pathogenic variants in the SLC22A5 gene, the causative gene of systemic primary carnitine deficiency [16], could reduce false-positive cases.

On the other hand, false-negative cases were frequently reported for citrin deficiency in Hong Kong [4] and elsewhere [17, 18]. By adopting second-tier genetic testing, we can extend the detection of patients with citrin deficiency with a relatively mild citrulline elevations, higher than the upper reference limit but below the NBS reporting cutoff (e.g., citrulline level of 25–35 μmol/L).

Limitations

Short-read mapping is not suitable for genes with highly homologous genomic regions such as pseudogenes [19], in this case CYP21A2 is interfered by its pseudogene CYP21A1P. While the assay can pick up most of the pathogenic variants in the exonic and splice region, other variants within the promoter regions, deep intronic regions, or regulatory elements outside of the targeted regions cannot be detected. Also, gross insertions and deletions within the target regions could not be detected by this assay at the time of reporting. These variants could possibly be detected using other optimized post-analytical pipeline specifically designed for copy number variant (CNV) detection. Amplicon-based panel might also be susceptible to allele drop out due to the presence of SNVs that cause defective primer annealing in primer binding sites [20, 21] or secondary structure formation in amplicon non-primer site [22], culminating in insufficient coverage and false-negative result.

Conclusions

In conclusion, we have successfully validated a custom AmpliSeq panel for IEM. This assay shall be practical for detecting SNVs and small indels applicable in second-tier test in NBS. Further work would be required to optimize the performance and workflow for its application in Hong Kong NBS program.


Corresponding author: Chloe Miu Mak, Newborn Screening for Inborn Errors of Metabolism Laboratory, Hong Kong Children’s Hospital, Hong Kong SAR, P.R. China; and Department of Pathology, Division of Chemical Pathology, Hong Kong Children’s Hospital, Hong Kong SAR, P.R. China, E-mail:

Acknowledgments

We would like to thank the Genetics and Genomics Division, Department of Pathology, Hong Kong Children’s Hospital for providing expert opinion and technical support for this work.

  1. Research funding: None declared.

  2. Author contributions: All authors have accepted responsibility for the entire content of this manuscript and approved its submission.

  3. Competing interests: Authors state no conflict of interest.

  4. Informed consent: Not applicable.

  5. Ethical approval: Not applicable.

References

1. Therrell, BL, Padilla, CD, Loeber, JG, Kneisser, I, Saadallah, A, Borrajo, GJ, et al.. Current status of newborn screening worldwide: 2015. Semin Perinatol 2015;39:171–87. https://doi.org/10.1053/j.semperi.2015.03.002.Suche in Google Scholar PubMed

2. Lehmann, S, Delaby, C, Vialaret, J, Ducos, J, Hirtz, C. Current and future use of “dried blood spot” analyses in clinical chemistry. Clin Chem Lab Med 2013;51:1897–909. https://doi.org/10.1515/cclm-2013-0228.Suche in Google Scholar PubMed

3. Feuchtbaum, L, Carter, J, Dowray, S, Currier, RJ, Lorey, F. Birth prevalence of disorders detectable through newborn screening by race/ethnicity. Genet Med 2012;14:937–45. https://doi.org/10.1038/gim.2012.76.Suche in Google Scholar PubMed

4. The Task Force on the Pilot Study of Newborn Screening for Inborn Errors of Metabolism, Hong Kong SAR. Evaluation of the 18-month “Pilot Study of Newborn Screening for Inborn Errors of Metabolism” in Hong Kong. HK J Paediatr (New Ser) 2020;25:16-22.Suche in Google Scholar

5. Lindner, M, Gramer, G, Haege, G, Fang-Hoffmann, J, Schwab, KO, Tacke, U, et al.. Efficacy and outcome of expanded newborn screening for metabolic diseases--report of 10 years from South-West Germany. Orphanet J Rare Dis 2011;6:44. https://doi.org/10.1186/1750-1172-6-44.Suche in Google Scholar PubMed PubMed Central

6. Lim, JS, Tan, ES, John, CM, Poh, S, Yeo, SJ, Ang, JS, et al.. Inborn Error of Metabolism (IEM) screening in Singapore by electrospray ionization-tandem mass spectrometry (ESI/MS/MS): an 8 year journey from pilot to current program. Mol Genet Metabol 2014;113:53–61. https://doi.org/10.1016/j.ymgme.2014.07.018.Suche in Google Scholar PubMed

7. Lindner, M, Abdoh, G, Fang-Hoffmann, J, Shabeck, N, Al-Sayrafi, M, Al-Janahi, M, et al.. Implementation of extended neonatal screening and a metabolic unit in the State of Qatar: developing and optimizing strategies in cooperation with the Neonatal Screening Center in Heidelberg. J Inherit Metab Dis 2007;30:522–9. https://doi.org/10.1007/s10545-007-0553-7.Suche in Google Scholar PubMed

8. Wang, LY, Chen, NI, Chen, PW, Chiang, SC, Hwu, WL, Lee, NC, et al.. Newborn screening for citrin deficiency and carnitine uptake defect using second-tier molecular tests. BMC Med Genet 2013;14:24. https://doi.org/10.1186/1471-2350-14-24.Suche in Google Scholar PubMed PubMed Central

9. Matern, D, Tortorelli, S, Oglesbee, D, Gavrilov, D, Rinaldo, P. Reduction of the false-positive rate in newborn screening by implementation of MS/MS-based second-tier tests: the Mayo Clinic experience (2004-2007). J Inherit Metab Dis 2007;30:585–92. https://doi.org/10.1007/s10545-007-0691-y.Suche in Google Scholar PubMed

10. Hollegaard, MV, Grauholm, J, Nielsen, R, Grove, J, Mandrup, S, Hougaard, DM. Archived neonatal dried blood spot samples can be used for accurate whole genome and exome-targeted next-generation sequencing. Mol Genet Metabol 2013;110:65–72. https://doi.org/10.1016/j.ymgme.2013.06.004.Suche in Google Scholar PubMed

11. Bodian, DL, Klein, E, Iyer, RK, Wong, WS, Kothiyal, P, Stauffer, D, et al.. Utility of whole-genome sequencing for detection of newborn screening disorders in a population cohort of 1,696 neonates. Genet Med 2016;18:221–30. https://doi.org/10.1038/gim.2015.111.Suche in Google Scholar PubMed

12. Samorodnitsky, E, Jewell, BM, Hagopian, R, Miya, J, Wing, MR, Lyon, E, et al.. Evaluation of hybridization capture versus amplicon-based methods for whole-exome sequencing. Hum Mutat 2015;36:903–14. https://doi.org/10.1002/humu.22825.Suche in Google Scholar PubMed PubMed Central

13. Robinson, JT, Thorvaldsdóttir, H, Wenger, AM, Zehir, A, Mesirov, JP. Variant review with the integrative genomics viewer. Cancer Res 2017;77:e31–4. https://doi.org/10.1158/0008-5472.can-17-0337.Suche in Google Scholar PubMed PubMed Central

14. Koboldt, DC, Ding, L, Mardis, ER, Wilson, RK. Challenges of sequencing human genomes. Briefings Bioinf 2010;11:484–98. https://doi.org/10.1093/bib/bbq016.Suche in Google Scholar PubMed PubMed Central

15. Wang, PP, Parker, WT, Branford, S, Schreiber, AW. BAM-matcher: a tool for rapid NGS sample matching. Bioinformatics 2016;32:2699–701. https://doi.org/10.1093/bioinformatics/btw239.Suche in Google Scholar PubMed

16. Lee, NC, Tang, NL, Chien, YH, Chen, CA, Lin, SJ, Chiu, PC, et al.. Diagnoses of newborns and mothers with carnitine uptake defects through newborn screening. Mol Genet Metabol 2010;100:46–50. https://doi.org/10.1016/j.ymgme.2009.12.015.Suche in Google Scholar PubMed

17. Ohura, T, Kobayashi, K, Tazawa, Y, Abukawa, D, Sakamoto, O, Tsuchiya, S, et al.. Clinical pictures of 75 patients with neonatal intrahepatic cholestasis caused by citrin deficiency (NICCD). J Inherit Metab Dis 2007;30:139–44. https://doi.org/10.1007/s10545-007-0506-1.Suche in Google Scholar PubMed

18. Shigetomi, H, Tanaka, T, Nagao, M, Tsutsumi, H. Early detection and diagnosis of neonatal intrahepatic cholestasis caused by citrin deficiency missed by newborn screening using tandem mass spectrometry. Int J Neonatal Screen 2018;4:5. https://doi.org/10.3390/ijns4010005.Suche in Google Scholar PubMed PubMed Central

19. Trier, C, Fournous, G, Strand, JM, Stray-Pedersen, A, Pettersen, RD, Rowe, AD. Next-generation sequencing of newborn screening genes: the accuracy of short-read mapping. NPJ Genom Med 2020;5:36. https://doi.org/10.1038/s41525-020-00142-z.Suche in Google Scholar PubMed PubMed Central

20. Shestak, AG, Bukaeva, AA, Saber, S, Zaklyazminskaya, EV. Allelic dropout is a common phenomenon that reduces the diagnostic yield of PCR-based sequencing of targeted gene panels. Front Genet 2021;12:620337. https://doi.org/10.3389/fgene.2021.620337.Suche in Google Scholar PubMed PubMed Central

21. Zucca, S, Villaraggia, M, Gagliardi, S, Grieco, GS, Valente, M, Cereda, C, et al.. Analysis of amplicon-based NGS data from neurological disease gene panels: a new method for allele drop-out management. BMC Bioinf 2016;17:339. https://doi.org/10.1186/s12859-016-1189-0.Suche in Google Scholar PubMed PubMed Central

22. Lam, CW, Mak, CM. Allele dropout caused by a non-primer-site SNV affecting PCR amplification--a call for next-generation primer design algorithm. Clin Chim Acta 2013;421:208–12. https://doi.org/10.1016/j.cca.2013.03.014.Suche in Google Scholar PubMed


Supplementary Material

The online version of this article offers supplementary material (https://doi.org/10.1515/labmed-2021-0115).


Received: 2021-09-02
Accepted: 2021-09-29
Published Online: 2021-10-25
Published in Print: 2021-12-20

© 2021 Kwok Yeung Tsang et al., published by De Gruyter, Berlin/Boston

This work is licensed under the Creative Commons Attribution 4.0 International License.

Heruntergeladen am 30.12.2025 von https://www.degruyterbrill.com/document/doi/10.1515/labmed-2021-0115/html
Button zum nach oben scrollen