Validation of amplicon-based next generation sequencing panel for second-tier test in newborn screening for inborn errors of metabolism
-
Kwok Yeung Tsang
, Toby Chun Hei Chan
, Matthew Chun Wing Yeung , Tsz Ki Wong , Wan Ting Lau und Chloe Miu Mak
Abstract
Objectives
Next generation sequencing (NGS) technology has allowed cost-effective massive parallel DNA sequencing. To evaluate the utility of NGS for newborn screening (NBS) of inborn errors of metabolism (IEM), a custom panel was designed to target 87 disease-related genes. The pilot study was primarily proposed for second-tier testing under the NBSIEM program in Hong Kong.
Methods
The validation of the panel was performed with two reference genomes and an external quality assurance (EQA) sample. Sequencing libraries were synthesized with amplicon-based approach. The libraries were pooled, spiked-in with 2% PhiX DNA as technical control, for 16-plex sequencing runs. Sequenced reads were analyzed using a commercially available pipeline.
Results
The average target region coverage was 208× and the fraction of region with target depth ≥20× was 95.7%, with a sensitivity of 91.2%. There were 85 out of 87 genes with acceptable coverage, and EQA result was satisfactory. The turnaround time from DNA extraction to completion of variant calling and quality control (QC) procedures was 2.5 days.
Conclusions
The NGS approach with the amplicon-based panel has been validated for analytical performance and is suitable for second-tier NBSIEM test.
Introduction
A newborn screening (NBS) program is an effective mean to find out conditions which are not clinically evident on physical examination in neonatal period to allow early diagnosis and intervention to prevent disability or death. Many countries carry out screening program for inborn errors of metabolism (IEM) conditions utilizing dried blood spot (DBS) [1]. DBS analysis provides advantages of significantly small volume of blood, minimal sample preparation, and relatively long term stability of analytes through drying [2]. Although individual IEM is rare, the collective incidence could be up to one in 500 to 4,000, posing severe public health problem [3]. During an 18-month pilot study of expanded NBS for 26 IEM conditions conducted in Hong Kong since 2015, nine confirmed IEM cases was found among 15,138 babies screened, with a collective incidence of one in 1,682 newborns [4], compared to incidence reported in other places such as 1/2,920 in Germany [5], 1/3,165 in Singapore [6], and 1/901 in Qatar [7]. Incidence rate may vary due to reasons such as panel selection, ethnic composition and scope of the study.
Tandem mass spectrometry (MS/MS) program for IEM is ideal for early detection and diagnosis. Though the application of MS/MS has allowed rapid, cost-effective, and simultaneous detection of analytes related to IEM, there has still been some limitations such as high false-positive rate, imprecision, and false-negative. False-negative rates are high for several IEM diseases such as citrin deficiency and carnitine uptake defect, delaying diagnosis and treatment in patients [8]. False-positive, which in some cases such as congenital adrenal hyperplasia can be as high as one true-positive in 13 false-positives [9], could introduce unnecessary follow-up tests resulting in negative psychosocial effects on children and parents, as well as high follow-up medical costs. These drawbacks have raised a need for rapid and effective second-tier tests by alternative methods including next generation sequencing (NGS; also called massively parallel sequencing).
NGS technology has revolutionized the study of human genomic variation and disease diagnosis. It is shown to be compatible with DBS samples [10] and suitable for NBS genes [11]. Practically, targeted NGS has advantage in clinical diagnostics due to its speed and cost-effectiveness, when compared with whole genome sequencing (WGS) or whole-exome sequencing (WES). Targeted NGS employing hybridization-based capturing method was found with better sequencing complexity and uniformity, while amplicon-based method had the advantages of shorter preparation time and smaller DNA input [12]. Though amplicon-based method was more vulnerable to false-positive and false-negative results, the problems are mainly caused by insufficient read coverage and minimum variant frequency, hence could be compensated by modifying the algorithms, such as adjusting filter parameters. Because of the limited DNA quantity from DBS and the urgency of prompt diagnosis, amplicon-based approach would be more suitable for applications in NBS.
NGS has the potential to be implemented in NBS in three ways, 1) follow-up diagnosis and family screening; 2) as a first-tier screening test, and 3) as a second-tier test. NGS could be applied in second-tier confirmation testing for NBS, particularly for borderline biochemical screening results. First-tier MS/MS screening method is susceptible to high false-positive rate and false-negative, especially in several IEM diseases. Second-tier NGS test could help to rule out ambiguous screening results due to nutritional impacts or common heterozygous carriers, meanwhile ruling in IEM cases with unsatisfactory sensitivity using MS/MS such as citrin deficiency and carnitine uptake defect. With reference to Hong Kong local statistics and resources available for NBS, targeted NGS panel for second-tier confirmation test would be a suitable choice for implementation. In this study, we validated an 87 gene amplicon-based NGS panel which was designed for second-tier test for NBS program and other IEM diseases/clinical purpose. Reference DNA materials and DBS samples were used for DNA extraction and library preparation. A commercial pipeline was employed for bioinformatic analysis and variant calling. The whole workflow can be finished in 2.5 days and compatible with the turnaround time of five-working days of the current NBS program.
Materials and methods
Validation samples
The validation samples consisted of two reference genomes, RM 8398 (ethnic Utah/Mormon) and RM 8393 (ethnic Chinese) (purchased from NIST, Gaithersburg, USA). A total of three validation experiments were performed, each of which included a 16-plex sequencing run, spiked-in with 2% PhiX DNA as technical control. One sample was triplicated in each run for validation of technical performance on repeatability and between run repeatability. A total of 3 × 16 = 48 sequencing results were obtained. The reference genome samples were used for assessment of sensitivity and specificity.
Panel design
A custom AmpliSeq panel (Illumina, San Diego, USA) was designed for this study, wherein was amplicon-based, targeting the exons and ±20 base pair (bp) of intron-exon-boundaries of 87 genes and 289,264 bps in the human genome with 2,263 amplicons (Panel design summary shown in Table 1). The primers were separated into two pools for library amplification. The panel could detect genetic variants for 87-IEM diseases that are potential targets of NBS or for other clinical purpose (Table 2). For reference, a Browser Extensible Data (BED) file containing the genomic positions of the missed bases in exons is provided as Supplementary Data 1, and the exons containing missed bases in each gene were summarized in Supplementary Data 2.
Summary of panel design and validation.
| Design | |
| Genome | GRCh37 (hg19) |
| Total number of gene | 87 |
| Total number of exon | 1,139 |
| Number of base covered by panel | 289,264 |
| Number of base in reportable region (coding exon ± 20 bp) | 171,226 |
| Total number of amplicon | 2,263 |
| Number of primer pool | 2 |
| Number of base covered by amplicon | 405,603 |
| Average amplicon size, bp | 179 |
| Performance | |
| Target region mean depth | 208× |
| Fraction of regions target depth ≥20× | 95.7% |
| Sensitivity (SNV and indel combined) | 91.2% |
| Specificity | 99.95% |
-
SNV, single nucleotide variant.
List of the panel genes with individual >20× coverage and the associated disease.
| Gene | ≥20× coverage | Condition | OMIM entry | |
|---|---|---|---|---|
| 1. | ABCD1 | 89.5% | X-linked adrenoleukodystrophy | 300100 |
| 2. | ACAD8 | 90.9% | Isobutyryl-CoA dehydrogenase deficiency | 611283 |
| 3. | ACAD9 | 93.4% | Mitochondrial complex I deficiency, nuclear type 20 | 611126 |
| 4. | ACADM | 100% | Medium-chain acyl-CoA dehydrogenase deficiencyb | 201450 |
| 5. | ACADS | 93.5% | Short-chain acyl-CoA dehydrogenase deficiency | 201470 |
| 6. | ACADSB | 99.9% | 2-methylbutyrylglycinuria | 610006 |
| 7. | ACADVL | 92.0% | Very long-chain acyl-CoA dehydrogenase deficiencyb | 201475 |
| 8. | ACAT1 | 99.0% | Alpha-methylacetoacetic aciduria (alternative titles: Beta-ketothiolase deficiency/2-methyl-3-hydroxybutyric academia/Mitochondrial acetoacetyl-coa thiolase deficiency)b | 203750 |
| 9. | ACAT2 | 97.6% | Acetyl-CoA acetyltransferase-2 deficiency | 614055 |
| 10. | ACSF3 | 99.1% | Combined malonic and methylmalonic aciduriab | 614265 |
| 11. | ADA | 98.2% | Severe combined immunodeficiency due to ADA deficiency | 102700 |
| 12. | AHCY | 91.3% | Hypermethioninemia with deficiency of S-adenosylhomocysteine hydrolase | 613752 |
| 13. | ALDH4A1 | 90.1% | Hyperprolinemia, type II | 239510 |
| 14. | ALDH6A1 | 93.4% | Methylmalonate semialdehyde dehydrogenase deficiencyb | 614105 |
| 15. | AMT | 99.5% | Glycine encephalopathy (non-ketotic hyperglycinemia) | 605899 |
| 16. | ARG1 | 98.6% | Argininemiab | 207800 |
| 17. | ASL | 87.0% | Argininosuccinic acidemiab | 207900 |
| 18. | ASS1 | 98.6% | Citrullinemia type Ib | 215700 |
| 19. | ATP7B | 95.4% | Wilson disease | 277900 |
| 20. | AUH | 96.8% | 3-methylglutaconic aciduria, type I | 250950 |
| 21. | BCKDHA | 97.0% | Maple syrup urine disease, type Iab | 248600 |
| 22. | BCKDHB | 99.8% | Maple syrup urine disease, type Ibb | 248600 |
| 23. | BTD | 98.0% | Biotinidase deficiencyb | 253260 |
| 24. | CBS | 94.5% | Homocystinuriab | 236200 |
| 25. | CD320 | 81.3% | Methylmalonic aciduria, transient, due to transcobalamin receptor defectb | 613646 |
| 26. | CFTR | 96.7% | Cystic fibrosis | 219700 |
| 27. | CPS1 | 98.0% | Carbamoylphosphate synthetase I deficiency | 608307 |
| 28. | CPT1A | 99.8% | Carnitine palmitoyltransferase type I deficiency | 255120 |
| 29. | CPT2 | 99.7% | Carnitine palmitoyltransferase II deficiencyb | 600649 |
| 30. | CYP21A2 a (invalid) | NA | Congenital adrenal hyperplasiab | 201910 |
| 31. | DBT | 97.1% | Maple syrup urine disease, type IIb | 248600 |
| 32. | DECR1 | 100% | 2,4-dienoyl-CoA reductase deficiency | 616034 |
| 33. | DLD | 99.9% | Maple syrup urine disease, type III (dihydrolipoamide dehydrogenase deficiency)b | 246900 |
| 34. | ETFA | 100% | Glutaric acidemia type IIb | 231680 |
| 35. | ETFB | 88.1% | Glutaric acidemia type IIb | 231680 |
| 36. | ETFDH | 99.8% | Glutaric acidemia type IIb | 231680 |
| 37. | ETHE1 | 92.5% | Ethylmalonic encephalopathy | 602473 |
| 38. | FAH | 99.3% | Tyrosinaemia type Ib | 276700 |
| 39. | GAA | 91.0% | Glycogen storage disease II (Pompe’s disease) | 232300 |
| 40. | GALE | 93.8% | Galactose epimerase deficiency | 230350 |
| 41. | GALK1 | 90.7% | Galactokinase deficiency with cataracts | 230200 |
| 42. | GALT | 97.5% | Classical galactosemiab | 230400 |
| 43. | GCDH | 93.9% | Glutaric acidemia type Ib | 231670 |
| 44. | GCH1 | 92.7% | Dystonia, DOPA-responsive, with or without hyperphenylalaninemia Hyperphenylalaninemia, BH4-deficient, B |
128230
233910 |
| 45. | GCSH | 70.5% | Glycine encephalopathy (non-ketotic hyperglycinemia) | 605899 |
| 46. | GLDC | 93.1% | Glycine encephalopathy (non-ketotic hyperglycinemia) | 605899 |
| 47. | GLUD1 | 96.8% | Hyperinsulinism-hyperammonemia syndrome | 606762 |
| 48. | GNMT | 83.8% | Glycine N-methyltransferase deficiency | 606664 |
| 49. | HADH | 89.9% | 3-hydroxyacyl-CoA dehydrogenase deficiency (SCHAD deficiency, formerly) | 231530 |
| 50. | HADHA | 100% | Long-chain 3-hydroxyacyl-CoA dehydrogenase deficiency Mitochondrial trifunctional protein deficiency |
609016
609015 |
| 51. | HADHB | 98.5% | Trifunctional protein deficiency | 609015 |
| 52. | HLCS | 98.8% | Multiple carboxylase deficiency/Holocarboxylase synthetase deficiencyb | 253270 |
| 53. | HMGCL | 99.9% | 3-hydroxy-3-methylglutaryl-CoA lyase deficiencyb | 246450 |
| 54. | HMGCS2 | 97.0% | HMG-CoA synthase-2 deficiency | 605911 |
| 55. | HPD | 98.3% | Tyrosinemia type III | 276710 |
| 56. | HSD17B4 | 99.8% | D-bifunctional protein deficiency | 261515 |
| 57. | IDUA a (invalid) | 55.9% | Mucopolysaccharidosis Ih/Hurler syndrome | 607014 |
| 58. | IVD | 94.6% | Isovaleric acidemiab | 243500 |
| 59. | MAT1A | 99.7% | Methionine adenosyltransferase I/III deficiency | 250850 |
| 60. | MCCC1 | 99.0% | 3-Methylcrotonyl-CoA carboxylase 1 deficiency | 210200 |
| 61. | MCCC2 | 99.9% | 3-Methylcrotonyl-CoA carboxylase 1 deficiency | 210200 |
| 62. | MCEE | 98.7% | Methylmalonyl-CoA epimerase deficiencyb | 251120 |
| 63. | MLYCD | 83.3% | Malonyl-CoA decarboxylase deficiencyb | 248360 |
| 64. | MMAA | 100% | Methylmalonic aciduria, type cblA, vitamin B12-responsiveb | 251100 |
| 65. | MMAB | 98.3% | Methylmalonic aciduria, vitamin B12-responsive, due to defect in synthesis of adenosylcobalamin, cblB complementation typeb | 251110 |
| 66. | MMACHC | 96.1% | Methylmalonic aciduria and homocystinuria, cblC typeb | 277400 |
| 67. | MMADHC | 100% | Methylmalonic aciduria and homocystinuria, cblD type (alternative gene symbol C2orf25) b | 277410 |
| 68. | MMUT | 93.9% | Methylmalonic aciduria, mut(0) typeb | 251000 |
| 69. | NADK2 | 99.0% | 2,4-dienoyl-CoA reductase deficiency | 616034 |
| 70. | NAGS | 70.2% | N-acetylglutamate synthase deficiency | 237310 |
| 71. | OTC | 95.7% | Ornithine transcarbamylase deficiency | 311250 |
| 72. | OXCT1 | 99.5% | Succinyl CoA:3-oxoacid CoA transferase deficiency | 245050 |
| 73. | PAH | 100% | Phenylketonuria due to phenylalanine hydroxylase deficiencyb | 261600 |
| 74. | PCBD1 | 95.2% | Pterin-4α-carbinolamine dehydratase deficiency | 264070 |
| 75. | PCCA | 96.5% | Propionic acidemiab | 606054 |
| 76. | PCCB | 100% | Propionic acidemiab | 606054 |
| 77. | PPM1K | 99.8% | Maple syrup urine disease, mild variantb | 615135 |
| 78. | PRODH | 84.1% | Hyperprolinemia, type I | 239500 |
| 79. | PTS | 95.1% | 6-pyruvoyl-tetrahydropterin synthase deficiencyb | 261640 |
| 80. | QDPR | 97.7% | Dihydropteridine reductase deficiency | 261630 |
| 81. | SLC22A5 | 96.0% | Carnitine uptake deficiencyb | 212140 |
| 82. | SLC25A13 | 97.5% | Neonatal-onset type II citrullinemia/Neonatal intrahepatic cholestasis caused by citrin deficiency (NICCD)b | 605814 |
| 83. | SLC25A15 | 100% | Hyperornithinemia-hyperammonemia-homocitrullinemia syndrome | 238970 |
| 84. | SLC25A20 | 100% | Carnitine-acylcarnitine translocase deficiencyb | 212138 |
| 85. | SUCLA2 | 93.8% | Mitochondrial DNA depletion syndrome 5 (encephalomyopathic with or without methylmalonic aciduria)b | 612073 |
| 86. | SUCLG1 | 95.8% | Mitochondrial DNA depletion syndrome 9 (encephalomyopathic type with methylmalonic aciduria)b | 245400 |
| 87. | TAT | 97.5% | Tyrosinemia type II | 276600 |
-
aIDUA and CYP21A2 are invalid due to poor coverage and presence of pseudogene, respectively. bThe IEM conditions included in the Hong Kong NBS program.
DNA extraction for DBS
For DBS collected on Whatman 903 Proteinsaver Cards (Cytiva, Vancouver, Canada), six 3.2 mm diameter DBS discs should be used for DNA extraction. Extraction was performed manually using column-based QIAamp DNA Mini Kit (QIAGEN, Germantown, USA) with modification using manufacturer-supplied buffers ATL, AL, AW1, AW2, EB, and proteinase K. Three-hundred and sixty microliters buffer ATL for tissue lysis was added to the punched DBS discs, followed by incubation at 85 °C for 10 min; 40 μL proteinase K stock solution was added and incubated at 56 °C for 1 h; 400 μL buffer AL was added and incubated at 70 °C for 10 min. Four-hundred microliters ethanol (96–100%) was added to the mixture before applying to the QIAamp Mini spin column and centrifuged at 6,000×g for 1 min with filtrate discarded, followed by washing with AW1 and AW2. DNA was eluted from column with the same 50 μL buffer EB (QIAGEN, Germantown, USA) twice with 1 min incubation at room temperature, followed by centrifugation at 6,000×g for 1 min. The extracted DNA quantity and quality were accessed by Qubit DNA High Sensitivity (HS) Assay Kit with Qubit 2.0 Fluorometer and NanoDrop spectrophotometer (ThermoFisher Scientific, Waltham, USA) respectively.
Library preparation and targeted sequencing
Library preparation was carried out using AmpliSeq Plus Library Prep Kit (Illumina) following the manufacturer’s protocol. About 7.5 ng of DNA was used to synthesize library. AmpliSeq UD indexes were used for indexing. At the end, 27 μL of eluted library DNA was collected to a DNA LoBind tube (Eppendorf, Hamburg, Germany). NGS libraries were quantitated by Qubit dsDNA HS assay. Library size distribution was checked by Agilent High Sensitivity DNA Kit on 2,100 Bioanalyzer (Agilent Technologies, Santa Clara, USA), from which a distinct peak between 300 and 400 bp should be observed, with negligible level of contaminating DNA species. Each sequencing library was diluted to a concentration of 1 nM with buffer EB and 16 libraries were pooled together. The pooled library was then diluted to 90 pM and spiked with 2% 90 pM PhiX DNA as technical control. The pooled library was then loaded to the sequencing cartridge and subject to massive parallel sequencing to generate 150 bp pair-end reads in Illumina iSeq-100 System.
Data analysis
The iSeq-100 built-in software, DNA Amplicon, generated FASTQ output files from the raw reads, performed initial alignment and determined QC metrics. Sequence Analysis Viewer (SAV; Illumina) and FastQC (Bioinformatics Group, Babraham Institute, Cambridge, UK) apps were also used to monitor sequencing QC metrics of pooled and individual library, respectively.
We employed a commercial pipeline, the NextGENe software (v2.4.2.3; Softgenetics, State College, USA), to perform read alignment against the reference genome (GRCh37/hg19), filtering and variant calling. Data analyses targeted all the coding exons ±20 bp to include the splice site intronic region. For variant calling, the criteria included >20% mutation frequency, >3 allele count and >5 total coverage count; homozygous variants were exempted from these cutoffs; in-read phasing (merging) of adjacent variants with a maximum gap of 1 bp and phaseable read percentage of >50% was allowed. Additional custom filtering criteria was imposed to minimize false-positive rates by using a proprietary quality score (cutoff = 4) which was composed of adjusted coverage score, mismatch score and wrong allele score. A variant was regarded as valid only if a minimum of 20 reads at its position was observed. A mutation report (VCF file) was created annotating all variants. Human Gene Mutation Database (HGMD) Professional (QIAGEN, Germantown, USA), Alamut Visual (Sophia Genetics SA, Saint-Sulpice, Switzerland) and Integrative Genomics Viewer (IGV) [13] were used for further characterizing the variants and raw reads. Single nucleotide variant (SNV) and small indel were targeted for curation.
External quality assurance (EQA) program
We joined the European Molecular Genetics Quality Network (EMQN) pilot EQA scheme for NGS (Germline; 2020) as EQA. The provided DNA sample was sequenced in another 16-plex run, and the BED file of the original panel design covering the whole-panel, instead of coding exons only, was given to the EQA program to analyze the data.
Results
DNA extraction, library preparation, and sequencing metrics
All libraries showed satisfactory yield and quality, i.e., concentration ranged from 2.67 to 11.6 ng/μL and average size ranged from 359 to 390 bp with a distinct and dominant peak between 300 and 400 bp. The total yield of each run ranged from 1.74 to 1.92 giga bps. The overall sequencing quality of the three runs was good with %Q30 between 91 and 92% (Table 3). PhiX alignment rate is between 1.3 and 1.4% (compared with the expected 2%; Table 3). All individual library passed the FastQC metrics except one library showed distinctively high duplication level. This sample also showed a low uniformity (defined as % of base with >0.2× mean coverage) thus regarded as fail.
Summary of quality control (QC) metrics of the three batches of 16-plex run.
| Sample ID | Coverage (x) | Reads on targets | Uniformity |
|---|---|---|---|
| Batch 1 Q30 = 91.1% PhiX alignment rate = 1.3% |
|||
| RM8398 | 211 | 644,933 | 98.1% |
| RM8393 | 192 | 583,892 | 97.7% |
| Batch 2 Q30 = 91.6% PhiX alignment rate = 1.4% |
|||
| RM8398 | 191 | 582,074 | 96.8% |
| RM8393 | 221 | 676,909 | 97.1% |
| Batch 3 Q30 = 91.8% PhiX alignment rate = 1.4% |
|||
| RM8398 | 220 | 678,307 | 94.4% |
| RM8393 | 176 | 540,091 | 96.1% |
Coverage and repeatability
After alignment against the reference genome (GRCh37/hg19), on average one library got about 640,000 reads on targets, with an average coverage of 208× ±20 (1 SD) (Table 3). For intra-run repeatability, the triplicated sample had mean coverage of 195, 206, and 223 in each run, corresponding to CV of 9.1, 10.5, and 5.2% respectively. Combining the above to calculate inter-run repeatability, the average coverage is 208× with a CV of 10.1%.
For each of the 87 genes, we looked at the percentage of >20× coverage in reportable region from the three runs (Table 2). Seventy-five genes achieved an average of ≥90% >20× coverage, of which 8 genes achieved 100% >20× coverage in all three runs. Another eight genes had between 80 and 90% >20× coverage and were also useable for reporting. CYP21A2 was disqualified due to its very high homology with its pseudogene CYP21A1P. Another gene, IDUA, has low >20× coverage at 55.9% and will not be used for reporting. NAGS and GCSH had the next lowest >20× coverage at around 70% but were still considered useable for reporting.
Sensitivity
We used the variants in the reference genomes RM8398 (ethnic Utah/Mormon) and RM8393 (ethnic Chinese) to evaluate the sensitivity of the panel by looking at the proportion of true-positive variant called. The sensitivities were 90.9% (442 out of 486 variants; SNV and indel combined) and 91.4% (367 out 401 variants; SNV and indel combined) respectively which were considered acceptable. Specificity was calculated based on the proportion of false-positive over the total number of base in the reporting region achieving >20× coverage, so that specificities were 99.94% (98 false-positives out of 163,932 bases) and 99.95% (87 false-positives out of 163,932 bases) for RM8398 and RM8393 respectively. Many of the false-positives were found near the 3′ end of the mapped reads with lower coverage and quality score, probably due to a drop in sequencing quality near the 3′ end. These assay specific artefacts were recurrently observed in all samples, and could be easily identified and avoided using genome browser during NGS data reporting.
EQA
The results returned from EMQN pilot EQA scheme for NGS (2020) was satisfactory (Table 4), with the median depth at called variants being 178×. The sensitivity for SNV and indel were 97.89 and 81.58% respectively. The precisions were relatively lower for both SNV and indel at 77.44 and 53.4% respectively, due to the fact that the whole-panel BED file was submitted for analysis, which includes more non-coding regions and 3′ end of reads. The latter are known to more likely exhibit lower sequencing quality and PCR artefacts. The artefacts at 3′ end are also more likely to occur in tandem hence making indel more likely to occur than SNP. Fortunately, these artefacts can be easily identified using genome browser. Since we were the only participant who used iSeq and the custom AmpliSeq panel (out of 293 participating labs; 77.6% of which used one of the Illumina NGS platforms), our results cannot be directly compared with the peer group. Our EQA results show good sensitivity and acceptable precision, both of which can be improved if filtering criteria is optimized and only coding exons ±20 bp are analyzed.
Result summary of European Molecular Genetics Quality Network (EMQN) pilot external quality assurance (EQA) scheme for next generation sequencing (NGS) (germline 2020).
| Metric | Value |
|---|---|
| FASTQ files a | |
| Bases above Q30 quality scroe | 94.1% |
| Average base quality (Phred scale) | 35.9 |
| BAM file a | |
| Uniformityb | 97.4% |
| Off target | 2.4% |
| Error rate on target | 0.0032 |
| Coverage at 20× | 97.4% |
| VCF file a | |
| Depth at calls (median) | 178× |
| SNP detection a | |
| True-positives | 278 |
| False-positives | 81 |
| False-negatives | 6 |
| Sensitivityc | 97.9% |
| Precisiond | 77.4% |
| F-score | 86.5% |
| Indel detection a | |
| True-positives | 31 |
| False-positives | 27 |
| False-negatives | 7 |
| Sensitivityc | 81.6% |
| Precisiond | 53.5% |
| F-score | 64.6% |
-
aWe submitted the FASTQ sequence data files together with BAM and VCF result files to EMQN, which then evaluated the SNP and indel detection performance. bThe percentage of bases on target covered at 0.1 × median coverage. cProportion of actual positives that are correctly identified as such. dProportion of actual positives among all reported positives. EMQN, European Molecular Genetics Quality Network; EQA, External quality assurance; SNP, single nucleotide polymorphism; NGS, next generation sequencing.
Turnaround time (TAT)
For a 16-plex run, DNA preparation from DBS can be finished in half a day (Day 1). Library synthesis and pooling can be accomplished within 1.5 days (Day 1 and 2) followed by overnight sequencing for about 22 h. Bioinformatic analysis and QC procedure can be done within 3 h at Day 3. With normally only a few variants per gene being discovered and one to two genes per sample needing to be interpreted, an optimized workflow should allow reports to be ready in about 2.5 working days.
Discussion
In this study, we have validated the analytical performance of 16-plex iSeq sequencing using 87-IEM-gene AmpliSeq Panel. The use of this panel was aimed for second-tier NBS and other IEM diseases, targeting variants in coding exons and splice sites. Among the 87 genes, 85 genes excluding IDUA and CYP21A2 were evaluated with satisfactory performance for clinical reporting. The assay has achieved a sensitivity of 91.15%, specificity of 99.95%, satisfactory repeatability and EQA results. There are 75 genes had over 90% per gene >20× coverage based on reportable region. The whole workflow can be done with a 2.5 day TAT. Together with two-day TAT for first-tier MS/MS reporting, a total of five day TAT could be achieved for the whole NBS system.
Optimizing sample preparation and scrutiny
DNA extraction was performed manually using column-based method; in our experience, the minimum yield was about 60 ng, which is more than required for quantitation and library construction (about 12 ng). The excess DNA should be enough for repeat or other tests such as independent single nucleotide polymorphism (SNP) genotyping assays. Considering the urgency of NBS reporting, automated DNA extraction with magnetic particles could be an alternative method with the advantage of better speed, yield, purity and consistency, as well as reduced chance of sample swapping. The complicated sample and library preparation workflow in targeted NGS has introduced the possibility of sample-mix up or cross contamination [14, 15]. Therefore, procedures to scrutinize sample identity are important when implementing the test in order to minimize the chance of misreporting. Possible way includes in silico identity check to look up a set of highly heterogeneous SNP genotypes and sex prediction from the sequencing reads; the amalgamated SNP genotype should not be identical among a batch of 16 samples and the predicted sex should match with the reported sex. Furthermore, independent PCR-based SNP genotyping assays can be performed and the result compared with that from the in silico SNP identification pipeline, in order to minimize the chance of sample-mixing/swapping.
Second-tier NBS test through targeted NGS
The introduction of NGS would be beneficial to current NBS practice. First-tier MS/MS and biochemical approaches allow detection of multiple IEM including amino acid, organic acid and fatty acid disorders. However, these tests may have short-comings of relatively high false-positive or false-negative rates. This might be due to factors such as prematurity, total parenteral nutrition, maternal defects and the age of disease onset, which affect metabolite concentration and lead to false results. Though two-tier testing might help to screen out suspected cases for specific IEM, limited applications are available. To complement with the current NBS approach, targeted NGS could be implemented as second-tier test. The drawbacks of first-tier biochemical tests could be addressed through the integration of targeted NGS, to minimize the false-positive or negative results due to borderline metabolites concentration.
According to the statistics of Hong Kong NBS since 2015, several conditions are found to have relatively high incidence of false-positive or false-negative results [4], therefore, these conditions would be excellent candidates for second-tier test using our assay. For example, our NGS assay could be used as second-tier test for samples with borderline range of free carnitine level (e.g., 5.0–6.5 μmol/L). The absence of pathogenic variants in the SLC22A5 gene, the causative gene of systemic primary carnitine deficiency [16], could reduce false-positive cases.
On the other hand, false-negative cases were frequently reported for citrin deficiency in Hong Kong [4] and elsewhere [17, 18]. By adopting second-tier genetic testing, we can extend the detection of patients with citrin deficiency with a relatively mild citrulline elevations, higher than the upper reference limit but below the NBS reporting cutoff (e.g., citrulline level of 25–35 μmol/L).
Limitations
Short-read mapping is not suitable for genes with highly homologous genomic regions such as pseudogenes [19], in this case CYP21A2 is interfered by its pseudogene CYP21A1P. While the assay can pick up most of the pathogenic variants in the exonic and splice region, other variants within the promoter regions, deep intronic regions, or regulatory elements outside of the targeted regions cannot be detected. Also, gross insertions and deletions within the target regions could not be detected by this assay at the time of reporting. These variants could possibly be detected using other optimized post-analytical pipeline specifically designed for copy number variant (CNV) detection. Amplicon-based panel might also be susceptible to allele drop out due to the presence of SNVs that cause defective primer annealing in primer binding sites [20, 21] or secondary structure formation in amplicon non-primer site [22], culminating in insufficient coverage and false-negative result.
Conclusions
In conclusion, we have successfully validated a custom AmpliSeq panel for IEM. This assay shall be practical for detecting SNVs and small indels applicable in second-tier test in NBS. Further work would be required to optimize the performance and workflow for its application in Hong Kong NBS program.
Acknowledgments
We would like to thank the Genetics and Genomics Division, Department of Pathology, Hong Kong Children’s Hospital for providing expert opinion and technical support for this work.
-
Research funding: None declared.
-
Author contributions: All authors have accepted responsibility for the entire content of this manuscript and approved its submission.
-
Competing interests: Authors state no conflict of interest.
-
Informed consent: Not applicable.
-
Ethical approval: Not applicable.
References
1. Therrell, BL, Padilla, CD, Loeber, JG, Kneisser, I, Saadallah, A, Borrajo, GJ, et al.. Current status of newborn screening worldwide: 2015. Semin Perinatol 2015;39:171–87. https://doi.org/10.1053/j.semperi.2015.03.002.Suche in Google Scholar PubMed
2. Lehmann, S, Delaby, C, Vialaret, J, Ducos, J, Hirtz, C. Current and future use of “dried blood spot” analyses in clinical chemistry. Clin Chem Lab Med 2013;51:1897–909. https://doi.org/10.1515/cclm-2013-0228.Suche in Google Scholar PubMed
3. Feuchtbaum, L, Carter, J, Dowray, S, Currier, RJ, Lorey, F. Birth prevalence of disorders detectable through newborn screening by race/ethnicity. Genet Med 2012;14:937–45. https://doi.org/10.1038/gim.2012.76.Suche in Google Scholar PubMed
4. The Task Force on the Pilot Study of Newborn Screening for Inborn Errors of Metabolism, Hong Kong SAR. Evaluation of the 18-month “Pilot Study of Newborn Screening for Inborn Errors of Metabolism” in Hong Kong. HK J Paediatr (New Ser) 2020;25:16-22.Suche in Google Scholar
5. Lindner, M, Gramer, G, Haege, G, Fang-Hoffmann, J, Schwab, KO, Tacke, U, et al.. Efficacy and outcome of expanded newborn screening for metabolic diseases--report of 10 years from South-West Germany. Orphanet J Rare Dis 2011;6:44. https://doi.org/10.1186/1750-1172-6-44.Suche in Google Scholar PubMed PubMed Central
6. Lim, JS, Tan, ES, John, CM, Poh, S, Yeo, SJ, Ang, JS, et al.. Inborn Error of Metabolism (IEM) screening in Singapore by electrospray ionization-tandem mass spectrometry (ESI/MS/MS): an 8 year journey from pilot to current program. Mol Genet Metabol 2014;113:53–61. https://doi.org/10.1016/j.ymgme.2014.07.018.Suche in Google Scholar PubMed
7. Lindner, M, Abdoh, G, Fang-Hoffmann, J, Shabeck, N, Al-Sayrafi, M, Al-Janahi, M, et al.. Implementation of extended neonatal screening and a metabolic unit in the State of Qatar: developing and optimizing strategies in cooperation with the Neonatal Screening Center in Heidelberg. J Inherit Metab Dis 2007;30:522–9. https://doi.org/10.1007/s10545-007-0553-7.Suche in Google Scholar PubMed
8. Wang, LY, Chen, NI, Chen, PW, Chiang, SC, Hwu, WL, Lee, NC, et al.. Newborn screening for citrin deficiency and carnitine uptake defect using second-tier molecular tests. BMC Med Genet 2013;14:24. https://doi.org/10.1186/1471-2350-14-24.Suche in Google Scholar PubMed PubMed Central
9. Matern, D, Tortorelli, S, Oglesbee, D, Gavrilov, D, Rinaldo, P. Reduction of the false-positive rate in newborn screening by implementation of MS/MS-based second-tier tests: the Mayo Clinic experience (2004-2007). J Inherit Metab Dis 2007;30:585–92. https://doi.org/10.1007/s10545-007-0691-y.Suche in Google Scholar PubMed
10. Hollegaard, MV, Grauholm, J, Nielsen, R, Grove, J, Mandrup, S, Hougaard, DM. Archived neonatal dried blood spot samples can be used for accurate whole genome and exome-targeted next-generation sequencing. Mol Genet Metabol 2013;110:65–72. https://doi.org/10.1016/j.ymgme.2013.06.004.Suche in Google Scholar PubMed
11. Bodian, DL, Klein, E, Iyer, RK, Wong, WS, Kothiyal, P, Stauffer, D, et al.. Utility of whole-genome sequencing for detection of newborn screening disorders in a population cohort of 1,696 neonates. Genet Med 2016;18:221–30. https://doi.org/10.1038/gim.2015.111.Suche in Google Scholar PubMed
12. Samorodnitsky, E, Jewell, BM, Hagopian, R, Miya, J, Wing, MR, Lyon, E, et al.. Evaluation of hybridization capture versus amplicon-based methods for whole-exome sequencing. Hum Mutat 2015;36:903–14. https://doi.org/10.1002/humu.22825.Suche in Google Scholar PubMed PubMed Central
13. Robinson, JT, Thorvaldsdóttir, H, Wenger, AM, Zehir, A, Mesirov, JP. Variant review with the integrative genomics viewer. Cancer Res 2017;77:e31–4. https://doi.org/10.1158/0008-5472.can-17-0337.Suche in Google Scholar PubMed PubMed Central
14. Koboldt, DC, Ding, L, Mardis, ER, Wilson, RK. Challenges of sequencing human genomes. Briefings Bioinf 2010;11:484–98. https://doi.org/10.1093/bib/bbq016.Suche in Google Scholar PubMed PubMed Central
15. Wang, PP, Parker, WT, Branford, S, Schreiber, AW. BAM-matcher: a tool for rapid NGS sample matching. Bioinformatics 2016;32:2699–701. https://doi.org/10.1093/bioinformatics/btw239.Suche in Google Scholar PubMed
16. Lee, NC, Tang, NL, Chien, YH, Chen, CA, Lin, SJ, Chiu, PC, et al.. Diagnoses of newborns and mothers with carnitine uptake defects through newborn screening. Mol Genet Metabol 2010;100:46–50. https://doi.org/10.1016/j.ymgme.2009.12.015.Suche in Google Scholar PubMed
17. Ohura, T, Kobayashi, K, Tazawa, Y, Abukawa, D, Sakamoto, O, Tsuchiya, S, et al.. Clinical pictures of 75 patients with neonatal intrahepatic cholestasis caused by citrin deficiency (NICCD). J Inherit Metab Dis 2007;30:139–44. https://doi.org/10.1007/s10545-007-0506-1.Suche in Google Scholar PubMed
18. Shigetomi, H, Tanaka, T, Nagao, M, Tsutsumi, H. Early detection and diagnosis of neonatal intrahepatic cholestasis caused by citrin deficiency missed by newborn screening using tandem mass spectrometry. Int J Neonatal Screen 2018;4:5. https://doi.org/10.3390/ijns4010005.Suche in Google Scholar PubMed PubMed Central
19. Trier, C, Fournous, G, Strand, JM, Stray-Pedersen, A, Pettersen, RD, Rowe, AD. Next-generation sequencing of newborn screening genes: the accuracy of short-read mapping. NPJ Genom Med 2020;5:36. https://doi.org/10.1038/s41525-020-00142-z.Suche in Google Scholar PubMed PubMed Central
20. Shestak, AG, Bukaeva, AA, Saber, S, Zaklyazminskaya, EV. Allelic dropout is a common phenomenon that reduces the diagnostic yield of PCR-based sequencing of targeted gene panels. Front Genet 2021;12:620337. https://doi.org/10.3389/fgene.2021.620337.Suche in Google Scholar PubMed PubMed Central
21. Zucca, S, Villaraggia, M, Gagliardi, S, Grieco, GS, Valente, M, Cereda, C, et al.. Analysis of amplicon-based NGS data from neurological disease gene panels: a new method for allele drop-out management. BMC Bioinf 2016;17:339. https://doi.org/10.1186/s12859-016-1189-0.Suche in Google Scholar PubMed PubMed Central
22. Lam, CW, Mak, CM. Allele dropout caused by a non-primer-site SNV affecting PCR amplification--a call for next-generation primer design algorithm. Clin Chim Acta 2013;421:208–12. https://doi.org/10.1016/j.cca.2013.03.014.Suche in Google Scholar PubMed
Supplementary Material
The online version of this article offers supplementary material (https://doi.org/10.1515/labmed-2021-0115).
© 2021 Kwok Yeung Tsang et al., published by De Gruyter, Berlin/Boston
This work is licensed under the Creative Commons Attribution 4.0 International License.
Artikel in diesem Heft
- Frontmatter
- Editorial
- Emerging technologies in paediatric laboratory medicine
- Articles
- The impact of the COVID-19 pandemic on child health
- Ending diagnostic odyssey using clinical whole-exome sequencing (CWES)
- Validation of amplicon-based next generation sequencing panel for second-tier test in newborn screening for inborn errors of metabolism
- The neonatal microbiome in utero and beyond: perinatal influences and long-term impacts
- Update on endothelial dysfunction in COVID-19: severe disease, long COVID-19 and pediatric characteristics
- New reference intervals for endocrinological biomarkers in pediatric patients: what can we learn from the LIFE child study?
- Data mining of pediatric reference intervals
- Electronic tools in clinical laboratory diagnostics: key examples, limitations, and value in laboratory medicine
- Acknowledgment
- Acknowledgment
- Congress Abstracts
- XVth International Congress on Pediatric Laboratory Medicine, Munich, Nov 27–28, 2021; Poster Presentation Abstracts
Artikel in diesem Heft
- Frontmatter
- Editorial
- Emerging technologies in paediatric laboratory medicine
- Articles
- The impact of the COVID-19 pandemic on child health
- Ending diagnostic odyssey using clinical whole-exome sequencing (CWES)
- Validation of amplicon-based next generation sequencing panel for second-tier test in newborn screening for inborn errors of metabolism
- The neonatal microbiome in utero and beyond: perinatal influences and long-term impacts
- Update on endothelial dysfunction in COVID-19: severe disease, long COVID-19 and pediatric characteristics
- New reference intervals for endocrinological biomarkers in pediatric patients: what can we learn from the LIFE child study?
- Data mining of pediatric reference intervals
- Electronic tools in clinical laboratory diagnostics: key examples, limitations, and value in laboratory medicine
- Acknowledgment
- Acknowledgment
- Congress Abstracts
- XVth International Congress on Pediatric Laboratory Medicine, Munich, Nov 27–28, 2021; Poster Presentation Abstracts