Implementation of Amplicon Parallel Sequencing Leads to Improvement of Diagnosis and Therapy of Lung...

9
Copyright © 2015 by the International Association for the Study of Lung Cancer 1049 Journal of Thoracic Oncology ® •  Volume 10, Number 7, July 2015 Introduction: The Network Genomic Medicine Lung Cancer was set up to rapidly translate scientific advances into early clinical trials of targeted therapies in lung cancer performing molecular analyses of more than 3500 patients annually. Because sequential analysis of the relevant driver mutations on fixated samples is challenging in terms of workload, tissue availability, and cost, we established multiplex parallel sequencing in routine diagnostics. The aim was to analyze all therapeutically relevant mutations in lung cancer samples in a high- throughput fashion while significantly reducing turnaround time and amount of input DNA compared with conventional dideoxy sequenc- ing of single polymerase chain reaction amplicons. Methods: In this study, we demonstrate the feasibility of a 102 ampli- con multiplex polymerase chain reaction followed by sequencing on an Illumina sequencer on formalin-fixed paraffin-embedded tissue in routine diagnostics. Analysis of a validation cohort of 180 samples showed this approach to require significantly less input material and to be more reliable, robust, and cost-effective than conventional dideoxy sequencing. Subsequently, 2657 lung cancer patients were analyzed. Results: We observed that comprehensive biomarker testing provided novel information in addition to histological diagnosis and clinical staging. In 2657 consecutively analyzed lung cancer samples, we identified driver mutations at the expected prevalence. Furthermore we found potentially targetable DDR2 mutations at a frequency of 3% in both adenocarcinomas and squamous cell carcinomas. Conclusion: Overall, our data demonstrate the utility of systematic sequencing analysis in a clinical routine setting and highlight the dra- matic impact of such an approach on the availability of therapeutic strategies for the targeted treatment of individual cancer patients. Key Words: Lung cancer, Next-generation sequencing, Routine diagnostic, Formalin-fixed, Amplicon. (J Thorac Oncol. 2015;10: 1049–1057) L ung cancer is histologically divided into small-cell lung car- cinomas and non–small-cell lung carcinomas (NSCLCs). NSCLC comprises approximately 85% of newly diagnosed lung cancers subclassified into adenocarcinoma (AD; ~50%), squamous cell carcinoma (SCC; ~30%), and large cell car- cinoma (20%). 1 Recently, treatment paradigms shifted from those based on mere morphology to one that incorporates actionable genomic alterations. 2 The discovery of actionable mutations in the Epidermal Growth Factor Receptor (EGFR) gene and rearrangements of the Anaplastic Lymphoma Kinase (ALK) that are primarily found in AD patients have led to a remarkable improvement in overall survival of genetically selected patients. 3,4 Also in SCC and small-cell lung carcinomas, several novel potential driver mutations are currently being evaluated as potential DOI: 10.1097/JTO.0000000000000570 Copyright © 2015 by the International Association for the Study of Lung Cancer ISSN: 1556-0864/15/1007-1049 Implementation of Amplicon Parallel Sequencing  Leads to Improvement of Diagnosis and Therapy  of Lung Cancer Patients Katharina König, PhD,*†‡§ Martin Peifer,§ Jana Fassunke, PhD,*†‡Michaela A. Ihle, Dipl-Biol,*†‡Helen Künstlinger, Dpil-Biol,*†‡ Carina Heydt, MSc,*†‡ Katrin Stamm, PhD,¶ Frank Ueckeroth,*†‡ Claudia Vollbrecht, BSc,†‡ Marc Bos, MD,†‡ Masyar Gardizi, MD,†‡ Matthias Scheffler, MD,†‡ Lucia Nogova, MD,†‡ Frauke Leenders, MD,# Kerstin Albus, MSc,*†‡ Lydia Meder, MSc,*†‡ Kerstin Becker, PhD,** Alexandra Florin, MA,*†‡ Ursula Rommerscheidt-Fuss,*†‡ Janine Altmüller, PhD,** Michael Kloth, MD,*†‡ Peter Nürnberg, Prof,** Thomas Henkel, PhD,¶ Sven-Ernö Bikár, PhD,†† Martin L. Sos, PhD,† William J. Geese, PhD,‡‡ Lewis Strauss,‡‡ Yon-Dschun Ko, MD,‡§ Ulrich Gerigk, MD,‡ Margarete Odenthal, PhD,*†‡ Thomas Zander, MD,†‡ Jürgen Wolf, Prof,†‡ Sabine Merkelbach-Bruse, MD,*†‡ Reinhard Buettner, MD,*†‡ and Lukas C. Heukamp, PhD, MD*†‡#*** *Institute of Pathology, University Hospital of Cologne, Cologne, Germany; †Center for Integrated Oncology Cologne/Bonn, University Hospital of Cologne, Cologne, Germany; ‡Lung Cancer Group Cologne, Department I for Internal Medicine, University Hospital of Cologne, Cologne, Germany; §Labor Dr. Quade und Kollegen GmbH, Cologne, Germany; Institute of Human Genetics, University of Cologne, Cologne, Germany; ¶Targos Molecular Pathology GmbH, Kassel, Germany; #Department of Translational Genomics, University of Cologne, Cologne, Germany; **Cologne Center for Genomics, Cologne, Germany; ††GENterprise GENOMICS GmbH, Mainz, Germany; ‡‡Bristol-Myers Squibb R&D, Princeton, New Jersey; §§Evangelische Kliniken Johanniter, Bonn, Germany; ║║Malteser Krankenhaus, Lung Cancer Center, Bonn, Germany; ¶¶Center for Molecular Medicine Cologne (CMMC), University of Cologne, Cologne, Germany; ##Institute for Hematopathology Hamburg, Hamburg, Germany; and ***New Oncology, Cologne, Germany. König, Peifer, Fassunke, Buettner, and Heukamp contributed equally to this article. Disclosure: The authors declare no conflict of interest. Address for correspondence: Katharina König, Institute of Pathology, University Hospital Cologne, Kerpenerstraße 62, Köln 50937, Germany. E-mail: [email protected] ORIGINAL ARTICLE

Transcript of Implementation of Amplicon Parallel Sequencing Leads to Improvement of Diagnosis and Therapy of Lung...

Copyright © 2015 by the International Association for the Study of Lung Cancer

1049Journal of Thoracic Oncology ®  •  Volume 10, Number 7, July 2015

Introduction: The Network Genomic Medicine Lung Cancer was set up to rapidly translate scientific advances into early clinical trials of targeted therapies in lung cancer performing molecular analyses of more than 3500 patients annually. Because sequential analysis of the relevant driver mutations on fixated samples is challenging in terms of workload, tissue availability, and cost, we established multiplex parallel sequencing in routine diagnostics. The aim was to analyze all therapeutically relevant mutations in lung cancer samples in a high-throughput fashion while significantly reducing turnaround time and amount of input DNA compared with conventional dideoxy sequenc-ing of single polymerase chain reaction amplicons.Methods: In this study, we demonstrate the feasibility of a 102 ampli-con multiplex polymerase chain reaction followed by sequencing on an Illumina sequencer on formalin-fixed paraffin-embedded tissue in

routine diagnostics. Analysis of a validation cohort of 180 samples showed this approach to require significantly less input material and to be more reliable, robust, and cost-effective than conventional dideoxy sequencing. Subsequently, 2657 lung cancer patients were analyzed.Results: We observed that comprehensive biomarker testing provided novel information in addition to histological diagnosis and clinical staging. In 2657 consecutively analyzed lung cancer samples, we identified driver mutations at the expected prevalence. Furthermore we found potentially targetable DDR2 mutations at a frequency of 3% in both adenocarcinomas and squamous cell carcinomas.Conclusion: Overall, our data demonstrate the utility of systematic sequencing analysis in a clinical routine setting and highlight the dra-matic impact of such an approach on the availability of therapeutic strategies for the targeted treatment of individual cancer patients.

Key Words: Lung cancer, Next-generation sequencing, Routine diagnostic, Formalin-fixed, Amplicon.

(J Thorac Oncol. 2015;10: 1049–1057)

Lung cancer is histologically divided into small-cell lung car-cinomas and non–small-cell lung carcinomas (NSCLCs).

NSCLC comprises approximately 85% of newly diagnosed lung cancers subclassified into adenocarcinoma (AD; ~50%), squamous cell carcinoma (SCC; ~30%), and large cell car-cinoma (20%).1 Recently, treatment paradigms shifted from those based on mere morphology to one that incorporates actionable genomic alterations.2

The discovery of actionable mutations in the Epidermal Growth Factor Receptor (EGFR) gene and rearrangements of the Anaplastic Lymphoma Kinase (ALK) that are primarily found in AD patients have led to a remarkable improvement in overall survival of genetically selected patients.3,4 Also in SCC and small-cell lung carcinomas, several novel potential driver mutations are currently being evaluated as potential

DOI: 10.1097/JTO.0000000000000570Copyright © 2015 by the International Association for the Study of Lung CancerISSN: 1556-0864/15/1007-1049

Implementation of Amplicon Parallel Sequencing  Leads to Improvement of Diagnosis and Therapy 

of Lung Cancer Patients

Katharina König, PhD,*†‡§ Martin Peifer,§‖ Jana Fassunke, PhD,*†‡Michaela A. Ihle, Dipl-Biol,*†‡Helen Künstlinger, Dpil-Biol,*†‡ Carina Heydt, MSc,*†‡ Katrin Stamm, PhD,¶

Frank Ueckeroth,*†‡ Claudia Vollbrecht, BSc,†‡ Marc Bos, MD,†‡ Masyar Gardizi, MD,†‡ Matthias Scheffler, MD,†‡ Lucia Nogova, MD,†‡ Frauke Leenders, MD,# Kerstin Albus, MSc,*†‡ Lydia Meder, MSc,*†‡ Kerstin Becker, PhD,** Alexandra Florin, MA,*†‡ Ursula Rommerscheidt-Fuss,*†‡ Janine

Altmüller, PhD,**‖ Michael Kloth, MD,*†‡ Peter Nürnberg, Prof,** Thomas Henkel, PhD,¶ Sven-Ernö Bikár, PhD,†† Martin L. Sos, PhD,† William J. Geese, PhD,‡‡ Lewis Strauss,‡‡ Yon-Dschun Ko, MD,‡§

Ulrich Gerigk, MD,‡‖ Margarete Odenthal, PhD,*†‡ Thomas Zander, MD,†‡ Jürgen Wolf, Prof,†‡ Sabine Merkelbach-Bruse, MD,*†‡ Reinhard Buettner, MD,*†‡ and Lukas C. Heukamp, PhD, MD*†‡#***

*Institute of Pathology, University Hospital of Cologne, Cologne, Germany; †Center for Integrated Oncology Cologne/Bonn, University Hospital of Cologne, Cologne, Germany; ‡Lung Cancer Group Cologne, Department I for Internal Medicine, University Hospital of Cologne, Cologne, Germany; §Labor Dr. Quade und Kollegen GmbH, Cologne, Germany; ║Institute of Human Genetics, University of Cologne, Cologne, Germany; ¶Targos Molecular Pathology GmbH, Kassel, Germany; #Department of Translational Genomics, University of Cologne, Cologne, Germany; **Cologne Center for Genomics, Cologne, Germany; ††GENterprise GENOMICS GmbH, Mainz, Germany; ‡‡Bristol-Myers Squibb R&D, Princeton, New Jersey; §§Evangelische Kliniken Johanniter, Bonn, Germany; ║║Malteser Krankenhaus, Lung Cancer Center, Bonn, Germany; ¶¶Center for Molecular Medicine Cologne (CMMC), University of Cologne, Cologne, Germany; ##Institute for Hematopathology Hamburg, Hamburg, Germany; and ***New Oncology, Cologne, Germany.

König, Peifer, Fassunke, Buettner, and Heukamp contributed equally to this article.Disclosure: The authors declare no conflict of interest.Address for correspondence: Katharina König, Institute of Pathology,

University Hospital Cologne, Kerpenerstraße 62, Köln 50937, Germany. E-mail: [email protected]

ORIGINAL ARTICLE

Copyright © 2015 by the International Association for the Study of Lung Cancer

1050 Copyright © 2015 by the International Association for the Study of Lung Cancer

König et al. Journal of Thoracic Oncology ®  •  Volume 10, Number 7, July 2015

actionable targets,5,6 for example, amplification of Fibroblast Growth Factor Receptor 1 (FGFR1).7–9 However, the transla-tion of these scientific advances is currently hampered by the limited availability of genomic information on tumor material in clinical routine diagnostics.

Due to whole genome sequencing efforts in lung can-cer,8,10–12 the number of newly identified potential driver muta-tions,13 for example, Fibroblast Growth Factor 3, RET, and ROS,14 in lung cancer has increased steadily.15 The recently identified subgroups are small, comprising 0.5% to 3% of the lung cancer patients16,17 compared with the most frequently activated oncogene mutation in the V-Ki-ras2 Kirsten rat sar-coma viral oncogene homolog18 gene found in 15% to 30% of NSCLC patients.19 Thus, routine molecular diagnostics for all actionable and potentially actionable driver mutations remains a growing challenge.

At present, mutation analysis of formalin-fixed paraf-fin-embedded (FFPE) tumor tissue is mainly based on single sequential polymerase chain reaction (PCR) followed by con-ventional dideoxy or Sanger sequencing (SS), which is both labor- and cost-intensive and limited by sensitivity and the amount of available DNA. Since we aimed to sequence the entire 3.2-kb coding region of Discoidin domain-containing receptor 2 (DDR2) to recruit patients into a clinical trial with Dasatinib entitled: “Trial of Dasatinib in Subjects With Advanced Cancers Harboring DDR2 Mutation or Inactivating B-RAF Mutation” (NCT01514864 clinicaltrials.gov), we opted to replace mutational analysis by SS with amplicon-based massive parallel sequencing.20

With next-generation sequencing (NGS) benchtop devices becoming available, we aimed to reduce the time required for comprehensive molecular diagnostics and mini-mize the amount of required FFPE-derived input DNA while at the same time increase the number of target regions analyzed.

MATERIALS AND METHODS

Sample CohortsSamples were analyzed as part of the standard diagnos-

tic procedures in agreement with guidelines, with approval of the local ethics committee (Ref Number: 10–242) and diag-nosed based on the 2004 World Health Organization classifi-cation of lung tumors21 and the International Multidisciplinary Classification of Lung Adenocarcinoma.22

For DNA extraction, tumor areas were marked on a hematoxylin and eosin-stained slide by a senior patholo-gist. After deparaffination, tumor tissue was macrodissected from six unstained 10-μm sections and samples were lysed overnight with proteinase K at 37°C. Extraction was carried out automatically using a BioRobot M48 (Qiagen, Hilden, Germany) or a Maxwell 16 System (Promega, Madison, WI). Tumor content varied between 10% and 100%, with a mean value of 36.5% and a median of 30% (Fig. S3A, Supplemental Digital Content 2, http://links.lww.com/JTO/A840), and we found a significant correlation between tumor content and allelic frequencies of the identified variants (Fig. S3B, Supplemental Digital Content 2, http://links.lww.com/JTO/A840). Molecular diagnostics was performed at the accredited central molecular pathology laboratory using high-resolution

melting prescreening and fragment length analysis as appro-priate followed by SS.21,23

Ion AmpliSeq Custom DNA Panels and Library Preparation

DNA concentration was determined with the Qubit dsDNA HS Assay Kit (Life Technologies, Darmstadt, Germany) on the Qubit 2.0 Fluorometer. Up to 50 ng of FFPE DNA was amplified for 30 cycles with Ion AmpliSeq Custom DNA Panels (Life Technologies) covering 102 amplicons of 12 different genes (Table S1, Supplemental Digital Content 1, http://links.lww.com/JTO/A839) split in two primer pools with the 5× Ion AmpliSeq HiFi Master Mix, which is part of Ion AmpliSeq DNA Library Kit 2.0. After treatment with the FuPa reagent, PCR products of both primer pools from the same patient were pooled. PCR products were purified with 1.6× Agencourt AMPure XP beads (Beckman Coulter, Brea, CA) incubated for 5 minutes at room temperature, washed twice with 80% ethanol, and eluted in DNase/RNase free water. PCR products were incubated with NEXTflex DNA Adenylation Mix (Bioo Scientific, Austin, TX). NEXTflex DNA Barcode adapters (Bioo Scientific) were ligated to the fragments using Switch Solution and T4 DNA Ligase. Ligation products were cleaned up with 1.8× Agencourt AMPure XP beads incubated for 5 minutes at room temperature, washed twice with 80% ethanol, and eluted in DNase/RNase free water followed by size selection with 0.8× Agencourt AMPure XP beads incubated for 5 minutes at room temperature; the supernatant was collected and incubated with 0.2× Agencourt AMPure XP beads for 5 minutes at room temperature, washed twice with 80% ethanol, and eluted in DNase/RNase free water. The NEXTflex Primer Mix together with Platinum PCR SuperMix HiFi was used for the final PCR amplification with 10 cycles and a final clean-up with 1× Agencourt AMPure XP beads for 5 minutes at room temperature, washed twice with 80% ethanol, and eluted in DNase/RNase free water.

Sequencing Using the Illumina MiSeq PlatformLibrary concentration was determined with the Qubit

dsDNA HS Assay Kit (Life Technologies) on the Qubit 2.0 Fluorometer, which varied from 5 to 25 ng/μl. Libraries were diluted to 10 nM and pooled in equimolar amounts. Pooled libraries (5 ng/μl) were spiked with 5% PhiX DNA (Illumina, San Diego, CA) and sequenced paired end with the “MiSeq reagent Kit V2 (300-cycles)” (Illumina). FastQ files generated by the MiSeq Reporter were used as data output.

Generation of Variant Lists and Alignment of Raw Sequencing Reads

Briefly, after removing adapter sequences, raw reads (FastQ files) were aligned against reference NCBI build 37 (human genome 19, hg19) and only against the chromosomal regions covered by our Custom Panel (CP) using Burrows Wheeler Alignment. Remaining unmapped reads were then realigned using BLAT24,25 to detect longer insertions or dele-tions. After combining all alignments into a single BAM file, genomic variants were called using a modified version of our previously described method8 and adapted for targeted

Copyright © 2015 by the International Association for the Study of Lung Cancer

1051Copyright © 2015 by the International Association for the Study of Lung Cancer

Journal of Thoracic Oncology ®  •  Volume 10, Number 7, July 2015 Implementation of Amplicon Parallel Sequencing

sequencing. Variants were then imported into a FileMaker (FileMaker GmbH, Germany) database for further analysis, annotation, and reporting. BAM files were viewed in the Integrative Genomics Viewer (http://www.broadinstitute.org/igv/).26,27

RESULTSThe Network Genomic Medicine Lung Cancer was

founded at the Center for Integrated Oncology Cologne/Bonn by the Lung Cancer Group Cologne aiming at maximizing molecular testing in lung cancer to stratify approximately 4000 lung cancer patients per annum for clinically approved targeted therapies or early phase clinical trials.2 FFPE sam-ples of 2124 consecutive AD patients were received over a period of 21 months of which 1616 specimens were adequate for molecular analysis using the algorithm summarized in Figure S1A (Supplemental Digital Content 2, http://links.lww.com/JTO/A840). Although 76.1% of samples with suf-ficient tumor content to be successfully analyzed for EGFR and ALK aberrations, the amount of material was too small for completion of all our analysis in the extended diagnostic panel in more than 30% of the samples. In 42% of these AD patients, we identified at least one genomic lesion (Fig. S1B, Supplemental Digital Content 2, http://links.lww.com/JTO/A840). The median turnaround time was 15 days and 21 days if ALK Fluorescence in situ hybridization was performed (Fig. S4, Supplemental Digital Content 2, http://links.lww.com/JTO/A840).

Procedure of Amplicon Parallel Sequencing on the Illumina Platform

The AmpliSeq Cancer panel by Life Technologies seemed to be optimally suited to our requirements with the flexibility of the CP design. Initial test samples indicated that previously determined variants in routine diagnostic could be detected (Fig. S1A, Supplemental Digital Content 2, http://links.lww.

com/JTO/A840). To establish a specific lung cancer panel, we designed our own CP tailored to the genes of interest that show therapeutical relevance (Table S1, Supplemental Digital Content 1, http://links.lww.com/JTO/A839) for multiple ongo-ing and future clinical studies (http://www.lungcancergroup.de).28–33 We opted to sequence the multiplex PCR product on a MiSeq Benchtop sequencer (Illumina) (Fig. 1A, B). The modi-fied library preparation protocol is depicted in Figure 1C.

Validation of the Ampliseq CP for Routine Diagnostics

Having shown that a combined approach of Ampliseq CPs Multiplex PCR with parallel sequencing on a MiSeq was feasible in terms of data quality, coverage of amplicons, and variant identification for a few test samples, we aimed to vali-date this specific panel for our routine diagnostic workflow. One hundred eighty cases including lung, melanoma, colon, and gastrointestinal stromal tumor cancer samples previously analyzed in our routine diagnostic laboratory served as a vali-dation cohort for the NGS pipeline, including samples with known mutations in EGFR, KRAS, BRAF, CTNNB1, TP53, NRAS, and ALK (Table 1; a complete list can be found in Table S2, Supplemental Digital Content 1, http://links.lww.com/JTO/A839). For 169 cases, we asserted concordance of the SS and NGS analyses. Although eight samples had a concen-tration below the input DNA concentration recommended by the AmpliSeq library preparation protocol (minimum of 10 ng per reaction), the previously determined mutations could be clearly detected (Sample No. 21, 70, 88, 101, 110, 114, 125, 127). Approximately 30% of the FFPE samples in our institu-tion do not fulfill the criteria for DNA quantity and quality. It should be mentioned that samples with low DNA content are prone to fixation artifacts and have a higher failure rate34,35 (data not shown).

Initially, in 30 cases, the NGS analytic pipeline could not detect the previously determined mutation because the relevant

A

B

C

FIGURE 1.  Schematic presentation of the library preparation. A and C, Genomic DNA (up to 50 ng) is amplified with the AmpliSeq Custom Panels. Primers are digested with FuPa, an “A” is added at the 3´ end to allow adapter ligation. After bead clean up and size selection, the successfully ligated products are amplified by PCR with a final bead clean up. Samples are quantified by Qubit, normalized to 10 nM, and pooled to be then sequenced on v2 MiSeq Cartridge. A table outline (A) and a cartoon (C) are depicted. B, Mutational analysis can be completed within 4 work-ing days starting with the DNA extraction until data analysis. PCR, polymerase chain reaction.

Copyright © 2015 by the International Association for the Study of Lung Cancer

1052 Copyright © 2015 by the International Association for the Study of Lung Cancer

König et al. Journal of Thoracic Oncology ®  •  Volume 10, Number 7, July 2015

amplicons had a coverage of less than 100-fold. However, all these cases were successfully analyzed by resequencing with a higher DNA library input and/or lower number of samples per sequencing run, leading to a higher coverage per amplicon.

Based on these observations, we determined a coverage of 100 as cutoff value for evaluating every specific amplicon for further routine diagnostic procedure. Furthermore, the CP shows a diverse heterogeneity in terms of the representation of amplicons, with a coverage ranging from 100- to 10,000-fold (Fig. S5, Supplemental Digital Content 2, http://links.lww.com/JTO/A840). We therefore consider the minimum cover-age rather than the average covered to be a relevant quality parameter.

To check for robustness and reproducibility of our NGS pipeline, we performed inter- and intrareproducibility tests. We chose three samples with a previously determined mutation status and firstly conducted the library preparation setup in triplicates on two different days; secondly, a highly diluted sample; and thirdly, we used AmpliSeq Library Kit and Adenylation mix with different lot numbers under con-stant conditions. All mutations could be confidently identified

TABLE 1.  Summary of the Validation Procedure of 180 Validation Cases

No. of Cases

Validation cohort 180

Concordant with SS 169

Discrepancies 11

Additional variant identified 75

Suboptimal input DNA 8

SS, Sanger sequencing.

A

D

B C

E

FIGURE 2.  Discrepancies between SS and NGS. A–C, Three of 11 cases with discrepancies between SS and NGS due to human errors in the mutation calling of the initial Sanger sequence are shown with 25 reads and two amino acids surrounding the affected amino acid are depicted. In addition, the allelic frequency (%) as well as coverage (x) is given. A, The peak representing the mutation from the electropherogram could not be assigned to its correct position for PIK3CA c.3140A>T p.H1047L because the primers are too close to the posi-tion of the mutation. B, A clear C to A substitution in the Sanger sequence is visible but was misread as a G to C substitution leading to the wrong amino acid substitution, that is, c.34_35GG>CC p.G12L instead of c.34_35delinsTT p.G12F. C, In one patient, the correct base substitu-tion was noted but the wrong amino acid substitution reported. D and E, Representative case in which a T790M resistance mutation was found in EGFR exon 20, which was not screened for by the SS pipeline (D). The pathogenic co-occurring mutation in the same sample in EGFR exon 21: c.2573T>G p.L858R is shown in (E). NGS, next-generation sequencing; SS, Sanger sequencing.

Copyright © 2015 by the International Association for the Study of Lung Cancer

1053Copyright © 2015 by the International Association for the Study of Lung Cancer

Journal of Thoracic Oncology ®  •  Volume 10, Number 7, July 2015 Implementation of Amplicon Parallel Sequencing

in all samples with similar allelic frequencies. Furthermore, no additional variants were detected (Table S3, Supplemental Digital Content 1, http://links.lww.com/JTO/A839).

Moreover, we performed NGS with three samples at two separate time points of extractions from the same block. Although DNA concentration differed between the two extractions and the resulting library concentration (Table S4, Supplemental Digital Content 1, http://links.lww.com/JTO/A839), the sequencing results were congruent with respect to identifying the correct mutation.

Discrepancies between SS and NGSWe identified 11 cases with discrepancies between SS

and NGS. Going back to the original Sanger electrophero-grams, we found errors in the mutation calling of the initial Sanger sequence and not in the NGS sequences in 11 cases (Sample No. 66, 133, 158, 169, and 171–177) (Fig. 2A–C). Three mutations (Sample No. 178–180) could not be detected by NGS. Repeating the SS in two samples revealed a sample annotation error. The third sample involved a rare EGFR muta-tion in Exon 18 (c.2125G>A p.E709K) that was called by the initial Sanger-based analysis. This mutation was considered to represent a fixation artifact resulting from deamination in the template DNA. Thus, we could identify all the mutations that were previously determined by SS and any discrepancies were unambiguously resolved. In the majority of cases this was due to human errors that occurred during initial Sanger analysis.

In 75 cases, we identified a total of 95 additional vari-ants. Most of these occurred in the TP53 gene (more than 50%) not previously tested for by SS. In three patients of our validation cohort, we observed a T790M mutation by NGS, not initially tested for in the SS pipeline, in addition to the

activating EGFR mutations.36–38 The allele frequencies of the T790M mutations were 15.45%, 64.73%, and 77.49%, respec-tively (Fig. 2D, E and data not shown).

Sample Preparation and Turnaround Time Is Reduced while Augmenting the Number of Targets

After successful validation, we implemented NGS in the routine diagnostic workflow and analyzed 4000 patients per annum by multiplexing 44 samples in equimolar concen-trations, including one negative control per sequencing run (Fig. 1A). With an optimal cluster density of around 1000 K/mm2, we are able to achieve a mean coverage of 4000× and a quality Q

30 score of over 95% (Fig. 3A) and a minimal coverage

of 500 in 99% of the amplicons (Fig. S5, Supplemental Digital Content 2, http://links.lww.com/JTO/A840). The most critical amplicon was Phosphatase and tensin homolog (PTEN) exon 9 (Table S1, Supplemental Digital Content 1, http://links.lww.com/JTO/A839), never reaching a coverage of higher than 50.

Starting from isolated DNA, we are in principle able to obtain data ready for evaluation within 4 days (Fig. 1B). In the context of a routine diagnostic setting with 15,000 single analy-sis per year for various assays, the turnaround time was some-what longer. In comparison to the SS approach, the overall time for sample preparation was reduced by 4.5 hours (Table 2), and the mean turnaround time was reduced from 12 days to 10 days for the entire NGS panel of 102 amplicons (Fig. 3B). However, it should be emphasized that the previous SS approach just cov-ered four genes (KRAS, BRAF, EGFR, and PIK3CA), whereas NGS approach includes 102 amplicons spanning 14 genes. Additionally, as the integration of NGS is more practicable in

A

B

FIGURE 3.  Multiplexing leads to reduced turn-around time. A, The mean coverage for six runs on the MiSeq in a routine diagnostic is shown including the number of samples loaded (n), cluster density,49 and the Quality30 score. B, The number of days to complete full analysis by SS (left panel) and NGS (right panel). n is the number of patients analyzed. NGS, next-generation sequencing; SS, Sanger sequencing.

Copyright © 2015 by the International Association for the Study of Lung Cancer

1054 Copyright © 2015 by the International Association for the Study of Lung Cancer

König et al. Journal of Thoracic Oncology ®  •  Volume 10, Number 7, July 2015

the course of routine diagnostic, we increased the number of samples for analysis by including patients diagnosed as SCC.

NGS in Routine Diagnostics Reveals a More Comprehensive Picture of Occurring Mutation Rates in ADs and SCCs

Since introducing NGS into the routine diagnos-tic laboratory, we analyzed 1971 ADs and 686 SCCs and observed a different composition of mutation spectrum between AD and SCC (Fig. 4A). The most prevalent muta-tions found in ADs were TP53 (35%), KRAS (26%), and EGFR (8%). The ranking of the most frequent mutations detected SCC is different: TP53 (57%), PTEN (8%), and PIK3CA (6%). In 13% of the samples of AD, and 12% of SCC, respectively, we could not detect any mutation by our CP and were reported as wildtype. In approximately 3% of all the cases, we were not able to obtain a diagnostic result due to low sample quality. TP53 mutations were a common additional genomic event in 35% of the ADs and 57% of the SCC samples. The distribution of mutations in SCC and AD were similar in the TP53 wild-type and mutant cases (Fig. 4B). TP53 mutations occurred in ADs most common together with KRAS mutations (46%) and in SCC together with PTEN mutations (31%).

In 2011, Hammerman et al39 reported somatic mutations in DDR2 at a frequency of 3.8% in cell lines and lung SCC samples, which could be confirmed in following reports.40,41 We detected DDR2 mutations with a frequency of 3% in ADs (n = 67) and 3% in SCC (n = 25), suggesting that treatment with dasatinib may be of interest for study in clinical trials in both AD and SCC.

Additional Mutations Found by NGS Have a Positive Clinical Impact

We found two EGFR mutations in the validation cohort of 180 cases and seven in a total of 315 samples that had been sent for molecular diagnostics all with an initial diagnosis of SCC by an external pathologist. Performing immunohistochemical staining for cytokeratin 7, thyroid transcription factor-1 (TTF1), p63/p40, and Napsin A42 (Fig. S7, Supplemental Digital Content 2, http://links.lww.com/JTO/A840), we showed that three cases had been misdiagnosed as SCC but in fact were ADs. Only one sample was confirmed as a pure SCC with a rare EGFR exon 19 deletion mutation (Fig. S8, Supplemental Digital Content 2, http://links.lww.com/JTO/A840). Previously, this EGFR muta-tion would not have been screened for in a SCC. However, because the NGS approach now screens for the complete muta-tion spectrum in all NSCLC, the exact histopathological dif-ferentiation of AD versus SCC in determining what molecular analysis needs to be performed is becoming less important.

NGS Pipeline OptimizationsEven though NGS replaced SS in our routine diagnostic

laboratory, several changes needed to be made to the original workflow based on the experiences gained from the first 1000 samples analyzed. The following adaptations were imple-mented: up to 44 samples are multiplexed during a Miseq run including one negative control. We avoid using the adapters containing the same barcode in consecutive runs. Furthermore, we determine the number of reads harboring barcodes not used in the library preparation to detect contaminations.

The mutation and coverage files generated by the bioin-formatic pipeline are imported into our evaluation and annota-tion database developed in-house. Variants are always checked visually in the Integrative Genomics Viewer. Annotations of every new occurring variant are stored in a central annotation database after intense literature research, which is updated on a regular basis.

A major issue of analyzing FFPE material is the occur-rence of fixation artifacts that are purine substitutions. When a sample shows an increased and unlikely number of G>A or C>T or vice versa transition rate, that is, usually more than 15 variants per sample, only those having an allelic frequency of over 10% are reported.34,43

DISCUSSIONThe identification of novel targetable driver mutations

in cancer makes accurate and rapid molecular diagnostics a requirement for most cancer patients.8,10,12 Until recently, detec-tion of somatic mutations in routine FFPE samples relied on sin-gle, often sequentially performed assays for each potential driver event. This is time consuming, cost and labor intensive, and is

TABLE 2.  Opposing the Laboratory Turnaround Time for 48 Samples by AmpliSeq Custom Panel Sequenced on the MiSeq Versus Former Routine Workflow Including Hands-On Time

Time Duration

Hands-On Time

Sanger and HRM (for 48 samples; min 4 to max 10 analysis)

Quantification NanoDrop and Agarose

30 min 30 min

BRAF and KRAS HRM 6 hr 2 hr

Sanger (15 samples)

20 hr 2 hr

EGFR FA and Sanger (32 samples)

18.5 hr 1.5 hr

PIK3CA HRM 4 hr 2 hr

Sanger (5 samples)

18 hr 1 hr

Data analysis 4 hr 4 hr

Total 71 hr 13 hr

NGS (for 48 samples; 102 targets)

Quantification Qubit 30 min 30 min

Multiplex PCR 4 hr 1.5 hr

Library preparation 7 hr 4.5 hr

Quantification, normalization, pooling

1 hr 1 hr

MiSeq 26 hr 30 min

Data analysis 28 hr 4 hr

Total 66.5 hr 12 hr

HRM, high-resolution melting; FA, fragment length analysis; NGS, next-generation sequencing; PCR, polymerase chain reaction.

Copyright © 2015 by the International Association for the Study of Lung Cancer

1055Copyright © 2015 by the International Association for the Study of Lung Cancer

Journal of Thoracic Oncology ®  •  Volume 10, Number 7, July 2015 Implementation of Amplicon Parallel Sequencing

limited by the availability of small amounts of tumor material, particularly in lung cancer where often only small biopsies or cytologies can be obtained.35 To overcome these limitations, we implemented a multiplex PCR for clinically relevant driver mutations followed by parallel sequencing. The Ion AmpliSeq

Multiplex PCR was ideally suited for use in routine diagnostic because the content is highly customizable and optimized for applications on FFPE. In combinations with the higher sequenc-ing accuracy of the Illumina sequencing platform, we estab-lished a highly reliable clinical diagnostic test.44–46

A

B

FIGURE 4.  NGS in routine diagnostics reveals expected mutation frequencies for EGFR, KRAS, and BRAF as well as an equal mutation rate for DDR2 in ADs and SCC. A and B, One thousand nine hundred seventy-one consecutive ADs and 686 SCCs were analyzed by NGS, which detected driver mutations in 87% of the samples with frequencies depicted here. DDR2 muta-tions were found with a frequency of 3% both in ADs and SCCs. C and D, Thirty-five percent of the ADs and 57% of the SCCs samples harbored TP53 mutations. All TP53 mutated cases and the frequency of co-occurring driver mutations are depicted. ADs, adenocarcinomas; SCC, squamous cell carcinoma.

Copyright © 2015 by the International Association for the Study of Lung Cancer

1056 Copyright © 2015 by the International Association for the Study of Lung Cancer

König et al. Journal of Thoracic Oncology ®  •  Volume 10, Number 7, July 2015

This approach required some modifications to the library preparation protocol, namely after end-repair of the PCR products, an adenosine overhang was generated to allow for efficient adapter ligation.

By analyzing a broad array of different driver mutation in lung cancer in parallel, we were able to gather a comprehen-sive overview of the mutations distribution in our local patient cohort (Fig. 4). These data are consistent with previously pub-lished data.11,33,47

Previously, it was not always possible to screen patients for rare mutations by SS due to lack of material and workload constraints. Using a multiplex PCR approach to analyze all NSCLC patients for a board array of targetable driver muta-tions, we identified several patients with the initial diagnosis of an SCC who harbored actionable EGFR mutations that other-wise would not have been evaluated. Retrospective analysis of this cohort revealed a false histological classification as SCC instead of AD in three patients at the time of initial diagnosis. In addition, we also identified one SCC with a rare targetable EGFR mutation (Fig. 3D, E; Figs. S7 and S8, Supplemental Digital Content 2, http://links.lww.com/JTO/A840).

These findings imply that comprehensive testing of all NSCLC patients reduces the need for extensive immunhis-tochemical subtyping of small poorly differentiated tumor specimens.

A further challenge in the application of precision medi-cine in cancer patients is the occurrence of primary and sec-ondary resistance.48 Thus, NGS-based diagnostic platforms will have broad implication for the detection of resistance mechanisms already at time of the initial molecular analysis and may guide novel studies to overcome primary and second-ary resistance mechanisms. In this study, we demonstrate that our multiplex PCR and NGS-based platform is sensitive and robust and requires less input material, thus reducing the num-ber of patients in whom input material is limiting. At the same time, the turnaround time was significantly reduced and the number of target regions was increased by more than 10-fold compared with the previous SS approach. Furthermore, this alternative approach minimized the difficulty of mutation call-ing inherent to SS. Thus, NGS technologies, complemented with a customized bioinformatics pipeline optimized for a high-throughput diagnostic setting, provide a powerful tool to screen lung cancer patient in routine diagnostics.

CONCLUSIONIn conclusion, NGS-based diagnostic approaches have

now been developed into robust, reliable, fast, and cost-effec-tive assays that will increasingly replace single sequential PCR-based or SS-based approaches.

ACKNOWLEDGMENTSThis work was supported by the German Cancer Aid as

part of the Interdisciplinary Oncology Centers of Excellence program to the Center for Integrated Oncology Köln Bonn through the collaborative project “Comprehensive molecular and histopathological characterization of small cell lung can-cer.” Further support was provided by the Federal German Ministry of Science and Education as part of the National

Genome Research Network program (01GS08101 to J.W. and P.N.). L.C.H. and R.B. are supported by the SFB832 funded by the Deutsche Forschungsgesellschaft (DFG). M.L.S. received support from the International Association for the Study of Lung Cancer (Young Investigator Award).

REFERENCES 1. König K, Meder L, Kröger C, et al. Loss of the keratin cytoskeleton is not

sufficient to induce epithelial mesenchymal transition in a novel KRAS driven sporadic lung cancer mouse model. PLoS One 2013;8:e57996.

2. Clinical Lung Cancer Genome Project (CLCGP); Network Genomic Medicine (NGM). A genomics-based classification of human lung tumors. Sci Transl Med 2013;5:209ra153.

3. Giaccone G. Epidermal growth factor receptor inhibitors in the treatment of non-small-cell lung cancer. J Clin Oncol 2005;23:3235–3242.

4. Kwak EL, Bang YJ, Camidge DR, et al. Anaplastic lymphoma kinase inhi-bition in non-small-cell lung cancer. N Engl J Med 2010;363:1693–1703.

5. Sos ML, Fischer S, Ullrich R, et al. Identifying genotype-dependent effi-cacy of single and combined PI3K- and MAPK-pathway inhibition in cancer. Proc Natl Acad Sci U S A 2009;106:18351–18356.

6. Kim HS, Mitsudomi T, Soo RA, Cho BC. Personalized therapy on the horizon for squamous cell carcinoma of the lung. Lung Cancer 2013;80:249–255.

7. Weiss J, Sos ML, Seidel D, et al. Frequent and focal FGFR1 amplification associates with therapeutically tractable FGFR1 dependency in squamous cell lung cancer. Sci Transl Med 2010;2:62ra93.

8. Peifer M, Fernández-Cuesta L, Sos ML, et al. Integrative genome analy-ses identify key somatic driver mutations of small-cell lung cancer. Nat Genet 2012;44:1104–1110.

9. Sos ML, Dietlein F, Peifer M, et al. A framework for identification of actionable cancer genome dependencies in small cell lung cancer. Proc Natl Acad Sci U S A 2012;109:17034–17039.

10. Imielinski M, Berger AH, Hammerman PS, et al. Mapping the hall-marks of lung adenocarcinoma with massively parallel sequencing. Cell 2012;150:1107–1120.

11. Cancer Genome Atlas Research Network. Comprehensive genomic char-acterization of squamous cell lung cancers. Nature 2012;489:519–525.

12. Cancer Genome Atlas Research Network. Comprehensive molecular pro-filing of lung adenocarcinoma. Nature 2014;511:543–550.

13. Lawrence MS, Stojanov P, Polak P, et al. Mutational heterogene-ity in cancer and the search for new cancer-associated genes. Nature 2013;499:214–218.

14. Liao RG, Jung J, Tchaicha J, et al. Inhibitor-sensitive FGFR2 and FGFR3 mutations in lung squamous cell carcinoma. Cancer Res 2013;73:5195–5205.

15. Doebele RC, Oton AB, Peled N, Camidge DR, Bunn PA Jr. New strate-gies to overcome limitations of reversible EGFR tyrosine kinase inhibitor therapy in non-small cell lung cancer. Lung Cancer 2010;69:1–12.

16. D’Arcangelo M, D’Incecco A, Cappuzzo F. Rare mutations in non-small-cell lung cancer. Future Oncol 2013;9:699–711.

17. Oxnard GR, Binder A, Jänne PA. New targetable oncogenes in non-small-cell lung cancer. J Clin Oncol 2013;31:1097–1104.

18. Navin N, Krasnitz A, Rodgers L, et al. Inferring tumor progression from genomic heterogeneity. Genome Res 2010;20:68–80.

19. Sun JM, Hwang DW, Ahn JS, Ahn MJ, Park K. Prognostic and predictive value of KRAS mutations in advanced non-small cell lung cancer. PLoS One 2013;8:e64816.

20. Chang F, Li MM. Clinical application of amplicon-based next-generation sequencing in cancer. Cancer Genet 2013;206:413–419.

21. Ney JT, Froehner S, Roesler A, Buettner R, Merkelbach-Bruse S. High-resolution melting analysis as a sensitive prescreening diagnostic tool to detect KRAS, BRAF, PIK3CA, and AKT1 mutations in formalin-fixed, paraffin-embedded tissues. Arch Pathol Lab Med 2012;136:983–992.

22. Travis WD, Brambilla E, Noguchi M, et al. International associa-tion for the study of lung cancer/American Thoracic Society/European Respiratory Society International Multidisciplinary Classification of Lung Adenocarcinoma. J Thorac Oncol 2011;6:244–285.

23. Zimmer S, Kahl P, Buhl TM, et al. Epidermal growth factor recep-tor mutations in non-small cell lung cancer influence downstream Akt, MAPK and Stat3 signaling. J Cancer Res Clin Oncol 2009;135:723–730.

Copyright © 2015 by the International Association for the Study of Lung Cancer

1057Copyright © 2015 by the International Association for the Study of Lung Cancer

Journal of Thoracic Oncology ®  •  Volume 10, Number 7, July 2015 Implementation of Amplicon Parallel Sequencing

24. Kent WJ. BLAT—the BLAST-like alignment tool. Genome Res 2002;12:656–664.

25. Bhagwat M, Young L, Robison RR. Using BLAT to find sequence similar-ity in closely related genomes. Curr Protoc Bioinformatics 2012;Chapter 10:Unit10.8.

26. Robinson JT, Thorvaldsdóttir H, Winckler W, et al. Integrative genomics viewer. Nat Biotechnol 2011;29:24–26.

27. Thorvaldsdóttir H, Robinson JT, Mesirov JP. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and explo-ration. Brief Bioinform 2013;14:178–192.

28. Viglietto G. Activating E17K mutation in the gene encoding the protein kinase AKT1 in a subset of squamous cell carcinoma of the lung. Cell Cycle 2009;8:2869–2870.

29. Brose MS, Volpe P, Feldman M, et al. BRAF and RAS mutations in human lung cancer and melanoma. Cancer Res 2002;62:6997–7000.

30. Sharma SV, Bell DW, Settleman J, Haber DA. Epidermal growth factor receptor mutations in lung cancer. Nat Rev Cancer 2007;7:169–181.

31. Kawano O, Sasaki H, Endo K, et al. PIK3CA mutation status in Japanese lung cancer patients. Lung Cancer 2006;54:209–215.

32. Jin G, Kim MJ, Jeon HS, et al. PTEN mutations and relationship to EGFR, ERBB2, KRAS, and TP53 mutations in non-small cell lung cancers. Lung Cancer 2010;69:279–283.

33. Kong-Beltran M, Seshagiri S, Zha J, et al. Somatic mutations lead to an oncogenic deletion of met in lung cancer. Cancer Res 2006;66:283–289.

34. Kerick M, Isau M, Timmermann B, et al. Targeted high throughput sequencing in clinical cancer settings: formaldehyde fixed-paraffin embedded (FFPE) tumor tissues, input amount and tumor heterogeneity. BMC Med Genomics 2011;4:68.

35. Spencer DH, Sehn JK, Abel HJ, Watson MA, Pfeifer JD, Duncavage EJ. Comparison of clinical targeted next-generation sequence data from formalin-fixed and fresh-frozen tissue specimens. J Mol Diagn 2013;15:623–633.

36. Kobayashi S, Boggon TJ, Dayaram T, et al. EGFR mutation and resistance of non-small-cell lung cancer to gefitinib. N Engl J Med 2005;352:786–792.

37. Mok TS, Wu YL, Thongprasert S, et al. Gefitinib or carboplatin-paclitaxel in pulmonary adenocarcinoma. N Engl J Med 2009;361:947–957.

38. Pao W, Miller VA, Politi KA, et al. Acquired resistance of lung adenocar-cinomas to gefitinib or erlotinib is associated with a second mutation in the EGFR kinase domain. PLoS Med 2005;2:e73.

39. Hammerman PS, Sos ML, Ramos AH, et al. Mutations in the DDR2 kinase gene identify a novel therapeutic target in squamous cell lung can-cer. Cancer Discov 2011;1:78–89.

40. Miao L, Wang Y, Zhu S, et al. Identification of novel driver mutations of the discoidin domain receptor 2 (DDR2) gene in squamous cell lung cancer of Chinese patients. BMC Cancer 2014;14:369.

41. An SJ, Chen ZH, Su J, et al. Identification of enriched driver gene altera-tions in subgroups of non-small cell lung cancer patients based on histol-ogy and smoking status. PLoS One 2012;7:e40109.

42. Travis WD. Classification of lung cancer. Semin Roentgenol 2011;46:178–186.

43. Wong SQ, Li J, Tan AY, et al; CANCER 2015 Cohort. Sequence artefacts in a prospective series of formalin-fixed tumours tested for mutations in hotspot regions by massively parallel sequencing. BMC Med Genomics 2014;7:23.

44. Wagle N, Berger MF, Davis MJ, et al. High-throughput detection of actionable genomic alterations in clinical tumor samples by targeted, massively parallel sequencing. Cancer Discov 2012;2:82–93.

45. Loman NJ, Misra RV, Dallman TJ, et al. Performance comparison of benchtop high-throughput sequencing platforms. Nat Biotechnol 2012;30:434–439.

46. Koshimizu E, Miyatake S, Okamoto N, et al. Performance comparison of bench-top next generation sequencers using microdroplet PCR-based enrichment for targeted sequencing in patients with autism spectrum dis-order. PLoS One 2013;8:e74167.

47. Akamatsu H, Koh Y, Kenmotsu H, et al. Multiplexed molecular profiling of lung cancer using pleural effusion. J Thorac Oncol 2014;9:1048–1052.

48. Niederst MJ, Engelman JA. Bypass mechanisms of resistance to receptor tyrosine kinase inhibition in lung cancer. Sci Signal 2013;6:re6.

49. Govindan R, Ding L, Griffith M, et al. Genomic landscape of non-small cell lung cancer in smokers and never-smokers. Cell 2012;150:1121–1134.