Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Analysis
  • Published:

Identifying recurrent mutations in cancer reveals widespread lineage diversity and mutational specificity

Abstract

Mutational hotspots indicate selective pressure across a population of tumor samples, but their prevalence within and across cancer types is incompletely characterized. An approach to detect significantly mutated residues, rather than methods that identify recurrently mutated genes, may uncover new biologically and therapeutically relevant driver mutations. Here, we developed a statistical algorithm to identify recurrently mutated residues in tumor samples. We applied the algorithm to 11,119 human tumors, spanning 41 cancer types, and identified 470 somatic substitution hotspots in 275 genes. We find that half of all human tumors possess one or more mutational hotspots with widespread lineage-, position- and mutant allele–specific differences, many of which are likely functional. In total, 243 hotspots were novel and appeared to affect a broad spectrum of molecular function, including hotspots at paralogous residues of Ras-related small GTPases RAC1 and RRAS2. Redefining hotspots at mutant amino acid resolution will help elucidate the allele-specific differences in their function and could have important therapeutic implications.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Mutational data and hotspot detection.
Figure 2: Lineage landscape of hotspot mutations.
Figure 3: Lineage diversity and mutant allele specificity.
Figure 4: Candidate Ras-related small GTPase driver mutations in the long tail.

Similar content being viewed by others

References

  1. Vogelstein, B. et al. Cancer genome landscapes. Science 339, 1546–1558 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  2. Garraway, L.A. & Lander, E.S. Lessons from the cancer genome. Cell 153, 17–37 (2013).

    Article  CAS  PubMed  Google Scholar 

  3. Tamborero, D. et al. Comprehensive identification of mutational cancer driver genes across 12 tumor types. Sci. Rep. 3, 2650 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  4. Lawrence, M.S. et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499, 214–218 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Kandoth, C. et al. Mutational landscape and significance across 12 major cancer types. Nature 502, 333–339 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Gonzalez-Perez, A. et al. IntOGen-mutations identifies cancer drivers across tumor types. Nat. Methods 10, 1081–1082 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Santarius, T., Shipley, J., Brewer, D., Stratton, M.R. & Cooper, C.S. A census of amplified and overexpressed human cancer genes. Nat. Rev. Cancer 10, 59–64 (2010).

    Article  CAS  PubMed  Google Scholar 

  8. Burd, C.E. et al. Mutation-specific RAS oncogenicity explains NRAS codon 61 selection in melanoma. Cancer Discov. 4, 1418–1429 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Menzies, A.M. et al. Distinguishing clinicopathologic features of patients with V600E and V600K BRAF-mutant metastatic melanoma. Clin. Cancer Res. 18, 3242–3249 (2012).

    Article  CAS  PubMed  Google Scholar 

  10. Vivanco, I. et al. Differential sensitivity of glioma- versus lung cancer-specific EGFR mutations to EGFR kinase inhibitors. Cancer Discov. 2, 458–471 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Westcott, P.M. et al. The mutational landscapes of genetic and chemical models of Kras-driven lung cancer. Nature 517, 489–492 (2015).

    Article  CAS  PubMed  Google Scholar 

  12. Krauthammer, M. et al. Exome sequencing identifies recurrent somatic RAC1 mutations in melanoma. Nat. Genet. 44, 1006–1014 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Cancer Genome Atlas Research Network. Comprehensive molecular characterization of urothelial bladder carcinoma. Nature 507, 315–322 (2014).

  14. Lee, C.S. et al. Recurrent point mutations in the kinetochore gene KNSTRN in cutaneous squamous cell carcinoma. Nat. Genet. 46, 1060–1062 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Jaiswal, B.S. et al. Oncogenic ERBB3 mutations in human cancers. Cancer Cell 23, 603–617 (2013).

    Article  CAS  PubMed  Google Scholar 

  16. Alexandrov, L.B. et al. Signatures of mutational processes in human cancer. Nature 500, 415–421 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Mullighan, C.G. et al. CREBBP mutations in relapsed acute lymphoblastic leukaemia. Nature 471, 235–239 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Cheung, L.W. et al. Naturally occurring neomorphic PIK3R1 mutations activate the MAPK pathway, dictating therapeutic response to MAPK pathway inhibitors. Cancer Cell 26, 479–494 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Weng, A.P. et al. Activating mutations of NOTCH1 in human T cell acute lymphoblastic leukemia. Science 306, 269–271 (2004).

    Article  CAS  PubMed  Google Scholar 

  20. Hart, T., Brown, K.R., Sircoulomb, F., Rottapel, R. & Moffat, J. Measuring error rates in genomic perturbation screens: gold standards for human functional genomics. Mol. Syst. Biol. 10, 733 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  21. Yu, H.A. et al. Prognostic impact of KRAS mutation subtypes in 677 patients with metastatic lung adenocarcinomas. J. Thorac. Oncol. 10, 431–437 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Ihle, N.T. et al. Effect of KRAS oncogene substitutions on protein behavior: implications for signaling and clinical outcome. J. Natl. Cancer Inst. 104, 228–239 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Garassino, M.C. et al. Different types of K-Ras mutations could affect drug sensitivity and tumour behaviour in non-small-cell lung cancer. Ann. Oncol. 22, 235–237 (2011).

    Article  CAS  PubMed  Google Scholar 

  24. de Bruin, E.C. et al. Spatial and temporal diversity in genomic instability processes defines lung cancer evolution. Science 346, 251–256 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Cancer Genome Atlas Research Network. Integrated genomic characterization of endometrial carcinoma. Nature 497, 67–73 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  26. Cancer Genome Atlas Network. Comprehensive molecular characterization of human colon and rectal cancer. Nature 487, 330–337 (2012).

  27. Whitehall, V.L. et al. Oncogenic PIK3CA mutations in colorectal cancers and polyps. Int. J. Cancer 131, 813–820 (2012).

    Article  CAS  PubMed  Google Scholar 

  28. Lawrence, M.S. et al. Discovery and saturation analysis of cancer genes across 21 tumour types. Nature 505, 495–501 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Grabiner, B.C. et al. A diverse array of cancer-associated MTOR mutations are hyperactivating and can predict rapamycin sensitivity. Cancer Discov. 4, 554–563 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Al-Ahmadie, H. et al. Synthetic lethality in ATM-deficient RAD50-mutant tumors underlies outlier response to cancer therapy. Cancer Discov. 4, 1014–1021 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Iyer, G. et al. Genome sequencing identifies a basis for everolimus sensitivity. Science 338, 221 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Forbes, S.A. et al. COSMIC: exploring the world's knowledge of somatic mutations in human cancer. Nucleic Acids Res. 43, D805–D811 (2015).

    Article  CAS  PubMed  Google Scholar 

  33. Hodis, E. et al. A landscape of driver mutations in melanoma. Cell 150, 251–263 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Watson, I.R. et al. The RAC1 P29S hotspot mutation in melanoma confers resistance to pharmacological inhibition of RAF. Cancer Res. 74, 4845–4852 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Van Allen, E.M. et al. The genetic landscape of clinical resistance to RAF inhibition in metastatic melanoma. Cancer Discov. 4, 94–109 (2014).

    Article  CAS  PubMed  Google Scholar 

  36. Janakiraman, M. et al. Genomic and biological characterization of exon 4 KRAS mutations in human cancer. Cancer Res. 70, 5901–5911 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Ehrhardt, A., Ehrhardt, G.R., Guo, X. & Schrader, J.W. Ras and relatives--job sharing and networking keep an old family together. Exp. Hematol. 30, 1089–1106 (2002).

    Article  CAS  PubMed  Google Scholar 

  38. Barker, K.T. & Crompton, M.R. Ras-related TC21 is activated by mutation in a breast cancer cell line, but infrequently in breast carcinomas in vivo. Br. J. Cancer 78, 296–300 (1998).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Clark, G.J., Kinch, M.S., Gilmer, T.M., Burridge, K. & Der, C.J. Overexpression of the Ras-related TC21/R-Ras2 protein may contribute to the development of human breast cancers. Oncogene 12, 169–176 (1996).

    CAS  PubMed  Google Scholar 

  40. Huang, Y. et al. A novel insertional mutation in the TC21 gene activates its transforming activity in a human leiomyosarcoma cell line. Oncogene 11, 1255–1260 (1995).

    CAS  PubMed  Google Scholar 

  41. Erdogan, M., Pozzi, A., Bhowmick, N., Moses, H.L. & Zent, R. Signaling pathways regulating TC21-induced tumorigenesis. J. Biol. Chem. 282, 27713–27720 (2007).

    Article  CAS  PubMed  Google Scholar 

  42. Rosário, M., Paterson, H.F. & Marshall, C.J. Activation of the Ral and phosphatidylinositol 3′ kinase signaling pathways by the ras-related protein TC21. Mol. Cell. Biol. 21, 3750–3762 (2001).

    Article  PubMed  PubMed Central  Google Scholar 

  43. Rong, R., He, Q., Liu, Y., Sheikh, M.S. & Huang, Y. TC21 mediates transformation and cell survival via activation of phosphatidylinositol 3-kinase/Akt and NF-kappaB signaling pathway. Oncogene 21, 1062–1070 (2002).

    Article  CAS  PubMed  Google Scholar 

  44. Rosário, M., Paterson, H.F. & Marshall, C.J. Activation of the Raf/MAP kinase cascade by the Ras-related protein TC21 is required for the TC21-mediated transformation of NIH 3T3 cells. EMBO J. 18, 1270–1279 (1999).

    Article  PubMed  PubMed Central  Google Scholar 

  45. Cerami, E. et al. The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov. 2, 401–404 (2012).

    Article  PubMed  Google Scholar 

  46. Gao, J. et al. Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci. Signal. 6, pl1 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  47. Karolchik, D. et al. The UCSC Genome Browser database: 2014 update. Nucleic Acids Res. 42, D764–D770 (2014).

    Article  CAS  PubMed  Google Scholar 

  48. McLaren, W. et al. Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor. Bioinformatics 26, 2069–2070 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Benjamini, Y. & Yekutieli, D. The control of the false discovery rate in multiple testing under dependency. Ann. Stat. 29, 1165–1188 (2001).

    Article  Google Scholar 

  50. Cheng, D.T. et al. MSK-IMPACT: A hybridization capture-based next generation sequencing clinical assay for solit tumor molecular oncology. J. Mol. Diagn. 17, 251–264 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Bao, L., Pu, M. & Messer, K. AbsCN-seq: a statistical method to estimate tumor purity, ploidy and absolute copy numbers from next-generation sequencing data. Bioinformatics 30, 1056–1063 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Nik-Zainal, S. et al. The life history of 21 breast cancers. Cell 149, 994–1007 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We would like to thank the members of the Taylor and Schultz laboratories and members of the Marie-Jos��e and Henry R. Kravis Center for Molecular Oncology for useful discussions throughout this work. M.T.C. was supported by the US National Institutes of Health (NIH) training grant T32 GM007175 and A.B.O. was supported by NIH grant P30 CA82103. This work has been further supported by an NIH Core Grant (P30 CA008748, PI: Thompson); the Josie Robertson and Prostate Cancer Foundations (N.S. and B.S.T.), the Sontag Foundation and Cycle for Survival (B.S.T.).

Author information

Authors and Affiliations

Authors

Contributions

M.T.C., S.A. and B.S.T. conceived the study. N.S. and B.S.T. supervised analyses. M.T.C., S.A., N.D.S., A.B.O. and B.S.T. developed methods and algorithms. M.T.C., J.S.C., J.G., C.K. and N.S. acquired data. M.T.C., S.A., J.S.C., S.P.G., B.H.L., J.G. and D.B.S. performed experiments and analyses. M.T.C., N.S. and B.S.T. wrote the manuscript with input from all authors.

Corresponding author

Correspondence to Barry S Taylor.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 Hotspot detection components and workflow.

a) A schematic of the hotspot detection methodology employed here is shown. b) The steps involved in filtering mutation calls, samples, and genes, as well as generating and curating the final hotspot list.

Supplementary Figure 2 Facets of underlying mutational processes incorporated into hotspot discovery model.

a) We incorporated the intrinsic mutability of individual trinucleotides genome-wide, as this varies widely and is driven by diverse endogenous and exogenous sources. Shown are all mutations in the study cohort, the subset of mutations detected in two mutagen-associated tumor types (tobacco-associated lung adenocarcinoma and UV-associated cutaneous melanoma), and two subsets of tumors driven by aberrant DNA repair processes (microsatellite instability and POLE exonuclease domain mutations). b) We truncate the standard binomial model based on a priori understanding of the ways in which hotspots can emerge in candidate cancer genes (Supplementary Fig. 1a) to allow for the identification of warmspot mutations, most often in genes with one or a few hotspots of exceptional frequency that would otherwise suppress other less common alleles. Shown are all hotspots that achieved significance only after model truncation (x vs. y-axis significance values). The color reflects increasing truncated significance. Inset, the pattern of significance among all hotspots found after truncation.

Supplementary Figure 3 Global features of significant hotspots.

a) The number of hotspots across the genes identified here (inset: distribution of hotspot type). b) The frequency of specific hotspots across the 41 tumor types analyzed here. c) The number of mutant alleles distributed among the hotspots detected (inset: number of mutant alleles at a given hotspot increases with the number of tumor types affected, dot is the average number of mutant alleles across the hotspots identified in each of the indicated number of tumor types, bars are the 95% confidence interval). d) The fraction of total mutational burden present in the hotspot (positional specificity) of each affected gene.

Supplementary Figure 4 RNA expression in tumors with known oncogenic hotspots.

The mRNA expression of the indicated gene is shown for all tumors (gray bars) across cancer types in which one or more tumor harbors the indicated hotspot (count of tumor types plotted is indicated, expression is a Z-score of log2 RSEM normalized count data inferred from level-3 TCGA RNA sequencing data). Tumors harboring the oncogenic hotspot are indicated with red tick marks (x-axis) the density distribution of which is shown in blue. The top row indicates genes with no association between the level of expression and the presence of the hotspot. Middle row are those genes whose expression is elevated in tumors bearing the hotspot. The bottom row indicates genes and hotspots of variable patterns of expression. Multiple hotspots in the same gene with different patterns of expression (ERBB2 and PIK3CA) are shown for reference.

Supplementary Figure 5 Lineage map of hotspots in common tumor suppressors.

As in main text Fig. 2A, shown here are all hotspots detected in excluded tumor suppressor genes that harbor at least one hotspot affecting >5% of tumors of one or more tumor types are shown. Frequencies are indicated and genes, hotspots, and tumor types are ordered as described in main text Fig. 2A. These included 14 hotspots from Arg213-1450 of the N-terminal of APC, the mutational cluster region (MCR), affecting between 6 and 37 tumors nearly all of which were colorectal cancers.

Supplementary Figure 6 Squamous cell type-specific hotspots.

a) The enrichment of hotspots in squamous cell tumors (by frequency and significance, as indicated). b) The distribution of tumor types among cases mutated for either MAPK1 E322 (top) or EP300 D1399 (bottom).

Supplementary Figure 7 Impact of unconventional hotspots.

a) Significant splice site hotspots are shown and have diverse affect on transcript sequence and structure. In blue is the coverage and splicing pattern inferred from RNA sequencing of a representative tumor harboring each hotspot. The impact of each is summarized (rightmost column) and include in- frame and frame-shift events resulting from exon skipping, intron retention, and deletions. Highlighted in yellow are splice site hotspots at opposite ends of the same intron with both similar and dissimilar impact on transcript structure. SMTNL2 was not assessable due to little detectable expression in E244 (e3+1)-mutant tumors. b) The spectrum of nonsense mutations in hotspots indicate a subset are comprised exclusively of nonsense mutations. c) Shown is the impact of nonsense hotspots on transcript expression in CDKN2A, TP53, and APC, three genes affected by the greatest number of nonsense mutations. As expected, the expression (inferred from RNA sequencing of affected cases in TCGA cohorts) of all three genes was significantly decreased between tumors with missense mutations versus those with candidate loss-of- function (LOF) mutations of any kind, including nonsense hotspots. Nevertheless, where no difference in TP53 or APC expression existed between tumors with non-hotspot LOF mutations (labeled Misc. LOF) and those carrying nonsense hotspots, the tumors bearing a nonsense hotspot in CDKN2A expressed significantly less transcript levels p16INK4A mRNA than did tumors with non-hotspot LOF mutations.

Supplementary Figure 8 Additional candidate long tail hotspots.

a) Two hotspot mutations were detected in the N-terminal of NUP93 (E14K and Q15*), a constitutively essential gene. The E14K hotspot was recurrently mutated in breast cancer (55% of all E14K mutants), making it among the most common hotspots in breast cancers (right). b) Two somatic missense hotspots H28R and R60Q affect the bHLHz domain of MAX in a diversity of tumor types (indicated at bottom of panel d). These hotspots are different in type and position to the germline nonsense mutations present in sporadic pheochromocytomas and paragangliomas (bottom). c) The MYC:MAX heterodimer bound to DNA in which the DNA binding domain of MAX is highlighted (in blue) indicates the position of R60Q and H28R hotspots in highly conserved residues at the 5’ and 3’ end of the canonical E-box CACGTG motif respectively. A H374R mutation in MYC (annotated), also present in a uterine endometrial like MAX H28R mutations, is at a site equivalent to MAX H28R, extending the affected subset of cases in this tumor type. d) MAX hotspot mutations are mutually exclusive with mutations and amplification of MYC in affected tumor types, irrespective of hypermutation status.

Supplementary Figure 9 GQ60GK dinucleotide mutations are a single genomic event.

Shown are aligned reads from whole-exome sequencing of the tumor and matched normal DNA and RNA sequencing of the tumor from representative affected cases indicating that the GQ60GK dinucleotide mutation is a single event expressed in cis.

Supplementary information

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chang, M., Asthana, S., Gao, S. et al. Identifying recurrent mutations in cancer reveals widespread lineage diversity and mutational specificity. Nat Biotechnol 34, 155–163 (2016). https://doi.org/10.1038/nbt.3391

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nbt.3391

This article is cited by

Search

Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer