LiquidCNA: Tracking subclonal evolution from longitudinal liquid biopsies using somatic copy number alterations

Eszter Lakatos¹, Helen Hockings^{2

3}, Maximilian Mossner¹, Weini Huang⁴, Michelle Lockley^{2

5}, Trevor A Graham¹

Affiliations

¹ Centre for Genomics and Computational Biology, Barts Cancer Institute, Queen Mary University of London, London, UK.
² Centre for Cancer Cell and Molecular Biology, Barts Cancer Institute, Queen Mary University of London, London, UK.
³ Barts Health NHS Trust, St Bartholomew's Hospital, West Smithfield, London, UK.
⁴ School of Mathematical Sciences, Queen Mary University of London, London, UK.
⁵ Department of Gynaecological Oncology, Cancer Services, University College London Hospital, London, UK.

PMID: 34401670
PMCID: PMC8350516
DOI: 10.1016/j.isci.2021.102889

LiquidCNA: Tracking subclonal evolution from longitudinal liquid biopsies using somatic copy number alterations

Eszter Lakatos et al. iScience. 2021.

. 2021 Jul 21;24(8):102889.

doi: 10.1016/j.isci.2021.102889. eCollection 2021 Aug 20.

Authors

Eszter Lakatos¹, Helen Hockings^{2

3}, Maximilian Mossner¹, Weini Huang⁴, Michelle Lockley^{2

5}, Trevor A Graham¹

Affiliations

¹ Centre for Genomics and Computational Biology, Barts Cancer Institute, Queen Mary University of London, London, UK.
² Centre for Cancer Cell and Molecular Biology, Barts Cancer Institute, Queen Mary University of London, London, UK.
³ Barts Health NHS Trust, St Bartholomew's Hospital, West Smithfield, London, UK.
⁴ School of Mathematical Sciences, Queen Mary University of London, London, UK.
⁵ Department of Gynaecological Oncology, Cancer Services, University College London Hospital, London, UK.

PMID: 34401670
PMCID: PMC8350516
DOI: 10.1016/j.isci.2021.102889

Abstract

Cell-free DNA (cfDNA) measured via liquid biopsies provides a way for minimally invasive monitoring of tumor evolutionary dynamics during therapy. Here we present liquidCNA, a method to track subclonal evolution from longitudinally collected cfDNA samples sequenced through cost-effective low-pass whole-genome sequencing. LiquidCNA utilizes somatic copy number alteration (SCNA) to simultaneously genotype and quantify the size of the dominant subclone without requiring B-allele frequency information, matched-normal samples, or prior knowledge on the genetic identity of the emerging clone. We demonstrate the accuracy of liquidCNA in synthetically generated sample sets and in vitro mixtures of cancer cell lines. In vivo application in patients with metastatic lung cancer reveals the progressive emergence of a novel tumor subpopulation. LiquidCNA is straightforward to use, is computationally inexpensive, and enables continuous monitoring of subclonal evolution to understand and control-therapy-induced resistance.

Keywords: Cancer; Cancer systems biology; Genomics.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interest.

Figures

**Figure 1**
Schematic of copy number measurements The first panel shows the SCNA profile of ancestral (in yellow) and subclonal (in red) tumor cells. At different sampling time points, the overall tumor SCNA profile is a mixture of these profiles (second panel), influenced by the composition of tumor-derived DNA depicted on the pie charts. Clonal, subclonal, and unstable segments are indicated in yellow, red, and blue, respectively. Note that the CN of clonal segments remains the same. In the liquid biopsies taken at each time point, contamination from normal cells leads to ’flattened’ measured SCNA profiles (last panel) due to normal cells having a diploid karyotype. This contamination affects the CN of each segment. Our aim is to estimate purity (p_i) and subclonal ratio (r_i) based on clonal and subclonal SCNAs.

**Figure 2**
Illustration of the estimation algorithm (A) Outline of the estimation algorithm, with estimation outputs highlighted in color. Dashed arrows separate independent modules. (B) Purity estimation based on the peaks of the distribution of segment CNs. Green lines show the peaks expected at an example purity of 0.21. (C) The error of a range of purity estimates, computed from the distance of observed and estimated peaks in (B). Each line corresponds to a smoothing kernel applied to the raw segment CN distribution. The optimal purity is indicated with arrows. (D) Change in segment CN values (ΔCNs) plotted according to an example sample order. The number of subclonal segments computed in (E) is indicated below. (E) Classification of segments based on the sample order in (D). Segments with low variance are classified as clonal (gray). Nonclonal segments are evaluated whether they follow a quasi-monotone pattern (indicated by the shaded regions) and classified as unstable (outside of shaded region, blue) or subclonal (red). (F) ΔCN values plotted according to the optimal sample order maximizing subclonal segments. Line colors indicate the class of each segment as in (E). (G) Relative subclonal ratio estimation compared with maximal subclonal ratio sample (rightmost in (F)). Points show individual segment-wise estimates, with an example segment highlighted in black. Black line shows the median. (H and I) Subclonal ratios and confidence intervals inferred by fitting a Gaussian mixture model to the ΔCN distribution of subclonal segments. The components of the best fit with means −r and r are shown in green and magenta, respectively, in (H).

**Figure 3**
Estimation of mixtures of synthetic cell populations (A) Parameters used to randomly sample synthetic datasets including simulated measurement noise. The font size of copy number states indicates their probability. (B) A randomly generated sample. The heatmap depicts the distribution of segment CNs in ancestral and subclonal cells, and the proportion of cell populations is shown on the pie chart (red: subclonal, yellow: ancestral, gray: normal). (C) Copy number profile of the sample in (B), with raw bin-wise and segmented copy number values shown in black and red, respectively. (D) Estimated purity of 1,000 synthetic samples with varying levels of noise (σ), plotted against the true theoretical purity. The y = x line is indicated with dashes. (E) Error of purity estimation (absolute difference to true purity) for samples with noise level indicated on the x axis. (F) True and estimated subclonal ratios of 200 synthetic datasets (1,000 samples) with varying levels of noise (σ). (G) Error in subclonal ratio estimation for datasets with increasing noise level. Box plot elements in (E) & (G) stand for the following: center line, median; box limits, upper and lower quartiles; whiskers, 1.5x interquartile range; points, outliers. R in (D) & (F) indicates the Pearson correlation coefficient; p < 10⁻⁸ for all panels. See also Figures S1–S3.

**Figure 4**
Estimation of mixtures of high-grade serous ovarian cancer cell lines (A) Copy number profile of the ancestral/sensitive and subclonal/resistant HGSOC cell lines. Raw bin-wise and segmented copy number values are shown in black and red, respectively. Resistant-specific subclonal SCNAs are highlighted. (B) Purity estimates of samples S0-S5. Corrected values are computed using the linear fit in (C). Theoretical purity values are indicated by maroon diamonds. (C) True (theoretical) and estimated tumor purity of 120 *in silico* HGSOC cell line mixtures. y = x and the linear fit of the estimates (y = 0.81x) are shown with dashed and solid lines, respectively. Point shape and shade indicate the total number of reads per sample. (D) Subclonal ratio estimates for samples S1-S5. Shaded and empty bars indicate estimates derived using direct (Gaussian fit) and two-step (from relative ratios in (F)) methods, respectively. Error bars show 95% confidence interval of the direct estimate, and maroon diamonds indicate theoretical values. (E) True and estimated subclonal ratios of 50 *in silico* datasets constructed of samples from (C) with 50 million reads. (F) Relative subclonal ratio estimates for samples S1-S4, compared to S5. Estimates from each subclonal segment are shown with dots, the median estimates are indicated by black lines, and true values are indicated by maroon diamonds. (G) True and estimated relative subclonal ratios in the 50 datasets shown in (E). See also Figures S5–S8.

**Figure 5**
Estimation in cfDNA samples from patient data Subclone-specific copy number changes and subclonal ratio in lung cancer patients 1306 (A), 3209 (B), and 2760 (C) from the study by Chen et al., 2019. Left: purity-corrected SCNA profiles. Yellow bars show the CN of each segment in the baseline sample, and red bars indicate subclonal deviations from this value in nonbaseline samples. Regions of subclone-specific CNAs are also indicated by darker colors. Shaded regions indicate the location of putatively therapy-associated cancer genes identified in the original study with CN losses (in red) and CN gains (in blue) and newly identified in liquidCNA (in green, see also Figure S10). A bar of CN > 8 on chromosome 3 (indicated by asterisk) has been omitted from (B) for better visualization. Right: estimated subclonal proportion of each sample with 95% confidence intervals. Note that only samples with >10% purity were analyzed (see also Figure S9) and patient 2760 had no gene annotation in the study by Chen et al., 2019.

See this image and copyright information in PMC

References

1. Adalsteinsson V.A., Ha G., Freeman S.S., Choudhury A.D., Stover D.G., Parsons H.A., Gydush G., Reed S.C., Rotem D., Rhoades J. Scalable whole-exome sequencing of cell-free DNA reveals high concordance with metastatic tumors. Nat. Commun. 2017;8:1324. doi: 10.1038/s41467-017-00965-y. - DOI - PMC - PubMed
1. Beroukhim R., Mermel C.H., Porter D., Wei G., Raychaudhuri S., Donovan J., Barretina J., Boehm J.S., Dobson J., Urashima M. The landscape of somatic copy-number alteration across human cancers. Nature. 2010;463:899–905. doi: 10.1038/nature08822. - DOI - PMC - PubMed
1. Bettegowda C., Sausen M., Leary R.J., Kinde I., Wang Y., Agrawal N., Bartlett B.R., Wang H., Luber B., Alani R.M. Detection of circulating tumor DNA in early- and late-stage human malignancies. Sci. Transl. Med. 2014;6:224ra24. doi: 10.1126/scitranslmed.3007094. - DOI - PMC - PubMed
1. Cancer Genome Atlas Research Network Integrated genomic analyses of ovarian carcinoma. Nature. 2011;474:609–615. doi: 10.1038/nature10166. - DOI - PMC - PubMed
1. Chen X., Chang C.-W., Spoerke J.M., Yoh K.E., Kapoor V., Baudo C., Aimi J., Yu M., Liang-Chu M.M., Suttmann R. Low-pass whole-genome sequencing of circulating cell-free DNA demonstrates dynamic changes in genomic copy number in a squamous lung cancer clinical cohort. Clin. Cancer Res. 2019;25:2254–2263. doi: 10.1158/1078-0432.CCR-18-1593. - DOI - PubMed

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

LiquidCNA: Tracking subclonal evolution from longitudinal liquid biopsies using somatic copy number alterations

Affiliations

LiquidCNA: Tracking subclonal evolution from longitudinal liquid biopsies using somatic copy number alterations

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources