www.fgks.org   »   [go: up one dir, main page]

Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Ultrasensitive plasma-based monitoring of tumor burden using machine-learning-guided signal enrichment

Abstract

In solid tumor oncology, circulating tumor DNA (ctDNA) is poised to transform care through accurate assessment of minimal residual disease (MRD) and therapeutic response monitoring. To overcome the sparsity of ctDNA fragments in low tumor fraction (TF) settings and increase MRD sensitivity, we previously leveraged genome-wide mutational integration through plasma whole-genome sequencing (WGS). Here we now introduce MRD-EDGE, a machine-learning-guided WGS ctDNA single-nucleotide variant (SNV) and copy-number variant (CNV) detection platform designed to increase signal enrichment. MRD-EDGESNV uses deep learning and a ctDNA-specific feature space to increase SNV signal-to-noise enrichment in WGS by ~300× compared to previous WGS error suppression. MRD-EDGECNV also reduces the degree of aneuploidy needed for ultrasensitive CNV detection through WGS from 1 Gb to 200 Mb, vastly expanding its applicability within solid tumors. We harness the improved performance to identify MRD following surgery in multiple cancer types, track changes in TF in response to neoadjuvant immunotherapy in lung cancer and demonstrate ctDNA shedding in precancerous colorectal adenomas. Finally, the radical signal-to-noise enrichment in MRD-EDGESNV enables plasma-only (non-tumor-informed) disease monitoring in advanced melanoma and lung cancer, yielding clinically informative TF monitoring for patients on immune-checkpoint inhibition.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: MRD-EDGESNV deep-learning classifier distinguishes ctDNA SNV fragments from cfDNA artifacts.
Fig. 2: Machine-learning-based error suppression and additional features enhance plasma WGS-based CNV detection sensitivity.
Fig. 3: Tumor-informed monitoring of minimal residual disease in perioperative, neoadjuvant and recurrent disease settings.
Fig. 4: MRD-EDGE tumor-informed detection of ctDNA from screen-detected adenomas and pT1 lesions.
Fig. 5: ctDNA detection in melanoma plasma WGS without matched tumor.
Fig. 6: Serial monitoring of clinical response to immunotherapy with MRD-EDGESNV.

Similar content being viewed by others

Data availability

Clinical samples sequenced at the New York Genome Center are available from the European Genome-Phenome Archive (EGA) under accession codes EGAS00001007306 and EGAS00001007545. Sequence data from previous work14 are available from the EGA under accession codes EGAS00001004406 and EGAS00001007451. For clinical samples from Aarhus University, to protect the privacy and confidentiality of patients in this study, personal data including clinical and sequence data are not made publicly available in a repository or the supplementary material of the article. These data can be requested at any time from Claus Lindbjerg Andersen’s laboratory (cla@clin.au.dk). Any requests will be reviewed within a time frame of 2–3 weeks by the data assessment committee to verify whether the request is subject to any intellectual property or confidentiality obligations. All data shared will be de-identified. Clinical information and sequencing metrics pertinent to clinical samples from Aarhus University are therefore withheld from supplementary tables and present in restricted tables as indicated in Supplementary Tables 116. Access to clinical data and processed sequencing data output files (Mutect2 v.4.2.4.1, Strelka2 v.2.9.10 and FACETS v.0.6.2) used in the article requires that the data requestor (legal entity) enter into Collaboration and Data Processing Agreements, with the Central Denmark Region (the legal entity controlling and responsible for the data). Request for access to raw sequencing data furthermore requires that the purpose of the data re-analysis is approved by The Danish National Committee on Health Research Ethics. Upon a reasonable request, the authors, on behalf of the Central Denmark Region, will enter into a collaboration with the data requestor to apply for approval. Additional info can be found at https://genome.au.dk/library/GDK000010/.

Code availability

Computer code and computational models for reproducing results from this study are included in revision materials and will be available upon request from the Center for Technology Licensing at Cornell University for academic (non-commercial) groups. The code can be accessed at ctl.cornell.edu/industry/mrdedge-license-request/.

References

  1. Powles, T. et al. ctDNA guiding adjuvant immunotherapy in urothelial carcinoma. Nature https://doi.org/10.1038/s41586-021-03642-9 (2021).

  2. Bratman, S. V. et al. Personalized circulating tumor DNA analysis as a predictive biomarker in solid tumor patients treated with pembrolizumab. Nat. Cancer 1, 873–881 (2020).

    CAS  PubMed  Google Scholar 

  3. Tie, J. et al. Circulating tumor DNA analysis guiding adjuvant therapy in stage II colon cancer. N. Engl. J. Med. 386, 2261–2272 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  4. Phallen, J. et al. Direct detection of early-stage cancers using circulating tumor DNA. Sci. Transl. Med. https://doi.org/10.1126/scitranslmed.aan2415 (2017).

  5. Newman, A. M. et al. An ultrasensitive method for quantitating circulating tumor DNA with broad patient coverage. Nat. Med. 20, 548–554 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  6. Nabet, B. Y. et al. Noninvasive early identification of therapeutic benefit from immune checkpoint inhibition. Cell 183, 363–376 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  7. Rose Brannon, A. et al. Enhanced specificity of clinical high-sensitivity tumor mutation profiling in cell-free DNA via paired normal sequencing using MSK-ACCESS. Nat. Commun. 12, 3770 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  8. Magbanua, M. J. M. et al. Circulating tumor DNA in neoadjuvant-treated breast cancer reflects response and survival. Ann. Oncol. 32, 229–239 (2021).

    CAS  PubMed  Google Scholar 

  9. Henriksen, T. V. et al. Circulating tumor DNA in stage III colorectal cancer, beyond minimal residual disease detection, towards assessment of adjuvant therapy efficacy and clinical behavior of recurrences. Clin. Cancer Res. https://doi.org/10.1158/1078-0432.CCR-21-2404 (2021).

  10. Kotani, D. et al. Molecular residual disease and efficacy of adjuvant chemotherapy in patients with colorectal cancer. Nat. Med. 29, 127–134 (2023).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. Kurtz, D. M. et al. Enhanced detection of minimal residual disease by targeted sequencing of phased variants in circulating tumor DNA. Nat. Biotechnol. https://doi.org/10.1038/s41587-021-00981-w (2021).

  12. Haque, I. S. & Elemento, O. Challenges in using ctDNA to achieve early detection of cancer. Preprint at bioRxiv https://doi.org/10.1101/237578 (2017).

  13. Avanzini, S. et al. A mathematical model of ctDNA shedding predicts tumor detection size. Sci. Adv. 6, eabc4308 (2020).

    PubMed  PubMed Central  Google Scholar 

  14. Zviran, A. et al. Genome-wide cell-free DNA mutational integration enables ultra-sensitive cancer monitoring. Nat. Med. 26, 1114–1124 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  15. Newman, A. M. et al. Integrated digital error suppression for improved detection of circulating tumor DNA. Nat. Biotechnol. 34, 547–555 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  16. Wan, J. C. M. et al. ctDNA monitoring using patient-specific sequencing and integration of variant reads. Sci. Transl. Med. https://doi.org/10.1126/scitranslmed.aaz8084 (2020).

  17. Gydush, G. et al. Massively parallel enrichment of low-frequency alleles enables duplex sequencing at low depth. Nat. Biomed. Eng. 6, 257–266 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  18. Alexandrov, L. B. et al. Signatures of mutational processes in human cancer. Nature 500, 415–421 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  19. Alexandrov, L. B. et al. Mutational signatures associated with tobacco smoking in human cancer. Science 354, 618–622 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  20. Underhill, H. R. et al. Fragment length of circulating tumor DNA. PLoS Genet. 12, e1006162 (2016).

    PubMed  PubMed Central  Google Scholar 

  21. Mouliere, F. et al. Enhanced detection of circulating tumor DNA by fragment size analysis. Sci. Transl. Med. 10, eaat4921 (2018).

    PubMed  PubMed Central  Google Scholar 

  22. Guo, J. et al. Quantitative characterization of tumor cell-free DNA shortening. BMC Genomics 21, 473 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. Gonzalez-Perez, A., Sabarinathan, R. & Lopez-Bigas, N. Local determinants of the mutational landscape of the human genome. Cell 177, 101–114 (2019).

    CAS  PubMed  Google Scholar 

  24. Woo, Y. H. & Li, W.-H. DNA replication timing and selection shape the landscape of nucleotide variation in cancer genomes. Nat. Commun. 3, 1004 (2012).

    PubMed  Google Scholar 

  25. Haradhvala, N. J. et al. Mutational strand asymmetries in cancer genomes reveal mechanisms of DNA damage and repair. Cell 164, 538–549 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. Donley, N. & Thayer, M. J. DNA replication timing, genome stability and cancer: late and/or delayed DNA replication timing is associated with increased genomic instability. Semin. Cancer Biol. 23, 80–89 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  27. Polak, P. et al. Cell-of-origin chromatin organization shapes the mutational landscape of cancer. Nature 518, 360–364 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  28. Bruhm, D. C. et al. Single-molecule genome-wide mutation profiles of cell-free DNA for non-invasive detection of cancer. Nat. Genet. 55, 1301–1310 (2023).

    CAS  PubMed  PubMed Central  Google Scholar 

  29. Taylor, A. M. et al. Genomic and functional approaches to understanding cancer aneuploidy. Cancer Cell 33, 676–689.e3 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  30. Deshpande, A., Walradt, T., Hu, Y., Koren, A. & Imielinski, M. Robust foreground detection in somatic copy number data. Preprint at bioRxiv https://doi.org/10.1101/847681 (2019).

  31. Raine, K. M. et al. AscatNgs: identifying somatically acquired copy-number alterations from whole-genome sequencing data. Curr. Protoc. Bioinform. 56, 15.9.1–15.9.17 (2016).

    Google Scholar 

  32. Carter, S. L. et al. Absolute quantification of somatic DNA alterations in human cancer. Nat. Biotechnol. 30, 413–421 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  33. Cristiano, S. et al. Genome-wide cell-free DNA fragmentation in patients with cancer. Nature 570, 385–389 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  34. Snyder, M. W., Kircher, M., Hill, A. J., Daza, R. M. & Shendure, J. Cell-free DNA comprises an in vivo nucleosome footprint that informs its tissues-of-origin. Cell 164, 57–68 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  35. Jiang, P. et al. Preferred end coordinates and somatic variants as signatures of circulating tumor DNA associated with hepatocellular carcinoma. Proc. Natl Acad. Sci. USA 115, E10925–E10933 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  36. Renaud, G. et al. Unsupervised detection of fragment length signatures of circulating tumor DNA using non-negative matrix factorization. eLife https://doi.org/10.7554/eLife.71569 (2022).

  37. Zack, T. I. et al. Pan-cancer patterns of somatic copy number alteration. Nat. Genet. 45, 1134–1140 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  38. Reinert, T. et al. Analysis of plasma cell-free DNA by ultradeep sequencing in patients with stages I to III colorectal cancer. JAMA Oncol. 5, 1124–1131 (2019).

    PubMed  PubMed Central  Google Scholar 

  39. Tan, A. C. et al. Abstract 5114: ultra-sensitive detection of minimal residual disease (MRD) through whole genome sequencing (WGS) using an AI-based error suppression model in resected early-stage non-small cell lung cancer (NSCLC). Cancer Res. 82, 5114 (2022).

    Google Scholar 

  40. Tie, J. et al. Circulating tumor DNA analyses as markers of recurrence risk and benefit of adjuvant therapy for stage III colon cancer. JAMA Oncol. 5, 1710–1717 (2019).

    PubMed  PubMed Central  Google Scholar 

  41. Altorki, N. K. et al. Neoadjuvant durvalumab with or without stereotactic body radiotherapy in patients with early-stage non-small-cell lung cancer: a single-centre, randomised phase 2 trial. Lancet Oncol. 22, 824–835 (2021).

    CAS  PubMed  Google Scholar 

  42. Kageyama, S.-I. et al. Radiotherapy increases plasma levels of tumoral cell-free DNA in non-small cell lung cancer patients. Oncotarget 9, 19368–19378 (2018).

    PubMed  PubMed Central  Google Scholar 

  43. Shaw, J. et al. Serial postoperative ctDNA monitoring of breast cancer recurrence. J. Clin. Orthod. 40, 562 (2022).

    Google Scholar 

  44. Myint, N. N. M. et al. Circulating tumor DNA in patients with colorectal adenomas: assessment of detectability and genetic heterogeneity. Cell Death Dis. 9, 894 (2018).

    PubMed  PubMed Central  Google Scholar 

  45. Junca, A. et al. Detection of colorectal cancer and advanced adenoma by liquid biopsy (Decalib Study): the ddPCR challenge. Cancers https://doi.org/10.3390/cancers12061482 (2020).

  46. Galanopoulos, M. et al. Comparative study of mutations in single nucleotide polymorphism loci of KRAS and BRAF genes in patients who underwent screening colonoscopy, with and without premalignant intestinal polyps. Anticancer Res. 37, 651–657 (2017).

    CAS  PubMed  Google Scholar 

  47. Rasmussen, L. et al. Protocol outlines for parts 1 and 2 of the prospective endoscopy III study for the early detection of colorectal cancer: validation of a concept based on blood biomarkers. JMIR Res. Protoc. 5, e182 (2016).

    PubMed  PubMed Central  Google Scholar 

  48. Alcántara Torres, M. et al. DNA aneuploidy in colorectal adenomas. Role in the adenoma-carcinoma sequence. Rev. Esp. Enferm. Dig. 97, 7–15 (2005).

    PubMed  Google Scholar 

  49. Lin, Y. et al. Intensity-modulated radiation therapy for definitive treatment of cervical cancer: a meta-analysis. Radiat. Oncol. 13, 177 (2018).

    PubMed  PubMed Central  Google Scholar 

  50. Wolff, R. K. et al. Mutation analysis of adenomas and carcinomas of the colon: early and late drivers. Genes Chromosomes Cancer 57, 366–376 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  51. Cindy Yang, S. Y. et al. Pan-cancer analysis of longitudinal metastatic tumors reveals genomic alterations and immune landscape dynamics associated with pembrolizumab sensitivity. Nat. Commun. 12, 5137 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  52. Postow, M. A. et al. Adaptive dosing of nivolumab + ipilimumab immunotherapy based upon early, interim radiographic assessment in advanced melanoma (The ADAPT-IT Study). J. Clin. Oncol. https://doi.org/10.1200/JCO.21.01570 (2021).

  53. Adalsteinsson, V. A. et al. Scalable whole-exome sequencing of cell-free DNA reveals high concordance with metastatic tumors. Nat. Commun. 8, 1324 (2017).

    PubMed  PubMed Central  Google Scholar 

  54. Weber, S. et al. Dynamic changes of circulating tumor DNA predict clinical outcome in patients with advanced non–small-cell lung cancer treated with immune checkpoint inhibitors. JCO Precis. Oncol. https://doi.org/10.1200/PO.21.00182 (2021).

  55. Zhang, Q. et al. Prognostic and predictive impact of circulating tumor DNA in patients with advanced cancers treated with immune checkpoint blockade. Cancer Discov. https://doi.org/10.1158/2159-8290.CD-20-0047 (2020).

  56. Wolchok, J. D. et al. Overall survival with combined nivolumab and ipilimumab in advanced melanoma. N. Engl. J. Med. 377, 1345–1356 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  57. Bai, X. et al. Early use of high-dose glucocorticoid for the management of irAE is associated with poorer survival in patients with advanced melanoma treated with anti-PD-1 monotherapy. Clin. Cancer Res. 27, 5993–6000 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  58. Almogy, G. et al. Cost-efficient whole genome-sequencing using novel mostly natural sequencing-by-synthesis chemistry and open fluidics platform. Preprint at bioRxiv https://doi.org/10.1101/2022.05.29.493900 (2022).

  59. Chowell, D. et al. Improved prediction of immune checkpoint blockade efficacy across multiple cancer types. Nat. Biotechnol. https://doi.org/10.1038/s41587-021-01070-8 (2021).

  60. Gerstung, M. et al. The evolutionary history of 2,658 cancers. Nature 578, 122–128 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  61. Illumina. TruSeq DNA PCR-Free Reference Guide (Illumina, 2017).

  62. Reinert, T. et al. Analysis of circulating tumour DNA to monitor disease burden following colorectal cancer surgery. Gut 65, 625–634 (2016).

    CAS  PubMed  Google Scholar 

  63. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  64. Jiang, H., Lei, R., Ding, S.-W. & Zhu, S. Skewer: a fast and accurate adapter trimmer for next-generation sequencing paired-end reads. BMC Bioinform. 15, 182 (2014).

    Google Scholar 

  65. Bergmann, E. A., Chen, B.-J., Arora, K., Vacic, V. & Zody, M. C. Conpair: concordance and contamination estimator for matched tumor-normal pairs. Bioinformatics 32, 3196–3198 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  66. Arora, K. et al. Deep whole-genome sequencing of 3 cancer cell lines on 2 sequencing platforms. Sci. Rep. 9, 19123 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  67. Favero, F. et al. Sequenza: allele-specific copy number and mutation profiles from tumor sequencing data. Ann. Oncol. 26, 64–70 (2015).

    CAS  PubMed  Google Scholar 

  68. Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  69. Amemiya, H. M., Kundaje, A. & Boyle, A. P. The ENCODE blacklist: identification of problematic regions of the genome. Sci. Rep. 9, 9354 (2019).

    PubMed  PubMed Central  Google Scholar 

  70. Benjamin, D. et al. Calling somatic SNVs and indels with Mutect2. Preprint at bioRxiv https://doi.org/10.1101/861054 (2019).

  71. ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

    Google Scholar 

  72. Rozowsky, J. et al. PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls. Nat. Biotechnol. 27, 66–75 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  73. Corces, M. R. et al. The chromatin accessibility landscape of primary human cancers. Science 362, eaav1898 (2018).

    PubMed  PubMed Central  Google Scholar 

  74. Ernst, J. & Kellis, M. ChromHMM: automating chromatin-state discovery and characterization. Nat. Methods 9, 215–216 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  75. Xiong, K. & Ma, J. Revealing Hi-C subcompartments by imputing inter-chromosomal chromatin interactions. Nat. Commun. 10, 5069 (2019).

    PubMed  PubMed Central  Google Scholar 

  76. Imielinski, M. et al. fragCounter: GC and mappability corrected fragment coverage for paired end whole genome sequencing. GitHub https://github.com/mskilab-org/fragCounter (2018).

  77. van de Geijn, B., McVicker, G., Gilad, Y. & Pritchard, J. K. WASP: allele-specific software for robust molecular quantitative trait locus discovery. Nat. Methods 12, 1061–1063 (2015).

    PubMed  PubMed Central  Google Scholar 

  78. Dentro, S. C., Wedge, D. C. & Van Loo, P. Principles of reconstructing the subclonal architecture of cancers. Cold Spring Harb. Perspect. Med. https://doi.org/10.1101/cshperspect.a026625 (2017).

  79. Henriksen, T. V. et al. Error characterization and statistical modeling improves circulating tumor DNA detection by droplet digital PCR. Clin. Chem. 68, 657–667 (2022).

    PubMed  Google Scholar 

  80. Henriksen, T. V. et al. Comparing single-target and multitarget approaches for postoperative circulating tumour DNA detection in stage II-III colorectal cancer patients. Mol. Oncol. 16, 3654–3665 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  81. Cheng, D. T. et al. Memorial Sloan Kettering-integrated mutation profiling of actionable cancer targets (MSK-IMPACT): a hybridization capture-based next-generation sequencing clinical assay for solid tumor molecular oncology. J. Mol. Diagn. 17, 251–264 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  82. Shen, R. & Seshan, V. E. FACETS: allele-specific copy number and clonal heterogeneity analysis tool for high-throughput DNA sequencing. Nucleic Acids Res. 44, e131 (2016).

    PubMed  PubMed Central  Google Scholar 

  83. Davidson-Pilon, C. lifelines, survival analysis in Python. Zenodo https://doi.org/10.5281/zenodo.5512044 (2021).

  84. Zivich, P., Davidson-Pilon, C., Reger, D., Diong, J. & The Gitter Badger. pzivich/zEpid: v.0.9.0. Zenodo https://doi.org/10.5281/zenodo.7234506 (2020).

Download references

Acknowledgements

We thank the patients who contributed plasma and tissue to this project as well as their families. We thank and acknowledge the Danish Cancer Biobank and the Endoscopy III study team (L. Nannestad Jørgensen and M. Rasmussen (Bispebjerg Hospital, Copenhagen), M. R. Madsen and A. H. Madsen (Herning Hospital, Herning), L. Ferm and E. Rømer (Hvidovre Hospital, Hvidovre), T. Boest and B. Andersen (Randers Hospital, Randers) and A. Khalid (Viborg Hospital, Viborg)) for providing access to blood and tissue materials. We also thank the Landau laboratory and the NYGC computational biology and sequencing teams for help and feedback throughout this work. MGH investigators and sample collection are supported by the Adelson Medical Research Foundation. The Endoscopy III cohort was established by support from The Andersen Foundation, The Augustinus Foundation, The Beckett Foundation, The Inger Bonnéns Fund, The Hans & Nora Buchards Fund, The Walter Christensen Fund, The P.M. Christiansen Fund, The Kong Chr. X’s Fund, The Aase & Ejnar Danielsens Fund, The Family Erichsens Fund, The Knud & Edith Eriksens Fund, The Svend Espersens Fund, The Elna & Jørgen Fagerholts Cancer Research Foundation, The Sofus Carl Emil Friis Scholarship, The Torben & Alice Frimodts Fund, The Eva & Henry Frænkels Fund, The Gangsted Foundation, Thora & Viggo Groves Memorial Scholarship, The H-Foundation, Erna Hamiltons Scholarship, Søren & Helene Hempels Scholarship, The Sven & Ina Hansens Fund, The Henrik Henriksen Fund, Carl & Ellen Hertz’ Scholarship, Jørgen Holm & Elisa F. Hansen Memorial Scholarship, The Jochum Foundation, The KID Foundation, The Kornerup Foundation, The Linex Foundation, The Dagmar Marshalls Fund, The Midtjyske Fund, The Axel Muusfeldts Fund, The Børge Nielsens Fund, Michael Hermann Nielsens Memorial Scholarship, The Arvid Nilssons Fund, The Obelske Family Fund, The Krista & Viggo Petersens Fund, The Willy & Ingeborg Reinhards Fund, The Kathrine & Vigo Skovgaards Fund, The Toyota Foundation, The Vissing Foundation, The Danish Cancer Research Foundation and Hvidovre Hospitals Research Pool. This work was supported by the Mark foundation Aspire Award (D.A.L. and C.L.A.); Novo Nordisk Foundation (grant no. NNF17OC0025052 to C.L.A.); the Danish Cancer Society (grant numbers R146-A9466-16-S2 (C.L.A.); R231-A13845 (C.L.A.); and R257-A14700 (C.L.A.)). MSKCC investigators are supported by Cancer Center Support Grant P30 CA08748 from the National Institutes of Health/National Cancer Institute. A.J.W. received support from the Conquer Cancer Foundation Young Investigator Award, the Melanoma Research Alliance Young Investigator Award and the NCI K08 Mentored Career Scientist Award. D.A.L. is supported by the Burroughs Wellcome Fund Career Award for Medical Scientists, the National Cancer Institute (R01 CA266619), the Valle Scholar Award, Mark Foundation Emerging Leader Award and the Melanoma Research Alliance Established Investigator Award. D.A.L. is a Scholar of the Leukemia & Lymphoma Society. This work was made possible by the MacMillan Family Foundation and the MacMillan Center for the Study of the Non-Coding Cancer Genome at the New York Genome Center. The opinions, results and conclusions reported in this paper are those of the authors and are independent from these funding sources.

Author information

Authors and Affiliations

Authors

Contributions

D.A.L., A.J.W., M.S. and C.L.A. conceived and designed the project. S.A. and K.G. designed the TNBC recurrence monitoring aspect of the project. D.A.L., N.R., C.L.A., S.A. and K.G. served as lead principal investigators. A.J.W., M.S., A.F., N.Ø., C.M., L.S., D.T.F., K.K., M.S.M., M.M., D.M., L.W., W.d.B., C.L., T.S., C.S., D.V., A.J.M., R.M., V.C., E.K., D.L., K.G., J.S., M.K.C., G.B., J.D.W., A.S., S.T., N.K.A., M.A.P., C.T., J.N., S.Ø.J., M.H.R., C.L.A. and D.A.L. recruited patients, performed patient selection, collected and curated clinical information, prepared samples for sequencing and performed next-generation sequencing and data storage. T.V.H., A.F., M.H.R. and C.L.A. performed tumor-informed ctDNA analysis by digital PCR. A.J.W., M.S., D.H., R.B., C.C.K., L.A., J.Q., S.R., A.A., A.D., W.F.H., M.Z., J.B., T.L., M.I., E.Z., S.A., N.R., M.F.B., R.H.S. and D.A.L. performed the computational genomics analyses. A.J.W. and D.A.L. wrote the paper with comments and contributions from all authors. S.A., C.L.A., C.P., J.S.S., A.P.C., E.Z., C.T., M.A.P. and N.R. reviewed and edited the manuscript.

Corresponding authors

Correspondence to Adam J. Widman or Dan A. Landau.

Ethics declarations

Competing interests

D.A.L., A.J.W., C.C.K. and J.B. are listed as inventors on a pending patent application (WO2023018791A1), filed by Cornell University, which is directed to methods of detecting SNVs for the purposes of MRD detection and other plasma-based cancer monitoring. D.A.L., A.J.W. and M.S. are listed as inventors on a pending patent application (WO2023133093A1), filed by Cornell University, which is directed to methods of detecting CNVs for the purposes of MRD detection and other plasma-based cancer monitoring. A.P.C. is listed as an inventor on submitted patents pertaining to cell-free DNA (US patent applications 63/237,367, 63/056,249, 63/015,095 and 16/500,929) and receives consulting fees from Eurofins Viracor. A.S. receives research funding from AstraZeneca, has served on Advisory Boards for AstraZeneca, Blueprint Medicines and Jazz Pharmaceuticals, and has been a consultant for Genentech. M.A.P. has received consulting fees from BMS, Chugai, Cancer Expert Now, Intellisphere, Merck, MJH Associates, Nektar, Pfizer, Uptodate, WebMD and Erasca, and received institutional support from RGenix, Infinity, BMS, Merck, Genentech and Novartis. C.L.A. reports collaborations with C2i Genomics and Natera. C.S. has received honoraria for advisory board participation from Pfizer, Novartis, Knight, Bayer, Merck, Roche and Lilly within the past 2 years. None of the honoraria has been in excess of CAD$5,000 and total for honoraria received is less than CAD$25,000. M.K.C. has received consulting fees from BMS, Merck, InCyte, Moderna, ImmunoCore and AstraZeneca and receives institutional support from BMS. S.T. is funded by Cancer Research UK (grant ref. no. A29911); the Francis Crick Institute, which receives its core funding from Cancer Research UK (FC10988), the UK Medical Research Council (FC10988) and the Wellcome Trust (FC10988); the National Institute for Health Research Biomedical Research Centre at the Royal Marsden Hospital and Institute of Cancer Research (grant ref. no. A109), the Royal Marsden Cancer Charity, The Rosetrees Trust (grant ref. no. A2204), Ventana Medical Systems (grant reference nos. 10467 and 10530), the National Institute of Health (U01 CA247439) and Melanoma Research Alliance (award ref. no. 686061). S.T. has received speaking fees from Roche, AstraZeneca, Novartis and Ipsen. S.T. has the following patents filed: Indel mutations as a therapeutic target and predictive biomarker PCTGB2018/051892 and PCTGB2018/051893. G.B. has sponsored research agreements through her institution with: Olink Proteomics, Teiko Bio, InterVenn Biosciences and Palleon Pharmaceuticals. G.B. served on advisory boards for Iovance, Merck, Nektar Therapeutics, Novartis and Ankyra Therapeutics. G.B. consults for Merck, InterVenn Biosciences, Iovance and Ankyra Therapeutics. G.B. holds equity in Ankyra Therapeutics. M.F.B. reports consulting for AstraZeneca, Eli Lilly, Paige.AI, research support from Boundless Bio and intellectual property rights for SOPHiA Genetics. J.D.W. is a consultant for Apricity, Ascentage Pharma, AstraZeneca, BeiGene, Bicara Therapeutics, Bristol Myers Squibb, Daiichi Sankyo, Dragonfly, Imvaq, Larkspur, Psioxus, Recepta, Takeda, Tizona, Trishula Therapeutics and Sellas. J.D.W. received grant/research support from Bristol Myers Squibb and Enterome. J.D.W. has Equity in Apricity, Arsenal IO/CellCarta, Ascentage, Imvaq, Linneaus, Larkspur, Georgiamune, Maverick, Tizona Therapeutics and Xenimmune. D.A.L. received research support from Illumina. D.A.L. participated in advisory boards Pangea, Mission Bio and Alethiomics. D.A.L. is a scientific co-founder of C2i Genomics. The other authors declare no competing interests.

Peer review

Peer review information

Nature Medicine thanks the anonymous reviewers for their contribution to the peer review of this work. Primary Handling Editor: Ulrike Harjes, in collaboration with the Nature Medicine team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 MRD-EDGESNV feature selection, model architecture and performance.

a) Feature density plots for ctDNA and cfDNA SNV artifacts used in the MRD-EDGESNV NSCLC model. In this comparison, ctDNA SNV fragments are identified from consensus mutation calls in high-burden NSCLC plasma samples and compared to cfDNA SNV fragments (sequencing errors) drawn from within the same plasma sample to preclude sample-specific biases when establishing predictive ability of individual features. b) SNV classification performance for different machine-learning models. F1 score was assessed on tumor-confirmed melanoma ctDNA SNV fragments vs. cfDNA artifacts from healthy controls. Random subsamplings (n = 6) were drawn from the held-out melanoma validation set, which was split into tenths for this analysis. We compared performance between MRD-EDGESNV and its separate components (left), as well as to other ML architectures (right) c) Fragment-level ROC analysis for MRD-EDGESNV classifier for different cancer types. Performance is assessed on filtered fragments (~90% of low-quality cfDNA artifacts are excluded by quality filters) in held-out validation sets (Supplementary Table 1) for melanoma (blue), CRC (green), and NSCLC (red). Colored dots on curves indicate the tumor-informed decision threshold (0.5) used in each tumor type to classify individual SNV fragments as ctDNA or cfDNA artifact. d) Signal-to-noise enrichment analysis for MRDetect and for each step of the MRD-EDGESNV tumor-informed pipeline using the same in silico mixing replicates as in Fig. 1e (n = 20 replicates). Final pipeline enrichment is 118-fold for MRD-EDGESNV vs. 8.3-fold for MRDetectSNV in the same datasets.

Extended Data Fig. 2 Lower limit of detection studies with MRD-EDGESNV.

a) In silico studies of cfDNA from the metastatic colorectal cancer sample CRC-863 mixed into cfDNA from a healthy plasma sample (CTRL-335) at mixing fractions TF = 10−6–10−3 at 29X coverage depth, performed in 30 technical replicates with independent sampling seeds. Tumor-informed MRD-EDGESNV enables sensitive detection of TF as low as 1*10−5 (AUC 0.80), measured by Z score of SNV detection rates against unmixed control plasma (TF = 0). b) In silico studies of cfDNA from the metastatic small cell lung cancer sample SC-128_0w mixed into cfDNA from a healthy plasma sample (CTRL-216) at mixing fractions TF = 10−6–10−3 at 25X coverage depth, performed in 20 technical replicates with independent sampling seeds. Tumor-informed MRD-EDGESNV enables sensitive detection of TF as low as 5*10−6 (AUC 0.86), measured by Z score of SNV detection rates against unmixed control plasma (TF = 0). Box plots represent median, lower and upper quartiles; whiskers correspond to 1.5 x interquartile range. An AUC heatmap measures detection vs. TF = 0 at different mixed TFs. c) Sensitivity at 95% specificity for tumor-informed MRD-EDGESNV in silico studies in green) CRC, red) SCLC, and blue) melanoma. Mixed TF replicates were compared to TF = 0 replicates by sample-level MRD-EDGESNV Z score. Error bars indicate 95% binomial confidence interval for empiric sensitivity based on number of technical replicates (n = 30 CRC, n = 20 SCLC, n = 20 melanoma). d-f) Detection performance vs. TF = 0 at different mixed TFs for MRD-EDGESNV (blue) and MRDetectSNV SVM (gray). The AUC is measured by a sample Z score (positive label) compared to TF = 0 distribution (negative label) for each replicate at each TF. Error bars represent 95% CI (DeLong AUC variance). (bottom) Normalized error for a subset of mixed TFs between MRD-EDGESNV and MRDetectSNV. Error bars represent 95% CI. Normalized error is shown for TFs where AUC is less than 1 and is measured as (TFestimated-TFmixed)/TFmixed. d) in silico CRC studies as defined in (a), e) in silico SCLC studies as defined in (b), f) In silico studies of cfDNA from the metastatic cutaneous melanoma sample MEL-100 mixed into cfDNA from a healthy plasma sample (CTRL-216) at mixing fractions TF = 10−7–10−4 at 16X coverage depth, performed in 20 technical replicates with independent sampling seeds.

Extended Data Fig. 3 Experimental mixing studies with MRD-EDGESNV.

a) Plasma TF inference with MRD-EDGESNV using genome-wide SNV integration for in vitro dilutions of the pretreatment melanoma plasma MEL-137_A in expired plasma harvested through plasmapheresis from a donor without known cancer. Dilutions were performed in 2 replicates, and a mean noise rate for the patient-specific mutation profile was drawn from n = 17 concurrently sequenced SCLC plasma samples. b) MRD-EDGESNV (left) and MRDetectSNV (right) Z score discrimination between ctDNA detected in experimental plasma replicates (blue dots, replicate 1, and green dots, replicate 2) from the patient MEL-137 and downsampled TF = 0 replicates (white boxes, n = 30, 15 downsampled alignment files from 2 TF = 0 replicates). Signal is measured from SNV detection rates on patient plasma and the downsampled TF = 0 plasma samples using the patient-specific SNV profile for MEL-137. Positive ctDNA detection (dotted blue line) was defined as patient plasma MRD-EDGESNV or MRDetectSNV Z score above a detection threshold of 95% specificity against downsampled TF = 0 plasma in the ROC for each platform. Sample-level Z scores were capped at 10 to allow greater visibility of Z scores around the detection threshold.

Extended Data Fig. 4 In silico mixing studies of MRD-EDGECNV in CRC, NSCLC, and melanoma.

a) In silico mixing studies in which high TF plasma samples were admixed into non-cancer plasma (left, right) or low TF plasma samples (middle). Admixtures (n = 25 technical replicates per mix fraction) model tumor fractions of 10−6–10−3. Box plots represent median, lower and upper quartiles; whiskers correspond to 1.5 x interquartile range. An AUC heatmap demonstrates detection performance vs. TF = 0 at different mixed TFs as measured by a sample Z score (derived from summed read-depth skews for read-depth classifier, BAF score for BAF classifier, summed fragment length entropy for fragment length entropy classifier, Methods) compared to TF = 0 distribution for each replicate. a) Pretreatment NSCLC plasma from the patient NSCLC-45 was mixed into non-cancer control plasma from the patient CTRL-206 in 25 technical replicates. The read depth (left) and fragment length entropy (right) classifiers demonstrate similar performance in pretreatment NSCLC admixtures compared to CRC admixtures (Fig. 2b–d). (middle) Pretreatment melanoma plasma from the patient MEL-12 was mixed into post-treatment plasma following a major response to immunotherapy in 25 technical replicates. The BAF classifier demonstrates similar performance compared to CRC admixtures (Fig. 2c) and accounts for bias that may be encountered when mixing plasma into matched peripheral blood mononuclear cell (PBMC) normal, as performed in CRC. b) Z scores for the read-depth classifier in neutral regions (no copy number gain or loss in the matched tumor WGS data) for NSCLC demonstrates the expected absence of directional read-depth skew in copy-neutral regions. c) Assessment of preoperative plasma, post-adjuvant plasma, and matched normal (from PBMCs) BAF in SNPs before (left) and after (right) SNP quality filters in CRC (patient CRC-465). Filters include mapping bias correction and outlier exclusion criteria. To demonstrate the relationship between signal and phased SNPs, the major allele in plasma is randomly permuted to be in phase or out of phase at the percentage specified along the x axis. Quality filters enable appropriate signal inference for preoperative plasma (highest signal), postoperative MRD (intermediate signal), and PBMC BAF (minimal signal).

Extended Data Fig. 5 Clinical performance of tumor-informed MRD-EDGE in stage III perioperative colorectal cancer.

a) (left) ROC analysis on MRD-EDGE (blue) and MRDetect (gray) in preoperative stage III CRC. Preoperative plasma samples with matched tumor mutation profiles (n = 15) are compared with control plasma samples assessed against all unmatched stage III CRC tumor mutation profiles (n = 15 tumor profiles assessed across 25 control samples from Aarhus controls cohort, n = 375 control-comparisons). Twenty control samples included in SNV model training and / or used in the MRD-EDGECNV read-depth PON were withheld from this analysis. (middle) ROC analysis with MRD-EDGESNV (blue), and MRDetectSNV (gray). Preoperative plasma samples with matched tumor mutation profiles (n = 15) are compared with unmatched control plasma samples assessed against all unmatched stage III CRC tumor mutation profiles (n = 15 tumor profiles assessed across 40 control samples from Aarhus controls cohort, n = 600 control-comparisons). Five control samples included in SNV model training were withheld from this analysis. (right) ROC analysis with MRD-EDGECNV (blue), and MRDetectCNV (gray). Preoperative plasma sample CNV-based Z scores (n = 15) are compared against control plasma samples assessed against all unmatched stage III CRC tumor mutation profiles (n = 15 tumor profiles assessed across 25 control samples from Aarhus controls cohort, n = 375 control-comparisons). Twenty control samples included in the read-depth panel of normal were withheld from this analysis. b) Cross-patient ROC analysis on preoperative stage III CRC plasma samples for MRD-EDGESNV demonstrates similar performance to control (non-cancer) plasma. Preoperative plasma samples with matched tumor profiles (n = 15) are compared with stage III CRC plasma samples assessed against all unmatched stage III CRC tumor profiles (n = 15 tumor profiles assessed across 14 cross-patient samples, n = 210 cross-comparisons). c) ROC analysis performed on CNV-based Z score values for read depth (left), BAF (middle), and fragment length entropy (right) CNV classifiers in preoperative stage III CRC. Preoperative plasma samples with matched tumor profiles (n = 15) are compared with control plasma samples assessed against all unmatched tumor profiles (n = 375 comparisons for read depth, 15 tumor profiles assessed across 25 control samples; n = 675 comparisons for BAF and fragment length entropy, 15 tumor profiles assessed across 45 control samples). Twenty control samples included in the read-depth panel of normal samples were withheld from read-depth analysis.

Extended Data Fig. 6 Comparison of MRD-EDGE and MRDetect in preoperative, pretreatment NSCLC.

a) (left) ROC analysis of NSCLC plasma samples for MRD-EDGE (blue) and MRDetect (gray). NSCLC plasma samples with matched tumor profiles (n = 22 samples) are compared with control plasma samples assessed against all unmatched NSCLC tumor mutation profiles (n = 22 tumor profiles assessed across 20 control samples from NYGC controls cohort, n = 440 control-comparisons). (middle) ROC analysis of NSCLC plasma samples for MRD-EDGESNV (blue) and MRDetectSNV (gray). NSCLC plasma samples with matched tumor profiles (n = 22, Supplementary Table 5) are compared with control plasma samples assessed against all unmatched NSCLC tumor mutation profiles (n = 22 tumor profiles assessed across 40 control samples from NYGC controls cohort, n = 660 control-comparisons). Five patients used in MRD-EDGESNV NSCLC model training were excluded from downstream analysis. (right) ROC analysis of NSCLC plasma samples for MRD-EDGECNV (blue) and MRDetectCNV (gray). NSCLC plasma samples with matched tumor profiles (n = 22) are compared against control plasma samples assessed against all unmatched NSCLC tumor mutation profiles (n = 22 tumor profiles assessed across 20 control samples from NYGC controls cohort, n = 440 control-comparisons). Fifteen patients used in the read-depth PON samples were withheld from downstream analysis. b) Cross-patient ROC analysis on pretreatment NSCLC tumor profiles for MRD-EDGESNV demonstrates similar performance to control (non-cancer) plasma. Preoperative plasma samples with matched tumor profiles (n = 22) are compared with NSCLC plasma samples assessed against all unmatched NSCLC tumor profiles (n = 22 tumor profiles assessed across 21 cross-patient samples, n = 462 cross-comparisons). c) ROC analysis performed on CNV-based Z score values for read depth (left), BAF (middle), and fragment length entropy (right) CNV classifiers in preoperative stage III CRC. Preoperative plasma samples with matched tumor profiles (n = 22) are compared with control plasma samples assessed against all unmatched tumor profiles (n = 440 comparisons for read depth, 22 tumor profiles assessed across 20 control samples; n = 770 comparisons for BAF and fragment length entropy, 22 tumor profiles assessed across 35 control samples). Twenty control samples included in the read-depth panel of normal samples were withheld from read-depth analysis.

Extended Data Fig. 7 MRD-EDGE detection of ctDNA from colorectal pT1 carcinomas and adenomas.

a) Cross-patient ROC analysis for MRD-EDGESNV in screen-detected pT1 lesions (left) and adenomas (right). Preoperative plasma samples with matched tumor mutation profiles are compared with a cross-patient panel of plasma samples assessed against all unmatched cross-patient tumor profiles (n = 44, including 29 pT1 and adenoma cross patients and 15 stage III preoperative patients). b) Tumor resection volume for adenoma samples in which ctDNA was detected (orange, n = 7) and non-detected (blue, n = 13). Box plots represent median, bottom and upper quartiles; whiskers correspond to 1.5 x interquartile range.

Extended Data Fig. 8 Use of MRD-EDGESNV in acral melanoma and monitoring response to immunotherapy with MRD-EDGESNV.

a) ctDNA detection rates for pretreatment cutaneous melanoma samples from the adaptive dosing cohort (n = 26, orange, detection rate was capped at 0.0005) compared to acral melanoma samples (n = 3, blue, pre- and post-treatment time points from one patient with acral melanoma) sequenced within the same batch and flow cell and detection rates as healthy control plasma (n = 30, gray). ctDNA is not detected from acral melanoma plasma, demonstrating absence of batch effect and the specificity of MRD-EDGESNV for the UV signatures associated specifically with cutaneous melanoma. b) Forest plot demonstrating relationship between ctDNA TF trend (increase or decrease) and progression-free survival (PFS) and overall survival (OS) at serial post-treatment time points. MRD-EDGESNV TF estimates are measured as a detection rate normalized to the pretreatment sample (normalized detection rate, nDR). Each post-treatment timepoint is prognostic of PFS outcomes. HR, hazard ratio. c) (left) Kaplan–Meier overall survival analysis for Week 6 RECIST response (n = 10 partial response, ‘PR’, n = 8 stable disease,’SD’, n = 6 progressive disease, ‘PD’) in the adaptive dosing melanoma cohort (n = 26 patients) where CT imaging was available at Week 6 shows no significant relationship with OS (multivariate log-rank test). (right) Kaplan–Meier OS analysis for Week 6 ctDNA trend in adaptive dosing melanoma patients with decreased (n = 17) or increased (n = 5) nDR compared to pretreatment timepoint as measured by MRD-EDGESNV. Patients with undetectable pretreatment ctDNA (n = 2) were excluded from the analysis, as were 2 patients where Week 6 plasma was not available for analysis. Increased nDR at Week 6 was associated with shorter overall survival (two-sided log-rank test).

Extended Data Fig. 9 Use of MRD-EDGESNV to monitor response to ICI in small cell lung cancer.

a) In silico studies of cfDNA from the SCLC sample SC-128 (pretreatment TF = 22.9%) mixed in n = 20 replicates against cfDNA from a healthy plasma sample (TF = 0) at mix fractions 10−5–10−2 at 25X coverage depth. MRD-EDGESNV enables sensitive detection of TF as low as TF = 5*10−4 (AUC 0.72), measured by Z score of SNV fragment detection rate against unmixed control plasma (TF = 0, n = 20 randomly chosen replicates), without matched tumor tissue to guide SNV identification. Box plots represent median, bottom and upper quartiles; whiskers correspond to 1.5 x interquartile range. An AUC heatmap measures detection vs. TF = 0 at different mixed TFs. b) ROC analysis on detection rates for MRD-EDGESNV (blue) and TF estimation with ichorCNA (gray) in pretreatment SCLC plasma samples (Supplementary Table 7). Fragment detection rates in SCLC plasma samples (n = 16 plasma samples) were compared with fragment detection rates in control plasma samples (n = 30). c) Kaplan–Meier progression-free survival analysis for Week 3 ctDNA trend in SCLC patients with decreased (n = 7) or increased (n = 3) normalized detection rate (nDR) as measured by MRD-EDGESNV. Increased nDR at Week 3 was associated with shorter progression-free survival (two-sided log-rank test).

Supplementary information

Supplementary Information

Supplementary Note and Figs. 1–19.

Reporting Summary

Supplementary Tables

Supplementary Tables 1–16.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Widman, A.J., Shah, M., Frydendahl, A. et al. Ultrasensitive plasma-based monitoring of tumor burden using machine-learning-guided signal enrichment. Nat Med 30, 1655–1666 (2024). https://doi.org/10.1038/s41591-024-03040-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41591-024-03040-4

Search

Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer