CellPhy: accurate and fast probabilistic inference of single-cell phylogenies from scDNA-seq data
Preprint 2020 en
Authors
AK
Alexey M. Kozlov
JA
João M. Alves
AS
Alexandros Stamatakis
Abstract
1 min read
Abstract We introduce a maximum likelihood framework called CellPhy for inferring phylogenetic trees from single-cell DNA sequencing (scDNA-seq) data. CellPhy leverages a finite-site Markov genotype substitution model with 16 diploid states, akin to those typically used in statistical phylogenetics. It includes a dedicated error function for single cells that incorporates amplification/sequencing error and allelic dropout (ADO). Moreover, it can explicitly consider the uncertainty of the variant calling process by using genotype likelihoods as input. We implemented CellPhy in a widely used open-source phylogenetic inference package (RAxML-NG) that provides statistical confidence measurements on the estimated tree and scales particularly well on large scDNA-seq datasets with hundreds or thousands of cells. To benchmark CellPhy, we carried out 19,400 coalescent simulations of cell samples from exponentially-growing tumors for which the true phylogeny was known. We evolved single-cell diploid DNA genotypes along the simulated genealogies under different scenarios, including infinite- and finite-sites nucleotide mutation models, trinucleotide mutational signatures, sequencing, and amplification errors, allele dropouts, and cell doublets. Our simulations suggest that CellPhy is robust to amplification/sequencing errors and ADO and outperforms state-of-the-art methods under realistic scDNA-seq scenarios both in terms of accuracy and speed. Also, we sequenced 24 single-cell whole-genomes from a colorectal tumor. Together with three published scDNA-seq data sets, we analyzed these empirical data to illustrate how CellPhy can provide more reliable biological insights than most competing methods. CellPhy is freely available at https://github.com/amkozlov/cellphy .
Senbai Kang, Nico Borgsmüller, Monica Valecha, Jack Kuipers, João M. Alves, Sonia Prado‐Lòpez, Débora Chantada, Niko Beerenwinkel, David Posada, Ewa Szczurek
Discussion(0)
No comments yet. Be the first to comment.