Publications

22 Publications visible to you, out of a total of 22

Much Ado About Nothing: Accelerating Maximum Likelihood Phylogenetic Inference via Early Stopping to evade (Over-)optimization

Computational Molecular Evolution

Abstract (Expand)

Maximum Likelihood (ML) based phylogenetic inference constitutes a challenging optimization problem. Given a set of aligned input sequences, phylogenetic inference tools strive to determine the tree …

Authors: Anastasis Togkousidis, Alexandros Stamatakis, Olivier Gascuel

Date Published: 8th Jul 2024

Publication Type: Journal

DOI: 10.1101/2024.07.04.602058

Citation: biorxiv;2024.07.04.602058v1,[Preprint]

Created: 16th Oct 2024 at 13:28, Last updated: 16th Oct 2024 at 13:28

Complexity of avian evolution revealed by family-level genomes

Computational Molecular Evolution

(Show All)

Abstract (Expand)

Despite tremendous efforts in the past decades, relationships among main avian lineages remain heavily debated without a clear resolution. Discrepancies have been attributed to diversity of species …

Authors: Josefin Stiller, Shaohong Feng, Al-Aabid Chowdhury, Iker Rivas-González, David A. Duchêne, Qi Fang, Yuan Deng, Alexey Kozlov, Alexandros Stamatakis, Santiago Claramunt, Jacqueline M. T. Nguyen, Simon Y. W. Ho, Brant C. Faircloth, Julia Haag, Peter Houde, Joel Cracraft, Metin Balaban, Uyen Mai, Guangji Chen, Rongsheng Gao, Chengran Zhou, Yulong Xie, Zijian Huang, Zhen Cao, Zhi Yan, Huw A. Ogilvie, Luay Nakhleh, Bent Lindow, Benoit Morel, Jon Fjeldså, Peter A. Hosner, Rute R. da Fonseca, Bent Petersen, Joseph A. Tobias, Tamás Székely, Jonathan David Kennedy, Andrew Hart Reeve, Andras Liker, Martin Stervander, Agostinho Antunes, Dieter Thomas Tietze, Mads Bertelsen, Fumin Lei, Carsten Rahbek, Gary R. Graves, Mikkel H. Schierup, Tandy Warnow, Edward L. Braun, M. Thomas P. Gilbert, Erich D. Jarvis, Siavash Mirarab, Guojie Zhang

Date Published: 1st Apr 2024

Publication Type: Journal

DOI: 10.1038/s41586-024-07323-1

Citation: Nature

Created: 23rd Apr 2024 at 11:12, Last updated: 23rd Apr 2024 at 11:13

Pandora: A Tool to Estimate Dimensionality Reduction Stability of Genotype Data

Computational Molecular Evolution

(Show All)

Abstract (Expand)

Motivation: Genotype datasets typically contain a large number of single nucleotide polymorphisms for a comparatively small number of individuals. To identify similarities between individuals and to …

Authors: Julia Haag, Alexander I. Jordan, Alexandros Stamatakis

Date Published: 15th Mar 2024

Publication Type: Journal

DOI: 10.1101/2024.03.14.584962

Citation: biorxiv;2024.03.14.584962v1,[Preprint]

Created: 23rd Apr 2024 at 11:14, Last updated: 23rd Apr 2024 at 11:15

Predicting Phylogenetic Bootstrap Values via Machine Learning

Computational Molecular Evolution

(Show All)

Abstract (Expand)

Estimating the statistical robustness of the inferred tree(s) constitutes an integral part of most phylogenetic analyses. Commonly, one computes and assigns a branch support value to each inner branch …

Authors: Julius Wiegert, Dimitri Höhler, Julia Haag, Alexandros Stamatakis

Date Published: 6th Mar 2024

Publication Type: Journal

DOI: 10.1101/2024.03.04.583288

Citation: biorxiv;2024.03.04.583288v1,[Preprint]

Created: 23rd Apr 2024 at 11:16, Last updated: 23rd Apr 2024 at 11:17

AleRax: A tool for gene and species tree co-estimation and reconciliation under a probabilistic model of gene duplication, transfer, and loss

Computational Molecular Evolution

Abstract (Expand)

ABSTRACT Motivation Genomes are a rich source of information on the pattern and process of evolution across biological scales. How best to make use of that information is an active area of research in …at information is an active area of research in phylogenetics. Ideally, phylogenetic methods should not only model substitutions along gene trees, which explain differences between homologous gene sequences, but also the processes that generate the gene trees themselves along a shared species tree. To conduct accurate inferences, one needs to account for uncertainty at both levels, that is, in gene trees estimated from inherently short sequences and in their diverse evolutionary histories along a shared species tree. Results We present AleRax, a software that can infer reconciled gene trees together with a shared species tree using a simple, yet powerful, probabilistic model of gene duplication, transfer, and loss. A key feature of AleRax is its ability to account for uncertainty in the gene tree and its reconciliation by using an efficient approximation to calculate the joint phylogenetic-reconciliation likelihood and sample reconciled gene trees accordingly. Simulations and analyses of empirical data show that AleRax is one order of magnitude faster than competing gene tree inference tools while attaining the same accuracy. It is consistently more robust than species tree inference methods such as SpeciesRax and ASTRAL-Pro 2 under gene tree uncertainty. Finally, AleRax can process multiple gene families in parallel thereby allowing users to compare competing phylogenetic hypotheses and estimate model parameters, such as DTL probabilities for genome-scale datasets with hundreds of taxa Availability and Implementation GNU GPL at https://github.com/BenoitMorel/AleRax and data are made available at https://cme.h-its.org/exelixis/material/alerax_data.tar.gz . Contact Benoit.Morel@h-its.org Supplementary information Supplementary material is available.

Authors: Benoit Morel, Tom A. Williams, Alexandros Stamatakis, Gergely J. Szöllősi

Date Published: 7th Oct 2023

Publication Type: Journal

DOI: 10.1101/2023.10.06.561091

Citation: biorxiv;2023.10.06.561091v2,[Preprint]

Created: 2nd Jan 2024 at 18:29, Last updated: 5th Mar 2024 at 21:25

Adaptive RAxML-NG: Accelerating Phylogenetic Inference under Maximum Likelihood using Dataset Difficulty

Computational Molecular Evolution

(Show All)

Abstract (Expand)

Abstract Phylogenetic inferences under the maximum likelihood criterion deploy heuristic tree search strategies to explore the vast search space. Depending on the input dataset, searches from different …t, searches from different starting trees might all converge to a single tree topology. Often, though, distinct searches infer multiple topologies with large log-likelihood score differences or yield topologically highly distinct, yet almost equally likely, trees. Recently, Haag et al. introduced an approach to quantify, and implemented machine learning methods to predict, the dataset difficulty with respect to phylogenetic inference. Easy multiple sequence alignments (MSAs) exhibit a single likelihood peak on their likelihood surface, associated with a single tree topology to which most, if not all, independent searches rapidly converge. As difficulty increases, multiple locally optimal likelihood peaks emerge, yet from highly distinct topologies. To make use of this information, we introduce and implement an adaptive tree search heuristic in RAxML-NG, which modifies the thoroughness of the tree search strategy as a function of the predicted difficulty. Our adaptive strategy is based upon three observations. First, on easy datasets, searches converge rapidly and can hence be terminated at an earlier stage. Second, overanalyzing difficult datasets is hopeless, and thus it suffices to quickly infer only one of the numerous almost equally likely topologies to reduce overall execution time. Third, more extensive searches are justified and required on datasets with intermediate difficulty. While the likelihood surface exhibits multiple locally optimal peaks in this case, a small proportion of them is significantly better. Our experimental results for the adaptive heuristic on 9,515 empirical and 5,000 simulated datasets with varying difficulty exhibit substantial speedups, especially on easy and difficult datasets (53% of total MSAs), where we observe average speedups of more than 10×. Further, approximately 94% of the inferred trees using the adaptive strategy are statistically indistinguishable from the trees inferred under the standard strategy (RAxML-NG).

Authors: Anastasis Togkousidis, Oleksiy M Kozlov, Julia Haag, Dimitri Höhler, Alexandros Stamatakis

Date Published: 1st Oct 2023

Publication Type: Journal

DOI: 10.1093/molbev/msad227

Citation: Molecular Biology and Evolution 40(10),msad227

Created: 2nd Jan 2024 at 18:28, Last updated: 5th Mar 2024 at 21:25

Interpreting phylogenetic placements for taxonomic assignment of environmental DNA

Computational Molecular Evolution

Abstract (Expand)

Abstract Taxonomic assignment of operational taxonomic units (OTUs) is an important bioinformatics step in analyzing environmental sequencing data. Pairwise alignment and phylogenetic‐placement methods …

Authors: Isabelle Ewers, Lubomír Rajter, Lucas Czech, Frédéric Mahé, Alexandros Stamatakis, Micah Dunthorn

Date Published: 1st Sep 2023

Publication Type: Journal

DOI: 10.1111/jeu.12990

Citation: J Eukaryotic Microbiology 70(5),e12990

Created: 2nd Jan 2024 at 18:26, Last updated: 5th Mar 2024 at 21:25

Publications

Filters ×

Filters