Publications

What is a Publication?
20 Publications visible to you, out of a total of 20

Abstract (Expand)

In dieser Arbeit wird Spearfish, eine neue Methode zur distanzbasierten Inferenz von Genbäumen, entwickelt und getestet. Spearfish verwendet die paarweisen Distanzen der Gensequenzen, sowie die Distanzen der zugehörigen Spezies im Speziesbaum, in einem Clustering-Verfahren, um 10 Genbäume zu rekonstruieren. Der beste wird anschließend mithilfe eines statistischen Evaluierungsverfahrens ausgewählt. Auf allen getesteten simulierten Datensätzen konnte gezeigt werden, dass die von Spearfish inferierten Bäume durchschnittlich eine Distanz von 0,213 zum echten Genbaum besitzen. Damit ist es 2,18-mal genauer als Methoden wie RAxML-NG, welche den Speziesbaum nicht berücksichtigen. Spearfish ist 25,85% ungenauer, aber 49,63% schneller als GeneRax, eine der führenden Methoden, die Genbäume mithilfe ihres Speziesbaumes korrigieren. So kann Spearfish verwendet werden, um Startbäume für GeneRax zu rekonstruieren oder bei goßen Datensätzen sogar zu ersetzen.

Authors: Lukas Knirsch, Benoit Morel, Alexandros Stamatakis

Date Published: 2nd Oct 2025

Publication Type: Bachelor's Thesis

Abstract (Expand)

Despite tremendous efforts in the past decades, relationships among main avian lineages remain heavily debated without a clear resolution. Discrepancies have been attributed to diversity of species sampled, phylogenetic method, and the choice of genomic regions 1–3. Here, we address these issues by analyzing genomes of 363 bird species 4 (218 taxonomic families, 92% of total). Using intergenic regions and coalescent methods, we present a well-supported tree but also a remarkable degree of discordance. The tree confirms that Neoaves experienced rapid radiation at or near the Cretaceous–Paleogene (K–Pg) boundary. Sufficient loci rather than extensive taxon sampling were more effective in resolving difficult nodes. Remaining recalcitrant nodes involve species that challenge modeling due to extreme GC content, variable substitution rates, incomplete lineage sorting, or complex evolutionary events such as ancient hybridization. Assessment of the impacts of different genomic partitions showed high heterogeneity across the genome. We discovered sharp increases in effective population size, substitution rates, and relative brain size following the K–Pg extinction event, supporting the hypothesis that emerging ecological opportunities catalyzed the diversification of modern birds. The resulting phylogenetic estimate offers novel insights into the rapid radiation of modern birds and provides a taxon-rich backbone tree for future comparative studies.

Authors: Josefin Stiller, Shaohong Feng, Al-Aabid Chowdhury, Iker Rivas-González, David A. Duchêne, Qi Fang, Yuan Deng, Alexey Kozlov, Alexandros Stamatakis, Santiago Claramunt, Jacqueline M. T. Nguyen, Simon Y. W. Ho, Brant C. Faircloth, Julia Haag, Peter Houde, Joel Cracraft, Metin Balaban, Uyen Mai, Guangji Chen, Rongsheng Gao, Chengran Zhou, Yulong Xie, Zijian Huang, Zhen Cao, Zhi Yan, Huw A. Ogilvie, Luay Nakhleh, Bent Lindow, Benoit Morel, Jon Fjeldså, Peter A. Hosner, Rute R. da Fonseca, Bent Petersen, Joseph A. Tobias, Tamás Székely, Jonathan David Kennedy, Andrew Hart Reeve, Andras Liker, Martin Stervander, Agostinho Antunes, Dieter Thomas Tietze, Mads Bertelsen, Fumin Lei, Carsten Rahbek, Gary R. Graves, Mikkel H. Schierup, Tandy Warnow, Edward L. Braun, M. Thomas P. Gilbert, Erich D. Jarvis, Siavash Mirarab, Guojie Zhang

Date Published: 1st Apr 2024

Publication Type: Journal

Abstract (Expand)

Accurately reconstructing the evolutionary history of a group of organism is a complex task. Current state-of-the-art tools produce phylogenetic tree distributions with Markov chain Monte-Carlo (MCMC) methods by sampling the posterior tree distribution under a given model to reflect uncertainties in the underlying models and data. While these distributions offer very good insight into the phylogenetic history, they are very compute intensive. In this thesis we present and evaluate multiple heuristics to approximate these distributions with distance-based methods. To judge the quality of our heuristics, we compare our distribution against a reference MCMC-based distribution with split and frequency-based metrics. We show that our method works well for some types of data, but not all, compared to other tools, and that further information about the data needs to be incorporated to make this viable in practice. Our most successful method is characterized by the use of pair-wise distance distributions to apply likelihood-supported perturbation to the input distances for the Neighbor Joining algorithm. Because this ignores the interdependencies between distances, we need to add parsimony filtering as a post-processing step to eliminate unlikely trees from our distributions, which significantly improves the results. Finally, we also discuss the shortcomings and future potential of our heuristics to more accurately estimate pair-wise distances and their interdependencies, which should lead to more competitive results.

Authors: Noah Wahl, Benoit Morel, Alexandros Stamatakis

Date Published: 1st Dec 2023

Publication Type: Master's Thesis

Abstract (Expand)

ABSTRACT Motivation Genomes are a rich source of information on the pattern and process of evolution across biological scales. How best to make use of that information is an active area of research inat information is an active area of research in phylogenetics. Ideally, phylogenetic methods should not only model substitutions along gene trees, which explain differences between homologous gene sequences, but also the processes that generate the gene trees themselves along a shared species tree. To conduct accurate inferences, one needs to account for uncertainty at both levels, that is, in gene trees estimated from inherently short sequences and in their diverse evolutionary histories along a shared species tree. Results We present AleRax, a software that can infer reconciled gene trees together with a shared species tree using a simple, yet powerful, probabilistic model of gene duplication, transfer, and loss. A key feature of AleRax is its ability to account for uncertainty in the gene tree and its reconciliation by using an efficient approximation to calculate the joint phylogenetic-reconciliation likelihood and sample reconciled gene trees accordingly. Simulations and analyses of empirical data show that AleRax is one order of magnitude faster than competing gene tree inference tools while attaining the same accuracy. It is consistently more robust than species tree inference methods such as SpeciesRax and ASTRAL-Pro 2 under gene tree uncertainty. Finally, AleRax can process multiple gene families in parallel thereby allowing users to compare competing phylogenetic hypotheses and estimate model parameters, such as DTL probabilities for genome-scale datasets with hundreds of taxa Availability and Implementation GNU GPL at https://github.com/BenoitMorel/AleRax and data are made available at https://cme.h-its.org/exelixis/material/alerax_data.tar.gz . Contact Benoit.Morel@h-its.org Supplementary information Supplementary material is available.

Authors: Benoit Morel, Tom A. Williams, Alexandros Stamatakis, Gergely J. Szöllősi

Date Published: 7th Oct 2023

Publication Type: Journal

Abstract (Expand)

Abstract Species tree-aware phylogenetic methods model how gene trees are generated along the species tree by a series of evolutionary events, including the duplication, transfer and loss of genes.fer and loss of genes. Over the past ten years these methods have emerged as a powerful tool for inferring and rooting gene and species trees, inferring ancestral gene repertoires, and studying the processes of gene and genome evolution. However, these methods are complex and can be more difficult to use than traditional phylogenetic approaches. Method development is rapid, and it can be difficult to decide between approaches and interpret results. Here, we review ALE and GeneRax, two popular packages for reconciling gene and species trees, explaining how they work, how results can be interpreted, and providing a tutorial for practical analysis. It was recently suggested that reconciliation-based estimates of duplication and transfer frequencies are unreliable. We evaluate this criticism and find that, provided parameters are estimated from the data rather than being fixed based on prior assumptions, reconciliation-based inferences are in good agreement with the literature, recovering variation in gene duplication and transfer frequencies across lineages consistent with the known biology of studied clades. For example, published datasets support the view that transfers greatly outnumber duplications in most prokaryotic lineages. We conclude by discussing some limitations of current models and prospects for future progress. Significance statement Evolutionary trees provide a framework for understanding the history of life and organising biodiversity. In this review, we discuss some recent progress on statistical methods that allow us to combine information from many different genes within the framework of an overarching phylogenetic species tree. We review the advantages and uses of these methods and discuss case studies where they have been used to resolve deep branches within the tree of life. We conclude with the limitations of current methods and suggest how they might be overcome in the future.

Authors: Tom A. Williams, Adrian A. Davin, Benoit Morel, Lénárd L. Szánthó, Anja Spang, Alexandros Stamatakis, Philip Hugenholtz, Gergely J. Szöllősi

Date Published: 17th Mar 2023

Publication Type: Journal

Abstract (Expand)

Abstract Motivation Missing data and incomplete lineage sorting (ILS) are two major obstacles to accurate species tree inference. Gene tree summary methods such as ASTRAL and ASTRID have been developedy methods such as ASTRAL and ASTRID have been developed to account for ILS. However, they can be severely affected by high levels of missing data. Results We present Asteroid, a novel algorithm that infers an unrooted species tree from a set of unrooted gene trees. We show on both empirical and simulated datasets that Asteroid is substantially more accurate than ASTRAL and ASTRID for very high proportions (>80%) of missing data. Asteroid is several orders of magnitude faster than ASTRAL for datasets that contain thousands of genes. It offers advanced features such as parallelization, support value computation and support for multi-copy and multifurcating gene trees. Availability and implementation Asteroid is freely available at https://github.com/BenoitMorel/Asteroid. Supplementary information Supplementary data are available at Bioinformatics online.

Authors: Benoit Morel, Tom A Williams, Alexandros Stamatakis

Date Published: 2023

Publication Type: Journal

Abstract

Not specified

Authors: Sarah Lutteropp, Céline Scornavacca, Alexey M. Kozlov, Benoit Morel, Alexandros Stamatakis

Date Published: 31st Aug 2021

Publication Type: Journal

Powered by
(v.1.15.2)
Copyright © 2008 - 2024 The University of Manchester and HITS gGmbH