Use Cases of Predictive Modeling for Phylogenetic Inference and Placements

Abstract:

In this work, we present two distinct applications of predictive modeling within the domain of phylogenetic inference and placement. Phylogenetic placements aim to place new entities into a given phylogenetic tree. While there exist efficient implementations for producing phylogenetic placements, the underlying reasons why particular placements are more difficult to perform than others are unknown. In the first use case, we focus on the prediction of the difficulty of those phylogenetic placements. We developed Bold Assertor of Difficulty (BAD). BAD can predict the placement difficulty between 0 (easy) and 1 (hard) with high accuracy. On a set of 3000 metagenomic placements, we obtain a mean absolute error of 0.13. BAD can help biologists understand the challenges associated with placing specific sequences into a reference phylogeny during metagenomic studies based on SHapley Additive exPlanations (SHAP) explanations. Estimating the statistical robustness of the inferred phylogenetic tree constitutes an integral part of most phylogenetic analyses. Commonly, one computes and assigns a branch support value to each inner branch of the inferred phylogeny. The most widely used method for calculating branch support on trees inferred under maximum likelihood is the Standard, non parametric Felsenstein Bootstrap Support (SBS). The SBS method is computationally costly, leading to the development of alternative approaches such as Rapid Bootstrap and UltraFast Bootstrap 2 (UFBoot2). The second use case of this work is concerned with the fast machine learning-based approxi mation of those SBS values. Our SBS predictor, Educated Bootstrap Guesser (EBG), is on average 9.4 (𝜎 = 5.5) times faster than the major competitor UFBoot2 and provides an SBS estimate with a median absolute error of 5 when predicting SBS values between 0 and 10

SEEK ID: https://publications.h-its.org/publications/1912

Filename: masterJulius.pdf 

Format: PDF document

Size: 2.98 MB

SEEK ID: https://publications.h-its.org/publications/1912

Research Groups: Computational Molecular Evolution

Publication type: Master's Thesis

Citation:

Date Published: 7th Apr 2024

URL:

Registered Mode: manually

Authors: Julius Wiegert, Julia Haag, Dimitri Höhler, Alexandros Stamatakis

help Submitter
Activity

Views: 114   Downloads: 1

Created: 9th Jan 2025 at 13:09

Last updated: 9th Jan 2025 at 13:09

help Tags

This item has not yet been tagged.

help Attributions

None

Powered by
(v.1.15.2)
Copyright © 2008 - 2024 The University of Manchester and HITS gGmbH