Phylogenetic trees represent hypothetical evolutionary relationships between organisms. Approaches for inferring phylogenetic trees include the Maximum Likelihood (ML) method. This method relies on numerical optimization routines that use internal numerical thresholds. We analyze the influence of these thresholds on the likelihood scores and runtimes of tree inferences for the ML inference tools RAxML-NG, IQ-Tree, and FastTree. We analyze 22 empirical datasets and show that we can speed up the tree inference in RAxML-NG and IQ-Tree by changing the default values of two such numerical thresholds. Using 15 additional simulated datasets, we show that these changes do not affect the accuracy of the inferred phylogenetic trees. For RAxML-NG, increasing the likelihood thresholds lh_epsilon and spr_lh_epsilon to 10 and 103 respectively results in an average speedup of 1.9 ± 0.6. Increasing the likelihood threshold lh_epsilon in IQ-Tree results in an average speedup of 1.3 ± 0.4. In addition to the numerical analysis, we attempt to predict the difficulty of datasets, with the aim of preventing an unnecessarily large number of tree inferences for datasets that are easy to analyze. We present our prediction experiments and discuss why this task proved to be more challenging than anticipated.
SEEK ID: https://publications.h-its.org/publications/1435
Research Groups: Computational Molecular Evolution
Publication type: Master's Thesis
Citation:
Date Published: No date defined
URL: https://cme.h-its.org/exelixis/pubs/masterJulia.pdf
Registered Mode: manually
Views: 4219
Created: 18th Jan 2022 at 08:53
Last updated: 5th Mar 2024 at 21:24
This item has not yet been tagged.
None