The Free Lunch is not over yet—systematic exploration of numerical thresholds in maximum likelihood phylogenetic inference

Abstract:
        Abstract
        
          Summary
          Maximum likelihood (ML) is a widely used phylogenetic inference method. ML implementations heavily rely on numerical optimization routines that use internal numerical thresholds to determine convergence. We systematically analyze the impact of these threshold settings on the log-likelihood and runtimes for ML tree inferences with RAxML-NG, IQ-TREE, and FastTree on empirical datasets. We provide empirical evidence that we can substantially accelerate tree inferences with RAxML-NG and IQ-TREE by changing the default values of two such numerical thresholds. At the same time, altering these settings does not significantly impact the quality of the inferred trees. We further show that increasing both thresholds accelerates the RAxML-NG bootstrap without influencing the resulting support values. For RAxML-NG, increasing the likelihood thresholds ϵLnL and ϵbrlen to 10 and 103, respectively, results in an average tree inference speedup of 1.9 ± 0.6 on Data collection 1, 1.8 ± 1.1 on Data collection 2, and 1.9 ± 0.8 on Data collection 2 for the RAxML-NG bootstrap compared to the runtime under the current default setting. Increasing the likelihood threshold ϵLnL to 10 in IQ-TREE results in an average tree inference speedup of 1.3 ± 0.4 on Data collection 1 and 1.3 ± 0.9 on Data collection 2.
        
        
          Availability and implementation
          All MSAs we used for our analyses, as well as all results, are available for download at https://cme.h-its.org/exelixis/material/freeLunch_data.tar.gz. Our data generation scripts are available at https://github.com/tschuelia/ml-numerical-analysis.

SEEK ID: https://publications.h-its.org/publications/1749

DOI: 10.1093/bioadv/vbad124

Research Groups: Computational Molecular Evolution

Publication type: Journal

Journal: Bioinformatics Advances

Editors: Aida Ouangraoua

Citation: Bioinformatics Advances 3(1),vbad124

Date Published: 2023

Registered Mode: by DOI

Citation
Haag, J., Hübner, L., Kozlov, A. M., & Stamatakis, A. (2023). The Free Lunch is not over yet—systematic exploration of numerical thresholds in maximum likelihood phylogenetic inference. In A. Ouangraoua (Ed.), Bioinformatics Advances (Vol. 3, Issue 1). Oxford University Press (OUP). https://doi.org/10.1093/bioadv/vbad124
Activity

Views: 1747

Created: 2nd Jan 2024 at 18:27

Last updated: 5th Mar 2024 at 21:25

help Tags

This item has not yet been tagged.

help Attributions

None

Powered by
(v.1.15.2)
Copyright © 2008 - 2024 The University of Manchester and HITS gGmbH