Publications

What is a Publication?
1687 Publications visible to you, out of a total of 1687

Abstract (Expand)

Abstract Phylogenetic inferences under the maximum likelihood criterion deploy heuristic tree search strategies to explore the vast search space. Depending on the input dataset, searches from differentt, searches from different starting trees might all converge to a single tree topology. Often, though, distinct searches infer multiple topologies with large log-likelihood score differences or yield topologically highly distinct, yet almost equally likely, trees. Recently, Haag et al. introduced an approach to quantify, and implemented machine learning methods to predict, the dataset difficulty with respect to phylogenetic inference. Easy multiple sequence alignments (MSAs) exhibit a single likelihood peak on their likelihood surface, associated with a single tree topology to which most, if not all, independent searches rapidly converge. As difficulty increases, multiple locally optimal likelihood peaks emerge, yet from highly distinct topologies. To make use of this information, we introduce and implement an adaptive tree search heuristic in RAxML-NG, which modifies the thoroughness of the tree search strategy as a function of the predicted difficulty. Our adaptive strategy is based upon three observations. First, on easy datasets, searches converge rapidly and can hence be terminated at an earlier stage. Second, overanalyzing difficult datasets is hopeless, and thus it suffices to quickly infer only one of the numerous almost equally likely topologies to reduce overall execution time. Third, more extensive searches are justified and required on datasets with intermediate difficulty. While the likelihood surface exhibits multiple locally optimal peaks in this case, a small proportion of them is significantly better. Our experimental results for the adaptive heuristic on 9,515 empirical and 5,000 simulated datasets with varying difficulty exhibit substantial speedups, especially on easy and difficult datasets (53% of total MSAs), where we observe average speedups of more than 10×. Further, approximately 94% of the inferred trees using the adaptive strategy are statistically indistinguishable from the trees inferred under the standard strategy (RAxML-NG).

Authors: Anastasis Togkousidis, Oleksiy M Kozlov, Julia Haag, Dimitri Höhler, Alexandros Stamatakis

Date Published: 1st Oct 2023

Publication Type: Journal

Abstract

Not specified

Authors: Johannes Bracher, Lotta Rüter, Fabian Krüger, Sebastian Lerch, Melanie Schienle

Date Published: 19th Sep 2023

Publication Type: Journal

Abstract

Not specified

Authors: Wei Zhao, Federico López, J. Maxwell Riestenberg, Michael Strube, Diaaeldin Taha, Steve Trettel

Date Published: 18th Sep 2023

Publication Type: InProceedings

Abstract

Not specified

Authors: K. Ertini, G. Folatelli, L. Martinez, M. C. Bersten, J. P. Anderson, C. Ashall, E. Baron, S. Bose, P. J. Brown, C. Burns, J. M. DerKacy, L. Ferrari, L. Galbany, E. Hsiao, S. Kumar, J. Lu, P. Mazzali, N. Morrell, M. Orellana, P. J. Pessi, M. M. Phillips, A. L. Piro, A. Polin, M. Shahbandeh, B. J. Shappee, M. Stritzinger, N. B. Suntzeff, M. Tucker, N. Elias-Rosa, H. Kuncarayakti, C. P. Gutiérrez, A. Kozyreva, T. E. Müller-Bravo, T. -W. Chen, J. T. Hinkle, A. V. Payne, P. Székely, T. Szalai, B. Barna, R. Könyves-Tóth, D. Bánhidi, I. B. Bı́ró, I. Csányi, L. Kriskovits, A. Pál, Zs Szabó, R. Szakáts, K. Vida, J. Vinkó, M. Gromadzki, L. Harvey, M. Nicholl, E. Paraskeva, D. R. Young, B. Englert

Date Published: 8th Sep 2023

Publication Type: Journal

Abstract (Expand)

Health data collected in clinical trials and epidemiological as well as public health studies cannot be freely published, but are valuable datasets whose subsequent use is of high importance for health research. The National Research Data Infrastructure for Personal Health Data (NFDI4Health) aims to promote the publication of such health data without compromising privacy. Based on existing international standards, NFDI4Health has established a generic information model for the description and preservation of high-level metadata describing health-related studies, covering both clinical and epidemiological studies. As an infrastructure for publishing such preservation metadata as well as more detailed representation information of study data (e.g. questionaries and data dictionaries), NFDI4Health has developed the German Central Health Study Hub. Content is either harvested from existing distributed sources or entered directly via a user interface. This metadata makes health studies more discoverable, and researchers can use the published metadata to evaluate the content of data collections, learn about access conditions and how and where to request data access. The goal of NFDI4Health is to establish interoperable and internationally accepted standards and processes for the publication of health data sets to make health data FAIR.

Authors: Juliane Fluck, Martin Golebiewski, Johannes Darms

Date Published: 7th Sep 2023

Publication Type: Proceedings

Abstract (Expand)

To support federated data structuring and sharing for sensitive health data from clinical trial, epidemiological and public health studies in the context of the German National Research Data Infrastructure for Personal Health Data (NFDI4Health), we have developed Local Data Hubs (LDHs) based on the FAIRDOM-SEEK platform. Those LDHs connect to the German Central Health Study Hub (CSH) to make the health data searchable and findable. This decentralised approach supports researchers to make health studies with their data FAIR (Findable, Accessible, Interoperable and Reusable), and at the same time fully preserves data protection for sensitive data.

Authors: Frank Meineke, Martin Golebiewski, Xiaoming Hu, Toralf Kirsten, Matthias Löbe, Sebastian Klammt, Ulrich Sax, Wolfgang Müller

Date Published: 7th Sep 2023

Publication Type: Proceedings

Abstract (Expand)

The exchange, dissemination, and reuse of biological specimens and data have become essentialfor life sciences research. This requires standards that enable cross-organizational documentation, traceability, and tracking of data and its corresponding metadata. Thus, data provenance, or the lineage of data, is an important aspect of data management in any information system integrating data from different sources [1]. It provides crucial information about the origin, transformation, and accountability of data, which is essential for ensuring trustworthiness, transparency, and quality of healthcare data [2]. For biological material and derived data, a novel ISO standard was recently introduced that specifies a general concept for a provenance information model for biological material and data and requirements for provenance data interoperability and serialization [3,4]. However, a specific standard for health data provenance is currently missing. In recent years, there has been a growing need for developing a minimal core data set for representing provenance information in health information systems. This paper presents a Provenance Core Data Set (PCDS), a generalized data model that aims to provide a set of attributes for describing data provenance in health information systems and beyond. 

Authors: Ulrich Sax, Christian Henke, Christian Dräger, Theresa Bender, Alessandra Kuntz, Martin Golebiewski, Hannes Ulrich, Mattias Löbe

Date Published: 7th Sep 2023

Publication Type: Journal

Powered by
(v.1.16.0)
Copyright © 2008 - 2024 The University of Manchester and HITS gGmbH