Publications

What is a Publication?
1579 Publications visible to you, out of a total of 1579

Abstract

Not specified

Authors: Cristina Cornelio, Jan Stuehmer, Shell Xu Hu, Timothy Hospedales

Date Published: 2023

Publication Type: InProceedings

Abstract

Not specified

Authors: Julian Schröter, Tal Dattner, Jennifer Hüllein, Alejandra Jayme, Vincent Heuveline, Georg F. Hoffmann, Stefan Kölker, Dominic Lenz, Thomas Opladen, Bernt Popp, Christian P. Schaaf, Christian Staufner, Steffen Syrbe, Sebastian Uhrig, Daniel Hübschmann, Heiko Brennenstuhl

Date Published: 2023

Publication Type: Journal

Abstract (Expand)

Abstract The BioCreative National Library of Medicine (NLM)-Chem track calls for a community effort to fine-tune automated recognition of chemical names in the biomedical literature. Chemicals are oneerature. Chemicals are one of the most searched biomedical entities in PubMed, and—as highlighted during the coronavirus disease 2019 pandemic—their identification may significantly advance research in multiple biomedical subfields. While previous community challenges focused on identifying chemical names mentioned in titles and abstracts, the full text contains valuable additional detail. We, therefore, organized the BioCreative NLM-Chem track as a community effort to address automated chemical entity recognition in full-text articles. The track consisted of two tasks: (i) chemical identification and (ii) chemical indexing. The chemical identification task required predicting all chemicals mentioned in recently published full-text articles, both span [i.e. named entity recognition (NER)] and normalization (i.e. entity linking), using Medical Subject Headings (MeSH). The chemical indexing task required identifying which chemicals reflect topics for each article and should therefore appear in the listing of MeSH terms for the document in the MEDLINE article indexing. This manuscript summarizes the BioCreative NLM-Chem track and post-challenge experiments. We received a total of 85 submissions from 17 teams worldwide. The highest performance achieved for the chemical identification task was 0.8672 F-score (0.8759 precision and 0.8587 recall) for strict NER performance and 0.8136 F-score (0.8621 precision and 0.7702 recall) for strict normalization performance. The highest performance achieved for the chemical indexing task was 0.6073 F-score (0.7417 precision and 0.5141 recall). This community challenge demonstrated that (i) the current substantial achievements in deep learning technologies can be utilized to improve automated prediction accuracy further and (ii) the chemical indexing task is substantially more challenging. We look forward to further developing biomedical text–mining methods to respond to the rapid growth of biomedical literature. The NLM-Chem track dataset and other challenge materials are publicly available at https://ftp.ncbi.nlm.nih.gov/pub/lu/BC7-NLM-Chem-track/. Database URL https://ftp.ncbi.nlm.nih.gov/pub/lu/BC7-NLM-Chem-track/

Authors: Robert Leaman, Rezarta Islamaj, Virginia Adams, Mohammed A Alliheedi, João Rafael Almeida, Rui Antunes, Robert Bevan, Yung-Chun Chang, Arslan Erdengasileng, Matthew Hodgskiss, Ryuki Ida, Hyunjae Kim, Keqiao Li, Robert E Mercer, Lukrécia Mertová, Ghadeer Mobasher, Hoo-Chang Shin, Mujeen Sung, Tomoki Tsujimura, Wen-Chao Yeh, Zhiyong Lu

Date Published: 2023

Publication Type: Journal

Abstract

Not specified

Authors: Ruchika Chavhan, Henry Gouk, Jan Stuehmer, Calum Heggan, Mehrdad Yaghoobi, Timothy Hospedales

Date Published: 2023

Publication Type: Journal

Abstract (Expand)

Fine-tuning biomedical pre-trained language models (BioPLMs) such as BioBERT has become a common practice dominating leaderboards across various natural language processing tasks. Despite their success and wide adoption, prevailing fine-tuning approaches for named entity recognition (NER) naively train BioPLMs on targeted datasets without considering class distributions. This is problematic especially when dealing with imbalanced biomedical gold-standard datasets for NER in which most biomedical entities are underrepresented. In this paper, we address the class imbalance problem and propose WeLT, a cost-sensitive fine-tuning approach based on new re-scaled class weights for the task of biomedical NER. We evaluate WeLT’s fine-tuning performance on mixed-domain and domain-specific BioPLMs using eight biomedical gold-standard datasets. We compare our approach against vanilla fine-tuning and three other existing re-weighting schemes. Our results show the positive impact of handling the class imbalance problem. WeLT outperforms all the vanilla fine-tuned models. Furthermore, our method demonstrates advantages over other existing weighting schemes in most experiments.

Authors: Ghadeer Mobasher, Wolfgang Müller, Olga Krebs, Michael Gertz

Date Published: 2023

Publication Type: Proceedings

Abstract

Not specified

Authors: Dilek Koptekin, Eren Yüncü, Ricardo Rodríguez-Varela, N. Ezgi Altınışık, Nikolaos Psonis, Natalia Kashuba, Sevgi Yorulmaz, Robert George, Duygu Deniz Kazancı, Damla Kaptan, Kanat Gürün, Kıvılcım Başak Vural, Hasan Can Gemici, Despoina Vassou, Evangelia Daskalaki, Cansu Karamurat, Vendela K. Lagerholm, Ömür Dilek Erdal, Emrah Kırdök, Aurelio Marangoni, Andreas Schachner, Handan Üstündağ, Ramaz Shengelia, Liana Bitadze, Mikheil Elashvili, Eleni Stravopodi, Mihriban Özbaşaran, Güneş Duru, Argyro Nafplioti, C. Brian Rose, Tuğba Gencer, Gareth Darbyshire, Alexander Gavashelishvili, Konstantine Pitskhelauri, Özlem Çevik, Osman Vuruşkan, Nina Kyparissi-Apostolika, Ali Metin Büyükkarakaya, Umay Oğuzhanoğlu, Sevinç Günel, Eugenia Tabakaki, Akper Aliev, Anar Ibrahimov, Vaqif Shadlinski, Adamantios Sampson, Gülşah Merve Kılınç, Çiğdem Atakuman, Alexandros Stamatakis, Nikos Poulakakis, Yılmaz Selim Erdal, Pavlos Pavlidis, Jan Storå, Füsun Özer, Anders Götherström, Mehmet Somel

Date Published: 2023

Publication Type: Journal

Abstract (Expand)

Abstract Motivation Missing data and incomplete lineage sorting (ILS) are two major obstacles to accurate species tree inference. Gene tree summary methods such as ASTRAL and ASTRID have been developedy methods such as ASTRAL and ASTRID have been developed to account for ILS. However, they can be severely affected by high levels of missing data. Results We present Asteroid, a novel algorithm that infers an unrooted species tree from a set of unrooted gene trees. We show on both empirical and simulated datasets that Asteroid is substantially more accurate than ASTRAL and ASTRID for very high proportions (>80%) of missing data. Asteroid is several orders of magnitude faster than ASTRAL for datasets that contain thousands of genes. It offers advanced features such as parallelization, support value computation and support for multi-copy and multifurcating gene trees. Availability and implementation Asteroid is freely available at https://github.com/BenoitMorel/Asteroid. Supplementary information Supplementary data are available at Bioinformatics online.

Authors: Benoit Morel, Tom A Williams, Alexandros Stamatakis

Date Published: 2023

Publication Type: Journal

Powered by
(v.1.14.2)
Copyright © 2008 - 2023 The University of Manchester and HITS gGmbH