Publications

What is a Publication?
11 Publications visible to you, out of a total of 11

Abstract (Expand)

Fine-tuning biomedical pre-trained language models (BioPLMs) such as BioBERT has become a common practice dominating leaderboards across various natural language processing tasks. Despite their success and wide adoption, prevailing fine-tuning approaches for named entity recognition (NER) naively train BioPLMs on targeted datasets without considering class distributions. This is problematic especially when dealing with imbalanced biomedical gold-standard datasets for NER in which most biomedical entities are underrepresented. In this paper, we address the class imbalance problem and propose WeLT, a cost-sensitive fine-tuning approach based on new re-scaled class weights for the task of biomedical NER. We evaluate WeLT’s fine-tuning performance on mixed-domain and domain-specific BioPLMs using eight biomedical gold-standard datasets. We compare our approach against vanilla fine-tuning and three other existing re-weighting schemes. Our results show the positive impact of handling the class imbalance problem. WeLT outperforms all the vanilla fine-tuned models. Furthermore, our method demonstrates advantages over other existing weighting schemes in most experiments.

Authors: Ghadeer Mobasher, Wolfgang Müller, Olga Krebs, Michael Gertz

Date Published: 2023

Publication Type: Proceedings

Abstract (Expand)

Chemical named entity recognition (NER) is a significant step for many downstream applications like entity linking for the chemical text-mining pipeline. However, the identification of chemical entities in a biomedical text is a challenging task due to the diverse morphology of chemical entities and the different types of chemical nomenclature. In this work, we describe our approach that was submitted for BioCreative version 7 challenge Track 2, focusing on the ‘Chemical Identification’ task for identifying chemical entities and entity linking, using MeSH. For this purpose, we have applied a two-stage approach as follows (a) usage of fine-tuned BioBERT for identification of chemical entities (b) semantic approximate search in MeSH and PubChem databases for entity linking. There was some friction between the two approaches, as our rule-based approach did not harmonise optimally with partially recognized words forwarded by the BERT component. For our future work, we aim to resolve the issue of the artefacts arising from BERT tokenizers and develop joint learning of chemical named entity recognition and entity linking using pre-trained transformer-based models and compare their performance with our preliminary approach. Next, we will improve the efficiency of our approximate search in reference databases during entity linking. This task is non-trivial as it entails determining similarity scores of large sets of trees with respect to a query tree. Ideally, this will enable flexible parametrization and rule selection for the entity linking search.

Authors: Ghadeer Mobasher, Lukrécia Mertová, Sucheta Ghosh, Olga Krebs, Bettina Heinlein, Wolfgang Müller

Date Published: 11th Nov 2021

Publication Type: Proceedings

Abstract (Expand)

This paper presents a report on outcomes of the 10th Computational Modeling in Biology Network (COMBINE) meeting that was held in Heidelberg, Germany, in July of 2019. The annual event brings together researchers, biocurators and software engineers to present recent results and discuss future work in the area of standards for systems and synthetic biology. The COMBINE initiative coordinates the development of various community standards and formats for computational models in the life sciences. Over the past 10 years, COMBINE has brought together standard communities that have further developed and harmonized their standards for better interoperability of models and data. COMBINE 2019 was co-located with a stakeholder workshop of the European EU-STANDS4PM initiative that aims at harmonized data and model standardization for in silico models in the field of personalized medicine, as well as with the FAIRDOM PALs meeting to discuss findable, accessible, interoperable and reusable (FAIR) data sharing. This report briefly describes the work discussed in invited and contributed talks as well as during breakout sessions. It also highlights recent advancements in data, model, and annotation standardization efforts. Finally, this report concludes with some challenges and opportunities that this community will face during the next 10 years.

Authors: Dagmar Waltemath, Martin Golebiewski, Michael L Blinov, Padraig Gleeson, Henning Hermjakob, Michael Hucka, Esther Thea Inau, Sarah M Keating, Matthias König, Olga Krebs, Rahuman S Malik-Sheriff, David Nickerson, Ernst Oberortner, Herbert M Sauro, Falk Schreiber, Lucian Smith, Melanie I Stefan, Ulrike Wittig, Chris J Myers

Date Published: 24th Aug 2020

Publication Type: Journal

Abstract

Not specified

Authors: Katherine Wolstencroft, Olga Krebs, Jacky L. Snoep, Natalie J. Stanford, Finn Bacall, Martin Golebiewski, Rostyk Kuzyakiv, Quyen Nguyen, Stuart Owen, Stian Soiland-Reyes, Jakub Straszewski, David D. van Niekerk, Alan R. Williams, Lars Malmström, Bernd Rinn, Wolfgang Müller, Carole Goble

Date Published: 3rd Jan 2017

Publication Type: Journal

Abstract

Not specified

Authors: Maksim Zakhartsev, Irina Medvedeva, Yury Orlov, Ilya Akberdin, Olga Krebs, Waltraud X. Schulze

Date Published: 1st Dec 2016

Publication Type: Journal

Abstract

Not specified

Authors: O. Krebs, K. Wolstencroft, NJ. Stanford, N. Morrison, M. Golebiewski, S. Owen, Q. Nguyen, JL. Snoep, W. Müller, C. Goble

Date Published: 2016

Publication Type: Journal

Abstract (Expand)

Systems biology research typically involves the integration and analysis of heterogeneous data types in order to model and predict biological processes. Researchers therefore require tools and resources to facilitate the sharing and integration of data, and for linking of data to systems biology models.

Authors: Katherine Wolstencroft, Stuart Owen, Olga Krebs, Quyen Nguyen, Natalie J Stanford, Martin Golebiewski, Andreas Weidemann, Meik Bittkowski, Lihua An, David Shockley, Jacky L. Snoep, Wolfgang Mueller, Carole Goble

Date Published: 1st Dec 2015

Publication Type: Journal

Powered by
(v.1.14.2)
Copyright © 2008 - 2023 The University of Manchester and HITS gGmbH