WeLT: Improving Biomedical Fine-tuned Pre-trained Language Models with Cost-sensitive Learning

Abstract:

Fine-tuning biomedical pre-trained language models (BioPLMs) such as BioBERT has become a common practice dominating leaderboards across various natural language processing tasks. Despite their success and wide adoption, prevailing fine-tuning approaches for named entity recognition (NER) naively train BioPLMs on targeted datasets without considering class distributions. This is problematic especially when dealing with imbalanced biomedical gold-standard datasets for NER in which most biomedical entities are underrepresented. In this paper, we address the class imbalance problem and propose WeLT, a cost-sensitive fine-tuning approach based on new re-scaled class weights for the task of biomedical NER. We evaluate WeLT’s fine-tuning performance on mixed-domain and domain-specific BioPLMs using eight biomedical gold-standard datasets. We compare our approach against vanilla fine-tuning and three other existing re-weighting schemes. Our results show the positive impact of handling the class imbalance problem. WeLT outperforms all the vanilla fine-tuned models. Furthermore, our method demonstrates advantages over other existing weighting schemes in most experiments.

SEEK ID: https://publications.h-its.org/publications/1684

DOI: 10.18653/v1/2023.bionlp-1.40

Research Groups: Scientific Databases and Visualisation

Publication type: Proceedings

Journal: The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks

Publisher: Association for Computational Linguistics

Citation: Ghadeer Mobasher, Wolfgang Müller, Olga Krebs, and Michael Gertz. 2023. WeLT: Improving Biomedical Fine-tuned Pre-trained Language Models with Cost-sensitive Learning. In The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, pages 427–438, Toronto, Canada. Association for Computational Linguistics.

Date Published: 2023

URL: https://aclanthology.org/2023.bionlp-1.40/

Registered Mode: manually

Authors: Ghadeer Mobasher, Wolfgang Müller, Olga Krebs, Michael Gertz

help Submitter
Citation
Mobasher, G., Müller, W., Krebs, O., & Gertz, M. (2023). WeLT: Improving Biomedical Fine-tuned Pre-trained Language Models with Cost-sensitive Learning. In The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks. The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks. Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.bionlp-1.40
Activity

Views: 1141

Created: 17th Jul 2023 at 21:42

Last updated: 5th Mar 2024 at 21:25

help Tags

This item has not yet been tagged.

help Attributions

None

Powered by
(v.1.14.2)
Copyright © 2008 - 2023 The University of Manchester and HITS gGmbH