WeLT: Improving Biomedical Fine-tuned Pre-trained Language Models with Cost-sensitive Learning

Abstract:

Fine-tuning biomedical pre-trained language models (BioPLMs) such as BioBERT has become a common practice dominating leaderboards across various natural language processing tasks. Despite their success and wide adoption, prevailing fine-tuning approaches for named entity recognition (NER) naively train BioPLMs on targeted datasets without considering class distributions. This is problematic especially when dealing with imbalanced biomedical gold-standard datasets for NER in which most biomedical entities are underrepresented. In this paper, we address the class imbalance problem and propose WeLT, a cost-sensitive fine-tuning approach based on new re-scaled class weights for the task of biomedical NER. We evaluate WeLT’s fine-tuning performance on mixed-domain and domain-specific BioPLMs using eight biomedical gold-standard datasets. We compare our approach against vanilla fine-tuning and three other existing re-weighting schemes. Our results show the positive impact of handling the class imbalance problem. WeLT outperforms all the vanilla fine-tuned models. Furthermore, our method demonstrates advantages over other existing weighting schemes in most experiments.

SEEK ID: https://publications.h-its.org/publications/1684

DOI: 10.18653/v1/2023.bionlp-1.40

Research Groups: Scientific Databases and Visualisation

Publication type: Proceedings

Journal: The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks

Publisher: Association for Computational Linguistics

Citation: Ghadeer Mobasher, Wolfgang Müller, Olga Krebs, and Michael Gertz. 2023. WeLT: Improving Biomedical Fine-tuned Pre-trained Language Models with Cost-sensitive Learning. In The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, pages 427–438, Toronto, Canada. Association for Computational Linguistics.

Date Published: 2023

URL: https://aclanthology.org/2023.bionlp-1.40/

Registered Mode: manually

Authors: Ghadeer Mobasher, Wolfgang Müller, Olga Krebs, Michael Gertz

help Submitter
Citation
Mobasher, G., Müller, W., Krebs, O., & Gertz, M. (2023). WeLT: Improving Biomedical Fine-tuned Pre-trained Language Models with Cost-sensitive Learning. In The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks (pp. 427–438). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.bionlp-1.40
Activity

Views: 2471

Created: 17th Jul 2023 at 21:42

Last updated: 5th Mar 2024 at 21:25

help Tags

This item has not yet been tagged.

help Attributions

None

Powered by
(v.1.15.2)
Copyright © 2008 - 2024 The University of Manchester and HITS gGmbH