Publications

What is a Publication?
4 Publications visible to you, out of a total of 4

Abstract (Expand)

In this paper, we propose an entity-based neural local coherence model which is linguistically more sound than previously proposed neural coherence models. Recent neural coherence models encode the input document using large-scale pretrained language models. Hence their basis for computing local coherence are words and even sub-words. The analysis of their output shows that these models frequently compute coherence on the basis of connections between (sub-)words which, from a linguistic perspective, should not play a role. Still, these models achieve state-of-the-art performance in several end applications. In contrast to these models, we compute coherence on the basis of entities by constraining the input to noun phrases and proper names. This provides us with an explicit representation of the most important items in sentences leading to the notion of focus. This brings our model linguistically in line with pre-neural models of computing coherence. It also gives us better insight into the behaviour of the model thus leading to better explainability. Our approach is also in accord with a recent study (O’Connor and Andreas, 2021), which shows that most usable information is captured by nouns and verbs in transformer-based language models. We evaluate our model on three downstream tasks showing that it is not only linguistically more sound than previous models but also that it outperforms them in end applications.

Authors: Sungho Jeon, Michael Strube

Date Published: 22nd May 2022

Publication Type: InProceedings

Abstract

Not specified

Authors: Sungho Jeon, Michael Strube

Date Published: 10th Nov 2021

Publication Type: InProceedings

Abstract (Expand)

Pretrained language models, neural models pretrained on massive amounts of data, have established the state of the art in a range of NLP tasks. They are based on a modern machine-learning technique, the Transformer which relates all items simultaneously to capture semantic relations in sequences. However, it differs from what humans do. Humans read sentences one-by-one, incrementally. Can neural models benefit by interpreting texts incrementally as humans do? We investigate this question in coherence modeling. We propose a coherence model which interprets sentences incrementally to capture lexical relations between them. We compare the state of the art in each task, simple neural models relying on a pretrained language model, and our model in two downstream tasks. Our findings suggest that interpreting texts incrementally as humans could be useful to design more advanced models.

Authors: Sungho Jeon, Michael Strube

Date Published: 1st Dec 2020

Publication Type: InProceedings

Abstract

Not specified

Authors: Sungho Jeon, Michael Strube

Date Published: 2020

Publication Type: InProceedings

Powered by
(v.1.15.2)
Copyright © 2008 - 2024 The University of Manchester and HITS gGmbH