Publications

What is a Publication?
1579 Publications visible to you, out of a total of 1579

Abstract

Not specified

Authors: Chloé Braud, Christian Hardmeier, Junyi Jessy Li, Sharid Loaciga, Michael Strube, Amir Zeldes

Date Published: 13th Jul 2023

Publication Type: Proceedings

Abstract (Expand)

Abstract Motivation Simulating Multiple Sequence Alignments (MSAs) using probabilistic models of sequence evolution plays an important role in the evaluation of phylogenetic inference tools, and isluation of phylogenetic inference tools, and is crucial to the development of novel learning-based approaches for phylogenetic reconstruction, for instance, neural networks. These models and the resulting simulated data need to be as realistic as possible to be indicative of the performance of the developed tools on empirical data and to ensure that neural networks trained on simulations perform well on empirical data. Over the years, numerous models of evolution have been published with the goal to represent as faithfully as possible the sequence evolution process and thus simulate empirical-like data. In this study, we simulated DNA and protein MSAs under increasingly complex models of evolution with and without insertion/deletion (indel) events using a state-of-the-art sequence simulator. We assessed their realism by quantifying how accurately supervised learning methods are able to predict whether a given MSA is simulated or empirical. Results Our results show that we can distinguish between empirical and simulated MSAs with high accuracy using two distinct and independently developed classification approaches across all tested models of sequence evolution. Our findings suggest that the current state-of-the-art models fail to accurately replicate several aspects of empirical MSAs, including site-wise rates as well as amino acid and nucleotide composition. Data and Code Availability All simulated and empirical MSAs, as well as all analysis results, are available at https://cme.h-its.org/exelixis/material/simulation_study.tar.gz . All scripts required to reproduce our results are available at https://github.com/tschuelia/SimulationStudy and https://github.com/JohannaTrost/seqsharp . Contact julia.haag@h-its.org

Authors: Johanna Trost, Julia Haag, Dimitri Höhler, Laurent Jacob, Alexandros Stamatakis, Bastien Boussau

Date Published: 12th Jul 2023

Publication Type: Journal

Abstract (Expand)

Automating Cross-lingual Science Journalism (CSJ) aims to generate popular science summaries from English scientific texts for non-expert readers in their local language. We introduce CSJ as a downstream task of text simplification and cross-lingual scientific summarization to facilitate science journalists’ work. We analyze the performance of possible existing solutions as baselines for the CSJ task. Based on these findings, we propose to combine the three components - SELECT, SIMPLIFY and REWRITE (SSR) to produce cross-lingual simplified science summaries for non-expert readers. Our empirical evaluation on the WIKIPEDIA dataset shows that SSR significantly outperforms the baselines for the CSJ task and can serve as a strong baseline for future work. We also perform an ablation study investigating the impact of individual components of SSR. Further, we analyze the performance of SSR on a high-quality, real-world CSJ dataset with human evaluation and in-depth analysis, demonstrating the superior performance of SSR for CSJ.

Authors: Mehwish Fatima, Michael Strube

Date Published: 8th Jul 2023

Publication Type: InProceedings

Abstract (Expand)

Coherence is an important aspect of text quality, and various approaches have been applied to coherence modeling. However, existing methods solely focus on a single document’s coherence patterns, ignoring the underlying correlation between documents. We investigate a GCN-based coherence model that is capable of capturing structural similarities between documents. Our model first identifies the graph structure of each document, from where we mine different sub-graph patterns. We then construct a heterogeneous graph for the training corpus, connecting documents based on their shared subgraphs. Finally, a GCN is applied to the heterogeneous graph to model the connectivity relationships. We evaluate our method on two tasks, assessing discourse coherence and automated essay scoring. Results show that our GCN-based model outperforms baselines, achieving a new state-of-the-art on both tasks.

Authors: Wei Liu, Xiyan Fu, Michael Strube

Date Published: 8th Jul 2023

Publication Type: InProceedings

Abstract (Expand)

Implicit discourse relation classification is a challenging task due to the absence of discourse connectives. To overcome this issue, we design an end-to-end neural model to explicitly generate discourse connectives for the task, inspired by the annotation process of PDTB. Specifically, our model jointly learns to generate discourse connectives between arguments and predict discourse relations based on the arguments and the generated connectives. To prevent our relation classifier from being misled by poor connectives generated at the early stage of training while alleviating the discrepancy between training and inference, we adopt Scheduled Sampling to the joint learning. We evaluate our method on three benchmarks, PDTB 2.0, PDTB 3.0, and PCC. Results show that our joint model significantly outperforms various baselines on three datasets, demonstrating its superiority for the task.

Authors: Wei Liu, Michael Strube

Date Published: 8th Jul 2023

Publication Type: InProceedings

Abstract

Not specified

Authors: Evan L. Ray, Logan C. Brooks, Jacob Bien, Matthew Biggerstaff, Nikos I. Bosse, Johannes Bracher, Estee Y. Cramer, Sebastian Funk, Aaron Gerding, Michael A. Johansson, Aaron Rumack, Yijin Wang, Martha Zorn, Ryan J. Tibshirani, Nicholas G. Reich

Date Published: 1st Jul 2023

Publication Type: Journal

Abstract (Expand)

Observations of individual massive stars, super-luminous supernovae, gamma-ray bursts, and gravitational wave events involving spectacular black hole mergers indicate that the low-metallicity Universe is fundamentally different from our own Galaxy. Many transient phenomena will remain enigmatic until we achieve a firm understanding of the physics and evolution of massive stars at low metallicity (Z). The Hubble Space Telescope has devoted 500 orbits to observing ∼250 massive stars at low Z in the ultraviolet (UV) with the COS and STIS spectrographs under the ULLYSES programme. The complementary X-Shooting ULLYSES (XShootU) project provides an enhanced legacy value with high-quality optical and near-infrared spectra obtained with the wide-wavelength coverage X-shooter spectrograph at ESO’s Very Large Telescope. We present an overview of the XShootU project, showing that combining ULLYSES UV and XShootU optical spectra is critical for the uniform determination of stellar parameters such as effective temperature, surface gravity, luminosity, and abundances, as well as wind properties such as mass-loss rates as a function of Z. As uncertainties in stellar and wind parameters percolate into many adjacent areas of astrophysics, the data and modelling of the XShootU project is expected to be a game changer for our physical understanding of massive stars at low Z. To be able to confidently interpret James Webb Space Telescope spectra of the first stellar generations, the individual spectra of low-Z stars need to be understood, which is exactly where XShootU can deliver. Table B.1 and full Table B.2 are available at the CDS via anonymous ftp to cdsarc.cds.unistra.fr (ftp://130.79.128.5) or via https://cdsarc.cds.unistra.fr/viz-bin/cat/J/A+A/675/A154 Based on observations collected at the European Southern Observatory under ESO programme 106.211Z.001.

Authors: Jorick S. Vink, A. Mehner, P. A. Crowther, A. Fullerton, M. Garcia, F. Martins, N. Morrell, L. M. Oskinova, N. St-Louis, A. ud-Doula, A. A. C. Sander, H. Sana, J. -C. Bouret, B. Kubátová, P. Marchant, L. P. Martins, A. Wofford, J. Th. van Loon, O. Grace Telford, Y. Götberg, D. M. Bowman, C. Erba, V. M. Kalari, M. Abdul-Masih, T. Alkousa, F. Backs, C. L. Barbosa, S. R. Berlanas, M. Bernini-Peron, J. M. Bestenlehner, R. Blomme, J. Bodensteiner, S. A. Brands, C. J. Evans, A. David-Uraz, F. A. Driessen, K. Dsilva, S. Geen, V. M. A. Gómez-González, L. Grassitelli, W. -R. Hamann, C. Hawcroft, A. Herrero, E. R. Higgins, D. John Hillier, R. Ignace, A. G. Istrate, L. Kaper, N. D. Kee, C. Kehrig, Z. Keszthelyi, J. Klencki, A. de Koter, R. Kuiper, E. Laplace, C. J. K. Larkin, R. R. Lefever, C. Leitherer, D. J. Lennon, L. Mahy, J. Maíz Apellániz, G. Maravelias, W. Marcolino, A. F. McLeod, S. E. de Mink, F. Najarro, M. S. Oey, T. N. Parsons, D. Pauli, M. G. Pedersen, R. K. Prinja, V. Ramachandran, M. C. Ramírez-Tannus, G. N. Sabhahit, A. Schootemeijer, S. Reyero Serantes, T. Shenar, G. S. Stringfellow, N. Sudnik, F. Tramper, L. Wang

Date Published: 1st Jul 2023

Publication Type: Journal

Powered by
(v.1.14.2)
Copyright © 2008 - 2023 The University of Manchester and HITS gGmbH