Publications
A contextualised protein language model reveals the functional syntax of bacterial evolution
Bacteria have evolved a vast diversity of functions and behaviours which are currently incompletely understood and poorly predicted from DNA sequence alone. To understand the syntax of bacterial evolution and discover genome-to-phenotype relationships, we curated over 1.3 million genomes spanning bacterial phylogenetic space, representing each as an ordered sequence of proteins which collectively were used to train a transformer-based, contextualised protein language model, Bacformer. […]
Wiatrak, M., Viñas Torné, R., Ntemourtsidou, M., Dinan, A., Abelson, D.C., Arora, D., Brbić, M., Weimann, A., Floto, R. A.
In biorXiv, 2025.
Sequence-based modelling of bacterial genomes enables accurate antibiotic resistance prediction
Rapid detection of antibiotic-resistant bacteria and understanding the mechanisms underlying antimicrobial resistance (AMR) are major unsolved problems that pose significant threats to global public health. […]
Wiatrak, M., Weimann, A., Dinan, A., Brbić, M., Floto, R. A.
In biorXiv, 2024.
On Masked Language Models for Contextual Link Prediction
In the real world, many relational facts require context; for instance, a politician holds a given elected position only for a particular timespan. […]
Brayne, A., Wiatrak, M., Corneil D.
In ACL 2022 workshop on Deep Learning Inside Out (DeeLIO).
Directed graph embeddings in pseudo-riemannian manifolds
The inductive biases of graph representation learning algorithms are often encoded in the background geometry of their embedding space. […]
Sim, A., Wiatrak, M., Brayne, A., Creed, P., Paliwal, S.
In ICML 2021.
Simple hierarchical multi-task neural end-to-end entity linking for biomedical text
Recognising and linking entities is a crucial first step to many tasks in biomedical text analysis, such as relation extraction and target identification. […]
Wiatrak, M., Iso-Sipilä, J.
In EMNLP 2021 workshop on Health Text Mining and Information Analysis (LOUHI).
Stabilizing generative adversarial networks: A survey
Generative Adversarial Networks (GANs) are a type of generative model which have received much attention due to their ability to model complex real-world data. Despite their recent successes, the process of training GANs remains challenging […]
Wiatrak, M., Albrecht, S.V., Nystrom, A.
In arXiv, 2020.