Customer-obsessed science
Research areas
-
November 20, 20254 min readA new evaluation pipeline called FiSCo uncovers hidden biases and offers an assessment framework that evolves alongside language models.
-
-
-
September 2, 20253 min read
-
Featured news
-
2024Foundation models (FMs) learn from large volumes of unlabeled data to demonstrate superior performance across a wide range of tasks. However, FMs developed for biomedical domains have largely remained unimodal, i.e., independently trained and used for tasks on protein sequences alone, small-molecule structures alone, or clinical data alone. To overcome this limitation, we present BioBRIDGE, a parameter-efficient
-
EACL 2024 Workshop on Linguistic Annotation2024Recent developments in active learning algorithms for NLP tasks show promising results in terms of reducing labelling complexity. In this paper we extend this effort to imbalanced datasets; we bridge between the active learning approach of obtaining diverse and informative examples, and the heuristic of class balancing used in imbalanced datasets. We develop a novel tune-free weighting technique that can
-
ICASSP 2024 Workshop on Self-supervision in Audio, Speech and Beyond2024Voice conversion (VC) systems are widely used for several applications, from speaker anonymisation to personalised speech synthesis. Supervised approaches learn a mapping between different speakers using parallel data, which is expensive to produce. Un-supervised approaches are typically trained to reconstruct the in-put signal, which is composed of the content and the speaker in-formation. Disentangling
-
2024Cross-triggering is a critical problem for applications of audio event detection (AED), particularly in low-resource settings. However, not much attention (if not none) has been paid to this problem in the AED research community. In this work, we tackle this problem via a regularization approach. We propose a regularizer, namely mutual exclusivity regularizer, that is able to enforce pairwise exclusivity
-
arXiv2024We introduce a text-to-speech (TTS) model called BASE TTS, which stands for Big Adaptive Streamable TTS with Emergent abilities. BASE TTS is the largest TTS model to-date, trained on 100K hours of public domain speech data, achieving a new state-of-the-art in speech naturalness. It deploys a 1-billion- parameter autoregressive Transformer that converts raw texts into discrete codes ("speechcodes") followed
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all