-
ECIR 20232023Graph Convolutional Networks have recently shown state-of-the-art performance for collaborative filtering-based recommender systems. However, many systems use a pure user-item bipartite interaction graph, ignoring available additional information about the items and users. This paper proposes an effective and general method, TextGCN, that utilizes rich textual information about the graph nodes, specifically
-
ACL Findings 20232023There has been great progress in unifying various table-to-text tasks using a single encoder-decoder model trained via multi-task learning (Xie et al., 2022). However, existing methods typically encode task information with a simple dataset name as a prefix to the encoder. This not only limits the effectiveness of multitask learning, but also hinders the model’s ability to generalize to new domains or tasks
-
ACL Findings 20232023Code-mixing is ubiquitous in multilingual societies, which makes it vital to build models for code-mixed data to power human language interfaces. Existing multilingual transformer models trained on pure corpora lack the ability to intermix words of one language into the structure of another. These models are also not robust to orthographic variations. We propose CoMix1, a pre-training approach to improve
-
ACL 20232023Recent work has shown that large-scale annotated datasets are essential for training state-of-the-art Question Answering (QA) models. Unfortunately, creating this data is expensive and requires a huge amount of annotation work. An alternative and cheaper source of supervision is given by feedback data collected from deployed QA systems. This data can be collected from tens of millions of user with no additional
-
Quantization-aware and tensor-compressed training of transformers for natural language understandingInterspeech 20232023Fine-tuned transformer models have shown superior performances in many natural language tasks. However, the large model size prohibits deploying high-performance transformer models on resource-constrained devices. This paper proposes a quantization-aware tensor-compressed training approach to reduce the model size, arithmetic operations, and ultimately runtime latency of transformer-based models. We compress
Related content
-
February 15, 2024In addition to its practical implications, recent work on “meaning representations” could shed light on some old philosophical questions.
-
January 25, 2024Amazon IIT–Bombay AI-ML Initiative seeks to advance artificial intelligence and machine learning research within speech, language, and multimodal-AI domains.
-
January 17, 2024Representing facts using knowledge triplets rather than natural language enables finer-grained judgments.
-
December 20, 2023Novel architectures and carefully prepared training data enable state-of-the-art performance.
-
December 19, 2023Four professors awarded for research in machine learning and robotics; two doctoral candidates awarded fellowships.
-
December 11, 2023Amazon senior principal engineer Luu Tran is helping the Alexa team innovate by collaborating closely with scientist colleagues.