Customer-obsessed science
Research areas
-
January 13, 20267 min readLeveraging existing environment simulators and reward functions based on verifiable ground truth boosts task success rate, even with small models and small training datasets.
-
January 8, 20264 min read
-
December 29, 20256 min read
-
December 29, 20259 min read
-
December 10, 20255 min read
Featured news
-
PQC Standardization Conference 20242024The QC-MDPC code-based KEM BIKE is an alternative candidate for standardization for the NIST Post-Quantum Cryptography Standardization Project. Per NIST’s report [2] “The BIKE cryptosystem was initially designed for ephemeral key use but has now been claimed to also support static key use”. BIKE uses the BGF decoder of [9] where its Decoding Failure Rate (DFR) is estimated by means of an extrapolation method
-
ICLR 2024 Workshop on Secure and Trustworthy Large Language Models (SET LLM)2024In this work, we propose sequence-level certainty as a common theme over hallucination in Knowledge Grounded Dialogue Generation (KGDG). We explore the correlation between the level of hallucination in model responses and two types of sequence-level certainty: probabilistic certainty and semantic certainty. Empirical results reveal that higher levels of both types of certainty in model responses are correlated
-
2024Document translation poses a challenge for Neural Machine Translation (NMT) systems. Most document-level NMT systems rely on meticulously curated sentence-level parallel data, assuming flawless extraction of text from documents along with their precise reading order. These systems also tend to disregard additional visual cues such as the document layout, deeming it irrelevant. However, real-world documents
-
MLSys 20242024The Mixture-of-Expert (MoE) technique plays a crucial role in expanding the size of DNN model parameters. However, it faces the challenge of extended all-to-all communication latency during the training process. Existing methods attempt to mitigate this issue by overlapping all-to-all with expert computation. Yet, these methods frequently fall short of achieving sufficient overlap, consequently restricting
-
MLSys 20242024Diffusion models have emerged as dominant performers for image generation. To support training large diffusion models, this paper studies pipeline parallel training of diffusion models and proposes DiffusionPipe, a synchronous pipeline training system that advocates innovative pipeline bubble filling technique, catering to structural char-acteristics of diffusion models. State-of-the-art diffusion models
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all