-
2025Accurate mapping of queries to product categories is crucial for efficient retrieval and ranking of relevant products in e-commerce search. Conventionally, such query classification models rely on supervised learning using historical user interactions, but their effectiveness diminishes in cold-start scenarios, where new categories or products lack sufficient training data. This results in poor query-to-category
-
2025Large Language Models (LLMs) enable natural language to SQL conversion, allowing users to query databases without SQL expertise. However, generating accurate, efficient queries is challenging due to ambiguous intent, domain knowledge requirements, and database constraints. Extensive reasoning improves SQL quality but increases computational costs and latency. We propose SQLGenie, a practical system for
-
EMNLP 2024 Workshop on Customizable NLP, Transactions on Machine Learning Research2025Precise estimation of downstream performance in large language models (LLMs) prior to training is essential for guiding their development process. Scaling laws analysis utilizes the statistics of a series of significantly smaller sampling language models (LMs) to predict the performance of the target LLM. For downstream performance prediction, the critical challenge lies in the emergent abilities in LLMs
-
Amazon Technical Reports2025We present Amazon Nova Premier, our most capable multimodal foundation model and teacher for model distillation. Nova Premier processes text, images, and videos with a one-million token context window enabling analysis of large codebases, long documents, and long videos in a single prompt. It also enables customers to use Amazon Bedrock to create customized variants of Amazon Nova Pro, Nova Lite, and Nova
-
2025Toxicity text detectors can be vulnerable to adversarial examples - small perturbations to input text that fool the systems into wrong detection. Existing attack algorithms are time-consuming and often produce invalid or ambiguous adversarial examples, making them less useful for evaluating or improving real-world toxicity content moderators. This paper proposes an annotation pipeline for quality control
Related content
-
November 11, 2020With a new machine learning system, Alexa can infer that an initial question implies a subsequent request.
-
November 10, 2020Alexa senior applied scientist provides career advice to graduate students considering a research role in industry.
-
November 9, 2020Watch a recording of the EMNLP 2020 session featuring a discussion with Amazon scholars and academics on the state of conversational AI.
-
November 6, 2020Work aims to improve accuracy of models both on- and off-device.
-
November 3, 2020Fourth challenge features four new teams.
-
October 30, 2020Prosody transfer technique addresses the problem of “source speaker leakage”, while prosody selection model better matches prosody to semantic content.