-
As the demand for online A/B testing continues to rises for tech companies, the opportunity cost of conducting these experiments becomes increasingly significant. Consequently, there is a rising need for an efficient continuous monitoring system capable of early terminating experiments when necessary. Existing literature and tools primarily focuses on early terminating experiments with evidently significant
-
Tabular data is one of the most common data formats found in the web and used in domains like finance, banking, e-commerce and medical. Although deep neural networks (DNNs) have demonstrated outstanding performance on homogeneous data such as visual, audio, and textual data, tree ensemble methods such as Gradient Boosted Decision Trees (GBDTs) are often the go-to choice for supervised machine learning problems
-
2025Retrieval-Augmented Generation (RAG) systems have shown promise in enhancing the performance of Large Language Models (LLMs). However, these systems face challenges in effectively integrating external knowledge with the LLM’s internal knowledge, often leading to issues with misleading or unhelpful information. This work aims to provide a systematic study on knowledge checking in RAG systems. We conduct
-
2025The data on user behaviors is sparse given the vast array of user-item combinations. Attributes related to users (e.g., age), items (e.g., brand), and behaviors (e.g., co-purchase) serve as crucial input sources for item-item transitions of user’s behavior prediction. While recent Transformer-based sequential recommender systems learn the attention matrix for each attribute to update item representations
-
2025Audio-Visual Speech-to-Speech Translation (AVS2S) typically prioritizes improving translation quality and naturalness. However, an equally critical aspect in audio-visual content is lip-synchrony—ensuring that the movements of the lips match the spoken content—essential for maintaining realism in dubbed videos. Despite its importance, the inclusion of lip-synchrony constraints in AVS2S models has been largely
Related content
-
September 19, 2024“Agentic workflows” that use multiple, fine-tuned smaller LLMs — rather than one large one — can improve efficiency.
-
July 17, 2024Learning algorithms and reinforcement learning are areas of focus, while LLM-related research — on topics such as continual learning, hallucination mitigation, and privacy — remains well represented.
-
May 31, 2024Novel loss term that can be added to any loss function regularizes interclass and intraclass distances.
-
May 17, 2024A novel loss function and a way to aggregate multimodal input data are key to dramatic improvements on some test data.
-
March 25, 2024Automated method that uses gradients to identify salient layers prevents regression on previously seen data.
-
March 21, 2024The principal economist and his team address unique challenges using techniques at the intersection of microeconomics, statistics, and machine learning.