Customer-obsessed science
Research areas
-
November 20, 20254 min readA new evaluation pipeline called FiSCo uncovers hidden biases and offers an assessment framework that evolves alongside language models.
-
October 2, 20253 min read
-
-
-
September 2, 20253 min read
Featured news
-
ICML 20232023Minimax-fair machine learning minimizes the error for the worst-off group. However, empirical evidence suggests that when sophisticated models are trained with standard empirical risk minimization (ERM), they often have the same performance on the worst-off group as a minimax-trained model. Our work makes this counterintuitive observation concrete. We prove that if the hypothesis class is sufficiently expressive
-
UAI 20232023Statistical prediction models are often trained on data that is drawn from different probability distributions than their eventual use cases. One approach to proactively prepare for these shifts harnesses the intuition that causal mechanisms should remain invariant between environments. Here we focus on a challenging setting in which the causal and anticausal variables of the target are unobserved. Leaning
-
ICML 2023 Workshop on Data-centric Machine Learning Research (DMLR)2023Despite recent advances in synthetic data generation, the scientific community still lacks a unified consensus on its usefulness. It is commonly believed that synthetic data can be used for both data exchange and boosting machine learning (ML) training. Privacy-preserving synthetic data generation can accelerate data exchange for downstream tasks, but there is not enough evidence to show how or why synthetic
-
Applied Marketing Analytics (AMA)2023Video ads are increasingly popular in digital marketing, but advertisers are unsure about how much 8 they improve performance over static ads and which consumer response, such as unmuting or 9 watching through the end, matters most. Using data from the online retail site Amazon.com, we 10 apply causal inference methods to both a monthlong and yearlong time horizon and find support 11 for our hypotheses.
-
ACL Findings 2023, NeurIPS 2022 Workshop on SyntheticData4ML2023There has been increasing interest in synthesizing data to improve downstream text-to-SQL tasks. In this paper, we examined the existing synthesized datasets and discovered that state-of-the-art text-to-SQL algorithms did not further improve on popular benchmarks when trained with augmented synthetic data. We observed two shortcomings: illogical synthetic SQL queries from independent column sampling and
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all