Customer-obsessed science
Research areas
-
November 20, 20254 min readA new evaluation pipeline called FiSCo uncovers hidden biases and offers an assessment framework that evolves alongside language models.
-
October 2, 20253 min read
-
-
-
September 2, 20253 min read
Featured news
-
ACL Findings 20232023Code-mixing is ubiquitous in multilingual societies, which makes it vital to build models for code-mixed data to power human language interfaces. Existing multilingual transformer models trained on pure corpora lack the ability to intermix words of one language into the structure of another. These models are also not robust to orthographic variations. We propose CoMix1, a pre-training approach to improve
-
CVPR 20232023Model selection is essential for reducing the search cost of the best pre-trained model over a large-scale model zoo for a downstream task. After analyzing recent hand-designed model selection criteria with 400+ ImageNet pre-trained models and 40 downstream tasks, we find that they can fail due to invalid assumptions and intrinsic limitations. The prior knowledge on model capacity and dataset also can not
-
ASEE 20232023The majority of students who choose to major in engineering do so to become a part of the community of practice of professional engineers (Johri & Olds, 2011), meaning that they want to have adequate exposure to what a career as a professional engineer could potentially be as part of their college experience. However, according to Jonassen (2014), engineering graduates are not well trained to contribute
-
UAI 20232023We study the problem of best-arm identification (BAI) in the fixed-budget setting with heterogeneous reward variances. We propose two variance-adaptive BAI algorithms for this setting: SHVar for known reward variances and SHAdaVar for unknown reward variances. The key idea in our algorithms is to adaptively allocate more budget to arms with higher reward variances. The main algorithmic novelty is in the
-
ACL 20232023Recent work has shown that large-scale annotated datasets are essential for training state-of-the-art Question Answering (QA) models. Unfortunately, creating this data is expensive and requires a huge amount of annotation work. An alternative and cheaper source of supervision is given by feedback data collected from deployed QA systems. This data can be collected from tens of millions of user with no additional
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all