-
ICLR 20232023We present HumanEvalX and MBXP, execution-based code completion benchmarks in 10+ programming languages. These datasets are generated by our conversion framework that transpiles prompts and test cases from original datasets (HumanEval and MBPP) to the corresponding data in a target language. Based on these benchmarks, we are able to evaluate code generation models in a multilingual fashion, and in particular
-
ICASSP 20232023GAN vocoders are currently one of the state-of-the-art methods for building high-quality neural waveform generative models. However, most of their architectures require dozens of billion floating-point operations per second (GFLOPS) to generate speech waveforms in samplewise manner. This makes GAN vocoders still challenging to run on normal CPUs without accelerators or parallel computers. In this work,
-
ICLR 20232023In this paper, we study how to use masked signal modeling in vision and language (V+L) representation learning. Instead of developing masked language modeling (MLM) and masked image modeling (MIM) independently, we propose to build joint masked vision and language modeling, where the masked signal of one modality is reconstructed with the help from another modality. This is motivated by the nature of image-text
-
ICLR 20232023Anomaly detection in time-series has a wide range of practical applications. While numerous anomaly detection methods have been proposed in the literature, a recent survey concluded that no single method is the most accurate across various datasets. To make matters worse, anomaly labels are scarce and rarely available in practice. The practical problem of selecting the most accurate model for a given dataset
-
ICLR 20232023Empirical studies suggest that machine learning models trained with empirical risk minimization (ERM) often rely on attributes that may be spuriously correlated with the class labels. Such models typically lead to poor performance during inference for data lacking such correlations. In this work, we explicitly consider a situation where potential spurious correlations are present in the majority of training
Related content
-
July 13, 2021Innovative faculty proposals will explore various aspects of trustworthy machine learning.
-
July 07, 2021James Hensman joins an effort to expand machine learning talent for UN sustainability goals.
-
June 29, 2021How Amazon’s Delivery Experience team acts as a concierge for customers.
-
June 28, 2021Didn't get the opportunity to attend the summit earlier this month? Now available on demand: Presentations on the science of machine learning by leading scholars, a fireside chat with Andrew Ng, and more career-growth content.
-
June 22, 2021Scientists describe the use of privacy-preserving machine learning to address privacy challenges in XGBoost training and prediction.
-
June 21, 2021Özer’s paper published in INFORMS’ Management Science 2021 explores the dynamics behind “cheap-talk” communications.