Beyond grey-box assumptions: Uncertainty-guided example selection for black-box language models
2026
In-context learning (ICL) with Large Language Models has been historically effective, but performance depends heavily on demonstration quality while annotation budgets remain constrained. Existing uncertainty-based selection methods like Cover-ICL achieve strong performance through logit-based uncertainty estimation, but most production LLMs operate as black-box APIs where internal states are inaccessible. This paper investigates whether effective uncertainty guided example selection can be maintained under black-box constraints by developing a consistency-based uncertainty estimation using only output observations. We evaluate five active learning methods (random, hardest, VoteK, fast-VoteK, and Cover-ICL) across six benchmark datasets under both grey-box and blackbox settings. Experiments reveal paradigm-dependent strategies: grey-box achieves best performance with Cover-ICL (62.34% average accuracy), while black-box favors hardest selection (68.71% average accuracy), but no single method dominates across all datasets. Our framework enables selecting appropriate uncertainty estimation strategies based on model accessibility constraints in practical deployment scenarios.
Research areas