We tackle the adversarial best arm identification problem, in its extension where different arms are associated with different costs. This extension is a natural abstraction of the cost-aware multi-fidelity hyperparameter optimization (CAMF-HPO) problem, where the costs, e.g. monetary or wall-clock time, of evaluations with different hyperparameter configurations are not uniform. We provide an algorithm with rigorous guarantees, matching a provable lower bound up to logarithmic factors. To the best of our knowledge, this is the first algorithm for this important problem that provides provable guarantees. The provided analysis has an application to the stochastic best arm identification problem. The core of our algorithm extends the Sequential Halving (SH) algorithm, and our novel view proves that information does not have to be discarded between rungs, closing a gap between practical implementations of SH and the theory behind it.
Research areas