Avoiding local minima in variational quantum eigensolvers with the natural gradient optimizer
We compare the BFGS optimizer, ADAM and NatGrad in the context of VQES. We systematically analyze their performance on the QAQA ansatz for the transverse field Ising and the XXZ model as well as on overparametrized circuits with the ability to break the symmetry of the Hamiltonian. The BFGS algorithm is frequently unable to find a global minimum for systems beyond about 20 spins and ADAM easily gets trapped in local minima or exhibits infeasible optimization durations. NatGrad on the other hand shows stable performance on all considered system sizes, rewarding its higher cost per epoch with reliability and competitive total run times. In sharp contrast to most classical gradient-based learning, the performance of all optimizers decreases upon seemingly benign overparametrization of the ansatz class, with BFGS and ADAM failing more often and more severely than NatGrad. This does not only stress the necessity for good ansatz circuits but also means that overparametrization, an established remedy for avoiding local minima in machine learning, does not seem to be a viable option in the context of VQES. The behavior in both investigated spin chains is similar, in particular the problems of BFGS and ADAM surface in both systems, even though their effective Hilbert space dimensions differ significantly. Overall our observations stress the importance of avoiding redundant degrees of freedom in ansatz circuits and to put established optimization algorithms and attached heuristics to test on larger system sizes. Natural gradient descent emerges as a promising choice to optimize large VQES.