In structured prediction, a predictor optimizes an objective function over a combinatorial search space, such as the set of all image segmentations, or the set of all part-of-speech taggings. Unfortunately, finding the optimal structured labeling—sometimes referred to as maximum a posteriori (MAP) inference—is, in general, NP-hard , due to the combinatorial structure of the problem. Many inference approximations have been proposed, some of which are based on linear programming (LP) relaxations [e.g., 5, 1, 15, 16, 4, 13], which “relax” the combinatorial search space to a convex polytope with a polynomial number of constraints. These LP relaxations can be solved efficiently, but may result in fractional solutions, i.e., non-integral labelings. The approximation quality of an LP relaxation is traditionally measured by a quantity known as the integrality gap, defined as the difference in the objective values obtained at the optima of the relaxed and exact problems. When the integrality gap is zero, the solution is said to be tight.