We consider the task of predicting subjective fashion traits from images. Specifically, we are interested in understanding which outfit actually better suites the user. Since these traits are highly subjective, they tend to be noisier. One solution is to annotate each example several times, but this makes it hard to collect large amounts of data. So, for practical reasons, large data sets have only a few human annotations for each example. This approach introduces sampling uncertainty since labels are estimated using only a small set of human annotations. In this paper, we provide a closed-form expression to model the label uncertainty induced by sub-sampling. We show that for fashion related traits our model can basically quantify the ability of a learning algorithm to learn from noisy data. We further use this model to construct a custom neural network loss function which is able to better learn fashion traits.
Research areas