Mitigating social bias in knowledge graph embeddings
Method significantly reduces bias while maintaining comparable performance on machine learning tasks.
Question-answering systems frequently rely on knowledge graphs, large collections of facts about real-world entities (people, organizations, countries, etc.). To make use of the information in knowledge graphs, machine learning models often employ knowledge graph embeddings, vector representations of the entities in the graphs.
A potential problem with this approach is that the distributions of data in knowledge graphs reflect current and historical social biases. For instance, most knowledge graphs include more male entities than female with the profession “banker”, or more “white American” entities than “African-American” entities with the profession “ballet dancer”.
If knowledge graph embeddings end up encoding these biases, so will the question-answering systems that use them. If a little girl talking to a chatbot asks, “What should I be when I grow up?”, a biased embedding might rule out possible answers that are predominantly associated with men in the knowledge graph. For some professions — “baritone”, for instance — that may be fine. But in other cases, the biases may be relics of a less egalitarian past.
Earlier this year, at the AKBC Workshop on Bias in Knowledge Graphs, we presented a paper that examines this problem. Using a standard embedding technique, we looked for correlations between the professions of people listed in Wikidata and demographic factors, such as gender, ethnicity, and religion, to see whether the embeddings do indeed encode harmful social biases.
Following on from this, at last week’s Conference on Empirical Methods in Natural Language Processing (EMNLP), we presented “Debiasing knowledge graph embeddings”, in which we attempt to address this problem by developing a lightweight alteration to the standard method of training graph embeddings that reduces bias.
As knowledge graph embeddings become more widely used within the machine learning community, we hope this work raises awareness of the biases they may encode and moves us closer to the goal of effective debiasing.
Knowledge graph embedding
A standard knowledge graph represents data using triples, each of which consists of two entities and the relationship between them: for instance, the entities emmanuelle_charpentier and germany are related by the relation lives_in.
Knowledge graph embeddings represent the entities in a knowledge graph as points in a multidimensional space. The idea is that spatial relationships between the points encode the relationships captured by the graph.
With the common embedding framework TransE, for instance, adding the vector representing the relationship lives_in to the point representing emmanuelle_charpentier should bring us close to the location of the point representing germany.
During training, the embedding model learns to maximize the accuracy of these spatial relationships across all the triples captured in the knowledge graph. Among other applications, embeddings can be used for link prediction, or inferring relationships between entities that do not yet feature in the graph.
Do trained knowledge graph embeddings encode social biases?
To see why knowledge graph embeddings might encode social biases, let’s look at the counts of male and female entities in Wikidata, the most extensive open-source knowledge graph.
There are more than four times as many male entities in Wikidata as there are female, a reflection of long-persisting social biases in the real world.
In our paper “Measuring social bias in knowledge graph embeddings”, we determine whether such differences in counts become encoded in embeddings. To do this, we take the embedding of a human entity and tune it so that the addition of a relation vector — such has has_religion or has_gender — edges closer to the embedding for some particular right-hand attribute — such as “Catholic” or “female”.
As we tune the embedding, we observe how the result of adding the has_profession vector changes. That is, for each potential profession, we determine whether the model assigns it to the person with greater or lesser probability as the embedding changes.
Running this calculation across all humans and professions, we are able to identify the professions that the embeddings encode as the “most male” and the “most female”. The table at right shows the top 20 “most female” professions according to our measure. (The number of entities in Wikipedia with non-binary genders is comparatively negligible; although this represents another bias in the data, it also means that the resulting embeddings would be too noisy to yield meaningful results in our study.)
The differences in the counts of entities in the knowledge graph with these professions appear to translate to biases in the embeddings. There are some professions, such as “homekeeper”, that we would prefer were not associated with a particular gender; others, such as “woman of letters”, may be less controversial.
We also calculate the top 20 “male” professions, where the conclusions are similar.
Can we adjust the training of knowledge graph embeddings to reduce encoded biases?
In “Debiasing knowledge graph embeddings”, we turn our attention to reducing such biases and their potentially harmful consequences for downstream applications, such as chatbots. To do this, we train the embedding model not only on how faithfully it reconstructs triples but also on how well it approximates even distributions for gender and other sensitive characteristics, such as religion.
Put another way, we update the embedding of person1 so that it becomes impossible for the model to predict gender. If this is done precisely, it should also break correlations between gender and profession.
A potential drawback is that this approach prevents the model from using gender, religion, nationality, or ethnicity to predict noncontroversial triples. For instance, we may like the embeddings to reflect that a nun is more likely to be female than male.
To allow this, we introduce attribute embeddings. In cases where we wish to make use of sensitive information, we can simply add these attribute vectors back in to the embeddings.
We evaluate our model against a Basic TransE model with no debiasing and against the debiasing approach adopted by Bose et al., which uses neural-network filters proposed in the literature previously. We measure the usefulness of the embeddings for link prediction (according to mean reciprocal rank, or MRR), their bias, and training time.
During training, embeddings are scored on their accuracy — the degree to which they reproduce the corresponding triples in the knowledge graph. We measure bias as the difference between those scores for entities that fall into one category or another — religion, gender, and so on. We find that our model incurs a slight (roughly 3%) dropoff in link prediction accuracy in exchange for a dramatic reduction in bias.
|MRR||Gender bias||Seconds per epoch|
|Bose et al.||0.426||2.75||533.3|