Using hyperboloids to embed knowledge graphs
Novel embedding scheme enables a 7% to 33% improvement over its best-performing predecessors in handling graph queries.
Knowledge graphs are an extremely efficient way to represent information, but the standard techniques for analyzing them, which involve tracing connections through a graph one hop at a time, don’t scale well.
Recently, knowledge graph embeddings, which represent elements of the graph as points in a multidimensional space, have provided a way to analyze knowledge graphs more efficiently. But they give up much of the graphs’ informational richness.
Hyperboloids are bounded curved surfaces — curved analogues of rectangles — and characteristics of hyperbolic space enable them to capture hierarchical information that conventional knowledge graph embeddings lose.
Our paper describes a neural-network architecture for learning hyperboloid embeddings that enables us to logically compose knowledge graph queries. We could, for instance, search a product graph for all footwear from both brand A and brand B, a query that could be logically represented as the union of the intersections of the embeddings for brand A and brand B with the embedding for footwear.
In experiments, we compared our approach to four other graph embedding schemes, using five different datasets. On each dataset, we posed nine different queries, for a total of 45 queries total. Our approach outperformed its predecessors on 44 of those queries. Sometimes the improvements were quite dramatic: on individual queries, improvements of 20% to 50% over the best-performing predecessor were common.
A knowledge graph consists of nodes, which represent entities, connected by edges, which represent relationships between entities. A typical graph embedding would represent nodes and edges as vectors in the embedding space. With a good embedding, summing the vectors for a node and one of its edges should approximate the vector for the node that shares that edge.
Our embedding scheme, which we call HypE, instead embeds nodes and edges as hyperboloids on a Poincaré hypersphere. More particularly, each hyperboloid is defined by the intersections of two pairs of parallel arc-aligned horocyles.
An arc-aligned horocyle is a partial circle that is parallel to the diameter of a Poincaré hypersphere and intersects the hypersphere boundary orthogonally. (On a Poincaré hypersphere, the notion of parallelism is that of a curved space; parallel lines may exhibit different degrees of curvature; what defines them is that they don’t intersect each other.)
The hyperboloid determined by the intersection of horocycles can be described by the location of its center and the limit of its boundary (see figure above).
Because hyperboloid embeddings are extended in space, HypE can learn embeddings whose spatial overlap represents the logical intersection of categories encoded in the graph. To extend the example above, the embedding for brand A footwear would be the intersection of the embedding for brand A and the embedding for footwear.
We train a neural network that takes as inputs an entity, one of its relations, an arbitrary number of additional entities, and a control signal that indicates one of three operations: translation, union, and intersection. Union and intersection are the standard logical operations; translation simply means the traversal of some number of hops (in our experiments, one to three) through the graph.
The control signal sets a series of switches within the network that determine which inputs contribute to the output in what capacity. The network thus explicitly learns embeddings that encode information about particular types of logical relationships.
Our experiments compared the performance of embedding schemes on nine different queries: one-hop, two-hop, and three-hop translations; two- and three-entity intersections; two-entity unions; and three different combinations of translation and the logical operators.
Across five different datasets, HypE was the top performer on all but one test, a combination of union and translation; on that test, it finished second. On the five datasets, its average improvements over the second-best embedding schemes were 7% to 33%.