Learning effective representations of data is an important task in machine learning. Existing methods typically compute representations or embeddings in Euclidean space, which has shortcomings in representing hierarchical structures of the underlying data. Alternatively, hyperbolic geometry offers a representation scheme that is suited for robust, high-fidelity representations of tree-structured data. In this paper, we explore hyperbolic graph convolutional models for learning hyperbolic representations of source code, which exhibit natural hierarchies. We leverage the abstract syntax tree (AST) of source code and learn its graph-based representation to predict the function name from its body. We compare Lorentz and Poincaré Disk models of hyperbolic geometry with Euclidean geometry. We also propose several readout schemes to compute the graph-level representations and apply them to the method name prediction task. Using a Lorentz hyperbolic model, we establish a new state-of-the-art result on the ogbg-code2 benchmark for the task.
Research areas