Hyperbolic Embedding for Finding Syntax in BERT

Temirlan Auyespek, Thomas Mach, Zhenisbek Assylbekov

Research output: Contribution to journalConference articlepeer-review

Abstract

Recent advances in natural language processing have improved our understanding of what kind of linguistic knowledge is encoded in modern word representations. For example, methods for testing the ability to extract syntax trees from a language model architecture were developed by Hewitt and Manning (2019)-they project word vectors into Euclidean subspace in such a way that the corresponding squared Euclidean distance approximates the tree distance between words in the syntax tree. This work proposes a method for assessing whether embedding word representations in hyperbolic space can better reflect the graph structure of syntax trees. We show that the tree distance between words in a syntax tree can be approximated well by the hyperbolic distance between corresponding word vectors.

Original languageEnglish
Pages (from-to)58-64
Number of pages7
JournalCEUR Workshop Proceedings
Volume3078
Publication statusPublished - 2022
Event2021 International Conference of the Italian Association for Artificial Intelligence, AIxIA 2021 DP - Virtual, Online
Duration: Dec 1 2021Dec 3 2021

Keywords

  • BERT
  • Poincaré ball
  • Structural probe

ASJC Scopus subject areas

  • General Computer Science

Fingerprint

Dive into the research topics of 'Hyperbolic Embedding for Finding Syntax in BERT'. Together they form a unique fingerprint.

Cite this