TY - GEN
T1 - Context vectors are reflections of word vectors in half the dimensions
AU - Assylbekov, Zhenisbek
AU - Takhanov, Rustem
N1 - Funding Information:
This work is supported by the Nazarbayev University Collaborative Research Program 091019CRP2109.
Publisher Copyright:
© 2020 Inst. Sci. inf., Univ. Defence in Belgrade. All rights reserved.
PY - 2020
Y1 - 2020
N2 - This paper takes a step towards the theoretical analysis of the relationship between word embeddings and context embeddings in models such as word2vec. We start from basic probabilistic assumptions on the nature of word vectors, context vectors, and text generation. These assumptions are supported either empirically or theoretically by the existing literature. Next, we show that under these assumptions the widely-used word-word PMI matrix is approximately a random symmetric Gaussian ensemble. This, in turn, implies that context vectors are reflections of word vectors in approximately half the dimensions. As a direct application of our result, we suggest a theoretically grounded way of tying weights in the SGNS model.
AB - This paper takes a step towards the theoretical analysis of the relationship between word embeddings and context embeddings in models such as word2vec. We start from basic probabilistic assumptions on the nature of word vectors, context vectors, and text generation. These assumptions are supported either empirically or theoretically by the existing literature. Next, we show that under these assumptions the widely-used word-word PMI matrix is approximately a random symmetric Gaussian ensemble. This, in turn, implies that context vectors are reflections of word vectors in approximately half the dimensions. As a direct application of our result, we suggest a theoretically grounded way of tying weights in the SGNS model.
UR - http://www.scopus.com/inward/record.url?scp=85097356985&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85097356985&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85097356985
T3 - IJCAI International Joint Conference on Artificial Intelligence
SP - 5115
EP - 5119
BT - Proceedings of the 29th International Joint Conference on Artificial Intelligence, IJCAI 2020
A2 - Bessiere, Christian
PB - International Joint Conferences on Artificial Intelligence
T2 - 29th International Joint Conference on Artificial Intelligence, IJCAI 2020
Y2 - 1 January 2021
ER -