Reusing Weights in Subword-aware Neural Language Models

Research output: Contribution to journalArticle

10 Downloads (Pure)

Abstract

We propose several ways of reusing subword embeddings and other weights in subword-aware neural language models. The proposed techniques do not benefit a competitive character-aware model, but some of them improve the performance of syllable- and morpheme-aware models while showing significant reductions in model sizes. We discover a simple hands-on principle: in a multi-layer input embedding model, layers should be tied consecutively bottom-up if reused at output. Our best morpheme-aware model with properly reused weights beats the competitive word-level model by a large margin across multiple languages and has 20%-87% fewer parameters.
Original languageUndefined/Unknown
JournalProceedings of NAACL-HLT 2018
Publication statusPublished - Feb 23 2018

Keywords

  • cs.CL
  • cs.NE
  • stat.ML
  • 68T50
  • I.2.7

Cite this