Reusing weights in subword-Aware neural language models

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

We propose several ways of reusing subword embeddings and other weights in subwordaware neural language models. The proposed techniques do not benefit a competitive character-Aware model, but some of them improve the performance of syllable-and morpheme-Aware models while showing significant reductions in model sizes. We discover a simple hands-on principle: in a multilayer input embedding model, layers should be tied consecutively bottom-up if reused at output. Our best morpheme-Aware model with properly reused weights beats the competitive word-level model by a large margin across multiple languages and has 20%?87% fewer parameters.

Original languageEnglish
Title of host publicationLong Papers
PublisherAssociation for Computational Linguistics (ACL)
Pages1413-1423
Number of pages11
ISBN (Electronic)9781948087278
Publication statusPublished - Jan 1 2018
Event2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2018 - New Orleans, United States
Duration: Jun 1 2018Jun 6 2018

Publication series

NameNAACL HLT 2018 - 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference
Volume1

Conference

Conference2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2018
CountryUnited States
CityNew Orleans
Period6/1/186/6/18

ASJC Scopus subject areas

  • Linguistics and Language
  • Language and Linguistics
  • Computer Science Applications

Fingerprint Dive into the research topics of 'Reusing weights in subword-Aware neural language models'. Together they form a unique fingerprint.

Cite this