Fun fact: @woj_zaremba wanted to completely remove that paper from existence a year or so after. I forgot why though -- Woj?

@OfirPress: I started doing language modeling research in Jan 2016, using the LSTM from the "RNN Regularization" model by @woj_zaremba @OriolVinyalsML and @ilyasut, trying to improve its perplexity on Penn Treebank (1M toks). This led to the weight tying method, sometimes still used today.