MyJournals Home  

RSS FeedsEntropy, Vol. 20, Pages 839: Cross Entropy of Neural Language Models at Infinity--A New Bound of the Entropy Rate (Entropy)

 
 

6 november 2018 19:00:09

 
Entropy, Vol. 20, Pages 839: Cross Entropy of Neural Language Models at Infinity--A New Bound of the Entropy Rate (Entropy)
 




Neural language models have drawn a lot of attention for their strong ability to predict natural language text. In this paper, we estimate the entropy rate of natural language with state-of-the-art neural language models. To obtain the estimate, we consider the cross entropy, a measure of the prediction accuracy of neural language models, under the theoretically ideal conditions that they are trained with an infinitely large dataset and receive an infinitely long context for prediction. We empirically verify that the effects of the two parameters, the training data size and context length, on the cross entropy consistently obey a power-law decay with a positive constant for two different state-of-the-art neural language models with different language datasets. Based on the verification, we obtained 1.12 bits per character for English by extrapolating the two parameters to infinity. This result suggests that the upper bound of the entropy rate of natural language is potentially smaller than the previously reported values.


Del.icio.us Digg Facebook Google StumbleUpon Twitter
 
151 viewsCategory: Informatics, Physics
 
Entropy, Vol. 20, Pages 840: Higher Order Geometric Theory of Information and Heat Based on Poly-Symplectic Geometry of Souriau Lie Groups Thermodynamics and Their Contextures: The Bedrock for Lie Group Machine Learning (Entropy)
Entropy, Vol. 20, Pages 865: Optimization and Stability of Heat Engines: The Role of Entropy Evolution (Entropy)
 
 
blog comments powered by Disqus


MyJournals.org
The latest issues of all your favorite science journals on one page

Username:
Password:

Register | Retrieve

Search:

Physics

Use these buttons to bookmark us:
Del.icio.us Digg Facebook Google StumbleUpon Twitter


Valid HTML 4.01 Transitional
Copyright © 2008 - 2018 Indigonet Services B.V.. Contact: Tim Hulsen. Read here our privacy notice.
Other websites of Indigonet Services B.V.: Nieuws Vacatures News Tweets Travel Photos Nachrichten Indigonet Finances Leer Mandarijn