|Table of Contents|

Personalized recurrent neural network language modelbased on transfer learning(PDF)


Research Field:
Publishing date:


Personalized recurrent neural network language modelbased on transfer learning
Xu Ping1Wu Chao2Hu Fengjun1Wu Fan1Lin Jianwei1Liu Jingjing1
1.College of Information and Science Technology,Zhejiang Shuren University,Hangzhou 310015,China; 2.College of Computer Science and Technology,Zhejiang University,Hangzhou 310058,China
language model personalization recurrent neural network transfer learning small dataset pre-trained word vector
There are obstacles in the development of personalized language models on small data sets. A personalized recurrent neural network language model based on transfer learning is proposed. By designing a novel transfer learning training patterns based on pre-trained word vector,pre-trained external data,parameter fine-tuning and feature extraction classifier,a personalized language model with high degree of recognition is established on small data sets,both reducing the perplexity and improving the performance of the model. The experiment is conducted on the TV series Seinfeld’s roles. The results show that the perplexity of the role on the specific character test data set is 17.65% lower than that on other character data sets,which proves that the developed model has learned the personalized style of the character. It is demonstrated that the minimum perplexity of the model is reduced by 36.38% on an average through transfer learning,which proves that the developed model solves the obstacles mentioned above.


[1] Sutskever I,Vinyals O,Le Q V. Sequence to sequence learning with neural networks[J]. Advances in Neural Information Processing Systems,2014,4(1):3104-3112.
[2]王雍凯,毛存礼,余正涛,等. 基于图的新闻事件主题句抽取方法[J]. 南京理工大学学报,2016,40(4):438-443.
Wang Yongkai,Mao Cunli,Yu Zhengtao,et al. Approach for topical sentence of news events extraction based on graph[J]. Journal of Nanjing University of Science and Technology,2016,40(4):438-443.
[3]Graves A,Mohamed A R,Hinton G. Speech recognition with deep recurrent neural networks[C]//Proceedings of International Conference on Acoustics,Speech and Signal Processing. Piscataway,USA:Institute of Electrical and Electronics Engineers Inc,2013:6645-6649.
[4]黎亚雄,张坚强,潘登,等. 基于RNN-RBM语言模型的语音识别研究[J]. 计算机研究与发展,2014,51(9):1936-1944.
Li Yaxiong,Zhang Jianqiang,Pan Deng,et al. A study of speech recognition based on RNN-RBM language model[J]. Journal of Computer Research and Development,2014,51(9):1936-1944.
[5]Bengio Y,Ducharme R,Vincent P,et al. A neural probabilistic language model[J]. Journal of Machine Learning Research,2003,3:1137-1155.
[6]Mikolov T,Karafiat M,Burget L,et al. Recurrent neural network based language model[C]//Proceedings of the 11th Annual Conference of the International Speech Communication Association. Baixas,France:International Speech Communication Association,2010:1045-1048.
[7]Greff K,Srivastava R K,Koutnk J,et al. LSTM:A search space Odyssey[J]. IEEE Transactions on Neural Networks and Learning Systems,2017,28(10):2222-2232.
[8]刘畅,张一珂,张鹏远,等. 基于改进主题分布特征的神经网络语言模型[J]. 电子与信息学报,2018,40(1):219-225.
Liu Chang,Zhang Yike,Zhang Pengyuan,et al. Neural network language modeling using an improved topic distribution feature[J]. Journal of Electronics & Information Technology,2018,40(1):219-225.
[9]Marcus M P,Santorini B,Marcinkiewicz M A. Building a large annotated corpus of English:the penn treebank[J]. Computational Linguistics,1993,19(2):313-330.
[10]张剑,屈丹,李真. 基于词向量特征的循环神经网络语言模型[J]. 模式识别与人工智能. 2015,28(4):299-305.
Zhang Jian,Qu Dan,Li Zhen. Recurrent neural network language model based on word vector features[J]. Journal of Pattern Recognition and Artificial Intelligence,2015,28(4):299-305.
[11]Chelba C,Mikolov T,Schuster M,et al. One billion word benchmark for measuring progress in statistical language modeling[C]//Proceedings of the 15th Annual Conference of the International Speech Communication Association. Baixas,France:International Speech and Communication Association,2014:2635-2639.
[12]Yoon S,Yun H,Kim Y et al. Efficient transfer learning schemes for personalized language modeling using recurrent neural network[J]. Association for the Advancement of Artificial Intelligence,2017:457-463.
[13]Tseng B H,Lee H Y,Lee L S. Personalizing universal recurrent neural network language model with user characteristic features by social network crowdsourcing[J]. Automatic Speech Recognition and Unders-tanding,2016:84-91.
[14]Pennington J,Socher R,Manning C D. Glove:global vectors for word representation[C]//Proceedings of 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg,PA,USA:ACL,2014:1532-1543
[15]Pan S J,Yang Q. A survey on transfer learning[J]. IEEE Transactions on Knowledge and Data Engineering,2010,22(10):1345-1359.
[16]Danescu-Niculescu-Mizil C,Lee L. Chameleons in imagined conversations:A new approach to understanding coordination of linguistic style in dialogs[J]. Workshop on Cognitive Modeling and Computational Linguistics,2011:76-87.
[17]Kingma D P,Ba J. Adam:a method for stochastic optimization. arXiv preprint arXiv:1412.6980,2014.
[18]Pascanu R,Mikolov T,Bengio Y. On the difficulty of training recurrent neural networks[C]//Proceedings of 30th International Conference on Machine Learning. Atlanta,USA:International Machine Learning Society,2013:2347-2355.
[19]Jozefowicz R,Vinyals O,Schuster M,et al. Exploring the limits of language modeling. arXiv preprint arXiv:1602.02410,2016.


Last Update: 2018-08-30