[1]徐 萍,吴 超,胡峰俊,等.基于迁移学习的个性化循环神经网络语言模型[J].南京理工大学学报(自然科学版),2018,42(04):401.[doi:10.14177/j.cnki.32-1397n.2018.42.04.003]
 Xu Ping,Wu Chao,Hu Fengjun,et al.Personalized recurrent neural network language modelbased on transfer learning[J].Journal of Nanjing University of Science and Technology,2018,42(04):401.[doi:10.14177/j.cnki.32-1397n.2018.42.04.003]
点击复制

基于迁移学习的个性化循环神经网络语言模型()
分享到:

《南京理工大学学报》(自然科学版)[ISSN:1005-9830/CN:32-1397/N]

卷:
42卷
期数:
2018年04期
页码:
401
栏目:
出版日期:
2018-08-30

文章信息/Info

Title:
Personalized recurrent neural network language modelbased on transfer learning
文章编号:
1005-9830(2018)04-0401-08
作者:
徐 萍1吴 超2胡峰俊1吴 凡1林建伟1刘静静1
1.浙江树人大学 信息科技学院,浙江 杭州 310015; 2.浙江大学 计算机科学与技术学院,浙江 杭州 310058
Author(s):
Xu Ping1Wu Chao2Hu Fengjun1Wu Fan1Lin Jianwei1Liu Jingjing1
1.College of Information and Science Technology,Zhejiang Shuren University,Hangzhou 310015,China; 2.College of Computer Science and Technology,Zhejiang University,Hangzhou 310058,China
关键词:
语言模型 个性化 循环神经网络 迁移学习 小数据集 预训练词向量
Keywords:
language model personalization recurrent neural network transfer learning small dataset pre-trained word vector
分类号:
TP391
DOI:
10.14177/j.cnki.32-1397n.2018.42.04.003
摘要:
针对在小数据集上开发个性化语言模型仍存在的障碍,提出基于迁移学习的个性化循环神经网络语言模型。设计了基于预训练词向量、预训练电影剧本数据集、基于参数微调和特征提取分类器的迁移学习训练模式,在小数据集上建立了具有较高辨识度的个性化语言模型,降低了模型的困惑度,改进了模型的性能。模型的实验以电视剧Seinfeld角色为基础。结果表明:该模型在特定角色测试数据集上的困惑度比其他角色数据集平均低17.65%,证明其已经学会了该角色的个性化风格; 迁移学习使得模型最低困惑度平均降低了36.38%,较好地解决了基于小数据集开发个性化语言模型存在的障碍问题。
Abstract:
There are obstacles in the development of personalized language models on small data sets. A personalized recurrent neural network language model based on transfer learning is proposed. By designing a novel transfer learning training patterns based on pre-trained word vector,pre-trained external data,parameter fine-tuning and feature extraction classifier,a personalized language model with high degree of recognition is established on small data sets,both reducing the perplexity and improving the performance of the model. The experiment is conducted on the TV series Seinfeld’s roles. The results show that the perplexity of the role on the specific character test data set is 17.65% lower than that on other character data sets,which proves that the developed model has learned the personalized style of the character. It is demonstrated that the minimum perplexity of the model is reduced by 36.38% on an average through transfer learning,which proves that the developed model solves the obstacles mentioned above.

参考文献/References:

[1] Sutskever I,Vinyals O,Le Q V. Sequence to sequence learning with neural networks[J]. Advances in Neural Information Processing Systems,2014,4(1):3104-3112.
[2]王雍凯,毛存礼,余正涛,等. 基于图的新闻事件主题句抽取方法[J]. 南京理工大学学报,2016,40(4):438-443.
Wang Yongkai,Mao Cunli,Yu Zhengtao,et al. Approach for topical sentence of news events extraction based on graph[J]. Journal of Nanjing University of Science and Technology,2016,40(4):438-443.
[3]Graves A,Mohamed A R,Hinton G. Speech recognition with deep recurrent neural networks[C]//Proceedings of International Conference on Acoustics,Speech and Signal Processing. Piscataway,USA:Institute of Electrical and Electronics Engineers Inc,2013:6645-6649.
[4]黎亚雄,张坚强,潘登,等. 基于RNN-RBM语言模型的语音识别研究[J]. 计算机研究与发展,2014,51(9):1936-1944.
Li Yaxiong,Zhang Jianqiang,Pan Deng,et al. A study of speech recognition based on RNN-RBM language model[J]. Journal of Computer Research and Development,2014,51(9):1936-1944.
[5]Bengio Y,Ducharme R,Vincent P,et al. A neural probabilistic language model[J]. Journal of Machine Learning Research,2003,3:1137-1155.
[6]Mikolov T,Karafiat M,Burget L,et al. Recurrent neural network based language model[C]//Proceedings of the 11th Annual Conference of the International Speech Communication Association. Baixas,France:International Speech Communication Association,2010:1045-1048.
[7]Greff K,Srivastava R K,Koutnk J,et al. LSTM:A search space Odyssey[J]. IEEE Transactions on Neural Networks and Learning Systems,2017,28(10):2222-2232.
[8]刘畅,张一珂,张鹏远,等. 基于改进主题分布特征的神经网络语言模型[J]. 电子与信息学报,2018,40(1):219-225.
Liu Chang,Zhang Yike,Zhang Pengyuan,et al. Neural network language modeling using an improved topic distribution feature[J]. Journal of Electronics & Information Technology,2018,40(1):219-225.
[9]Marcus M P,Santorini B,Marcinkiewicz M A. Building a large annotated corpus of English:the penn treebank[J]. Computational Linguistics,1993,19(2):313-330.
[10]张剑,屈丹,李真. 基于词向量特征的循环神经网络语言模型[J]. 模式识别与人工智能. 2015,28(4):299-305.
Zhang Jian,Qu Dan,Li Zhen. Recurrent neural network language model based on word vector features[J]. Journal of Pattern Recognition and Artificial Intelligence,2015,28(4):299-305.
[11]Chelba C,Mikolov T,Schuster M,et al. One billion word benchmark for measuring progress in statistical language modeling[C]//Proceedings of the 15th Annual Conference of the International Speech Communication Association. Baixas,France:International Speech and Communication Association,2014:2635-2639.
[12]Yoon S,Yun H,Kim Y et al. Efficient transfer learning schemes for personalized language modeling using recurrent neural network[J]. Association for the Advancement of Artificial Intelligence,2017:457-463.
[13]Tseng B H,Lee H Y,Lee L S. Personalizing universal recurrent neural network language model with user characteristic features by social network crowdsourcing[J]. Automatic Speech Recognition and Unders-tanding,2016:84-91.
[14]Pennington J,Socher R,Manning C D. Glove:global vectors for word representation[C]//Proceedings of 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg,PA,USA:ACL,2014:1532-1543
[15]Pan S J,Yang Q. A survey on transfer learning[J]. IEEE Transactions on Knowledge and Data Engineering,2010,22(10):1345-1359.
[16]Danescu-Niculescu-Mizil C,Lee L. Chameleons in imagined conversations:A new approach to understanding coordination of linguistic style in dialogs[J]. Workshop on Cognitive Modeling and Computational Linguistics,2011:76-87.
[17]Kingma D P,Ba J. Adam:a method for stochastic optimization. arXiv preprint arXiv:1412.6980,2014.
[18]Pascanu R,Mikolov T,Bengio Y. On the difficulty of training recurrent neural networks[C]//Proceedings of 30th International Conference on Machine Learning. Atlanta,USA:International Machine Learning Society,2013:2347-2355.
[19]Jozefowicz R,Vinyals O,Schuster M,et al. Exploring the limits of language modeling. arXiv preprint arXiv:1602.02410,2016.

相似文献/References:

[1]叶锡君,袁培森,郭小清,等.基于用户兴趣和项目周期的协同过滤推荐算法[J].南京理工大学学报(自然科学版),2018,42(04):392.[doi:10.14177/j.cnki.32-1397n.2018.42.04.002]
 Ye Xijun,Yuan Peisen,Gou Xiaoqing,et al.Collaborative filtering recommendation algorithm based onuser interest and project cycle[J].Journal of Nanjing University of Science and Technology,2018,42(04):392.[doi:10.14177/j.cnki.32-1397n.2018.42.04.002]

备注/Memo

备注/Memo:
收稿日期:2018-04-26 修回日期:2018-07-20
基金项目:浙江省自然科学基金(LY14F020008); 浙江省教育厅研究计划(Y201329701); 国家自然科学基金(51675490); 浙江省公益技术应用研究计划(2016C31116,2017C31050); 浙江树人大学中青年学术团队项目
作者简介:徐萍(1977-),女,博士,讲师,主要研究方向:自然语言处理,数据分析等,E-mail:xpcs2007@sina.com。
引文格式:徐萍,吴超,胡峰俊,等. 基于迁移学习的个性化循环神经网络语言模型[J]. 南京理工大学学报,2018,42(4):401-408. 投稿网址:http://zrxuebao.njust.edu.cn
更新日期/Last Update: 2018-08-30