[1]田 园,原 野,刘海斌,等.基于BERT预训练语言模型的电网设备缺陷文本分类[J].南京理工大学学报(自然科学版),2020,44(04):446-453.[doi:10.14177/j.cnki.32-1397n.2020.44.04.009]
 Tian Yuan,Yuan Ye,Liu Haibin,et al.BERT pre-trained language model for defective textclassification of power grid equipment[J].Journal of Nanjing University of Science and Technology,2020,44(04):446-453.[doi:10.14177/j.cnki.32-1397n.2020.44.04.009]
点击复制

基于BERT预训练语言模型的电网设备缺陷文本分类()
分享到:

《南京理工大学学报》(自然科学版)[ISSN:1005-9830/CN:32-1397/N]

卷:
44卷
期数:
2020年04期
页码:
446-453
栏目:
出版日期:
2020-08-30

文章信息/Info

Title:
BERT pre-trained language model for defective textclassification of power grid equipment
文章编号:
1005-9830(2020)04-0446-08
作者:
田 园1原 野1刘海斌1满志博2毛存礼2
1.云南电网有限责任公司信息中心,云南 昆明 650000; 2.昆明理工大学 信息工程与自动化学院,云南 昆明 650500)摘 要:电网设备缺陷部位识别是设备故障分析的关键环节。该文提出一种基于预训练语言模型双向Transformers偏码表示(Bidirectional encoder representation from transformers,BERT)的电网设备缺陷文本分类方法。基于BERT预训练语言模型对电网设备缺陷部位文本进行预训练生成具有上下文特征的词嵌入(Word embeddi
Author(s):
Tian Yuan1Yuan Ye1Liu Haibin1Man Zhibo2Mao Cunli2
1.Ltd. Information Center,Yunnan Power Grid Co.,Kunmin 650000,China; 2.Kunming University of Science and Technology,Kunmin 650500,China
关键词:
TP391.1
Keywords:
power grid equipment pre-training language model bi-directional long short-term memory bidirectonal encoder representation from Transformers attention mechanism defect location text classification
分类号:
TP391.1
DOI:
10.14177/j.cnki.32-1397n.2020.44.04.009
摘要:
电网设备; 预训练语言模型; 双向长短时记忆网络; 双向Transformers偏码表示; 注意力机制; 缺陷部位; 文本分类
Abstract:
The identification of power grid equipment defects is a key link in grid equipment failure analysis.This paper proposes a method for analyzing power grid equipment defects based on bidirectional encoder representation from transformers pre-trained language model. The model based on the BERT is used to pre-train the defect text of power grid equipment and generate word embedding vector with context feature as the model input. Using the bi-directional long short-term memory network to input the grid equipment defect text vector is bidirectionally encoded to extract the semantic representation of the defect text,and the attention mechanism is used to enhance the meaning feature weight of the field words related to the defect parts in defect text of power grid equipment,and obtain the semantic features vectors that are helpful for the classification of the power equipment equipment defect location. Finally,the SoftMax layer of the model is used to classify the fault locations of power grid equipment. The experimental results show that the proposed method improves the F1 value of Baseline BiLSTM-Attention by 2.77% and 2.95% in the defect data sets of the main transformer and SF6 vacuum circuit breaker.

参考文献/References:

[1] 鞠平,周孝信,陈维江,等. “智能电网+”研究综述[J]. 电力自动化设备,2018(5):2-11.
Ju Ping,Zhou Xiaoxin,Chen Weijiang,et al. Summary of “Smart Grid+” research[J]. Electric Power Automation Equipment,2018(5):2-11.
[2]韩博闻. 基于Apriori关联算法的配电网运行大数据关联分析模型[J]. 上海电力学院学报,2018(2):20-26.
Han Bowen. Big data association analysis model of distribution network operation and maintenance based on apriori correlation algorithm[J]. Journal of Shanghai University of Electric Power,2018(2):20-26.
[3]刘科研,吴心忠,石琛,等. 基于数据挖掘的配电网故障风险预警[J]. 电力自动化设备,2018,38(5):148-153.
Liu Keyan,Wu Xinzhong,Shi Chen,et al. Fault risk early warning of distribution network based on data mining[J]. Electric Power Automation Equipment,2018,38(5):148-153.
[4]洪翠,付宇泽,郭谋发,等. 改进多分类支持向量机的配电网故障识别方法[J]. 电子测量与仪器学报,2019(1):7-15.
Hong Cui,Fu Yuze,Guo Moufa,et al. Identification method of distribution network faults based on improved multi-classification support vector machine[J]. Journal of Electronic Measurement and Instrument,2019(1):7-15.
[5]张斌,庄池杰,胡军,等. 结合降维技术的电力负荷曲线集成聚类算法[J]. 中国电机工程学报,2015,35(15):3741-3749.
Zhang Bin,Zhuang Chijie,Hu Jun,et al. Ensemble clustering algorithm combined with dimension reduction techniques for power load profiles[J]. Proceedings of the CSEE,2015,35(15):3741-3749.
[6]孙康,李千目,李德强. 基于级联卷积神经网络的人脸检测算法[J]. 南京理工大学学报,2018,42(1):40-47.
Sun Kang,Li Qianmu,Li Deqiang. Face detection algorithm based on cascaded convolutional neural network[J]. Journal of Nanjing University of Science and Technology,2018,42(1):40-47.
[7]王林,董楠. 基于Gabor特征与卷积神经网络的人体轮廓提取[J]. 南京理工大学学报,2018,42(1):89-95.
Wang Lin,Dong Nan. Human silhouette identification based on Gabor featureand convolutional neural network[J]. Journal of Nanjing University of Science and Technology,2018,42(1):89-95.
[8]徐萍,吴超,胡峰俊,等. 基于迁移学习的个性化循环神经网络语言模型[J]. 南京理工大学学报,2018,42(4):401-409.
Xu Ping,Wu Chao,Hu Fengjun,et al. Personalized recurrent neural network language modelbased on transfer learning[J]. Journal of Nanjing University of Science and Technology,2018,42(4):401-409.
[9]朱元振,刘玉田. 基于深度学习直流闭锁判断的高风险连锁故障快速搜索[J]. 电力系统自动化,2019,43(22):59-67.
Zhu Yuanzhen,Liu Yutian. Fast search for high-risk cascading failures based on deep learning DC blocking judgment[J]. Automation of Electric Power Systems,2019,43(22):59-67.
[10]孙宇嫣,蔡泽祥,郭采珊,等. 基于深度学习的智能变电站通信网络故障诊断与定位方法[J]. 电网技术,2019,43(12):4306-4314.
Sun Yuyan,Cai Zexiang,Guo Caishan,et al. Fault diagnosis and positioning for communication network in intelligent substation based on deep learning[J]. Power System Technology,2019,43(12):4306-4314.
[11]Xu S. Bayesian Na?ve Bayes classifiers to text classification[J]. Journal of Information Science,2018,44(1):48-59.
[12]You Ronghui,Dai Suyang,Zhang Zihan,et al. AttentionXML:extreme multi-label text classification with multi-label attention based recurrent neural networks[J]. Computing Research Repository,2018,18(1): 17-27.
[13]Chen J,Hu Y,Liu J,et al. Deep short text classification with knowledge powered attention[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Hawaii,USA:IOA Press,2019,33:6252-6259.
[14]Devlin J,Chang M,Lee K,et al. BERT:Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics,Human Language Technologies. Minneapolis,USA:IOA Press,2019:4171-4186.
[15]Vaswani A,Shazeer N,Parmar N,et al. Attention is all you need[C]//Advances in Neural Information Processing Systems. California,USA:IOA Press,2017:5998-6008.
[16]Mikolov T,Chen K,Corrado G,et al. Efficient estimation of word representations in vector space[J]. Computer Science,2013,32(2):123-129.
[17]Joulin A,Grave E,Bojanowski P,et al. Bag of tricks for efficient text classification[J]. Computing Research Repository,2016,7(1):17-59.
[18]Zhang M,Ai X,Hu Y. Chinese text classification system on regulatory information based on SVM[J]. IOP Conference Series:Earth and Environmental Science,2019,252(2):22-28.
[19]Anderson J. Fully convolutional networks for text classification[J]. Computing Research Repository,2019,15(1):11-17.
[20]Gers F. Long short-term memory in recurrent neural networks[D]. Swiss:Swiss Artificial Intelligence Laboratory,2001.
[21]Pennington J,Socher R,Manning C D. Glove:Global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Doha,Qatar:IOA Press,2014:1532-1543.

备注/Memo

备注/Memo:
收稿日期:2020-04-08 修回日期:2020-06-06
基金项目:云南省自然科学基金重点项目(2019FA023); 云南省中青年学术和技术带头人后备人才项目(2019HB006)
作者简介:田园(1989-),男,工程师,主要研究方向:电网数字化、自然语言处理,E-mail:270004294@qq.com; 通讯作者:原野(1992-),男,助理工程师,主要研究方向:电网数字化、自然语言处理,E-mail:wenma_2009@163.com。
引文格式:田园,原野,刘海斌,等. 基于BERT预训练语言模型的电网设备缺陷文本分类[J]. 南京理工大学学报,2020,44(4):446-453.
投稿网址:http://zrxuebao.njust.edu.cn
更新日期/Last Update: 2020-08-30