[1]刘光徽,胡 俊,於东军.基于多视角特征组合与随机森林的G蛋白偶联受体与药物相互作用预测[J].南京理工大学学报(自然科学版),2016,40(01):1.
 Liu Guanghui,Hu Jun,Yu Dongjun.Predicting GPCR-drug interactions with multi-view featurecombination and random forest[J].Journal of Nanjing University of Science and Technology,2016,40(01):1.
点击复制

基于多视角特征组合与随机森林的G蛋白偶联受体与药物相互作用预测
分享到:

《南京理工大学学报》(自然科学版)[ISSN:1005-9830/CN:32-1397/N]

卷:
40卷
期数:
2016年01期
页码:
1
栏目:
出版日期:
2016-02-29

文章信息/Info

Title:
Predicting GPCR-drug interactions with multi-view featurecombination and random forest
作者:
刘光徽12胡 俊1於东军1
1.南京理工大学 计算机科学与工程学院,江苏 南京 210094; 2.南京财经大学 信息工程学院,江苏 南京 210046
Author(s):
Liu Guanghui 12Hu Jun 1Yu Dongjun 1
1.School of Computer Science and Engineering,Nanjing University of Science and Technology,Nanjing 210094,China; 2.School of Information Engineering,Nanjing University of Finance and Economics,Nanjing 210046,China
关键词:
偶联受体 G蛋白偶联受体 药物 多视角特征 氨基酸组分 序列特征 分子指纹 随机森林
Keywords:
coupled recptors G-protein-coupled receptors drugs multi-view features amino acid composition sequence features molecular fingerprint random forest
分类号:
TP391.4
摘要:
为了提高G蛋白偶联受体(G-protein-coupled receptors,GPCR)与药物相互作用预测的精度,该文提出一种基于多视角特征组合与随机森林的GPCR-Drug相互作用预测新方法。该方法首先从氨基酸组成成分和蛋白质进化视角分别抽取GPCR的序列特征,并从分子指纹视角抽取药物分子的特征; 将所抽取的多视角特征进行组合,得到GPCR-Drug配对的特征表示; 基于所提出的GPCR-Drug特征表示方法,使用随机森林构建预测模型。在标准数据集上的交叉验证和独立测试结果验证了该文所述方法的有效性。
Abstract:
In order to improve the accuracy of predicting the interactions between G-protein-coupled receptors(GPCR)and drugs,this paper develops a novel method based on multi-view feature combination and random forest for GPCR-Drug interactions prediction with high performance.In the method,GPCR features from amino acid composition and protein evolution views and drug feature from molecular fingerprint are extracted; the feature of every GPCR-Drug pair can be formulated by serially combining the multi-view features of GPCRs and drugs; the GPCR-Drug prediction model is constructed with the random forest algorithm under the developed feature representation.Stringent experiments on benchmark datasets over both cross-validation and independent validation tests demonstrate the feasibility and efficacy of the proposed method.

参考文献/References:

[1] Kroeze W K,Sheffler D J,Roth B L.G-protein-coupled receptors at a glance[J].Journal of Cell Science,2003,116(24):4867-4869.
[2]Agrawal N J,Helk B,Trout B L.A computational tool to predict the evolutionarily conserved protein-protein interaction hot-spot residues from the structure of the unbound protein[J].FEBS Letters,2014,588(2):326-333.
[3]Chou Kuochen.Prediction of G-protein-coupled receptor classes[J].Journal of Proteome Research,2005,4(4):1413-1418.
[4]Karnik S S,Gogonea C,Patil S,et al.Activation of G-protein-coupled receptors:a common molecular mechanism[J].Trends in Endocrinology & Metabolism,2003,14(9):431-437.
[5]Albert B,Johnson A,Lewis J,et al.Molecular biology of the cell[M].4th ed.New York:Garland Science Press,2002.
[6]张君,金亚,叶燕锐,等.G蛋白偶联受体配体结合分析技术[J].药物分析杂志,2015,35(1):1-7.

Zhang Jun,Jin Ya,Ye Yanrui,et al.Technologies in G-protein-coupled receptor-ligand binding assays[J].Chinese Journal of Pharmaceutical Analysis,2015,35(1):1-7.
[7]Yamanishi Y,Araki M,Gutteridge A,et al.Prediction of drug-target interaction networks from the integration of chemical and genomic spaces[J].Bioinformatics,2008,24(13):i232-i240.
[8]He Zhisong,Zhang Jian,Shi Xiaohe,et al.Predicting drug-target interaction networks based on functional groups and biological features[J].PloS One,2010,5(3):e9603.
[9]Xiao Xuan,Min Jianliang,Wang Pu,et al.iGPCR-Drug:A web server for predicting interaction between GPCRs and drugs in cellular networking[J].PloS One,2013,8(8):e72234.
[10]Kanehisa M,Goto S,Hattori M,et al.From genomics to chemical genomics:new developments in KEGG[J].Nucleic Acids Research,2006,34(suppl 1):D354-D357.
[11]Chou Kuochen.Prediction of protein cellular attributes using pseudo-amino acid composition[J].Proteins:Structure,Function,and Bioinformatics,2001,43(3):246-255.
[12]Yu Dongjun,Hu Jun,Wu Xiaowei,et al.Learning protein multi-view features in complex space[J].Amino Acids,2013,44(5):1365-1379.
[13]Sch?ffer A A,Aravind L,Madden T L,et al.Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements[J].Nucleic Acids Research,2001,29(14):2994-3005.
[14]Chou Kuochen,Shen Hongbin.MemType-2L:a web server for predicting membrane proteins and their types by incorporating evolution information through Pse-PSSM[J].Biochemical and Biophysical Research Communications,2007,360(2):339-345.
[15]O’Boyle N M,Banck M,James C A,et al.Open Babel:An open chemical toolbox[J].Journal of Cheminformatics,2011,3(1):1-14.
[16]Butina D.Unsupervised data base clustering based on daylight’s fingerprint and Tanimoto similarity:A fast and automated way to cluster small and large data sets[J].Journal of Chemical Information and Computer Sciences,1999,39(4):747-750.
[17]Dou Yangchao,Wang Jun,Yang Jialiang,et al.L1pred:a sequence-based prediction tool for catalytic residues in enzymes with the L1-logreg classifier[J].PloS One,2012,7(4):e35666.
[18]Villasenor J D,Belzer B,Liao J.Wavelet filter evaluation for image compression[J].Image Processing,IEEE Transactions on,1995,4(8):1053-1060.
[19]冯凯,应展烽,吴军基,等.基于小波包变换和峰式马尔科夫链的风速短期预测[J].南京理工大学学报,2014,38(5):639-643.
Feng Kai,Ying Zhanfeng,Wu Junji,et al.Short-term wind speed forecast based on wavelet packet decomposition and peak-type Markov chain[J].Journal of Nanjing University of Science and Technology,2014,38(5):639-643.
[20]Breiman L.Random forests[J].Machine Learning,2001,45(1):5-32.
[21]黄衍,查伟雄.随机森林与支持向量机分类性能比较[J].软件,2012,33(6):107-110.
Huang Yan,Zha Weixiong.Comparison on classification performance between random forests and support vector machine[J].Software,2012,33(6):107-110.
[22]Yu Dongjun,Hu Jun,Huang Yan,et al.Target ATP site:A template-free method for ATP-binding sites prediction with residue evolution image sparse representation and classifier ensemble[J].Journal of Computational Chemistry,2013,34(11):974-985.
[23]魏志森,杨静宇,於东军.基于加权PSSM 直方图和随机森林集成的蛋白质交互作用位点预测[J].南京理工大学学报,2015,39(4):379-385.
Wei Zhisen,Yang Jingyu,Yu Dongjun.Protein-protein interaction sites prediction based on weighted PSSM histogram and random forests ensemble[J].Journal of Nanjing University of Science and Technology,2015,39(4):379-385.
[24]汤永利,李伟杰,于金霞,等.基于改进DS证据理论的网络安全态势评估方法[J].南京理工大学学报,2015,39(4):405-411.
Tang Yongli,Li Weijie,Yu Jinxia,et al.Network secur-ity situational assessment method based on improved D-S evidence theory[J].Journal of Nanjing University of Science and Technology,2015,39(4):405-411.

备注/Memo

备注/Memo:
收稿日期:2015-08-17 修回日期:2015-11-13
基金项目:国家自然科学基金(61373062); 江苏省自然科学基金(BK20141403); 江苏省“六大人才高峰”项目(2013-XXRJ-022)
作者简介:刘光徽(1971-),男,博士生,主要研究方向:生物信息学、模式识别,E-mail:lgh025@163.com; 通讯作者:於东军(1975-),男,博士,教授,博士生导师,主要研究方向:生物信息学、模式识别,E-mail:njyudj@njust.edu.cn。
引文格式:刘光徽,胡俊,於东军.基于多视角特征组合与随机森林的G蛋白偶联受体与药物相互作用预测[J].南京理工大学学报,2016,40(1):1-9.
投稿网址:http://zrxuebao.njust.edu.cn
DOI:10.14177/j.cnki.32-1397n.2016.40.01.001
更新日期/Last Update: 2016-02-29