|Table of Contents|

Distributed constraints consistency Gaussian mixture mode

《南京理工大学学报》(自然科学版)[ISSN:1005-9830/CN:32-1397/N]

Issue:
2013年06期
Page:
799-806
Research Field:
Publishing date:

Info

Title:
Distributed constraints consistency Gaussian mixture mode
Author(s):
Yu Yuecheng1Liu Caisheng2Sheng Jiagen2
1.College of Computer Science and Engineering;
2.College of Nanxu,Jiangsu University of Science and Technology,Zhenjiang 212003,China
Keywords:
constraints consistency Gaussian mixture model distributed clustering regularization operator
PACS:
TP391.4
DOI:
-
Abstract:
To effectively improve the clustering quality of non-spherical horizontally distributed data sets,a distributed constraints consistency Gaussian mixture mode(DCCGMM)is proposed.For the DCCGMM,the description model of the data sets is Gaussian mixture model(GMM),and the constraint information is introduced to GMM by constraints consistent regularization operators.Then,the estimated parameters of the DCCGMM reflect both the underlying probability distribution of sample data and the apriori knowledge from users,and each parameter can be estimated by a closed-form solution.The DCCGMM can be used for distributed clustering by designing the communication parameters between user sites.Experimental result shows that,compared with the distributed clustering algorithms based on K-means,the algorithm proposed here has considerable flexibility in clustering the non-spherical data sets and the clustering quality of this algorithm is better than the result of distributed expectation maximization(EM)algorithm without constraint information,and the global average clustering accuracy increases by 9%-20%.

References:

[1] 王飞,钱玉文,王执铨.基于无监督聚类算法的入侵检测[J].南京理工大学学报,2009,33(3):288-292.
Wang Fei,Qian Yuwen,Wang Zhiquan.Intrusion detection based on unsupervised clustering algorithm[J].Journal of Nanjing University of Science and Technology,2009,33(3):288-292.
[2]Mokeddem D,Belbachir H.A survey of distributed classification based ensemble data mining methods[J].Journal of Applied Sciences,2009,9(20):3739-3745.
[3]尹学松,胡恩良,陈松灿.基于成对约束的判别型半监督聚类分析[J].软件学报,2008,19(11):2791-2802.
Yin Xuesong,Hu Enliang,Chen Songcan.Discrimina-tive semi-supervised clustering analysis with pairwise constraints[J].Journal of Software,2008,19(11):2791-2802.
[4]王娜,李霞.基于监督信息特性的主动半监督谱聚类算法[J].电子学报,2010,38(1):172-176.
Wang Na,Li Xia.Active semi-supervised spectral clustering based on pairwise constraints[J].Acta Electronica Sinica,2010,38(1):172-176.
[5]於跃成,王建东,郑关胜,等.基于约束信息的并行K-means算法[J].东南大学学报:自然科学版,2011,41(3):505-508.
Yu Yuecheng,Wang Jiandong,Zheng Guansheng,et al.Parallel K-means algorithm based on constrained information[J].Journal of Southeast University(Natural Science Edition),2011,41(3):505-508.
[6]Yu Yuecheng,Wang Jiandong,Zheng Guansheng,et al.Distributed K-means based on soft constraints[J].Journal of Software Engineering,2011,5(4):116-126.
[7]倪巍伟,陈耿,吴英杰,等.一种基于局部密度的分布式聚类挖掘算法[J].软件学报,2008,19(9):2339-2348.
Ni Weiwei,Chen Geng,Wu Yingjie,et al.Local density based distributed clustering algorithm[J].Journal of Software,2008,19(9):2339-2348.
[8]Samatova N F,Ostrouchov G,Geist A,et al.RACHET:an efficient cover-based merging of clustering hierarchies from distributed datasets[J].Distributed Parallel Databases,2002,11(2):157-180.
[9]吉根林,凌霄汉,杨明.一种基于集成学习的分布式聚类算法[J].东南大学学报:自然科学版,2007,37(4):585-588.
Ji Genlin,Ling Xiaohan,Yang Ming.Distributed clustering algorithm based on ensemble learning[J].Journal of Southeast University(Natural Science Edition),2007,37(4):585-588.
[10]Wolfe J,Haghighi A,Klein D.Fully distributed EM for very large datasets[A].Proceeding ICML'08 Proceedings of the 25th International Conference on Machine Learning[C].New York,NY,USA:ACM,2008:1184-1191.
[11]Merugu S,Ghosh J.Privacy preserving distributed clustering using generative models[A].ICDM 2003 Third IEEE International Conference on Data Mining 2003[C].Florida,USA:IEEE,2003:211-218.
[12]Lin Xiaodong,Clifton C,Zhu M.Privacy-preserving clustering with distributed EM mixture modeling[J].Knowledge and Information Systems,2005,8(1):68-81.
[13]He Xiaofei,Deng Cai,Shao Yuanlong,et al.Laplacian regularized Gaussian mixture model for data clustering[J].IEEE Transactions on Knowledge and Data Engineering,2011,23(9):1406-1418.
[14]Liu Jialu,Deng Cai,He Xiaofei.Gaussian mixture model with local consistency[A].Proceedings of the Twenty-fourth AAAI Conference on Artificial Intelligence(AAAI-10)[C].Georgia,USA:AAAI,2010:512-517.
[15]Bilmes J A.A gentle tutorial of the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov mode[R].Berkeley,California,USA:ICSI,1998.
[16]Bishop C M.Pattern recognition and machine learning[M].Berlin,Germany:Springer,2006.

Memo

Memo:
-
Last Update: 2013-12-31