[1]刘芝怡,常 睿.基于矩阵的不确定数据频繁项集快速挖掘算法[J].南京理工大学学报(自然科学版),2015,39(04):420.
 Liu Zhiyi,Chang Rui.Fast algorithm of frequent itemset mining based on matrix from uncertain data[J].Journal of Nanjing University of Science and Technology,2015,39(04):420.
点击复制

基于矩阵的不确定数据频繁项集快速挖掘算法
分享到:

《南京理工大学学报》(自然科学版)[ISSN:1005-9830/CN:32-1397/N]

卷:
39卷
期数:
2015年04期
页码:
420
栏目:
出版日期:
2015-08-31

文章信息/Info

Title:
Fast algorithm of frequent itemset mining based on matrix from uncertain data
作者:
刘芝怡1常 睿2
常州工学院 1.计算机信息工程学院; 2.计划财务处,江苏 常州 213002
Author(s):
Liu Zhiyi1Chang Rui2
1.Department of Computer Science and Information Engineering; 2.Department of Project Finance,Changzhou Institute of Technology,Changzhou 213002,China
关键词:
不确定数据 频繁项集 期望支持度 快速挖掘
Keywords:
uncertain databases frequent itemsets expected support fast mining
分类号:
TP311
摘要:
针对CUF-growth算法中项集的期望支持度估算值过大,且挖掘过程中需要反复递归构造条件CUF-tree导致挖掘效率降低这一问题,提出UFIM-Matrix(Uncertain frequent itemset mining-matrix)算法。该算法不需要建立树结构,而是利用计算项集估算期望支持度的新方法和矩阵结构来产生规模更小候选项集,能在一定程度上减少计算开销,提高挖掘效率。最后的实验结果也表明了新算法性能更优。
Abstract:
The CUF-growth algorithm gives an upper bound on the expected support of itemsets,but the estimate is too high.It has own bottleneck that needs to build conditional CUF-tree recursively in the mining process for getting candidate itemsets.According to the deficiency of the CUF-growth,the UFIM-Matrix(Uncertain frequent itemset mining-matrix)algorithm is proposed.This algorithm does not need to build a pattern tree while it generates smaller candidate sets by using a matrix structure and an improved method to calculate the upper bound of the expected support of itemsets.It can greatly reduce the cost of computing and improve the mining efficiency.The experimental results indicate the algorithm is more effective and efficient.

参考文献/References:

[1] 李海峰,章宁,柴艳妹.不确定性数据上频繁项集挖掘的预处理方法[J].计算机科学,2012,39(7):161-164.
Li Haifeng,Zhang Ning,Cai Yanmei.Uncertain data preconditioning method in frequent itemset mining[J].Computer Science,2012,39(7):161-164.
[2]王水,祝孔涛,王乐.一种不确定数据集上频繁项集挖掘的近似算法[J].计算机应用研究,2014,31(3):725-728.
Wang Shui,Zhu Kongtao,Wang Le.Approximation algorithm for frequent itemsets mining on uncertain dataset[J].Application Research of Computers,2014,31(3):725-728.
[3]Chui C K,Kao B,Hung E.Mining frequent itemsets from uncertain data[M].Advances in Knowledge Discovery and Data Mining.Berlin,Germany:Springer Berlin Heidelberg,2007:47-58.
[4]Chui C K,Kao B.A decremental approach for mining frequent itemsets from uncertain data[M].Advances in Knowledge Discovery and Data Mining.Berlin,Germany:Springer Berlin Heidelberg,2008:64-75.
[5]Leung C K S,Mateo M A F,Brajczuk D A.A tree-based approach for frequent pattern mining from uncertain data[M].Advances in Knowledge Discovery and Data Mining.Berlin,Germany:Springer Berlin Heidelberg,2008:653-661.
[6]Leung C K S,Tanbeer S K.Fast tree-based mining of frequent itemsets from uncertain data[A].Database Systems for Advanced Applications[C].Berlin,Germany:Springer Berlin Heidelberg,2012:272-287.
[7]汪金苗,张龙波,邓齐志,等.不确定数据频繁项集挖掘方法综述[J].计算机工程与应用,2011,47(20):121-125.
Wang Jinmiao,Zhang Longbo,Deng Qizhi,et al.Survey on algorithm of mining frequent itemsets from uncertain data[J].Computer Engineering and Applications,2011,47(20):121-125.

备注/Memo

备注/Memo:
收稿日期:2014-11-07 修回日期:2015-07-06
基金项目:江苏省自然科学基金(BK20130245)
作者简介:刘芝怡(1977-),女,讲师,主要研究方向:知识发现与数据挖掘,E-mail:liuzy@czu.cn。
引文格式:刘芝怡,常睿.基于矩阵的不确定数据频繁项集快速挖掘算法[J].南京理工大学学报,2015,39(4):420-425.
投稿网址:http://zrxuebao.njust.edu.cn
更新日期/Last Update: 2015-08-31