[1]孙炯宁.基于混合式子树算法的大数据匿名化[J].南京理工大学学报(自然科学版),2015,39(05):609.
 Sun Jiongning.Anonymization of big data based on hybrid tree[J].Journal of Nanjing University of Science and Technology,2015,39(05):609.
点击复制

基于混合式子树算法的大数据匿名化
分享到:

《南京理工大学学报》(自然科学版)[ISSN:1005-9830/CN:32-1397/N]

卷:
39卷
期数:
2015年05期
页码:
609
栏目:
出版日期:
2015-10-31

文章信息/Info

Title:
Anonymization of big data based on hybrid tree
作者:
孙炯宁
江苏海事职业技术学院 信息工程学院,江苏 南京 211100
Author(s):
Sun Jiongning
Department of Information Engineering,Jiangsu Maritime Institute,Nanjing 211100,Chin
关键词:
大数据 云计算 数据匿名 隐私保护 MapReduce
Keywords:
big data cloud computing data anonymization privacy preservation MapReduce
分类号:
TP301.6
摘要:
自顶而下具体化(TDS)和自底向上泛化(BUG)是子树匿名化的主要方法,但其并行能力不足,易导致在云数据处理中缺乏可扩展性。当TDS和BUG分开使用时,很难准确确定K匿名参数。针对这一问题,该文提出一种在大数据中进行有效数据匿名化的基于TDS和BUG的混合方法,设计了基于该混合方法的MapReduce模型,以提高云计算能力的可扩展性。实验表明,与现有方法相比,该混合法可以显著提高扩展性和子树匿名化的效率。
Abstract:
The top-down specialization(TDS)and the bottom-up generalization(BUG)are two ways to fulfill the sub-tree anonymization.However,existing approaches for sub-tree anonymization fall short of parallelization capability,thereby lacking scalability in handling big data on cloud.Still,both the TDS and the BUG suffer from poor performances for certain value of the K anonymity parameter when they are utilized individually.In view of that,a hybrid approach combining the TDS and the BUG for efficient sub-tree anonymization over big data is proposed.Further,the MapReduce is designed based algorithms for two components(TDS and BUG)to gain the high scalability by exploiting powerful computation capability of cloud.Experiment evaluations demonstrate that the hybrid approach significantly improves the scalability and the efficiency of the sub-tree anonymization scheme over existing approaches.

参考文献/References:

[1] 黄纬,温志萍,程初.云计算中基于K-均值聚类的虚拟机调度算法研究[J].南京理工大学学报,2013,37(6):807-812.
Huang Wei,Wen Zhiping,Cheng Chu.Virtual machine scheduling algorithm based on K-means clustering in cloud computing[J]. Journal of Nanjing University ofScience and Technology,2013,37(6):807-812.
[2]Fung B C M,Wang K,Chen R,et al.Privacy preserving data publishing:A survey of recent developments[J].ACM Computer Survey,2010,42(4):1-53.
[3]海燕,王志坚,刘志中,等,一种支持Web服务QoS动态预测的方法[J].南京理工大学学报,2013,37(1):52-59.
Hai Yan,Wang Zhijian,Liu Zhizhong,et al.Approach for Web service QoS dynamic prediction[J].Journal of Nanjing University of Science and Technology,2013,37(1):52-59.
[4]穆强.基于熵的K-匿名属性泛化算法研究[D].南京:南京信息工程大学计算机与软件学院,2011.
[5]Sweeney L.K-anonymity:A model for protecting privacy[J].International Journal of Uncertainty,Fuzziness and Knowledge-Based Systems,2002,10(5):557-570.
[6]王一杰,吴英杰,唐庆明.基于混合划分技术的隐私保护关系型数据发布算法[J].南京理工大学学报,2013,37(4):493-499.
Wang Yijie,Wu Yingjie,Tang Qingming.Algorithm for privacy preserving relational data publication based on hybrid partitioning approach[J].Journal of Nanjing University of Science and Technology,2013,37(4):493-499.
[7]李建江,崔健,王聃,等.MapReduce并行编程模型研究综述[J].电子学报,2011,39(11):2635-2642.
Li Jianjiang,Cui Jian,Wang Ran,et al.Review of MapReduce parallel programming model[J].Journal of Electronic,2011,39(11):2635-2642.
[8]Li T,Li N,Zhang J,et al.Slicing:A new approach for privacy preserving data publishing[J].IEEE Transactions on Knowledge and Data Engineering,2012,24(3):561-574.
[9]Palit I,Reddy C K.Scalable and parallel boosting with mapreduce[J].IEEE Transactions on Knowledge and Data Engineering,2012,24(10):1904-1916.
[10]Zhang X,Yang L T,Liu C,et al.A scalable two phase top-down specialization approach for data anonymization using MapReduce on cloud[J].IEEE Transactions on Parallel and Distributed Systems,2013,25(2):363-373.

相似文献/References:

[1]马 旸,蔡 冰.大数据环境下Lucene性能优化方法研究[J].南京理工大学学报(自然科学版),2015,39(03):260.
 Ma Yang,Cai Bing.Performance optimization method of Lucene in big data[J].Journal of Nanjing University of Science and Technology,2015,39(05):260.
[2]钱晓东,曹 阳.基于社区极大类发现的大数据并行聚类算法[J].南京理工大学学报(自然科学版),2016,40(01):117.
 Qian Xiaodong,Cao Yang.Large data parallel clustering algorithm based ondiscovery of maximal class in the community[J].Journal of Nanjing University of Science and Technology,2016,40(05):117.
[3]王 倩,谭永杰,秦 杰,等.基于Hadoop分布式平台的海量图像检索[J].南京理工大学学报(自然科学版),2017,41(04):442.[doi:10.14177/j.cnki.32-1397n.2017.41.04.007]
 Wang Qian,Tan Yongjie,Qin Jie,et al.Massive image retrieval based on Hadoop distributed platform[J].Journal of Nanjing University of Science and Technology,2017,41(05):442.[doi:10.14177/j.cnki.32-1397n.2017.41.04.007]
[4]黄 纬,张建德,彭焕峰,等.数据中心应用感知的动态资源配置研究[J].南京理工大学学报(自然科学版),2018,42(03):322.[doi:10.14177/j.cnki.32-1397n.2018.42.03.010]
 Huang Wei,Zhang Jiande,Peng Huanfeng,et al.Application-aware dynamic resource allocation in data center[J].Journal of Nanjing University of Science and Technology,2018,42(05):322.[doi:10.14177/j.cnki.32-1397n.2018.42.03.010]
[5]赵 莉.基于支持向量机的云计算资源负载预测模型[J].南京理工大学学报(自然科学版),2018,42(06):687.[doi:10.14177/j.cnki.32-1397n.2018.42.06.008]
 Zhao Li.Load forecasting model of cloud computing resourcesbased on support vector machine[J].Journal of Nanjing University of Science and Technology,2018,42(05):687.[doi:10.14177/j.cnki.32-1397n.2018.42.06.008]
[6]周 航,董宁宁,张 宏.非一致性存储结构架构下服务器物理资源竞争问题[J].南京理工大学学报(自然科学版),2019,43(05):615.[doi:10.14177/j.cnki.32-1397n.2019.43.05.011]
 Zhou Hang,Dong Ningning,Zhang Hong.Research on resource contention problem on nonuniform memory access server[J].Journal of Nanjing University of Science and Technology,2019,43(05):615.[doi:10.14177/j.cnki.32-1397n.2019.43.05.011]

备注/Memo

备注/Memo:
收稿日期:2015-04-21 修回日期:2015-06-18
基金项目:江苏省高等职业院校国内高级访问学者计划资助项目(2014FX021)
作者简介:孙炯宁(1977-),女,硕士生,副教授,主要研究方向:软件构建、大数据处理; E-mail:sjn913@163.com。
引文格式:孙炯宁.基于混合式子树算法的大数据匿名化[J].南京理工大学学报,2015,39(5):609-613.
投稿网址:http://zrxuebao.njust.edu.cn
DOI:10.14177/j.cnki.32-1397n.2015.39.05.016
更新日期/Last Update: 2015-10-31