[1]王永利,王 川,蒋效会,等.基于时空布隆过滤器的RFID冗余数据清洗算法[J].南京理工大学学报(自然科学版),2015,39(03):253.
 Wang Yongli,Wang Chuan,Jiang Xiaohui,et al.RFID duplicate removing algorithm based on temporal-spatial Bloom filter[J].Journal of Nanjing University of Science and Technology,2015,39(03):253.
点击复制

基于时空布隆过滤器的RFID冗余数据清洗算法
分享到:

《南京理工大学学报》(自然科学版)[ISSN:1005-9830/CN:32-1397/N]

卷:
39卷
期数:
2015年03期
页码:
253
栏目:
出版日期:
2015-06-30

文章信息/Info

Title:
RFID duplicate removing algorithm based on temporal-spatial Bloom filter
作者:
王永利1王 川1蒋效会1张功萱1孙淑杰2
1.南京理工大学 计算机科学与工程学院,江苏 南京 210094; 2.大庆市采油七厂 三矿葡四联,黑龙江 大庆 163517
Author(s):
Wang Yongli1Wang Chuan1Jiang Xiaohui1Zhang Gongxuan1Sun Shujie2
1.School of Computer Science and Engineering,NUST,Nanjing 210094,China; 2.Third Mine Pusilian,Seventh Oil Extraction Plant,Daqing 163517,China
关键词:
布隆过滤器 射频识别 冗余数据 位数组 比特数组 内存空间 利用率 误报错误 漏报错误
Keywords:
Bloom filter radio frequency identification redundant data integer array bit array memory space utilization false positive errors false negative errors
分类号:
TP391
摘要:
针对射频识别(RFID)固有不可靠性导致的海量冗余数据问题,该文提出了基于时空布隆过滤器的RFID冗余数据消除算法,该算法使用有限空间一次处理海量数据。使用位数组代替了比特数组,使得内存空间消耗为以前的标签号的长度倍。与传统布隆过滤器相比,仍有良好的空间利用率。该算法克服了传统布隆过滤器不能处理海量实时数据流的问题,消除了布隆过滤器的误报错误,设置恰当的参数以最小化漏报错误,漏报错误数量与具体应用场景和过滤器设置有关。实验结果验证了算法的有效性。
Abstract:
Aiming at the problem of huge amounts of duplicate data caused by the inherent unreliability of radio frequency identification(RFID),a redundant data filtering algorithm is proposed based on temporal-spatial Bloom filter.A great deal of RFID data are dealt every time using a limited memory.Integer array is used instead of bit array,and the memory space consumption is times of the tag length before.Compared with the traditional Bloom filter,the algorithm proposed here has good space utilization.The problem of traditional Bloom filter that it can't deal with a mass of real-time data flow is overcomed by the algorithm proposed here.False positive errors are removed,false negative errors are minimized by setting appropriate parameters.False negative errors are related to the specific application and filter settings specifically.Experimental results verify the effectiveness of the proposed algorithm.

参考文献/References:

[1] 陆宝春,丁日春,陈吉朋.基于自动分组排列的电子标签防碰撞算法[J].南京理工大学学报,2012,36(1):122-126.
Lu Baochun,Ding Richun,Chen Jipeng.Electronic tag anti-collision algorithm based on auto-grouping arrangement[J].Journal of Nanjing University of Science and Technology,2012,36(1):122-126.
[2]Baba A I,Lu Hua,Xie X.Spatio-temporal data cleansing for indoor RFID tracking data[A].2013 IEEE 14th International Conference on Mobile Date Management(MDM)[C].Milan,Italy:IEEE,2013:187-196.
[3]Mahdin H.A review on Bloom filter based approaches for RFID data cleaning[A].Proceedings of the First International Conference on Advanced Data and Information Engineering[C].Kuala Lumpur,Malaysia:Springer Singapore,2014:79-86.
[4]Jeffery S R,Alonso G,Franklin M J,et al.A pipelined framework for online cleaning of sensor data streams[A].Proceedings of the 22nd International Conference on Data Engineering(ICDE'06)[C].Atlanta,GA,USA:IEEE,2006:1-140.
[5]Bai Y,Wang F,Liu P.Efficiently filtering RFID data streams[A].Proceedings of the 32nd International Conference on Very Large Data Bases[C].Seoul,Korea:Springer Verlag,2006:50-57.
[6]Carbunar B,Ramanathan M K,Koyuturk M,et al.Redundant reader elimination in RFID systems[A].2005 Second Annual IEEE Communications Society Conference on Sensor and Ad Hoc Communications and Networks[C].Anchorage,AK,USA:IEEE Computer Society,2005:176-184.
[7]Lee C H,Chung C W.An approximate duplicate elimination in RFID data streams[J].Data and Knowledge Engineering,2011,70(12):1070-1087.
[8]Bloom B H.Space/time trade-offs in hash coding with allowable errors[J].Communications of the ACM,1970,13(7):422-426.
[9]Metwally A,Agrawal D,El Abbadi A.Duplicate detection in click streams[A].Proceedings of the 14th International Conference on World Wide Web[C].New York,USA:ACM,2005:12-21.
[10]Deng Fan,Rafiei D.Approximately detecting duplicates for streaming data using stable Bloom filters[A].Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data[C].Chicago,USA:ACM,2006:25-36.
[11]Wang Xiaowei,Zhang Qiang,Jia Yan.Efficiently filtering duplicates over distributed data streams[A].Proceedings of the 2008 International Conference on Computer Science and Software Engineering[C].Anchorage,AK,USA:IEEE Computer Society,2008:631-634.
[12]韩京宇,徐立臻,董逸生.数据质量研究综述[J].计算机科学,2008,35(2):1-5,12.
Han Jingyu,Xu Lizhen,Dong Yisheng.An overview of data quality research[J].Computer Science,2008,35(2):1-5,12.
[13]Mitzenmacher M.Compressed Bloom filters[J].IEEE/ACM Transactions on Networking(TON),2002,10(5):604-612.
[14]Peiya F W.Temporal management of RFID data[A].Proceedings of the 31st International Conference on Very Large Data Bases[C].Trondheim,Norway:Springer Verlag,2005:1128-1139.
[15]Jeffery S R,Franklin M J.Adaptive cleaning for RFID data streams[A].Proceedings of the 32nd International Conference on Very Large Data Bases[C].Seoul,Korea:Springer Verlag,2006:163-174.

相似文献/References:

[1]赖晓铮,刘焕彬,苏艳,等.纸基RFID包装箱标签天线设计[J].南京理工大学学报(自然科学版),2008,(03):367.
 LAI Xiao-zheng,LIU Huan-bin,SU Yan,et al.Papery Substrate RFID Container Tag Antenna Design[J].Journal of Nanjing University of Science and Technology,2008,(03):367.
[2]王 丹,赵 凯.基于距离预测分组的ALOHA算法[J].南京理工大学学报(自然科学版),2018,42(01):102.[doi:10.14177/j.cnki.32-1397n.2018.42.01.015]
 Wang Dan,Zhao Kai.ALOHA algorithm based on distance prediction grouping[J].Journal of Nanjing University of Science and Technology,2018,42(03):102.[doi:10.14177/j.cnki.32-1397n.2018.42.01.015]

备注/Memo

备注/Memo:
收稿日期:2014-09-15 修回日期:2014-11-17
基金项目:国家自然科学基金(61170035); 中央高校基本科研业务费专项资金项目(30920130112006); 江苏省“973”计划(BK2011022); 江苏省自然科学基金(BK2011702); 江苏省科技成果转化专项资金项目(BA2013047)
作者简介:王永利(1974-),男,博士,教授,主要研究方向:数据库技术、大数据分析、人机物融合、模式识别等,E-mail:yongliwang@njust.edu.cn。
引文格式:王永利,王川,蒋效会,等.基于时空布隆过滤器的RFID冗余数据清洗算法[J].南京理工大学学报,2015,39(3):253-259.
投稿网址:http://zrxuebao.njust.edu.cn
更新日期/Last Update: 2015-06-30