Research on Data Filtering Technology Based on the RFID Middleware in the Internet of Things Kun Huang 1, Bing-wu Liu 2, Jun-tao Li 3 1 Graduate Division, Beijing Wuzi University, Beijing, 101149, China Graduate Division, Beijing Wuzi University, Beijing, 101149, China 3 School of Information, Beijing Wuzi University, Beijing, 101149, China (1huang__kun@126.com, 2Liubingwu@bwu.edu.cn, 3 Lijuntao@bwu.edu.cn) 2 Abstract-Middleware technology is becoming a hot topic in research due to its important position in the application of radio frequency identification (RFID). This paper first introduced the architecture of the Internet of Things, as well as the important position of the RFID middleware, and then summarizes the existing data filtering methods and characteristics of several filters. At last build the initial form of a combination of filters, take smart shelves for example. Keywords-RFID middleware, Internet of Things, Data filtering, Smart shelves I. INTERNET OF THINGS Internet of things, abbreviated as IOT, is a network which through radio frequency identification (RFID) , infrared sensors, global positioning system, laser scanners and other information sensing device, make articles connect with internet to realize information exchange and communication according to the agreed agreement ,in order to achieve intelligent identification, location, tracking, monitoring, and management. Internet of Things is "the Internet connects material objects". It has two meanings [1]: first, the core and foundation of the Internet of Things is still the Internet, it is the extension and expansion of the Internet network. Second, its client-side spread and extended to any goods and articles for information exchange and communication. The application of the Internet of Things is bound to produce vast amounts of sensory data, how to collect, store and process this mass data in real time is a problem. In addition, a number of sensory devices are provided by different hardware vendors, and now China has not yet formed a uniform standard in interface and protocol -aware layer. How can complex equipment be seamlessly integrated into the existing system? How to carry out a unified monitoring and management for the sensory devices? How to seamlessly integrate new equipment into the Internet? These are big challenges to the development of Internet and which is the responsibility of the Internet of Things application middleware have to shoulder. In this paper the Internet of Things middleware mainly refers to the RFID middleware [2]. ___________________________ Financial supported by: Beijing Natural Science Foundation Project (Class B) (Key Project of Beijing Municipal Education Commission Science and Technology Development Plan), Research on Intelligent Logistics System Based on Internet of Things Technology, (NO.:KJ201210037037) II. RFID MIDDLEWARE RFID is the abbreviation for "radio frequency identification", which is a non-contact automatic identification and data collection technology [3]. The application of RFID technology is expanding rapidly after the 1990s. From 2000 year to the present, RFID product range has been significantly enriched and costs is becoming lower and lower, a variety of new applications are emerging. RFID System includes RFID hardware and application support software. Hardware part is made up of electronic tags and readers. Electronic tags are data carriers, which is divided into passive RFID label, half-passive RFID tags and active RFID tag. Passive electronic tag extracts the radio frequency energy radiate by the reader as its working power and transmits the label information to the reader; Semi-passive tags and active tags powered by battery. RFID middleware, known as the nerve center of the RFID systems[4], is the most important part of the RFID software system, it directly face mass data collected by hardware, filter the data and submit to the high-level application software after effective packaging. Nowadays, research on RFID middleware is mainly concentrated on how to filter the vast amounts of data, the redundancy, and exploit the useful information [5] .The functions of RFID middle ware are shown as Figure1. Application software systems Application software systems Interfaces and protocols RFID Middleware Interfaces and protocols RFID reader RFID reader Radio Frequency Identification Tags FigureI. Tags RFID Middleware diagram III. DATA FILTERING The original data collected from the underlying hardware is enormous, yet truly meaningful to the user is not so much. If redundant data is not filtered out, it will bring three aspects of the burden: (1) Burden on network bandwidth due to the transfer of large amounts of data; (2) Burden on Data processor due to the needs of handling large amounts of data; (3) Burden on data storage due to the database need to store large extra amounts of data. Middleware receive data from RFID reader, there will exist some redundant information and also wrong information. So it is necessary to filter the data, this is also its important feature, filter's purpose is to eliminate redundant data, eliminate "useless" information and transmit "useful" information application. Redundant data that middleware needs to filter out including: (1)In a short term the same RFID readers duplicate reported the same data. When detecting node location, fixed node information duplicate reported; nodes are repeated tested when the node goes into and out of an area. (2)Neighboring readers report the same data. Readers have a missing rates, it has a relationship with the placement of the antenna, the distance from readers and the texture. Typically to ensure the read rates, there may place more than one reader in the same place. More than one reader report the monitored articles may generate repeated data. (3)In addition to the above issues, many users might also want to get some information for a particular node, the information for new node and disappeared node or just some special node. When users use the data, what expected is minimized redundancy and gets accurate data close to the demands, it's up to middleware to solve this problem. Solution for redundant information is to set filters for processing in middleware. According to the requirements of different systems, it need to set different types of filters; according to some of the redundant data listed above, current filter can be summed up in the following three ways: (1)Weight filter The data we collect often produce a significant amount of redundant data, filter can eliminate the redundant data[6].For filter, we adopt a filtering algorithm as follows: assumes that the data middleware acquired can be expressed as (ReaderID, NodeID, Timestamp), wherein ReaderID is the ID of RFID reader, NodeID identifying RFID node ID,Timestamp represents the node's read time. In the filtering process, the data will be put into a Hash table, use NodeID as hash table keys. We define a time interval, when the reader read a new node data, check if there are nodes with the same data in the Hash table. If there are nodes with the same data and the read time lag between the nodes is less than Timelnterval, then consider the node as a repeat reading data that should be filtered. Meanwhile update the node data’s read time in the hash table. If there are nodes with the same data and the read time lag between the nodes is greater than Timelnterval, the node data is considered to be the new node data that need be output. Meanwhile update the node data’s read time in the hash table. If it does not exist, then insert it into a Hash table and output the node data. (2)Event filter In this paper, middleware mainly filter three kinds of nodes: new node, left node and currently active node. A new node means the node appears this time that never appear before. Left node is the node that has appeared several times before but does not occur in the PersistTime. Current active node represents a new node and the node appeared last time and occurs in the PersistTime once again. For event filter [7], we adopt a filtering algorithm as follows: assume that nodes data’s format is (ReaderID, NodeID, Timelnterval). In the filtering process, put the currently active node data into a Hash tables, regard Node1D as Hash table key word. Set up another two queues, hold new node data and node data that have already left. When you are reading a node, checking whether the Hash table exist the same node, if it does not exist, then insert it into the Hash table and the new label queue. If it exists, then updates the node’s reading time in the Hash table. Traverse the Hash tables, put the node that does not update in PersistTime into the queue of nodes that has already left, and then remove it [8]. (3)Invalid RFID Data filter In practical applications, data filters also have other requirements rather than data redundancy filtering. Due to instable signal or other interference factors, the RFID tags of items which are on the shelves can not be detected in each reader cycle; or when the customer pushed his cart next to shelf, merchandises within the cart be read by the readers in the shelf, this is invalid RFID data. Identify the occasional data and erase it through program is the key to invalid RFID data suppression. Algorithm presented here uses a threshold [9], each label’s report is given a certain amount of weight, reduce the weight of labels that do not appear. When the label value is be above or below the threshold, triggering the appropriate label [10]. This algorithm is described as follows: 1) Defines the label’s add up value after each appearance as valueStep; 2) Defines the threshold that triggers the tag’s appearance as fapp; 3) Defines the threshold that triggers the tag’s disappearance as fdis; 4) Defines the label status’s field as detectStatus; 5) If the label appears, its weight adds valueStep; 6) If the label does not appear, its weight value will reduce 1; 7) Label weight value greater than or equal to fapp, detectStatus =false , the label appearance event is triggered, generate a label appear records and then reset detectStatus to true; 8) Label weight value less than or equal to fdis, detectStatus =true, the label disappearance event is triggered, generate a label disappear records, and reset detectStatus to false; In the algorithm above the threshold can be set according to needs, because an invalid RFID data occurrences less, it is difficult to appear above fapp or below fdis, it can effectively inhibit invalid RFID data. IV. FURTHER ANALYSIS OF THE DATA FILTER In specific application, in order to ensure the effectiveness of the information uploaded to the upper-layer service interfaces, the three type filters above are used in combination to improve the filters accuracy. Following part sets the intelligent shelf in supermarket which is popular in intelligent logistics system as an example to illustrate the combination of a smart filter. Figure 2 shows the filtering process of smart filter: Reader Eliminate redundant Filter Invalid RFID Filter Discard Feedback Event Filter Discard Discard Supermarket application software system Tag data stream Supermarket manage database FigureII. Flow chart of data filtering By smart shelves, administrator can monitor shelf articles in real time. First, when there is repeating entry of articles already exists on the shelf, weight filter eliminate redundant information. Second, administrators need to know when a new product put on the shelves or what merchandise is bought by customers, then event filter work. Third, when the articles customs get from another shelves passing the shelf, RFID reader may read this product information and regard this product as a new product, so an error occurs, invalid RFID filter has its place.Thus, RFID middleware use triple filter combination to filter out redundant information effectively before the label information be passed to upper-layer application software and reduce the burden on the system. V. CONCLUSION This paper study the basic knowledge of RFID middleware in the internet of things, points out the importance of RFID middleware in the internet of things. Through research on the different features of the existing filtering technology and the example of smart shelves, this paper discussed the availability of the combinations of filters. Hope this paper can contribute to the further development of data filtering technology based on the RFID middleware. REFERENCE [1] Shen Subin. Study on the architecture of the Internet and related technologies[J]. Journal of Nanjing University of posts and telecommunications, 2009, 6. [2] "advances in RFID middleware"; Ding Zhenhua, Li Jintao, Feng Bo and Guo Junbo; the computer engineering 2006.11. [3] the tour Qing, Li Sujian. Radio frequency identification technology (RFID) theory and application [m]. Beijing: Publishing House of electronics industry, 2004:8-21. [4] Zhong Huian. RFID technical operational nerve centre-of-RFID Middleware[J]. Electronic Commerce Pilot, 2004, 6(14) : 1. [5] Ding Zhenhua, Li Jintao, Feng Bo. RFID middleware advances [J]. Computer engineering, 2006, 32 (21) : 9 - 11. [6] Jiang Shaogang, Tan Jie. RFID middleware research on data processing and filtering methods[J]. Computer applications, 2008,28(10): 2613- 2615. [7] Li Li, Zhu Fresh , Wang Fang . EPC middleware research system[J]. Computer engineering and design, 2006, 27(18): 3360- 3363. [8] Wendy Zhao. RFID middleware event management system design and realization of [D]. Wuhan: Huazhong University, 2006. [9] Alfonsi B J. Privacy debate centers on radio frequency identification [J].Security & Privacy Magazine, 2004, 2(2):12. [10] PALMER M. Seven principles of effective RFID data management[EB/OL].[2008-04-10]. http://www.objectstore.com /docs /articles