Abstract

advertisement
Mobile Big Data Fault-Tolerant
Processing for eHealth Networks
Kun Wang, Yun Shao, Lei Shu, Chunsheng Zhu, and Yan Zhang
Abstract
In daily life, people tend to use mobile networks for more accurate overall data.
With intelligent mobile devices, almost all kinds of data can be collected automatically, which contributes directly to the blooming of eHealth. However, large
amounts of data are also leading us into the era of big data, in which new data
collection, transmission, and processing techniques are required. To ensure ubiquitous data collection, the scale of mobile eHealth networks has to be expanded.
Also, networks will face more pressure to transmit large amounts of eHealth data.
In addition, because the processing time increases with data volume, even powerful processors cannot always be regarded as efficient for big data. To solve these
problems, in this article, an interests-based reduced variable neighborhood search
(RVNS) queue architecture (IRQA) is proposed. In this three-layer architecture, a
fault-tolerant mechanism based on interests matching is designed to ensure the
completeness of eHealth data in the data gathering layer. Then the data integrating
layer checks the accuracy of data, and also prepares for data processing. In the
end, an RVNS queue is adopted for rapid data processing in the data analyzing
layer. After processing with relevant rules, only valuable data will be reported to
health care providers, which saves their effort to identify these data. Simulation
shows that IRQA is steady and fast enough to process large amounts of data.
T
raditional health care services can hardly meet the needs
of the growing population because hospital capacity and
medical workers are limited in terms of the continuously
increasing treatment requests. On this background, a new
kind of eHealth service using intelligent device to monitor
people’s lives is developing rapidly with the benefits of big data
technique. In this era of big data, large amount of data has
been transmitted on the Internet, stored in servers and clouds,
and even collected around people’s life by mobile networks
[1]. Characterized by their volume, velocity, variability, and
veracity, mobile big data networks are used to describe those
extremely large or complex data sets in the network for which
traditional processing methods are inadequate.
Specifically, data collected by mobile eHealth networks are
becoming more ubiquitous with the development of hardware,
and a group of collecting nodes are required to get more overall data [2]. Meanwhile, the number of collecting nodes in a
network tends to increase, and each collecting node tends to
Ken Wang is with Nanjing University of Posts and Telecommunications.
Yun Shao is with the University of Southern California.
Lei Shu is the corresponding author for this article, and is with Guangdong
University of Petrochemical Technology, Dalian University of Technology,
and Guangdong Provincial Key Laboratory of Petrochemical Equipment
Fault Analysis.
Chunsheng Zhu is with the University of British Columbia.
Yan Zhang is with Simula Research Laboratory and the University of Oslo.
36
0890-8044/16/$25.00 © 2016 IEEE
have higher sampling frequency. Some typical types of sensors
(e.g., brain sensors) will generate huge amounts of data, the
size of which can even grow to the terabyte level. Also, the
size of records of a clinic agency easily grows to the exabyte
level. All the factors mentioned above increase the scale of a
network, as well as the data volume transmitted in an eHealth
network.
On the other hand, due to the limited energy and functionality of mobile nodes, data has to be converged and processed
in a central server. As shown in Fig. 1, the monitoring of people’s status is always completed by the combination of body
sensors and environmental sensors. The reason that sensor
data (e.g., blood pressure, blood sugar, heart rate) from body
sensors and environmental sensors should be combined in
analysis is that some environmental factors (e.g., temperature,
pressure) will affect an elderly person’s physiological conditions a lot. In addition, we also show status sensors in Fig. 1.
That is because moving status may cause inaccurate analysis.
For example, the movement of some kinds of transportation
(e.g., cars and elevators) will cause fall detection to be out
of order in some cases. However, with the expansion of network scale caused by more powerful functionalities, it is highly
possible that some interruptive nodes or network congestion
block the routing so that it becomes difficult for data to reach
the server. Accordingly, the server cannot have complete data
for analysis, and it will be interested in the value of lost data.
Based on this idea, the foundation of interests matching is
formed: let a server generate interest requests according to lost
data, and nodes match the received requests in their buffer;
then the lost data could be recollected.
In addition, on the server side, the increasing amount of
data definitely slows down the processing speed. With the
IEEE Network • January/February 2016
IRQA
Status sensors
Vision
Blood
pressure
Environmental
sensors
Brain
Hearing
Heart rate
Toxins
Falling
tion
ges
Con
Health care
provider
Body sensors
Interruptive node
Normal node
Figure 1. Problem statement of mobile big data network.
development of a network, it is highly possible that the processing ratio will be slower than the data receiving ratio, which
means lots of the newest data arrives while old data is still
being processed because of the increment of data processing time. Under this circumstance, if data is still organized in
sequence as is tradition, a processor will analyze those earlier
arriving data first, and the queue used to store the newest
data is going to elongate infinitely so that the newest data
will never be processed promptly. Therefore, how to process
these collected data efficiently and return valuable information
deserves further exploration.
To this end, we mainly focus on reliable data transmission
in a mobile eHealth network and rapid data processing in a
central server under a big data scenario, and propose an interests-based Reduced Variable Neighborhood Search (RVNS)
queue architecture (IRQA) to solve unreliable transmission
and data processing problems caused by the expansion of a
network. The contributions of our article are summarized as
follows:
•A new fault-tolerant mechanism is proposed using interests
matching to ensure the reliability of data transmission.
•An RVNS queue is designed based on the RVNS algorithm
[3], which is always used for solving combinational optimization problems. RVNS queue improves processing speed
dramatically under a big data scenario.
Big data is a very hot topic, and much research is still in the
exploration phase. Currently, research toward big data can be
divided into two fields.
Lots of research works focus on algorithms and strategies
for big data mining. Zhang et al. designed a novel community-centric framework to predict community activities [4]. The
framework consists of community detection and community activity modeling. It extracts community activity patterns
from big data collected physically and virtually. Kuang et al.
proposed a unified tensor model [5]. This model can represent unstructured, semistructured, and structured data with
a tensor model. In detail, each kind of data is represented
by a subtensor, which is finally merged with a unified tensor.
Also, a small but valuable core tensor is extracted using an
incremental high order singular value decomposition. Yu et al.
designed an RTIC-C system to handle the huge data volume
based on cloud computing [6]. RTIC-C includes a distribute
data management service to ensure large-scale data storage,
a parallel distributed framework to run various mining applications based on the Map-Reduce mechanism, and a restful
web service interface for third-party mining services. Gu et al.
IEEE Network • January/February 2016
proposed a cost minimization method [7]. In this method, a
2D Markov chain is adopted to generate an efficient solution
to linearize mixed-integer nonlinear programming problems
about average task completion time.
The other track concentrates on methodologies for big
data storage. Dou et al. proposed a privacy-aware cross-cloud
service composition method [8]. In the method, evaluations
of services are promoted by quality of service (QoS) history
records to enhance credibility. Also, a k-means algorithm is
adopted as the filter for representative historical records. Yin
et al. proposed an efficient storage model based on code erasing [9]. In the proposed model, the large amount of data is separated into various storage nodes, and different users can set
different coding parameters to improve the robustness of the
overall storage system. Yang et al. proposed a novel approach
for error detection in big sensor data [10]. The approach first
classifies typical data errors. Then clustered wireless sensor
networking (WSN) is introduced for fast error detection based
on scale-free network topology. Most detection operations are
carried out in spatial data blocks, which ensure its timeliness.
To sum up, even though much research in these two directions has been carried out, there are few comprehensive
designs to solve problems in ambient assisted living (AAL).
The necessity of big data analysis was already pointed out by
Mao et al. [11]. A universal method for big data abstraction
was also presented in the article. Forkan et al. also proposed
a context-aware monitoring based on big data techniques [12].
However, most traditional data processing algorithms use firstin first-out (FIFO) queues to carry data [13]. Such a queue is
effective with powerful processors, but it may be that a processor cannot meet the required speed in some scenarios. Under
this circumstance, massive data are left in the queue, and cannot be processed quickly, so essentially a breakthrough is necessary to solve the problem.
Actually, we have delved into the research of data processing in eHealth networks before, and proposed a local data processing architecture in [14]. In that article, we also adopted the
RVNS algorithm as a methodology for data processing, but
we explore more essential characteristics about RVNS now,
and propose the concept of an RVNS queue. In addition, we
consider more about big data transmission in mobile eHealth
networks, and propose a interests-matching mechanism to
ensure the reliability of data transmission. Based on the above
description, an interests-matching RVNS queue architecture is
introduced in this article to help health care providers identify
valuable data more efficiently.
The rest of the article is organized as follows. The main idea
of RVNS is presented. We demonstrate how IRQA ensures
reliable rapid processing through a three-layer architecture
with simulation results. We present a conclusion.
Overview of RVNS Algorithm
In this section, we introduce RVNS and its original algorithm,
Variable Neighborhood Search (VNS).
VNS is a meta-heuristic algorithm for solving combinational
optimization problems [3]. Normally, for a combinational optimization problem, there is always a solution space S = {x1, x2,
x3, …, xn}, and an optimal solution selected from S based on
some specific utility function . However, in VNS, there are
two kinds of optimal solutions, called global and local optimal
solutions, denoted as x and x’, respectively.
A global optimal solution is the currently found x ∈ S
contributing to the maximum result of x. For every global
optimal solution x, there is a neighborhood structure N(x)
constructed based on x. N(x) is a subset of S. Also, the neighborhood structure consists of k neighborhoods. These specific
37
Solution space
Global optimal solution: X
X’ > X
X’ < X
Local optimal solution: X’
Randomly generate
Neighborhoods
Figure 2. Overall idea of RVNS.
neighborhoods are denoted as Nk, where k ∈ [1, kmax]. Particularly, when x is a feasible global optimal solution, these k specific neighborhoods are denoted as Nk(x). Nk is reconstructed
continuously through a series of predefined rules when a new
feasible global optimal solution is found.
A new feasible global optimal solution is always generated
from local optimal solutions. In VNS, if we cannot find another xi in a specific neighborhood satisfying (x′) <  (xi) using
a specific subroutine, x’ will be regarded as a local optimal
solution of those neighborhoods, where i ∈ [1, n]. However,
RVNS simplifies the adoption of a subroutine, and just takes
an element in current neighborhoods randomly as the local
optimal solution. Considering the necessity of rapid processing, we use RVNS for guidance in reorganizing received data
in this article. After the confirmation of the local optimal solution, the original global optimal solution x will be replaced
by local optimal solution x′ if (x) < (x′), so a new feasible
global optimal solution is generated.
At the beginning of RVNS, a global optimal solution x is
generated randomly, and a corresponding neighborhood structure Nk(x) is constructed. The first local optimal solution x′ is
generated from N1(x). If the original global optimal solution
x is replaced by the current local optimal solution x′, a new
neighborhood structure is constructed based on the new feasible global optimal solution. The second local optimal solution
isfound in N2(x), and the above work flow is repeated. The
procedure of RVNS is presented in Fig. 2.
To sum up, the whole execution process of VNS mainly
includes two parts: constructing changeable neighborhoods
systematically and searching for local optimal solutions. The
merits of this algorithm are to reduce the calculation complexity through local search and to avoid local optima through
changing neighborhood structure systematically. In this article,
we rearrange the data process sequence based on the neighborhood structure in RVNS, which differs from the traditional
processing pattern and provides the possibility of real-time
processing in the big data background.
Interests-Based RVNS Queue Architecture
In this section, we introduce how IRQA works in detail. As
shown in Fig. 3, IRQA can be divided into three layers: data
gathering layer (DGL), data integrating layer (DIL), and data
analyzing layer (DAL). DGL takes charge of collecting and
storing relevant collected data. Also, it checks the status of
mobile nodes, and reports interruptive nodes with a fault-tolerant mechanism. Converged data enter DIL, which is used to
38
check the effectiveness of data. Due to the influence of working status and environment, a learning machine is adopted to
optimize the checking process. Also, data is divided into different levels in DIL. In the end, DAL puts data into an RVNS
queue and reports valuable data.
Data Gathering Layer (DGL)
In the proposed architecture, data collected by deployed
mobile nodes will be converged to a local server and wait for
processing. The server interacts with people directly.
Usually, in mobile big data networks, nodes are portable,
and they can be carried by people, by transportation, or moving themselves. For energy, battery power is used to for monitoring and moving. The continuous moving of these nodes
serves as the basis of interests matching.
Each node is assigned a unique index during initialization
as its identification in system. After data collection, nodes
package the data into packets using node indexes for identification of packets. On the other hand, a local server can
retrieve the source node of a packet through looking up the
node index on that packet. In addition, considering the difference between sending sequence and receiving sequence in
the real world, a timestamp is given before sending a packet.
Based on this timestamp, a server can sort the collected data
in time sequence.
Initially, nodes relay packets according to the original routing algorithm. However, due to the unreliable network condition, some packets have to face network delay. Meanwhile,
because of interruptive nodes, a server cannot achieve complete data. Accordingly, an effective fault-tolerant mechanism
is necessary. In this article, a new fault-tolerant mechanism is
designed using the thought of interests matching, as shown in
Fig. 4.
Lost packets can be found by referring the node indexes
when sorting packets based on timestamp. Then an index interest is generated and broadcast for each lost packet according
to the found timestamp and node index.
Nodes search relevant packets among their buffer once
interests are received. If those packets have not been found,
nodes take no action except to help broadcast. On the contrary, the node resends corresponding packets matched with
interests.
The reason that the interests matching mechanism works
is the existence of movable nodes. Initially, the routing algorithm sets up a routing path for packet transmission, but the
server does not receive the packets because of a routing break.
Then an interest is generated and broadcast. Note that there
is an interval between the original relay of a lost packet and
receiving interest. During this interval, the routing path changes because nodes are moving continuously. Accordingly, packet retransmission may more possibly reach the server.
Data Integrating Layer (DIL)
Filter: Even though DIL ensures the completeness of a
packet, the correctness of data has not been checked. In the
real world, it is hard to keep all the data unchanged due to the
influence of the network environment. Accordingly, a filter is
designed in this section. To give a reference, a relevant data
set is introduced. This relevant data set stores massive historical data collected by various nodes.
When packets enter DIL, they are first decapsulated. At this
moment, a filter checks representative data according to the
range retrieved from a relevant data set, based on which some
invalid data can be found. Also, a learning machine is adopted in the filter to accumulate common errors so that similar
errors can be detected faster.
Corresponding error packets are returned to DGL with
IEEE Network • January/February 2016
node indexes and timestamp, where relevant
interests are generated and data are recollected.
Level Division: The mechanisms above
ensure data completeness and accuracy. Then
the data with various timestamps from different
nodes can be organized as a matrix, which can
be regarded as a solution space S of RVNS.
Data inside the matrix are put in an RVNS
queue later.
In the matrix, the column and row represent
data collected by different nodes at the same
time, and data collected by one node at different times, respectively. Afterward, data in S will
be divided into several levels randomly. Note
that we do not design a complex algorithm for
level division. That is because in a big data scenario, data volume could be unthinkable, and
has already increased the processing time. If
some algorithms with higher complexity are
adopted, the processing time further increases
with algorithm complexity. At that time, it is
impossible to control the processing time. Also,
the size of each level tends to be equal when
data volume becomes extremely large based on
the probabilistic rule.
As a result, all the data in S are divided into
corresponding levels, which serve as the foundation of neighborhood construction, or what
we call RVNS queue construction.
Health care provider
Valuable data
RVNS queue
Data
analyzing
layer
Processing
unit
Buffer
Levels
01:09 Action A1
01:09 Action A2
01:09 Action A3
01:09 Action A4
01:09 Action A
01:09 Actio
Data
integrating
layer
Filter
Data Analyzing Layer
In this section, we first introduce the idea of
Interests
risk function. Then we analyze how an RVNS
queue is used for rapid data processing as well
as its detailed design.
RVNS Queue Construction: After level division, a neighborhood structure is constructed
Packet
according to the current global optimal soluData
tion. Specifically, an initial global optimal
gathering
Sensor index
solution (current global optimal solution) is
layer
Timestamp
selected randomly from s, that is, one column
from the matrix, denoted as x. Neighborhood
structure is constructed according to the level
of x, denoted as N(X). Suppose X ⊆ L(opt) (opt
∈ [1, max]), and the level of another column
Figure 3. Structure of IRQA.
from the matrix is denoted as L(temp) (temp
∈ [1, max]); then that column will belong to
when the result is large enough, its relevant data is reported.
Nk(X), where k = |L(temp) – L(opt)|. When all the data is set
On the other hand, if the result of current local optimal soluin its corresponding neighborhood, original RVNS queue contion X′ is smaller than the current global optimal solution,
struction is finished. Unlike other queues, an RVNS queue is
RVNS continues to search in the next neighborhood.
modified during the running of RVNS, which is called neighDetailed Design of an RVNS Queue: There are two factors
borhood reconstruction.
that can influence the performance of an RVNS queue. The
Before the start of RVNS, every column is assigned into a
first factor is the size of the solution space. Because RVNS
specific neighborhood according to X. At this moment, data in
randomly picks data in the RVNS queue to process, the possithe solution space are not carried by a traditional FIFO queue
bility of selection risk decreases with the expansion of the soluanymore, but in the RVNS queue. Since every application
tion size. On the contrary, if the solution space is too small,
scenario analyzing rule has a given threshold, the initial global
the RVNS queue is essentially close to a traditional FIFO
optimal solution is processed by adopting the corresponding
queue. Another factor is the interval of neighborhood reconrule, and the result of the initial global optimal solution is
struction. Since newly generated data enter DAL continuously,
recorded. When a new RVNS turn is launched, a function is
the newest data cannot be processed promptly if the interval
called to generate a local optimal solution X′ from the first
is too large. On the other hand, data in the solution space do
neighborhood randomly. If the result of X′ is larger than that
not have enough processing time if the interval is too small.
of x after being processed by the rule, the relevant data of X′
In addition, cross impact may exist between these two factors.
is more valuable. In this case, current global optimal solution
For example, assuming that the solution space is relatively
x will be replaced by local optimal solution X′, and the RVNS
small, the time needed for analysis would be definitely shortqueue is reconstructed according to the new global optimal
er than the time needed for analysis with a larger solution
solution just as for neighborhood reconstruction. Specifically,
IEEE Network • January/February 2016
39
Blocked
A’ buffer
...
B
Packet A50
A
Interruptive node
Data packet A50
Source: A
Timestamp: 50
...
D
C
E
(a)
Blocked
A’ buffer
B
Packet A50
Interruptive node
reconstruction. On the other hand, if the maximum
CPU running time is adopted as the terminal condition, this problem is still unsolved. In this case, it
is notable that once a result is larger than a given
threshold, IRQA sends a report, and the result
must be at its peak at this time. However, it cannot
ensure that there is no result larger than the threshold and smaller than the current highest risk factor,
which means other smaller results are ignored in
the following analysis. In this case, relevant data are
ignored without a report. Accordingly, we need to
change the conventional terminal conditions, and
as described above, the terminal condition should
be set as the sending of a report, because the report
signal is still generated in a new RVNS queue with
a result larger than the threshold and smaller than
the former peak.
Performance Evaluation
...
The performance evaluation is divided into two
parts. The first part evaluates the effectiveness of
the interests-based mechanism by virtually setting up
Data packet A50
a network environment. The second part evaluates
Source: A
the performance of an RVNS queue.
Timestamp: 50
Interests-Based Mechanism Evaluation: In this
section, we test the improvement resulting from the
D
interests-based mechanism by observing the delivery
ratio, which is the ratio of received packets to total
50
0
A
5
0
t
tA
5
s
e
generated packets. We first set up a network using
e
C
er
ck
tA
Int
Pa
C++. In the simulated network, there are 100 moves
r
te
able nodes, each of which represents a set of sensors
In
E
carried by an elderly person. These nodes move in a
100 m  100 m area with speed varying from 3 m/s
(b)
to 5 m/s. To assist the transmission, we also set 10
fixed nodes as the trunk nodes. The communication
Figure 4. Fault-tolerant mechanism: a) original transmission process;
radius of both the moveable nodes and the trunk
b) fault tolerant transmission process
nodes is 7 m. To simulate the interruption, we gradually reduce the number of available trunk nodes
from 10 to 5, and compare the delivery ratio of the
original scenario and interests-based scenario in 50,000 s. The
Original
Interests-based
results are presented in Fig. 5.
80
As we can see, the interests-based mechanism can obviously
improve the delivery ratio. The reason for the improvement is
70
that when a node carrying lost packets receives corresponding
interests, the lost packets are resent. In this way, the environ60
ment of the node is favorable after moving for a while.
RVNS Queue Evaluation: The performance of an RVNS
50
queue is evaluated through simulation in this section. In this
article, a server is simulated using C++. Some of the simu0
1
2
3
4
5
lation data come from the Arrhythmia Data Set in the UCI
Number of faulted trunk nodes
Machine Learning Repository [15]. In simulation, for every 5
s, the server receives a group of packets, the number of which
Figure 5. Delivery ratio of original scenario and interestsreaches the factorial of 20 (20!). Each group of data enters an
based scenario.
RVNS queue with four levels and a FIFO queue, respectively.
Data is processed within an assumed rule with complexity of
O(2n), and the processing time of each data group varies from
space. Thus, without considering the current size of the solution space, a proper interval of neighborhood reconstruction
5 s to 12 s (integer). The maximum analyzing result is 100, and
cannot be achieved. Based on this, a group of simulations is
the given threshold is 90. Simulation length is 50,000 s.
designed to demonstrate their optimal value.
We designed three groups of simulation. The first group
In the end, we need to pay attention to the terminal confinds the optimal parameters of an RVNS queue by changing
dition of RVNS. There are two conventional terminal conthe size of the solution space and the interval of neighborhood
ditions: maximum time of neighborhood reconstruction and
reconstruction, and the results are shown in Fig. 6a. The secmaximum CPU running time. If we adopt the former as the
ond group compares the increment of maximum results with
terminal condition, RVNS will stop once the maximum time is
time between an RVNS queue and a FIFO queue, and the
achieved. At this moment, because the newest data still need
results are shown in Fig. 6b. The third group compares the
to be analyzed, the former larger results and their relevant
waiting time between an RVNS queue and a FIFO queue
data are invalid, which increases the amount of neighborhood
when average data processing time changes, and the results
A
0
A5
A5
50
st
re
tA
st
re
te
te
ke
In
c
Pa
In
st
50
tA
re
te
In
ke
c
Pa
...
0
Delivery ratio (%)
0
A5
40
IEEE Network • January/February 2016
Processing time (s)
are shown in Fig. 6c. Specifically, in the third group, the average processing time changes from 5 s to 15 s (integer).
Based on the analysis above, due to the existence of cross
impact, two parameters change together in Fig. 6a. Results
show that the optimums of reconstruction interval and solution space size are 100 s and 100, respectively. Thus, these
values are used in the following simulation. In addition, Figs.
6b and 6c reveal that the RVNS queue is stable enough to
generate quick response when processing time increases with
the amount of data. The most important reason is that the
sequence of data processing changes because of the RVNS
queue so that the potential valuable data can be analyzed and
reported early.
14000
13000
12000
11000
10000
9000
8000
7000
6000
5000
1000
Conclusion
References
[1] C. Jardak, P. Mahon, and J. Riihijar, “Spatial Big Data and Wireless
Networks: Experiences, Applications, and Research Challenges,” IEEE Network, vol. 28, no. 4, 2014, pp. 26–31.
[2] A. Hristova, A. M. Bernardos, and J. R. Casar, “Context-Aware Services
for Ambient Assisted Living: A Case-Study,” 1st Int’l. Symp. Applied Sciences on Biomedical and Commun. Technologies, 2008, pp. 1–5.
[3] P. Hansen and N. Mladenovic, “Variable Neighborhood Search: Principles
and Applications,” Euro. J. Oper. Res., vol. 130, no. 3, 2001, pp. 449–67.
[4] Y. Zhang et al., “Cap: Community Activity Prediction Based on Big Data
Analysis,” IEEE Network, vol. 28, no. 4, 2014, pp. 52–57.
[5] L. Kuang et al., “A tensor-Based Approach for Big Data Representation and
Dimensionality Reduction,” IEEE Trans. Emerg. Topics Comp., vol. 2, no.
3, 2014, pp. 1–10.
[6] J. Yu, F. Jiang, and T. Zhu, “RTIC-C: A Big Data System for Massive Traffic
Information Mining,” IEEE Int’l. Conf. Cloud Computing and Big Data,
2013, pp. 395–402.
[7] L. Gu et al., “Cost Minimization for Big Data Processing in Geo-Distributed Data
Centers,” IEEE Trans. Emerg. Topics Comp., vol. 2, no. 3, 2014, pp. 314–23.
[8] W. Dou et al., “Hiresome-ii: Towards Privacy-Aware Cross-Cloud Service
Composition for Big Data Applications,” IEEE Trans. Parallel Distrib. Sys.,
vol. 26, no. 2, 2013, pp. 455–66.
[9] C. Yin et al., “Robot: An Efficient Model for Big Data Storage Systems Based
on Erasure Coding,” 2013 IEEE Int’l. Conf. Big Data, 2013, pp. 163–68.
[10] C. Yang et al., “A Time Efficient Approach for Detecting Errors in Big
Sensor Data on Cloud,” IEEE Trans. Parallel Distrib. Sys., vol. 26, no. 2,
2015, pp. 329–39.
IEEE Network • January/February 2016
0
1000
80
60
40
20
0
Acknowledgments
RVNS
FIFO
4
8
16
32
64
128 256 512 1024 2048 4096
Time (s)
(b)
214
212
Waiting time (s)
This work is supported by NSFC (61572262, 61100213,
61401107); SFDPH (20113223120007); NSF of Jiangsu Province (BK20141427), NUPT (NY214097, XJKY14011); Open
Research Fund of Key Lab of Broadband Wireless Communication and Sensor Network Technology (Nanjing University
of Posts and Telecommunications), Ministry of Education
(NYKL201507); Educational Commission of Guangdong Province (2013KJCX0131); Guangdong High-Tech Development
Fund (2013B010401035); the 2014 Guangdong Province Outstanding Young Professor Project and 2013 Top Level Talents
Project in the Sailing Plan of Guangdong Province; projects
240079F/20 funded by the Research Council of Norway, and
the European Commission FP7 Project CROWN (grant no.
PIRSES-GA-2013-627490).
0
800
600
400
ize
s
e
c
200
spa
tion
Solu
(a)
Result values
In the big data era, ubiquitous sensing is not always a hard
problem. However, the real problems are how to transfer
collected data and analyze them efficiently. In this article,
a three-layer architecture is proposed for a mobile eHealth
network, where a fault-tolerant mechanism is proposed using
interests matching to ensure the reliability of network transmission. In addition, filtered data are set in an RVNS queue for
rapid processing in the data analyzing layer, and only valuable
data are returned to health care providers, which saves people
effort in identifying them. In further research, we will focus
more on the specific design of the fault-tolerant mechanism to
meet the needs of more universal networks.
800
Rec 600
o
int nstru400
erv cti 200
al ( on
s)
210
28
26
24
22
RVNS
FIFO
5
6
7
8
9
10 11 12 13
Average processing time (s)
14
15
(c)
Figure 6. Simulation results: a) processing time with different
reconstruction interval and solution space size; b) change of
highest founded result with time; and c) waiting time of data
when average data processing time changes.
[11] R. Mao et al., “Overcoming the Challenge of Variety: Big Data Abstraction, the Next Evolution of Data Management for AAL Communication
Systems,” IEEE Commun. Mag., vol. 53, no. 1, Jan. 2015, pp. 42–47.
[12] A. R. M. Forkan et al., “BDCaM: Big Data for Context-aware Monitoring–
A Personalized Knowledge Discovery Framework for Assisted Healthcare,”
IEEE Trans. Cloud Comp., vol. PP, no. 99, 2015, pp. 1–14.
[13] S. Prakash, H. L. Yann, and T. Johnson, “A Nonblocking Algorithm for
Shared Queues Using Compare-and-Swap,” IEEE Trans. Comp., vol. 43,
no. 5, 1994, pp. 548–59.
[14] K. Wang et al., “LDPA: A Local Data Processing Architecture in Ambient
Assisted Living Communications,” IEEE Commun. Mag., vol. 53, no. 1,
Jan. 2015, pp. 56–63.
[15] H. A. Guvenir, B. Acar and H. Muderrisoglu, “Arrhythmia Data Set in
UCI Machine Learning Repository,” UC Irvine, 1998; http://archive.ics.
uci.edu/ml/datasets/Arrhythmia.
Biographies
Kun Wang [M] is an associate professor in the School of Internet of Things,
Nanjing University Posts and Telecommunications, China. He received his
Ph.D. degree from the School of Computers, Nanjing University of Posts and
Telecommunications in 2009 and was a postdoctoral fellow in the Electrical
Engineering Department, University of California, Los Angeles (UCLA) from
2013 to 2014. He has published over 50 papers in related international
conferences and journals, including IEEE Transactions on Industrial Informat-
41
ics, the IEEE Sensors Journal, IEEE Communications Magazine, IEEE GLOBECOM, and IEEE ICC. His current research interests include wireless sensor
networks, delay-tolerant metworks, stream computing, ubiquitous computing,
mobile cloud computing, and information security technologies. He is a member of ACM.
Y un S hao is a postgraduate student in the Viterbi School of Engineering,
Department of Computer Science, University of Southern California (USC). He
received his Bachelor’s degree at Nanjing University of Posts and Telecommunications. His current research interests include wireless sensor networks,
delay-tolerant networks, and stream computing.
Lei Shu [M] (lei.shu@live.ie) received his Ph.D. degree from the Digital Enterprise Research Institute, National University of Ireland, Galway, in 2010. Until
March 2012, he was a specially assigned researcher in the Department of
Multimedia Engineering, Graduate School of Information Science and Technology, Osaka University, Japan. In October 2012, he joined Guangdong
University of Petrochemical Technology, China, as a full professor. In 2013,
he started to serve at Dalian University of Technology as a Ph.D supervisor.
Meanwhile, he is also working as vice director of the Guangdong Provincial
Key Laboratory of Petrochemical Equipment Fault Diagnosis, China. He is
the founder of the Industrial Security and Wireless Sensor Networks Lab. His
research interests include wireless sensor networks, multimedia communication,
middleware, and security. He was awarded the IEEE GLOBECOM 2010 and
ICC 2013 Best Paper Awards. He has been serving as Editor in Chief for IEEE
CommSoft E-letter and EAI Endorsed Transactions on Industrial Networks and
Intelligent Systems.
42
Chunsheng Zhu received his B.E. degree in network engineering from Dalian
University of Technology in June 2010 and his M.Sc. degree in computer
science from St. Francis Xavier University, Antigonish, Nova Scotia, Canada,
in May 2012. He has been working toward his Ph.D. degree in the Department of Electrical and Computer Engineering, University of British Columbia,
Vancouver, Canada, since September 2012. He has around 40 papers published or accepted by refereed international journals (e.g., IEEE Transactions
on Industrial Electronics, IEEE Systems Journal) and conferences (e.g., IEEE
ICC). His current research interests are mainly in the areas of wireless sensor
networks and mobile cloud computing.
Y an Z hang [SM] is currently head of the Department of Networks at Simula Research Laboratory, Norway, and an associate professor (part-time)
at the Department of Informatics, University of Oslo, Norway. He received
a Ph.D. degree from the School of Electrical and Electronics Engineering,
Nanyang Technological University, Singapore. He is an Associate Editor or
on the Editorial Boards of a number of well-established scientific international
journals, such as Wiley Wireless Communications and Mobile Computing.
He has also served as a Guest Editor for IEEE Transactions on Smart Grid,
IEEE Transactions on Industrial Informatics, IEEE Communications Magazine,
IEEE Wireless Communications, and IEEE Transactions on Dependable and
Secure Computing. He has served or is serving as Chair of a number of conferences, including IEEE PIMRC 2016, IEEE CCNC 2016, WICON 2016,
IEEE SmartGridComm 2015, and IEEE CloudCom 2015. He has served as a
TPC member for numerous international conference including IEEE INFOCOM,
IEEE ICC, IEEE GLOBECOM, and IEEE WCNC. His current research interest
include wireless networks and reliable and secure cyber-physical systems (e.g.,
healthcare, transport, smart grid). He has received seven Best Paper Awards.
IEEE Network • January/February 2016
Download