An Integrated and Flexible Scheduler for Sensor Grids Hock Beng Lim1 and Danny Lee2 1 Intelligent Systems Center, Nanyang Technological University limhb@ntu.edu.sg 2 School of Computing, National University of Singapore Abstract. The integration of wireless sensor networks and grid computing is an emerging and promising area of research. Sensor grids extend the grid computing paradigm to the sharing of sensor resources in wireless sensor networks. By combining the complementary strengths of sensor networks and grid computing, sensor grids can support applications that require real-time information from the physical environment and vast amount of computational and storage resources. One of the major challenges in the design of sensor grids is how to efficiently schedule sensor jobs across the collection of sensor resources in a sensor grid. In this paper, we design an integrated and flexible scheduler for a sensor grid testbed. The scheduler makes use of several scheduling and load balancing algorithms to suit the characteristics of sensor jobs. We performed extensive evaluations to characterize the behaviour and performance of these scheduling and load balancing algorithms. Our performance results indicate that the sensor grid scheduler can provide a good tradeoff betweeen performance objectives such as sensor resource utilization, sensor job throughput, and sensor grid lifetime. 1 Introduction Wireless Sensor Networks (WSNs) consist of groups of small, inexpensive, low-power, and self-contained sensor nodes with sensing, data processing, and wireless communication capabilities [1, 2]. They enable the direct monitoring of the physical environment, and the seamless coupling of real-time sensor data with the digital world. The sensor nodes are resource-constrained since they have limited sensing capability, processing power, and communication bandwidth. However, with the aggregation of a large number of such nodes over a wide area, a wireless sensor network has substantial data acquisition and processing capabilities. Thus, wireless sensor networks can be regarded as distributed computing resources that can be shared by multiple users and applications. Grid computing is based on the concept of the coordinated sharing of distributed and heterogeneous resources to solve large-scale problems in dynamic virtual organizations [3]. The types of shared resources can be computational servers, storage, or even sensors. They are owned by institutions or individuals in virtual organizations (VOs). Rules exist to ensure fairness and quality of service for users of grid resources, and to maximize the utility of these resources. In fact, the grid computing paradigm can be extended to include the sharing of sensor resources in WSNs. A sensor grid [4] is formed by the integration of WSNs with the conventional wired grid fabric. A sensor grid processes multiple sensor job requests from different users. In a typical workflow, sensor jobs arriving at a sensor grid should be handled automatically by the system. This raises the issues of how the sensor grid efficiently schedules incoming jobs to be executed, and how it allocates sensor resources to these jobs. The scheduling of jobs in distributed, parallel, and grid computing systems has been extensively studied in the past. However, these previous works deal with the scheduling of computational jobs. There are some important differences between sensor jobs and computational jobs. Thus, existing scheduling algorithms and schedulers for traditional distributed and parallel systems may not work well in the context of sensor grids. Previously, we proposed a sensor grid architecture called the Scalable Proxy-based aRchItecture for seNsor Grid (SPRING) [4]. In this paper, we design an integrated and flexible scheduler for a sensor grid testbed based on the SPRING framework. Several scheduling and load balancing algorithms were implemented within this scheduler to suit the unique characteristics of sensor jobs. The scheduler can use an appropriate scheduling or load balancing algorithm to suit the requirements of the resource owner and users. We performed extensive evaluations to measure the impact of these scheduling and load balancing algorithms using several performance metrics. Our performance results indicate the suitability of the scheduling and load balancing algorithms to satisfy performance objectives such as sensor resource utilization, sensor job throughput, and sensor grid lifetime. With these algorithms, our scheduler can provide a good tradeoff between the performance objectives of a sensor grid. The rest of this paper is organized as follows. Section 2 provides an outline of our SPRING architecture and the sensor grid testbed system implemented. We discuss the design of our sensor grid scheduler in Section 3, including the scheduling and load balancing algorithms used. Sections 4 and 5 present the evaluation of the scheduling and the load balancing algorithms respectively, and also discuss the significance of the results. Finally, Section 6 concludes this paper. 2 Sensor Grid Architecture and Testbed The key idea of the SPRING framework [4] is to use proxy systems as interfaces between WSNs and the grid fabric. This enables the functionality of the sensor nodes to be exposed to the grid although they are resource-constrained. As shown in Figure 1, SPRING is based on a layered-architecture approach. The layers represent the main middleware components that connect different type of systems to the grid network. Each layer defines services that are accessible via Application Programming Interfaces (APIs) for the application or other layers. The WSN Proxy comprises of various Proxy components, which are shown in Figure 2. Please refer to [4] for a detailed description of the SPRING framework and the WSN Proxy components. We have developed a sensor grid testbed prototype to implement the SPRING framework [4]. For this testbed, the WSN is based on the Crossbow motes platform. User System User Access Grid Meta-scheduler Grid Interface Grid Network WSN Proxy Grid Interface Grid Interface Proxy Components Resource Scheduler WSN Scheduler Resource Mgmt Compute Node WSN Management Compute Resource WSN Fig. 1. The SPRING Framework 3 3.1 Sensor Grid Scheduler Design Characteristics of Sensor Jobs The design of a sensor grid scheduler is influenced by some important differences between sensor jobs and computational jobs. Unlike computational jobs, sensor jobs are not multitasking in nature. A sensor node can execute only one sensor job at a time, and it cannot execute multiple sensor jobs via multitasking. Thus, sensor jobs are not preempt-able. While computational jobs automatically terminate upon completion, the durations of sensor jobs have to be explicitly specified in advance. Sensor jobs are also likely to require specific time slots for execution compared to general computational jobs. In addition, sensor jobs are not only time-sensitive, but may also be location and resource-sensitive in some cases. Certain jobs can only run on certain sensor nodes in the correct location. Another aspect of resource sensitivity is that a single sensor job can run on more than 1 sensor node simultaneously, which makes the load balancer more significant in allocating these jobs. 3.2 Sensor Job Scheduling Algorithms In this work, we adapted four existing multiprocessor scheduling algorithms to suit the sensor grid scenario. The scheduling algorithms we have chosen to implement are relatively simple and deterministic, in line with the consideration that the scheduler should be lightweight and provide reliable performance. One common modification that we made was such that none of the scheduling algorithms support preemptive scheduling. 1. Earliest Deadline First (EDF): EDF is a scheduling algorithm that allocates higher priorities to jobs closer to their deadline. Created as a alternative to the static priority schedulers, it guarantees schedulability when node utilization is full [5]. In Grid Interface Quality of Service Availability Security Power Management WSN Connectivity Information Services Data Management Proxy Components WSN Scheduler WSN Management Fig. 2. The Proxy Component Architecture the case where deadlines are equal, we modified the algorithm to use the duration of the job as a tie breaker, with shorter duration jobs given a higher priority. 2. First Come First Served (FCFS): FCFS or First In First Out (FIFO) is the simplest of the four scheduling algorithms. Using a queue structure, this algorithm simply adds new jobs to the end of the queue as they arrive, picking jobs for execution only from the front of the queue. 3. Least Laxity First (LLF): A derivative of EDF, LLF is a scheduling algorithm that allocates priorities on the amount of laxity a job has: the lower the laxity, the higher the priority. The laxity, or slack time, is the time left until its deadline after the job is completed, assuming that the job could be executed immediately [6]. In the case when laxity is equal for two jobs, the algorithm was modified to allocate a higher priority to the job with the nearer deadline. 4. Shortest Job Next (SJN) : SJN allocates priorities statically depending on the duration of jobs. Shorter jobs are favoured, and are given higher priorities. Similar to LLF, the modification made to the algorithm was to allocate a higher priority to the job with the nearer deadline when durations are equal. 3.3 Load Balancing Algorithms Load balancing algorithms are important for resource owners to manage their sensor resources. By spreading out the allocation of jobs to sensor nodes according to different criteria, resource constraints of WSNs such as sensor node battery life and wireless network bandwidth can be taken into account. For example, resource owners may want to spread out sensor jobs to different sensor nodes in a WSN to even out the power consumption across nodes and prolong the WSN lifetime, provided the jobs do not have affinity to specific nodes. Our scheduler provides the flexibility for the user to select whether he wants specific resource allocation, such as in the case when the job is location sensitive. Alternatively, the user may want automatic resource allocation, in which all nodes in a WSN are eligible to run the job, and thus he simply needs to specify how many nodes he wishes to execute the job on. 1. No Priority (Earliest Slot First): If there is no criteria for load balancing, we choose to let the job finish as soon as possible, without any specific priority consideration. This form of load balancing will consider the combination of nodes required with the earliest available time in which the sensor job can fit into, which gives it the alternate name of Earliest Slot First. In the case of tied earliest times, the algorithm will randomly select a combination of nodes to execute the job. 2. Battery Priority: In our testbed, we assume that we have the facility to query the amount of battery life remaining in the sensor nodes (motes). When a job arrives at the load balancer, the battery levels of the motes will be queried. Depending on the number of motes specified by the user, the system will examine the average instantaneous battery levels for all combinations of motes, then consider the battery threshold level. The threshold, varying from 0.0 to 1.0, indicates the percentage of highest battery levels that the load balancer will consider. That is, a threshold of 0.5 will cause the combinations with the top 50% average battery levels to be considered. Out of this subset of combinations, the load balancer will then choose the combination with the earliest time. Theoretically, a threshold of 1.0 should be no different from the load balancing algorithm of ”No Priority”. However in the case of a tie between average battery and time, the choice of which combination to choose will be random. 3. Load Time Priority: One property of sensor jobs is that job durations are clearly specified before execution, which we leverage here to provide a more balanced indicator of load. The load balancer maintains a table of mote information internally, including the number of minutes of jobs scheduled to be executed, i.e. the Load Time for each mote. Similar to Battery Priority load balancing, when a job arrives to be allocated to the motes, the load balancer will check the Load Times and generate a list of average load times for all possible combinations of motes. Like Battery Priority load balancing, it will consider the threshold when selecting a combination of motes for allocation, selecting the combination with the earliest timeslot within the threshold level. 3.4 Dropping Jobs After the phases of job scheduling and load balancing, there are 2 simple criteria that can dictate whether jobs will be executed or not: (1) whether a job will exceed its specified deadline if it is a ”hard” deadline, and (2) whether the motes allocated to the job have run out of battery. 3.5 Implementation of the Scheduler The scheduler was implemented in Java, which is a language compatible with the Globus API. We developed a grid service that runs in the Globus container that accepts job details and interfaces with the scheduler to schedule the job. % Scheduled against Load Factor 100.0% 98.0% % Scheduled 96.0% 94.0% 92.0% EDF FCFS LLF SJN 90.0% 88.0% 0.0 0.1 0.2 0.3 0.4 0.5 0.6 Load Factor 0.7 0.8 0.9 1.0 Fig. 3. % Scheduled vs Load Factor As mentioned earlier, ideally we should be able to query the battery life of the motes dynamically. However, due to the limitations of the Crossbow motes platform, we cannot obtain actual data on battery usage in the motes. Thus, we simulate the level of battery usage in our scheduler. We add an extra field in the job parameters called Battery Usage, which indicates the percentage of the total battery life that the job will consume per minute of execution. This is to simulate the different rates of battery consumption that different jobs have. Studies have shown that it is possible to predict the power usage of a given mote executable via simulation, and thus we can obtain a predicted rate of battery consumption [7]. 4 Evaluation of Scheduling Algorithms In the evaluation of scheduling algorithms, we show how the different execution orders of jobs affect the job turnaround time, the number of jobs that meet their deadlines, and the effective utilization of the motes. We also study how the scheduling algorithms behave as the job load increases, i.e. when timing constraints become tighter. 4.1 Experimental Methodology To achieve this, we primarily aimed to vary the type of scheduling algorithm, while we do not use any load balancing. In addition, we also choose to vary the Load Factor, which is defined as: Load Factor , F = sum of all generated job durations total execution time period (1) We used the soft deadline type here as we wanted to prevent jobs being dropped due to timing constraints from affecting the overall times. Turnaround Time against Load Factor 800 700 Turnaround Time (min) 600 500 400 300 EDF FCFS LLF SJN 200 100 0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 Load Factor 0.7 0.8 0.9 1.0 Fig. 4. Turnaround Time vs Load Factor We developed a workload generator to create 25 batches of 50 jobs each to simulate user input for each of the 10 Load Factor levels. We used this same set of input jobs to test each scheduling algorithm to ensure that the results are comparable against each other. Each batch consists of jobs that have start times that follow a Poisson distribution over the first hour, and have deadlines 24 hours after the start time. Job durations were randomly specified, to simulate the unpredictable nature of user requirements. We use a sensor grid testbed with a WSN consisting of 5 motes. To minimize the case of uneven resource allocation, we let the system fully decide the way motes are allocated by using an automatic allocation type. A uniform random distribution was used for the number of motes required per job. We chose the parameters such that jobs were not constrained by resources, i.e. no jobs will be dropped due to lack of battery due to the nominal battery usage values, and that load balancing have as little impact as possible on time allocation. This would mean that the best possible situation would be created in terms of turnaround and utilization due to the closer packing of jobs. 4.2 Performance Metrics i. % Scheduled This is the ratio of the number of jobs that successfully finish before their deadlines over the total number of jobs. The resource owner would prefer a value that is as high as possible for this metric. ii. Turnaround Time This is the average time that a job will take from submission until completion. Users would like this to be as low as possible, as it means that they can get their data back sooner. % Unutilised Time against Load Factor 15.0% EDF FCFS LLF SJN 14.0% % Unutilised Time 13.0% 12.0% 11.0% 10.0% 9.0% 8.0% 0.0 0.1 0.2 0.3 0.4 0.5 0.6 Load Factor 0.7 0.8 0.9 1.0 Fig. 5. % Unutilized Time vs Load Factor iii. % Unutilized Time This is the percentage of time left unused after the jobs are scheduled. A lower value is desired as it indicates better utilization of the resource. 4.3 Results and Discussion The graphs shown on Figure 3 to Figure 5 are based on the averaged values of each set of 25 trials. From this experiment, we can see that users would prefer the SJN scheduling algorithm, since it provides the shortest turnaround times amongst all the scheduling algorithms. However, SJN performs poorly for utilization and missed time, which resource owners would be concerned about. SJN will execute short jobs first, which means that long jobs will constantly be delayed, up until the point where they begin to miss their deadlines. The most suitable choice for resource owners is EDF, which also satisfies users with a reasonable turnaround time. It drops the fewest number of jobs, has a consistent amount of % Unutilized Time of about 12% and misses the lowest number of jobs by a relatively short amount. EDF in this case cannot ensure full schedulability when the Load Factor is less than or equal to 1 due to the resource requirements of the jobs. This is a major differences between computational and sensor jobs that has been demonstrated in this experiment. LLF provides the least amount of gaps between jobs, but otherwise performs poorly. We noticed that this algorithm tended to schedule long jobs first, since given the same deadline as a short job, the laxity of a long job is lower. As a result, short jobs wait long times before executing, affecting turnaround time. An additional side effect is that given the same data set, more short jobs will tend to miss their deadlines than algorithms like SJN, so the % scheduled is lower. Sensor Grid Lifetime against Battery Threshold 1100 Sensor Grid Lifetime (min) 1050 1000 950 900 EDF FCFS LLF SJN 850 800 0.0 0.2 0.4 0.6 Battery Threshold 0.8 1.0 Fig. 6. Sensor Grid Lifetime vs Battery Threshold It is interesting to note that FCFS is average in all metrics, which was unexpected, given its unpredictable nature. Upon examining the individual trial results, we found that there were many large variations, but were evened out by the averaging function. This is the benefit of taking a comparatively larger set of samples, suggesting that FCFS may be a “good enough” algorithm as the number of trials gets large. 5 Evaluation of Load Balancing Algorithms The evaluation of load balancing algorithms aims to show the effect of load balancing in combination with the different scheduling algorithms on the resource lifetimes and the number of schedulable jobs. We also wanted to explore the effect of increasing the threshold on the respective load balancing algorithms, to find the optimum value of threshold that has the highest utilization and at the same time the longest resource lifetime. Lastly, we want to compare the two load balancing algorithms to find out which one produces a more consistent output. 5.1 Experimental Methodology We vary the threshold values over 6 steps from 0.0 to 1.0 for each of the scheduling algorithms, and will repeat the experiment for each of the load balancing types. We picked the Load Factor of 0.5 to ensure that the jobs are guaranteed to meet their deadline regardless of scheduling algorithm, and not to use an extremal value that has been observed to cause the scheduling algorithms to behave erratically. For each batch of 50 jobs, the job parameters are generally the same as the ones from the previous section with the same rationale, except for the amount of battery use. As mentioned in section 3.5, we had to simulate the battery usage in our program. % Above Optimal against Battery Threshold 20.0% EDF FCFS LLF SJN % Above Optimal 15.0% 10.0% 5.0% 0.0% 0.0 0.2 0.4 0.6 0.8 1.0 -5.0% Battery Threshold Fig. 7. % Above Optimal vs Battery Threshold In order to test the minimum network lifetime, we had to get the situation when the batteries would be exhausted. Therefore when the battery usage was generated, it was generated with the goal of depleting all the battery life of all the motes. 5.2 Performance Metrics i. Sensor Grid Lifetime This metric refers to the minimum time until a mote becomes inactive due to lack of battery. When this happens, the sensor grid is unable to service all possible jobs, reducing its usefulness. The resource owner would want to keep this value as high as possible. ii. % Above Optimal For this experiment, we are assuming that the case of no load balancing will produce the optimal time. By defining the “completion time” as the time from the start of the first job to the end of the last job, we can define this metric as the completion time for the trial with the given load balancing scheme over the completion time for no load balancing. Therefore for the best utilization, the resource owner would want this value to be as close to zero as possible. 5.3 Results and Discussion Comparing the two load balancing algorithms from the results shown in Figure 6 to Figure 9, we can see that in general, the Load Time Priority scheme produces more consistent results as the threshold increases for all scheduling algorithms. Battery Priority on the other hand experiences a drastic change in all metrics as the threshold goes from 0.2 to 0.4. Generally, we can conclude that when the rate of battery usage is not Sensor Grid Lifetime against Load Time Threshold 1000 Sensor Grid Lifetime (min) 980 960 940 920 EDF FCFS LLF SJN 900 880 0.0 0.2 0.4 0.6 Load Time Threshold 0.8 1.0 Fig. 8. Sensor Grid Lifetime vs Load Time Threshold known, Load Time Priority can provide an adequate approximation of balanced battery load that prolongs sensor grid lifetime. Comparing the difference between scheduling algorithms, it seems that the scheduling algorithm SJN in conjunction with the Load Time Priority load balancer produces the best and most consistent results for both users and resource owners in this experiment, that is, when the battery usage is high. SJN provides the longest sensor grid lifetimes, up to 80 minutes more than LLF when threshold is 1.0, and is able to schedule the largest percentage of jobs. However it does have the drawback that it consistently extends the completion time for a set of jobs by the largest amount. This is mitigated when the chosen threshold value is about, such that the sensor grid lifetime is balanced by the corresponding increase in completion time. EDF is an average performer when load balancing is introduced, but continues to produce consistent results. LLF would probably not be the choice of anyone but the resource owner most concerned with maximizing his utilization. FCFS again produces average values for both load balancing algorithms, suggesting again that it may be “good enough” for general cases. 6 Conclusion Sensor grids are a new and emerging field of research, with many important unresolved issues, one of which is the scheduling of sensor jobs. In this paper, we addressed the design of an integrated and flexible sensor grid scheduler. We adapted scheduling algorithms to suit the characteristics of sensor jobs, and these modified algorithms perform differently from their conventional counterparts. We also incorporated a load balancing scheme that caters to the limitations of sensor grids, and found that it has an impact on the sensor grid utilization and useful lifetime. % Above Optimal against Load Time Threshold 18.0% 16.0% 14.0% % Above Optimal 12.0% 10.0% 8.0% 6.0% EDF FCFS LLF SJN 4.0% 2.0% 0.0% 0.0 0.2 0.4 0.6 Load Time Threshold 0.8 1.0 Fig. 9. % Above Optimal vs Load Time Threshold In conclusion, we have design a flexible and tunable sensor grid scheduler and implemented it on a testbed that we have created. The various scheduling and load balancing algorithms were also characterized such that the scheduler can readily select one that suits the performance objectives such as sensor resource utilization, sensor job throughput, and sensor grid lifetime. We plan to develop a dynamic scheduler that can adaptively select the appropriate scheduling and load balancing algorithms according to the sensor job requirements and the sensor grid’s resource status. References 1. Akyildiz, I.F., Su, W., Sankarasubramaniam, Y., Cayirci, E.: Wireless sensor networks: A survey. Computer Networks 38(4) (2002) 393–422 2. Culler, D., Estrin, D., Srivastava, M.: Overview of sensor networks. IEEE Computer (2004) 41–49 3. Foster, I., Kesselman, C., Tuecke, S.: The anatomy of the Grid: Enabling scalable virtual organizations. Intl Journal of Supercomputer Applications 15(3) (2001) 200–222 4. Lim, H., Teo, Y., Mukherjee, P., Lam, V., Wong, W., See, S.: Sensor grid: Integration of wireless sensor networks and the grid. In: Proc. of the IEEE Conf on Local Computer Networks (LCN 2005), Sydney, Australia (2005) 91–99 5. Liu, C., Layland, J.: Scheduling algorithms for multiprogramming in a hard-real-time environment. Journal of the ACM (JACM) 20(1) (1973) 46–61 6. Oh, S.H., Yang, S.M.: A modified least-laxity-first scheduling algorithm for real-time tasks. In: The Fifth International Conference on Real-Time Computing Systems and Applications, Hiroshima, Japan (1998) 31–36 7. Shnayder, V., Hempstead, M., Chen, B.R., Allen, G.W., Welsh, M.: Simulating the power consumption of large-scale sensor network applications. In: 2nd International Conference on Embedded Networked Sensor Systems, Baltimore, MD, USA (2004) 188 – 200