INEN 689 – 602 PROJECT Nurse Scheduling Problem Elif Ilke Gokce Industrial Engineering Texas A&M University elifg@tamu.edu 1 Abstract Every hospital needs to produce repeatedly duty rosters for its nursing staff. Properly scheduling the nursing staff has a great impact on the quality of health care, the recruitment of nursing staff, the development of nursing budgets, staff and patient safety and satisfaction, and administrative workload. The scheduling of hospital personnel is particularly challenging because of different staffing needs on different days and shifts. The planned personnel schedule usually has to be changed to deal with unforeseen circumstances such as staff sickness and emergencies. In this project, a mixed- integer stochastic programming approach to the solution of the nurse scheduling problem is proposed. The problem is formulated as two-stage recourse model. The objective is to minimize unsatisfied staff in the first stage and the expected overtime cost in the second stage. The nurse scheduling problem will be solved by L2 algorithm. The aim of this study is to help health institutions to determine nurse schedules which would increase nursing staff, efficiency and satisfaction. Keywords: L2 algorithm, nurse scheduling, stochastic programming 2 1 INTRODUCTION Every hospital needs to repeatedly produce duty rosters for its nursing staff. The problem is of critical importance for a variety of reasons. Properly scheduling the nursing staff has a great impact on the quality of health care, the recruitment of nursing personnel, the development of nursing budgets, staff and patient safety, staff and patient satisfaction and administrative workload. The scheduling of hospital personnel is particularly challenging because of different staffing needs on different days and shifts. Unlike many other organizations, healthcare institutions need to be staffed 24 hours a day over seven days a week. In addition, in many hospitals, nurses are allowed to request preset shifts, while other nurses are scheduled around these pre-set shifts. Usually, nursing officers spend a substantial amount of time developing rosters especially when there are many staff requests. Because of time-consuming manual scheduling, and for various other reasons, the nurse scheduling problem (NSP) has attracted much attention. Personnel scheduling problems have been studied extensively over the past three decades (see survey papers by [3] [22] [7] [5] [12]). Personnel scheduling is the problem of assigning employees to shifts or duties over a scheduling period so that certain constraints (organizational and personal) are satisfied. It involves constructing a schedule for each employee within an organization in order for a set of tasks to be fulfilled. In the domain of healthcare, this is particularly challenging because of the presence of a range of different staff requirements on different days and shifts. In addition, unlike many other organizations, healthcare institutions work twenty-four hours a day for every single day of the year. Most nurse scheduling problems are extremely difficult and complex. [22], for example, say nurse scheduling is more complex than the traveling salesman problem. A general overview of various approaches for nurse scheduling can be found in [20] [10] and [8]. Early research ([25] [15][23]) concentrated on the development of mathematical programming models. To reduce computational complexity, researchers had to restrict the problem dimensions and consider a smaller size of constraints in their models, resulting in solutions that are too simple to be applied in real hospital situations. 3 The above observations have led to other attempts, trying to solve the real nurse scheduling problems within reasonable time. Besides heuristic approaches (e.g., [21]; [6]), artificial intelligence approaches such as constraint programming ([16]), expert systems ([10]) and knowledge based systems ([18]) were investigated with some success. Since the 1990’s, most papers have tackled the problem with metaheuristic approaches, including simulated annealing ([17]), tabu search ([11]) and genetic algorithms ([1]). A key feature of real NSP which is in the subset of the personnel planning problem is that the planned personnel schedule usually has to be changed (often at very short notice) to deal with unforeseen circumstances such as staff sickness and emergencies. Very little study dealing with the uncertainty that is inherent in the real world has been done and they are mostly based on fuzzy methodologies. However, stochastic optimization techniques will provide solutions that can be applied to the real world NSPs. The NSP involves producing a periodic (weekly, fortnightly, or monthly) duty roster for nursing staff, subject to a variety of hard/soft constraints such as legal regulations, personnel policies, nurses preferences and many other requirements that may be hospital-specific. These constraints can vary from one hospital to another while the objectives in scheduling can also vary. In this project, first part gives brief introduction to workforce assignment and studies in the literature. In the second part, nurse scheduling problem modeling methods are argued, and then Solution approaches to the nurse scheduling problem which are optimization approaches, heuristic approaches and AI approaches are described. Third part contains nurse scheduling problem definition and modeling formulation using stochastic integer programming formulation. In the fourth part, L2-Binary first stage algorithm which is used for solving stochastic integer programming model is defined. Part five gives brief description of exponential design. Data sets and implemented program code are also explained. In the last part, conclusion and the future work is found. 4 2 2.1 THE NURSE SCHEDULING PROBLEM Modeling the Nurse Scheduling Problem Parameters in the NSP may include the working shifts per week if night shifts are worked, preference costs of particular nurses working on particular shift pattern, working shifts per schedule if day shifts are worked, working shifts per schedule if both day and night shifts are worked, demand for certain grade of nurses on day and on night shifts. The NSP is commonly described by a nurse-day, a nurse-task and a nurse-shift pattern view ([9]). In a nurse-day view, the decision variables are defined in two different ways. In the first case; vij is defined for each nurse on each day, where i=1…N indexes the nurses and j=1…M indexes the days within a scheduling period. The domains of these variables consist of on-duty shifts and free shifts. On-duty shifts include any number of shifts per day. It is common to use only a morning shift (A) of eight working hours, an afternoon shift (P) of eight working hours, and a night shift (N) of eight working hours. Free shifts include day-off (O), vacation leave (VL), unpaid leave (UL), etc. In the general, when there are Z shifts per day, xij can take Z+1 possible values as the following: ⎧0 nurse i is off duty on day j ⎪1 nurse i works shift 1 on day j ⎪ xij = ⎨ ⎪# # ⎪⎩Z nurse i works shift Z on day j For 0–1 models, the decision variables are modified as xijk , where i, j are the same index as before and k=1… Z indexes the Z possible shifts in a day. ⎧1 nurse i works shift k on day j xijk = ⎨ otherwise ⎩0 In a nurse-task view, the decision variable is defined for each nurse in each shift as xis, where i=1…N indexes the nurses and s=1…Z indexes the tasks within a scheduling period. The shift defined in nurse-task view may not necessarily correspond to a day. 5 ⎧1 nurse i receives task s xis = ⎨ otherwise ⎩0 In a nurse-shift pattern view, the decision variable is defined for each nurse and for each shift pattern as xip, where i=1…N indexes the nurses and p=1…M indexes the shift patterns. ⎧1 nurse i works shift pattern p xip = ⎨ otherwise ⎩0 In literature, three types of objective function are used with NSPs. The first type is ∑ j∈F ( i ) pij xij → min!, where pij is the penalty cost of nurse i working on shift pattern j, xij is the decision variable with a nurse- shift pattern view and F(i) is the set of feasible shift patterns for nurse i. The objective is to minimize the total penalty cost for all nurses. Second type purposes to minimize the number of uncovered shifts. Last type purposes to minimize the cost of schedules and the penalty for violating shift balance. Two types of constraints commonly occur in NSPs: hard constraints and soft constraints. Hard constraints include coverage requirements; such as nurse demand per day per shift type per skill category while soft constraints are usually those involved with time requirements on personal schedules. The goal is always to schedule resources to meet the hard constraints while aiming at a high quality result with respect to soft constraints. Commonly used constraints are as follows: 1. Minimum/maximum nurse workload 2. Consecutive same working shift (minimum/maximum/exact number) 3. Consecutive working shift/days (minimum/maximum/exact number) 4. Nurse skill levels and categories 5. Nurses’ preferences or requirements 6. Nurses free days (minimum/maximum/consecutive free days) 7. Free time between working shifts 8. Shift type(s) assignments (maximum shift type, requirements for each shift types). 6 9. Constraints among groups/types of nurses, such as nurses not allowed to work together or nurses who must work together 10. Shift patterns 11. Other requirements in a shorter or longer time period other than the planning time period; such as every day in a shift must be assigned 12. Constraints among shifts; such as shifts cannot be assigned to a person at the same time 13. Requirements of different types of nurses or staff demand for any shift It is important to note that constraints 1, 3, 5, 6, 8, 11 and 13 are common in NSPs. Soft constraints can include balance in workload, assigning complete weekends and patterns enabling specific cyclic constraints. 2.2 Solution Approaches to the Nurse Scheduling Problem Solution approaches that have been proposed to solve NSPs are classified in three main categories: optimization-mathematical programming (MP), heuristics and artificial intelligence. 2.2.1 Optimization Approaches Optimization approaches are usually based on mathematical programming. In general, optimization using mathematical programming can be classified in three categories: single-objective mathematical programming, multi-objective mathematical programming, and mathematical programming-based near-optimal approaches. Single-objective MP involves maximizing a goal which is preferred by the decision-maker. The following studies have been done in terms of single-objective MP: 1. An algorithm with three stages is presented for NSP: generate a set of possible schedules which are seven-tuples of 0–1 depending on whether the day is off or on, formulate the problem as an IP, and produce a solution ([17]). 2. [26] proposed a model which aims to maximize nurses’ preferences, by considering the length of a work, rotation patterns, and requests for days off, and minimum numbers of nursing personnel of each skill class to be assigned to each day and a 4 to 6-week scheduling period. The problem is solved by a modified 7 Balintfy and Blackburns’ algorithm. In this two-phase algorithm, Phase I finds a feasible solution to meet various constraints, and Phase II improve the Phase I solution by maximizing individual preferences for various schedule patterns while maintaining the Phase I solution. 3. NSP is modeled as a large-scale mixed integer quadratic programming problem to minimize a shortage cost of nursing services for a period of three to four days subject to nursing skill class requirements, total personnel capacity constraints, integral assignment, minimum nurse requirements throughout the scheduling period and other relevant constraints. The problem is decomposed by a primal resource-directive approach into a 0–1 LP master problem, with smaller quadratic programming sub-problems ([25]). 4. [15] formulated the problem to minimize an objective function that balances the trade-off between staffing coverage and schedule preferences of individual nurses, subject to certain constraints on the nurses schedules. The constraints are divided into hard and soft constraints. The hard constraints define sets of feasible nurse schedules, while violation of soft constraints results in a penalty cost that appears in the objective function. A coordinate descent algorithm was proposed to find near-optimal solutions. 5. [13] presented a generalized 0–1 column generation model with a resource constrained shortest path auxiliary problem for NSP. The master problem finds a configuration of individual schedules to satisfy the demand coverage constraints while minimizing salary costs and maximizing both nurse preferences and team balance. A feasible solution of the auxiliary problem is an acceptable schedule for a given nurse, with respect to collective agreement requirements such as seniority, workload, rotations and days off. A new resource structure was defined in the auxiliary problem in order to satisfy complex collective agreement rules specific to the problem. 6. NSP which considers the case of two consecutive days off per week for each person. 7. NSP is modeled as an integer programming with 0–1 constraint matrix, and the IP was solved parametrically as a bounded series of network problems. 8 8. NSP which considers the case of 10 working days in a 14-day period with variable demands. However, multi-objective models appear to be more realistic. [2] proposed a goal programming model. Goals are minimizing staffing requirements, minimizing desired staffing requirements, meeting nurses’ preferences, and nurses’ special requests. This approach works in two phases. In the first phase, the nurses are assigned their dayon/day-off pattern for the two-week scheduling horizon by a goal programming model that allows for consideration of the multiple conflicting objectives inherent in the scheduling of a staff of nurses. The second phase makes specific shift assignment through the use of a heuristic procedure. The major advantage of the goal programming formulation of the nurse scheduling problem is the flexibility it permits in choosing priorities; it can take into account such factors as nurses' preferences and desired staffing requirements. In MP-based near-optimal methods is aimed to combine the MP and AI approaches. The problem was formulated as an approximate IP model. The IP problem is first solved and then its solution improved. 2.2.2 Heuristic Approaches For combinatorial problems, exact optimization usually requires large computational times to produce optimal solutions. In contrast, heuristic approaches can produce satisfactory results in reasonably short times. Heuristics used to solve NSPs is divided into two categories: classical heuristics and metaheuristics. In the recent years, metaheuristics such as Tabu Search (TS), Genetic Algorithm (GA) and Simulated Annealing (SA), have been proved to be very efficient in obtaining near-optimal solutions for a variety of hard combinatorial problems including the NRP ([3]). Classical heuristic approaches which have been widely studied in nursing literature were straightforward automation of manual practices. For instance, Greedy Shuffling. First of all, the problem is solved for the worst schedule and then it is improved by exchanging a part of this schedule with a part from another persons’ schedule. Many human-inspired approaches can be found in Greedy Shuffling type algorithms which work by calculating all the shuffles for all personnel and listing them with the highest cost benefit first. This is repeated as many times as possible. 9 Classical heuristic approaches have been proposed to help decision-makers select work patterns to provide the needed coverage for given skilled nurse classes on each shift, develop a basic pattern to meet shift and coverage constraints and meet required staffing levels. TS approaches have been proposed to solve the NRP. TS is a search that moves iteratively from one solution to another in a neighborhood space with the assistance of an adaptive memory that forbids solution attribute changes recorded in the short-term memory to be reused. In TS, a move, for example, can take on an assigned shift type from one nurse to another on the same day and a move not allowed if the person does not belong to the skill category required or if there is already an assignment for that shift type. In TS, hard constraints remained fulfilled, while solutions move. TS approaches have been proposed to ensure enough nurses are on duty at all times while taking account of individual preferences and requests for days off. GAs, which are stochastic meta-heuristics, have also been used to solve the NRP. In GA, the basic idea is to find a genetic representation of the problem. Starting with a population of randomly created solutions, better solutions are more likely to be selected for recombination into new solutions. In addition, new solutions may be formed by mutating or randomly changing old ones. For example, in NRP, for crossover and mutation, the best personal schedule from each of the parents can be selected, a random selection from the personal schedule of parents can be selected, or we can select the best events in a schedule. Some of the best solutions in each generation are kept while others are replaced by newly formed solutions. [13] used GA for a problem with multiple criteria where the concept of a Pareto optimality scheme is used for the evaluation of the multi-criteria objective function. [1] developed a GA approach to solve an NRP. Instead of working directly with populations of potential solutions and handling the constraints using penalty functions or repairs, they proposed an indirect approach in which the task of balancing optimization and constraint satisfaction is shared between a greedy heuristic and the GA. Individuals are represented by permutations of the available nurses and the heuristic is used to build schedules by allocating the nurses to their shifts in the given order. Memetic algorithms, which are viewed as hybrid GA, are a population-based approach for heuristic search in optimization problems. Basically, they combine local 10 search heuristics with crossover operators. [8] described a memetic algorithm that incorporates TS into a GA, using steepest descent for each individual. The results reported for the NRP are better than those obtained by a hybrid TS approach by [8]. This work has gone further in combining hybrid TS with evolutionary approaches. There has been some use of simulated annealing techniques for the NRP. For example, [22] presented a SA heuristic for shift-scheduling using non-continuously available employees. 2.2.3. AI approaches AI techniques have been used to solve NRPs modelled as a CSP [10] modeled the NRP as a CSP which was solved by a combined approach of look-ahead and intelligent scoring which determines which nurse is to be scheduled next and which shift satis.es most of the soft constraints. [1] adopt a PCSP model for the NRP. INTERDIP, which is their prototype system, supports semi-automatic creation of duty rosters and imitates certain aspects of manual planning to improve on the theoretical complexity of the problem, using a constraint package based on CHIP. The package includes linear equations, constraints over definite domains and boolean constraints. [15] modeled the NRP as a HCSP, where legal regulations are hard constraints and nurses_ preferences are usually lower level soft constraints. [15] reported a commercial system ORBIS which models the NRP as a HCSP with fuzzy constraints and inferred control strategies. ORBIS uses a B&B algorithm with constraint propagation and variable/value-ordering techniques to solve problems involving 250–1200 variables with on few minutes [10] Constraint logic programming languages have the advantage of describing constraint logic easily. [12] presented a non-cyclic scheduling system, namely Horoplan, whose algorithm is a constraint-based arti.cial intelligence approach implemented with Charme, which is a constraint-based programming language. [16] discussed an approach, which takes advantage of the declarative ability of Prolog language for the description of constraints, for incorporating the constraints to generalize the NRP. [18] combined constraint logic programming with case-based reasoning to reduce the search spaces further. As a commercial constraint-based package for the powerful C++ programming language, ILOG SOLVER has been widely used to solve the NRP, with the help of 11 heuristic techniques It should be noted that [10] used redundant modeling which increases constraint propagation through cooperation among different models for the same problem via channeling constraints. Knowledge-based search approaches have also been used to solve the NRP by Lukman et al. [14] and [10]. 12 3 PROBLEM STATEMENT AND MODEL FORMULATION Nurse scheduling problem is assigning nurse workforce to the shifts subject to a number of constraints such as time, demand and preference constrains. The objective is to minimize the total cost which consists of preference cost, overtime cost and unsatisfied demand cost. Inputs of the problem are the number of nurse types, number of each type of nurses. In this project, it is assumed that there are three types of nurses, and the number of these nurses are N1, N2 and N3 respectively. There are 3 kinds of shift on a day, 7am-3pm, 3pm-11pm and 11pm-7am. Each nurse has to take one day off in a week. This obligation requires that each nurse can work at 6 shifts in a week. Demand is in terms of number of nurses. The output is a feasible schedule which minimizes the total cost. The problem is formulated as two-stage recourse model as follows: Indices: k = 1, 2… N1 (nurse grade 1 index) l = 1, 2… N2 (nurse grade 2 index) m = 1, 2… N3 (nurse grade 3 index) j = 1, 2…21 (shift index) j = 1, 4, 7, 10, 13, 16, 19 corresponds to 7am-3pm shifts j = 2, 5, 8, 11, 14, 17, 20 corresponds to 3pm-11pm shifts j = 3, 6, 9, 12, 15, 18, 21 corresponds to 11pm-7am shifts g = 1, 2, 3 (nurse grade index) First Stage Parameters: N1: number of first grade nurses N2: number of second grade nurses N3: number of third grade nurses P1kj: preference cost of first grade nurse k working at shift j P2lj: preference cost of second grade nurse l working at shift j P3mj: preference cost of third grade nurse m working at shift j ADjg: average demand for g th grade nurses at shift j 13 First Stage Decision Variable: ⎧1 if the first grade nurse k is assigned to shift j as a grade g nurse x kjg = ⎨ o/w ⎩0 ⎧1 if the second grade nurse l is assigned to shift j as a grade g nurse y ljg = ⎨ o/w ⎩0 ⎧1 if the third grade nurse i is assigned to shift j z mj = ⎨ o/w ⎩0 Second Stage Parameters: ocg: over time cost for nurse grade type g or higher in a shift ucg: unsatisfied demand cost for nurse grade type g or higher in a shift ~ D jg : demand realization for nurse grade type g or higher at shift j Second Stage Decision Variable: ⎧1 if the first grade nurse i is assigned to shift j as a grade g nurse ox kjg = ⎨ o/w ⎩0 ⎧1 if the second grade nurse i is assigned to shift j as a grade g nurse oy ljg = ⎨ o/w ⎩0 ⎧1 if the third grade nurse i is assigned to shift j oz mj = ⎨ o/w ⎩0 ujg: unsatisfied demand for nurse grade type g at shift j 14 N1 21 N2 3 N3 21 ~ Min ∑∑ ∑ P1kj x kjg + ∑∑ ∑ P 2 lj y ljg + ∑∑ P3 mj z mj + E ( f ( x, y, z , D)) k =1 j =1 g =1 21 3 l =1 j =1 g = 2 m =1 j =1 Subject to: N1 ∑x k =1 kj1 ≥ AD j1 ∀ j = 1,2...21 (1) ∀ j = 1,2...21 (2) ∀ j = 1,2...21 (3) =6 ∀ k = 1,2...N 1 (4) =6 ∀ l = 1,2...N 2 (5) ∀ m = 1,2...N 3 (6) N1 N2 k =1 l =1 ∑ xkj 2 + ∑ ylj 2 ≥ AD j 2 N1 ∑x k =1 21 kj 3 j =1 g =1 l =1 m =1 kjg 3 ∑∑ y j =1 g = 2 21 ∑z j =1 N3 3 ∑∑ x 21 N2 + ∑ y lj 3 + ∑ z mj ≥ AD j 3 mj ljg =6 x kj1 + x k ( j +1)1 + x k ( j + 2 )1 + x kj 2 + x k ( j +1) 2 + x k ( j + 2 ) 2 + x kj 3 + x k ( j +1) 3 + x k ( j + 2 ) 3 ≤ 1 ∀ k = 1,2...N 1 j = 1, 4, 7,10,13,16,19 (7) ∀ l = 1,2...N 2 j = 1, 4, 7, 10, 13, 16, 19 (8) j = 1, 4, 7, 10, 13, 16, 19 (9) y lj 2 + y l ( j +1) 2 + y l ( j + 2 ) 2 + y lj 3 + y l ( j +1)3 + y l ( j + 2 )3 ≤ 1 z mj + z m ( j +1) + z m ( j + 2 ) ≤ 1 ∀ m = 1,2...N 2 x kjg ∈ {0,1} ∀ k = 1,2...N 1 ∀ j = 1,2,...21 g = 1,2,3 (10) y mjg ∈ {0,1} ∀ m = 1,2...N 2 ∀ j = 1,2,...21 g = 2,3 (11) z mj ∈ {0,1} ∀ m = 1,2...N 3 (12) ∀ j = 1,2,...21 Where for each scenario: N 3 21 N1 21 3 N 2 21 3 21 3 ~ E ( f ( x, D) = Min ∑∑ ∑ oc1ox kjg + ∑∑ ∑ oc 2 oy ljg + ∑∑ oc3 oz mj + ∑∑ uc g u jg k =1 j =1 g =1 l =1 j =1 g = 2 Subject to: 15 m =1 j =1 j =1 g =1 N1 N1 k =1 k =1 N1 N2 N1 N2 k =1 l =1 k =1 l =1 N1 N2 N3 N1 N2 N3 k =1 l =1 m =1 k =1 l =1 m =1 ∑ xkj1 + ∑ oxkj1 + u j1 ≥ D j1 ∀ j = 1,2...21 ∑ xkj 2 + ∑ ylj 2 + ∑ oxkj 2 + ∑ oylj 2 + u j 2 ≥ D j 2 (13) ∀ j = 1,2...21 ∑ xkj 3 + ∑ ylj 3 + ∑ z mj + ∑ oxkj 3 + ∑ oylj 3 + ∑ oz mj + u j 3 ≥ D j 3 (14) ∀ j = 1,2...21 (15) ox kj1 + ox kj 2 + ox kj 3 − x k ( j −1)1 − x k ( j −1) 2 − x k ( j −1) 3 ≤ 0 ∀ k = 1,2...N 1 ∀ j = 2, 3...21 (16) oy lj 2 + oy lj 3 − y l ( j −1) 2 + y l ( j −1)3 ≤ 0 ∀ l = 1,2...N 2 oz mj − z m ( j −1) ≤ 0 ∀ j = 2, 3...21 21 3 ∑∑ ox j =1 g =1 21 j =1 g = 2 21 ∑ oz j =1 mj (17) (18) kjg ≤1 ∀ k = 1,2...N 1 (19) ljg ≤1 ∀ l = 1,2...N 2 (20) 3 ∑∑ oy ∀ m = 1,2...N 3 ∀ j = 2, 3...21 ≤1 (21) ∀ m = 1,2...N 3 ox kjg ∈ {0,1} ∀ k = 1,2...N 1 ∀ j = 1,2,...21 g = 1,2,3 (22) oy mjg ∈ {0,1} ∀ m = 1,2...N 2 ∀ j = 1,2,...21 g = 2,3 (23) oz mj ∈ {0,1} ∀ m = 1,2...N 3 u jg ≥ 0, int ∀ j = 1, 2...21 ∀ j = 1,2,...21 (24) (25) g = 1,2,3 (1), (2) and (3) are demand constraints. (4), (5) and (6) provides that each nurse has to assign 6 shifts in a week. (7), (8) and (9) provides that each nurse has to assign as one type of nurse and one shift in a day. (13), (14) and (15) are demand constraints. (16), (17) and (18) provides that a nurse has to work on previous shift in order to assign as an overtime nurse. (19), (20) and (21) are the constraints of assigning each nurse only once as overtime nurse. (10), (11), (12), (22), (23) and (24) are binary constraints. 16 4 ALGORITHM 4.1. Definition L2-Binary first stage algorithm is used to solve the stochastic integer programming problem. The algorithm is as the following; Step 0: Initialization • Let ε ; 0 given • x 1 = {min c T x | Ax = b, x ∈ {0,1}n1 } • LB ← −∞ , UB ← ∞ , and k ← 1 • Compute L: o Alternative 1: At point x1 solve the linear relaxation of the second stage and take the expected value of linear relaxation objective values o Alternative 2: Observe the objective function of the second stage to come up with a valid lower bound for the expected value of the recourse. Step 1: Solve subproblems for all w ∈ Ω • ~ )] Solve all subproblems and obtain F(xk)= E[ f ( x k , w • ~ )] soln= c T x k + E[ f ( x k , w • Update upper bound by UB=max(soln, UB) Step 2: Update and solve master problem • Use L, xk and F(xk) to derive L2 optimality cut which is: η ≥ ( F ( x k ) − L)( ∑ x j − ∑ x j − S k + 1) + L where Sk is the set of variables j∈S k j∉S k in solution xk whose value is 1. Compute α k and β k such that the above cut is represented as βkT x +η ≥ αk • Add this cut to the master problem and solve master again and obtain 17 x k +1 = arg min{c T x + η | Ax = b, β t T x + η ≥ α t , t = 1,2...k x ∈ {0,1} } Step 3: Termination If UB − LB ≤ ε STOP, Else set k=k+1 and go back to step 1. 4.2. L2 Code Validation The code is tested with three data instance from literature. The instances are SSLP_5_25_50, SSLP_5_25_100 and SSLP_15_45_5. In these calculations, L value is taken as -50,000. Solution CPU Time (sec) Number of Iterations SSLP_5_25_50 -121.6 1.625000 33 SSLP_5_25_100 -127.37 5.969000 33 SSLP_15_45_5 -262.4 199.167 327 18 START INITIALIZATION SELECT ALTERNATIVE Alternative 1 Alternative 2 SOLVE SUBPROBLEMS UPDATE AND SOLVE MASTER PROBLEM N UB − LB ≤ ε ? Y STOP 19 5 5.1 COMPUTATIONAL RESULTS Experimental Design Experimental methods are widely used in research as well as in industrial settings, however, sometimes for very different purposes. The primary goal in scientific research is usually to show the statistical significance of an effect that a particular factor exerts on the dependent variable of interest. A well designed experiment will have the following properties; 1) A well defined objective, 2) The ability to estimate error, 3) Sufficient precision, 4) The ability to distinguish various effects by randomization and factorial design. In this work, the objective is to determine the change in total cost under different parameters. If the proposed stochastic programming model works right, as the parameters in the model change, the solution always becomes consistent. In addition to this, at the end of the experiments for different parameters, it is found that which parameter has the vital effect on the cost. Based on this result, modifications for better design can be done on the hospital system. The experimental design has the ability to estimate error. Experimental error is defined as observed – expected (find using model). In this work, the expected results such as total cost, nurse schedules are found as a result of stochastic model solution. If the model is applied on the real hospital, the observed values can be easily determined. Therefore, error estimate is calculated easily. The most common experimental design employed in this study is called a completely randomized design. This experiment involves a comparison of the means for a number, say k, of treatments, based on independent random samples of n1, n2, . . ., nk observations, drawn from populations associated with treatments 1, 2, k, respectively. After collecting the data from a completely randomized design, our goal is to make inferences about k population means where m i is the mean of the population of measurements associated with treatment i, for i = 1, 2. . . k. The null hypothesis to be tested is that the k treatment means are equal, i.e., 20 H0: m 1 = m 2 = . . . = m k and the alternative hypothesis is that at least two of the treatment means differ. An analysis of variance provides an easy way to analyze the data from a completely randomized design. The analysis partitions SS (Total) into two components, SST and SSE. These two quantities are defined in general term as follows: Recall that the quantity SST denotes the sum of squares for treatments and measures the variation explained by the differences between the treatment means. The sum of squares for error, SSE, is a measure of the unexplained variability, obtained by calculating a pooled measure of the variability within the k samples. If the treatment means truly differ, then SSE should be substantially smaller than SST. We compare the two sources of variability by forming an F statistic: where n is the total number of measurements. Under certain conditions, the F statistic has a repeated sampling distribution known as the F distribution that the F distribution depends on n 1 numerator degrees of freedom and n 2, denominator degrees of freedom. For the completely randomized design, F is based on n 1 = (k - 1) and n 2 = (n - k) degrees of freedom. If the computed value of F exceeds the upper critical value, F¥ we reject H0 and conclude that at least two of the treatment means differ. 21 5.1.1 Test to Compare k Population Means for a Completely Randomized Design H0: m 1 = m 2 . . . = m k [i.e., there is no difference in the treatment (population) means] Ha: At least two treatment means differ Test statistic: F = MST/MSE Rejection region: F > Fa where the distribution of F is based on (k - 1) numerator df and (n - k) denominator df, and Fa is the F value found in the table such that P(F > Fa ) = a . Assumptions: 1. All k population probability distributions are normal. 2. The k population variances are equal. 3. The samples from each population are random and independent. The results of an analysis of variance are usually summarized and presented in an analysis of variance (ANOVA) table. Such a table shows the sources of variation, their respective degrees of freedom, sums of squares, mean squares, and computed F statistic. Source of Variation Sum of Squares Degrees of Freedom Mean of Squares F F = MST/MSE Between groups SST k-1 MST/(k - 1) Within groups SSE n-k SSE/(n - k) Total SS(Total) n -1 Analysis of Variance Table for Completely Random Design 22 5.1.2 Data Used in Experimental Design As it stated above, the data used in experimental design is created randomly. There are three kinds of data instances. Each data set has two different kinds of scenarios. In the first data set, the total number of nurses in each data set is 20, 25, and 30 respectively. Classification of each nurses in each data set are as the flowing; N1 N2 N3 Total Data set 1 6 8 6 20 Data set 2 8 9 8 25 Data set 3 9 12 9 30 Preference costs for each nurse is an integer between 0 and 10. These values are generates randomly. For each of these values, one independent random variable is generated and it is multiplied by 10 and rounded to the nearest integer. The formula is as the following; | Pikj | = ⎡Rand() *10⎤ Data sets also contain weekly data for average demand which is denoted by ADjg. This is a 3x21 matrix, each column denote nurse type, and each row denotes data for a shift. As it is known, each day contains three shifts. Therefore there are 21 rows for weekly data. Average demand data is created randomly considering feasible constraints, such as each nurse should work 6 days, total number of nurse for each type should not be exceeded. Over time cost is determined as 15 for the first grade, 10 for the second grade and 6 for the third grade. Unsatisfied demand cost is the same as overtime cost for each nurse type, and these two cost values are fixed for each the data set. As it is mentioned above, for each three data set type, two scenarios sets designed. In the first of these scenarios, there are three kinds of scenarios. Second one contains five kinds of scenarios. Each of these scenarios are directly created from AD, such that: First of all, uniformly distributed random number between 0 and 1 are generated. If this 23 number is between 0-0.3, the value is decreased by 1. If it is between 0.3-0.7, the number stays the same. If it is between 0-7-1.0, it is increased by one. Each data set is named as nurse_a_b, a denotes the total number of nurse, b denotes number of scenarios. (See appendix for the data used in experiments) 5.2 Computational Results The experiments are run on P4 2.8Mhz 512MB ram computer. L value of the L2 algorithm is taken as the LP relaxation solution of the problem. First of all, each three problem with only one scenario is solved both using L2 Algoruthm and Cplex interactive Optimizer 9.0.0. Solutions are as the following; Total Number of Cplex Cost Cplex Time (sec) L2 Nurse L2 Time (Sec) 20 3.5200000000e+002 0.09 3.52 0.226000 25 3.3300000000e+002 0.03 3.33 0.452000 30 4.8300000000e+002 0.05 4.83 0.567000 The first graph shows that as the nurse number increases cost value found by Cplex and L2 stays the same. Second graph shows that L2 Computational time is higher than Cplex computational time. This difference comes from the data set and algorithm properties. Problem data are not dense, so Cplex can solve it. Cost 6.00E+02 5.00E+02 $ 4.00E+02 L2 3.00E+02 Cplex Cost 2.00E+02 1.00E+02 0.00E+00 20 25 Nurse 24 30 Time 0.7 0.6 Sec 0.5 0.4 L2 Time (Sec) 0.3 Cplex Time (sec) 0.2 0.1 0 20 25 30 Nurse In general, it can be seen hat L2 gives solution which is very close to the optimal. In LP Relaxation master problem is solved as MIP problem, and subproblems are solved as LP Problem. Total Number of L2 L2 # of L2 CPU LP_RELAX LP_RELAX LP_RELAX Iterations Time Nurse # of Iterations CPU Time nurse_20_3 244 2 0.265000 244 2 0.195000 nurse_20_5 259 2 0.328000 259 2 0.268000 nurse_25_3 414 3 0.563000 412.4 2 0.392000 nurse_25_5 438 3 0.875000 418.8 3 0.568000 nurse_30_3 523 2 0.687000 484 2 0.386 nurse_30_5 560 3 0.735000 558.4 2 0.534 It is important to note that, in L2 algorithm, C++ or Cplex sometimes takes 0.99999999 instead of 1. This affects the solution. Also, in L2 algorithm lower bound is important. If you choose different lower bound, you can find different solutions. 25 Cost 600 $ 500 400 L2 300 200 LP_RELAX nu rs e_ 20 _3 nu rs e_ 20 _5 nu rs e_ 25 _3 nu rs e_ 25 _5 nu rs e_ 30 _3 nu rs e_ 30 _5 100 0 Data set CPU Time 1 Sec 0.8 L2 CPU Time 0.6 LP_RELAX CPU Time 0.4 0.2 nu rs e_ 20 _3 nu rs e_ 20 _5 nu rs e_ 25 _3 nu rs e_ 25 _5 nu rs e_ 30 _3 nu rs e_ 30 _5 0 Data For nurse number 20, for these two scenario scenarios sets, L2 and LP_RELAX costs are the same, and CPU time is very close to each other. When the nurse number increases, difference between L2 and LP_RELAX CPU times increases. In addition this, cost values start to increase. The same result can be found for nurse number 30. However, increment rate of the CPU times stays the same. Each scenario is examined in the inconsistent environment. When the scenario number increases, the total cost value and CPU times increase. From the second graph above it can also be determined that if the nurse number increases, the algorithm becomes more complex. 26 6 CONCLUSION AND FUTURE RESEARCH In this project, a mixed- integer stochastic programming approach to the solution of the nurse scheduling problem is proposed. The problem is formulated as two-stage recourse model. The objective is to minimize unsatisfied staff in the first stage and the expected overtime cost in the second stage. The nurse scheduling problem will be solved by L2 algorithm. The nurse scheduling problem with one scenario is solved by both Cplex and L2 algorithms. The same solution is obtained, but as the problem is not complex, Cplex can solve it easily. Six data sets are designed and solved by both L2 algorithm and LP Relaxation. It is observed that the solution of L2 algorithm is closely related to the lower bound chosen. If a good lower bound is chosen, the solution close to the optimal solution can be found in a very short time. As the number of nurse increases the CPU time requires for solving problem is increases. In addition, when the number of scenarios increases, the problem becomes more complex. Therefore CPU time also increases. The following future studies can be done; • Finding the best lowerbound for L2 algorithm, • Solve the problem with the data instances from real world, • Add more constraints to the model, • The cut in L2 algorithm can be developed. 27 7 REFERENCES 1. Aickelin U., Dowsland K. A. , An Indirect Genetic Algorithm for a Nurse-Scheduling Problem, Computers and Operations Research, 31, 2004, pages 761-778 2. Arthur, J.L., and Ravidran, A., A Multiple Objective Nurse Scheduling Model, AIIE Transactions, 1981, 13, pages 55–60 3. Baker K., Workforce allocation in cyclical scheduling problems: a survey, Operational Research, 1976, 27: 155–167 4. Bard J.F.,and Purnomo H.W., Hospital-wide Reactive Scheduling of Nurses with Preference Considerations, IIE Transactions, 2005, 37, pages 589-608 5. Bechtold SE, and Brusco MJ, Showalter MJ, A comparative evaluation of labor tour scheduling methods, Decision Sciences, 1991, 22: 683–699 6. Blau R, and Sear A., Nurse scheduling with a microcomputer, Journal of Ambulance Care Management, 1983, 6: 1–13 7. Bradley D, and Martin J., Continuous personnel scheduling algorithms: a literature review, Journal of the Society for Health Systems, 1990, 2: 8-23 8. Burke EK, De Causmaecker P., Vanden Berghe G., and Van Landeghem H., The state of the art of nurse rostering, Journal of Scheduling, 2004, 7: 441-499 9. Cheang B, Li H, Lim A, and Rodrigues B., Nurse rostering problems – a bibliographic survey, European Journal of Operational Research, 2003 151: 447-460 10. Chen JG, and Yeung T., Hybrid expert system approach to nurse scheduling, Computers in Nursing, 11: 183–192 28 11. Dowsland K, and Thompson J.M., Nurse scheduling with knapsacks, networks and tabu search, Journal of Operational Research Society , 2000,51: 825-833 12. Ernst AT, Jiang H, Krishnamoorthy M, and Sier D., Staff scheduling and rostering: a review of applications, methods and models, European Journal of Operational Research ,2004, 153: 3-27 13. Jaumard, B., Semet, F., and T. Vovor, A Generalized Linear Programming Model For Nurse Scheduling, European Journal of Operational Research, 1998, 107, pages 1– 18. 14. Li L., Benton W.C., Hospital Capacity Management Decisions: Emphasis on Cost Control and Quality Enhancement, European Journal of Operational Research, 146, 2003, pages 596-614 15. Miller HE, Pierskalla W, and Rath G., Nurse scheduling using mathematical programming, Operations Research,1976, 24: 857–870 16. Okada M, Okada M .Prolog-based system for nursing staff scheduling implemented on a personal computer. Computers and Biomedical Research 21: 53–63 17. Rosenbloom, E.S. and N.F. Goertzen, Cyclic Nurse Scheduling, European Journal of Operational Research, 1987, 31, pages 19–23 19. Siferd S.P., and Benton W.C , Workforce Staffing and Scheduling: Hospital Nursing Specific Models, European Journal of Operational Research,1992,60, pages 233-246 20. Sitompul D, and Randhawa S., Nurse scheduling models: a state-of-the-art review, Journal of the Society of Health Systems, 1990, 2: 62-72 22. Tien JM, and Kamiyama A., On manpower scheduling algorithms, Society for Industrial and Applied Mathematics,1982, 24: 275–287 29 23. Trivedi VM, and Warner M A branch and bound algorithm for optimum allocation of float nurses, Management Science,1976, 22: 972-981 24. Vries T. D., and Beekman R. E, Applying Simple Dynamic Modeling for Decision Support in Planning Regional Health Care, European Journal of Operational Research, 5, 1998, pages 277-284 25. Warner M, and Prawda J.,A mathematical programming model for scheduling nursing personnel in a hospital, Management Science,1972, 19: 411–422. 26. Warner, D.M., Scheduling Nursing Personnel according to Nursing Preference: a Mathematical Programming Approach, Operations Research 1976, 24 (5), pages 842– 856 30 8 APPENDIX *.mod file param N1; param N2; param N3; set K:={k in 1..N1 }; set L:={l in 1..N2}; set M:={l in 1..n3}; set J:= {j in 1..21}; set G;#:={1,2,3} set H; #nurse grade 1 index #nurse grade 2 index #nurse grade 3 index #shift index #grade index param P1{K,J}; param P2{L,J}; param P3{M,J}; param AD{J,{1,2,3}}; param oc{G}; param uc{H}; param D{J,G}; var x{k in K, j in J, g in {1,2,3}}>=0,<=1, integer; var y{l in L, j in J, g in {2,3}}>=0,<=1, integer; var z{m in M, j in J}>=0,<=1, integer; var ox{k in K, j in J, g in {1,2,3}}>=0,<=1, integer; var oy{l in L, j in J, g in {2,3}}>=0,<=1, integer; var oz{m in M, j in J}>=0,<=1, integer; minimize PenaltyCost: sum{k in K, j in J, g in {1,2,3}} P1[k,j]*x[k,j,g]+sum{l in L, j in J, g in{ 2,3}} P2[l,j]*y[l,j,g]+sum{m in M, j in J} P3[m,j]*z[m,j]+sum{k in K, j in J, g in {1,2,3}} oc[1]*ox[k,j,g]+sum{l in L, j in J, g in {2,3}} oc[2]*oy[l,j,g]+sum{m in M, j in J} oc[3]*oz[m,j]; subject to AverageFirstNurseDemand {j in J}: sum{k in K} x[k,j,1]>=AD[j,1]; subject to AverageSecondNurseDemand {j in J}: sum{k in K} x[k,j,2]+sum{l in L} y[l,j,2]>=AD[j,2]; subject to AverageThirdNurseDemand {j in J}: sum{k in K} x[k,j,3]+sum{l in L} y[l,j,3]+sum{m in M} z[m,j]>=AD[j,3]; 31 subject to WorkingDayForFirstNurse {k in K}: sum{j in J , g in {1,2,3}} x[k,j,g]=6; subject to WorkingDayForSecondNUrse {l in L}: sum{j in J , g in {2,3}} y[l,j,g]=6; subject to WorkingDayForThirdNurse {m in M}: sum{j in J} z[m,j]=6; subject to shiftPerDayConstraintForFirstNurse {k in K, j in {1,4,7,10,13,16,19}}: x[k,j,1]+x[k,j+1,1]+x[k,j+2,1]+x[k,j,2]+x[k,j+1,2]+x[k,j+2,2]+x[k,j,3]+x[k,j+1,3] +x[k,j+2,3]<=1; subject to shiftPerDayConstraintForSecondNurse {l in L, j in {1,4,7,10,13,16,19}}: y[l,j,2]+y[l,j+1,2]+y[l,j+2,2]+y[l,j,3]+y[l,j+1,3]+y[l,j+2,3]<=1; subject to shiftPerDayConstraintForThirdNurse {m in M, j in {1,4,7,10,13,16,19}}: z[m,j]+z[m,j+1]+z[m,j+2]<=1; subject to ExcessFirstNurseDemand {j in J}: sum{k in K} x[k,j,1]+sum{k in K} ox[k,j,1]>=D[j,1]; subject to ExcessSecondNurseDemand {j in J}: sum{k in K} x[k,j,2]+sum{l in L} y[l,j,2]+sum{k in K} ox[k,j,2]+sum{l in L} oy[l,j,2]>=D[j,2]; subject to ExcessThirdNurseDemand {j in J}: sum{k in K} x[k,j,3]+sum{l in L} y[l,j,3]+sum{m in M} z[m,j]+sum{k in K} ox[k,j,3]+sum{l in L} oy[l,j,3]+sum{m in M} oz[m,j]>=D[j,3]; subject to OverTimeRequirementForFirstNurse {k in K, j in {2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21}}: ox[k,j,1]+ox[k,j,2]+ox[k,j,3]-x[k,j-1,1]-x[k,j-1,2]-x[k,j-1,3]<=0; subject to OverTimeRequirementForSecondNurse {l in L, j in {2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21}}: oy[l,j,2]+oy[l,j,3]-y[l,j-1,2]-y[l,j-1,3]<=0; subject to OverTimeRequirementForThirdNurse {m in M, j in {2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21}}: oz[m,j]-z[m,j-1]<=0; 32 subject to OvertimeDayForFirstNurse {k in K}: sum{j in J, g in {1,2,3}} ox[k,j,g]<=1; subject to OvertimeDayForSecondNurse {l in L}: sum{j in J, g in {2,3}} oy[l,j,g]<=1; subject to OvertimeDayForThirdNurse {m in M}: sum{j in J} oz[m,j]<=1; *.dat param N1:=9; param N2:=12; param N3:=9; param P1: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 := 1 0 5 0 8 7 2 4 3 9 9 9 5 7 8 1 3 3 9 3 4 9 2 7 3 9 1 5 4 5 2 7 0 8 6 0 8 8 3 2 2 3 7 7 3 8 2 4 9 9 1 2 5 2 7 6 1 5 2 1 1 8 4 10 7 8 4 1 6 3 4 0 7 2 1 9 5 7 10 7 10 7 4 8 7 3 1 1 5 8 3 2 3 1 8 5 7 4 6 4 4 8 1 2 0 1 5 6 1 5 6 6 4 5 9 9 9 6 1 9 1 9 6 4 2 7 3 1 10 9 9 5 7 5 8 5 6 2 1 5 2 10 7 6 8 8 8 2 4 9 7 1 6 5 33 8 8 8 1 5 5 9 1 8 9 7 2 8 2 7 5 5 8 3 2 1 2; 9 7 3 9 9 6 3 9 8 0 7 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 := 1 5 8 9 7 9 4 6 0 6 9 4 1 7 4 3 1 4 0 3 2 6 2 1 7 1 8 7 9 8 8 7 2 10 5 5 2 1 10 1 6 1 3 8 3 8 1 6 8 6 3 1 8 3 0 1 7 5 0 3 10 8 1 5 1 5 4 0 9 8 3 2 5 4 8 6 8 7 9 3 3 6 5 9 5 0 4 0 5 5 10 7 2 5 5 1 9 8 4 6 1 4 7 4 4 10 10 4 0 0 6 6 1 3 2 0 3 7 6 4 8 1 2 10 0 10 5 1 0 1 2 6 7 5 0 1 3 8 7 2 3 1 1 1 9 3 1 2 4 3 6 3 6 10 8 2 8 0 5 5 9 9 0 5 2 5 5 2 10 2 6 7 10 10 3 9 9 5 8 6 1 9 7 6 3 5 3 1 8 5 5 1 4 2 2 6 5 8; 10 7 10 3 0 9 9 1 7 2 2 11 7 5 2 3 6 6 1 6 7 6 12 2 3 9 5 8 8 6 0 3 6 param P2: 34 param P3: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 := 1 3 5 3 3 2 6 1 2 0 2 1 0 5 3 2 6 3 8 4 2 0 2 3 6 5 10 2 2 0 1 0 3 10 1 4 2 6 3 3 1 2 4 3 3 8 7 2 5 2 1 1 5 8 1 4 10 2 9 2 3 9 5 2 7 7 4 7 1 2 5 9 0 6 9 10 10 7 8 3 3 1 4 8 6 2 5 7 5 2 5 4 3 7 9 6 5 3 0 6 4 8 2 9 3 3 6 10 2 3 6 9 7 9 8 3 5 7 1 8 6 8 4 9 10 2 2 4 5 8 7 4 7 8 10 7 5 7 5 9 2 3 8 3 3 2 7 2 1 9 2 4 1 5 8 0 5 4 9 0 9 9 9 s 4 8 6 9 9 1 5 3 3 8 3 2; 9 4 0 4 0 8 3 8 6 3 9 1 2 3:= 1 2 4 2 2 2 3 3 3 1 2 2 param AD: 35 4 3 4 3 5 3 4 2 6 2 3 1 7 3 4 2 8 3 4 3 9 3 3 2 10 3 4 3 11 3 4 2 12 2 3 2 13 3 4 2 14 3 3 2 15 2 3 3 16 3 3 3 17 2 3 3 18 3 3 3 19 2 4 3 20 2 3 2 21 2 2 3; param: G: oc := 1 15 2 10 3 6; 36 param: H: uc := 1 15 2 10 3 6; param D: 1 2 3:= 1 2 4 3 2 3 3 2 3 1 4 3 4 3 4 3 5 4 3 1 6 2 2 2 7 3 4 4 8 3 4 3 9 2 3 2 10 4 5 4 11 3 3 3 12 2 4 2 13 4 4 3 14 3 3 2 15 2 5 2 16 4 4 4 17 2 3 2 18 2 2 3 19 2 3 3 20 2 2 2 21 1 2 1; 37