“Follow The Sun” Workflow In Global Software Development: Theory, Modeling And Quasi-Experiment To Explore Its Feasibility Erran Carmel, Yael Dubinsky, and J. Mark Johnston Erran Carmel, Univ. of Maryland, Univ. Coll. USA ecarmel@umuc.edu Yael Dubinsky IBM Haifa Research Lab Israel dubinsky@il.ibm.com corresponding author: Erran Carmel submitted on Dec 17 2008 to The Impacts of Global IS Sourcing on Engineering, Technology and Innovation Management 22-25 March 2009 Keystone, CO, USA Keywords: Follow the Sun 24-hour development Quasi-experiment Time to market Global software development Follow The Sun paper for Keystone conference printed on 3/10/2016 Page 2 of 21 “Follow The Sun” Workflow In Global Software Development: Theory, Modeling And Quasi-Experiment To Explore Its Feasibility Abstract Follow The Sun (FTS) is a special case of global software development. It is a special case in that there is a hand-off of unfinished work every day from one development site to the next -- many time zones away. This is markedly different from the majority of global technology work / offshoring in which coordination is necessary, but rarely at the level of absolute dependency on a daily basis. The objective of FTS is reduction in development duration (also known as time-to-market reduction, or cycle-time reduction). Time-to-market is most important in industries where products become outmoded quickly, Unfortunately, there have been relatively few successful cases of FTS in the field due to coordination difficulties. The first global software team specifically established to take advantage of FTS was at IBM in the mid-1990s. However, FTS was difficult to achieve and managers quickly resorted to traditional forms of offshoring to capitalize, instead, on wage differentials. In this article we present a foundation for understanding FTS including: a definition, a description of its place in the life cycle, a discussion of choice of methodologies that are likely to make it successful. We describe a multi-method approach to the study of FTS and present our findings from modeling and from quasi-experiments. First we present and develop a mathematical model of FTS benefits. The data suggest that unless the hand-off efficiencies are high then the duration reduction payoff for a third and fourth site of FTS is relatively small. We also present the outcomes of a set of exploratory quasi-experiments designed to test FTS and measure the speed of software work. The experiments were conducted with software engineering students performing semester projects. The FTS teams completed their work in similar or better duration relative to the control teams. This is in contrast to previous studies in which the software work of distributed teams takes longer in duration relative to co-located teams. We suggest that this happened-- in our case-- because of Agile implementation-- specifically the tight iteration and deadlines involved. Follow The Sun paper for Keystone conference printed on 3/10/2016 Page 3 of 21 Introduction Follow the Sun (FTS from hereon) is a rather intuitive idea: Hand-off work every day from one site to the next as the world turns (for example, USA to India). The impact is theoretically profound. One can reduce the development duration by 50% if there are two sites and by 67% if there are three sites. In spite of these tempting numbers, FTS has had few industry success cases and there has been little academic research into FTS. This article fills that void by presenting a multi-method examination of FTS. FTS (also called: 24-hour development and round-the-clock development) is a subset of global software development and shares with global software development the many issues and challenges of coordination, culture, and communication. However, in our understanding of FTS, it is uniquely focused on speed in that the project team configuration is designed to reduce cycle-time (also known as time-tomarket reduction, or duration reduction). We position our study as a foundation for studying the acceleration of software work in order to reduce time to market. Multi-method research strategy to investigate FTS Given the paucity of knowledge in both theory and practice, we have designed a multimethod research program to study FTS. In this paper we describe our first steps using methodologies #1 and #2 and in the future we plan to make use of the latter two. Our multi-method approach highlights areas that a single method can reveal relatively little about. 1. Modeling. Develop research frameworks for time separation and create mathematical and simulation models to better understand how to optimally architect FTS and how to evaluate its costs and benefits. 2. Quasi-experiments allow for refinement of research techniques and present lessons for practical FTS techniques. 3. Controlled lab experiments. See, for example, similar research in [15]. 4. Field-study of FTS. Highlight and learn from successes and failures in industry. We caution that there are very few of these. Why is time-to-market important? FTS is enticing to business in order to achieve time-to-market reduction.1 Time-to-market is the length of time it takes from a product conception until the product is available for use or sale [30]. Figure 1 depicts this construct. Time-to-market is most important in industries where products become outmoded quickly, such as, these days, mobile telephone handsets. Therefore time-to-market is critical for handset software. Time-to-market is also important for strategic information systems such as competitive e-commerce systems or innovative Supply Chain Management systems. There can be other considerations in producing systems more quickly such as avoiding contract creep. 1 Time-to-market reduction needs to be seen within the larger context of other competitive factors and project goals: One needs to make possible trade-offs between features, project cost, and quality [28]. Follow The Sun paper for Keystone conference printed on 3/10/2016 Page 4 of 21 Time-to-market Define Design Make Distribute Figure 1. Timeline of time-to-market, measured from inception to use/sale A desire for rapid development -- a sense of urgency -- is shared by most firms and projects in a competitive marketplace, But most efforts to reduce project duration are reactive: hurry-up tactics, such as overtime; work speed-up; setting aggressive deadlines; and adding personnel. Hurry-up, speeding up, or "stepping on the gas," means doing what needs to be done -- only faster. This is generally not recommended because of fatigue [26]. Adding personnel is of little interest in software because of the wisdom gained from the seminal Brooks' Law [3] ("adding manpower to a late software project makes it later"). FTS represents a globally dispersed proactive way to reduce time-to-market. It is centered on a high awareness of achieving this goal [30]by rethinking development processes. In summary, time-to-market reduction achieved by using FTS requires a deliberate design around the objective of speed. Time-to-market is an important area of inquiry because it is relatively understudied in the disciplines of MIS and software engineering. In software engineering/ global software development there has been some tangential interest such as [19] where multi-site software teams took longer than in co-located teams. Meanwhile, our Management Information Systems (MIS) literature has devoted some attention to investigating the time domain but has largely focused on subjective perceptions of time [29] rather than approaches to increasing speed. Why is FTS so difficult in practice? Globally distributed development adds difficulties to productive work. The frenetic pace of development will tend to pull apart teams by Carmel’s [7] five centrifugal forces of global software teams: loss of communication richness, coordination breakdown, loss of ‘teamness’, cultural differences, and geographic dispersion. FTS is even more difficult and quite uncommon because the production teams are sequentially handing off work-in-progress (unfinished objects). The production objects require daily “packaging” so that the task is understood by the next production site without synchronous interaction. There are times when the next production site needs more information and cannot proceed fully without clarification. When clarification is required then an entire day may be wasted because the previous site has already gone home. Sometimes a misunderstanding leads to re-work (classified vulnerability cost [14]). FTS literature The first global software team specifically set-up to take advantage of FTS was at IBM in the mid-1990s and is documented extensively in [7]. This team was set up from inception to work in FTS. It was spread Follow The Sun paper for Keystone conference printed on 3/10/2016 Page 5 of 21 out in 5 sites across the globe. However, FTS was unsuccessful. It was uncommon to move the software artifacts daily as had been hoped. Since FTS was not working the managers gave up on that objective though they continued global distributed software development. We note that some have claimed successful FTS, but on closer inspection, while these projects were indeed dispersed, they did not practice FTS [35][34]. Cameron [4] has claimed some limited success at FTS at the global American firm EDS (now HP). Hawryszkiewycz & Gorton were the first to examine FTS in a series of small controlled experiments in the 1990s [16] -- though they did not continue their line of inquiry. In recent years others have written about the promise and failures of FTS [17][18][11][2]. Contrary to myth, Indian offshore firms do little, if any, FTS [8]. With rather limited progress in the last decade, recent FTS literature has moved in another interesting trajectory: mathematical modeling of FTS [13][24][31][32][33]. We will survey to two week each one of these papers in the next section. It is worthwhile noting at the outset the each of these studies uses somewhat different approaches and assumptions. All of them focus, to some extent, on the issue of speed, while making critical assumptions that deal with the hand-over and coordination issues. By expertise Functional expertise B Site 1 Site 2 Integration Functional expertise A Site 2 Integration Site 1 By product Component A Component B time time time handoff handoff handoff handoff Site 2 handoff Site 1 handoff Site 2 Follow The Sun handoff Site 1 handoff By phase time Figure 2. FTS compared to other globally distributed configurations. The common globally distributed configurations involve decomposing tasks by expertise, by product (or subproduct or module), and by phase (see Figure 2)[7][19]. While central to FTS is the sequential handoff of work from one site to the next, paradoxically the vast majority of global teams do the utmost to do the exact opposite - to reduce the number of hand-offs as much as possible. Note that the other three approaches attempt to minimize hand-offs. FTS definition In light of the above, we propose a formal definition of FTS. We claim that software development FTS means satisfying all 4 of these criteria: 1. Production sites are far apart in time zones. 2. One of the main objectives is duration reduction/ time-to-market reduction. 3. At every point of time there is only one site that owns the product. (We must specify this criterion in order to make the dependency unambiguous). Follow The Sun paper for Keystone conference printed on 3/10/2016 Page 6 of 21 4. Hand-offs are conducted daily, where a hand-off is defined as a check-in of a work unit that the next site is dependent upon in order to continue. (We specify this in order to make the dependency unambiguous). Therefore, FTS is defined as: A type of global knowledge workflow, designed in order to reduce project duration, in which the knowledge product is owned and evolved by a production site and is handed-off daily to the next production site that is many time-zones apart to continue that work. We also state one key assumption and several other points. The key assumption is that each production site works during their day because most people around the world tend to work during the day and sleep at night. Related to this is an assumption that each production site works as a team and coordinates inside the team. Some other foundational points and assumptions: In software work there is one common repository. This allows all sites to “commit” the code/objects at the end of the workday. Exception condition: the unit of work can be sometimes = zero, in cases of holidays etc. While work is usually passed on daily, this cannot happen for every team every day. A team can consist of one or more members. Since Time To Market is a major goal then it is measured using objective units and possibly benchmarked. Our definition is consistent with other, broader definitions of global software development. For example, it assumes full cooperation between production sites, as defined by [21] . Our careful definition allows us to expand our thinking of FTS. We can all envision FTS with 2 or 3 sites. Assuming 6 hours of intensive software development per day, it is possible to do FTS with even 4 sites spread out across time zones of the globe. It may even be beneficial. We assume that each team will work intensively for 6 hours and during the beginning and end of day will perform other admin/work tasks. FTS is the least common of the four global work configurations. Partially because of its rarity, we have noticed confusion about FTS in industry and we can correct it with our definition. The two most common misconceptions are that FTS is: a. Global knowledge work in which knowledge workers are working and coordinating around the world. However in most cases these knowledge workers have little dependency and do not pass on work in order to reduce duration. Global knowledge work only satisfies criterion #1. Therefore this is not FTS. b. 24h work/ shift work (3 shifts a day). For example, a call center system routes calls to workers as the earth revolves. However in most cases these knowledge workers have little dependency with each other and do not pass on work in order to reduce duration. Shift work only satisfies criterion #1 and #3. Whom Therefore this is not FTS. What methodology is best for FTS? Those who have examined FTS closely have recognized the importance of selecting and adapting a software development methodology for the special needs of daily hand-offs. FTS requires very careful Follow The Sun paper for Keystone conference printed on 3/10/2016 Page 7 of 21 considerations of methodology and process. Cameron at EDS crafted a special methodological adaptation [4]. Similarly, IBM’s classic FTS team of the 1990s constructed a unique organization structure and process [7]. Some practitioners have told us that waterfall-like approaches seem best for FTS. Others have proposed other iterative models. We examine each of these. Let’s begin by examining the generic software development lifecycle (SDLC) phases. While some phases can work easily for FTS, at least over short periods (see Figure 3), others do not. Testing has worked well in FTS for many companies. One team searches for bugs, documents these in the bug database, which is then accessed and worked on by the software team at another site many time-zones west. Testing works well with FTS because the hand-off is structured, granular-- and with trained staff-- will usually not suffer from miscommunications. Prototyping is another specific phase that suits FTS. Rapid Prototyping Define Design Test & Fix Code Test •Maintenance •Helpdesk Integrate Figure 3. Generic waterfall SDLC. Notation above the arrows points to specific phases that are a good fit for FTS The work peculiarities in each of these phases mean that the theoretical advantages of FTS (reducing time-to-market by 50%), tend to show up in isolated phases. If one manages to perform FTS in one specific phase, it does not mean that those techniques are transferable to the other phases. Thus, if only testing used FTS successfully then overall project duration is reduced by only 12.5% rather than the ideal 50% (see Figure 4). Define Design Code 10% 30% 25% Typical percent of schedule Test Integrate 25% 10% Figure 4. Typical percent of project schedule duration in each SDLC stage We considered several iterative development methods such as RUP and Agile [1][20] and selected the Agile approach as most promising for FTS. We note that other iterative approaches can handle daily hand-offs, e.g., RUP and specifically Agile RUP. Later, we describe our first FTS exploratory quasiexperiments using an Agile approach. There has been some research on synthesis of distributed global teams and the Agile approach [e.g., [11][36]]. In the case of FTS, examining the SDLC when using the Agile approach shows that in every agile iteration all software development activities (define, design, code, test, integration) are performed for a smaller scope which is usually feature-based (conceptually described in [1][[20]]). The iteration length varies from two weeks to four weeks depending on the specific agile method that is used. We argue that in FTS agile environment, testing of small portions along the way should lead to a duration reduction of at least 12.5% relative to waterfall-like approaches and possibly more if we take into account Follow The Sun paper for Keystone conference printed on 3/10/2016 Page 8 of 21 the accumulative effect of testing on the development process and software quality (stability, maintainability, etc.). The Agile approach has a key characteristics that assists in structuring FTS settings: Support daily handoffs. Continuously integrating using an automated integration environment enables each team to develop in its own code-base in its own time period. Yet, each team maintains an updated testable code base to be used by the next production site. The policy of keeping the integration green (all test pass) at the end of the work day is common in Agile teams and nicely fits with FTS requirements. In summary, we note that once we move beyond broad generalities about researching FTS we must make choices regarding configuration and methodology. In our research we try to remain neutral to the extent possible and when we choose configurations and methodologies and we try to make these clear. In the case of our theoretical FTS model, our model was largely neutral to configurations and methodologies. In the case of our FTS quasi-experiment -- this is based on agile programming approaches. Research Method #1: Modeling The research of FTS lends itself to modeling because of the intensive workflow nature of the problem domain. Modeling (and its close cousin simulation) is a research method that enables us to surface tradeoffs and investigate different design configurations; furthermore, modeling allows the investigation of feasibility. As we shall see here, a basic model of FTS quickly surfaces the difficulties in harvesting its benefits. Researchers have begun to model FTS only recently, so at the end of this section we summarize and compare similar research efforts by others. There are several ways to define how efficient production and operations are, and so far FTS investigators have used many of them. Our key contribution is that our model standardizes the language describing duration, cost, and quality by using non-dimensional parameters. Main model assumptions As with any modeling approach, simplifying assumptions must be made. These can be relaxed at a later point by making the model more complex. Notice that as we relax any of these assumptions, however, FTS becomes less attractive. Some of the other models that we survey later were not explicit about simplifying assumptions and may have presented a somewhat misleading picture of their findings. And a note on terminology: here we use the term “shift” – which when it is time zone separated is usually referred to as a “site.” Using the term shift in the model is more precise since location becomes irrelevant. Our FTS assumptions: Personnel are interchangeable. This is an assumption common to most other models. This assumption is not as far-fetched as one would first think; there are quite a number of software development organizations that have set up mirror sites. Mirror sites means that groups with similar distributions of programming skills are created in time-dispersed locations (similar to assumptions of [21]). Another approach that is similar in spirit is Agile paired programmers. No absenteeism. If one of the nodes is operating at less than full capacity because of sick days or vacation, then everything gets delayed. Follow The Sun paper for Keystone conference printed on 3/10/2016 Page 9 of 21 No parallel work, just sequential. Recall that the essence of FTS is sequential work, whereas parallel work incorporates elements of expertise/ product configurations (illustrated in Figure 2). No phase-based work. Agile-like teams are involved in all SDLC phases including testing. There is no separation by lifecycle stage. There are no hand-off failures. A hand-off failure occurs when the receiving site fails to understand the work-in-progress and instructions sent by the previous site. Handoff overhead is fixed. In reality, in FTS, as one goes to 3 or even 4 sites separated by time zones, there is increased amount of information that needs to be conveyed, because there is a needs to transfer the knowledge of what all the other previous sites have done. Hand-off effort is equal for sender and receiver. In other words, it takes one unit of time to prepare and process for the sender. And this time unit is identical for the receiver to process and comprehend. The core model There are several ways to define how efficient operations are, and so far FTS investigators have used many of them. In this section we present the main parameters of our model focusing on time and coding efficiencies. We are proposing to standardize the language describing duration, cost, and quality by using non-dimensional parameters. Also, understanding the relative effectiveness of time utilization is easier when using well-defined non-dimensional comparisons. These comparisons can be thought of as comparing ratios of similar qualities (“apples to apples”). The primary non-dimensional comparison is overall time efficiency. It is defined here as the total time of coding work, Tcoding, divided by total actual time, Tactual, passed from the start of the project. (Duration) Time Efficiency ≡ (1.0) In order to be able to compare coding efficiency, which is what is being paid for, coding time efficiency also needs to be computed. This allows for comparisons of different sites or groups. Coding Efficiency ≡ (1.1) The times can be related by: Tactual = Twork + Taway Twork = Tcoding + Toverhead (1.2) There are two kinds of overhead. There is non-task related overhead, such as filling out time sheets, or fixing the copier. This overhead is labeled as Toverhead. There is also overhead inherent to the task which is included in Tcoding. This includes design, testing, meetings or conversations to discuss the project, as well as time wasted on misunderstandings or work stoppages due to lack of direction or coordination. The two efficiencies are related: Follow The Sun paper for Keystone conference printed on 3/10/2016 Page 10 of 21 (1.3) This ratio has encouraged FTS projects since FTS is a sustainable way to increase time at work and thus increase the overall efficiency of time use. And by comparing the overall efficiency of two different approaches the relative merit of each can also be expressed in a non-dimensional format. The relative overall effectiveness of two approaches is the ratio of one to the other. (1.4) is analogous to Cameron’s compression or improvement factor [4]. The effectiveness parameter’s subscript indicates which ratio is being taken, with the first letter indicating total time (T) or coding time (C) and the next two numbers indicating the numerator and denominator respectively. The effectiveness, , will be greater than one when the first approach has a higher efficiency. This result is analogous to the ratio of work-hours that other researchers have used (see Table 2). Coding effectiveness may also be computed. (1.5) The effectiveness is greater than one. when the first approach is a single site and an FTS team is the second approach The overhead costs of the coding tasks are included in the overall time so, is contained in . When equals one then, behaves as the simple predictor of using additional effort (person-hours) to increase Tcoding. This change is illustrated by comparing a first approach utilizing one shift of Tcoding of 8 hours to a second approach utilizing three shifts (= three sites) to obtain a Tcoding of 24 hours. This implies that is a third and this means that only a third of the elapsed time is used for coding. There are several sources of inefficiency that have been attributed to FTS: (1.6) Here Carmel’s five Cs [7] are assumed to be uncorrelated. Given these factors, and the data in the literature, it is unlikely that will equal one for FTS. Our modeling technique can be refined to relate individual FTS shift programming efficiencies and overall project time estimates, but for this discussion we are presenting the overall relationship of FTS and single site and so we are going to bundle these effects. One major piece of the detailed model represents the FTS coding inefficiencies due to handoff losses. These losses are due to communication during the shift overlap and time spent during coding time to understand the previous work. For simplicity, in this discussion, we bundle the hand-offs, the variation in site skills and hand-off variations into an average hand-off efficiency, . It is expected that projects with more sites, less trust, or larger cultural differences would have lower efficiencies than fewer sites, more trust, or smaller cultural differences [31]. Follow The Sun paper for Keystone conference printed on 3/10/2016 Page 11 of 21 Working through an example Let us now use the model to inspect sensitivity vis-à-vis number of sites/shifts, speed, and hand-off efficiency (Figure 5). The simple pure case of a single shift of co-located work is shown by the square. The Y axis shows the baseline duration of 1. The Y axis indicates the relative speed-up in duration. Thus, the Y horizontal line of 2 indicates that project duration is cut by half. Next, let’s examine the point marked with a circle. Assume the following topology: the project is made up of four work sites spread equally away from each other (6 hours apart). Each shift is eight hours in which there is a net 6 hours of coding time per shift. This is 2 hours of overhead activities, toverhead, (project time is denoted with a capital ‘T’ and shift time is denoted with a lower case ‘t’). Next, suppose that it takes 5% of the work duration, 18 minutes, to exchange information updating the next shift about the new work since their last shift. Each programmer would need to learn the work of three other shifts comprising 18 coding opportunity hours. If only information exchange losses are considered, then coding time would be the total shift work time, twork equals 6 hours, less 18 minutes receiving information and also less 18 minutes disseminating information for a total handoff time, thandoff, of 36 minutes. The average hand-off efficiency for shifts in is 90%, as 5% of each shift is lost to each of the two information transfers per shift. The overall efficiency, , of the four shifts is 90%. If the project were to require 1,000 coding hours for a single site approach (ηC=1.0, no handoff losses), The FTS team would require 1,111.1 hours of effort (more than single site due to communication inefficiencies). But the duration time for the FTS team would be 46.3 days while the single site, producing 6 hours of coding time per day, would require 167 days. The comparison ratios indicate the single site coding efficiency is somewhat higher, εC41=97%, but that the time effectiveness is higher for the FTS approach, εT41=3.6 (or in 28% of elapsed clock time). The person-hours cost premium for the speed premium is 11%, or for each 1% increase in speed, the cost increases by 0.04%. 4 3 εTx1 - Effectiveness Shift s 1 2 2 3 4 1 0 70 80 90 Handoff Efficiency (communication losses) % 100 Fig 5: FTS time effectiveness relative to single site work, shown by the square. The four site example is indicated by the circle. Note, the horizontal axis starts at 70%. All sites/shifts have two hours of overhead time. Follow The Sun paper for Keystone conference printed on 3/10/2016 Page 12 of 21 Comparing FTS with other ways to accelerate projects Several methodologies / management tactics may now be compared in order to investigate time to market reduction. The seven we considered are briefly described. Approaches 1,2,3 are common approaches. Approach 6 is basic “pure” FTS. Approaches 4, 5 and 7 we made up to illustrates fuller utilization of the clock time. 1. Baseline: single site effort. Here developers work eight hour days and are assumed to be effectively programming six hours a day. This is a sustainable work level for the programmers. 2. “Hurry up” = Baseline + overtime. This is a doubling of task-related work by working extra hours per day + one weekend day + neglecting some non-task overheard. This is a short term “hurry up” approach, that for long term efforts is not sustainable. 3. Paired programming: Two programmers are assigned to each task. They work on the code together and check-in only occurs when both agree that the task is complete. (Presumably productivity is higher, but this is not measured here). 4. Shifted paired programming: Similar to paired programming, but organized to eliminate time away from work—by working 7 days per week. Here, the two programmers have some overlapping time but cover for each for planned time off: weekends and vacations. For instance one programmer works Sunday through Thursday while the other programmer works Tuesday through Saturday.2 5. Shifted paired programming + overtime. 6. FTS. This approach is consistent with our FTS description previously 7. FTS with shifted paired programming: This combines single site reduced time away from work and potentially complete day utilization. Utilizing the non-dimensional framework the efficiencies of these tactics are listed in Table 1. Assumptions of table 1 are follows: Handoff efficiency is set to 100%. The annual time efficiency assumes fifteen vacation days and ten holidays. The percentage of days where work actually occurs is listed in the last column (this assumes that two of the sites have mutually exclusive holidays, for instance the US and India). In summary, Overall Time Efficiency when assuming: C 100% ,coding overhead 0 Inspecting Table 1 we find the following: The baseline approach (Row 1) uses a dismal 16% of the time available over 52 weeks. Row 2 illustrates doubling the duration efficiency. Shifted paired programming (Row 4) produces more coding time than the baseline but less than a rather traditional “Hurry up” approach. The FTS cases (Rows 6 and 7) are the most effective, with the combined FTS and shared programming approach achieving a weekly (no holiday) time efficiency of 100%, which is equivalent to a 91% annual time efficiency. This approach provides the benefit of at least one shift of work occurring every day of the year, without any overtime. Two approaches, the shifted paired programming and four site FTS approach reduce duration by nearly six times. The Hurry-up approach is quite effective, even relative to simple 2-site FTS approach. Doubling the FTS complexity to four sites does provide double the benefit relative to the Hurry-up 2 It might also be possible to time shift the overlap during a single day, but that is not assumed for this study. There is no literature on this approach, so no direct data exists to indicate that expert programmers would find this more acceptable and thus the approach would be sustainable. However, it does start to approach the open source model in the combination of individual and collective work modes. Follow The Sun paper for Keystone conference printed on 3/10/2016 Page 13 of 21 Overtime Tcoding Total Hours per shift hours per shift Approach 1 Baseline 2 “Hurry up “ - Baseline with overtime 3 Paired Programming 4 Shifted Paired Programming 5 Shifted paired programming with overtime 6 FTS Toverhead Days, per week approach. However, once we introduce hand-off overhead, as we did in Figure 5, the extra benefits of FTS are not as pronounced, especially when one considers the presumable large increase in cost and complexity. The old-fashioned Hurry-up approach is still relatively successful. Num of sites NS T T weekly (%) annual (%) Days when work occurs (%) Duration efficiency 0 3 2 1 6 10 8 11 5 6 1 1 18 36 16 32 63 77 0 0 2 2 6 6 8 8 5 7 1 1 18 25 16 23 63 86 3 1 10 11 7 1 42 38 97 5 5 5 7 7 2 3 4 3 4 36 54 71 75 100 32 47 63 69 91 0 2 6 8 0 2 6 8 0 2 6 8 7 FTS with shifted pair 0 2 6 8 programming 0 2 6 8 Table 1. Time efficiencies of seven development approaches. Comparison to other FTS models in the literature In Table 2 we summarize and compare somewhat similar modeling efforts by others. As we noted before, researchers have begun to model FTS only recently. Interestingly, the five studies described here are complementary in a number of respects: they use different methodologies, assumptions, and objectives. Of these studies, only studies #3 and #4 are similar in inquiring, as we have done, whether FTS is beneficial. All these studies seem to be either in their early stages or have not been continued in the broader context of studying global software development. 71 100 1) Jalote & Jain 2006 [24] 2) Sooraj. P. & Mohapatra 2008 [32] Objective Method Outcome measures Key model characteristics Testing the model Results: derived equivalent Shift How to allocate tasks to individual nodes in FTS/GSD. Critical path, network optimization Minimize project duration * Parallel and serial tasks * Three site, 8 hour shifts Applied the model to 2 projects at Infosys in a very limited way Efficiency And Duration efficiency ηT FTS reduced duration by 8-12% (check this). Based on small data set, retrospective analysis. εT = 90%. How to allocate geographical sites in FTS.. General sequential model; graphical and computer model. Minimize project duration Hypothetical analysis, no test data; N/A Mathematical model Minimize project duration * Inputs include detailed site data (geog location, work hours, productivity, wages, cost); general site factors. * Assumes overlap between sites for handoff. * Ignores hand-off problems. * Assumes variable overlap factors. *General factors can be input: single site development time, sites available, communication times, number of developers, concurrency level * Assumes overlap between sites for handoff. *Allows for sequential and parallel tasks * Both sequential and parallel tasks * 3 types of overhead Graphical sensitivity analysis Test of Model * Resources are interchangeable * Distance * Time Zone * Culture * Language * Distribution overhead * Member Familiarity * Team meeting Model indicates that FTS communication problems can make FTS slower than single site. Model not calibrated to test data but consistent with previous global software development factors Dyad only using three elements: Production costs, coordination costs, and Vulnerability costs Simple simulation to examine feasibility 5) Espinosa & 4) Setamanit et al Carmel [14]l 2007 [31] 2003 3) Taweel & Brereton 2006 [33] Explore FTS sensitivity to overhead and handoffs. Evaluate several GSD strategies, including FTS Explore feasibility of GSD modeling Discrete Event Simulation of system dynamics with interaction effects Mathematical model Minimize project cost Comparison of duration for four GDS approaches: single site, module based, phase based, and FTS N/A Small quasi-experiment: Ns ηT 1 33% 2 53% 3 84% 4 76% Empirical Study of 2 sites showed ηT of 51% Ideal Realistic ηT ηT Model Ns fts 2 47% 24% single 1 33% 33% 1.67 41% 38% 1 34% 27% module phase N/A Table 2: Brief review of similar research in modeling FTS and global software development strategies. GSD= Global Software Development. Table includes derived efficiencies when possible. Research Method #2: Quasi-experiments In this section we describe our two FTS quasi-experiments. A quasi-experiment is also called a field experiment. A quasi-experiment is a research design having some -- but not all -- of the characteristics of a laboratory experiment (cf.[4]). Quasi-experimental approaches have been used in other studies of global software development. For example Yadav, et al [35] set up paired teams of students in the US and India on a requirements and development task in order to assess, somewhat similar to our underlying theme, the difficulty of teaming on software across distance and time. Our experiment is focused on measuring speed in software development work conducted by distributed teams working according to the Agile software development approach. We describe our design in a manner consistent with [21]. Phase 1 of this stream is described in [10]. Experiment goals: investigate FTS in an Agile environment. Specifically, measure duration in order to test the central premise of time-to-market reduction. Experimental design Experiment participants and setting: Subjects were experienced computer science and electrical engineering students at an elite university, all between ages 20-30. Most of the students were working part time during their studies as developers and testers in large-scale enterprises. The experiments were conducted as part of practicum course in Computer Science in two successive semesters of 14 weeks each. This course is a project-based capstone course in which the project is the principal component of the grading, so students are motivated to do well. During the entire process of development there was an active electronic website for course and team communication. The following Agile practices were integral to the project and student learning: time boxing, testing, continuous integration, and constant customer availability for requirements clarifications. The general approach of a semester project, including use of Agile methods, was not new [12]. It had been guided by one of the authors every year since 2002. The only new dimension, this time, was to impose a FTS approach on one of the student teams. Experimental Design: In each experiment, there were two project teams which were measured: one FTS team and one control team (CO from hereon)3. Thus, we had two data points each time.4 Experimental task: In the first experiment the teams were required to build a simulator of processes and threads scheduler in the Unix operating system. In the second semester the teams were required to build a monitor and recommender system that identify problems with the system resources (memory, CPU and IO devices), alerts users and recommend possible actions. Thus, in each of the semesters, both experiment and control teams developed a software project on the same subject and same functionalities. 3 In the first experiment, there were 8 students in the FTS team and 7 students in the CO team. In the second experiment, there were 4 students in the FTS team and 4 students in the CO team. 4 In both experiments we examined a specific Agile iteration that was not the first Agile iteration in the course (the first iteration is dedicated to learning the subject and the development environment). Follow The Sun paper for Keystone conference printed on 3/10/2016 Page 16 of 21 In both semesters, there was no programming language that was enforced. However, there was a requirement to use a CVS5 (first experiment) or a SVN6 (second experiment) server for the integration build. Special Rules for the FTS team: In both experiments, while the CO team worked in a typical student manner without any interaction constraint, the FTS teams were split into two subteams to simulate working in two non-overlapped time zones (subteam A and subteam B from hereon). The FTS team had very strict communication rules in order to simulate the time differences of FTS. Direct communication was prohibited between the two subteams: no voice, no IM, no SMS. We also wanted to enforce work during their prescribed periods as much as we could, so access to the CVS/SVN integration server and to the electronic forums was only allowed in prescribed working hours for each student depending on whether he/she was in subteam A or subteam B. In both experiments we used the integration server and the electronic forums to see if the FTS communication rules were obeyed. They were. In the first experiment, subteam A worked between 21:30 and 11:00 while subteam B worked between 11:30 and 21:00. The 30-minute gap permits overtime of 30 minutes in each subteam. Most students also had demanding part-time jobs, so the odd hours were not a hardship. The FTS team was required to perform the task using the same effort (in person-hours) in 2.5 weeks instead of the 5 weeks that the CO team had. The academic supervisor operated in the same time zone as subteam A for the entire semester. The instructor’s weekly meetings were planned to be face-toface with subteam A and over phone with subteam B. In the second experiment each subteam was given a series of non-consecutive days i.e., one subteam worked on Sundays, Tuesdays, and Thursdays while the other subteam worked on Mondays, Wednesdays, and Fridays. This way, both FTS and CO teams had the same duration to complete the development of the iteration scope-- in this case it was two weeks. Measuring duration in FTS: Our main objective was to measure differences in project duration between FTS and CO. In the first experiment, we artificially imposed a deadline at exactly the halfway point for the FTS team (see Figure 6a). The logic is that an FTS team would theoretically reduce duration by half = 50%. Note that both teams, CO and FTS, had the same amount of development hours (effort) and the same functionalities to develop. In the second experiment, both FTS and CO teams had the same duration but FTS subteams did regular handoffs as per their assigned working days (see Figure 6b). Thus, each subteam had only half of the duration. FTS Team Control Team (CO) t+2.5 t+5 (a) 1st experiment (b) 2nd experiment Figure 6. Planned duration for measured iteration Other measures and data: The experimental data were generated by the CVS/SVN server, e.g. number of check-in operations and number of revisions per file. Further, we examined the electronic forums for work logs compiled by students and we analyzed students’ reflections during the semester. 5 6 See http://sourceforge.net/cvs/?group_id=3630. See http://subversion.tigris.org/ Follow The Sun paper for Keystone conference printed on 3/10/2016 Page 17 of 21 Experimental outcomes Examining development duration: The FTS teams performed roughly equally or better than the CO teams. Both teams (CO and FTS) in both experiments met the project’s requirements with respect to the functionality and level of quality. In the first experiment, we concluded that there was an approximate 10% reduction in development duration -- rather than the theoretical 50% of FTS. That is, instead of completing the development work at t+2.5 weeks, it was completed at t+4.5 weeks. In this, our first experiment, we discovered that the FTS team had no incentive to stop at the 2.5 week FTS deadline, since the grading deadline was still a few weeks away. So, with most functionality done by the 2.5 week milestone, the FTS team kept on tweaking and improving the software for much of the “extra” 2.5 weeks. While this was a weakness in the experimental design, it points to the relative success of the team in duration reduction. Improving in the second experiment, both teams (FTS and CO) had the same deadline and the constraint, as explained above, was on the dependency on the FTS handoffs according to each subteam’s working days. Both CO and FTS completed their tasks in two week duration with each subteam of the FTS team worked according to the handoff sequence. Thus, if the subteams had been working in FTS, around the clock, there would have been a 50% time reduction. Examining the integration process: In both experiments we used the integration server and the electronic forums to learn more about the special workflow that we set up in the FTS team. In the first experiment, we examined the last half week before the final presentation, The CO team performed 385 check-in operations with average of revision level of 8.8, while FTS team performed 336 check-in operations (13% less) with average of revision level of 18 (~100% more). Meaning the FTS team changed less files with much higher refinements level. In other words, the FTS team was tweaking, while the CO team was performing coarser, more fundamental tasks. Figure 7 shows revision information during the critical last week. The revision level indicates the level of refinement, where a high number is a very fine revision. A low number is an early revision. As can be observed, the levels of revisions in FTS are higher, meaning time was invested to revisit further issues and make advanced refinements. All together CO had 515 check-in operations in a lower level of revision whereas FTS had 400 check-in operations (28% less). Revision level 100 80 FTS 60 CO 40 20 0 Revisions of last week Figure 7: Revision level of last development week In the second experiment, examining the SVN server, the CO and FTS teams had roughly the same number of revisions (186 and 178 respectively), meaning both teams had a similar Follow The Sun paper for Keystone conference printed on 3/10/2016 Page 18 of 21 rhythm of work. Examining revisions of the FTS team in the experimental iteration we can see the handoffs in action (see Figure 8). We can see that there were 7 handoffs all together in this 2 week iteration (A to B, or B to A). Figure 8: Handoffs of the FTS team (Metaphorically, we like to call this the passing of the baton) Development log documentation: In the first experiment, when asked to reflect on the process, the FTS students described the special way they handed off the work at the end of their working hours. When we checked the electronic forum we found that the level of development log documentation was markedly better for FTS. This is to be expected since each subteam needed to inform the other subteam about their main changes (Carmel et al 2009). In the second experiment, examining the number of documentations lines that were entered to explain the revisions, the FTS team wrote about 140% more (336 lines for the FTS compared to 144 lines for CO). In addition, examining the total number of messages on the course website we find that the FTS wrote about 25% more (1272 entries for the FTS compared to 946 entries for CO). Limitations As we noted before, quasi-experiments are conducted as stepping stones for learning and refinement and naturally have many limitations to generalization. The main limitations were, first, that we had a small sample. Second, our measures were imperfect. We forced a different duration on the FTS team and had to derive a number for the duration reduction (in the first experiment), and we forced time constraint of non–consecutive working days (in the second experiment). We list additional secondary limitations: students self-selected to the experimental condition. Students could have “cheated” (although we monitored them and are rather certain that they did not). In a real world situation, there are about 8 iterations in an Agile release before producing a deliverable. Since the experiment was in an academic setting we had to ask for a deliverable after only 1 measurable iteration. This compression of the project life-cycle led to the situation where the main time span we measured, of the students’ iteration, included additional tasks which are not usually done in a typical “real” iteration. Follow The Sun paper for Keystone conference printed on 3/10/2016 Page 19 of 21 Conclusions from Quasi-Experiments We point out three important conclusions that emerged: The first conclusion relates to our empirical finding that FTS completed their work in similar or better duration. This is in contrast to previous results like in [19] in which the software work of distributed teams takes longer in duration relative to co-located teams. We suggest that this happened-- in our case-because of Agile implementation-- specifically the tight iteration and deadlines involved. The second conclusion deals with the continuous integration practice. This is probably necessary to make FTS feasible. In other words it might be impossible to do FTS without this. And it has other positive outcomes: our data suggests that implementing a work procedure that includes developing small code components (code and test) with continuous check-in every day may be significant to the professional FTS hand-offs. The third conclusion is that when communication is limited to zero between FTS subteams, the dependency on high level documentation -- like the commit logs -- increases. In software there is general consensus that more documentation is useful for downstream activities such as system enhancement and maintenance. Overall Conclusions Here we integrate the observations and conclusions from each of our FTS studies. Here we discuss the somewhat contradictory results that we are getting from our theoretical model and our quasi-experiment. Our theoretical model presents a somewhat idealized picture. For example the 4-site example that we presented of Figure 5 indicates an effectiveness of 3.6 relative to the one site model. This means that the duration would be only 28% - a truly astounding improvement-- and probably an unlikely one. A more realistic picture, if we relax the various assumptions that we listed, would be lower than the plot-lines of Figure 5. Thus, the theoretical model is not very promising in its findings about FTS. The big multiplier impact of FTS will be moderated by coordination and personnel parameters. On the other hand, our quasi-experiments are somewhat promising. The FTS teams performed roughly equally, or better, than the control teams, while maintaining equivalency in functionality and level of quality. How do we explain this contradiction? We assumed that the efficiency of all teams is equal in our basic theoretical model. Rephrased, we assumed that all types of teams had equal productivity. But we have indications from the quasi-experiment that the productivity of the FTS team changes: it increases. What we found in the quasi-experiment is that because of the different nature of structure -- the FTS team behaves differently. The team needs to be more disciplined. And we know that when time is managed correctly this impact outcomes: the fast and agile movements of recent decades have advocated and shown this [23]. This is the real contribution of quasi-experiment over theoretical models. We are able to get some early indications that our simplifying assumptions in the model-- are perhaps incorrect. Follow The Sun paper for Keystone conference printed on 3/10/2016 Page 20 of 21 Implications for practice: From our quasi-experiments we learned about potential advantages of the continuous integration practice. We found that implementing a work procedure that includes developing small pieces of code and test and continuous check-in every day into one common repository may be significant to the professional FTS hand-offs. Implications for research: We delineated a multi-method research plan (using 4 methods) at the beginning of this paper. The authors, as well as other researchers, will follow down this path in order to better understand the promise of Follow The Sun/24-hour software development. The FTS promise may prove to be a myth, but we don't know enough about it yet to make any strong conclusions. References [1] Beck, K. Extreme Programming Explained. Addison-Wesley, 2000. [2] Betts, M. “24/7 global application development? Sounds good, doesn't work”, Computerworld. September 16th, 2005. [3] Brooks, F.P. The mythical man-month: readings in software engineering. Reading, MA: Addison-Wesley, 1975. [4] Cameron, Alex ”A Novel Approach to Distributed Concurrent Software Development using a “Follow-the-Sun” Technique.” Unpublished EDS working paper, 2004. [5] Campbell, D.T. and Stanley, J.C. 1966. Experimental and quasi-experimental designs for research Rand McNally and co (USA). [6] Carmel, E. “Cycle-Time In Packaged Software Firms”, Journal of Product Innovation Management, 12(2), March, 1995. [7] Carmel, E. Global Software Teams: collaborating across borders and time zones, 1999. Published by Prentice Hall-PTR. [8] Carmel, E. Building your Information Systems From the Other Side of the World: How Infosys manages time differences. MIS Quarterly Executive, 5(1), 2006. [9] Carmel, E., Dubinsky, Y. and Espinosa, A. Follow The Sun Software Development: New Perspectives, Conceptual Foundation, and Exploratory Field Study. 42nd Hawaii International Conference on Systems Sciences, January, 2009. [10] Carmel, E., Dubinsky, Y. and Espinosa, A. Follow The Sun Software Development: New Perspectives, Conceptual Foundation, and Exploratory Field Study. 42nd Hawaii International Conference on Systems Sciences, January, 2009. [11] Denny, Nathan, Igor Crk, and Ravi Sheshu Nadella, Agile Software Processes for the 24-Hour Knowledge Factory Environment, in Gupta, A., Outsourcing and Offshoring of Professional Services: Business Optimization in a Global Economy, Publisher: IGI Global, Hershey Pennsylvania, 2008. [12] Dubinsky, Yael and O. Hazzan, “ The construction process of a framework for teaching software development methods”, Computer Science Education, 15:4, 2005, pp. 275–296. [13] Dubinsky, Yael and O. Hazzan, “Using a Role Scheme to Derive Software Project Metrics”, Journal of Systems Architecture, 52, 2006, pp. 693–699. [14] Espinosa, J.A. and E. Carmel, “Modeling Coordination Costs Due to Time Separation in Global Software Teams”, International Workshop on Global Software Development, part of the International Conference on Software Engineering, Portland, Oregon, USA, May, 2003. [15] Espinosa, J.A., N. Nan, and E. Carmel “Do Gradations of Time Zone Separation Make a Difference in Performance? A First Laboratory Study”, International Conference on Global Software Engineering, Munich, Germany, August 27-30, 2007. [16] I. Gorton, I. Hawryszkiewycz, L. Fung: “Enabling Software Shift Work with Groupware: A Case Study”, HICSS (3) 1996: 72-81. [17] A. Gupta, S. Seshasai, and R. Aron, "Research Commentary: Toward the 24-Hour Knowledge Factory - A Prognosis of Practice and a Call for Concerted Research" (November 19, 2006). Eller College of Management Working Paper No. 1038-06 Available at SSRN: http://ssrn.com/abstract=946012 [18] A. Gupta, Deriving Mutual Benefits from Offshore Outsourcing: The 24-Hour Knowledge Factory Scenario” Communications of the ACM, 2008. Follow The Sun paper for Keystone conference printed on 3/10/2016 Page 21 of 21 [19] J. Herbsleb, J., et al. “An Empirical Study of Global Software Development: Distance and Speed”, in 23rd. International Conference on Software Engineering (ICSE). 2001. Toronto, Canada: IEEE Computer Society Press. [20] Highsmith, J. Agile Software Development Ecosystems. Addison Wesley, 2002. [21] T. Hildenbrand, M. Geisser, T. Kude, D. Bruch, T. Acker, "Agile Methodologies for Distributed Collaborative Development of Enterprise Applications", In workshop on Engineering Complex Distributed Systems (ECDS), Complex, Intelligent, and Software Intensive Systems (CISIS), 2008, pp. 540-545. [22] Hildenbrand, T. , F. Rothlauf, M. Geisser, A. Heinzl, T. Kude, "Approaches to Collaborative Software Development", In workshop on Engineering Complex Distributed Systems (ECDS), Complex, Intelligent, and Software Intensive Systems (CISIS), 2008, pp. 523-528. [23] Jalote, P, A Palit, P Kurien, VT Peethamber Timeboxing: a process model for iterative software development The Journal of Systems & Software, 2004 [24] Jalote, P. and G. Jain, “Assigning tasks in a 24-h software development model”, Journal of Systems and Software 79(7): 904-911 (2006). [25] B.A. Kitchenham, S.L. Pfleeger, L.M. Pickard, P.W. Jones, D.C. Hoaglin, K. El-Emam, J. Rosenberg, “Preliminary Guidelines for Empirical Research in Software Engineering”, IEEE Transaction on Software Engineering, 2004. [26] M.R., Millson, S.P. Raj, and D. A Wilemon, “Survey of Major Approaches for Accelerating New Product Development”, Journal of Product Innovation Management, 9(1): 53-69 (March, 1992). [27] D. Redmiles, A. van der Hoek, B. Al-Ani, T. Hildenbrand, S. Quirk, A. Sarma, S. Silveira, R. Filho, C. de Souza, E. Trainer, "Continuous Coordination: A New Paradigm to Support Globally Distributed Software Development Projects". In Wirtschafts Informatik, (49) , 2007, pp. S28-S38. [28] M.D. Jr. Rosenau, “Schedule emphasis of new product development personnel”, Journal of Product Innovation Management, 6(4): 282-8 (December, 1989). [29] C.S. Saunders, C. Van Slyke, and D. Vogel, ”My Time or Yours? Managing Time Visions in Global Virtual Teams,” Academy of Management Executive, 2004, Vol. 18, No. 1, 19-31. [30] Smith, P.G. and Reinersten, D.G. Developing products in half the time. New York: Van Nostrand Reinhold, 1991. [31] Setamanit, Siri-on, Wayne W. Wakeland, David Raffo. Using simulation to evaluate global software development task allocation strategies. Software Process: Improvement and Practice, Vol. 12, Num 5, September/October 2007, 491-503. [32] Sooraj. P. and P.K.J. Mohapatra, Modeling the 24-hour software development process, Strategic Outsourcing: An International Journal 1(2), 122 – 141. (2008). [33] A. Taweel and P. Brereton, “Modelling Software Development across Time Zones”, Information and Software Technology, Volume 48, Issue 1 , January, 2006. [34] J.J. Treinen, S.L. Miller-Frost, “Following the Sun: Case Studies in Global Software Development”, IBM Systems Journal, 45(4), October 2006. [35] Yadav, Vanita; Adya, Monica; Sridhar, Varadharajan; Nath, Dhruv Flexible Global Software Development (GSD): Antecedents of Success in Requirements Analysis: Journal of Global Information Management, Vol. 17, Issue 1 [36] M. Yap, “Follow the sun: distributed extreme programming development,” Proceedings of Agile Conference, 2005.