Author: Peter Benedek Balog Msc in International Economic Consulting Academic Supervisor: Dr. Philipp Schröder THE EFFECTS OF THE ECONOMIC CRISIS ON THE SOFTWARE INDUSTRY WITH SPECIAL ATTENTION TO THE OPEN SOURCE SECTOR Aarhus School of Business, University of Aarhus 2010.11.30. Statement of originality This work has not previously submitted for a degree or a diploma in any university. To the best of my knowledge and belief, the thesis contains no material previously published or written by any other person except where due reference is made in the thesis itself. i Abstract In the recent years the Open Source Software (OSS henceforward) development movement gained a considerable attention from various academic research fields. Due to the nature of the OSS development process namely that it involves developers at many different locations and organizations sharing code to develop and reform programs, requires an interdisciplinary understanding. Through a review of the existing literature in this dissertation I develop an econometric model in order to investigate different macroeconomic factors that might be influential on the behavior of OSS activity. By analyzing the software industry as a whole I put the results into an exploratory framework which helps the understanding. The panel dataset used in the analysis consist quarterly macroeconomic figures on 25 countries from 1998 – Q1 until 2010 – Q2. It also includes data on four OSS project hosting site. The model analyzes the aggregate OSS activity as well as the differences across the “forges”. The research concluded that the software industry was able to overcome the negative effects of the recent crisis fairly quickly. The reason might be that the industry is a major driver of the R&D sector; therefore the once decreased ICT budgets will be filled up quickly along with the stabilization. The results show that changes in the real economy has limited effect on the OSS development activity in general. However there are considerable differences among the “forges” in terms of the magnitude and type of the different effects. There is a significant positive effect between GDP growth, and the activity of the individual communities. Keywords: open source software, productivity, economic crisis, software industry, business cycle theory ii Table of Contents Abstract ........................................................................................................................................ 2 1. 2. 3. Introduction .......................................................................................................................... 1 1.1. Problem Statement..................................................................................................... 1 1.2. Methodology ................................................................................................................ 2 1.3. Delimitation .................................................................................................................. 3 1.4. Structure....................................................................................................................... 3 Literature Overview ............................................................................................................ 4 2.1. Open Source Phenomenon ...................................................................................... 4 2.2. Business Cycle Theory .............................................................................................. 7 Analysis of the software industry.................................................................................... 10 3.1. 3.1.1. FORBES Global 2000 ...................................................................................... 11 3.1.2. Truffle 100 .......................................................................................................... 12 3.1.3. Global Software 100......................................................................................... 13 3.1.4. Software 500 ..................................................................................................... 14 3.1.5. Demand side ..................................................................................................... 16 3.2. 4. Conclusion of the analysis ...................................................................................... 18 Analysis of the Forges ..................................................................................................... 19 4.1. 5. Data ............................................................................................................................ 11 Data ............................................................................................................................ 20 4.1.1. OSS dataset ...................................................................................................... 20 4.1.2. Productivity and performance measure ........................................................ 21 4.1.2. Macroeconomic data ........................................................................................ 31 4.1. Methodology .............................................................................................................. 35 4.2. Model .......................................................................................................................... 36 4.3. Results ....................................................................................................................... 37 4.4. Conclusions of the analyses ................................................................................... 51 Conclusion ......................................................................................................................... 53 List of References..................................................................................................................... 55 Appendix .................................................................................................................................... 61 iii List of Figures Figure 1.1. Growth in New of Projects in each Repository by Year 2 Figure 3.1. Number of Software Companies on Forbes2000 list 11 Figure 3.2. Key Figures of Software Companies on Forbes2000 list (billion $) 12 Figure 3.3. Revenue of the top 100 European vendors from software activity (billion €) 13 Figure 3.4. Revenue of the Software 500 list’s companies (billion $) 14 Figure 3.5. Change in IT budget by company size, 2009 16 Figure 3.6. IT expenditure by counties (billion $) 17 Figure 4.1. Number of Shared Names across each Repository 28 Figure 4.2. Number of new projects registered in a month to Rubyforge 29 Figure 4.3. Number of new projects registered in a month to Freshmeat and Rubyforge 30 Figure 4.4. Number of new projects registered in a month to Sourceforge 31 List of Tables Table 3.1. Number of Employees by Software 500 companies 14 Table 3.2. Top ten company based on Revenue / Employee 15 Table 4.1. Variables and its’ descriptions in the OSS dataset 20 Table 4.2. Software Productivity (ESLOC/SM) by selected Application Domains 25 Table 4.3. Selected record form the analysis’ dataset 32 Table 4.4. Macroeconomic variables used in the analysis 33 Table 4.5. RE Estimation Results for Model 1-4 (robust standard errors) 38 Table 4.6. RE Estimation Results for Model 1-4 (normal standard errors) 39 Table 4.6. RE Estimation Results for Model4 – 10 (robust standard errors) 41 Table 4.7. Results for Forge wise RE estimation with (M4) regressors 44 Table 4.8. Results for Forge wise RE estimation with (M5) regressors 45 Table 4.9. Results for Forge wise RE estimation with (M6) regressors 46 Table 4.9. Results for Forge wise RE estimation with (M7) regressors 47 Table 4.9. Results for Forge wise RE estimation with (M8) regressors 48 Table 4.9. Results for Forge wise RE estimation with (M9) regressors 49 Table 4.9. Results for Forge wise RE estimation with (M10) regressors 50 Appendix A – Table of countries weighted by the number of active developers 61 Appendix B – Table of calculations of the effects’ size 62 iv 1. Introduction Over the past decade, the Open Source Software phenomenon has had a global impact on the way organizations and individuals create, distribute, acquire and use software and software-based services. OSS has challenged the conventional wisdom of the software engineering and software business communities, has been instrumental for educators and researchers, and has become an important aspect of e-government and information society initiatives. OSS is a complex phenomenon and requires an interdisciplinary understanding of its engineering, technical, economic, legal and socio-cultural dynamics. The open source movement has attracted the interest of many academic researchers due to the success of famous OSS products like the ‘Linux’ operating system, the ‘Apache Web Server’ or the ‘Firefox’ web browser. 1.1. Problem Statement The purpose of this thesis is to provide an answer to the following question: How is the Software Industry affected by the 2008 Economic Crisis? It will be answered by an analysis of the software industries performance with special attention to the OSS sector. During fall 2008 the world economy experienced dramatic falls in the GDP leading to a deep recession due to various reasons such as the collapse of the US realty bubble. Major economies like the American or the European were even experienced negative growth rates while a slow but positive growth were fueled by the emerging economies of China or India. According to the FLOSSmole project - an internet-based collaborative collection and analysis of free/libre/open source project data (Howison et al., 2006) - the number of registered projects in the OSS hosting-sites are started to decrease significantly since the economic crisis started late 2008 as Figure 1.1. shows below. 1 Figure 1.1. Growth in New of Projects in each Repository by Year Source: flossmole.org 1.2. Methodology In my thesis I will attempt to find the factors that can explain the recent changes in the output of the OSS sector. Those micro- and macroeconomic factors that can affect the performance of the sector, and the recent economic recession have a heavy effect on it, for instance the stock prices, the GDP or the size of the unemployment or the changes in the amount of retail sales. I will also try to find alternative explanations, which are not originated in the ongoing crisis, to the shrinking number of OSS projects. In contrast I will also analyze the proprietary software sector to provide a cleaner picture about the prospects in the software industry as a whole. I strongly believe that the topic is related to the existing academic literature in macroeconomics, more narrowly it connects to the business cycle theories. The ones are concerning the sources and the nature of the macroeconomic fluctuations. It is critical that we specify what software industry means in this paper. Due to the exponential growth of computer usage in various sectors and different purposes there are tens of thousands of different software products. The term 2 ‘software’ in this analysis will mean application software, system software and all the necessary tools. 1.3. Delimitation It is found to be very hard to gain access to reliable data on the software industry as a whole, and it was even harder in case of the OSS sector. The author is aware of the fact that the activity level of the OSS is usually measured by the number of messages generated in a given period of time, and that the output in the software sector is generally measured by lines of code. Instead a different approach has been used due to the resource constraints. The detailed discussion of data constraints and dataset building can be found in section 4.1. 1.4. Structure The rest of the thesis structured as the follows: Section 2 describes the evolution of the Open Source Phenomenon and provides an insight to the existing academic literature on business cycle theory. Section 3 presents an analysis of the software industry in the light of the recent economic crisis. It provides various data about the sector and presents the results as well as the conclusions of the analysis. Section 4 consists of the analysis on the willingness to start an OSS project. It presents a model used in the thesis, the dataset and the variables used are also discussed in detail. Section 5 concludes on the findings of the thesis. 3 2. Literature Overview 2.1. Open Source Phenomenon The Free/Open Source Software (F/OSS) phenomenon has attracted an increasing amount of attention in the academic literature in recent years. The term “open source” refers to the fact that the program source code is – in contrast to the source code of the proprietary software, which is only distributed in compiled machine code – accessible, available and thus alterable by its user (Bitzer and Schröder, 2006). Due to the growing number of contributions from various academic fields a wide range of interesting issues emerged. This section provides account of these contributions in order to understand the state of the phenomenon. Among researches conducted on the F/OSS individual incentives and motivations received by far the most attention. According to a widely accepted scheme of the results (Osterloh et al., 2001; Hars and Ou, 2002; Lakhani and Wolf, 2005) the individuals motivation can be grouped under two headings. Individuals are driven either by intrinsic- or extrinsic motivations. In the first case according to Deci and Ryan (1985) intrinsic motivation is defined by an activity for its inherent satisfactions rather than for some separable consequence. The individual is moved to act for fun, challenge rather than for rewards. On the other hand Lerner and Tirole (2002) states that a programmer only engage in a project, whether commercial or F/OSS, if she derives a net benefit. This means that the source of motivation depends on external factors – reputation, user needs, learning and performance improvement. The success of the OSS development was rather surprising in among some scholars. Since it seems to be in contrast to the “Brooks` Law”. Which declares that adding manpower to a late software product makes it later. The OSS development on the other hand seems to be driven precisely by the high number of skilled developers. Raymond (1998) summarizes the main features of the OSS production mode which generated large number of academic commentary. Advantages of the parallel code development (Feller and 4 Fitzgerald, 2002). Integration of users into the production of software code, von Hippel and von Kogh (2003) describing F/OSS as a private-collective model of innovation. Numerous studies turned their attention to the governance and coordination structure of the OSS projects. Rossi (2005) summarizes the principal findings of these studies: according to Krishnamurthy (2002) the median number of the developers per projects is four while it is only one for Healy and Sussmann (2003). The success of Linux operating systems – accounting for a 38% share of the server operating systems market (Bitzer 2004) - contrary to Microsoft’s Windows seemingly huge superiority in many fields referenced by an extensive number of authors. The issue of the competition between open- and closed software production discussed widely (Gaudeul 2004, Economides and Katsamakas 2005, Harison and Koski 2008). An increasing number of for profit firms base their services on the OSS phenomenon to mention the few biggest: Red Hat, Cygnus, VA Linux. Meanwhile established hardware and software producers – IBM, Hewlett Packard and Oracle - turned their attention to the OSS sector. The studies analyzing the competition in two dimensions: the quality differences and the dynamics of innovation between the competitor sectors. The models suggesting that the driver of innovation is to meet with the user’s needs which leads to a quality increase (Kuan, 2001; Bessen, 2002). Bonaccorsi and Rossi (2003) take into account the network effects and externalities in the competition. Casadeus-Masanell and Ghemwhat’s (2003) model of mixed-duopoly competition focuses on the influence of strategic pricing decisions on consumer’s valuation while Bitzer (2004) looks at the product differentiation effects on the competition. So far a very important aspect of the F/OSS phenomenon has been overlooked the OSS licenses. These licenses have a huge role on all previously mentioned issues, .the motivation of developers, the coordination of projects, the effectiveness of the OSS development model and the competition with the commercial sector. The Open Source Definition (OSD)1 defines the “rights that 1 I refer to the OSD v1.9, the latest version available at http://opensource.org/osd.html (accessed, November 26, 2010) 5 a software license must grant you to be certified as Open Source” (Perens, 1999). The main principles of the OSD are the following: Free Redistribution: the license may not restrict any party from selling or giving away the software. Availability of the source code: the license allows modifications and derived works and allows the distribution of these under the same terms as the license of the original software. Integrity of the source code: The license may restrict source-code from being distributed in modified form only if the license allows the distribution of "patch files" with the source code for the purpose of modifying the program at build time. The confusion whether F/OSS belongs to the public domain addressed by a wide range of scholars: Lee (1999) Perens (1999), Lanzi (2005). There is a difference between F/OSS and a software that simply put into the public domain, because although it is free and available to all developers do not surrender their rights to their software creations. Rather, they retain copyright over their work and adopt licenses to ensure free access and modification of the source code as Rossi (2005) summarizes the above mentioned papers` findings. Lerner and Tirole (2005) found in their empirical research that restrictive licenses are more likely to be adopted when the software is directed at end-users, whereas less-restrictive licenses are more frequently adopted for projects aimed to developers, the Internet or proprietary operating systems. As Rossi (2005) summarizes numerous studies suggest the need for rethinking the current state of the intellectual property protection for software programs (Moglen, 1999; Osterloch et al., 2001; Benkler, 2002). OSS seems to represent a “new intellectual property paradigm” (Maurer & Scotchmer, 2006), i.e. a new type of ownership concept that leads to different allocations of intellectual property rights and different modes of organization as compared to so called proprietary software. 6 2.2. Business Cycle Theory A business cycle is identified by the behavior of aggregate economic activity which is measured by a wide variety of series including output, sales, employment, and income according to Moore (1983). The analysis conducted in this thesis is fitted into this context. I will try to shed light to the conformation of OSS activity2 with the use of the business cycle theory’s approach. Explaining the behavior of OSS development activity through macroeconomic series. A brief overview of the evolution of theories follows. The early theories: The classical theory dominated the macroeconomics from the late XVIII. century until the 1930s. The theory – although has been discredited during the Keynesian revolution – provides foundation for several modern theories of the business cycle such as the monetarist and real business cycle models. The model itself is focused on the supply side effects in the economy. It views positive shocks causing expansions and negative shocks causing recessions. Keynes with The General Theory of Employment, Interest and Money (1936) points out that there is truly wrong with a theory – classical – that cannot explain the severe unemployment of the 1930s. The theory focuses on the nature of the investment, the variable found to be the most unstable. Therefore he changed the focus of the analysis from the exogenous variables (supply side) to the endogenous (demand side) one. Monetarism: Friedman and Schwartz (1963) examined the changes in the monetary growth and argued that it is the main source of economic instability. The evidence showed that the growth of money supply indeed lead to a GNP growth and, because prices are slow to adjust, generate business cycles. (Hall 1990) 2 OSS activity: I refer to OSS activity as a number of projects registered to the „forges”. 7 The rational expectations theory (Miller, 1994) provides insight to the behavior of economic agents. It argues for policy heterogeneity (Sargent and Wallace, 1975), that discretionary monetary and fiscal policy maybe unable to alter aggregate output. The New-Keynesian school improved further Keynes’ original theory answering many critics of it. One of the major contributions is the prediction that unstable aggregate demand and supply are important determinant of the business cycle. The aggregate demand causes cycles because the wages and prices are not flexible in the short run, on the supply side the cause is in the real changes in the labour market and/or production function alter the output (Hall1990). Real business cycle theory is the first one that approaches the nature of cycles from the supply side again since Keynes discredited the classical theory. It states that the dominant causes of instability are external shocks to the supply. (Barro 1989, Plosser 1989) This theory can be the base of research conducted in this thesis since the assumption is that the macroeconomic environment affects the sector indirectly, through the developers, who provide the supply for the sector. Modern theories: These above mentioned major competing theories of business cycle identified the four major variables of the theory: the role of price and wage adjustment, the nature of cyclical unemployment the relative roles of aggregate demand and supply. However each school differs on the judgment of the factors. This moved to the scope to new research areas. Considerable attention is devoted to the analysis of the stock market fluctuations (Chauvet 2001, Chordia and Shivakumar 2001, Casarin and Trecroci 2007). Due to the nature of the field the subject of forecasting cycles is adressed. (Zimmermann 2001) Rötheli (2006) showed that business practitioners' forecasting of the business cycle was ahead of business cycle theorizing for many years. Theorists of the 1930s and 1940s were not yet ready to incorporate into their theories the business forecasting methods that had become widespread entrepreneurial practice. Instead, theorists either 8 speculated on business psychology or built tractable dynamic mathematical models with accelerator type investment. However, by the 1940s, a model of the cycle with forward-looking investors might actually have been developed. Finally as Hall (1990) summarizes the common method to use to characterize the behavior of economic variables is in terms of their cyclical nature during business cycles: procyclical: if the variable typically moves in the same direction as aggregate economic activity, countercyclical variable usually moves in the opposite direction of aggregate economic activity, a cyclical: showing no consistent pattern in terms of its movement over business cycles. 9 3. Analysis of the software industry Interaction between the real economy and the virtual world The above statement may sound odd but with two simple examples I show that there is a serious level of interaction more correctly dependence on the real economy from the side of virtual world. For example the resource that keeps the virtual world “alive” is electricity which is produced in the real economy therefore affected by all sorts of economic or ecologic effects to mention the most obvious ones. According to a calculation a ‘Second Life’ avatar consumes 1.752 kWh annually almost as much as Brazilians who consume 1.884. Furthermore it has been proven that Google alone consumes 2.1 teraWh electricity in a year which equals to the production of 2 average nuclear reactors. The financial crisis started in 2008 have huge impact on the world-wide real economy leading to growing unemployment rates, shrinking consumption, decreased or even negative rate of GDP. This describes the massive decrease to IT budgets around the world. There was a huge pressure on CEOs to lower costs to be able to stay in the competition in any sector. And here comes the OSS sector into the picture. By 2009 FLOSS has been recognized by CEOs as an opportunity to lower costs. The following section will provide a software market analysis. We can say that among the different OSS business models those of which had 50-60% revenue from subscription fees and 40-50% from the services are gained on the crisis. 10 3.1. Data During the analysis and the data collection period I had to realize that it is very hard if not impossible to find aggregate data on the industry. Therefore a productivity analysis is seems impossible to conduct with the resources available to this thesis. On the other hand there are numerous organizations – professional journal, market researcher – that conduct some kind of analysis or data collection, mostly survey based. Thus I chose to present the data from these sources and conduct an analysis on it. 3.1.1. FORBES Global 2000 The Forbes Global 2000 is a comprehensive list of the world’s biggest and most powerful companies, as measured by a composite ranking for sales, profits, assets, and market value.3 It is possible to search based on industries in the list, the Software and services industry included on the list since 2004. Figure 3.1. Number of Software Companies on Forbes2000 list As the graph shows the number of companies between 28 and 35 since 2004. It is constantly growing since 2006 from an all time low 28. As of the 2010 list the number is the highest – 35 – since the introduction of the sector to the list. Based on the shape of the number of companies the sector is growing despite the global Source: Author’s own calculation, Forbes2000 economic crisis. The Forbes 2000 list also contains figures about key financial indicators the following figure is picturing the behavior of those. 3 Forbes 2000 sources: Exshare, FT Ineractive Data, Reuters Fundamentals, and Worldscope via FactSet Research Systems, Bloomberg Financial Markets. 11 Figure 3.2. Key Figures of Software Companies on Forbes2000 list (billion $)4 Source: Author’s own calculations, Forbes2000 The figure captures very well the effects of the crisis. The market value of the companies’ grew from 2004 to 2007 by 48.7%. The following year brought halt to this massive growth; while during 2008 a significant 31.9% decrease took place. The sector seems to be recovered very fast by reaching an all time high 1159.05 billion USD. The value of the sales and the assets shows very similar behavior. Very slow growth until 2006. Then the growth accelerated until 2008, followed by a slow decrease in the values which may continue over 2010. 3.1.2. Truffle 1005 The Truffle 100 ranks the top European software companies. With the inclusion of the Truffle 100 list’s analysis it is possible to draw a picture about the European software industry which can increase the resolution of the global picture. According to the list’s authors the software industry is characterized by periodic technological disruptions that pave the way for new arrivals; for that purpose its results, research and statistics are released on a yearly basis. 4 Market value is as of Februar 28, 2010 The Truffle 100 is compiled from survey & research conducted by IDC & CXP for the purpose of the Truffle 100 ranking. Europe is defined as: EU + Switzerland + Norway. A company is defined as a European software vendor if its headquarters and R&D management are based in Europe 5 12 According to the results the revenue of the top European Figure 3.3. Revenue of the top 100 European vendors from software activity (billion €) vendors is constantly growing despite the ongoing economic slowdown. The European market is very concentrated, 79% of the revenue comes from the top 25 companies up from 70% from 2008. And the vendors are facing very heavy competition with huge global Source: Author’s own calculations, Truffle 100 actors like Microsoft, whose revenue in 2009 was alone €40.9 billion, 52% higher than the top 100 European vendor’s aggregate revenue. This clearly shows that there is room for improvement. The report “emphasizes the impressive dynamismand exceptional resilience of the European software industry. In a challenging environment, software vendors have demonstrated their ability to bounce back quickly (with 8.4% year-on-year growth) and remain profitable (€3.7 billion), while maintaining a heavy level of investment in R&D (€3.8 billion).” (Kroes, 2010). 3.1.3. Global Software 1006 The Global Software Top 100 list is very similar to the Truffle list but the focus is on the worldwide software industry. The companies are ranked according to their revenues7. The authors gathered data from SEC filings, annual reports and corporate websites. Similarly to the European market the global software market is also very concentrated. The top ten vendor accounts for nearly 60% of the top 100’s revenue, which is over $220 billion in 2009. 46 vendors out of the 100 reported lower revenues than the previous year; still the average growth was 3.2% 6 http://www.softwaretop100.org/methodology (accessed October 27, 2010) Revenue contains 'prepackaged' software sales; subscription and support activities; certain service activities are excluded; hosted software solutions (Software as a Service) are included. 7 13 among the listed companies. The report concludes that the major effect of the crisis was that many companies cut back their IT budgets, but since the sector seems to be recovered very fast and the industry is one of the R&D drivers, the budgets will increase in the future again and will grow bigger than before. The other effect is that the „credit crisis abruptly stopped acquisitions by private equity firms”. 3.1.4. Software 500 Software Magazine’s Software 500 list is a comprehensive look at the software industry targeting enterprise IT organizations with software and services. The authors collect data based on the magazine’s survey. The rankings are based on total worldwide software and service revenue 8. Total 2009 Software 500 revenue is $491.2 billion, an 8.7% increase over last year's total. Once again an analysis that seems to report about an industry, healed very quickly after the global economic crisis started. Figure 3.4. Revenue of the Software 500 list’s companies (billion $) Source: Author’s own calculation, Software500 8 Iincludes revenues from software licenses, maintenance and support, training, and softwarerelated services and consulting. www.softwaremag.com (accessed October26, 2010) 14 The graph above confirms the assumptions; the industry seems to grow despite the ongoing crisis. Another interesting aspect of the Software 500 list is that data on the number of employees was also collected. The total number of employees in the Software 500 is up a healthy 26% to 3 707 957, compared to 2 953 016. It suggests that in 2007 many companies were cutting costs and conserving cash. The big increase in 2009 is primarily due to the addition of Hitachi to the list, with 389 752 employees, and Emerson Electric, with 140 700 employees. The fact that the actors on the list may change from year to year can cause biases in case a time series analysis therefore caution is advised interpreting the data. Table 3.1. Number of Employees by Software 500 companies Employees Growth 2008 3 707 957 25.6% 2007 2 953 016 1.3% 2006 2 914 480 14.7% 2005 2 539 872 -4.6% 2004 2 660 023 13.7% Source: Software 500 It is however possible to calculate the labour productivity and compare the agents in the industry. Table 3.2. shows the top ten company based on the revenue / employee. 15 Table 3.2. Top ten company based on Revenue / Employee Average for 500 = $ 191 417 Rank Company 2008 Employees 2008 Revenue (thousand $) Revenue/ Employee 236 Innodata Isogen, Inc. 45 75 001 1 666 689 197 Lighthouse Computer Services, Inc. 95 119 700 1 260 000 78 ePlus inc. 658 727 159 1 105 105 138 Technology Integration Group 300 274 500 915 000 5 288 881 313 100 000 666 667 498 JangoMail 215 GreenPages Technology Solutions 6 150 2 Microsoft Corp. 91 000 52 280 000 574 505 70 Akamai Technologies, Inc. 1 500 790 924 527 283 787 407 300 517 535 3 572 376 509 321 106 SolidWorks Corp. 27 Juniper Networks, Inc. 7 014 Source: Software 500 The table captures very well the productivity difference among the companies the first company on the list realizes about 8 times higher revenue than the average on the list. This variance can be explained by the different activities performed. 3.1.5. Demand side It seems that the software and related service providers were able to overcome a very challenging situation and were able to grow their revenues. But the major driver of these revenues was not the new orders but the already existing 16 maintenance contracts. Thus it is worth to inspect the demand side as well. There was reference about decreasing ICT expenditure in every sector due to the heavy cost cutting pressure. According to IDC’s survey based research the ratio of companies planning to increase or not back on ICT spending exceed the ones decrease. Only one third of the small businesses plan to decrease the budget, while it is about 50% in case of medium size, and the large ones are planning to cut back less. Figure 3.5. Change in IT budget by company size, 2009 Source: IDC Overall 42% of the companies were planned to decrease their spending along with 27% who is not changing and 31% who will increase. On country level a growing tendency is taking shape. The fact of that only part of the ICT expenditure is spent on software and related services; and that the covered period ends in 2008 makes the graph less useful! 17 Figure 3.6. IT expenditure by counties9 (billion $) Source: Author’s own calculation, OECD, USCB 3.2. Conclusion of the analysis The above I have presented four different analyses on the software industry. The different lists created by various organizations included only the biggest actors of the sector. It is possible however to draw a comprehensive picture of the effects of the crisis on the industry. Based on the different researches the behavior of the revenue figures share a theme: the sector experienced a minor but positive growth last year which started to accelerate again recently. The market is concentrated therefore an unseen economic effect on the major players can reshape the landscape of the software industry. Many companies cut back on the ICT budgets to stay in competition, but as the studies suggest the sector is one of the key player in R&D moreover necessary to increase productivity and reach optimal resource allocation, therefore the budgets expected to be filled up along with the stabilization. 9 Europe equals to EU27, Switzerland, Norway, Turkey; exchange rate as of Oct 25, EUR/USD = 1.4031 (http://www.x-rates.com/d/USD/EUR/graph120.html) 18 4. Analysis of the Forges In the following section I will attempt to address the answer to the research starter problem. As Figure 1.1 shows in the introduction section the number of projects started to decrease significantly at the same time. It is found to be very surprising that the OSS sector is seems to experience a massive decrease in “output”. In this case output loss means that there is a decrease in the number of projects registered in each repository since the beginning of the economic crisis. In case an economic crisis most of the real economy agents are under a pressure to decrease their costs. The OSS in many cases a cheap alternative to the available proprietary ones. The combination of the cost constraint and the cheap alternative should result an increase in the demand and thus the output as well or at least it should be stay steady. That is why the decrease in number of projects in the observed time period is found to be surprising. The volume of the decrease seems to be the same however it is not since the y axis of the graph – Number of Projects – has an exponential scale. Therefore the volatility in the graphs not represents the true values relative to each other. The number of projects hosted at Objectweb is in the 10-30 range at Rubyforge it is around 1000 and at Sourceforge in the 10000 range. If the graph would use a linear scale instead of the exponential the Objectweb line would seem a flat line on the x axis. In one side there is a massive output decrease in the OSS sector which can be the result of productivity deterioration. On the other side a deep economic recession started at the same time which can be the cause also. My assumption is that the real economy has an effect on the OSS productivity level. The purpose of the analysis therefore is to shed light on the relationship between the real economy and the OSS productivity. The following chapter can be divided into four parts. The first part provides the description of a number of variables that can describe the performance of the 19 OSS sector. The second part will discuss the factors that can affect these performance measures. In the third part an econometric model is presented to determine the effects of the different, preliminary defined factors on the performance of the Forges. Finally a detailed discussion follows on the received results. 4.1. Data The Dataset building will be presented in two subsections the first describing the OSS dataset, the second will provide an overview of the macroeconomic dataset. 4.1.1. OSS dataset The OSS dataset contains data about projects hosted by the Freshmeat, Objectweb, Rubyforge and Sourceforge forges. Table 4.1. summarizes the selected variables from Flossmole (Howison, 2006) and its’ descriptions of the dataset. Table 4.1. Variables and its’ descriptions in the OSS dataset Variable proj_unixname Description Name of the project under it is registered to the forge. Identification number of the Flossmole datasource_id dataset where it is collected to. Differs across forge and time. url URL address of the project through the hosting site. The project’s own html address if it real_url has an additional one from the above. For example own website, mailing list different from the one on the host. 20 date_registered proj_long_name Date of the project registered to the hosting site. Full name of the project. Identification number of the project. proj_id Identifies a project with a single number instead of the name. Total number of developers for the dev_count project date_collected 4.1.2. Date the data collected to the Flossmole dataset. Productivity and performance measure Productivity is usually defined as a ratio of output from a production process for given inputs. It can also be expressed as the ratio of output to input. In the case of software industry it can be defined as a rate of producing some output using a set of inputs in a given time unit. Thus the first step is measure the output which is the software itself in our case. Unfortunately due the complexity of the product there are several different software size measures. According to Chemuturi and Kaligotla these are the following: Function Points: A unit of measurement to express the amount of business functionality an information system provides to the end user. The cost (in dollars or hours) of a single unit is calculated from past projects. (Cutting, 2009). This unit of measure is used by the IFPUG Functional Size Measurement Method, which is one of the five ISO recognized software metric to size an information system. It is based on the functionality that is perceived by the 21 user of the information system, independent of the technology used to implement the information system10. Object Points: According to Guiliano et al. (1999) object point is a method that estimates the object oriented (OO) software development projects’ size. The experience that has been obtained with function points in traditional software development is exploited into the OO paradigm. Adapting function points to OO, the mapping of function point concepts to object oriented concepts is necessary, and OO – specific concepts must be handled. Equivalent Source Lines of Code (ESLOC): Source Lines of Code means the number of lines in a software’s source code. There are two different approaches. One is counting the physical lines where every ‘enter’ or hard line brake stands for one line. The other is counting the logical executable statements as lines. Equivalent SLOC takes into account the differences in effort required to incorporate new vs. inherited code into a delivered system also the additional effort required to modify reused/adapted code for inclusion into the software product (Hihn, 2004). Test Points: Testing a project and that a Test Point is equivalent to a normalized test case. It is common knowledge that test cases differ widely in terms of complexity and the activities necessary to execute it. Therefore, the test cases need to be normalized – just the way Function Points are normalized in to one common measure using weighting factors. Now there are no uniformly Wikipedia contributors. „Function point.” Wikipedia, The Free Encyclopedia. 16 September 2010, 18:54 UTC. http://en.wikipedia.org/w/index.php?title=Function_point&oldid=385215816. (accessed October 20, 2010) 10 22 agreed measures of normalizing the test cases to a common size.11 Use Case Points: A measurement of how much effort is required to write software based on how much work the software is intended to do. The method was created by Gustav Karner of Rational Software Corporation in the mid 1990′s. The method was based on a study of about 200 projects with an average size of 5 man-years of effort. The use case point method of estimation was found to be within 10% of the actual results for over 95% of the projects (Blain, 2007). Feature Points: In 1986, Software Productivity Research, Inc. developed an experimental method for applying Function Point logic to system software such as operating systems, telephone switching systems, and the like. To avoid confusion with the IBM Function Point method this experimental alternative was called ‘Feature Points’. When Function Points are applied to such systems, they of course generate counts. However, the counts appear to be misleading for software that is high in algorithmic complexity, but sparse in inputs and outputs. From both a psychological and practical vantage point, these kinds of systems software seem to require a counting method that is equivalent to Function Points, but sensitive to the difficulties brought on by high algorithmic complexity.12 Etc. Beyond the large number of different software size measures there is no generally accepted way of converting these from one to another. Therefore it is possible that the size of the software will change relative to each other aralikatte „Test Point Estimation.” Scribd http://www.scribd.com/doc/4939959/Test-PointEstimation. (accessed October 20, 2010) 12 „What are Feature Points?” Software Productivity Research. http://www.spr.com/featurepoints.html. (accessed October 20, 2010) 11 23 depending on the measurement system. The fact that there is no clear way to conduct the Lines of Code methodology – count the logical statements or the physical statements, how to treat inline documentation – makes the situation even more complicated. However these above mentioned issues are only one side of the coin. The other major problem with regards to software productivity measurement is that we want to explain a rather complex process with one single empirical figure. The developers have to work in an ever changing-, continuously evolving environment where – focusing on new technology in telecommunications industry – the software complexity increased by a factor of 10 in the early 2000’s (Groth, 2004). Almost simultaneously the introduction passed off to the Web-based services with Java. Moreover a revolution took place in software outsourcing and organizational structure which further increased the complexity of the software development by a wide margin. Nevertheless to say the skill levels of these activities are different, the tools, inputs and outputs are all different. In my opinion lumping them together and call it “Software Development” then giving a single productivity figure to it at best can result an unarguably rough estimate. There are attempts to give a range to the productivity figures such as 10 hours per Function Point but it could vary from 2 to 135 depending on the product (Chemuturi and Kaligotla) or from 45 to 975 (Reifer, 2004) ESLOC/SM (lines of code/staff month) depending on the application domain. Table 4.2. summarizes selected results from Reifer’s (2004) productivity calculations. 24 Table 4.2. Software Productivity (ESLOC/SM) by selected Application Domains Application Domain Range (ESLOC/SM) Banking 155 to 550 Command and Control 95 to 350 Data processing 165 to 500 Scientific 130 to 360 Telecommunications 175 to 440 Web Business 190 to 985 Source: Reifer (2004) Through the description of Reifer’s analysis it is possible to show how complex procedure is to measure the size of software. The original calculation conducted on 600 projects which are taken from Reifer’s database of more than 1,800 projects. These projects were completed within 1997 and 2004 by any of 40 organizations. A project is defined as the delivery of software to system integration. Projects include builds and products that are delivered externally, not internally. Both delivery of a product to market and a build to integration fit this definition. The scope of all projects starts with software requirements analysis and finishes with completion of software testing. The average number of hours per staff month was 152 in the original analysis adjusted holidays, vacation, etc. into account. SLOC is defined to be logical source line of code using the conventions (Florac and Carleton, 1999). ESLOC are defined to take into account reworked and reused code (Boehm, 1981). Reifer defined function point sizes using the International Function Point Users Group (IFPUG) Function point sizes were converted to SLOC using backfiring factors published by IFPUG in 2000, as available on their web site. 25 As Table 4.2. shows the variance of the results is quite large through the domains which do not allow us to receive a close to precise estimate sector wise. To sum up, three factors were common themes among the vendors: the lack of an industry-wide standard definition for software productivity, software applications’ increasing complexity, need for more formalized processes in the industry as a whole. As a solution for these complications Chemuturi and Kaligotla suggests to “shift focus from macro productivity to micro productivity”. This means that the software development process should be divided into sub segments and each should be treated differently and described with a different productivity estimate. This is beyond the scope of this thesis, nevertheless the available resources. Therefore a different approach has been used. The research will not study the productivity of the OSS sector. First and foremost because as section 1.4. describes access to data is limited and was not possible to retrieve any about the number of lines coded in any given period. Instead the study will analyze the activity of the community. The activity is proxied by counting the number of projects registered in a given time period. The steps of dataset reforming are shown through the Rubyforge dataset. The same transformations were made in each four of the OSS hosting site datasets the only difference is the number of observations. The Rubyforge dataset contains data about 9000 projects hosted on its site. 40 Flossmole datasets with the collection date since July 2006 were combined to receive a historical table. The original dataset contains 211633 observations. Throughout the transformation the aim was to create a dataset with information about a project and its registration date. Variable ‘project_id13’ was used to 13 For descriptions see: Table 4.1. 26 distinguish the projects from each other. The reason behind this is the problem of the shared names across hosts. The following analysis discusses the problem, than the Rubyforge dataset transformation description continues. OSS development environment – Forges: With the birth of Forges the, the maturation and creation of large scale market of users and developers of open source become possible. By providing a basic, no-cost infrastructure for the fundamental necessities of a project such as the mailing list – which greatly increases the effectiveness of the communication within the community around the project – or the free file storage space. On the other hand the wide spread of the use of forges have a downside. Three of the most important disadvantages are the following. First and foremost the information dissemination which means that it is not clear what happens a project is lost between bug tracking and mailing lists, in case of forum projects - - difficult to interact with each other moreover very hard, almost impossible to track evolution between projects. Second negative effect of the forges is the distributed development which at the first sight might seem a big advantage. This enables large scale development by individual groups through the distributed version control. It can lead to huge success (Linux Kernel) but with the first mentioned problem, the lack of information, it often times can lead to confusion, hard to keep track if a certain problem has already solved can lead to double coding. Finally the shared names problem namely that different project can run under the same name across the different forges. According to the FLOSSmole project there were 1367 projects with shared names on Rubyforge and Sourceforge in June 2009. For instance, starfish is a project listed on both Sourceforge and Rubyforge. On Rubyforge, it is described as a “tool to make programming ridiculously easy”, but on 27 Sourceforge the starfish project is described as a password management application (Howison, 2006). Figure 1.2. shows the number of shared names across the largest project hosting sites. Figure 4.1. Number of Shared Names across each Repository Source: flossmole.org The continuation of the Rubyforge dataset transformation description follows. As of the first step projects with no project id were dropped out which equaled to 31643 observations. The second step was to remove the projects that were observed multiple times during the collection period. It is resulted a dataset with 9181 different projects. Then the removal of the ones with no registration date and the ones that were registered in the first and last observed month - because the collection was not conducted on the last day of the month - followed. As the final step the projects were counted and aggregated on a monthly level. The final dataset contains data about 8696 projects before the aggregation. The 28 same dataset contains 231581, 55247 and 146 observations respectively in case of Sourceforge, Freshmeat, and Objectweb. Figure 4.1. shows the results with regards to Rubyforge. Figure 4.2. Number of new projects registered in a month to Rubyforge Source: Author’s own calculation Figure 4.1. supports very well the Flossmole observation from Figure 1.1. Namely since the beginning of the 2008 depression the number of new projects is decreasing; the growth of the forges slowing down. The next figure – representing the Rubyforge dataset complemented with the Freshmeat data – shows similar results. The growth of the total number is decreasing; however this decline starts earlier – around mid. 2004 – compared to the Rubyforge site. 29 Figure 4.3. Number of new projects registered in a month to Freshmeat and Rubyforge Source: Author’s own calculation Interestingly enough the Sourceforge’s community’s shows different trends over the same observation period. The first big and quick increase occurred at the end of 2000 when the so far steady number of new projects tripled in just two months. That might be the result of the so called dot-com bubble collapse followed with a delay (the “IT bubble” collapsed in May 10, 2000). The activity level stabilized at 1500 – 2000 for the next five years. Following this period until mid 2008 the level of the activity grown along with the increase of the variance of the newly registered projects number. Throughout 2008 until mid 2009 a massive 26% - 30% decline took place. Only, to took off and double the number of projects and reach an all time high level of 4500. 30 Figure 4.4. Number of new projects registered in a month to Sourceforge Source: Author’s own calculation The above mentioned observations with regards to the vide variety of the conformation does not rules out, nor confirms any relationship between the real economy and the level of OSS activity. To shed light to the relationship in the following I present an econometric model with the discussion of the related methodology and the results of the evaluation. 4.1.2. Macroeconomic data The dataset consists of panel data with quarters and countries being the two dimensions. Below the choice of timeframe, number of countries and data sources is described. The period from 1998-Q1 to 2010-Q2 was chosen as the timeframe of the analysis. The period was limited to the last twelve years since it results reliable figures on OSS projects. With the use of quarterly data an adequate number of 50 time periods received for each country. To determine the countries included in the analysis I first searched for the origin of the OSS development. Engelhardt and Freytag (2009) states that studies 31 indicated that, firstly OSS developers are well-educated software engineers (or ICT students) in order to be able to write software code (i.e. programming), secondly one must be able to think in abstract terms and logic. Additionally, most programming languages are based on English and the whole communication and coordination of OSS projects is done in English. Therefore macroeconomic data has been collected for the OECD member countries and Brazil, Indonesia, Russian Federation, South Africa. In this stage I face a problem however; I have the same number of programs registered for all the different countries in the different times. Table 4.3. provides a visualization of the problem presenting selected lines and columns from the dataset. Records of the dependent variable (1F)14 constant over countries. Table 4.3. Selected record form the analysis’ dataset year 1999 1998 1998 1999 1998 1998 quarter country country id unemp level unemp rate (%) 1 4 3 1 4 3 Austria Austria Austria Canada Canada Canada 2 2 2 4 4 4 152667 160333 168667 1224600 1245467 1258833 4.0 4.2 4.4 7.9 8.1 8.2 1F 471 324 313 471 324 313 Source: STATA dataset It is beyond the scope and resources of this thesis to explore the geographical origin of each project. Therefore I used Engelhardt and Freytag’s (2009) top 30 list of countries by active users to weight the dependent variables. As a result, the number of countries included in the analysis shrinked to 2515. In order to bring consistency to the analysis my aim was to collect data from as few sources as possible. The only source of the data is the SourceOECD. Table 4.4. provides the description of each macroeconomic variables used during the analysis. 14 See description of 1F in p. 36. In alphabetical order the countries included are: Australia, Austria, Belgium, Brazil, Canada, Czech Republic, Denmark, Finland, France, Germany, Israel, Italy, Japan, Mexico, Netherlands, New Zeland, Norway, Poland, Russian Federation, South Africa, Spain, Sweden, Switzerland, United Kingdom, United States 15 32 Table 4.4. Macroeconomic variables used in the analysis Variable Description Employment: People in civilian employment above a specified age who during the reference period were emp either paid employees, employers and self-employed, unpaid family workers or students with temporary paid job (household survey based). Harmonized unemployment rate: Give the numbers h_unem_r of unemployed people as a percentage of the civilian labour force. The civilian labour force consist the employed and unemployed people. Industrial production is an index covering production in mining, manufacturing and public utilities g_indprod (electricity, gas and water), but excluding construction. (growth since last quarter, seasonally adjusted). Retail trade volume index: Calculated by dividing g_retsales_vol total retail trade turnover in current prices by an appropriate price deflator (growth since last quarter, seasonally adjusted). Unit labour costs: measures the average cost of g_unit_labour_costs labour per unit of output and are calculated as the ratio of total labour costs to real output (growth since last quarter, seasonally adjusted). Consumer prices measures changes over time in the g_cons_prices general level of prices of goods and services that a reference population acquires, uses or pays for consumption. (growth since last quarter) 33 g_gdp Gross Domestic Product at constant prices, seasonally adjusted (growth since last quarter). Broad Money supply, in addition to currency in circulation plus sight deposits held by domestic non- b_money banks, also include time deposits as well as savings deposits at short-notice held by domestic non-banks. (growth since last quarter, seasonally adjusted). Share share_p Prices: Prices of common shares of companies traded on national or foreign stock exchanges (growth since last quarter). Export of Goods: Consist of exports of national products, exports without transformation of goods and exp exports from bonded warehouses which have not been transformed since import (in billions of USD, growth since last quarter, seasonally adjusted). Import of Goods Consist of imports for direct domestic consumption; withdrawals from bonded imp warehouses and free zones for domestic consumption ; and imports into bonded warehouses and free zones (in billions of USD, growth since last quarter, seasonally adjusted). Service Exports: Economic flows streaming into the s_exp economy from the rest of the world. (in USD, growth since last quarter, seasonally adjusted). Service Import: Economic flows streaming from the s_imp economy to the rest of the world. (in USD, growth since last quarter, seasonally adjusted). 34 4.1. Methodology A linear regression is attempts to model the relationship between the dependent variable and the independent variable(s) (or explanatory variable) (Wooldridge, 2008). The linear relationship can be captured through a scatter plot. Equation (4.1) is the equation for the linear regression. 𝑌𝑖𝑡 = 𝛽0 + 𝑋𝑖𝑡 𝛽 + 𝜀𝑖𝑡 (4.1) Where: 𝑌𝑖𝑡 = the dependent variable, 𝛽0 is the constant term, 𝑋𝑖𝑡 𝛽 is the explanatory variable, 𝜀𝑖𝑡 is the error term. In case of panel data however the model has to be modified. Panel data is a dataset containing observations from different times, from different places in the same topic (Wooldridge, 2008) for instance the growth of the GDP between 1998 and 2010 in the different OECD countries. Panel data models used as a way of controlling for cross-sectional heterogeneity which means that there is something “different” about the observed units, but it is not possible to reduce these differences completely to the observable data. The equation (4.1) can also be solved by a fixed- or a random effect model. 𝑌𝑖𝑡 = 𝛽0 + 𝑋𝑖𝑡 𝛽 + 𝑍𝑖𝛾 + 𝛼𝑖 + 𝜀𝑖𝑡 (4.2) Where: 𝑌𝑖𝑡 = the dependent variable, 𝛽0 is the constant term, 𝑋𝑖𝑡 𝛽 is the explanatory variable, 𝜀𝑖𝑡 is the error term 35 𝑍𝑖𝛾 observed effect but can not be estimated through the fixed effect model these are time-invariant factors, 𝛼𝑖 un-observed individual specific effect a fixed effect for each individual across time. The difference between the two models is in the underlying assumptions (Wooldridge, 2008): By using the fixed effect model we assume that the individual specific effect is correlated with the independent variables. In this case the timeinvariant factors – such as gender, name, etc. – will be excluded from the equation by taking the difference between each observation with the within-group mean values in order to get rid of the individual specific effect term 𝛼𝑖 . 𝐸(𝛼𝑖 ∣ 𝑋𝑖𝑡 , 𝑍𝑖𝛾 ) ≠ 0 On the other hand in case of the random effect model we assume that the individual specific effects are uncorrelated with the explanatory variables. All the coefficients will be estimated whether it is a time-variant or time-invariant. Since in this case there is no fixed individual specific effect 𝛼𝑖 and 𝜀𝑖𝑡 can be combined together to form a new error term 𝜉𝑖𝑡 . Therefore we do not need to take differences and all variables will be included 𝐸(𝛼𝑖 ∣ 𝑋𝑖𝑡 , 𝑍𝑖𝛾 ) = 0 4.2. Model After taking into consideration the previously discussed, I have decided to use the random effect model to estimate the relationship between the number of projects registered and the changes in certain macroeconomic factors. The final model expressed in the (4.3) equation. 36 𝑔_𝑓𝑜𝑟𝑔𝑒𝑠𝑖𝑡 = 𝛽0 + 𝛽1 𝑒𝑚𝑝𝑖𝑡 + 𝛽2 + 𝛽1 ℎ_𝑢𝑛𝑒𝑚_𝑟𝑖𝑡 + 𝛽3 𝑔_𝑖𝑛𝑑𝑝𝑟𝑜𝑑𝑖𝑡 + 𝛽4 𝑔_𝑟𝑒𝑡𝑠𝑎𝑙𝑒𝑠_𝑣𝑜𝑙𝑖𝑡 + 𝛽5 𝑔_𝑢𝑛𝑖𝑡_𝑙𝑎𝑏𝑜𝑢𝑟_𝑐𝑜𝑠𝑡𝑠𝑖𝑡 + 𝛽6 𝑔_𝑐𝑜𝑛𝑠_𝑝𝑟𝑖𝑐𝑒𝑠𝑖𝑡 + (4.3) 𝛽7 𝑔_𝑔𝑑𝑝𝑖𝑡 + 𝛽8 𝑏_𝑚𝑜𝑛𝑒𝑦𝑖𝑡 + 𝛽9 𝑠ℎ𝑎𝑟𝑒_𝑝𝑖𝑡 + 𝛽10 𝑒𝑥𝑝𝑖𝑡 + 𝛽11 𝑖𝑚𝑝𝑖𝑡 + 𝛽12 𝑠_𝑒𝑥𝑝𝑖𝑡 + 𝛽13 𝑠_𝑖𝑚𝑝𝑖𝑡 + 𝛾𝑡 + 𝛿𝑖 + 𝜉𝑖𝑡 Where: 𝑖 represents the given country, 𝑡 represents the given quarter, 𝛾𝑡 captures the time variant factors by quarter dummies, 𝛿𝑖 captures the country variant factors by country dummies. 4.3. Results All together eight different regressions been conducted. The eight models can be separated into two groups in the following way: the one is uses robust standard error calculation method: Model 1-4 the one that computes the standard error in the usual way Model 1B-4B The models differ on the number of quarters observed and the used dependent variables rather than the independent variables. Three different datasets have been constructed and two dependent variables in order to capture differences across the forges: the first dataset (1DS) contains data on all the 50 observed quarters (from 1998-Q1 to 2010-Q2), the second (2DS) keeps information on the variables from 2003-Q3 to 2009-Q4, finally the third dataset (3DS) has data on 40 quarters from 2000-Q1 to 2009-Q4, the first dependent variable (1F) aggregates the four forges’ records, the second (2F) sums the Sourceforge and Freshmeat data –accounts for 97% of the observed number of projects. 37 Table 4.5. and 4.6. summarizes the results of the equations, the description of model specifications follow after the output tables. Table 4.5. RE Estimation Results for Model 1-4 (robust standard errors) Employment Unemployment Rate Industrial Production Retail Sales Volume Unit Labour Costs Consumer Prices GDP Growth Broad Money Supply Share Prices Export of Goods Import of Goods Service Exports Service Imports Constant overall R2 Number of obs. Model 1 Model 2 Model 3 Model 4 -0.000* -0.000 0.000 -0.000** (0.000) (0.000) (0.000) (0.000) -0.012 0.007 0.014 0.019* (0.034) (0.044) (0.011) (0.011) -0.003 -0.002 0.007* 0.006 (0.012) (0.013) (0.003) (0.005) 0.004 -0.025 0.006 0.006 (0.032) (0.043) (0.006) (0.008) 0.013 0.059 -0.008 0.014 (0.068) (0.092) (0.012) (0.014) -0.044 -0.065 0.002 0.012 (0.040) (0.051) (0.011) (0.014) 0.035 0.064 -0.022** 0.003 (0.034) (0.047) (0.009) (0.013) -0.001 -0.004 -0.005 -0.011* (0.018) (0.019) (0.005) (0.006) -0.002 -0.005 0.001 0.000 (0.004) (0.005) (0.001) (0.001) 0.008 0.009 -0.002 0.002 (0.008) (0.009) (0.002) (0.002) -0.023** -0.024** -0.001 -0.007** (0.010) (0.012) (0.003) (0.003) 0.011 0.009 0.002** 0.001 (0.008) (0.009) (0.001) (0.001) -0.008 -0.007 -0.003** -0.004* (0.008) (0.010) (0.002) (0.002) 1.664* 2.233 0.061 0.304*** (0.910) (1.381) (0.103) (0.107) 0.140 516 0.139 516 0.393 276 0.421 416 The values in parentheses are the standard errors. Significance levels are denoted by *** = 1%, ** = 5%, and * = 10%. All models have been estimated using, country-dummies and quarter dummies (not shown). Source: STATA output 38 Table 4.6. RE Estimation Results for Model 1-4 (normal standard errors) Employment Unemployment Rate Industrial Production Retail Sales Volume Unit Labour Costs Consumer Prices GDP Growth Broad Money Supply Share Prices Export of Goods Import of Goods Service Exports Service Imports Constant overall R2 Number of Obs. Model 1B Model 2B Model 3B Model 4B -0.000*** -0.000*** 0.000 -0.000** (0.000) (0.000) (0.000) (0.000) -0.012 0.007 0.014* 0.019** (0.037) (0.048) (0.009) (0.009) -0.003 -0.002 0.007* 0.006 (0.017) (0.022) (0.003) (0.004) 0.004 -0.025 0.006 0.006 (0.028) (0.036) (0.006) (0.007) 0.013 0.059 -0.008 0.014 (0.054) (0.071) (0.012) (0.013) -0.044 -0.065 0.002 0.012 (0.054) (0.071) (0.011) (0.012) 0.035 0.064 -0.022** 0.003 (0.049) (0.064) (0.010) (0.012) -0.001 -0.004 -0.005 -0.011** (0.023) (0.030) (0.004) (0.005) -0.002 -0.005 0.001 0.000 (0.004) (0.005) (0.001) (0.001) 0.008 0.009 -0.002 0.002 (0.009) (0.012) (0.002) (0.002) -0.023** -0.024* -0.001 -0.007*** (0.010) (0.014) (0.002) (0.003) 0.011 0.009 0.002* 0.001 (0.007) (0.009) (0.001) (0.002) -0.008 -0.007 -0.003* -0.004* (0.008) (0.011) (0.002) (0.002) 1.664*** 2.233*** 0.061 0.304*** (0.359) (0.471) (0.101) (0.095) 0.140 516 0.139 516 0.393 276 0.421 416 The values in parentheses are the standard errors. Significance levels are denoted by *** = 1%, ** = 5%, and * = 10%. All models have been estimated using, country-dummies and quarter dummies (not shown). Source: STATA output Both Model 1 (M1) and Model 2 (M2) based on the first dataset (1DS) but while in (M1) (1F) is the dependent variable in (M2) it is (F2). The difference between 39 the two dependent variables is that (F2) excludes the Rubyforge data. Since it accounts only about 3% of the total number new projects there is no major difference in the results. The overall R2 is fairly low in both cases around 14% percent of the variance in the dependent variable is explained by the independent variables. Only the Import of Goods found to be significant at a 5% level. This means that a 1% change in the growth of Imports results -0.023% change in the growth of the OSS activity. The size of Employment significant at the 10% level however the size of the impact is irrelevant. There is room for improvement according to the low R2 of (M1) and (M2). The regression in Model 3 (M3) conducted on (2DS) with (1F) as a dependent variable. In this case the model seems to explain a lot more variance in the OSS activity, the R2 is 0.393. This model found the growth of GDP, the Service Exports and the Service Imports to be significant at a 5% level with (-0.022), 0.002 and (-0.003) point estimate respectively. Model 4 (M4) regresses (2F) with data from (3DS). This model shows the most convincing value – 0.421 – of R2 so far. Furthermore this is the model that found the largest number of significant regressors. Employed and Import of Goods at 5%, Unemployment, Money Supply and Service Imports at 10% level. The size of employed people in the economy seems to be irrelevantly low repeatedly. Meanwhile 1% change in the growth of imported goods leads to a -0.007% change in the growth of OSS activity. Unemployment has a positive 0.019% effect while both money supply and service imports have negative effect, 0.011% and 0.004% respectively (M4) seems to provide a good basis for the next step of analysis where I create different models using the (2DS) and (2F) specifications, but vary the across regressors. 40 Table 4.6. RE Estimation Results for Model4 – 10 (robust standard errors) Model 4 Employment Model 5 Model 6 Model 7 Model 8 Model 9 Model 10 -0.000** (0.000) Unemployment Industrial Prod. Retail Sales Unit Labour Costs Consumer Prices GDP Growth Money Supply Share Prices Exp. of Goods Imp. of Goods Service Exports. Service Imports Constant R2 Number of obs. 0.019* 0.019* (0.011) (0.010) 0.006 0.003 0.004 0.008** (0.005) (0.004) (0.004) (0. 004) 0.006 -0.010* -0.012** 0.002 (0.008) (0.006) (0.005) (0. 008) 0.014 0.006 0.005 0 .018 (0.014) (0.010) (0.010) (0.014) 0.012 0.003 0.012 0.017 (0.014) (0.008) (0.011) (0.014) 0.003 0.007 0.028*** 0.011 0.023** 0.012* (0.013) (0.006) (0.007) (0.011) (0.011) (0.006) -0.011* -0.010** -0.010** -0.011** (0.006) (0.004) (0.005) (0. 005) 0.000 0.001 0.001 0 .001 (0.001) (0.001) (0.001) (0.001) 0.002 0.001 0.002 0.002 (0.002) (0.001) (0.002) (0.002) -0.007** -0.005*** -0.008*** -0.007** (0.003) (0.002) (0.002) (0.003) 0.001 -0.000 -0.001 -0.000 (0.001) (0.001) (0.001) (0.000) -0.004* -0.001 -0.002* -0.002 (0.002) (0.001) (0.001) (0.002) 0.304*** 0.277*** 0.251*** 0.246*** 0.252*** 0.279*** 0.170** (0.107) (0.034) (0.031) (0.032) (0.033) (0.035) (0.076) 0.421 416 0.365 627 0.378 898 0.366 796 0.393 796 0.371 587 0.408 436 The values in parentheses are the standard errors. Significance levels are denoted by *** = 1%, ** = 5%, and * = 10%. All models have been estimated using, country-dummies and quarter dummies (not shown). Source: STATA output 41 Model 5-10 captures the differences between the effects of the different macroeconomic factors on the OSS activity. In general we can say that the models explain 36.6% to 42.1% of the response variation by the regressors. Model 5 pictures the effects of the changes in the money supply and the share prices; with GDP growth in the model I capture the state of the given country’s economy. Out of the three variables money supply found to be significant, at a 5% level. The coefficient has a negative sign therefore the OSS activity’s growth will increase by 0.011% along with a 1% decrease in money supply growth. Model 9 is the extension of (M5) it includes the balance of accounts by following the in- and outwards streaming economic flows. It seems that the OSS activity is only affected by the inward flows; however the effect is very small with a point estimate of 0.002 and only significant at the 10% level. the money supply is still significantly negative but the coefficient increased to (-0.010). Finally the GDP growth is become significant at the 10% level with a point estimate of 0.012. Model 6 focuses on the consumption side of the economy. The growth of the retail sales is only significant at a 10% level with a (-0.010%) effect on the OSS activity. The consumer prices, the export of goods and the service exports have no significant effect, and this is a common theme in the behavior of the above mentioned three variables. The import of goods has a low but very significant – at 1% - point estimate with a magnitude of (-0.005). In this model unlike to (M9) service export has no significant effect. The GDP growth has a significant effect of 0.028% on OSS activity growth, the largest effect across models in case of the GDP growth variable. Model 7 focuses on the production side of the economy and not industrial production, GDP growth nor labour productivity – proxied by unit labour costs - found to be significant. 42 Model 8 combines the consumption (M6) and production (M7) side. The results very similar to the previous ones, none of (M7)’s variables become significant. From (M6) the previously significant variables remained significant the levels however changed. Retail sales is significant at a 5% up from the previous 10% level while GDP growth decreased to 5% from the previous 1%. The import of goods significant at a 1% level once again. The point estimates are (- 0.012), 0.023 and (0.008) respectively. Model 10 includes all the variables except the GDP and Employment. This is the only model where industrial production is significant with a point estimate of 0.008 at a 5% level. Money supply and Import of goods are also significant like in all other models in which these variables were included. The point estimates are the same as the base model (M4). The unemployment has also the same effect and the significance level. The only difference is that the previously mentioned industrial production becomes significant. In addition to the aggregate OSS sector analysis I have conducted estimations on the different forges. (M4) – (M10) specifications were used with the different forges’ activity growth as the dependent variable. Objectweb was excluded from the analysis due to the limited number of observations. 43 Table 4.7. Results for Forge wise RE estimation with (M4) regressors Model 4 Model 4sf Model 4fm Model 4rf -0.000** (0.000) 0.019* (0.011) 0.006 (0.005) 0.006 (0.008) 0.014 (0.014) 0.012 (0.014) 0.003 (0.013) -0.011* (0.006) 0.000 (0.001) 0.002 (0.002) -0.007** (0.003) 0.001 (0.001) -0.004* (0.002) -0.000** (0.000) 0.024 (0.023) 0.007 (0.011) 0.003 (0.018) 0.031 (0.027) 0.019 (0.031) 0.051 (0.036) -0.019 (0.015) 0.002 (0.003) 0.011* (0.006) -0.016** (0.007) -0.002 (0.003) -0.006 (0.004) -0.000** (0.000) 0.004 (0.030) -0.005 (0.010) -0.003 (0.028) -0.060 (0.045) -0.024 (0.027) 0.009 (0.028) -0.000 (0.016) -0.000 (0.003) 0.006 (0.007) -0.008 (0.009) 0.008 (0.008) -0.005 (0.007) -0.000 (0.000) -0.078*** (0.026) -0.016 (0.010) 0.037** (0.015) 0.059 (0.038) -0.003 (0.026) 0.024 (0.030) -0.016 (0.019) 0.003* (0.002) -0.004 (0.004) 0.010** (0.005) 0.004 (0.003) -0.007** (0.003) Constant 0.304*** (0.107) 0.670** (0.274) 0.616** (0.287) 0.515* (0.300) R2 Number of obs. 0.421 416 0.297 416 0.106 506 0.193 272 Employment Unemployment Industrial Production Retail Sales Unit Labour Costs Consumer Prices GDP growth Broad money Share Prices Export of Goods Import of Goods Service Exports Service Imports The values in parentheses are the standard errors. Significance levels are denoted by *** = 1%, ** = 5%, and * = 10%. All models have been estimated using, country-dummies and quarter dummies (not shown). Source: STATA output 44 The general model with all of the independent variables there is fairly large variance in R2 across forges, it seems that none of the variables are significant in case of Freshmeat except the unemployment, but the effect’s size is irrelevantly low. The aggregate level analysis and Sourceforge shows similar results, in both models the import of goods is significant at a 5% level with a point estimate of (-0.016) in the second case, more than doubling the effect compared to the aggregate model. The export of goods also has a significant effect on Sourceforge community’s activity with a coefficient of 0.011. In case of Rubyforge the unemployment has the most significant effect and the highest also, with (-0.078%). Retail sales, import of goods and service imports are also significant at a 5% level with a point estimate of 0.037; 0.010; (-0.007) respectively. Table 4.8. Results for Forge wise RE estimation with (M5) regressors Model 5 Model 5sf Model 5fm Model 5rf 0.007 (0.006) -0.010** (0.004) 0.001 (0.001) 0.034** (0.016) -0.016 (0.011) 0.002 (0.002) 0.021 (0.018) 0.000 (0.010) 0.001 (0.001) 0.006 (0.009) 0.020 (0.015) 0.002** (0.001) Constant 0.277*** (0.034) 0.483*** (0.085) 0.100 (0.078) -0.002 (0.066) R2 Number of obs. 0.365 627 0.256 627 0.079 764 0.030 407 GDP Growth Broad money Share Prices The values in parentheses are the standard errors. Significance levels are denoted by *** = 1%, ** = 5%, and * = 10%. All models have been estimated using, country-dummies and quarter dummies (not shown). Source: STATA output GDP, money supply and share prices cannot explain any variance significantly in case of Freshmeat once again. Rubyforge’s R2 dropped to almost 0 from nearly 0.2 compared to the aggregate model, even though share prices 45 changes are significant and account for a 0.002% effect in activity. GDP is the only significant variable in case of Sourceforge with a point estimate of 0.034. Broad money is not significant in any of the forges however it is on the aggregate level. Table 4.9. Results for Forge wise RE estimation with (M6) regressors Model 6 Retail sales Model 6sf Model 6fm Model 6rf -0.010* (0.006) 0.003 (0.008) 0.028*** (0.007) 0.001 (0.001) -0.005*** (0.002) -0.000 (0.001) -0.001 (0.001) -0.027* (0.014) 0.006 (0.017) 0.077*** (0.017) 0.007** (0.004) -0.010*** (0.004) -0.001 (0.001) -0.002 (0.002) 0.007 (0.015) 0.000 (0.016) 0.030* (0.018) 0.002 (0.004) -0.006 (0.005) 0.000 (0.003) 0.001 (0.002) 0.003 (0.011) 0.000 (0.017) 0.012 (0.012) 0.000 (0.003) 0.002 (0.003) -0.000 (0.002) 0.003* (0.002) Constant 0.251*** (0.031) 0.447*** (0.076) 0.086 (0.073) 0.042 (0.050) R2 Number of obs. 0.378 898 0.272 898 0.079 1080 0.025 589 Consumer prices GDP Growth Export of Goods Import of Goods Service Exports Service Imports The values in parentheses are the standard errors. Significance levels are denoted by *** = 1%, ** = 5%, and * = 10%. All models have been estimated using, country-dummies and quarter dummies (not shown). Source: STATA output The consumption variables have almost nonexistent effect on the activity both at Rubyforge and Freshmeat. While service imports significant in the first case only at the 10% level with a very low, 0.003 point estimate; at the mean time GDP growth has the same size of effect on the activity at Freshmeat. Sourceforge once again has very similar results as the aggregate level. The 46 only difference is the size of the effects and that the export of goods is significant with a 0.007 coefficient. Table 4.10. Results for Forge wise RE estimation with (M7) regressors Model 7 Model 7sf Model 7fm Model 7rf 0.003 (0.004) 0.006 (0.010) 0.011 (0.011) -0.005 (0.008) 0.022 (0.021) 0.076*** (0.027) -0.013** (0.007) -0.013 (0.029) 0.075*** (0.026) -0.016* (0.009) 0.094*** (0.023) 0.084*** (0.024) Constant 0.246*** (0.032) 0.409*** (0.078) 0.072 (0.078) -0.058 (0.054) R2 Number of obs. 0.366 796 0.265 796 0.084 974 0.077 511 Industrial production Unit labour costs GDP Growth The values in parentheses are the standard errors. Significance levels are denoted by *** = 1%, ** = 5%, and * = 10%. All models have been estimated using, country-dummies and quarter dummies (not shown). Source: STATA output The model focusing on the production side of the economy has no significant variables at the aggregate level however at a forge level 1% GDP growth causes 0.076%, 0.075% and 0.084% growth at Sourceforge, Freshmeat and Rubyforge activity respectively. Furthermore changes in industrial production has a negative effect on Freshmeat with a (-0.013) coefficient. According to this specification Rubyforge’s activity growth will increase by 0.094% along with a 1% decrease in the labour productivity. 47 Table 4.11. Results for Forge wise RE estimation with (M8) regressors Model 8 Model 8sf Model 8fm Model 8rf 0.004 (0.004) -0.012** (0.005) 0.005 (0.010) 0.012 (0.011) 0.023** (0.011) 0.002 (0.002) -0.008*** (0.002) -0.003 (0.008) -0.038*** (0.014) 0.017 (0.020) 0.029 (0.024) 0.097*** (0.027) 0.010** (0.005) -0.015*** (0.006) -0.013* (0.007) 0.017 (0.016) -0.016 (0.029) 0.004 (0.018) 0.071*** (0.027) 0.002 (0.005) -0.005 (0.007) -0.017** (0.008) 0.008 (0.012) 0.094*** (0.023) -0.016 (0.019) 0.074*** (0.026) -0.000 (0.003) 0.005 (0.004) Constant 0.252*** (0.033) 0.423*** (0.080) 0.058 (0.076) -0.062 (0.054) R2 Number of obs. 0.393 796 0.294 796 0.086 974 0.083 511 Industrial production Retail sales Unit Labour Costs Consumer Prices GDP Growth Export of Goods Import of Goods The values in parentheses are the standard errors. Significance levels are denoted by *** = 1%, ** = 5%, and * = 10%. All models have been estimated using, country-dummies and quarter dummies (not shown). Source: STATA output Model 8 combines the consumption (M6) and production (M7) side. In case of Sourceforge in addition to the aggregate model’s significant variables the export of goods found to be significant with a 0.01 point estimate. Freshmeat is affected by GDP growth at a 1% level, same as Rubyforge and the estimates are very close also. Labour productivity and industrial production once again significantly affects the activity growth of Rubyforge. 48 Table 4.12. Results for Forge wise RE estimation with (M9) regressors Model 9 Model 9sf 0.012* (0.006) -0.010** (0.005) 0.001 (0.001) -0.001 (0.001) -0.002* (0.001) Constant R2 Number of obs. GDP growth Broad money Share Prices Service Exports Service Imports Model 9fm Model 9rf 0.045*** (0.017) -0.015 (0.011) 0.003 (0.002) -0.003 (0.002) -0.005** (0.002) 0.016 (0.019) -0.005 (0.011) 0.000 (0.001) 0.001 (0.003) 0.001 (0.002) 0.005 (0.010) 0.017 (0.015) 0.002* (0.001) -0.002 (0.002) 0.002 (0.002) 0.279*** (0.035) 0.495*** (0.087) 0.107 (0.079) 0.008 (0.067) 0.371 587 0.267 587 0.077 711 0.031 377 The values in parentheses are the standard errors. Significance levels are denoted by *** = 1%, ** = 5%, and * = 10%. All models have been estimated using, country-dummies and quarter dummies (not shown). Source: STATA output Model 9 has no significant variables in case of the two small forge. And only GDP growth and service imports affect Sourceforge significantly. The estimates are 0.045 and (-0.005) respectively. Interestingly enough the effect of the money supply became insignificant in all three cases while it is significant at the aggregate level. 49 Table 4.13. Results for Forge wise RE estimation with (M10) regressors Model 10 Model 10sf 0.019* 0.018 0.011 -0.078*** (0.010) (0.021) (0.029) (0.023) 0.008** 0.019** 0.001 -0.011 (0.004) (0.008) (0.009) (0.007) 0.002 0.002 0.010 0.041*** (0.008) (0.017) (0.024) (0.014) 0.018 0.025 -0.030 0.051 (0.014) (0.027) (0.043) (0.036) 0.017 0.030 0.009 -0.005 (0.014) (0.032) (0.023) (0.026) -0.011** -0.016 0.006 -0.013 (0.005) (0.015) (0.016) (0.019) 0.001 0.003 -0.000 0.003* (0.001) (0.003) (0.003) (0.002) 0.002 0.009 0.006 -0.004 (0.002) (0.006) (0.007) (0.004) -0.007** -0.015** -0.006 0.011** (0.003) (0.007) (0.009) (0.005) -0.000 -0.002 0.004 0.004 (0.001) (0.003) (0.007) (0.003) -0.002 -0.006 -0.000 -0.008** (0.002) (0.004) (0.007) (0.003) Constant 0.170** (0.076) 0.398** (0.167) 0.025 (0.180) 0.420** (0.172) R2 Number of obs. 0.408 436 0.282 436 0.078 533 0.189 273 Unemployment Industrial Production Retail Sales Unit Labour Costs Consumer Prices Broad Money Share Prices Export of Goods Import of Goods Service Exports Service Imports Model 10fm Model 10rf The values in parentheses are the standard errors. Significance levels are denoted by *** = 1%, ** = 5%, and * = 10%. All models have been estimated using, country-dummies and quarter dummies (not shown). Source: STATA output 50 Finally Model 10. Freshmeat once again explained very poorly by the model with no significant variable. In case of Sourceforge the only significant variables are industrial production 0.019 and import of goods (-0.015). Rubyforge is affected by the most factors. Unemployment (-0.078) and retail sales 0.041 both significant at a 1% level. Import of goods and service imports at 5% with a point estimate of 0.011; (-0.008). And finally share prices which is only significant at the 10% level with a low effect of 0.003%. 4.4. Conclusions of the analyses The results found in section 4 can be divided into two parts. The first part describes the behaviour of the forges aggregate activity due to different macroeconomic factor changes. The second part compares the effects of the same variables on the different forge’s activity growth. On aggregate level seven – (M4)-(M10) - different model were built in order to estimate differences in different macroeconomic factors. Unemployment, money supply and import of goods found to be significant in all models where they were included. GDP growth, industrial production, retail sales and service imports were significant only in certain models. The constantly significant factors have limited or nonexistent variance across the models. Unemployment has the highest effect with a point estimate of 0.019 also this is the only variable out of the three which is in a positive relation to the dependent variable. The size of money supply changes has (-0.01) (-0.011) estimated effect. The variance in coefficients the largest in case of changes in imported goods, it varies between (-0.005) and (-0.008). These effects seem to be very little but putting it into a different context it turns out that a 1% growth in unemployment rate can result 34916 new projects registered to the sites compared to the previous quarter’s figure. In case of money supply and imports the same size of growth can result 184 and 92-147 less projects registered on the forges respectively. 16 Details of the calculation can be found in Appendix A 51 Regressions are specified on the same way as they were conducted on the change in the individual forge’s activity. Sourceforge: GDP consequently has a significant effect on Sf’s activity the estimates are always positive which means that the behaviour of the activity moves along with the business cycle. The size of the effect varies between 0.034 and 0.097 depending on the model specification. The import of goods also seems to have a significant effect on Sourceforge however the it is “reverse” in this case due to the negative estimates which vary from (-0.01) to (-0.015), this equals to approximately 37 – 55517 less projects registered in case of a 1% increase in the growth of the value growth of imported goods. Freshmeat follows a totally different activity path; all the models had a low R2 meaning that there are other, unobserved factors causing the variance in the activity changes. The industrial production and GDP has significant effects in some cases. While the earlier has negative estimates of (-0.013); the latter has bigger and positive impact with 0.071 and 0.075 coefficients. This equals to 1491-157518 more project in case of 1% increase in the growth rate of GDP Rubyforge: Interestingly enough the smallest forge is significantly sensitive to the changes in unit labour costs, the relationship is positive with a 0.094 estimate. This means that a decline by a unit in the labour productivity – increase in the unit labour costs (ULC) – leads to approximately 84619 more projects registered to the host. The other significant variable is GDP growth with a 0.074 to 0.084 point estimate. The above results suggest that there is difference between the forges, each affected differently by the same macroeconomic factors. This can be the result of different types of projects hosted; or different consistency of the developer community. Unfortunately it is not possible to determine which and how affects the activity based on the available dataset. 17 For details see: Appendix B at pag 62. For details see: Appendix B at pag 62. 19 For details see: Appendix B at pag 62. 18 52 5. Conclusion The economic crisis started in late 2008 resulted huge changes in the world economy. Numerous developed countries had to face slowing or even negative GDP growth, Ireland the previous “Celtic tiger” and Greece received enormous aid packages to be able to keep up with the financing of its debt. These hard times, according to Flossmole, seems to affect the OSS sector as well and through that the whole software industry. According to the analysis conducted in section 3 the software industry is indeed affected by the crisis, but the data suggest that this effect was minor in terms of revenue lost in the industry compared to the construction industry for example. Both the European and the global market was able to grow however this growth lagged behind the previously experienced high rate. There were losers and winners, in general it can be said that among the different business models those of which had 50-60% revenue from subscription fees and 40-50% from the services are gained on the crisis (Desmond J. P., 2010). The industry is a major employer of the R&D sector and despite the depression there are more and more employees from year to year. Section 4 analyzed the behaviour of OSS activity due to macroeconomic factors. The models show that the effect on the activity is limited in general import of goods, unemployment rate and money supply have effect on the sector. However there is difference across the forges. In general due to the crisis lot of entrepreneur turned to the OSS sector due to cost cutting incentives. As it turns out from the second section OSS is a very complex phenomenon with numerous research possibilities. Therefore the activity of the community is affected by a lot of different factors such as motivation, private investment to the sector, types of licenses; macroeconomic factors only a small part of it. According to the results the changes in the real economy have limited 53 effect on the OSS sector, there are numerous unobserved factor not covered by this study. For further research it could be interesting to expand the dataset to the lines of code generated in the forges. This would result a more precise dataset namely because the database I built during the study has its limitations. To name only one the activity is weighted only by county therefore it is constant over time, however the number of active developers is dynamically changes from time to time. Furthermore the low R2 suggest room for improvement by including new variables. 54 List of References Barro, R. J. (1989), Modern Business Cycle Theory, Harvard University Press, Cambridge, Mass. Benkler, Y. (2002), Coase’s penguin or Linux and the nature of the firm. Yale Law Journal, 112(3) 369-446. Bessen, J. (2002), What good is free software? In Hahn, editor, (2002), Government policy towards Open Source Software, pp. 12-33. AEI Brooking Joint Center for Regulatory Studies, Washington DC, USA. Bitzer, J. (2004), Commercial versus open source software: The role of product heterogeneity in competition. Economic Systems, 28(4): 369-381. Bitzer J. ,Schröder P., editors (2006), The economics of open source software development, Elsevier, Amsterdam. Blain, – Points T. „Software Introduction." Cost Tyner Estimation Blain Blog. With February Use Case 12, 2007. http://tynerblain.com/blog/2007/02/12/software-cost-estimation-ucp-1/. (accessed October 20, 2010.) Bonaccorsi, A. and Rossi, C. (2003), Why open source can succeed, Research Policy, 32(7): 1243-1258. Boehm, B. W. (1981), Software Engineering Economics, Prentice-Hall. Casadeus – Masanell, R. and Ghemwhat, P. (2003), Dynamic Mixed Duopoly: A Model Motivated by Linux vs. Windows. Harvard Business School working paper. Casarin, R. and Trecroci, C. (2007), Business Cycle and Stock Market Volatility: Are They Related?, working paper, available at SSRN: http://ssrn.com/abstract=888524 (accessed: 5 June, 2010) Chemuturi, M. and Kaligotla, S. „Productivity for Software Estimators” http://www.metricssoftware.com/Productivity%20for%20Software%20Estimators .pdf (accessed October 19, 2010) 55 Chauvet, M., (2001), Stock Market Fluctuations And The Business Cycle, working paper, available at SSRN: http://ssrn.com/abstract=283793 (accessed: 3 June, 2010) Chordia, T. and Shivakumar, L. (2001), Earnings, Business Cycle And Stock Returns, working paper, available at SSRN: http://ssrn.com/abstract=281491 (accessed: 3 June, 2010). Cutting, T. (2009). „Estimating Lessons Learned in Project Management Traditional.” http://www.pmhut.com/estimating-lessons-learned-in-project- management-traditional. (accessed October 20, 2010) Deci, E. L. and Ryan R. M. (1985), Intrinsic motivation and Selfdetermination in Human Behaviour, Plenum Press, New York. Economides, N. and Katsamakas, E. (2005), Two-sided competition of proprietary vs. open source technology platforms and the implications for the software industry, working paper, Stern School of Business, New York University Engelhardt, S. and Freytag A. (2009), Geographic Allocation of OSS Contributions: The Role of Institutions and Culture, Jena Economic Research Papers, Friedrich Schiller University and the Max Planck Institute of Economics, Jena, Germany Feller, J and Fitzgerald, B. (2002), Understanding Open Source Software Development, Addison Wesley, Boston, MA. Florac, W. A. and Carleton, A.D, editors (1999), Measuring the Software Process, Addison Wesley, Boston, MA. Friedman, M. and Schwartz, A. (1963), A Monetary History of the United States, 1867-1960, Princeton University Press, Princeton, NJ. Gaudeul, A. (2004), Competition between open-source and proprietary software: the (La)TeX case study, EconWPA, Industrial Organization. 56 Groth, R. (2004), Is the Software Industry's Productivity Declining? IEEE Software. 21(6): 92-94 Guiliano, A. et al. (1999), A Function Point-Like Measure for ObjectOriented Software, Empirical Software Engineering. 4(3): 263–287. Hall, T. E. (1990), Business cycles: the nature and causes of economic fluctuations, Praeger Publishers, New York. Harison, E. and Koski, H. (2008), Does open innovation foster productivity? Evidence from open source software (OSS) firms, pp. 23. Discussion Papers, Keskusteluaiheita, ETLA, The Research Institute of the Finnish Economy, Elinkeinoelämän Tutkimuslaitos. Hars, A. and Ou S. (2002) Working for Free? Motivations of participating in open source projects, International Journal of Electronic commerce, 6(3): 2540. Healy, K. and Schussman, A. (2003), The ecology of open source software development, working paper, www.kieranhealy.org/files/drafts/oss-activity.pdf available (accessed: 14 at: September, 2010) Hihn, J. (2006), Sizing the System, Quantitative Software Management: Cost Estimation & Sizing, Power Point. http://software.gsfc.nasa.gov/docs/QSM-class/Day%201-Cost/04a-Size.ppt. (accessed October 20, 2010.) Howison, J., Conklin, M., & Crowston, K. (2006). FLOSSmole: A collaborative repository for FLOSS research data and analyses, International Journal of Information Technology and Web Engineering, 1(3): 17–26. Keynes, J. M. (1936), The General theory of Employment, Interest and Money, Macmillan and Co., London. Krishnamurthy, C. (2002), Cave or community? an empirical examination of 100 mature open source projects, First Monday 7(6). 57 Kroes, N. et al. (eds) (2010), Truffle100: The top 100 European Software vendors, www.truffle100.com, Truffle Capital Kuan, J. (2001), Open source software as consumer integration into production. working paper, available at: SSRN: http://ssrn.com/abstract=259648 or doi:10.2139/ssrn.259648 (accessed: 19 March, 2010) Lakhani, K. and Wolf, R. G. (2005) Why hackers do what they do: understanding motivation efforts in free/open source projects, In Hissam et al (eds), Perspectives in Free and Open Source Software, MIT Press, Cambridge, USA pp. 3-21. Lanzi, D. (2005), Copyleft vs. Copyright: some competitive effects of Open Source, Working Papers 541, Dipartimento Scienze Economiche, Universita' di Bologna. Lee, S. H. (1999) Open source software licensing. Working paper, available at: http://cyber.law.harvard.edu/openlaw/gpl.pdf (accessed 1 May, 2010) Lerner J., and Tirole, J. (2002) Some simple economics of open source. Journal of Industrial Economics, 50(2): 197-234. Lerner J., and Tirole, J. (2005) The scope of open source licensing, Journal of Law, Economics and Organization, 21(1): 20-56. Osterloh, M. et al. (2001), Open Source - New Rules in Software Development. University of Zurich, working paper. www.iou.uzh.ch/orga/downloads/opensourceaom.pdf (accessed: 16 April, 2010) Maurer, S. and Scotchmer, S. (2006), Open Source Software: The New Intellectual Property Paradigm, In T. Hendershott (ed.), Handbook on Information Systems, Elsevier, Amsterdam Miller, P. J., editor (1994), The Rational Expectations Revolution, The MIT Press, Cambridge, London. Moglen, E. (1999), Anarchism triumphant: Free software and the death of copyright. First Monday, 4(8). 58 Moore, G. H. (1983), Business Cycles, Inflation and Forecasting, Ballinger Publishing Company, Cambridge, USA. Perens, B. (1999), The open source definition. In Di Bona, C. et al. (eds), (1999), Voices from the Open Source Revolution pp. 171-188, O’Reilly & Associates, Sebastopol, CA. Plosser, C. (1989), Understanding the Real Business Cycles, Journal of Economic Perspectives 3(Summer): 51-78. Raymond, E. S. (1998) The cathedral and the bazaar, First Monday, 3(3) Reifer, and D. J. (2004), Productivity Industry Benchmarks, Software Reifer Cost, Quality Consultants http://www.compaid.com/caiinternet/ezine/Reifer-Benchmarks.pdf. Inc. (accessed October 19, 2010) Romer, D. (1996), Advanced Macroeconomics, McGraw-Hill Companies, Inc., New York. Rossi, M. A. (2005), Decoding the free/open source software puzzle: a survey of theoretical and empirical contributions, In Bitzer J. ,Schröder P., (eds) (2006), The economics of open source software development, Elsevier, Amsterdam Rötheli, T. F. (2006), Business Forecasting and the Development of Business Cycle Theory. History of Political Economy, Forthcoming. working paper, available at SSRN: http://ssrn.com/abstract=917121 (accessed, 7 May, 2010) Sargent, T. J. and Wallace, N. (1975), Rational expectations, the Optimal Monetary Instrument, and the Optimal Money Supply Rule, Journal of Political Economy, 83(April 1975): 241-254. Tarun Chordia and Lakshmanan Shivakumar (2001), Earnings Business Cycle and Stock Returns 59 von Hippel, E. and von Kogh, G (2003) Open source software and the „private – collective” innovation model: issues for organization science, Organization Science, 14(2): 208-223. Wooldridge, J. M. (2008, 4ed.), Introductory Econometrics, A Modern Approach, South-Western Collage Publishing, London. Zimmermann, C. (2001), Forecasting with Real Business Cycle Models. Indian Economic Review, 36(1) 60 Appendix Appendix A – Table of countries weighted by the number of active developers Country id 1 2 3 4 6 7 8 9 10 15 16 17 20 21 22 23 24 28 29 30 32 33 34 37 38 Country Australia Austria Belgium Canada Czech Republic Denmark Finland France Germany Israel Italy Japan Mexico Netherlands New Zeland Norway Poland Spain Sweden Switzerland United Kingdom United States Brazil Russian Federation South Afrika Active developer 7945.00 2549.00 3034.00 11524.00 1443.00 2314.00 1842.00 10987.00 24197.00 1467.00 6200.00 1357.00 1401.00 6687.00 1635.00 1883.00 2520.00 4760.00 4642.00 3033.00 14051.00 112981.00 4038.00 3217.00 1216.00 Weight 0.033534 0.010759 0.012806 0.048640 0.006091 0.009767 0.007775 0.046374 0.102130 0.006192 0.026169 0.005728 0.005913 0.028224 0.006901 0.007948 0.010636 0.020091 0.019593 0.012802 0.059306 0.476868 0.017044 0.013578 0.005132 Source: Author’s own calculation, Engelhardt & Freytag (2009) 61 Appendix B – Table Calculations of the effects’ size Forge Sourceforge Freshmeat Rubyforge Aggregate Avgerage of the Weighted a Activity Growth Effect of a 0.001 Coefficient Size of the Effect in Project Numbers 0.124752 0.071226 0.029117 0.062124 0.000125 7.12E-05 2.91E-05 6.21E-05 37 21 9 18 Source: Author’s own calculation 62