The effects of the economic crisis on the software industry

advertisement
Author: Peter Benedek Balog
Msc in International Economic Consulting
Academic Supervisor: Dr. Philipp Schröder
THE EFFECTS OF THE ECONOMIC CRISIS ON THE
SOFTWARE INDUSTRY
WITH SPECIAL ATTENTION TO THE OPEN SOURCE SECTOR
Aarhus School of Business, University of Aarhus
2010.11.30.
Statement of originality
This work has not previously submitted for a degree or a diploma in any
university. To the best of my knowledge and belief, the thesis contains no
material previously published or written by any other person except where due
reference is made in the thesis itself.
i
Abstract
In the recent years the Open Source Software (OSS henceforward)
development movement gained a considerable attention from various academic
research fields. Due to the nature of the OSS development process namely that
it involves developers at many different locations and organizations sharing
code
to
develop
and
reform
programs,
requires
an
interdisciplinary
understanding.
Through a review of the existing literature in this dissertation I develop an
econometric model in order to investigate different macroeconomic factors that
might be influential on the behavior of OSS activity. By analyzing the software
industry as a whole I put the results into an exploratory framework which helps
the understanding.
The panel dataset used in the analysis consist quarterly macroeconomic figures
on 25 countries from 1998 – Q1 until 2010 – Q2. It also includes data on four
OSS project hosting site. The model analyzes the aggregate OSS activity as
well as the differences across the “forges”.
The research concluded that the software industry was able to overcome the
negative effects of the recent crisis fairly quickly. The reason might be that the
industry is a major driver of the R&D sector; therefore the once decreased ICT
budgets will be filled up quickly along with the stabilization.
The results show that changes in the real economy has limited effect on the
OSS development activity in general. However there are considerable
differences among the “forges” in terms of the magnitude and type of the
different effects. There is a significant positive effect between GDP growth, and
the activity of the individual communities.
Keywords: open source software, productivity, economic crisis, software
industry, business cycle theory
ii
Table of Contents
Abstract ........................................................................................................................................ 2
1.
2.
3.
Introduction .......................................................................................................................... 1
1.1.
Problem Statement..................................................................................................... 1
1.2.
Methodology ................................................................................................................ 2
1.3.
Delimitation .................................................................................................................. 3
1.4.
Structure....................................................................................................................... 3
Literature Overview ............................................................................................................ 4
2.1.
Open Source Phenomenon ...................................................................................... 4
2.2.
Business Cycle Theory .............................................................................................. 7
Analysis of the software industry.................................................................................... 10
3.1.
3.1.1.
FORBES Global 2000 ...................................................................................... 11
3.1.2.
Truffle 100 .......................................................................................................... 12
3.1.3.
Global Software 100......................................................................................... 13
3.1.4.
Software 500 ..................................................................................................... 14
3.1.5.
Demand side ..................................................................................................... 16
3.2.
4.
Conclusion of the analysis ...................................................................................... 18
Analysis of the Forges ..................................................................................................... 19
4.1.
5.
Data ............................................................................................................................ 11
Data ............................................................................................................................ 20
4.1.1.
OSS dataset ...................................................................................................... 20
4.1.2.
Productivity and performance measure ........................................................ 21
4.1.2.
Macroeconomic data ........................................................................................ 31
4.1.
Methodology .............................................................................................................. 35
4.2.
Model .......................................................................................................................... 36
4.3.
Results ....................................................................................................................... 37
4.4.
Conclusions of the analyses ................................................................................... 51
Conclusion ......................................................................................................................... 53
List of References..................................................................................................................... 55
Appendix .................................................................................................................................... 61
iii
List of Figures
Figure 1.1. Growth in New of Projects in each Repository by Year
2
Figure 3.1. Number of Software Companies on Forbes2000 list
11
Figure 3.2. Key Figures of Software Companies on Forbes2000 list (billion $)
12
Figure 3.3. Revenue of the top 100 European vendors from software activity (billion €)
13
Figure 3.4. Revenue of the Software 500 list’s companies (billion $)
14
Figure 3.5. Change in IT budget by company size, 2009
16
Figure 3.6. IT expenditure by counties (billion $)
17
Figure 4.1. Number of Shared Names across each Repository
28
Figure 4.2. Number of new projects registered in a month to Rubyforge
29
Figure 4.3. Number of new projects registered in a month to Freshmeat and Rubyforge
30
Figure 4.4. Number of new projects registered in a month to Sourceforge
31
List of Tables
Table 3.1. Number of Employees by Software 500 companies
14
Table 3.2. Top ten company based on Revenue / Employee
15
Table 4.1. Variables and its’ descriptions in the OSS dataset
20
Table 4.2. Software Productivity (ESLOC/SM) by selected Application Domains
25
Table 4.3. Selected record form the analysis’ dataset
32
Table 4.4. Macroeconomic variables used in the analysis
33
Table 4.5. RE Estimation Results for Model 1-4 (robust standard errors)
38
Table 4.6. RE Estimation Results for Model 1-4 (normal standard errors)
39
Table 4.6. RE Estimation Results for Model4 – 10 (robust standard errors)
41
Table 4.7. Results for Forge wise RE estimation with (M4) regressors
44
Table 4.8. Results for Forge wise RE estimation with (M5) regressors
45
Table 4.9. Results for Forge wise RE estimation with (M6) regressors
46
Table 4.9. Results for Forge wise RE estimation with (M7) regressors
47
Table 4.9. Results for Forge wise RE estimation with (M8) regressors
48
Table 4.9. Results for Forge wise RE estimation with (M9) regressors
49
Table 4.9. Results for Forge wise RE estimation with (M10) regressors
50
Appendix A – Table of countries weighted by the number of active developers
61
Appendix B – Table of calculations of the effects’ size
62
iv
1. Introduction
Over the past decade, the Open Source Software phenomenon has had a
global impact on the way organizations and individuals create, distribute,
acquire and use software and software-based services. OSS has challenged
the conventional wisdom of the software engineering and software business
communities, has been instrumental for educators and researchers, and has
become an important aspect of e-government and information society initiatives.
OSS is a complex phenomenon and requires an interdisciplinary understanding
of its engineering, technical, economic, legal and socio-cultural dynamics. The
open source movement has attracted the interest of many academic
researchers due to the success of famous OSS products like the ‘Linux’
operating system, the ‘Apache Web Server’ or the ‘Firefox’ web browser.
1.1. Problem Statement
The purpose of this thesis is to provide an answer to the following question:
How is the Software Industry affected by the 2008 Economic Crisis?
It will be answered by an analysis of the software industries performance with
special attention to the OSS sector.
During fall 2008 the world economy experienced dramatic falls in the GDP
leading to a deep recession due to various reasons such as the collapse of the
US realty bubble. Major economies like the American or the European were
even experienced negative growth rates while a slow but positive growth were
fueled by the emerging economies of China or India.
According to the FLOSSmole project - an internet-based collaborative collection
and analysis of free/libre/open source project data (Howison et al., 2006) - the
number of registered projects in the OSS hosting-sites are started to decrease
significantly since the economic crisis started late 2008 as Figure 1.1. shows
below.
1
Figure 1.1. Growth in New of Projects in each Repository by Year
Source: flossmole.org
1.2. Methodology
In my thesis I will attempt to find the factors that can explain the recent changes
in the output of the OSS sector. Those micro- and macroeconomic factors that
can affect the performance of the sector, and the recent economic recession
have a heavy effect on it, for instance the stock prices, the GDP or the size of
the unemployment or the changes in the amount of retail sales. I will also try to
find alternative explanations, which are not originated in the ongoing crisis, to
the shrinking number of OSS projects.
In contrast I will also analyze the proprietary software sector to provide a
cleaner picture about the prospects in the software industry as a whole.
I strongly believe that the topic is related to the existing academic literature in
macroeconomics, more narrowly it connects to the business cycle theories. The
ones are concerning the sources and the nature of the macroeconomic
fluctuations.
It is critical that we specify what software industry means in this paper. Due to
the exponential growth of computer usage in various sectors and different
purposes there are tens of thousands of different software products. The term
2
‘software’ in this analysis will mean application software, system software and
all the necessary tools.
1.3. Delimitation
It is found to be very hard to gain access to reliable data on the software
industry as a whole, and it was even harder in case of the OSS sector. The
author is aware of the fact that the activity level of the OSS is usually measured
by the number of messages generated in a given period of time, and that the
output in the software sector is generally measured by lines of code. Instead a
different approach has been used due to the resource constraints. The detailed
discussion of data constraints and dataset building can be found in section 4.1.
1.4. Structure
The rest of the thesis structured as the follows:
Section 2 describes the evolution of the Open Source Phenomenon and
provides an insight to the existing academic literature on business cycle theory.
Section 3 presents an analysis of the software industry in the light of the recent
economic crisis. It provides various data about the sector and presents the
results as well as the conclusions of the analysis.
Section 4 consists of the analysis on the willingness to start an OSS project. It
presents a model used in the thesis, the dataset and the variables used are also
discussed in detail.
Section 5 concludes on the findings of the thesis.
3
2. Literature Overview
2.1.
Open Source Phenomenon
The Free/Open Source Software (F/OSS) phenomenon has attracted an
increasing amount of attention in the academic literature in recent years. The
term “open source” refers to the fact that the program source code is – in
contrast to the source code of the proprietary software, which is only distributed
in compiled machine code – accessible, available and thus alterable by its user
(Bitzer and Schröder, 2006). Due to the growing number of contributions from
various academic fields a wide range of interesting issues emerged. This
section provides account of these contributions in order to understand the state
of the phenomenon.
Among researches conducted on the F/OSS individual incentives and
motivations received by far the most attention. According to a widely accepted
scheme of the results (Osterloh et al., 2001; Hars and Ou, 2002; Lakhani and
Wolf, 2005) the individuals motivation can be grouped under two headings.
Individuals are driven either by intrinsic- or extrinsic motivations. In the first case
according to Deci and Ryan (1985) intrinsic motivation is defined by an activity
for its inherent satisfactions rather than for some separable consequence. The
individual is moved to act for fun, challenge rather than for rewards. On the
other hand Lerner and Tirole (2002) states that a programmer only engage in a
project, whether commercial or F/OSS, if she derives a net benefit. This means
that the source of motivation depends on external factors – reputation, user
needs, learning and performance improvement.
The success of the OSS development was rather surprising in among some
scholars. Since it seems to be in contrast to the “Brooks` Law”. Which declares
that adding manpower to a late software product makes it later. The OSS
development on the other hand seems to be driven precisely by the high
number of skilled developers. Raymond (1998) summarizes the main features
of the OSS production mode which generated large number of academic
commentary. Advantages of the parallel code development (Feller and
4
Fitzgerald, 2002). Integration of users into the production of software code, von
Hippel and von Kogh (2003) describing F/OSS as a private-collective model of
innovation. Numerous studies turned their attention to the governance and
coordination structure of the OSS projects. Rossi (2005) summarizes the
principal findings of these studies: according to Krishnamurthy (2002) the
median number of the developers per projects is four while it is only one for
Healy and Sussmann (2003).
The success of Linux operating systems – accounting for a 38% share of the
server operating systems market (Bitzer 2004) - contrary to Microsoft’s
Windows seemingly huge superiority in many fields referenced by an extensive
number of authors. The issue of the competition between open- and closed
software production discussed widely (Gaudeul 2004, Economides and
Katsamakas 2005, Harison and Koski 2008). An increasing number of for profit
firms base their services on the OSS phenomenon to mention the few biggest:
Red Hat, Cygnus, VA Linux. Meanwhile established hardware and software
producers – IBM, Hewlett Packard and Oracle - turned their attention to the
OSS sector. The studies analyzing the competition in two dimensions: the
quality differences and the dynamics of innovation between the competitor
sectors. The models suggesting that the driver of innovation is to meet with the
user’s needs which leads to a quality increase (Kuan, 2001; Bessen, 2002).
Bonaccorsi and Rossi (2003) take into account the network effects and
externalities in the competition. Casadeus-Masanell and Ghemwhat’s (2003)
model of mixed-duopoly competition focuses on the influence of strategic
pricing decisions on consumer’s valuation while Bitzer (2004) looks at the
product differentiation effects on the competition.
So far a very important aspect of the F/OSS phenomenon has been overlooked
the OSS licenses. These licenses have a huge role on all previously mentioned
issues, .the motivation of developers, the coordination of projects, the
effectiveness of the OSS development model and the competition with the
commercial sector. The Open Source Definition (OSD)1 defines the “rights that
1
I refer to the OSD v1.9, the latest version available at http://opensource.org/osd.html (accessed,
November 26, 2010)
5
a software license must grant you to be certified as Open Source” (Perens,
1999). The main principles of the OSD are the following:

Free Redistribution: the license may not restrict any party from selling or
giving away the software.

Availability of the source code: the license allows modifications and
derived works and allows the distribution of these under the same terms
as the license of the original software.

Integrity of the source code: The license may restrict source-code from
being distributed in modified form only if the license allows the
distribution of "patch files" with the source code for the purpose of
modifying the program at build time.
The confusion whether F/OSS belongs to the public domain addressed by a
wide range of scholars: Lee (1999) Perens (1999), Lanzi (2005). There is a
difference between F/OSS and a software that simply put into the public
domain, because although it is free and available to all developers do not
surrender their rights to their software creations. Rather, they retain copyright
over their work and adopt licenses to ensure free access and modification of the
source code as Rossi (2005) summarizes the above mentioned papers`
findings. Lerner and Tirole (2005) found in their empirical research that
restrictive licenses are more likely to be adopted when the software is directed
at end-users, whereas less-restrictive licenses are more frequently adopted for
projects aimed to developers, the Internet or proprietary operating systems. As
Rossi (2005) summarizes numerous studies suggest the need for rethinking the
current state of the intellectual property protection for software programs
(Moglen, 1999; Osterloch et al., 2001; Benkler, 2002). OSS seems to represent
a “new intellectual property paradigm” (Maurer & Scotchmer, 2006), i.e. a new
type of ownership concept that leads to different allocations of intellectual
property rights and different modes of organization as compared to so called
proprietary software.
6
2.2.
Business Cycle Theory
A business cycle is identified by the behavior of aggregate economic activity
which is measured by a wide variety of series including output, sales,
employment, and income according to Moore (1983).
The analysis conducted in this thesis is fitted into this context. I will try to shed
light to the conformation of OSS activity2 with the use of the business cycle
theory’s approach. Explaining the behavior of OSS development activity through
macroeconomic series. A brief overview of the evolution of theories follows.
The early theories: The classical theory dominated the macroeconomics from
the late XVIII. century until the 1930s. The theory – although has been
discredited during the Keynesian revolution – provides foundation for several
modern theories of the business cycle such as the monetarist and real business
cycle models. The model itself is focused on the supply side effects in the
economy. It views positive shocks causing expansions and negative shocks
causing recessions.
Keynes with The General Theory of Employment, Interest and Money (1936)
points out that there is truly wrong with a theory – classical – that cannot explain
the severe unemployment of the 1930s. The theory focuses on the nature of the
investment, the variable found to be the most unstable. Therefore he changed
the focus of the analysis from the exogenous variables (supply side) to the
endogenous (demand side) one.
Monetarism: Friedman and Schwartz (1963) examined the changes in the
monetary growth and argued that it is the main source of economic instability.
The evidence showed that the growth of money supply indeed lead to a GNP
growth and, because prices are slow to adjust, generate business cycles. (Hall
1990)
2
OSS activity: I refer to OSS activity as a number of projects registered to the „forges”.
7
The rational expectations theory (Miller, 1994) provides insight to the behavior
of economic agents. It argues for policy heterogeneity (Sargent and Wallace,
1975), that discretionary monetary and fiscal policy maybe unable to alter
aggregate output.
The New-Keynesian school improved further Keynes’ original theory answering
many critics of it. One of the major contributions is the prediction that unstable
aggregate demand and supply are important determinant of the business cycle.
The aggregate demand causes cycles because the wages and prices are not
flexible in the short run, on the supply side the cause is in the real changes in
the labour market and/or production function alter the output (Hall1990).
Real business cycle theory is the first one that approaches the nature of cycles
from the supply side again since Keynes discredited the classical theory. It
states that the dominant causes of instability are external shocks to the supply.
(Barro 1989, Plosser 1989) This theory can be the base of research conducted
in this thesis since the assumption is that the macroeconomic environment
affects the sector indirectly, through the developers, who provide the supply for
the sector.
Modern theories: These above mentioned major competing theories of business
cycle identified the four major variables of the theory: the role of price and wage
adjustment, the nature of cyclical unemployment the relative roles of aggregate
demand and supply. However each school differs on the judgment of the
factors. This moved to the scope to new research areas. Considerable attention
is devoted to the analysis of the stock market fluctuations (Chauvet 2001,
Chordia and Shivakumar 2001, Casarin and Trecroci 2007).
Due to the nature of the field the subject of forecasting cycles is adressed.
(Zimmermann 2001) Rötheli (2006) showed that business practitioners'
forecasting of the business cycle was ahead of business cycle theorizing for
many years. Theorists of the 1930s and 1940s were not yet ready to
incorporate into their theories the business forecasting methods that had
become
widespread
entrepreneurial
practice.
Instead,
theorists
either
8
speculated on business psychology or built tractable dynamic mathematical
models with accelerator type investment. However, by the 1940s, a model of
the cycle with forward-looking investors might actually have been developed.
Finally as Hall (1990) summarizes the common method to use to characterize
the behavior of economic variables is in terms of their cyclical nature during
business cycles:

procyclical: if the variable typically moves in the same direction as
aggregate economic activity,

countercyclical variable usually moves in the opposite direction of
aggregate economic activity,

a cyclical: showing no consistent pattern in terms of its movement over
business cycles.
9
3. Analysis of the software industry
Interaction between the real economy and the virtual world
The above statement may sound odd but with two simple examples I show that
there is a serious level of interaction more correctly dependence on the real
economy from the side of virtual world. For example the resource that keeps the
virtual world “alive” is electricity which is produced in the real economy therefore
affected by all sorts of economic or ecologic effects to mention the most obvious
ones. According to a calculation a ‘Second Life’ avatar consumes 1.752 kWh
annually almost as much as Brazilians who consume 1.884. Furthermore it has
been proven that Google alone consumes 2.1 teraWh electricity in a year which
equals to the production of 2 average nuclear reactors. The financial crisis
started in 2008 have huge impact on the world-wide real economy leading to
growing unemployment rates, shrinking consumption, decreased or even
negative rate of GDP.
This describes the massive decrease to IT budgets around the world. There
was a huge pressure on CEOs to lower costs to be able to stay in the
competition in any sector. And here comes the OSS sector into the picture. By
2009 FLOSS has been recognized by CEOs as an opportunity to lower costs.
The following section will provide a software market analysis.
We can say that among the different OSS business models those of which had
50-60% revenue from subscription fees and 40-50% from the services are
gained on the crisis.
10
3.1.
Data
During the analysis and the data collection period I had to realize that it is very
hard if not impossible to find aggregate data on the industry. Therefore a
productivity analysis is seems impossible to conduct with the resources
available to this thesis. On the other hand there are numerous organizations –
professional journal, market researcher – that conduct some kind of analysis or
data collection, mostly survey based. Thus I chose to present the data from
these sources and conduct an analysis on it.
3.1.1. FORBES Global 2000
The Forbes Global 2000 is a comprehensive list of the world’s biggest and most
powerful companies, as measured by a composite ranking for sales, profits,
assets, and market value.3 It is possible to search based on industries in the list,
the Software and services industry included on the list since 2004.
Figure 3.1. Number of Software Companies
on Forbes2000 list
As the graph shows the number of
companies between 28 and 35 since
2004. It is constantly growing since
2006 from an all time low 28. As of
the 2010 list the number is the highest
– 35 – since the introduction of the
sector to the list. Based on the shape
of the number of companies the
sector is growing despite the global
Source: Author’s own calculation, Forbes2000
economic crisis. The Forbes 2000 list
also contains figures about key financial indicators the following figure is
picturing the behavior of those.
3
Forbes 2000 sources: Exshare, FT Ineractive Data, Reuters Fundamentals, and Worldscope
via FactSet Research Systems, Bloomberg Financial Markets.
11
Figure 3.2. Key Figures of Software Companies on Forbes2000 list (billion $)4
Source: Author’s own calculations, Forbes2000
The figure captures very well the effects of the crisis. The market value of the
companies’ grew from 2004 to 2007 by 48.7%. The following year brought halt
to this massive growth; while during 2008 a significant 31.9% decrease took
place. The sector seems to be recovered very fast by reaching an all time high
1159.05 billion USD.
The value of the sales and the assets shows very similar behavior. Very slow
growth until 2006. Then the growth accelerated until 2008, followed by a slow
decrease in the values which may continue over 2010.
3.1.2. Truffle 1005
The Truffle 100 ranks the top European software companies. With the inclusion
of the Truffle 100 list’s analysis it is possible to draw a picture about the
European software industry which can increase the resolution of the global
picture. According to the list’s authors the software industry is characterized by
periodic technological disruptions that pave the way for new arrivals; for that
purpose its results, research and statistics are released on a yearly basis.
4
Market value is as of Februar 28, 2010
The Truffle 100 is compiled from survey & research conducted by IDC & CXP for the purpose
of the Truffle 100 ranking. Europe is defined as: EU + Switzerland + Norway. A company is
defined as a European software vendor if its headquarters and R&D management are based in
Europe
5
12
According to the results the
revenue of the top European
Figure 3.3. Revenue of the top 100
European vendors from software
activity (billion €)
vendors is constantly growing
despite the ongoing economic
slowdown.
The
European
market is very concentrated,
79% of the revenue comes
from the top 25 companies up
from 70% from 2008. And the
vendors are facing very heavy
competition with huge global
Source: Author’s own calculations, Truffle 100
actors like Microsoft, whose
revenue in 2009 was alone €40.9 billion, 52% higher than the top 100 European
vendor’s aggregate revenue. This clearly shows that there is room for
improvement.
The
report
“emphasizes
the
impressive
dynamismand
exceptional resilience of the European software industry. In a challenging
environment, software vendors have demonstrated their ability to bounce back
quickly (with 8.4% year-on-year growth) and remain profitable (€3.7 billion),
while maintaining a heavy level of investment in R&D (€3.8 billion).” (Kroes,
2010).
3.1.3. Global Software 1006
The Global Software Top 100 list is very similar to the Truffle list but the focus is
on the worldwide software industry. The companies are ranked according to
their revenues7. The authors gathered data from SEC filings, annual reports and
corporate websites.
Similarly to the European market the global software market is also very
concentrated. The top ten vendor accounts for nearly 60% of the top 100’s
revenue, which is over $220 billion in 2009. 46 vendors out of the 100 reported
lower revenues than the previous year; still the average growth was 3.2%
6
http://www.softwaretop100.org/methodology (accessed October 27, 2010)
Revenue contains 'prepackaged' software sales; subscription and support activities; certain
service activities are excluded; hosted software solutions (Software as a Service) are included.
7
13
among the listed companies. The report concludes that the major effect of the
crisis was that many companies cut back their IT budgets, but since the sector
seems to be recovered very fast and the industry is one of the R&D drivers, the
budgets will increase in the future again and will grow bigger than before. The
other effect is that the „credit crisis abruptly stopped acquisitions by private
equity firms”.
3.1.4. Software 500
Software Magazine’s Software 500 list is a comprehensive look at the software
industry targeting enterprise IT organizations with software and services. The
authors collect data based on the magazine’s survey.
The rankings are based on total worldwide software and service revenue 8. Total
2009 Software 500 revenue is $491.2 billion, an 8.7% increase over last year's
total. Once again an analysis that seems to report about an industry, healed
very quickly after the global economic crisis started.
Figure 3.4. Revenue of the Software 500 list’s companies (billion $)
Source: Author’s own calculation, Software500
8
Iincludes revenues from software licenses, maintenance and support, training, and softwarerelated services and consulting. www.softwaremag.com (accessed October26, 2010)
14
The graph above confirms the assumptions; the industry seems to grow despite
the ongoing crisis. Another interesting aspect of the Software 500 list is that
data on the number of employees was also collected. The total number of
employees in the Software 500 is up a healthy 26% to 3 707 957, compared to
2 953 016. It suggests that in 2007 many companies were cutting costs and
conserving cash. The big increase in 2009 is primarily due to the addition of
Hitachi to the list, with 389 752 employees, and Emerson Electric, with 140 700
employees. The fact that the actors on the list may change from year to year
can cause biases in case a time series analysis therefore caution is advised
interpreting the data.
Table 3.1. Number of Employees by Software 500 companies
Employees
Growth
2008
3 707 957
25.6%
2007
2 953 016
1.3%
2006
2 914 480
14.7%
2005
2 539 872
-4.6%
2004
2 660 023
13.7%
Source: Software 500
It is however possible to calculate the labour productivity and compare the
agents in the industry. Table 3.2. shows the top ten company based on the
revenue / employee.
15
Table 3.2. Top ten company based on Revenue / Employee
Average for 500 = $ 191 417
Rank
Company
2008
Employees
2008 Revenue
(thousand $)
Revenue/
Employee
236 Innodata Isogen, Inc.
45
75 001
1 666 689
197
Lighthouse Computer
Services, Inc.
95
119 700
1 260 000
78
ePlus inc.
658
727 159
1 105 105
138
Technology
Integration Group
300
274 500
915 000
5 288
881 313
100 000
666 667
498 JangoMail
215
GreenPages
Technology Solutions
6
150
2
Microsoft Corp.
91 000
52 280 000
574 505
70
Akamai
Technologies, Inc.
1 500
790 924
527 283
787
407 300
517 535
3 572 376
509 321
106 SolidWorks Corp.
27
Juniper Networks,
Inc.
7 014
Source: Software 500
The table captures very well the productivity difference among the companies
the first company on the list realizes about 8 times higher revenue than the
average on the list. This variance can be explained by the different activities
performed.
3.1.5. Demand side
It seems that the software and related service providers were able to overcome
a very challenging situation and were able to grow their revenues. But the major
driver of these revenues was not the new orders but the already existing
16
maintenance contracts. Thus it is worth to inspect the demand side as well.
There was reference about decreasing ICT expenditure in every sector due to
the heavy cost cutting pressure. According to IDC’s survey based research the
ratio of companies planning to increase or not back on ICT spending exceed the
ones decrease. Only one third of the small businesses plan to decrease the
budget, while it is about 50% in case of medium size, and the large ones are
planning to cut back less.
Figure 3.5. Change in IT budget by company size, 2009
Source: IDC
Overall 42% of the companies were planned to decrease their spending along
with 27% who is not changing and 31% who will increase. On country level a
growing tendency is taking shape. The fact of that only part of the ICT
expenditure is spent on software and related services; and that the covered
period ends in 2008 makes the graph less useful!
17
Figure 3.6. IT expenditure by counties9 (billion $)
Source: Author’s own calculation, OECD, USCB
3.2.
Conclusion of the analysis
The above I have presented four different analyses on the software industry.
The different lists created by various organizations included only the biggest
actors of the sector. It is possible however to draw a comprehensive picture of
the effects of the crisis on the industry.

Based on the different researches the behavior of the revenue figures
share a theme: the sector experienced a minor but positive growth last
year which started to accelerate again recently.

The market is concentrated therefore an unseen economic effect on the
major players can reshape the landscape of the software industry.

Many companies cut back on the ICT budgets to stay in competition, but
as the studies suggest the sector is one of the key player in R&D
moreover necessary to increase productivity and reach optimal resource
allocation, therefore the budgets expected to be filled up along with the
stabilization.
9
Europe equals to EU27, Switzerland, Norway, Turkey; exchange rate as of Oct 25, EUR/USD = 1.4031
(http://www.x-rates.com/d/USD/EUR/graph120.html)
18
4. Analysis of the Forges
In the following section I will attempt to address the answer to the research
starter problem. As Figure 1.1 shows in the introduction section the number of
projects started to decrease significantly at the same time. It is found to be very
surprising that the OSS sector is seems to experience a massive decrease in
“output”. In this case output loss means that there is a decrease in the number
of projects registered in each repository since the beginning of the economic
crisis.
In case an economic crisis most of the real economy agents are under a
pressure to decrease their costs. The OSS in many cases a cheap alternative to
the available proprietary ones. The combination of the cost constraint and the
cheap alternative should result an increase in the demand and thus the output
as well or at least it should be stay steady. That is why the decrease in number
of projects in the observed time period is found to be surprising.
The volume of the decrease seems to be the same however it is not since the y
axis of the graph – Number of Projects – has an exponential scale. Therefore
the volatility in the graphs not represents the true values relative to each other.
The number of projects hosted at Objectweb is in the 10-30 range at Rubyforge
it is around 1000 and at Sourceforge in the 10000 range. If the graph would use
a linear scale instead of the exponential the Objectweb line would seem a flat
line on the x axis.
In one side there is a massive output decrease in the OSS sector which can be
the result of productivity deterioration. On the other side a deep economic
recession started at the same time which can be the cause also. My assumption
is that the real economy has an effect on the OSS productivity level. The
purpose of the analysis therefore is to shed light on the relationship between the
real economy and the OSS productivity.
The following chapter can be divided into four parts. The first part provides the
description of a number of variables that can describe the performance of the
19
OSS sector. The second part will discuss the factors that can affect these
performance measures. In the third part an econometric model is presented to
determine the effects of the different, preliminary defined factors on the
performance of the Forges. Finally a detailed discussion follows on the received
results.
4.1. Data
The Dataset building will be presented in two subsections the first describing
the OSS dataset, the second will provide an overview of the macroeconomic
dataset.
4.1.1.
OSS dataset
The OSS dataset contains data about projects hosted by the Freshmeat,
Objectweb, Rubyforge and Sourceforge forges. Table 4.1. summarizes the
selected variables from Flossmole (Howison, 2006) and its’ descriptions of the
dataset.
Table 4.1. Variables and its’ descriptions in the OSS dataset
Variable
proj_unixname
Description
Name of the project under it is
registered to the forge.
Identification number of the Flossmole
datasource_id
dataset where it is collected to. Differs
across forge and time.
url
URL address of the project through
the hosting site.
The project’s own html address if it
real_url
has an additional one from the above.
For example own website, mailing list
different from the one on the host.
20
date_registered
proj_long_name
Date of the project registered to the
hosting site.
Full name of the project.
Identification number of the project.
proj_id
Identifies a project with a single
number instead of the name.
Total number of developers for the
dev_count
project
date_collected
4.1.2.
Date
the
data
collected
to
the
Flossmole dataset.
Productivity and performance measure
Productivity is usually defined as a ratio of output from a production process for
given inputs. It can also be expressed as the ratio of output to input. In the case
of software industry it can be defined as a rate of producing some output using
a set of inputs in a given time unit.
Thus the first step is measure the output which is the software itself in our case.
Unfortunately due the complexity of the product there are several different
software size measures. According to Chemuturi and Kaligotla these are the
following:

Function Points: A unit of measurement to express the amount of
business functionality an information system provides to the end
user. The cost (in dollars or hours) of a single unit is calculated
from past projects. (Cutting, 2009). This unit of measure is used
by the IFPUG Functional Size Measurement Method, which is one
of the five ISO recognized software metric to size an information
system. It is based on the functionality that is perceived by the
21
user of the information system, independent of the technology
used to implement the information system10.

Object Points: According to Guiliano et al. (1999) object point is a
method that estimates the object oriented (OO) software
development projects’ size. The experience that has been
obtained with function points in traditional software development is
exploited into the OO paradigm. Adapting function points to OO,
the mapping of function point concepts to object oriented concepts
is necessary, and OO – specific concepts must be handled.

Equivalent Source Lines of Code (ESLOC): Source Lines of
Code means the number of lines in a software’s source code.
There are two different approaches. One is counting the physical
lines where every ‘enter’ or hard line brake stands for one line.
The other is counting the logical executable statements as lines.
Equivalent SLOC takes into account the differences in effort
required to incorporate new vs. inherited code into a delivered
system
also
the
additional
effort
required
to
modify
reused/adapted code for inclusion into the software product (Hihn,
2004).

Test Points: Testing a project and that a Test Point is equivalent
to a normalized test case. It is common knowledge that test cases
differ widely in terms of complexity and the activities necessary to
execute it. Therefore, the test cases need to be normalized – just
the way Function Points are normalized in to one common
measure using weighting factors. Now there are no uniformly
Wikipedia contributors. „Function point.” Wikipedia, The Free Encyclopedia. 16 September
2010, 18:54 UTC. http://en.wikipedia.org/w/index.php?title=Function_point&oldid=385215816.
(accessed October 20, 2010)
10
22
agreed measures of normalizing the test cases to a common
size.11

Use Case Points: A measurement of how much effort is required
to write software based on how much work the software is
intended to do. The method was created by Gustav Karner of
Rational Software Corporation in the mid 1990′s. The method was
based on a study of about 200 projects with an average size of 5
man-years of effort. The use case point method of estimation was
found to be within 10% of the actual results for over 95% of the
projects (Blain, 2007).

Feature Points: In 1986, Software Productivity Research, Inc.
developed an experimental method for applying Function Point
logic to system software such as operating systems, telephone
switching systems, and the like. To avoid confusion with the IBM
Function Point method this experimental alternative was called
‘Feature Points’. When Function Points are applied to such
systems, they of course generate counts. However, the counts
appear to be misleading for software that is high in algorithmic
complexity, but sparse in inputs and outputs. From both a
psychological and practical vantage point, these kinds of systems
software seem to require a counting method that is equivalent to
Function Points, but sensitive to the difficulties brought on by high
algorithmic complexity.12

Etc.
Beyond the large number of different software size measures there is no
generally accepted way of converting these from one to another. Therefore it is
possible that the size of the software will change relative to each other
aralikatte „Test Point Estimation.” Scribd http://www.scribd.com/doc/4939959/Test-PointEstimation. (accessed October 20, 2010)
12 „What are Feature Points?” Software Productivity Research. http://www.spr.com/featurepoints.html. (accessed October 20, 2010)
11
23
depending on the measurement system. The fact that there is no clear way to
conduct the Lines of Code methodology – count the logical statements or the
physical statements, how to treat inline documentation – makes the situation
even more complicated.
However these above mentioned issues are only one side of the coin. The other
major problem with regards to software productivity measurement is that we
want to explain a rather complex process with one single empirical figure. The
developers have to work in an ever changing-, continuously evolving
environment where – focusing on new technology in telecommunications
industry – the software complexity increased by a factor of 10 in the early
2000’s (Groth, 2004). Almost simultaneously the introduction passed off to the
Web-based services with Java. Moreover a revolution took place in software
outsourcing and organizational structure which further increased the complexity
of the software development by a wide margin. Nevertheless to say the skill
levels of these activities are different, the tools, inputs and outputs are all
different. In my opinion lumping them together and call it “Software
Development” then giving a single productivity figure to it at best can result an
unarguably rough estimate.
There are attempts to give a range to the productivity figures such as 10 hours
per Function Point but it could vary from 2 to 135 depending on the product
(Chemuturi and Kaligotla) or from 45 to 975 (Reifer, 2004) ESLOC/SM (lines of
code/staff month) depending on the application domain. Table 4.2. summarizes
selected results from Reifer’s (2004) productivity calculations.
24
Table 4.2. Software Productivity (ESLOC/SM) by selected Application
Domains
Application Domain
Range (ESLOC/SM)
Banking
155 to 550
Command and Control
95 to 350
Data processing
165 to 500
Scientific
130 to 360
Telecommunications
175 to 440
Web Business
190 to 985
Source: Reifer (2004)
Through the description of Reifer’s analysis it is possible to show how complex
procedure is to measure the size of software. The original calculation conducted
on 600 projects which are taken from Reifer’s database of more than 1,800
projects. These projects were completed within 1997 and 2004 by any of 40
organizations. A project is defined as the delivery of software to system
integration. Projects include builds and products that are delivered externally,
not internally. Both delivery of a product to market and a build to integration fit
this definition. The scope of all projects starts with software requirements
analysis and finishes with completion of software testing. The average number
of hours per staff month was 152 in the original analysis adjusted holidays,
vacation, etc. into account.
SLOC is defined to be logical source line of code using the conventions (Florac
and Carleton, 1999). ESLOC are defined to take into account reworked and
reused code (Boehm, 1981).
Reifer defined function point sizes using the International Function Point Users
Group (IFPUG) Function point sizes were converted to SLOC using backfiring
factors published by IFPUG in 2000, as available on their web site.
25
As Table 4.2. shows the variance of the results is quite large through the
domains which do not allow us to receive a close to precise estimate sector
wise.
To sum up, three factors were common themes among the vendors:

the lack of an industry-wide standard definition for software productivity,

software applications’ increasing complexity,

need for more formalized processes in the industry as a whole.
As a solution for these complications Chemuturi and Kaligotla suggests to “shift
focus from macro productivity to micro productivity”.
This means that the software development process should be divided into sub
segments and each should be treated differently and described with a different
productivity estimate. This is beyond the scope of this thesis, nevertheless the
available resources.
Therefore a different approach has been used. The research will not study the
productivity of the OSS sector. First and foremost because as section 1.4.
describes access to data is limited and was not possible to retrieve any about
the number of lines coded in any given period. Instead the study will analyze the
activity of the community. The activity is proxied by counting the number of
projects registered in a given time period. The steps of dataset reforming are
shown through the Rubyforge dataset. The same transformations were made in
each four of the OSS hosting site datasets the only difference is the number of
observations.
The Rubyforge dataset contains data about 9000 projects hosted on its site. 40
Flossmole datasets with the collection date since July 2006 were combined to
receive a historical table. The original dataset contains 211633 observations.
Throughout the transformation the aim was to create a dataset with information
about a project and its registration date. Variable ‘project_id13’ was used to
13
For descriptions see: Table 4.1.
26
distinguish the projects from each other. The reason behind this is the problem
of the shared names across hosts. The following analysis discusses the
problem, than the Rubyforge dataset transformation description continues.
OSS development environment – Forges: With the birth of Forges the, the
maturation and creation of large scale market of users and developers of open
source become possible. By providing a basic, no-cost infrastructure for the
fundamental necessities of a project such as the mailing list – which greatly
increases the effectiveness of the communication within the community around
the project – or the free file storage space. On the other hand the wide spread
of the use of forges have a downside.
Three of the most important disadvantages are the following.

First and foremost the information dissemination which means that it is
not clear what happens a project is lost between bug tracking and
mailing lists, in case of forum projects - - difficult to interact with each
other moreover very hard, almost impossible to track evolution between
projects.

Second negative effect of the forges is the distributed development which
at the first sight might seem a big advantage. This enables large scale
development by individual groups through the distributed version control.
It can lead to huge success (Linux Kernel) but with the first mentioned
problem, the lack of information, it often times can lead to confusion,
hard to keep track if a certain problem has already solved can lead to
double coding.

Finally the shared names problem namely that different project can run
under the same name across the different forges. According to the
FLOSSmole project there were 1367 projects with shared names on
Rubyforge and Sourceforge in June 2009. For instance, starfish is a
project listed on both Sourceforge and Rubyforge. On Rubyforge, it is
described as a “tool to make programming ridiculously easy”, but on
27
Sourceforge the starfish project is described as a password management
application (Howison, 2006). Figure 1.2. shows the number of shared
names across the largest project hosting sites.
Figure 4.1. Number of Shared Names across each Repository
Source: flossmole.org
The continuation of the Rubyforge dataset transformation description follows.
As of the first step projects with no project id were dropped out which equaled to
31643 observations. The second step was to remove the projects that were
observed multiple times during the collection period. It is resulted a dataset with
9181 different projects. Then the removal of the ones with no registration date
and the ones that were registered in the first and last observed month - because
the collection was not conducted on the last day of the month - followed. As the
final step the projects were counted and aggregated on a monthly level. The
final dataset contains data about 8696 projects before the aggregation. The
28
same dataset contains 231581, 55247 and 146 observations respectively in
case of Sourceforge, Freshmeat, and Objectweb. Figure 4.1. shows the results
with regards to Rubyforge.
Figure 4.2. Number of new projects registered in a month to Rubyforge
Source: Author’s own calculation
Figure 4.1. supports very well the Flossmole observation from Figure 1.1.
Namely since the beginning of the 2008 depression the number of new projects
is decreasing; the growth of the forges slowing down.
The next figure – representing the Rubyforge dataset complemented with the
Freshmeat data – shows similar results. The growth of the total number is
decreasing; however this decline starts earlier – around mid. 2004 – compared
to the Rubyforge site.
29
Figure 4.3. Number of new projects registered in a month to Freshmeat
and Rubyforge
Source: Author’s own calculation
Interestingly enough the Sourceforge’s community’s shows different trends over
the same observation period. The first big and quick increase occurred at the
end of 2000 when the so far steady number of new projects tripled in just two
months. That might be the result of the so called dot-com bubble collapse
followed with a delay (the “IT bubble” collapsed in May 10, 2000). The activity
level stabilized at 1500 – 2000 for the next five years. Following this period until
mid 2008 the level of the activity grown along with the increase of the variance
of the newly registered projects number. Throughout 2008 until mid 2009 a
massive 26% - 30% decline took place. Only, to took off and double the number
of projects and reach an all time high level of 4500.
30
Figure 4.4. Number of new projects registered in a month to Sourceforge
Source: Author’s own calculation
The above mentioned observations with regards to the vide variety of the
conformation does not rules out, nor confirms any relationship between the real
economy and the level of OSS activity.
To shed light to the relationship in the following I present an econometric model
with the discussion of the related methodology and the results of the evaluation.
4.1.2.
Macroeconomic data
The dataset consists of panel data with quarters and countries being the two
dimensions. Below the choice of timeframe, number of countries and data
sources is described.
The period from 1998-Q1 to 2010-Q2 was chosen as the timeframe of the
analysis. The period was limited to the last twelve years since it results reliable
figures on OSS projects. With the use of quarterly data an adequate number of
50 time periods received for each country.
To determine the countries included in the analysis I first searched for the origin
of the OSS development. Engelhardt and Freytag (2009) states that studies
31
indicated that, firstly OSS developers are well-educated software engineers (or
ICT students) in order to be able to write software code (i.e. programming),
secondly one must be able to think in abstract terms and logic. Additionally,
most programming languages are based on English and the whole
communication and coordination of OSS projects is done in English. Therefore
macroeconomic data has been collected for the OECD member countries and
Brazil, Indonesia, Russian Federation, South Africa.
In this stage I face a problem however; I have the same number of programs
registered for all the different countries in the different times. Table 4.3. provides
a visualization of the problem presenting selected lines and columns from the
dataset. Records of the dependent variable (1F)14 constant over countries.
Table 4.3. Selected record form the analysis’ dataset
year
1999
1998
1998
1999
1998
1998
quarter country country id unemp level unemp rate (%)
1
4
3
1
4
3
Austria
Austria
Austria
Canada
Canada
Canada
2
2
2
4
4
4
152667
160333
168667
1224600
1245467
1258833
4.0
4.2
4.4
7.9
8.1
8.2
1F
471
324
313
471
324
313
Source: STATA dataset
It is beyond the scope and resources of this thesis to explore the geographical
origin of each project. Therefore I used Engelhardt and Freytag’s (2009) top 30
list of countries by active users to weight the dependent variables. As a result,
the number of countries included in the analysis shrinked to 2515.
In order to bring consistency to the analysis my aim was to collect data from as
few sources as possible. The only source of the data is the SourceOECD. Table
4.4. provides the description of each macroeconomic variables used during the
analysis.
14
See description of 1F in p. 36.
In alphabetical order the countries included are: Australia, Austria, Belgium, Brazil, Canada,
Czech Republic, Denmark, Finland, France, Germany, Israel, Italy, Japan, Mexico, Netherlands,
New Zeland, Norway, Poland, Russian Federation, South Africa, Spain, Sweden, Switzerland,
United Kingdom, United States
15
32
Table 4.4. Macroeconomic variables used in the analysis
Variable
Description
Employment: People in civilian employment above a
specified age who during the reference period were
emp
either paid employees, employers and self-employed,
unpaid family workers or students with temporary paid
job (household survey based).
Harmonized unemployment rate: Give the numbers
h_unem_r
of unemployed people as a percentage of the civilian
labour force. The civilian labour force consist the
employed and unemployed people.
Industrial
production
is
an
index
covering
production in mining, manufacturing and public utilities
g_indprod
(electricity,
gas
and
water),
but
excluding
construction. (growth since last quarter, seasonally
adjusted).
Retail trade volume index: Calculated by dividing
g_retsales_vol
total retail trade turnover in current prices by an
appropriate price deflator (growth since last quarter,
seasonally adjusted).
Unit labour costs: measures the average cost of
g_unit_labour_costs
labour per unit of output and are calculated as the
ratio of total labour costs to real output (growth since
last quarter, seasonally adjusted).
Consumer prices measures changes over time in the
g_cons_prices
general level of prices of goods and services that a
reference population acquires, uses or pays for
consumption. (growth since last quarter)
33
g_gdp
Gross
Domestic
Product
at
constant
prices,
seasonally adjusted (growth since last quarter).
Broad Money supply, in addition to currency in
circulation plus sight deposits held by domestic non-
b_money
banks, also include time deposits as well as savings
deposits at short-notice held by domestic non-banks.
(growth since last quarter, seasonally adjusted).
Share
share_p
Prices:
Prices
of
common
shares
of
companies traded on national or foreign stock
exchanges (growth since last quarter).
Export of Goods: Consist of exports of national
products, exports without transformation of goods and
exp
exports from bonded warehouses which have not
been transformed since import (in billions of USD,
growth since last quarter, seasonally adjusted).
Import of Goods Consist of imports for direct
domestic consumption; withdrawals from bonded
imp
warehouses and free zones for domestic consumption
; and imports into bonded warehouses and free zones
(in billions of USD, growth since last quarter,
seasonally adjusted).
Service Exports: Economic flows streaming into the
s_exp
economy from the rest of the world. (in USD, growth
since last quarter, seasonally adjusted).
Service Import: Economic flows streaming from the
s_imp
economy to the rest of the world. (in USD, growth
since last quarter, seasonally adjusted).
34
4.1.
Methodology
A linear regression is attempts to model the relationship between the dependent
variable and the independent variable(s) (or explanatory variable) (Wooldridge,
2008). The linear relationship can be captured through a scatter plot.
Equation (4.1) is the equation for the linear regression.
𝑌𝑖𝑡 = 𝛽0 + 𝑋𝑖𝑡 𝛽 + 𝜀𝑖𝑡
(4.1)
Where:

𝑌𝑖𝑡 = the dependent variable,

𝛽0 is the constant term,

𝑋𝑖𝑡 𝛽 is the explanatory variable,

𝜀𝑖𝑡 is the error term.
In case of panel data however the model has to be modified. Panel data is a
dataset containing observations from different times, from different places in the
same topic (Wooldridge, 2008) for instance the growth of the GDP between
1998 and 2010 in the different OECD countries. Panel data models used as a
way of controlling for cross-sectional heterogeneity which means that there is
something “different” about the observed units, but it is not possible to reduce
these differences completely to the observable data. The equation (4.1) can
also be solved by a fixed- or a random effect model.
𝑌𝑖𝑡 = 𝛽0 + 𝑋𝑖𝑡 𝛽 + 𝑍𝑖𝛾 + 𝛼𝑖 + 𝜀𝑖𝑡
(4.2)
Where:

𝑌𝑖𝑡 = the dependent variable,

𝛽0 is the constant term,

𝑋𝑖𝑡 𝛽 is the explanatory variable,

𝜀𝑖𝑡 is the error term
35

𝑍𝑖𝛾 observed effect but can not be estimated through the fixed effect
model these are time-invariant factors,

𝛼𝑖 un-observed individual specific effect a fixed effect for each individual
across time.
The difference between the two models is in the underlying assumptions
(Wooldridge, 2008):

By using the fixed effect model we assume that the individual specific
effect is correlated with the independent variables. In this case the timeinvariant factors – such as gender, name, etc. – will be excluded from the
equation by taking the difference between each observation with the
within-group mean values in order to get rid of the individual specific
effect term 𝛼𝑖 .
𝐸(𝛼𝑖 ∣ 𝑋𝑖𝑡 , 𝑍𝑖𝛾 ) ≠ 0

On the other hand in case of the random effect model we assume that
the individual specific effects are uncorrelated with the explanatory
variables. All the coefficients will be estimated whether it is a time-variant
or time-invariant. Since in this case there is no fixed individual specific
effect 𝛼𝑖 and 𝜀𝑖𝑡 can be combined together to form a new error term 𝜉𝑖𝑡 .
Therefore we do not need to take differences and all variables will be
included
𝐸(𝛼𝑖 ∣ 𝑋𝑖𝑡 , 𝑍𝑖𝛾 ) = 0
4.2.
Model
After taking into consideration the previously discussed, I have decided to use
the random effect model to estimate the relationship between the number of
projects registered and the changes in certain macroeconomic factors. The final
model expressed in the (4.3) equation.
36
𝑔_𝑓𝑜𝑟𝑔𝑒𝑠𝑖𝑡 = 𝛽0 + 𝛽1 𝑒𝑚𝑝𝑖𝑡 + 𝛽2 + 𝛽1 ℎ_𝑢𝑛𝑒𝑚_𝑟𝑖𝑡 + 𝛽3 𝑔_𝑖𝑛𝑑𝑝𝑟𝑜𝑑𝑖𝑡 +
𝛽4 𝑔_𝑟𝑒𝑡𝑠𝑎𝑙𝑒𝑠_𝑣𝑜𝑙𝑖𝑡 + 𝛽5 𝑔_𝑢𝑛𝑖𝑡_𝑙𝑎𝑏𝑜𝑢𝑟_𝑐𝑜𝑠𝑡𝑠𝑖𝑡 + 𝛽6 𝑔_𝑐𝑜𝑛𝑠_𝑝𝑟𝑖𝑐𝑒𝑠𝑖𝑡 +
(4.3)
𝛽7 𝑔_𝑔𝑑𝑝𝑖𝑡 + 𝛽8 𝑏_𝑚𝑜𝑛𝑒𝑦𝑖𝑡 + 𝛽9 𝑠ℎ𝑎𝑟𝑒_𝑝𝑖𝑡 + 𝛽10 𝑒𝑥𝑝𝑖𝑡 + 𝛽11 𝑖𝑚𝑝𝑖𝑡 +
𝛽12 𝑠_𝑒𝑥𝑝𝑖𝑡 + 𝛽13 𝑠_𝑖𝑚𝑝𝑖𝑡 + 𝛾𝑡 + 𝛿𝑖 + 𝜉𝑖𝑡
Where:

𝑖 represents the given country,

𝑡 represents the given quarter,

𝛾𝑡 captures the time variant factors by quarter dummies,

𝛿𝑖 captures the country variant factors by country dummies.
4.3. Results
All together eight different regressions been conducted. The eight models can
be separated into two groups in the following way:

the one is uses robust standard error calculation method: Model 1-4

the one that computes the standard error in the usual way Model 1B-4B
The models differ on the number of quarters observed and the used dependent
variables rather than the independent variables. Three different datasets have
been constructed and two dependent variables in order to capture differences
across the forges:

the first dataset (1DS) contains data on all the 50 observed quarters
(from 1998-Q1 to 2010-Q2),

the second (2DS) keeps information on the variables from 2003-Q3 to
2009-Q4,

finally the third dataset (3DS) has data on 40 quarters from 2000-Q1 to
2009-Q4,

the first dependent variable (1F) aggregates the four forges’ records,

the second (2F) sums the Sourceforge and Freshmeat data –accounts
for 97% of the observed number of projects.
37
Table 4.5. and 4.6. summarizes the results of the equations, the description of
model specifications follow after the output tables.
Table 4.5. RE Estimation Results for Model 1-4 (robust standard errors)
Employment
Unemployment Rate
Industrial Production
Retail Sales Volume
Unit Labour Costs
Consumer Prices
GDP Growth
Broad Money Supply
Share Prices
Export of Goods
Import of Goods
Service Exports
Service Imports
Constant
overall R2
Number of obs.
Model 1
Model 2
Model 3
Model 4
-0.000*
-0.000
0.000
-0.000**
(0.000)
(0.000)
(0.000)
(0.000)
-0.012
0.007
0.014
0.019*
(0.034)
(0.044)
(0.011)
(0.011)
-0.003
-0.002
0.007*
0.006
(0.012)
(0.013)
(0.003)
(0.005)
0.004
-0.025
0.006
0.006
(0.032)
(0.043)
(0.006)
(0.008)
0.013
0.059
-0.008
0.014
(0.068)
(0.092)
(0.012)
(0.014)
-0.044
-0.065
0.002
0.012
(0.040)
(0.051)
(0.011)
(0.014)
0.035
0.064
-0.022**
0.003
(0.034)
(0.047)
(0.009)
(0.013)
-0.001
-0.004
-0.005
-0.011*
(0.018)
(0.019)
(0.005)
(0.006)
-0.002
-0.005
0.001
0.000
(0.004)
(0.005)
(0.001)
(0.001)
0.008
0.009
-0.002
0.002
(0.008)
(0.009)
(0.002)
(0.002)
-0.023**
-0.024**
-0.001
-0.007**
(0.010)
(0.012)
(0.003)
(0.003)
0.011
0.009
0.002**
0.001
(0.008)
(0.009)
(0.001)
(0.001)
-0.008
-0.007
-0.003**
-0.004*
(0.008)
(0.010)
(0.002)
(0.002)
1.664*
2.233
0.061
0.304***
(0.910)
(1.381)
(0.103)
(0.107)
0.140
516
0.139
516
0.393
276
0.421
416
The values in parentheses are the standard errors. Significance levels are denoted by *** = 1%, ** = 5%,
and * = 10%.
All models have been estimated using, country-dummies and quarter dummies (not shown).
Source: STATA output
38
Table 4.6. RE Estimation Results for Model 1-4 (normal standard errors)
Employment
Unemployment Rate
Industrial Production
Retail Sales Volume
Unit Labour Costs
Consumer Prices
GDP Growth
Broad Money Supply
Share Prices
Export of Goods
Import of Goods
Service Exports
Service Imports
Constant
overall R2
Number of Obs.
Model 1B
Model 2B
Model 3B
Model 4B
-0.000***
-0.000***
0.000
-0.000**
(0.000)
(0.000)
(0.000)
(0.000)
-0.012
0.007
0.014*
0.019**
(0.037)
(0.048)
(0.009)
(0.009)
-0.003
-0.002
0.007*
0.006
(0.017)
(0.022)
(0.003)
(0.004)
0.004
-0.025
0.006
0.006
(0.028)
(0.036)
(0.006)
(0.007)
0.013
0.059
-0.008
0.014
(0.054)
(0.071)
(0.012)
(0.013)
-0.044
-0.065
0.002
0.012
(0.054)
(0.071)
(0.011)
(0.012)
0.035
0.064
-0.022**
0.003
(0.049)
(0.064)
(0.010)
(0.012)
-0.001
-0.004
-0.005
-0.011**
(0.023)
(0.030)
(0.004)
(0.005)
-0.002
-0.005
0.001
0.000
(0.004)
(0.005)
(0.001)
(0.001)
0.008
0.009
-0.002
0.002
(0.009)
(0.012)
(0.002)
(0.002)
-0.023**
-0.024*
-0.001
-0.007***
(0.010)
(0.014)
(0.002)
(0.003)
0.011
0.009
0.002*
0.001
(0.007)
(0.009)
(0.001)
(0.002)
-0.008
-0.007
-0.003*
-0.004*
(0.008)
(0.011)
(0.002)
(0.002)
1.664***
2.233***
0.061
0.304***
(0.359)
(0.471)
(0.101)
(0.095)
0.140
516
0.139
516
0.393
276
0.421
416
The values in parentheses are the standard errors. Significance levels are denoted by *** = 1%, ** = 5%,
and * = 10%.
All models have been estimated using, country-dummies and quarter dummies (not shown).
Source: STATA output
Both Model 1 (M1) and Model 2 (M2) based on the first dataset (1DS) but while
in (M1) (1F) is the dependent variable in (M2) it is (F2). The difference between
39
the two dependent variables is that (F2) excludes the Rubyforge data. Since it
accounts only about 3% of the total number new projects there is no major
difference in the results. The overall R2 is fairly low in both cases around 14%
percent of the variance in the dependent variable is explained by the
independent variables. Only the Import of Goods found to be significant at a 5%
level. This means that a 1% change in the growth of Imports results -0.023%
change in the growth of the OSS activity. The size of Employment significant at
the 10% level however the size of the impact is irrelevant.
There is room for improvement according to the low R2 of (M1) and (M2). The
regression in Model 3 (M3) conducted on (2DS) with (1F) as a dependent
variable. In this case the model seems to explain a lot more variance in the
OSS activity, the R2 is 0.393. This model found the growth of GDP, the Service
Exports and the Service Imports to be significant at a 5% level with (-0.022),
0.002 and (-0.003) point estimate respectively.
Model 4 (M4) regresses (2F) with data from (3DS). This model shows the most
convincing value – 0.421 – of R2 so far. Furthermore this is the model that found
the largest number of significant regressors. Employed and Import of Goods at
5%, Unemployment, Money Supply and Service Imports at 10% level. The size
of employed people in the economy seems to be irrelevantly low repeatedly.
Meanwhile 1% change in the growth of imported goods leads to a -0.007%
change in the growth of OSS activity. Unemployment has a positive 0.019%
effect while both money supply and service imports have negative effect,
0.011% and 0.004% respectively
(M4) seems to provide a good basis for the next step of analysis where I create
different models using the (2DS) and (2F) specifications, but vary the across
regressors.
40
Table 4.6. RE Estimation Results for Model4 – 10 (robust standard errors)
Model 4
Employment
Model 5
Model 6
Model 7
Model 8
Model 9
Model 10
-0.000**
(0.000)
Unemployment
Industrial Prod.
Retail Sales
Unit Labour Costs
Consumer Prices
GDP Growth
Money Supply
Share Prices
Exp. of Goods
Imp. of Goods
Service Exports.
Service Imports
Constant
R2
Number of obs.
0.019*
0.019*
(0.011)
(0.010)
0.006
0.003
0.004
0.008**
(0.005)
(0.004)
(0.004)
(0. 004)
0.006
-0.010*
-0.012**
0.002
(0.008)
(0.006)
(0.005)
(0. 008)
0.014
0.006
0.005
0 .018
(0.014)
(0.010)
(0.010)
(0.014)
0.012
0.003
0.012
0.017
(0.014)
(0.008)
(0.011)
(0.014)
0.003
0.007
0.028***
0.011
0.023**
0.012*
(0.013)
(0.006)
(0.007)
(0.011)
(0.011)
(0.006)
-0.011*
-0.010**
-0.010**
-0.011**
(0.006)
(0.004)
(0.005)
(0. 005)
0.000
0.001
0.001
0 .001
(0.001)
(0.001)
(0.001)
(0.001)
0.002
0.001
0.002
0.002
(0.002)
(0.001)
(0.002)
(0.002)
-0.007**
-0.005***
-0.008***
-0.007**
(0.003)
(0.002)
(0.002)
(0.003)
0.001
-0.000
-0.001
-0.000
(0.001)
(0.001)
(0.001)
(0.000)
-0.004*
-0.001
-0.002*
-0.002
(0.002)
(0.001)
(0.001)
(0.002)
0.304***
0.277***
0.251***
0.246***
0.252***
0.279***
0.170**
(0.107)
(0.034)
(0.031)
(0.032)
(0.033)
(0.035)
(0.076)
0.421
416
0.365
627
0.378
898
0.366
796
0.393
796
0.371
587
0.408
436
The values in parentheses are the standard errors.
Significance levels are denoted by *** = 1%, ** = 5%, and * = 10%.
All models have been estimated using, country-dummies and quarter dummies (not shown).
Source: STATA output
41
Model 5-10 captures the differences between the effects of the different
macroeconomic factors on the OSS activity. In general we can say that the
models explain 36.6% to 42.1% of the response variation by the regressors.

Model 5 pictures the effects of the changes in the money supply and the
share prices; with GDP growth in the model I capture the state of the
given country’s economy. Out of the three variables money supply found
to be significant, at a 5% level. The coefficient has a negative sign
therefore the OSS activity’s growth will increase by 0.011% along with a
1% decrease in money supply growth.

Model 9 is the extension of (M5) it includes the balance of accounts by
following the in- and outwards streaming economic flows. It seems that
the OSS activity is only affected by the inward flows; however the effect
is very small with a point estimate of 0.002 and only significant at the
10% level. the money supply is still significantly negative but the
coefficient increased to (-0.010). Finally the GDP growth is become
significant at the 10% level with a point estimate of 0.012.

Model 6 focuses on the consumption side of the economy. The growth of
the retail sales is only significant at a 10% level with a (-0.010%) effect
on the OSS activity. The consumer prices, the export of goods and the
service exports have no significant effect, and this is a common theme in
the behavior of the above mentioned three variables. The import of
goods has a low but very significant – at 1% - point estimate with a
magnitude of (-0.005). In this model unlike to (M9) service export has no
significant effect. The GDP growth has a significant effect of 0.028% on
OSS activity growth, the largest effect across models in case of the GDP
growth variable.

Model 7 focuses on the production side of the economy and not industrial
production, GDP growth nor labour productivity – proxied by unit labour
costs - found to be significant.
42

Model 8 combines the consumption (M6) and production (M7) side. The
results very similar to the previous ones, none of (M7)’s variables
become significant. From (M6) the previously significant variables
remained significant the levels however changed. Retail sales is
significant at a 5% up from the previous 10% level while GDP growth
decreased to 5% from the previous 1%. The import of goods significant
at a 1% level once again. The point estimates are (- 0.012), 0.023 and (0.008) respectively.

Model 10 includes all the variables except the GDP and Employment.
This is the only model where industrial production is significant with a
point estimate of 0.008 at a 5% level. Money supply and Import of goods
are also significant like in all other models in which these variables were
included. The point estimates are the same as the base model (M4). The
unemployment has also the same effect and the significance level. The
only difference is that the previously mentioned industrial production
becomes significant.
In addition to the aggregate OSS sector analysis I have conducted estimations
on the different forges. (M4) – (M10) specifications were used with the different
forges’ activity growth as the dependent variable. Objectweb was excluded from
the analysis due to the limited number of observations.
43
Table 4.7. Results for Forge wise RE estimation with (M4) regressors
Model 4
Model 4sf
Model 4fm
Model 4rf
-0.000**
(0.000)
0.019*
(0.011)
0.006
(0.005)
0.006
(0.008)
0.014
(0.014)
0.012
(0.014)
0.003
(0.013)
-0.011*
(0.006)
0.000
(0.001)
0.002
(0.002)
-0.007**
(0.003)
0.001
(0.001)
-0.004*
(0.002)
-0.000**
(0.000)
0.024
(0.023)
0.007
(0.011)
0.003
(0.018)
0.031
(0.027)
0.019
(0.031)
0.051
(0.036)
-0.019
(0.015)
0.002
(0.003)
0.011*
(0.006)
-0.016**
(0.007)
-0.002
(0.003)
-0.006
(0.004)
-0.000**
(0.000)
0.004
(0.030)
-0.005
(0.010)
-0.003
(0.028)
-0.060
(0.045)
-0.024
(0.027)
0.009
(0.028)
-0.000
(0.016)
-0.000
(0.003)
0.006
(0.007)
-0.008
(0.009)
0.008
(0.008)
-0.005
(0.007)
-0.000
(0.000)
-0.078***
(0.026)
-0.016
(0.010)
0.037**
(0.015)
0.059
(0.038)
-0.003
(0.026)
0.024
(0.030)
-0.016
(0.019)
0.003*
(0.002)
-0.004
(0.004)
0.010**
(0.005)
0.004
(0.003)
-0.007**
(0.003)
Constant
0.304***
(0.107)
0.670**
(0.274)
0.616**
(0.287)
0.515*
(0.300)
R2
Number of obs.
0.421
416
0.297
416
0.106
506
0.193
272
Employment
Unemployment
Industrial Production
Retail Sales
Unit Labour Costs
Consumer Prices
GDP growth
Broad money
Share Prices
Export of Goods
Import of Goods
Service Exports
Service Imports
The values in parentheses are the standard errors.
Significance levels are denoted by *** = 1%, ** = 5%, and * = 10%.
All models have been estimated using, country-dummies and quarter dummies (not shown).
Source: STATA output
44
The general model with all of the independent variables there is fairly large
variance in R2 across forges, it seems that none of the variables are significant
in case of Freshmeat except the unemployment, but the effect’s size is
irrelevantly low. The aggregate level analysis and Sourceforge shows similar
results, in both models the import of goods is significant at a 5% level with a
point estimate of (-0.016) in the second case, more than doubling the effect
compared to the aggregate model. The export of goods also has a significant
effect on Sourceforge community’s activity with a coefficient of 0.011. In case of
Rubyforge the unemployment has the most significant effect and the highest
also, with (-0.078%). Retail sales, import of goods and service imports are also
significant at a 5% level with a point estimate of 0.037; 0.010; (-0.007)
respectively.
Table 4.8. Results for Forge wise RE estimation with (M5) regressors
Model 5
Model 5sf
Model 5fm
Model 5rf
0.007
(0.006)
-0.010**
(0.004)
0.001
(0.001)
0.034**
(0.016)
-0.016
(0.011)
0.002
(0.002)
0.021
(0.018)
0.000
(0.010)
0.001
(0.001)
0.006
(0.009)
0.020
(0.015)
0.002**
(0.001)
Constant
0.277***
(0.034)
0.483***
(0.085)
0.100
(0.078)
-0.002
(0.066)
R2
Number of obs.
0.365
627
0.256
627
0.079
764
0.030
407
GDP Growth
Broad money
Share Prices
The values in parentheses are the standard errors.
Significance levels are denoted by *** = 1%, ** = 5%, and * = 10%.
All models have been estimated using, country-dummies and quarter dummies (not shown).
Source: STATA output
GDP, money supply and share prices cannot explain any variance significantly
in case of Freshmeat once again. Rubyforge’s R2 dropped to almost 0 from
nearly 0.2 compared to the aggregate model, even though share prices
45
changes are significant and account for a 0.002% effect in activity. GDP is the
only significant variable in case of Sourceforge with a point estimate of 0.034.
Broad money is not significant in any of the forges however it is on the
aggregate level.
Table 4.9. Results for Forge wise RE estimation with (M6) regressors
Model 6
Retail sales
Model 6sf
Model 6fm Model 6rf
-0.010*
(0.006)
0.003
(0.008)
0.028***
(0.007)
0.001
(0.001)
-0.005***
(0.002)
-0.000
(0.001)
-0.001
(0.001)
-0.027*
(0.014)
0.006
(0.017)
0.077***
(0.017)
0.007**
(0.004)
-0.010***
(0.004)
-0.001
(0.001)
-0.002
(0.002)
0.007
(0.015)
0.000
(0.016)
0.030*
(0.018)
0.002
(0.004)
-0.006
(0.005)
0.000
(0.003)
0.001
(0.002)
0.003
(0.011)
0.000
(0.017)
0.012
(0.012)
0.000
(0.003)
0.002
(0.003)
-0.000
(0.002)
0.003*
(0.002)
Constant
0.251***
(0.031)
0.447***
(0.076)
0.086
(0.073)
0.042
(0.050)
R2
Number of obs.
0.378
898
0.272
898
0.079
1080
0.025
589
Consumer prices
GDP Growth
Export of Goods
Import of Goods
Service Exports
Service Imports
The values in parentheses are the standard errors.
Significance levels are denoted by *** = 1%, ** = 5%, and * = 10%.
All models have been estimated using, country-dummies and quarter dummies (not shown).
Source: STATA output
The consumption variables have almost nonexistent effect on the activity both
at Rubyforge and Freshmeat. While service imports significant in the first case
only at the 10% level with a very low, 0.003 point estimate; at the mean time
GDP growth has the same size of effect on the activity at Freshmeat.
Sourceforge once again has very similar results as the aggregate level. The
46
only difference is the size of the effects and that the export of goods is
significant with a 0.007 coefficient.
Table 4.10. Results for Forge wise RE estimation with (M7) regressors
Model 7
Model 7sf
Model 7fm Model 7rf
0.003
(0.004)
0.006
(0.010)
0.011
(0.011)
-0.005
(0.008)
0.022
(0.021)
0.076***
(0.027)
-0.013**
(0.007)
-0.013
(0.029)
0.075***
(0.026)
-0.016*
(0.009)
0.094***
(0.023)
0.084***
(0.024)
Constant
0.246***
(0.032)
0.409***
(0.078)
0.072
(0.078)
-0.058
(0.054)
R2
Number of obs.
0.366
796
0.265
796
0.084
974
0.077
511
Industrial production
Unit labour costs
GDP Growth
The values in parentheses are the standard errors.
Significance levels are denoted by *** = 1%, ** = 5%, and * = 10%.
All models have been estimated using, country-dummies and quarter dummies (not shown).
Source: STATA output
The model focusing on the production side of the economy has no significant
variables at the aggregate level however at a forge level 1% GDP growth
causes 0.076%, 0.075% and 0.084% growth at Sourceforge, Freshmeat and
Rubyforge activity respectively. Furthermore changes in industrial production
has a negative effect on Freshmeat with a (-0.013) coefficient. According to this
specification Rubyforge’s activity growth will increase by 0.094% along with a
1% decrease in the labour productivity.
47
Table 4.11. Results for Forge wise RE estimation with (M8) regressors
Model 8
Model 8sf
Model 8fm Model 8rf
0.004
(0.004)
-0.012**
(0.005)
0.005
(0.010)
0.012
(0.011)
0.023**
(0.011)
0.002
(0.002)
-0.008***
(0.002)
-0.003
(0.008)
-0.038***
(0.014)
0.017
(0.020)
0.029
(0.024)
0.097***
(0.027)
0.010**
(0.005)
-0.015***
(0.006)
-0.013*
(0.007)
0.017
(0.016)
-0.016
(0.029)
0.004
(0.018)
0.071***
(0.027)
0.002
(0.005)
-0.005
(0.007)
-0.017**
(0.008)
0.008
(0.012)
0.094***
(0.023)
-0.016
(0.019)
0.074***
(0.026)
-0.000
(0.003)
0.005
(0.004)
Constant
0.252***
(0.033)
0.423***
(0.080)
0.058
(0.076)
-0.062
(0.054)
R2
Number of obs.
0.393
796
0.294
796
0.086
974
0.083
511
Industrial production
Retail sales
Unit Labour Costs
Consumer Prices
GDP Growth
Export of Goods
Import of Goods
The values in parentheses are the standard errors.
Significance levels are denoted by *** = 1%, ** = 5%, and * = 10%.
All models have been estimated using, country-dummies and quarter dummies (not shown).
Source: STATA output
Model 8 combines the consumption (M6) and production (M7) side. In case of
Sourceforge in addition to the aggregate model’s significant variables the export
of goods found to be significant with a 0.01 point estimate. Freshmeat is
affected by GDP growth at a 1% level, same as Rubyforge and the estimates
are very close also. Labour productivity and industrial production once again
significantly affects the activity growth of Rubyforge.
48
Table 4.12. Results for Forge wise RE estimation with (M9) regressors
Model 9
Model 9sf
0.012*
(0.006)
-0.010**
(0.005)
0.001
(0.001)
-0.001
(0.001)
-0.002*
(0.001)
Constant
R2
Number of obs.
GDP growth
Broad money
Share Prices
Service Exports
Service Imports
Model 9fm
Model 9rf
0.045***
(0.017)
-0.015
(0.011)
0.003
(0.002)
-0.003
(0.002)
-0.005**
(0.002)
0.016
(0.019)
-0.005
(0.011)
0.000
(0.001)
0.001
(0.003)
0.001
(0.002)
0.005
(0.010)
0.017
(0.015)
0.002*
(0.001)
-0.002
(0.002)
0.002
(0.002)
0.279***
(0.035)
0.495***
(0.087)
0.107
(0.079)
0.008
(0.067)
0.371
587
0.267
587
0.077
711
0.031
377
The values in parentheses are the standard errors.
Significance levels are denoted by *** = 1%, ** = 5%, and * = 10%.
All models have been estimated using, country-dummies and quarter dummies (not shown).
Source: STATA output
Model 9 has no significant variables in case of the two small forge. And only
GDP growth and service imports affect Sourceforge significantly. The estimates
are 0.045 and (-0.005) respectively. Interestingly enough the effect of the
money supply became insignificant in all three cases while it is significant at the
aggregate level.
49
Table 4.13. Results for Forge wise RE estimation with (M10) regressors
Model 10
Model 10sf
0.019*
0.018
0.011
-0.078***
(0.010)
(0.021)
(0.029)
(0.023)
0.008**
0.019**
0.001
-0.011
(0.004)
(0.008)
(0.009)
(0.007)
0.002
0.002
0.010
0.041***
(0.008)
(0.017)
(0.024)
(0.014)
0.018
0.025
-0.030
0.051
(0.014)
(0.027)
(0.043)
(0.036)
0.017
0.030
0.009
-0.005
(0.014)
(0.032)
(0.023)
(0.026)
-0.011**
-0.016
0.006
-0.013
(0.005)
(0.015)
(0.016)
(0.019)
0.001
0.003
-0.000
0.003*
(0.001)
(0.003)
(0.003)
(0.002)
0.002
0.009
0.006
-0.004
(0.002)
(0.006)
(0.007)
(0.004)
-0.007**
-0.015**
-0.006
0.011**
(0.003)
(0.007)
(0.009)
(0.005)
-0.000
-0.002
0.004
0.004
(0.001)
(0.003)
(0.007)
(0.003)
-0.002
-0.006
-0.000
-0.008**
(0.002)
(0.004)
(0.007)
(0.003)
Constant
0.170**
(0.076)
0.398**
(0.167)
0.025
(0.180)
0.420**
(0.172)
R2
Number of obs.
0.408
436
0.282
436
0.078
533
0.189
273
Unemployment
Industrial Production
Retail Sales
Unit Labour Costs
Consumer Prices
Broad Money
Share Prices
Export of Goods
Import of Goods
Service Exports
Service Imports
Model 10fm Model 10rf
The values in parentheses are the standard errors.
Significance levels are denoted by *** = 1%, ** = 5%, and * = 10%.
All models have been estimated using, country-dummies and quarter dummies (not shown).
Source: STATA output
50
Finally Model 10. Freshmeat once again explained very poorly by the model
with no significant variable. In case of Sourceforge the only significant variables
are industrial production 0.019 and import of goods (-0.015). Rubyforge is
affected by the most factors. Unemployment (-0.078) and retail sales 0.041 both
significant at a 1% level. Import of goods and service imports at 5% with a point
estimate of 0.011; (-0.008). And finally share prices which is only significant at
the 10% level with a low effect of 0.003%.
4.4.
Conclusions of the analyses
The results found in section 4 can be divided into two parts. The first part
describes the behaviour of the forges aggregate activity due to different
macroeconomic factor changes. The second part compares the effects of the
same variables on the different forge’s activity growth.
On aggregate level seven – (M4)-(M10) - different model were built in order to
estimate differences in different macroeconomic factors. Unemployment, money
supply and import of goods found to be significant in all models where they
were included. GDP growth, industrial production, retail sales and service
imports were significant only in certain models. The constantly significant
factors have limited or nonexistent variance across the models. Unemployment
has the highest effect with a point estimate of 0.019 also this is the only variable
out of the three which is in a positive relation to the dependent variable. The
size of money supply changes has (-0.01) (-0.011) estimated effect. The
variance in coefficients the largest in case of changes in imported goods, it
varies between (-0.005) and (-0.008). These effects seem to be very little but
putting it into a different context it turns out that a 1% growth in unemployment
rate can result 34916 new projects registered to the sites compared to the
previous quarter’s figure. In case of money supply and imports the same size of
growth can result 184 and 92-147 less projects registered on the forges
respectively.
16
Details of the calculation can be found in Appendix A
51
Regressions are specified on the same way as they were conducted on the
change in the individual forge’s activity.

Sourceforge: GDP consequently has a significant effect on Sf’s activity
the estimates are always positive which means that the behaviour of the
activity moves along with the business cycle. The size of the effect varies
between 0.034 and 0.097 depending on the model specification. The
import of goods also seems to have a significant effect on Sourceforge
however the it is “reverse” in this case due to the negative estimates
which vary from (-0.01) to (-0.015), this equals to approximately 37 –
55517 less projects registered in case of a 1% increase in the growth of
the value growth of imported goods.

Freshmeat follows a totally different activity path; all the models had a
low R2 meaning that there are other, unobserved factors causing the
variance in the activity changes. The industrial production and GDP has
significant effects in some cases. While the earlier has negative
estimates of (-0.013); the latter has bigger and positive impact with 0.071
and 0.075 coefficients. This equals to 1491-157518 more project in case
of 1% increase in the growth rate of GDP

Rubyforge: Interestingly enough the smallest forge is significantly
sensitive to the changes in unit labour costs, the relationship is positive
with a 0.094 estimate. This means that a decline by a unit in the labour
productivity – increase in the unit labour costs (ULC) – leads to
approximately 84619 more projects registered to the host. The other
significant variable is GDP growth with a 0.074 to 0.084 point estimate.
The above results suggest that there is difference between the forges, each
affected differently by the same macroeconomic factors. This can be the result
of different types of projects hosted; or different consistency of the developer
community. Unfortunately it is not possible to determine which and how affects
the activity based on the available dataset.
17
For details see: Appendix B at pag 62.
For details see: Appendix B at pag 62.
19
For details see: Appendix B at pag 62.
18
52
5. Conclusion
The economic crisis started in late 2008 resulted huge changes in the world
economy. Numerous developed countries had to face slowing or even
negative GDP growth, Ireland the previous “Celtic tiger” and Greece
received enormous aid packages to be able to keep up with the financing of
its debt. These hard times, according to Flossmole, seems to affect the OSS
sector as well and through that the whole software industry.
According to the analysis conducted in section 3 the software industry is
indeed affected by the crisis, but the data suggest that this effect was minor
in terms of revenue lost in the industry compared to the construction industry
for example. Both the European and the global market was able to grow
however this growth lagged behind the previously experienced high rate.
There were losers and winners, in general it can be said that among the
different business models those of which had 50-60% revenue from
subscription fees and 40-50% from the services are gained on the crisis
(Desmond J. P., 2010). The industry is a major employer of the R&D sector
and despite the depression there are more and more employees from year
to year.
Section 4 analyzed the behaviour of OSS activity due to macroeconomic
factors. The models show that the effect on the activity is limited in general
import of goods, unemployment rate and money supply have effect on the
sector. However there is difference across the forges. In general due to the
crisis lot of entrepreneur turned to the OSS sector due to cost cutting
incentives.
As it turns out from the second section OSS is a very complex phenomenon
with numerous research possibilities. Therefore the activity of the community
is affected by a lot of different factors such as motivation, private investment
to the sector, types of licenses; macroeconomic factors only a small part of
it. According to the results the changes in the real economy have limited
53
effect on the OSS sector, there are numerous unobserved factor not
covered by this study.
For further research it could be interesting to expand the dataset to the lines
of code generated in the forges. This would result a more precise dataset
namely because the database I built during the study has its limitations. To
name only one the activity is weighted only by county therefore it is constant
over time, however the number of active developers is dynamically changes
from time to time. Furthermore the low R2 suggest room for improvement by
including new variables.
54
List of References
Barro, R. J. (1989), Modern Business Cycle Theory, Harvard University
Press, Cambridge, Mass.
Benkler, Y. (2002), Coase’s penguin or Linux and the nature of the firm.
Yale Law Journal, 112(3) 369-446.
Bessen, J. (2002), What good is free software? In Hahn, editor, (2002),
Government policy towards Open Source Software, pp. 12-33. AEI Brooking
Joint Center for Regulatory Studies, Washington DC, USA.
Bitzer, J. (2004), Commercial versus open source software: The role of
product heterogeneity in competition. Economic Systems, 28(4): 369-381.
Bitzer J. ,Schröder P., editors (2006), The economics of open source
software development, Elsevier, Amsterdam.
Blain,
–
Points
T.
„Software
Introduction."
Cost
Tyner
Estimation
Blain
Blog.
With
February
Use
Case
12,
2007.
http://tynerblain.com/blog/2007/02/12/software-cost-estimation-ucp-1/.
(accessed October 20, 2010.)
Bonaccorsi, A. and Rossi, C. (2003), Why open source can succeed,
Research Policy, 32(7): 1243-1258.
Boehm, B. W. (1981), Software Engineering Economics, Prentice-Hall.
Casadeus – Masanell, R. and Ghemwhat, P. (2003), Dynamic Mixed
Duopoly: A Model Motivated by Linux vs. Windows. Harvard Business School
working paper.
Casarin, R. and Trecroci, C. (2007), Business Cycle and Stock Market
Volatility:
Are
They
Related?,
working
paper,
available
at
SSRN:
http://ssrn.com/abstract=888524 (accessed: 5 June, 2010)
Chemuturi, M. and Kaligotla, S. „Productivity for Software Estimators”
http://www.metricssoftware.com/Productivity%20for%20Software%20Estimators
.pdf (accessed October 19, 2010)
55
Chauvet, M., (2001), Stock Market Fluctuations And The Business Cycle,
working paper, available at SSRN: http://ssrn.com/abstract=283793 (accessed:
3 June, 2010)
Chordia, T. and Shivakumar, L. (2001), Earnings, Business Cycle And
Stock
Returns,
working
paper,
available
at
SSRN:
http://ssrn.com/abstract=281491 (accessed: 3 June, 2010).
Cutting, T. (2009). „Estimating Lessons Learned in Project Management Traditional.”
http://www.pmhut.com/estimating-lessons-learned-in-project-
management-traditional. (accessed October 20, 2010)
Deci, E. L. and Ryan R. M. (1985), Intrinsic motivation and Selfdetermination in Human Behaviour, Plenum Press, New York.
Economides, N. and Katsamakas, E. (2005), Two-sided competition of
proprietary vs. open source technology platforms and the implications for the
software industry, working paper, Stern School of Business, New York
University
Engelhardt, S. and Freytag A. (2009), Geographic Allocation of OSS
Contributions: The Role of Institutions and Culture, Jena Economic Research
Papers, Friedrich Schiller University and the Max Planck Institute of Economics,
Jena, Germany
Feller, J and Fitzgerald, B. (2002), Understanding Open Source Software
Development, Addison Wesley, Boston, MA.
Florac, W. A. and Carleton, A.D, editors (1999), Measuring the Software
Process, Addison Wesley, Boston, MA.
Friedman, M. and Schwartz, A. (1963), A Monetary History of the United
States, 1867-1960, Princeton University Press, Princeton, NJ.
Gaudeul, A. (2004), Competition between open-source and proprietary
software: the (La)TeX case study, EconWPA, Industrial Organization.
56
Groth, R. (2004), Is the Software Industry's Productivity Declining? IEEE
Software. 21(6): 92-94
Guiliano, A. et al. (1999), A Function Point-Like Measure for ObjectOriented Software, Empirical Software Engineering. 4(3): 263–287.
Hall, T. E. (1990), Business cycles: the nature and causes of economic
fluctuations, Praeger Publishers, New York.
Harison, E. and Koski, H. (2008), Does open innovation foster
productivity? Evidence from open source software (OSS) firms, pp. 23.
Discussion Papers, Keskusteluaiheita, ETLA, The Research Institute of the
Finnish Economy, Elinkeinoelämän Tutkimuslaitos.
Hars, A. and Ou S. (2002) Working for Free? Motivations of participating
in open source projects, International Journal of Electronic commerce, 6(3): 2540.
Healy, K. and Schussman, A. (2003), The ecology of open source
software
development,
working
paper,
www.kieranhealy.org/files/drafts/oss-activity.pdf
available
(accessed:
14
at:
September,
2010)
Hihn, J. (2006), Sizing the System, Quantitative Software Management:
Cost
Estimation
&
Sizing,
Power
Point.
http://software.gsfc.nasa.gov/docs/QSM-class/Day%201-Cost/04a-Size.ppt.
(accessed October 20, 2010.)
Howison, J., Conklin, M., & Crowston, K. (2006). FLOSSmole: A
collaborative repository for FLOSS research data and analyses, International
Journal of Information Technology and Web Engineering, 1(3): 17–26.
Keynes, J. M. (1936), The General theory of Employment, Interest and
Money, Macmillan and Co., London.
Krishnamurthy, C. (2002), Cave or community? an empirical examination
of 100 mature open source projects, First Monday 7(6).
57
Kroes, N. et al. (eds) (2010), Truffle100: The top 100 European Software
vendors, www.truffle100.com, Truffle Capital
Kuan, J. (2001), Open source software as consumer integration into
production. working paper, available at: SSRN: http://ssrn.com/abstract=259648
or doi:10.2139/ssrn.259648 (accessed: 19 March, 2010)
Lakhani, K. and Wolf, R. G. (2005) Why hackers do what they do:
understanding motivation efforts in free/open source projects, In Hissam et al
(eds), Perspectives in Free and Open Source Software, MIT Press, Cambridge,
USA pp. 3-21.
Lanzi, D. (2005), Copyleft vs. Copyright: some competitive effects of
Open Source, Working Papers 541, Dipartimento Scienze Economiche,
Universita' di Bologna.
Lee, S. H. (1999) Open source software licensing. Working paper,
available at: http://cyber.law.harvard.edu/openlaw/gpl.pdf (accessed 1 May,
2010)
Lerner J., and Tirole, J. (2002) Some simple economics of open source.
Journal of Industrial Economics, 50(2): 197-234.
Lerner J., and Tirole, J. (2005) The scope of open source licensing,
Journal of Law, Economics and Organization, 21(1): 20-56.
Osterloh, M. et al. (2001), Open Source - New Rules in Software
Development.
University
of
Zurich,
working
paper.
www.iou.uzh.ch/orga/downloads/opensourceaom.pdf (accessed: 16 April, 2010)
Maurer, S. and Scotchmer, S. (2006), Open Source Software: The New
Intellectual Property Paradigm, In T. Hendershott (ed.), Handbook on
Information Systems, Elsevier, Amsterdam
Miller, P. J., editor (1994), The Rational Expectations Revolution, The
MIT Press, Cambridge, London.
Moglen, E. (1999), Anarchism triumphant: Free software and the death of
copyright. First Monday, 4(8).
58
Moore, G. H. (1983), Business Cycles, Inflation and Forecasting,
Ballinger Publishing Company, Cambridge, USA.
Perens, B. (1999), The open source definition. In Di Bona, C. et al. (eds),
(1999), Voices from the Open Source Revolution pp. 171-188, O’Reilly &
Associates, Sebastopol, CA.
Plosser, C. (1989), Understanding the Real Business Cycles, Journal of
Economic Perspectives 3(Summer): 51-78.
Raymond, E. S. (1998) The cathedral and the bazaar, First Monday, 3(3)
Reifer,
and
D.
J.
(2004),
Productivity
Industry
Benchmarks,
Software
Reifer
Cost,
Quality
Consultants
http://www.compaid.com/caiinternet/ezine/Reifer-Benchmarks.pdf.
Inc.
(accessed
October 19, 2010)
Romer, D. (1996), Advanced Macroeconomics, McGraw-Hill Companies,
Inc., New York.
Rossi, M. A. (2005), Decoding the free/open source software puzzle: a
survey of theoretical and empirical contributions, In Bitzer J. ,Schröder P., (eds)
(2006), The economics of open source software development, Elsevier,
Amsterdam
Rötheli, T. F. (2006), Business Forecasting and the Development of
Business Cycle Theory. History of Political Economy, Forthcoming. working
paper, available at SSRN: http://ssrn.com/abstract=917121 (accessed, 7 May,
2010)
Sargent, T. J. and Wallace, N. (1975), Rational expectations, the Optimal
Monetary Instrument, and the Optimal Money Supply Rule, Journal of Political
Economy, 83(April 1975): 241-254.
Tarun Chordia and Lakshmanan Shivakumar (2001), Earnings Business
Cycle and Stock Returns
59
von Hippel, E. and von Kogh, G (2003) Open source software and the
„private – collective” innovation model: issues for organization science,
Organization Science, 14(2): 208-223.
Wooldridge, J. M. (2008, 4ed.), Introductory Econometrics, A Modern
Approach, South-Western Collage Publishing, London.
Zimmermann, C. (2001), Forecasting with Real Business Cycle Models. Indian
Economic Review, 36(1)
60
Appendix
Appendix A – Table of countries weighted by the number of active developers
Country id
1
2
3
4
6
7
8
9
10
15
16
17
20
21
22
23
24
28
29
30
32
33
34
37
38
Country
Australia
Austria
Belgium
Canada
Czech Republic
Denmark
Finland
France
Germany
Israel
Italy
Japan
Mexico
Netherlands
New Zeland
Norway
Poland
Spain
Sweden
Switzerland
United Kingdom
United States
Brazil
Russian Federation
South Afrika
Active developer
7945.00
2549.00
3034.00
11524.00
1443.00
2314.00
1842.00
10987.00
24197.00
1467.00
6200.00
1357.00
1401.00
6687.00
1635.00
1883.00
2520.00
4760.00
4642.00
3033.00
14051.00
112981.00
4038.00
3217.00
1216.00
Weight
0.033534
0.010759
0.012806
0.048640
0.006091
0.009767
0.007775
0.046374
0.102130
0.006192
0.026169
0.005728
0.005913
0.028224
0.006901
0.007948
0.010636
0.020091
0.019593
0.012802
0.059306
0.476868
0.017044
0.013578
0.005132
Source: Author’s own calculation, Engelhardt & Freytag (2009)
61
Appendix B – Table Calculations of the effects’ size
Forge
Sourceforge
Freshmeat
Rubyforge
Aggregate
Avgerage of the Weighted a
Activity Growth
Effect of a 0.001
Coefficient
Size of the Effect
in Project
Numbers
0.124752
0.071226
0.029117
0.062124
0.000125
7.12E-05
2.91E-05
6.21E-05
37
21
9
18
Source: Author’s own calculation
62
Download