Open-source technologies for realizing social networks: a multiple descriptive casestudy Jose Teixeira School of Economics at University of Turku Jose.Teixeira@utu.fi Abstract. This article aims at describing the role of the open-source software phenomenon within high-tech corporations providing social networks and applications. By taking a multiple case study approach, I address what are the open-source software technological components embedded by leading social networking players, and a rich description on how those players collaborate with the open-source community. Our findings, based on a population of three commercial providers of social networks and applications suggest that open-source plays an important role on the technological development of their social networking platforms. An open-source technological stack for realizing social networks is proposed and several managerial issues dealing with collaboration with open-source communities are explored. keywords: open-source, social networks, entrepreneurship, facebook, spotify, netlog 1 Introduction This article develops a deeper understanding on how providers of popular social networking Internet sites employ open-source technologies, that are freely available on the Internet and within the public domain, in their inner technological operations realizing social network services targeting a global community of Internet users. Online social networks are in vogue this days. Facebook is the primordial example, its currently the biggest social network within our WEIRD society (Western, Educated, Industrialized, Rich, and Democratic). Probably benefiting from being a USA and California based company, Facebook is also the online social network that most captured attention from the media, we are even able to see the Hollywood movie “The Social Network” dedicated to it. Practitioners, such as corporate brand marketers, quickly discover that targeting advertising based on personal profiles and correspondent social graph works quite well. In the last two years we saw marketing communication budgets flowing from traditional media, Internet portals and search engines to social networks, as claimed by industrial market research players such as comScore, Forrester and Kenshoo (Steel 2011) and (Kenshoo 2012). Within academia, even thought studies on social networks have been conducted in fields like sociology and anthropology for decades (Oinas-kukkonen et al. 2010), only more recently it captured massive attention from computer scientists and information systems researchers. I can give prominent examples such as the works from Horowitz and Kamvar (2010) drafting the anatomy of a large-scale social search engine; Mislove et al. (2010) that performed a large-scale measurement study and analysis of the structure of multiple online social networks; Putzke (2010) that studied social behaviours in online game environments; and finally, Agarwal et al. (2008) that devoted to the design and use of information technology with social context and their impact on organizations. In this paper, I cross the social networking phenomenon with the open-source phenomenon by assessing how social networking providers are employing open-source technological components in their in-house software development. The open-source phenomena also gather extensive research attention in the last decades across many disciplines. I would like to highlight the works of Stallman(1993) and Raymond (2001) on conceptualizing and coining the open-source phenomenon from a computer-science perspective; together with the work from Lerner and Tirole (2005) that applied different economic perspectives on it. In this research, I engaged closely with three different social networking operators assessing what is role that the open-source software phenomenon plays as a enabler of the social networks and correspondent applications. The significant implications are mostly empirical and can be addressed by practitioners, such technology developers wishing to integrate social networking capabilities in their products and services. Introduced the topic and its empirical relevance, the article continues by outlining key contributions bridging the open-source and social networking phenomena and areas for further development that this research seeks to address. Afterwards, I proceed to an in-depth multiple descriptive case-study conducted with three providers of popular social networking websites. In the last sections, I discuss possible contributions triggered by continuous reasoning on the phenomena being studied with the support of a comprehensive analysis of the collected data. 2 Literature review The existence of recent literature reviews on social networks and applications across different disciplines such as entrepreneurship (Hoang & Antoncic 2003); marketing (Cooke & Buckley 2008); computer science (Mislove et al. 2007); information systems (Parameswaran & Whinston 2007) and (Oinas-kukkonen 2010) facilitated the process of identifying relevant literature that guided this research. By reviewing existing literature bridging the social networking and the open-source phenomena I identified two research streams. A first stream of research address the topology of networks of opensource developers. Valverde and Solé (2007) suggest that the overall goals of the open-source community and underlining hierarchy shape the open-source community network dynamics. Using the sourceforge.net open-source network, Madey et al. (2002) identified interesting characteristics on the open-source community social graph such as the existence of preferential attachment for new nodes. Xu et al. (2005), using larger data sets from sourceforge.net, claims certain topological properties that may potentially explain the success and efficiency of OSS development practices. The second research stream, rather than taking a look at the structure of open-source social networks, it addresses social aspects such as communication, socialization and motivation withing open-source social networks. Ethnography methods were employed by Ducheneaut (2005) for describing socialization processes in the open-source community developing the python programming language. Barcellini et al. (2008) performed a socio-cognitive analysis of online design discussions in an Open Source Software community by analyzing the same python open-source community. By taking a look at bug-fixing tasks, Crowston and Howison (2005) found that open-source networks have tendency for more decentralized communication patterns that tradition corporate highly centralized software projects. In both streams of research, the researchers point their lenses to social networks of open-source software developers. In this paper however, I turn the lenses from a completely different perspective. I take a look at organizations developing digital technology that realize social networks and applications, describing how they use and benefit from public domain software artefacts developed by the open-source community. From this perspective, little knowledge seems to exist; I did not find published research addressing how organizations providing social networks and applications integrate open-source technological components within their research and development (R&D) operations. Still, I believe that it matters to both academia and practitioners to know on what extend those young and innovative organizations use open-source software, why and how. In the succeeding sections, I proceed to report a rich qualitative description on how three leading players in the social networking industry use open-source technologies for realizing social networks and applications. 3 Methodology The research question guiding the preliminary research efforts was: “what role the opensource software phenomenon plays as a enabler of the social networks and correspondent applications”. In this paper I address first, what are the open-source software technological components embedded by social networking players; and second, how are those players collaborating with the open-source community. Regarding the first research question, I simply report what technological components were taken from the open-source community and integrated by the investigated social networking players. I paid attention on what king of technological components are integrated, their functional purpose and legal software distribution licensing. Addressing the second research question, I investigated the interactions between the studied social networking players and the open-source community, while assessing motivations and pursued competitive advantages driving the collaboration. Some emphasis was dedicated on assessing on how social networking players work up-stream, meaning how do they contribute back to the open-source community. This research efforts took the form of a multiple descriptive case-study in the moulds of Eisenhardt (1989), Miles and Huberman (1994) and Yin (2002). As author, I had tiny or no control over networked behavioural events within the social networking and open-source phenomena being studied. It is important to notice that my research efforts point the lenses at a contemporary phenomenon in a real-life context where the boundaries between phenomenon being studied and its context is not obvious. In Table 1, I present the three unit of analysis from this multiple descriptive case study. By interviewing staff from those three social networking providers, I searched for consistent patterns of evidence across the three units taking a recognized role within the same phenomenon being studied. It might matter to mention that, as evidenced by technological review periodicals such as Byte and Techcrunch.com, those three organization are great examples of startups that registered exponential grow over a short period of time. Organization Facebook Spotify Netlog Table 1: Description Biggest and most studied social network The leading peer-assisted music streaming system One of the most global social networks for the youth Country USA Sweden Belgium The multiple case-study organizational unit of analysis It matters to mention that, even if this research was guided by the case-study process proposed by Eisenhardt (1989), it matters to refer that, for this paper, I simply and modestly aim at providing a rich description of the observed phenomenon. Also methodologically inspired by Dyer and Wilkins (1991), I seek to provide a good and rich phenomenological description, emphasizing on contemporary relevance over rigor. Therefore, this paper is detached of any generalization reasoning, but rather invites the readers to thereafter address it. In the following sub-sections, I provide more detail on methodological issues embedded on the design and execution of this research. 3.1 Preparation and pilot study This research was partially driven from an event organized by the Canada-Norway partnership program in higher education (CANOE) and hosted by the Networks and Distributed Systems Group from the University of Oslo between 22 and 26 of August 2011 in Sundvolden, Norway. This event was a rare opportunity for researchers with interests on social networking topics to meet together with industry practitioners from major providers of social networks and services. In an attempt to exploit the most from the previous mentioned event, a questionnaire was developed for guiding semi-structured interviews with practitioners from the social networking emergent industry. One pilot case study was conducted locally, with a Finnish social network provider with expertise in video broadcasting. The pilot study confirmed the relevance on studying opensource technologies realizing social networks and outlined a more focused questionnaire for further developments. 3.2 Fieldwork strategies Even if this research addressed directly technology developers from three organizations offering social networking services, it is very important to notice that we are dealing with complex interorganizational platforms over simpler products developed by a single organizations. For instance, from a very early stage, Facebook exposed publicly on the Internet an open and well documented Application Programming Interface (API) that allowed any 3rd party software developers to access Facebook social graph and develop the so called Facebook apps. Today many 3rd party organizations operate by complementing Facebook core platform with complementary products and services, that under network effects add value both to Facebook and its users. Both the case study protocol as described by Yin (2002) and phenomenological interviewing by Thompson et al. (1989) guided the author semi-structured interviews during the fieldwork phase of the study. Thompson et al. (1989) argues that phenomenological interviews are “the most powerful means of attaining an in-depth understanding of another person's experiences” (1989: 138). Individuals from the organizations providing social networks and application, with computer engineering background and system development responsibilities, where interviewed in a very informal setting. A total of five interviews were conducted by the author. It most cases the interviews took more that one hour, I perceived that they were taken by the interviewees as both interesting and informal conversations. I started by seeking information on what open-source technologies organization's use for realizing social networking and applications, which software artifacts and for what purpose; also how organization deals with version control, bug fixing, security and software licensing issues. In a later stage, I addressed the interactions of the organizations with the open-source community: how collaboration takes place, the existence of contracts or any other legal agreements between parts, how does the technology support takes place, were the organizations contributing back to the open-source community. Often the conversation reached other emergent topics such as cloud computing and development on mobile technology. During the interview, small pauses were requested by the interviewer to transcript important parts of the conversation. After each interview, the author rapidly produced several textual notes capturing information he considered relevant. In two cases, the interviewees shared complementary documentation with the author. In the following section, I describe how all these collected information was then digitalized, classified and carefully analyzed. 4 Data analysis Demonstrating rigor through a careful and comprehensive articulation of data analysis is a critical issue in improving the robustness of qualitative research. The qualitative inquiry presented by (Eisenhardt 1989) and (Miles & Huberman 1994) guided the data categorization and analysis within this research. Popular and wide available software tools facilitated the data categorization: a text editor, a spreadsheet processor and some mindmap software were used. Different theories and empirical perspectives were applied on the collected data. As a new phenomenon not previously covered by the literature, the data neither match nor falsify exiting theories on open-source software research. The author, due to its educational background, employed mostly the computer science and information systems perspectives to reason from data. However the collected data might have value for other research communities of expertise. The outcomes of this analytical process are developed in the following findings sections. Findings addressing the first research question required little analysis and can be considered findings from interviewees compiled by this paper author. The second research question was more challenging requiring considerable efforts both while interviewing and performing a compressive data analysis. 5 Findings Resulting from informal technical discussions with the interviewees and directly addressing the first research question, the following Table 2 presents a stack of open-source technological components used by Facebook, Spotify and Netlog. Different open-source projects providing software artifacts integrated by the studied organizations are grouped and presented by functional characteristics. Due to informal non-disclosing agreements with the interviews, I do not reveal what technologies are used specifically by each organization but by the overall set of three organizations. Technological function Client-side programing languages Server-side programing languages Database/Persitence Server operating system Web server Load balancer Object cache Search and indexing Configuration management Process orchestration Network monitoring Backup systems Version control Statistics/BI/DW Testing Table 2: Integrated open-source software packages C, C++, Java Python, Java, Scala, Ruby, PHP Mysql, ext3 file-system GNU Linux kernel Apache, nginx, php-fpm, HipHop haproxy jemalloc, memcached ubersearch, unicorn, sphinxsearch Puppetlabs cron, gearman Zabbix Bacula CVS, SVN, GIT hadoop, hbase, HIVE, Sqlite phpunit, seleniumhq, jenkins-ci Technological stack realizing social networks Addressing the second research question, even if the collected data was consensual with existing knowledge, I could observe some unexpected findings evidenced by patterns on the collected multiorganizational data. A rich set of descriptive data was obtained thanks to the extremely collaborative attitude from the interviewees, this lead to a considerable amount of issues that can be furthered explored. Following I report three descriptive findings with potential to rise debate among this paper readership. First, the satisfaction of the studied organizations with open-source technologies seems quite high, specially among the R&D teams. It is important to notice that the studied organizations attained an extremely high grow since their start-up times, resulting in often mutating ownership structures. It was observable that some of those organizations ownership and governance changes led to pressures on the R&D staff to roll-out from open-source software to proprietary technology from well known traditional software houses. This have potential to create issues between R&D teams and organization's leadership. Many vendors of proprietary technology do not seek sales per se, but to associate their technologies with the brands of the studied organizations. “we been told several times to embrace cloud-computing technologies from a particular vendor, we tried and failed several times” … “Many proprietary , expensive and complex solutions are designed as if one would fit all” … “Vendors are focused in attracting user base over our specific needs” Second, the collaboration with the open-source communities seems to be taken more at a personal level than at institutional level. As reported by one of the interviewees, the support provided by the open-source community is more ad-hoc and the solution for the problems is available earlier. The procedure seems unorganized and chaotic, but the interviewee claimed that it works better that in the organized technological support from big companies where who tries to help often does not know much about the organization in general and the developer looking for help in particular. “we have very good contacts with the open-source community, this enable us to fix complex problems just by chatting with key developers of the project” ... “In our experience in dealing with cloud computing vendors, bug reporting was tedious, passing over slow and complex processes, often resulting in nothing” … “we went back to control our own servers because unstable infrastructure often friezed our operations ” Finally, and for an entrepreneurship perspective. Open-source was present from the beginning of the organizations venture. Some of the founders had software development skills and pushed for open-source software development in-house. Not just because it provides low entrance barriers and agility to the organization but also as cultural aspect. Some of interviewees manifested their appreciation for open-source hacker culture, public domain ethics and seek for meritocracy. “We use a lot of open-source stuff. That's what made sense” … “We never got together and discuss about open-source vs proprietary, it just came naturally” … “ startups need to get used to the idea of rapid-prototyping cycles … open-source software development tools are friendly for rapid interactions”. Following I discuss the implications of the previous reported findings encompassing a set of opensource technological components and three descriptions regarding the collaboration of the social networking industry with the open-source community. 6 Discussion This research clearly distinguishes from other previous studies on how organizations use of open-source software. First, because the researched concentrates its lenses to a high-tech organizations widely recognized as global and innovative. Moreover, because the studied phenomena deal with complex inter-organizational platforms over products developed by a single organizations. The social networking platforms offered by the studied organizations are complemented by third party developed components that, under network effects, complement both the core social networking vendors and its users, see (Shapiro and Varian 1999) 6.1 Theoretical implications Our theory testing approach did not falsify any open-source theoretical proposition refereed in the literature review. The consensus with the established body of theoretical knowledge can be explained by the novelty of the phenomenon being studied. Moreover, as inspired by Dyer and Wilkins (1991) I focus more in providing a good description on the phenomena being studied, leaving out space for refined theoretical contributions. 6.2 Practical implications From the practical point of view, players or wannabe players in social networking industry can benefit from the suggested technological stack realizing social networks and applications. More general adopters of different software technologies have now a reference on what kind of technologies high profile organizations in social networking arena are exploiting from the open-source community. Moreover, our limited but in-depth description raises managerial awareness for issues that might pop-up when collaborating with the open-source community. 6.3 Policy and support implications I would conclude emphasizing that open-source plays a very important role in innovation. On our multiple-case studies, it lowered the entrance barriers for start-ups that are now recognize as global and innovative. Government initiatives should avoid protectionism actions aimed at securing revenues for the established traditional engineering houses. The open-source role in fomenting entrepreneurship venture is per se a strong argument for governments to protect, and even stimulate the phenomenon 6.4 Limitations of the study Limitations of the sample in this regard do not allow me to make any substantial assertions but these initial findings certainly point to the value of examining this unexplored issue further. Moreover, certain personal biases as a sole researcher already familiar with the opensource phenomenon can be present, even if several methodological design issues were considered to minimize them. 6.5 Areas for future research As a researcher with strong computer science and information systems background I did not apply enough perspectives on the analysis. It matters to apply other theoretical lenses covering fields such as marketing, entrepreneurship and social science disciplines that already deal with social networks for decades. This will require collaboration with researchers that work out of my comfort zone. This research was built on a comprehensive data set, some findings were already reported in this paper. I believe that, both by applying more efforts on the analysis of data and by triangulating findings with other research efforts, more empirical contributions could come: not in the consensual rich description here provided, but by provoking exiting knowledge on how organizations use opensource software. 7 Conclusions In our sample, the satisfaction from social networking technological developers with the opensource phenomena is extremely high. The use of open-source technological components started from the beginning, as early as the company founders developed their first software pieces. After an organizational startup phase the use of open-source software remains strong. For future and according our interviewees, all sample organizations consider to continue using open-source software, as well to collaborate with and contribute back to the open-source community. This research contributes with a technological stack realizing social networks and applications as proposed by our sample organizations. In addition, and perhaps more prone to foment future research, I provide a simple and rich description on how three popular and innovative organizations integrate technological components from the open-source community into their social networking platforms. Acknowledgment I would like to thank the interviewees realizing social networks for their intense and enthusiastic collaboration that made this research possible. Moreover, I would like to thank Reima Suomi from Turku School of Economics for early comments on this paper. References Avison, D. (1998). “Rigour in Action Research: Some Observations and a Plea,” Scandinavian Journal of Information Systems, (10:1&2), pp. 119-124. Agarwal, R., Gupta, A.K. & Kraut, R., 2008. The Interplay between Digital and Social Networks. Information Systems Research, 19(3), pp.243–252. Anon, Facebook’s Uphill Battle for Big-Brand Advertisers - WSJ.com. Available at: http://online.wsj.com/article/SB10001424052970204294504576613232804554362.html [Accessed March 14, 2012]. Barcellini, F. et al., 2008. A socio-cognitive analysis of online design discussions in Open Source Software community. Interacting with Computers, 20(1), pp.141-165. Cooke, M. & Buckley, N., 2008. Web 2.0, social networks and the future of market research. International Journal of Market Research, 50(2), pp.267-292. Crowston, K. & Howison, J., 2005. The social structure of free and open source software development. First Monday, 10(2). Ducheneaut, N., 2005. Socialization in an Open Source Software Community: A Socio-Technical Analysis. Comput. Supported Coop. Work, 14(4), pp.323–368. Dyer, W.G. & Wilkins, A.L., 1991. Better Stories, Not Better Constructs, to Generate Better Theory: A Rejoinder to Eisenhardt. The Academy of Management Review, 16(3), pp.613-619. Eisenhardt, K.M., 1989. Building Theories from Case Study Research. The Academy of Management Review, 14(4), pp.532-550. Hoang, H. & Antoncic, B., 2003. Network-based research in entrepreneurship: A critical review. Journal of Business Venturing, 18(2), pp.165-187. Horowitz, D. & Kamvar, S.D., 2010. The anatomy of a large-scale social search engine. In Proceedings of the 19th international conference on World wide web. Raleigh, North Carolina, USA: ACM, pp. 431-440. Jin Xu et al., 2005. A Topological Analysis of the Open Souce Software Development Community. In System Sciences, 2005. HICSS ’05. Proceedings of the 38th Annual Hawaii International Conference on. System Sciences, 2005. HICSS ’05. Proceedings of the 38th Annual Hawaii International Conference on. p. 198a. Kenshoo, 2012. Kenshoo Press Releases: Online Marketing News. Available at: http://www.kenshoo.com/Facebook_Advertising_Budget_Growth_Media_Alert/ [Accessed March 14, 2012]. Lerner, J. & Tirole, J., 2005. The Economics of Technology Sharing: Open Source and Beyond. Journal of Economic Perspectives, 19(2), pp.99-120. Madey, Greg, Freeh, V. & Tynan, R., 2002. The open source software development phenomenon: An analysis based on social network theory. In Americas Conference on Information Systems AMCIS2002. Citeseer, pp. 1806–1813. Available at: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.138.1547&rep=rep1&type=pdf. Miles, M.B. & Huberman, A.M., 1994. Qualitative data analysis: an expanded sourcebook, Sage Publications. Mislove, A. et al., 2007. Measurement and analysis of online social networks. In Proceedings of the 7th ACM SIGCOMM conference on Internet measurement. San Diego, California, USA: ACM, pp. 29-42. Oinas-kukkonen, H., 2010. Social Networks and Information Systems : Ongoing and Future Research Streams. Journal of the Association for Information Systems, 11(2), pp.61–68. Parameswaran, M. & Whinston, A.B., 2007. Research Issues in Social Computing. Journal of AIS, 8, pp.336–350. Putzke, J. et al., 2010. Journal of the Association for Information The Evolution of Interaction Networks in Massively Multiplayer Online Games * Multiplayer Online Games. Journal of the Association for Information Systems, 11(2), pp.69–94. Raymond, E.S., 2001. The Cathedral and the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary, O’Reilly \\& Associates, Inc. Shapiro, C. & Varian, H.R., 1999. Information rules, Harvard Business School Press Boston, Mass. Stallman, R., 1993. The GNU Manifesto - GNU Project - Free Software Foundation (FSF). Available at: http://www.gnu.org/gnu/manifesto.html [Accessed March 14, 2012]. Steel, E., 2011. Facebook’s Uphill Battle for Big-Brand Advertisers. Available at: http://online.wsj.com/article/SB10001424052970204294504576613232804554362.html [Accessed March 14, 2012]. Thompson, C.J., Locander, W.B. & Pollio, H.R., 1989. Putting Consumer Experience Back into Consumer Research: The Philosophy and Method of Existential-Phenomenology. Journal of Consumer Research, 16(2), pp.133-146. Valverde, S. & Solé, R.V., 2007. Self-organization versus hierarchy in open-source social networks. Phys. Rev. E, 76(4), p.046118. Yin, R., 2002. Case Study Research : Design and Methods, Sage Publications.