Investigating the role of social media in open source software communities Stream 10: Technology, eLearning and Virtual HRD Submission type: Working Paper Authors: Zeta Dooly Research Engineer/ DBA Candidate TSSG Waterford Institute of Technology Tel: +353(0)51 302943 EMail: zdooly@tssg.org: Kenny Doyle Research Engineer/ PhD Candidate TSSG Waterford Institute of Technology Tel: +353(0)51 302943 EMail: kdoyle@tssg.org Abstract Since it’s advent in the 1980’s, Open Source Software (OSS) has become widespread and is used by an ever increasing number of software users and developers to the point that it has become a core component in many global technological architectures. The shift towards open source has had noticeable effects on organisational thinking and how collective efforts are managed, organised and executed. Typically, open source software development teams have a common interest and they develop software using information and communication technologies (ICT) as their primary modes for communication. The OSS can be distributed, viewed, used and modified within specific license constraints. These communities produce vast amounts of data, information and knowledge, which become publicly archived data stores, often leveraged by future OSS developers (Guzzi et al 2013, Wang et al 2013, Correa and Sureka 2013, Galster and Tofan 2013, Pagano and Maalej 2012, Yang et al 2013, Sharif 2012). Recent evidence shows that OSS communities worldwide have adopted social networking, software and media as part of their overall ICT toolbox, establishing them as key communication channels for information sharing, project updating, motivating developers and crowdsourcing. However, how open source communities leverage social media for example, microblogging, is not yet well understood (Wang et al 2013). This paper will look at OSS as an organisational form, paying particular attention to how developments in the use of social media have facilitated it, and how these virtual modes of communication have contributed to the apparent unbundling of some organisational practices. Introduction The motivation for this research stems from an interest in ongoing changes within the software development environment where technological advancement has revolutionised the approach to the practices and behaviours of software developers. OS components form critical infrastructure in global technical architectures and the different practices in relation to communication, knowledge management and software development activity are not very well understood. Indeed, the emergence of Virtual Communities of practice (VCOP) and networks of practice (NOP) depend much on the existence of historic data in searchable formats to socialise new entrants and troubleshoot issues. With vast amounts of data generated by OSS communities it is relevant to question the usefulness of social media activity and explore the trends for different communities. Historically structured software development methods such as waterfall formed the backbone of software development in varying degrees of formal and informal process-orientated fashions. Development has now moved beyond distributed software development in global offices through to hybrid development models outside the core business organisational structure. These new hybrid structures include volunteers and Open Source (OS) communities as well as integrating existing OS components into new products and services. There is much diversity of products and services developed by OS communities and it is clear that their working methods differ from those of traditional closed source practices. One of the key differentials in OSS working methods is the temporal and spatial dispersion of the workforce; this is facilitated by the use of the internet in general and social media in particular for purposes of communication. OS products and services are developed by a diverse array of actors including individuals working in their spare time as well as profit and non-profit orientated organisations. What is Open Source? The movement for Open Source software was spearheaded in the early 1980’s by copyright campaigner Richard Stallman. Free (Libre) and Open Source Software (F/OSS, OSS, FLOSS) is software that is licensed to grant the right of users to use, study, change, and improve its design through the availability of its source code. To clarify, even though closed source software (CSS) development can be developed in a similar way to OSS, a differentiating factor is that the development team is often both geographically and organisationally distributed, this is described as community based development (Joo et al 2012; Von Hippel and Von Krogh 2001). Prior to this, software was developed within organisations and the programming source code which was deemed to be intellectual property was closely guarded by its proprietors. Due to this fact software was released as an executable file only and the source code was kept as a trade secret. The name open source is taken from the primary request of the original movement, namely that software packages released to the market also release the source code so that programmers could modify it to suit their own needs and requirements. Some of the benefits of OSS include a rapid turnaround time for resolution of bugs or problems with software and flexibility for users to enhance the software and tailor it to their own specifications. However, one of the main benefits cited during the first wave of open source software implementations was the cost; contrary to closed source code an initial up-front license payment was not enforced. As a result, the volume of research in this area rapidly grew particularly exploring the business and economic impact (Morgan and Finnegan, 2007). Early versions of OSS followed this pattern where established software developers publically released source code along with their programs and allowed users to modify and debug the existing program. Evolution of Open Source As the OSS movement matured the parameters which defined it broadened to include development projects which were open source from the outset. One of the key drivers in this move was the widespread adoption of the internet as it became far easier for communities to gather and marshal their resources in the one direction necessary for creating a purposeful and communal effort. While the original type of OSS was initially developed behind the closed doors of a software company and then released along with its source code allowing users to modify it; newer versions of OSS projects originated from the internet. They began in various chat rooms, message boards and forums where a project was outlined and technical contributions were sought from potential collaborators. The outputs of these projects were usually given away for free and with the increasing prevalence of OSS came an ever expanding repository of software which was free and available to use as building blocks for other developments. These developments included other new OS projects but crucially they also include software development projects which although carried out by established organisations, integrated components from OS repositories into their new products. How OS software is integrated as a component into core products is an interesting area; there needs to be confidence that the component adopted meets the overall product acceptance criteria in terms of various predefined elements such as reliability, quality, and trustworthiness. Communication in Open Source Networks As has been outlined above, OSS networks are spatially and temporally dispersed with participants in the development process rarely meeting each other in a face to face context. This bears some resemblance to a typical globally dispersed software development team, but the complexity lies in the organisational management aspects of being part of an organisation and being part of a community of developers. OSS contributors are spatially dispersed in as far as they are often physically located in geographically different places; and they are temporally dispersed in as far as they rarely work on the OSS project at the same time as each other. Working practices in most traditional organisations mean that people mostly work in the same place at the same time, meaning that there are set business hours during which work is done, and it is done in a defined geographical location. This means that communication in such instances is face to face with supporting documentation used to keep records. OS networks which are by their nature dispersed across time zones, must rely on written communication between members. In the era before the widespread adoption of social networking sites, such networks predominantly used message boards, emails and newsletters to communicate. The advent of social networking however means that there are a range of real-time options now available for communicating in OSS networks with the mode of communication being determined by a number of factors such as target audience, message complexity, and size. Awareness of what other programmers are doing, access to knowledge, issue management and coordination of tasks are some of the scenarios where communication is critical. Yet in cases where the OSS project is designed in a modular fashion people can take their section of the work and go it alone until they need assistance or have work to submit. Indeed it is where critical junctures arise that developers refer to online resources to overcome challenges (Dabbish et al 2012). They ascertain that generally people seemed to work independently until certain events brought them together, making the dependency more salient, such as when a potentially problematic change would show up in the feed, or when a pull request would create problems for other aspects of the code. Studies have highlighted the challenges that developers face while programming. In some cases there is a requirement for implementation specific detail and while programming there is a need to know what others are doing in the same module. Tools that provide concurrent versioning system (CVS) services such as SVN can support code management but design issues and architecture considerations are quite complex to support. These activities are closely tied to innovation. There is sufficient evidence to show that access to online resources such as information repositories and communication tools can alleviate some of the issues (Correa & Sureka 2013, Dabbish et al 2013, Muller et al 2012). Studies referring to coordination tools are not common in the literature (Crowston, 2012). The coordination of team activities can be managed through issue management tools but this risks the prioritisation of short-term fixes over system design. Singh et al 2011 (cited in Yang et al 2013) argue that allocating knowledge resources through digital social network techniques helps developers to design better software. This research is in search of further evidence to illustrate that social media communication can impact software development. Further Yang et al (2013) contend that OSS project success is mainly dependent on its measurements and determinants, and that the impacts of digital communication are still fragmented. Media choice for the different types of communication activity can be a key factor in gaining the momentum required to ensure project success, increase productivity levels, or support innovation, often the metrics under scrutiny in such scenarios. It is clear from the literature that the media choice is a determinant factor, its transparency can affect innovation (Dabbish 2012, Fiji and Saunders, 2011), interpretation of signals can affect decision-making (Tsay 2013) and it can help manage teams effectively (Figi and Saunders 2011). Figure 1 below illustrates the research problem, contextual environmental considerations and alludes to possible solution areas. Social media channels • • • • • • • communication activities Twitter Issue repositories LinkedIn Facebook Code websites e.g. Github Wiki Forums/Stackoverflow • Community building • Info/knowledge sharing • Project management/allocation of tasks • Innovation • Issue managements OSS developer challenges • Access to online information and resources while programming • Specific implementation detail • Team awareness of others’ programming activities problem domain Emerging research: Plugins to existing IDE Aggregated SM platforms Integrated SM and development environments Traceability links for communications across platforms Figure 1: Influending factors in Open source communication The literature suggests that there has been a shift in communication habits of OSS developer communities to issue repositories and that there is a need for improving communication tools as project developers often have issues in maintaining awareness over their peers’ developments (Guzzi et al 2013 and Tsay et al 2013). Automatically recovering traceability links among communication repositories would free developers from the difficult task of recovering scattered traces of historic communication, and would help researchers giving them a more complete picture of the development process. Thus, it might follow that tools for maintaining awareness could possibly improve developers’ productivity. Figure 1 above shows inputs, challenges and possible solutions to this research challenge and some of the proposed solution areas from the literature. There are a number of different social media platforms which are used by OSS developers and the choice of which one is used and when is mostly determined according to function. Repositories such as Github allow for projects to be modified and shared by a large community of developers, this means that this resource would be typically used at the development and coding phase. At other phases of an OSS project there may be the need for community building and publicity drives which aim to create awareness of the project in order to attract developers and users. At this stage social networking sites such as Twitter and Facebook are used as a means of publicizing the OSS project. Other non-social network sites (SNS) type modes of communication such as electronic mailing lists are also used as a means of keeping developers informed of developments and updates in the overall project, and other message board style methods of communication are also used for this purpose. In a traditional style organization the defined structures of collaboration and communication and the norms associated with working within an organization ensure that developers are well informed and are aware of the status and priorities of the project. In the absence of these organizational structures there needs to be other methods of bundling together the myriad strands of communications which are typical in an OSS project. One of the key differences between an organizationally run project and an OSS project is that of organizational social norms. Norms developed around wage labour relations mean that there are a number of highly codified and regulated standards which must be adhered to by all members of a development team. These standards include those relating to working hours, rates of payment, management structures, work standards and relations of authority to name but a few. These standards are mostly absent in OSS projects because of the fact that they are executed outside of institutional or organizational structures and they often do not involve the exchange of money for labour although there is some evidence to suggest that the volunteer status of OSS developers is also changing as OS companies such as RedHat emerge in the market place and other large industry regularly participate in OSS development activities. These factors combine to make OSS projects potentially amorphous and lacking in definitive structures of authority as they are geographically dispersed, self- organizing communities. In practice however this form of organisation in OSS communities is uncommon and is rarely associated with successful projects. Instead OSS projects usually have leaders who emerge according to a number of factors including their level of interaction with the project and how successful their work has been. In other instances, leaders are simply the originators of the project but the important factor to be considered is that leaderless, self-organising projects are less likely to succeed, and effective leadership, however it emerges or whatever form it takes is a prerequisite for successful projects. The role of social media is crucial as a communicative enabler in the emergence of leadership, ideas and in the ‘training’ of new members and motivation of members. The following section will describe how communities of practice which are typically found in structured organisations evolve in OSS communities into networks of practice which utilise social media and other forms of networked communications as a means of organising. Theoretical grounding Social software is used in collaborative work environments (CWEs) and open source software developers are one of the user groups. CWEs are software applications and platforms that facilitate diverse interactions between users, machines and collaborative services. Socialisation in organisations has been researched for many decades (Van Maanen and Schein, 1979) and can be applied to the OSS community to further our understanding of OSS development and practices among contributors. Crowston et al. (2012) argue that researchers need to draw on theoretical foundations that have been utilized in prior research on social interaction and software development, as well as other theoretical bases that are relevant to the OSS phenomenon, to develop a more theoretically grounded understanding of OSS development. What is undoubtedly of interest are the means by which members are socialized into the OSS community, the comparison between traditional forms of organization and OSS communities throws up a number of interesting questions. In typical organisational forms knowledge is transmitted via communities of practice. A community of practice ‘binds together a group of people who share a concern, a set of problems, an expertise or a passion about a topic’ (Wegner et al. 2002). Communities of practice can also be those which are based on a common goal, they can be formal or informal and are crucial within organisations and communities for the transmission of tacit knowledge and the socialisation of new members into work practices and getting things done effectively and efficiently. This type of knowledge transmission is referred to by Lave and Wegner as ‘situated learning’. Formal COP’s can be seen in working groups within larger organisations, informal COP’s can be as simple as a group of people within an organisation who have lunch together every day. Thus while some COP’s are institutionally sanctioned and organised, others are informal and operate on an interpersonal basis outside of official institutional organisation. Similarly some COP’s can be spontaneous and originate organically while others are purposefully created with a goal or set of goals in mind. The COP is defined by its structuring characteristics which relate to decisions relating to the specific roles which are to be played and who they are to be played by. COP are also defined within organisations as being facilitators of cross departmental knowledge sharing and creation. The contemporary organisational paradigm of flexibilisation, combined with technological advances in information and communications technology (ICT) mean that communities of practice are no longer dependent on geographical co-location of participants. Thus virtual communities of practice (VCOP’s) which are geographically dispersed and operate via the medium of digital communications have become increasingly common. OSS communities are more similar to networks of practice (NOP) than they are to VCOP. A network of practice is one which can vary in size from the small to the very large with limited if any barriers to entry. It usually does not have formalised control structures and membership of the network can be high in terms of numbers with most members being unknown to each other. Networks of practice also differ from COP with respect to the methods of communication employed; in Networks communication happens via indirect means such as message boards and other virtual media in a one to many fashion. Networks of practice tend to be large and sprawling in scale and because of both this and the geographic dispersal of members there is no need for complex socialisation to the network. Communities of practice are often structured in terms of a mentor/mentee relationship where an experienced member will socialise new members via training and integrate them into the formal and informal aspects of the community. In the virtual community of practice this embodied mentor can arguably be seen to have been replaced by virtual records of conversations and interactions. In a virtual Community of practice there must be more of a reliance on what Lave and Wenger (1991) call ‘legitimate peripheral participation’. The repositories of historical communications between community members can serve as a resource which new users can use to familiarise themselves with the rules, methods, and standards of participation. Thus there is a strong element of selfdirected learning in VCOP’s as well as in NOP as the mentoring relationship is not a social person to person interaction and instead is an individualised process of self- learning using community resources. With this in mind however it is worth noting that it is seen to be good practice in OSS communities to be forgiving with people who do not immediately conform to given standards. This is because the OSS norm is that contributors are doing so in a communitarian capacity as people working towards a shared goal; they are doing this however outside of typical market or wage relationships. Due to the fact that no one is getting payment or other form of gratuity in return for their work it is necessary to instil a collegiate atmosphere of cooperation and appreciation for work done. However, there are complexities emerging with this assumption, as the environment changes there are many organisations paying their employees to contribute to OSS development and it cannot be assumed that OSS contributors are non-rewarded volunteers. Large scale actions within networks of practice are usually taken on in a modular fashion with workloads broken down into smaller parts which can be combined and reconfigured with other parts to make larger works possible. Thus networks of practice are best suited for the development of modular systems, these systems can be designed by groups but for ease of development it is preferable for there to be a decision maker or project owner who can decide which aspects of the design to be integrated and which to be discarded. Having considered the environment in which open source communities utilise social software within their NoP it is also relevant to consider that some authors propose that the emergence of new domains such as virtual societies is insufficiently addressed in organizational and management theory and suggest it is in need of theorizing (Corley and Gioia, 2011). Possible use of this research/ Limitations From a practitioner’s perspective this research could give package managers an insight as to how OSS developers use social media based ICT tools to highlight updates, changes, prioritise effort allocation and propose design suggestions to the community. Possible impact of this research could affect community behaviours, collaboration and productivity and could extend existing theoretical frameworks on media choice and organizational/community practices. It could create knowledge that may contribute toward future collaborative software development environments and could provide evidence that social media (SM) contributes to productivity in OSS development. Black et al. (2010) argue that social media provides an outlet for communication in relation to new ideas, specifications, code design and related issues in global software development. To investigate this specifically within an OSS community environment would help to further our understanding on how social media supports OS software system development activities. This paper has conducted a literature review, aiming to distil the current state of the art however, its limitations in relation to data collection and analysis mean that the study is incomplete and further stages of research are required to investigate this topic further. The reseachers have collected some data from twitter representing the OSS community and plans to analyse this data over the coming periods. Conclusions Large scale open source software projects adopt communication software within their overall technology management toolkit to address the particular communication challenges that they encounter. The challenges that they encounter relate to real-time data access, task/team awareness and implementation detail, currently platforms to support these are fragmented. In paralell, organisation practices are challenged to address the complexity of NOP, and communities of developers. Social media channels are widely used in these communities; target audiences, message complexity, message size and team climate affect the media choice and potential impact. Further research into social media use within open source communities can extend our understanding of the impact social media has on OSS development. References Black, S., Harrison, R., & Baldwin, M. (2010) "A survey of social media use in software systems development." Proceedings of the 1st Workshop on Web 2.0 for Software Engineering. ACM, 2010. Corley, K. G. and Gioia, D. A. (2011) 'Building theory about theory building: what constitutes a theoretical contribution?', Academy of Management Review, Vol. 36, No. 1, pp. 12-32. Correa, D. and Sureka, A. ‘Fit or unfit: analysis and prediction of closed questions on stack overflow’ Proceedings of the first ACM conference on Online social networks, 2013. Galster, M. and Tofan, D. ‘Exploring possibilities to analyse microblogs for dependability information in variability-intensive open source software systems’, Software Reliability Engineering Workshops (ISSREW), 2013 IEEE International Symposium on, 2013, Giuffrida and Dittrich (2013) "Empirical studies on the use of social software in global software development–A systematic mapping study." Information and Software Technology 55.7 (2013): 11431164. Guzzi, A., Bacchelli, A., Lanza, M.,Pinzger, M. and Deursen, A. v ‘Communication in open source software development mailing lists’, Proceedings of the Tenth International Workshop on Mining Software Repositories, 2013. Kaplan M. and Haenlein, M. ‘Users of the world, unite! The challenges and opportunities of Social Media’ Business Horizons, vol. 53, no. 1, Jan. 2010. Lave, J. and Wenger, E. (1991) Situated learning: Legitimate peripheral participation, Cambridge university press. Morgan, L. and Finnegan, P. (2007) 'How perceptions of open source software influecne adoption: an exploratory study'. Muller, M., Ehrlich, K., Matthews, T., Perer, A., Ronen, I., & Guy, I (2012) "Diversity among enterprise online communities: collaborating, teaming, and innovating through social media." Proceedings of the 2012 ACM annual conference on Human Factors in Computing Systems. ACM, 2012. Pagano, D. and Maalej, W. ‘How do open source communities blog?’, Empirical Software Engineering, Springer, 2012. Sharif, K. Y. (2012) Open source programmers’ information seeking, unpublished thesis (PhD) Limerick University Skeels, M. M. and Grudin, J. ‘When social networks cross boundaries: a case study of workplace use of facebook and LinkedIn’, in Proceedings of the ACM 2009 international conference on Supporting group work, 2009. Tsay, J., Dabbish, L., and Herbsleb, J.D., ‘Social media in transparent work environments’, Cooperative and Human Aspects of Software Engineering (CHASE), 2013 6th International Workshop on , May 2013. Van Maanen, J. and Schein, E. H. (1979) Toward a theory of organizational socialization. Citeseer. Wang, X., Kuzmickaja, I., Stol, K., Abrahamsson, P. and Fitzgerald, B. (2013) ‘Microblogging in Open Source Software Development: The Case of Drupal Using Twitter’ IEEE Software,, 2013. Yang, X., Hu, D. and Robert, D. M. ‘How Microblogging Networks Affect Project Success of Open Source Software Development’ System Sciences (HICSS), 2013 46th Hawaii International Conference on, 2013 Black, S., Harrison, R. and Baldwin, M. (2010) A survey of social media use in software systems development. ACM. Corley, K. G. and Gioia, D. A. (2011) 'Building theory about theory building: what constitutes a theoretical contribution?', Academy of Management Review, Vol. 36, No. 1, pp. 12-32. Crowston, K., Wei, K., Howison, J. and Wiggins, A. (2012) 'Free/Libre open-source software development', ACM Computing Surveys, Vol. 44, No. 2, pp. 1-35. Lave, J. and Wenger, E. (1991) Situated learning: Legitimate peripheral participation, Cambridge university press. Morgan, L. and Finnegan, P. (2007) 'How perceptions of open source software influecne adoption: an exploratory study'. Van Maanen, J. and Schein, E. H. (1979) Toward a theory of organizational socialization. Citeseer.