Article Preview
Top1. Introduction
Social coding (Trockman, Zhou, K¨astner, & Vasilescu, 2018) is a modern approach to software development, which places an emphasis on formal and informal collaboration. Due to the emerging social coding platforms (SCP, for short) such as GitHub (Github, Inc., 2008) and Bitbucket (Atlassian, Inc., 2008), the social coding is now widely adopted not only in open-source projects, but in commercial software development.
In the SCP, software is developed and maintained collaboratively by a community of developers and users. The SCP provides a space for the people to create a project, have communication, play and build whatever they want. Moreover, every developer can hack and modify the code as he/she likes by forking the project. Then, the developer can request the project owner to merge his/her modification to the original code. If the owner finds it useful, the owner commits the merge, which creates a new version of the software. In GitHub, the above three activities are implemented as commands fork, pull request, and merge, respectively.
From the perspective of software evolution (Lehman, 1980), the above process of code modification is quite unique. Traditionally, when new features and/or better solutions are requested by users or customers, a project manager designs tasks, and directs developers to work in a top-down manner. Then, the developers modify the code according to the requirement. However, in the social coding, the code modification can be proposed by spontaneous and altruistic developers in the community. Thus, the software evolution can be triggered in a bottom-up way. We call this new type of software evolution spontaneous software evolution (Matsumoto et al., 2017).
The spontaneous software evolution by the social coding is a powerful means to achieve fast, flexible, and sustainable evolution of the software. Also, it often derives innovative ideas that a project owner has never imagined, realizing advanced features and products. For example, microsoft/vscode (Microsoft, Inc., 2015) is a GitHub project of a powerful code editor, where a lot of people have discussion, developers actively present codes, and many innovative features have been added. Thus, we can see that microsoft/vscode is a successful project in the context of spontaneous software evolution.
However, we do not know how such a successful project has been managed, or how such an active community has been formed and governed. To clarify these issues, we need to investigate the history, starting when the project was created, and leading to the current shape. Fortunately, since the SCP accumulates various event logs, we can look back how the project was for any past time. More specifically, for a given project p and timestamp t, let denote a state of p at t. Observing with varying t derives a sequence of states, which characterizes how p has been evolved. The challenge here is how to define so that it well characterizes the spontaneous software evolution within p.
The goal of this paper is to propose a method that can cope with the following research questions: