Data Mining Reveals How Social Coding Succeeds (And Fails)


Emerging Technology From the arXiv : “Collaborative software development can be hugely successful or fail spectacularly. An analysis of the metadata associated with these projects is teasing apart the difference….
The process of developing software has undergone huge transformation in the last decade or so. One of the key changes has been the evolution of social coding websites, such as GitHub and BitBucket.
These allow anyone to start a collaborative software project that other developers can contribute to on a voluntary basis. Millions of people have used these sites to build software, sometimes with extraordinary success.
Of course, some projects are more successful than others. And that raises an interesting question: what are the differences between successful and unsuccessful projects on these sites?
Today, we get an answer from Yuya Yoshikawa at the Nara Institute of Science and Technology in Japan and a couple of pals at the NTT Laboratories, also in Japan.  These guys have analysed the characteristics of over 300,000 collaborative software projects on GitHub to tease apart the factors that contribute to success. Their results provide the first insights into social coding success from this kind of data mining.
A social coding project begins when a group of developers outline a project and begin work on it. These are the “internal developers” and have the power to update the software in a process known as a “commit”. The number of commits is a measure of the activity on the project.
External developers can follow the progress of the project by “starring” it, a form of bookmarking on GitHub. The number of stars is a measure of the project’s popularity. These external developers can also request changes, such as additional features and so on, in a process known as a pull request.
Yoshikawa and co begin by downloading the data associated with over 300,000 projects from the GitHub website. This includes the number of internal developers, the number of stars a project receives over time and the number of pull requests it gets.
The team then analyse the effectiveness of the project by calculating factors such as the number of commits per internal team member, the popularity of the project over time, the number of pull requests that are fulfilled and so on.
The results provide a fascinating insight into the nature of social coding. Yoshikawa and co say the number of internal developers on a project plays a significant role in its success. “Projects with larger numbers of internal members have higher activity, popularity and sociality,” they say….
Ref: arxiv.org/abs/1408.6012 : Collaboration on Social Media: Analyzing Successful Projects on Social Coding”