Debian vs. SourceForge – Round 1

Posted on February 3, 2009


We all know about SourceForge and Debian. Although they have different purposes, they both act as repositories of free software, and most of the practitioners will know that Debian hosts what is considered to be the best projects — judged most worthy by its army of package maintainers. Conversely, many (but by no means all) SourceForge projects languish in obscurity; these are, at best, of little interest outside of the developers who run them, or, at worst, have completely stalled. It is conventional wisdom then that Debian projects receive much more activity from developers than those on communities like SourceForge.

So today’s research question is: How true is this? How much more activity (if at all) do projects in Debian actually receive than their counterparts in SourceForge? To answer this query, two quantifiable and measurable questions are proposed:

  1. Are the evolutionary characteristics of Debian projects significantly different from those in SourceForge? (In other words, do Debian projects receive so much more activity that we cannot conclude that random statistical noise is responsible for the difference?)
  2. Does Debian act as a “catalyst”, so that when project are entered into Debian’s repository, the activity around the project increases?

To answer the questions, we need to measure proxies of evolutionary activity. We chose:

  • Project age
  • Project size
  • Number of developers
  • Number of commits

How these attributes were measured, and how they helped to answer the questions, will be addressed in the follow-up post.

Posted in: Research