James E. Bromberger (JEB) , a contributor to Perl CPAN and Debian, has estimated the cost of developing Debian Wheezy (7.0) from scratch based on the the number of lines of code (LOC) counted with SLOCCount tool, the Constructive Cost Model (COCOMO) and the average wage of a developer of 72,533 USD (using median estimates from Salary.com and PayScale.com for 2011).
He found 419,776,604 lines of code in 31 programming languages giving an estimated cost of producing Debian Wheezy in February 2012 of 19 billion US dollar (14.4 Billion Euros), making each package source code (out of the 17,141 packages) worth an average of 1,112,547.56 USD to produce.
He also estimated the cost of Linux 3.1.8 Kernel with almost 10 millions lines of source code would be worth 540 million USD at standard complexity, or 1.877 billions USD when rated as ‘complex’.
I don’t know which tool he used for the calculation (maybe his own), but there are two simple COCOMO calculators on the internet in order to estimate the number of man hours required for a particular project based on the LOC and the code complexity:
- The Little COCOMO Calculator (COCOMO 81 model – 1st version)
- NASA basic COCOMO online calculator (COCOMO II model – 2nd version)
Bromberger also estimated the cost of several individual projects used in Debian and found that developing Apache 2.2.9 would cost 33.5 million USD and MySQL 64.2 million USD with today’s salary.
Of course, there are some caveats to such cost estimation. First, those costs assume software engineers based in the US. Taking into account outsourcing would significantly reduce those cost estimations. Second, he also tried another source code analysis tool (Ohcount) which found much less lines of code in Debian Wheezy. Finally, you also need to estimate the code complexity in the COCOMO model which may lead to great variation in the final costs.
In the last section of his post, Bromberger also analyzed the 31 programming languages that were used to develop the Debian software (See his chart below). It shows C and C++ still rule the (programming) world with respectively 40% and 20% of code written in those 2 programming languages. Java comes a distant third at 8%.