June 19, 2006
LOC

john mccarthy, photo by bart nagel
Thanks, Ash.

This is completely wrong for all I know, but it made me curious:

Lest those of you who wrote 5,000 lines of code last weekend pass a kidney stone at the thought of Windows developers writing only a thousand lines of code a year, realize that the average software developer in the US only produces around (brace yourself) 6200 lines a year.

Montezuma, something I work on in my spare time, for free, consists of 13605 lines of Lisp (I subtracted 1000 from the total line count to account for code someone else wrote) written in 182 days, which is 80.2 lines per day. Which is 29273 lines of code per year, if I can keep up that rate of code writing for another six months.

Some C++ code I wrote by myself, for pay, full-time, consisted of 10901 lines of code after 165 days. Which is a rate of 66.1 lines per day, or 24126 lines per year.

I don't really remember when the C++ code began to approach the high part of the completion curve, but at least 1000 lines were added in the last month of the interval I looked at. Looking at just the first 73 days, I wrote 6070 lines of code, which is 83.2 lines per day or 30368 lines per year. Compare to the rate for the first 73 days of Montezuma development, which was 108.8 lines/day, for 39730 lines/year.

Posted by jjwiseman at June 19, 2006 11:53 AM
Comments

I'd expect the lines of code per day to be very high for a 'new' project or 'new' feature and very low for 'integration' or 'enhancements' of existing code. Some days my lines of code is negative when I delete something incorrect.

A 'list of features' would be pretty interesting, especially when compared with the LOC stat.

Posted by: Quillian on June 19, 2006 12:53 PM

There's also a very large range of productivities. Good programmers are at least an order of magnitude or more productive than average programmers. If Microsoft has such low productivity, it may be a sign they're not attracting or retaining the really high performers. It could also be their products and/or development process have reached a state of terminal entropy. In any case, it's not a good sign.

Posted by: Paul Dietz on June 19, 2006 01:31 PM

I think your calculations are overlooking something. When measuring the commercial productivity of a programmer you have to calculate based on working days, not actual days. I'm guessing you spent time on the weekend working on the code you wrote for yourself. If not, you actually need to bump up your productivity metrics.

Assuming 50 working weeks a year (52 weeks minus 2 weeks vacation per year) and five working days per week, the Microsoft number works out to 24.8 LOC per day. This sounds low, but honestly I don't think I'm doing any better, because I also have to spend time on other matters, such as user support, meetings, communication of design decisions, meticulously crafted but ultimately tiny test cases designed to trigger specific defects, release builds, test failure investigation, and a whole host of other non-code artifacts. When I do produce new code it tends to be in short bursts that exceed the 24.8 per day rate but aren't sustained.

When I'm working on projects where I'm the sole developer, much of this overhead doesn't exist, and I can radically exceed this rate. However, no project that I work on personally approaches the conceptual complexity of what I do at work, and that's not (completely) due to unnecessary abstraction. (If it is, it's Paul's fault :-)

Posted by: Brian Mastenbrook on June 19, 2006 02:46 PM

I agree with Brian. Plus I wonder how you measure lines? do comments count? what about boiler plate code (if you're doing java!).

I think my lines-per-day rate can be pretty low on bug shooting days, but on other days i'm pretty sure I can crack out a couple hundred.

I'm also not sure how lines of code correlates to productivity. For example I could spend more time writing 5 lines of really good or 20 lines of not-so-good code that did the same thing (bad example, I know).

Posted by: Jon Philpott on June 19, 2006 03:10 PM

LOC produced are not the same as lines edited. Bug fixes, refactoring, etc. may or may not count. For instance, let's say you make an exact port of Lucene from Java to Lisp, and create Java bindings to make it looks completely identical to Java (i.e. not exposing any special Lisp functionality at all). Let's say you make a complete and accurate account of time spent. How many lines of code is that per week?

You could easily argue that the answer is zero (i.e. no new features). You might also argue that the answer is the difference between the LOC in Lucene and the LOC in the new project (which punishes brevity and reuse). Or you could take the integral of lines edited (which may or may not be meaningful). And yes, you could argue that the number of LOC in the new project is the true LOC (which also penalizes reuse).

The point - all of these answers are incomplete (where is the original source for the MS number?).

Posted by: Dave Christianson on June 19, 2006 03:50 PM

I take some weekends off from personal coding, and have (sadly) spent many weekends on the job. I haven't spent time writing manuals or tutorials or slides for Montezuma, like I did for the C++ code. I also haven't had to negotiate with project leaders or testers, or fix bugs in others' code that I relied on. Basically, they're completely different scenarios other than the fact that in each case I'm writing all the code.

16.9 or 24.8 LOC/day, both of those sound plausible to me, and not particularly low due to all the other things that are part of just about any programming job.

I used wc -l, because I just don't think it matters. LOC is such an arbitrary measure, it just is what it is and I don't think tweaking for comments (and of course comments should count) or boiler plate or anything else will really gain you much.

Posted by: John Wiseman on June 19, 2006 03:56 PM

I think I'll class LOC in the same category of usefulness as MIPS then.

Posted by: Jon Philpott on June 19, 2006 04:13 PM

I think the comments so far provide a decent overview of LOC as a metric. =) Everything said here mirrors what I understand to be current thinking, more or less.

For myself, I think the first poster nailed it. Sure, I wrote a thousand lines of Java in a day(*), when I was writing a new program from scratch and I knew exactly what I wanted it to look like because it was a toy version of a program I'd helped write last summer when consulting. When it comes time to edit a sufficiently large program, though, the amount of LOC changed tends to be very small, unless entire new features are being added which can't be based on existing code -- in which case, you're back to something like the new project experience.

I've definitely had days where I've added 10 lines of (unit) test code for each line of real code. Test code doesn't count, though.


(*) Okay, the NetBeans GUI builder helped too. A lot. But I wrote at least half the code myself by hand.

Posted by: Michael Hannemann on June 19, 2006 05:57 PM

I remember a once popular rule of thumb from the late seventies and early eighties. The rule was that, for a large project with a large number of software engineers, the project manager could expect the team to produce 20 debugged lines of code per day per engineer. Interestingly enough, this works out to about 5000 lines of code per year.

At the time, the caveats were 1) this only made sense as a team metric and that *productive* developers would both under and out perform this number, 2) Debugged lines meant measuring from final spec to released code.

I don't think the circumstances are the same. We were writing code in 36-bit macro assembler. There were no IDEs and cut-and-paste programming was seen as highly wasteful (of space or computes). So I would expect the number to be substantially higher than it was then.

But I think the caveats were universal. This sort of metric only makes sense for large teams working on big systems. Creating the code takes a small amount of time compared to the process of taking all that code and delivering it as part of a large low-defect system. And most importantly, it's meaningless for looking at individuals.

Posted by: R Hayes on June 19, 2006 07:58 PM

Maybe, in lisp, you should count your productivity in s-exps per day.

Posted by: E. Fischer on June 20, 2006 03:28 AM

Maybe, in lisp, you should count your productivity after macroexpansion.

Posted by: R. Smith on June 20, 2006 08:41 AM

Maybe, in lisp, you should count your productivity in s-exps per day.

One, but it's a big one?

Posted by: Michael Hannemann on June 21, 2006 07:22 PM

I just checked at work, and since starting a new job 3 months ago, I've averaged 27 lines of code per work day. That's at a Large Company (8000+ employees) with a large pre-existing codebase (over 10 years old), and lots of meetings to go to intruding on 'getting in the zone'.

Posted by: David Mercer on June 21, 2006 08:54 PM
Post a comment
Name:


Email Address:


URL:




Unless you answer this question, your comment will be classified as spam and will not be posted.
(I'll give you a hint: the answer is “lisp”.)

Comments:


Remember info?