www.fgks.org   »   [go: up one dir, main page]


Archive for the ‘Coders at Work’ Category

A Tale of Two, Three, … N Rewrites

March 7, 2010

In an earlier post about the various rewrites Netscape’s browser underwent I quoted both Jamie Zawinski and Brendan Eich about their experience with the programmers from companies acquired by Netscape coming in and rewriting sections of browser code. I quoted Eich saying:

There was an imperative from Netscape to make the acquisition that waved the Design Patterns book around [i.e. Collabra] feel like they were winners by using their new rendering engine, which was like My First Object-Oriented Rendering Engine.

Eich soon wrote me to clarify and I’ve been meaning to post something about it for quite some time. Here’s what he said:

I wasn’t referring to Collabra, rather to Digital Styles. Different Netscape startup-acquisition lottery winner, same sad rewrite story, but with the rendering engine the rewrite target instead of the mail/news (“groupware”) code the target this time.

You quoted jwz saying: “They [Collaborators] didn’t start from scratch with a blank disk but they eventually replaced every line of code.” The “every line of code” is about the mail/news code only, not the browser front ends, and specifically not the layout (rendering) engine. The Collaborators weren’t that good, they had no one who could hack a web-compatible replacement layout engine or a JavaScript engine. They were Windows Professional ™ C++ programmers, mostly.

Netscape 4 was not a total rewrite. (No Netscape release, not even Netscape 6, was, although 6 and the Mozilla code it was based on came closest by far.) I rewrote my first JS engine (“Mocha”) for it, but reused a lot of code with name changes (big change: a real GC instead of ref-counting and arenas). Eric Bina’s layout engine remained pretty much as it was in Netscape 3. Lou Montulli’s hacked over netlib, ditto. More hands on board at Netscape meant more incremental evolution, but the original Netscape 1 or 2 era browser and rendering code was still there.

Feel free to quote me, hope this helps.

From there Eich and Zawinski had a short back and forth that I got cc’d on about what exactly got rewritten by whom. Unfortunately the final answer, which could be pulled from the old CVS repository may be lost to history – Eich tried to get AOL to release the old CVS repsitory in 2003 but didn’t get anywhere with it, in part because, as he said in one of his emails, “I’m not sure what had become of the old CVS server at that point.”

Anyway a bit of ancient history that I should have posted about far sooner.

C++ in Coders at Work

October 16, 2009

One of the topics I asked most of my Coders at Work interviewees about was C++. I am not an expert, or even a competent C++ programmer and recognize that my own opinions about C++ are not well-informed enough to be worth much.1 But C++ fascinates me—it’s obviously a hugely successful language: most “serious” desktop apps are still written in C++ despite the recent inroads made by Objective C on OS X and perhaps some C# on Windows; the core of Google’s search engine is written in C++; and C++ dominates the games industry. Yet C++ is also frequently reviled both by those who never use and by those who use it all the time.

That was certainly reflected in the responses I got from my Coders interviewees when I asked them about it. Jamie Zawinski, as I’ve discussed recently, fought tooth and nail to keep C++ out of the Netscape code base (and eventually lost). Some of that was due to the immaturity of C++ compilers and libraries at the time, circa 1994, but it seems also to have to do with his estimation of the language as a language:

C++ is just an abomination. Everything is wrong with it in every way. So I really tried to avoid using that as much as I could and do everything in C at Netscape.

Part of Zawinski’s issue with C++ is that it is simply too complex:

When you’re programming C++ no one can ever agree on which ten percent of the language is safe to use. There’s going to be one guy who decides, “I have to use templates.” And then you discover that there are no two compilers that implement templates the same way.

Note that Zawinski had started his career as a Lisp programmer but also used C for many years while working on Netscape. And he later enjoyed working in Java. So it’s not that C++ was either too high-level or too low-level for him or that he couldn’t wrap his head around object orientation.

Joshua Bloch, who also hacked low level C code for many years before becoming a big-time Java head, told me that he didn’t get into object-oriented programming until quite late: “Java was the first object-oriented language I used with any seriousness, in part because I couldn’t exactly bring myself to use C++.” He echoed Zawinski’s point about how C++ forces programmers to subset the language:

I think C++ was pushed well beyond its complexity threshold and yet there are a lot of people programming it. But what you do is you force people to subset it. So almost every shop that I know of that uses C++ says, “Yes, we’re using C++ but we’re not doing multiple-implementation inheritance and we’re not using operator overloading.” There are just a bunch of features that you’re not going to use because the complexity of the resulting code is too high. And I don’t think it’s good when you have to start doing that. You lose this programmer portability where everyone can read everyone else’s code, which I think is such a good thing.

Ken Thompson, who still mostly uses C despite working at Google which is largely a C++ shop, has had as long an exposure to C++ as just about anyone, having worked with with Bjarne Stroustrup, C++’s inventor, at Bell Labs:

I would try out the language as it was being developed and make comments on it. It was part of the work atmosphere there. And you’d write something and then the next day it wouldn’t work because the language changed. It was very unstable for a very long period of time. At some point I said, no, no more.

In an interview I said exactly that, that I didn’t use it just because it wouldn’t stay still for two days in a row. When Stroustrup read the interview he came screaming into my room about how I was undermining him and what I said mattered and I said it was a bad language. I never said it was a bad language. On and on and on. Since then I kind of avoid that kind of stuff.

At that point in the interview I almost changed the topic. Luckily I took one more try at asking for his actual opinion of C++. His reply:

It certainly has its good points. But by and large I think it’s a bad language. It does a lot of things half well and it’s just a garbage heap of ideas that are mutually exclusive. Everybody I know, whether it’s personal or corporate, selects a subset and these subsets are different. So it’s not a good language to transport an algorithm—to say, “I wrote it; here, take it.” It’s way too big, way too complex. And it’s obviously built by a committee.

Stroustrup campaigned for years and years and years, way beyond any sort of technical contributions he made to the language, to get it adopted and used. And he sort of ran all the standards committees with a whip and a chair. And he said “no” to no one. He put every feature in that language that ever existed. It wasn’t cleanly designed—it was just the union of everything that came along. And I think it suffered drastically from that.

Brendan Eich, the CTO of the Mozilla Corporation, whose Mozilla browser is written almost entirely in C++, talks about “toe loss due to C and C++’s foot guns” and when I asked him if there are any parts of programming that he doesn’t enjoy as much as he used to, he replied:

I don’t know. C++. We’re able to use most of its features—there are too many of them. It’s probably got a better type system than Java. But we’re still screwing around with ’70s debuggers and linkers, and it’s stupid. I don’t know why we put up with it.

At least among my interviewees, even the most positive comments about C++ tended to fall in the category of “damning with faint praise”. I asked Brad Fitzpatrick, who used C++ in college and again now that he’s at Google, whether he likes it:

I don’t mind it. The syntax is terrible and totally inconsistent and the error messages, at least from GCC, are ridiculous. You can get 40 pages of error spew because you forgot some semicolon. But—like anything else—you quickly memorize all the patterns. You don’t even read the words; you just see the structure and think, “Oh, yeah, I probably forgot to close the namespace in a header file.” I think the new C++ spec, even though it adds so much complexity, has a lot of stuff that’ll make it less painful to type—as far as number of keystrokes. The auto variables and the for loops. It’s more like Python style. And the lambdas. It’s enough that I could delude myself into thinking I’m writing in Python, even though it’s C++.

Dan Ingalls, who helped invent modern object oriented programming as part of Alan Kay’s team that developed Smalltalk, never found C++ compelling enough to use but isn’t totally adverse to using it:

I didn’t get that much into it. It seemed like a step forward in various ways from C, but it seemed to be not yet what the promise was, which we were already experiencing. If I had been forced to do another bottom-up implementation, instead of using machine code I would’ve maybe started with C++. And I know a couple of people who are masters of C++ and I love to see how they do things because I think they don’t rely on it for the stuff that it’s not really that good at but totally use it as almost a metaprogramming language.

Joe Armstrong, similarly, has never felt the need to learn C++:

No, C++, I can hardly read or write it. I don’t like C++; it doesn’t feel right. It’s just complicated. I like small simple languages. It didn’t feel small and simple.

And finally Guy Steele, who probably knows more about more languages than anyone I interviewed (or possibly anyone, period), has also not been drawn to C++. But he did go out of his way to try to say something nice about Stroustrup’s effort:

I have not been attracted to C++. I have written some C++ code. Anything I think I might want to write in C++ now could be done about as well and more easily in Java. Unless efficiency were the primary concern.

But I don’t want to be seen as a detractor of Bjarne Stroustrup’s effort. He set himself up a particular goal, which was to make an object-oriented language that would be fully backwards-compatible with C. That was a difficult task to set himself. And given that constraint, I think he came up with an admirable design and it has held up well. But given the kinds of goals that I have in programming, I think the decision to be backwards-compatible with C is a fatal flaw. It’s just a set of difficulties that can’t be overcome.

Obviously with only fifteen interviewees in my book I have only a sampling of possible opinions. There are great programmers who have done great work with C++ and presumably at least some of them would have had more enthusiastic things to say about it if I had spoken with them. But this is what I heard from the people I spoke with.


1. I think I once managed to read all the way through Stroustrup’s The C++ Programming Language and have looked at at least parts of The Design and Evolution of C++. But I have never done any serious programming in it. I have made a couple attempts to learn it just because I felt I should but in recent years I’ve mostly given up, thinking that perhaps Erik Naggum, scourge of Usenet, was right when he said: “life is too long to know C++ well.”

Unit testing in Coders at Work

October 5, 2009

In his now infamous blog post “The Duct Tape Programmer”, Joel Spolsky quoted Jamie Zawinski from my interview with him in Coders at Work talking about how they didn’t use many unit tests when developing Netscape. “Uncle Bob” Martin, chiming in to say that Spolsky posting was right in general but wrong in almost all his specific claims and criticisms, was particularly riled by Spolsky’s implication that maybe unit tests aren’t 100% necessary at all times:

As for Joel’s consistent dismissal of unit testing, he’s just wrong about that. Unit testing (done TDD style) does not slow you down, it speeds you up. One day I hope Joel eventually realizes this. Programmers who say they don’t have time to write tests are living in the stone age. They might as well be saying that man wasn’t meant to fly.

Tim Bray also jumped in to strongly agree with Uncle Bob on the importance of unit tests, though he couldn’t bring himself to actually agree with much else Uncle Bob said.

Joel is wrong to piss on unit testing, and buys into the common fantasy that it slows you down. It doesn’t slow you down, it speeds you up. It’s been a while since I’ve run a dev team, but it could happen again. If it does, the developers will use TDD or they’ll be looking for another job.

Since this all started from the Zawinski interview in Coders at Work and since there were other people interviewed for the book, I figured it might be interesting to see what some of the other folks I talked to had to say about unit testing and things like TDD (“test driven development” or sometimes “test driven design”, for those of you behind on your acronyms.)

To start with, here’s a bit more context from the Zawinski interview:

Seibel: What about developer-level tests like unit tests?

Zawinski: Nah. We never did any of that. I did occasionally for some things. The date parser for mail headers had a gigantic set of test cases. Back then, at least, no one really paid a whole lot of attention to the standards. So you got all kinds of crap in the headers. And whatever you’re throwing at us, people are going to be annoyed if their mail sorts wrong. So I collected a whole bunch of examples online and just made stuff up and had this giant list of crappily formatted dates and the number I thought that should turn into. And every time I’d change the code I’d run through the tests and some of them would flip. Well, do I agree with that or not?

Seibel: Did that kind of thing get folded into any kind of automated testing?

Zawinski: No, when I was writing unit tests like that for my code they would basically only run when I ran them. We did a little bit of that later with Grendel, the Java rewrite, because it was just so much easier to write a unit test when you write a new class.

Seibel: In retrospect, do you think you suffered at all because of that? Would development have been easier or faster if you guys had been more disciplined about testing?

Zawinski: I don’t think so. I think it would have just slowed us down. There’s a lot to be said for just getting it right the first time. In the early days we were so focused on speed. We had to ship the thing even if it wasn’t perfect. We can ship it later and it would be higher quality but someone else might have eaten our lunch by then.

There’s bound to be stuff where this would have gone faster if we’d had unit tests or smaller modules or whatever. That all sounds great in principle. Given a leisurely development pace, that’s certainly the way to go. But when you’re looking at, “We’ve got to go from zero to done in six weeks,” well, I can’t do that unless I cut something out. And what I’m going to cut out is the stuff that’s not absolutely critical. And unit tests are not critical. If there’s no unit test the customer isn’t going to complain about that. That’s an upstream issue.

I hope I don’t sound like I’m saying, “Testing is for chumps.” It’s not. It’s a matter of priorities. Are you trying to write good software or are you trying to be done by next week? You can’t do both. One of the jokes we made at Netscape a lot was, “We’re absolutely 100 percent committed to quality. We’re going to ship the highest-quality product we can on March 31st.”

So Zawinski says unit testing would have slowed them down. Uncle Bob and Tim Bray both say that unit testing, “doesn’t slow you down, it speeds you up.” Did Zawinski and the rest of the Netscape gang just blow it? They were going all out to develop their software as fast as they could; could they have sped things up with more unit testing? Maybe they were just living in the stone age.

Now, if Uncle Bob and Bray wanted to make a less radical claim than that unit testing always speeds you up, they could point out that unit tests can help you go faster over the longer term, and it’s not clear even Zawinski would disagree. And they’d definitely get some strong support for that claim from the subject of chapter two of Coders, Brad Fitzpatrick. I asked him about any big differences between his early and later programming style:

Fitzpatrick: I’ve also done a lot of testing since LiveJournal. Once I started working with other people especially. And once I realized that code I write never fucking goes away and I’m going to be a maintainer for life. I get comments about blog posts that are almost 10 years old. “Hey, I found this code. I found a bug,” and I’m suddenly maintaining code.

I now maintain so much code, and there’s other people working with it, if there’s anything halfway clever at all, I just assume that somebody else is going to not understand some invariants I have. So basically anytime I do something clever, I make sure I have a test in there to break really loudly and to tell them that they messed up. I had to force a lot of people to write tests, mostly people who were working for me. I would write tests to guard against my own code breaking, and then once they wrote code, I was like, “Are you even sure that works? Write a test. Prove it to me.” At a certain point, people realize, “Holy crap, it does pay off,” especially maintenance costs later.

Another interviewee, Joshua Bloch described how he designs code by starting with the APIs. He claimed this is a sort of “test-first programming and refactoring applied to APIs” since the first thing he does with a newly designed is test whether it would support the use cases that had lead to creating the API in the first place. But since he does all that writing any runnable code, that could also be called old-fashioned, “thinking about what you’re going to do before you do it” programming. Bloch did dispute the claim of those TDD advocates who say the tests produced by TDD can function as a spec for the code under test:

I don’t think tests are even remotely an acceptable substitute for documentation. Once you’re trying to write something that other people can code to, you need precise specs, and the tests should test that the code conforms to those specs.

Elsewhere Bloch described how he used both system and unit testing when he was working on an implementation of transactional shared-memory:

To test the code, I wrote a monstrous “basher.” It ran lots of transactions, each of which contained nested transactions, recursively up to some maximum nesting depth. Each of the nested transactions would lock and read several elements of a shared array in ascending order and add something to each element, preserving the invariant that the sum of all the elements in the array was zero. Each subtransaction was either committed or aborted—90 percent commits, 10 percent aborts, or whatever. Multiple threads ran these transactions concurrently and beat on the array for a prolonged period. Since it was a shared-memory facility that I was testing, I ran multiple multithreaded bashers concurrently, each in its own process.

At reasonable concurrency levels, the basher passed with flying colors. But when I really cranked up the concurrency, I found that occasionally, just occasionally, the basher would fail its consistency check. I had no idea what was going on. Of course I assumed it was my fault because I had written all of this new code.

After the system test demonstrated the presence of a bug he turned to unit tests to find it:

I spent a week or so writing painfully thorough unit tests of each component, and all the tests passed. Then I wrote detailed consistency checks for each internal data structure, so I could call the consistency checks after every mutation until a test failed. Finally I caught a low-level consistency check failing—not repeatably, but in a way that allowed me to analyze what was going on. And I came to the inescapable conclusion that my locks weren’t working. I had concurrent read-modify-write sequences taking place in which two transactions locked, read, and wrote the same value and the last write was clobbering the first.

I had written my own lock manager, so of course I suspected it. But the lock manager was passing its unit tests with flying colors. In the end, I determined that what was broken wasn’t the lock manager, but the underlying mutex implementation! This was before the days when operating systems supported threads, so we had to write our own threading package. It turned out that the engineer responsible for the mutex code had accidentally exchanged the labels on the lock and try-lock routines in the assembly code for our Solaris threading implementation. So every time you thought you were calling lock, you were actually calling try-lock, and vice versa. Which means that when there was actual contention—rare in those days—the second thread just sailed into the critical section as if the first thread didn’t have the lock. The funny thing was that that this meant the whole company had been running without mutexes for a couple weeks, and nobody noticed.

I asked him if he though the author of the mutex code that had been the cause of his problems could or even should have caught the bug with his own unit tests:

I think a good automated unit test of the mutex facility could have saved me from this particular agony, but keep in mind that this was in the early ’90s. It never even occurred to me to blame the engineer involved for not writing good enough unit tests. Even today, writing unit tests for concurrency utilities is an art form.

Donald Knuth, who is also a fan of after-the-fact torture tests, described an approach to coding about as far away from TDD as you can imagine, which he used when originally developing his typesetting system, TeX:

Knuth: When I wrote TeX originally in 1977 and ’78, of course I didn’t have literate programming but I did have structured programming. I wrote it in a big notebook in longhand, in pencil.

Six months later, after I had gone through the whole project, I started typing into the computer. And did the debugging in March of ’78 while I had started writing the program in October of ’77. The code for that is in the Stanford archives—it’s all in pencil—and of course I would come back and change a subroutine as I learned what it should be.

This was a first-generation system, so lots of different architectures were possible and had to be discarded until I’d lived with it for a while and knew what was there. And it was a chicken-and-egg problem—you couldn’t typeset until you had fonts but then you couldn’t have fonts until you could typeset.

But structured programming gave me the idea of invariants and knowing how to make black boxes that I could understand. So I had the confidence that the code would work when I finally would debug it. I felt that I would be saving a lot of time if I waited six months before testing anything. I had enough confidence that the code was approximately right.

Seibel: And the time savings would be because you wouldn’t spend time building scaffolding and stubs to test incomplete code?

Knuth: Right.

So Knuth too disagrees with the notion that unit testing always makes you go faster. Maybe he too is living in the stone age.

Joe Armstrong, on the other hand, says he has moved toward a test-first development style recently:

Seibel: At the point that you start typing code, do you code top-down or bottom-up or middle-out?

Armstrong: Bottom up. I write a little bit and test it, write a little bit and test it. I’ve gone over to this writing test cases first, now. Unit testing. Just write the test cases and then write the code. I feel fairly confident that it works.

The only interviewee who touched directly on TDD versus other approaches was Peter Norvig. He said he does more unit testing than he used to and even said some nice things about TDD but pointed out:

It’s also important to know what you’re doing. When I wrote my Sudoku solver, some bloggers commented on that. They said, “Look at the contrast—here’s Norvig’s Sudoku thing and then there’s this other guy,” whose name I’ve forgotten, one of these test-driven design gurus. He starts off and he says, “Well, I’m going to do Sudoku and I’m going to have this class and first thing I’m going to do is write a bunch of tests.” But then he never got anywhere. He had five different blog posts and in each one he wrote a little bit more and wrote lots of tests but he never got anything working because he didn’t know how to solve the problem.

A bit later Norvig said:

Then bloggers were arguing back and forth about what this means. I don’t think it means much of anything—I think test-driven design is great. I do that a lot more than I used to do. But you can test all you want and if you don’t know how to approach the problem, you’re not going to get a solution.

Ignoring Norvig’s suggestion that the difference between the two attempts doesn’t mean much of anything, it is instructive (or at least interesting, in a rubber-necking kind of way) to look at the two writeups. The “other guy” turns out to be Ron Jeffries, one of the inventors of Extreme Programming, the author of two books on XP, and according to his website an “experienced XP author, trainer, coach, and practitioner”.

Norvig’s writeup is a short essay explaining about 100 lines of Python that can solve any Sudoku. Jeffries writeup, by contrast, is spread over five lengthy blog postings here, here, here, here, and here and ends without coming anywhere close to actually producing a program that can solve any but a tiny subset of all Sudoku problems.

At some level the difference between the two simply boils down—as Norvig suggests—to knowledge: Norvig knew how to solve the problem because it’s a specific instance of a kind of problem he already knew how to solve. Jeffries, obviously, did not. But he did choose to tackle this particular problem using TDD, a technique in which he is supposed to be the expert? Why did he have so little success?

One thing I noticed, reading through Jeffries’s blog posts, was that he got fixated on the problem of how to represent a Sudoku board. He immediately started writing tests of the low-level details of a few functions for manipulating a data structure representing the 9×9 Sudoku board and a few functions for getting at the rows, columns, and boxes of the board. (“Boxes” are what Sudoku players call the 3×3 squares subsquares of the 9×9 board.)

Then he basically wandered around for the rest of his five blog postings fiddling with the representation, making it more “object oriented” and then fixing up the tests to work with the new representation and so on until eventually, it seems, he just got bored and gave up, having made only one minor stab at the problem of actually solving puzzles.

I suspect, having done a small amount of TDD myself, that this is actually a pattern that arises when a programmer tries to apply TDD to a problem they just don’t know how to solve. If I was a high-priced consultant/trainer like Jeffries, I’d probably give this pattern a pithy name like “Going in Circles Means You Don’t Know What You’re Doing”. Because he had no idea how to tackle the real problem, the only kinds of tests he could think of were either the very high-level “the program works” kind which were obviously too much of a leap or low-level tests of nitty-gritty code that is necessary but not at all sufficient for a working solver.

However, since most of what Jeffries spent his time on was the code for representing a Sudoku board and determining which row, column, and box a given square on the board is in, let’s look at that part of Norvig’s code.

Norvig’s basic strategy is to represent a board using a hash table with the keys being row-column pairs like A1, A2, and so on up to I9. After seeing the mess Jeffries makes of trying represent a board this is a refreshingly simple choice. It also seems to me that it requires a bit of creativity: given that a Sudoku board is a 9×9 board, I suspect I’m not the only programmer in the world who might be inclined to start with a 2d array. In a language without true 2d arrays, I might then be tempted, as Jeffries was, to then use an 81-element array and then get all wrapped around the axle, as Jeffries did, making sure I haven’t screwed up the finicky math for converting between 1d and 2d indices. And for all we know Norvig fell into the same trap and only later realized that all he really needed was the easy random access provided by a hash table. But pretty clearly Jeffries’s approach of testing the heck out of an array-based implementation wasn’t sufficient to lead him to the much better hash table-based one.

Given his choice to use a hash table, Norvig needs a list of the keys, i.e. the Cartesian product of the row labels (A-I) and the column labels (1-9). So the first bit of code he shows is a function cross which implements a Cartesian product that combines the pairs of elements by concatenation. The standard mathematical definition of the Cartesian product is:

A × B = {(a,b) | a ∈ A and b ∈ B }
        

Python’s list comprehensions let Norvig express this function in essentially the same notation as the standard mathematical definition:

def cross(A, B):
    return [a+b for a in A for b in B]
        

Could he or should he have developed this function via a test-first strategy? Dunno. Would it have been faster? Possibly, if he had any problems getting it right. But given that he’s just transcribing a mathematical definition, maybe a quick check at Python’s REPL would be sufficient to make sure he hadn’t screwed anything up.

Next he defines two variables to hold the row and column labels:

rows = 'ABCDEFGHI'
cols = '123456789'
        

Do TDD people unit test their data? I don’t know. Should they? I’m not even sure what that would mean, at least in a case like this. At any rate, he then feeds these two untested values to the cross function to produce the list of all 81 squares:

squares  = cross(rows, cols)
        

Now he’s basically done with the representation of the board. Any hash table, with the elements of squares as its keys represents a Sudoku board. Later in his code Norvig will use a hash table with strings containing the possible digits that could be put in each square as values.

However there is one other bit of work to be done: to solve a Sudoku you’re going to need to be able to map from a square to the other squares in the same row, column, and box. Having read a bit about Sudoku solving, Norvig has discovered that people use the term ‘units’ to refer to the rows, columns, and boxes and ‘peers’ to refer to all the squares that are in one of the three ‘units’ of another square. Norvig, unlike Jeffries, realized that this is better represented in data than in code and proceeds to compute the data once and for all.

First he makes a list of all the units, i.e. all rows, columns, and boxes, using list comprehensions and his cross function:

unitlist = ([cross(rows, c) for c in cols] +
            [cross(r, cols) for r in rows] +
            [cross(rs, cs) for rs in ('ABC','DEF','GHI') for cs in ('123','456','789')])
        

Now, he can use unitlist to generate a dictionary, units, that maps each square name to a list of its three units. He completely brute-forces this, linearly scanning unitlist for each square, selecting the units containing the square which itself requires a linear scan of each element of unitlist. But why write something more clever when unitlist is only 27 elements long and it’s elements are each 9 elements long and this whole computation is only going to happen once anyway? Here’s the code:

units = dict((s, [u for u in unitlist if s in u]) 
             for s in squares)
        

Once he’s got units, which will also be used in its own right later, he can compute peers, a hash table that maps from square names to the set of peer squares:

peers = dict((s, set(s2 for u in units[s] for s2 in u if s2 != s))
             for s in squares)
        

And that’s it: 7 definitions in 12 lines of code and he’s done with data representation. I’m not sure how much code Jeffries ended up with. In his fourth installment he had about 81 lines devoted to providing slightly less functionality than Norvig provided in the code we just looked at. In the fifth (and mercifully final) installment, he started adding classes and subclasses and moving things around but never presented all the code again. Safe to say it ended up quite a lot more than 12 lines; if he’s lucky it stayed under 120.

I’m not a proponent (or particularly a detractor) of TDD. If I was, I’d be pretty strongly tempted to throw Jeffries under the bus—maybe TDD isn’t quite as bad as he makes it look in this exercise. It certainly seems that, within the constraints of TDD, he could have done a much better job. Perhaps if he had stopped to think a bit about what he was doing he could have, using TDD, ended up with code as simple as Norvig’s. For instance, if he had started a bit closer to the problem domain he could have started by writing tests of his ability to map from squares to peers and units and then implemented something to provide that functionality. So maybe Norvig is right—maybe there’s not much to learn from this episode except that Fred Brooks is still right and there are still no silver bullets.

Anyway, those are some of the highlights of, and some of the context around, what the folks I interviewed for Coders had to say about testing. There are probably some other good bits I’m forgetting at the moment. Feel free to buy a copy and look for them yourself.

Duct tape context: A tale of two rewrites

September 28, 2009

Last Thursday Joel Spolsky posted an article on his blog, The Duct Tape Programmer, based on my interview with Jamie Zawinski in Coders at Work. It—as Joel’s posts often do—sparked quite a bit of commentary on the programming web, eliciting responses from Uncle Bob Martin and Tim Bray and hundreds of comments on sites like the programming reddit and hackernews.

It was probably fortunate for me that Spolsky used the title he did—praising someone while calling them a “duct-tape programmer” is provocative and the provocation probably helped drive interest which, no doubt, led to a few people buying the book. So, that’s awesome. Thanks, Joel!

On the other hand, being provocative sometimes leaves little room for nuance. Since I dragged Zawinski into this by asking him to be in the book, I thought maybe I could try to unpack some of the context for folks who haven’t yet read the interview.

The first thing to remember is that the first version of Netscape was written in 1994. When Zawinski talks about the choice to not use C++ and the problems with templates, he’s not talking about C++ and templates in 2009; he’s talking about 15 years ago, four years before the language was officially standardized and a time when different compilers didn’t necessarily implement all the nooks and crannies of the language the same way. And Netscape had to run on Windows, Unix, and Mac, so reliable portability was especially important.

The second thing to keep in mind is that the Netscape team in 1994 was under tremendous, if self-imposed, time pressure. I asked Zawinski how it was that the original Netscape team, many of whom had worked on Mosaic at the NSCA and were thus getting a second crack at the same problem, avoided falling into second-system syndrome:

Zawinski: We were so focused on deadline it was like religion. We were shipping a finished product in six months or we were going to die trying.

Seibel: How did you come up with that deadline?

Zawinski: Well, we looked around at the rest of the world and decided, if we’re not done in six months, someone’s going to beat us to it so we’re going to be done in six months.

Could they have made a different trade-off in the old “fast, good, cheap—pick two” space? Possibly. But history certainly testifies that they produced a product quickly enough that no one beat them to the punch and good enough that they changed the history of the web.

But wait! Didn’t Netscape eventually collapse under the burden of accumulated cruft? Didn’t the Mozilla project have to do a massive, ground-up rewrite because the old code base was so covered in duct tape that it was impossible to work with? Doesn’t that completely disprove Spolsky’s point?

Not clear. I haven’t done an extensive investigation of the history of Netscape’s code but I do have Zawinski’s version of events and some corroboration from Brendan Eich, who was also there at the time.

Certainly they paid a price for their rapid development:

Seibel: After this relentless pace, at some point that has to start to catch up with you in terms of the quality of the code. How did you guys deal with that?

Zawinski: Well, the way we dealt with that was badly. There’s never a time to start over and rewrite it. And it’s never a good idea to start over and rewrite it.

Yet we all know that Netscape famously was rewritten after it was spun out into the Mozilla project. So case closed, right? Not quite. What many people don’t know is that—at least according to Zawinski—the Mozilla rewrite was the second rewrite. Here’s Zawinski’s version, from the part of his Coders interview where we were talking about his work with Terry Weissman on the mail reader that was added to Netscape in version 2.0:

Zawinski: So basically [Netscape] acquired this company, Collabra, and hired this whole management structure above me and Terry. Collabra has a product that they had shipped that was similar to what we had done in a lot of ways except it was Windows-only and it had utterly failed in the marketplace.

Then they won the start-up lottery and they got acquired by Netscape. And, basically, Netscape turned over the reins of the company to this company. So rather than just taking over the mail reader they ended up taking over the entire client division. Terry and I had been working on Netscape 2.1 when the Collabra acquisition happened and then the rewrite started. Then clearly their Netscape 3.0 was going to be extremely late and our 2.1 turned into 3.0 because it was time to ship something and we needed it to be a major version.

So the 3.0 that they had begun working on became 4.0 which, as you know, is one of the biggest software disasters there has ever been. It basically killed the company. It took a long time to die, but that was it: the rewrite helmed by this company we’d acquired, who’d never accomplished much of anything, who disregarded all of our work and all of our success, went straight into second-system syndrome and brought us down.

They thought just by virtue of being here, they were bound for glory doing it their way. But when they were doing it their way, at their company, they failed. So when the people who had been successful said to them, “Look, really, don’t use C++; don’t use threads,” they said, “What are you talking about? You don’t know anything.”

Well, it was decisions like not using C++ and not using threads that made us ship the product on time. The other big thing was we always shipped all platforms simultaneously; that was another thing they thought was just stupid. “Oh, 90 percent of people are using Windows, so we’ll focus on the Windows side of things and then we’ll port it later.” Which is what many other failed companies have done. If you’re trying to ship a cross-platform product, history really shows that’s how you don’t do it. If you want it to really be cross-platform, you have to do them simultaneously. The porting thing results in a crappy product on the second platform.

Seibel: Was the 4.0 rewrite from scratch?

Zawinski: They didn’t start from scratch with a blank disk but they eventually replaced every line of code. And they used C++ from the beginning. Which I fought against so hard and, dammit, I was right. It bloated everything; it introduced all these compatibility problems because when you’re programming C++ no one can ever agree on which ten percent of the language is safe to use. There’s going to be one guy who decides, “I have to use templates.” And then you discover that there are no two compilers that implement templates the same way.

And when your background, your entire background, is writing code where multiplatform means both Windows 3.1 and Windows 95, you have no concept how big a deal that is. So it made the Unix side of things—which thankfully was no longer my problem—a disaster. It made the Mac side of things a disaster. It meant it was no longer possible to ship on low-end Windows boxes like Win16. We had to start cutting platforms out. Maybe it was time to do that, but it was a bad reason. It was unnecessary.

Brendan Eich, who was also at Netscape at this time, confirms at least part of Zawinski’s account:

There was an imperative from Netscape to make the acquisition that waved the Design Patterns book around [i.e. Collabra] feel like they were winners by using their new rendering engine, which was like My First Object-Oriented Rendering Engine. From a high level it sounded good; it used C++ and design patterns. But it had a lot of problems.

He does, however, go on to say:

But the second reason we did the big rewrite—I was in mozilla.org and I really was kind of pissed at Netscape, like Jamie, who was getting ready to quit. I thought, you know, we need to open up homesteading space to new contributors. We can’t do it with this old hairball of student code from 1994. Or my fine Unix kernel-style interpreter code.

So Zawinski and Eich differ a bit on the extent to which the Collabra folks had completely replaced the old code in 4.0. If Zawinski’s recollection is accurate and the 4.0 rewrite had really replaced all the old code, then the fact the Mozilla project felt compelled to do a big rewrite is only evidence of what a mess the first rewrite had made of things and says nothing about the quality of the pre-4.0 code. Or maybe they hadn’t quite replaced everything but had still messed things up enough that it was easier to start over. Or maybe it really was the legacy of the 1994 code—too much duct tape—that was the real problem. Ironically, if Zawinski had had his way it’d be much easier to know what really happened—he tells me in a recent email that he tried, in 1998, to get Netscape to open source both the 4.0 and 3.0 code bases but they wouldn’t go for it.

Enough women in Coders at Work?

September 22, 2009

When I started work on my book Coders at Work, the first thing I had to do was come up with a list of people I wanted to interview. I came up with a lot of names myself and then put up some web pages where people could suggest names and sort them in various ways in order to show me which folks they were most interested in seeing interviewed. Thanks to some front-page play on the programming Reddit, I received quite a few suggestions and lots of people tried their hand at generating a list of fifteen or sixteen subjects.

Pretty soon the issue arose of whether I would interview any women and, if so, how many? How did it arise? Well, my mom, for one, called me to ask. And a friend pointed me to a blog post by Shelley Powers on burningbird.net criticizing the at-that-time recently released book Beautiful Code, for having only one woman among the thirty-eight authors of its thirty-three essays. (Powers’s post, unfortunately, is no longer online.)

In the end, I ended up interviewing one woman, Fran Allen, and fourteen men. Is one enough? I don’t know. Here are a few things I do know:

The issue of sexism, in the field and in the world at large, is a huge ball of worms. Why are there relatively few women programmers? Is it because all male programmers are irredeemably sexist? Because there are sex linked characteristics that make it more likely that a randomly chosen man will turn out to be a good programmer than that a randomly chosen woman will? Because girls disproportionally drop off the science and math track way back in grammar school? Or because once the field became, for whatever reason, largely male, it became less appealing for women to enter it? Take your pick. Or make up your own reason.

And does it even matter that programmers are disproportionally men? If you read the comments on sites like the programming Reddit whenever the issue comes up, there’s a strong contingent of folks who claim that the field is already an essentially perfect meritocracy and therefore if there aren’t many women programmers it’s simply because women are less interested in being programmers or are somehow less suited to it – no big deal. Obviously if that’s your point of view, you’re not going to be worried about whether there are any women in a book like Coders. However …

I was actively interested in having women in the book. I do believe there’s enough sexism left, both in the world and in the field, that a girl who finds computers fascinating and who wants to be a programmer when she grows up will face obstacles that a similarly interested and talented boy would not. One of those obstacles, I believe, is the lack of women role models. Yeah, yeah, I can hear the “perfect meritocracy” advocates now – why should a woman be a better role model than a man for an aspiring girl programmer when we’re talking about something as purely logical and rational as computer programming. They can ask that if they want. My experience is that people, starting from when we are kids, do notice gender and that the path of least resistance is to do what other people of your gender are doing. So if you’re a girl interested in programming and you see very few other girls and women programming, that’s going to be a obstacle for you to overcome.

Given that, I didn’t want to write a book that was going to be another vector for the implicit message that all programmers are men. But …

A policy of diversity for diversity’s sake is tricky to implement. Assuming you decide you want to achieve a certain level of gender diversity, your problems are just starting. No matter who I selected, man or woman, for a book like Coders, some people are going to ask, “Did they really deserve to be included in that group?” Fair enough. But if people answer that question, “Oh, he just got in because they wanted a Java programmer,” that’s unlikely to cause them to think badly of all Java programmers. But if the answer is, “She was just included because she’s a woman,” it may cause a classic “affirmative action” backlash. Plus, as either Shelly Powers or one of the commenters on her blog pointed out, it can be insulting for a woman to be invited to do something “just because” she’s a woman.

So how did it happen that I ended up with only one of my fifteen subjects being a woman?

At the end of my name gathering, which included posting a comment on the burningbird.net article asking for suggestions of names of women coders and doing my own search for likely women candidates, I had 284 names, of which ten were women. (Or nine if your brand of feminism and/or chauvinism doesn’t accept transgendered people identifying as women as “real” women.) That’s not many: 3.5%. On the other hand, it’s better than the percentage of female Beautiful Code authors (2.6%) and Turing Award winners at that time (2%). (With Barbara Liskov winning the Turing in 2008, that percentage is now 3.6%.)

So if I were to assume that my list of 284 was a good sampling of the kind of programmers I wanted to interview, then in a book of fifteen interviews, I would expect to have .53 women. Since I knew for sure I wanted at least one woman – and since it is certainly possible that despite my best efforts, my list of 284 is not an unbiased sample – I rounded up rather than down and chose Fran Allen. And I’m glad I did. Anyone who wants to claim Allen got in just because she’s a woman can come talk to me after they’ve won the Turing Award. And she provided another important bit of diversity, representing IBM from the heyday of Big Blue.

Do I have any regrets? In retrospect, I wish I had tried to interview Amy Fowler, one of the lead developers of Java’s Swing GUI toolkit. I certainly seriously considered her and she would have added not only a younger woman’s perspective to the mix but also a GUI programmer’s point of view, both of which could have made it an even better book.

Did my own sexism ever lead me astray? Possibly. Early on, Alan Kay recommended that I interview L Peter Deutsch and Dan Ingalls and I signed them up. But what about Adele Goldberg, who was also part of Kay’s group at Xerox PARC? Did some bit of sexism lurking in my heart convince me that she wasn’t really a great hacker like Deutsch and Ingalls? Or was it really, as I told myself, that I just didn’t want to overload the book with Smalltalkers? I really don’t know. I’d like to think it was purely the latter but have to admit it’s possible the former played a part too.

At any rate, I’m pleased with how the book came out and glad that I had a chance to interview Fran Allen, whose historical perspective, technical opinions, and comments on being a woman in the field were all equally interesting and valuable. I hope that if my daughter – to whom the book is dedicated – decides she wants to be a programmer when she grows up, she will find something to inspire her in Allen’s chapter and in all the other chapters of the book.

Coders at Work on Kindle?

September 17, 2009

A number of people have emailed me to ask whether Coders at Work will be available on the Kindle. Short answer: Yes.

Longer answer: for reasons I don’t really understand, it may be a while. My publisher has, they assure me, sent the necessary files to Amazon and now Amazon has to do something – not clear to me what – before the Kindle edition is actually available. However that something can apparently take months. My editor was optimistic that Coders was selling well enough that it might be more like one month than the six months that some titles have spent in the Amazon queue but there’s no way to know.

I really don’t understand what Amazon could be possibly doing that could take 1/4 the time it took me to actually write the book but there it is. If you’re eager to get Coders on your Kindle your best bet is, I guess, to click the “I’d like to read this book on Kindle” link on the Amazon Coders at Work page. Or maybe email Amazon directly since the “I’d like to read this book on my Kindle” is ostensibly for letting the publisher know you want it and the publisher, in this case, is already is happy to oblige.

Yet more Coders at Work reviews

September 9, 2009

Since I last did a rollup, Coders at Work has received three more reviews on the web, bringing the grand total to nine:

The reviews have continued to be quite positive and the Slashdot review turned out to be quite a coup, shooting Coders‘s Amazon Sales rank, for one glorious hour, up to #176 across all books. While perhaps nothing can complete with the Slashdot Effect for driving an instantaneous spike in interest, I am hoping that it will help to actually get the book into peoples’ hands, as should happen later this week.

Another person who got a sneak peek at the book, Andy Mulholland, the Chief Technology Officer at Capgemini, an $8-billion global consulting firm, sent me this review:

Just had a “sneak peek” at Coders at Work, a book by Peter Seibel due out in September that features interviews with 15 of the industry’s most important programmers.

Through in-depth interviews with the likes of Peter Norvig, formerly VP of Search at Google and now head of Google Labs, and Brad Fitzpatrick, founder of Live Journal, as well as elder statesmen such as Donald Knuth and Ken Thompson, Peter manages to capture the thinking behind these coders’ approach to their work. The book documents an older generation of software – much of it still in use – that coexists with today’s new and different types of applications. It’s as though you have multiple generations of cars on the road at the same time and everything is in flux – the cars themselves, the quality of the roads, the skill of the drivers, and the very purpose for which cars are used.

This book can help programmers sort through all this complexity. It would be a good read for all those studying software today because it provides a fundamental understanding of the world they are about to enter. I highly recommend it.

I guess if everyone–from Slashdot readers to someone advising the highest levels of big-company corporate IT–likes Coders, it’ll probably do okay.

Coders at Work continues to get good reviews

August 24, 2009

Since its first review, Coders at Work has been reviewed five more times. Here are all the reviews to date:

Some highlights:

“Absolutely amazing! A page turner, just like Harry Potter for the technically minded.” —Tobias Svensson

“What books would you recommend to help a developer learn programming? For me, this book joins my short list.” —Marc Hedlund

“This book is dead sexy. When it comes out, you should definitely get a copy.” —Joseph F. Miklojcik III

“Superb book!” —Prakash Swaminathan

“Read it, because then you will know the greatest coding brains.” —Amit Shaw

First review of Coders at Work

August 9, 2009

My offer of sneak peeks at Coders at Work for would-be reviewers, has yielded its first fruit. I can only hope that all reviewers are as kind. (Though the author does accuse me of “subversive behavior” for dwelling on my subjects’ Emacs use.)

Meanwhile, I’m basically unable to function since I spend all my time waiting for the next update of my Amazon Sales Rank and checking to see whether the link to the review is being upvoted on Reddit. Ooops, gotta go; new Sales Rank coming in 48 seconds.

Coders at Work in the Endgame

May 20, 2009

It’s now almost exactly two years since I started work on my book Coders at Work and I’m finally in the endgame. Once I get all of my chapters turned in I’ll probably do some work on the website and blog a bit about behind the scenes, making of the book kind of stuff.