My wife was working on a project where in some cases it was taking several minutes for a web page to appear. The project is pretty complicated and involves many pieces. I remembered that I had a book which talked about performance tuning so I pulled down my copy of Programming Pearls from the bookshelf. While it’s more geared towards writing code than integrating pieces together (which is closer to her project), the methodology still applies. Most of what I was able to cite from the book ends up sounding like common sense but is a good reminder: if you need a small speedup, consider all the levels and work at the one which will give the most gain for the least effort; if you need a large speedup, work at many levels since in some cases those speedups can multiply.
I did a little research so I could write this post about the book and to my surprise, a second edition has been released. While I haven’t had a chance to take a look at the new edition, I’m definitely going to seek it out. Although it doesn’t appear in the actual book, Jon posted a section from a draft preface which explains why he wrote the second edition; from that you can get a feel for his style.
When I first read the book, the chapter about estimation (”The Back of the Envelope”, Chapter 7 in the second edition) resonated with me. I remember as a child going through the toy store with my uncle while he bought gifts for my cousins, and he would ask me to keep track of how much he was spending. I spent a lot of effort adding down to the penny, and one year took a calculator with me. Now I do generous rounding and still get close enough that I can tell when something is wrong. OK, I admit that when we have one of those “$5 off $25″ coupons, I round down very generously and we end up spending closer to $35 than $25, but that beats having to go back and pick up more items.
Estimating applies to more than totaling bills. My son seemed mortified that the people around us were listening so intently when I was having him work through how many people were ahead of us waiting for a ride at Disneyland.
But I digress. This kind of estimation gets applied many times around here, from determining how much disk space will be used by a service to how early one should start when sending large updates to hosts in order to be finished by a certain time. No exact math is needed for many of these calculations, just an estimation with an appropriate safety factor.
Another concept one picks up from the book is the idea that it’s better to test early and test often. Early chapters examine sorting algorithms, and show how using tools such as assertions can help catch both coding and logic errors, especially when one tests small sections and has debugging output. The follow-up book, More Programming Pearls, formalizes this concept as scaffolding.
Speaking of More Programming Pearls, that book includes the priceless chapter “Bumper-Sticker Computer Science” with sayings that are serious (”Testing can show the presence of bugs, but not their absence”, Edsger Dijkstra), philosophical (”Good judgement comes from experience, and experience comes from bad judgement”, Fred Brooks), and humorous (”When you turn an ordinary page of code into just a handful of instructions for speed, expand the comments to keep the number of course lines constant”, Mike Morton).
Coming full circle to performance, the book talks more about the different levels of tuning, and many of them involve thinking a little “outside the box”. Many of these techniques are done for you by the compiler nowadays, such as loop unrolling, but the big wins still involve finding the hot spots of the code and tuning them, especially if you can reduce the “big-O” of an algorithm. Changing from code which runs in O(N2) (proportional to the square of the number of data input) to O(N log N), or even better O(N), isn’t easy, but is a win that scales.
Changing an algorithm like that may seem like a lot of work for just a few lines, but as Tom Cargill says (from More Programming Pearls), “The first 90% of the code accounts for the first 90% of the development time. The remaining 10% of the code accounts for the other 90% of the development time.” Too true!





