21 December 2008



Another dose of truth from XKCD.

20 December 2008

Seen after the ice storm

So, during the aftermath of the storm, as I was sitting in the parking lot of the local giant hardware store, I observed the following: guy parks his minivan in front of me. A few seconds later a woman in another minivan parks next to him. Her van has the kids in it; these two are obviously husband and wife or otherwise involved.

So, after the husband gets done talking to the wife, he finds a cart in the parking lot and proceeds to the back of the his van. The entire time he was walking around I was thinking to myself "gosh, this guy has a 'I am doing something stupid' look on his face".

Anyways, the guy gets to the back of the van, opens the back, and starts unloading various heaters from the back of the van into the cart. He unloaded at least six electric and propane heaters from the back of the van into the cart. Each one appeared to be used, because he had to place each of them back into their original box. This whole operation took a while. He had so much stuff he had to make two trips!

I couldn't believe this guy. What a jerk! The previous night had been very cold, so this guy went to the hardware store, bought a bunch of heaters, USED THEM, and now he was returning them (he must have gotten his power back). In effect, this guy was making everything in the store more expensive for all of the people (like me) who had bothered to prepare for events like ice storms and power outages.

Gheesh.

Reflections on the Great Ice Storm of 2008

In our house we just lived for a week without electricity (from the power company, anyways). It was an interesting week.

I'd been preparing for this event for a while, in fact, ever since the last big ice storm a decade ago. I bought a generator a long time ago, and I have been faithfully maintaining it ever since. I've changed the oil in this thing every fall, and I keep things like gas, oil, flashlights, extension cords, batteries, etc. on hand. I also happen to keep a five-gallon bucket of water (covered) in the basement that I use for toilets. This bucket is always handy for the occasional but short power outages that we experience at our house.

The event that I had been planning for was a little different than what actually occurred. I had been preparing for a three-day power outage in the middle of winter. What actually happened was a nearly seven day power outage.

As far as heat was concerned, I planned on running a couple of 1500 watt electric heaters during the outage. I thought that this would do the trick. It didn't take me too long to add another 1500 watt heater to my house during the outage. All of this "worked" in the sense that my pipes never froze, but things were somewhat chilly in our living space.

The generator also powered things like our refrigerator, freezer, lights, microwave, and rechargable flashlights. We ran extension cords throughout the house to get the power where it was needed.

Notably, our generator was not hooked up to our house circuit panel, so we had no oil boiler, no hot water, and no water at all from the well pump. I bought the drinking water that we needed from the store, and as for water for toilets, I used water from a nearby river.

We survived the week just fine. We are especially appreciative to all of the workers who have been working really long shifts in order to restore the power to everybody.

Now that the event is over for us, I have had a chance to think about how I can make the next power outage a little more comfortable for us. The one thing that comes to mind is that we are definitely going to be calling an electrician to get an estimate on getting a generator cutoff switch installed to the house circuit panel. Yeah, yeah, I know how to hook the generator up to the house circuit panel, but I just decided for myself that I didn't want to go there this time around. During our next outage I'd really like to be able to use our well pump and oil boiler.

22 November 2008

“Human beings are very quirky and individualistic, and wonderfully idiosyncratic,” Hastings says. “And while I love that about human beings, it makes it hard to figure out what they like.”

More here.

19 November 2008

Typing is the most important class a CS student could ever take -- not!

Every once in a while somebody who I think is very smart makes a pretty boffo comment. Case in point: Jeff Atwood over at Coding Horror concurs with the idea:


....the most important computer science course a CS student could ever take [...is...] Typing 101.


Wow. I really disagree with this.

I have worked on any number of projects over the years where it has been obvious to me that one or more of the engineers who work on the project can type....and by this I mean TYPE A LOT. I've seen projects that were comprised of a lot of code....piles of code...reams of code...tons of code. I am not exaggerating even a little bit when I tell you that I even knew of a project that involved a lot of highly paid engineers doing a lot of typing ("typing" and not "coding"). The project manager of that project cheerfully reported the progress of the project by saying "at this rate, we've got another 7 months of typing ahead of us". In the end, by any reasonable measure, THAT project turned out to be a total failure. Why was this project a failure? Because the engineers weren't typing the right things!

For the record, I think that it is a good idea that programmers can type reasonably well, but on the other hand the things that I value in a programmer are problem solving abilities, communication abilities, technical skills, and overall professionalism. I've met programmers who have all of these abilities and aren't the greatest typists. Would I hire these programmers to implement a project? You bet I would.

If Jeff Atwood is correct and typing is the most important course that a CS student could ever take, computer science departments all over should immediately adjust their curriculum. Also, the process of hiring software engineers needs to change. First and foremost, every software engineer must change the format of their resume to prominently mention how many words per minute they can type. The first step to any software engineering interview must be changed to include a rigorous typing test.

I think that I'll end the reductio ad absurdum now...

For a software engineer, it's not how fast you type, it's what you type that is important.

28 October 2008

Beware of the Leopard (or, "In praise of #error")

One day at $DAYJOB, whilst all of the engineers were working on the large C++-based project that many people reading this blog have actually used, somebody discovered a bug. Actually, this was less of a bug and more of a software misconfiguration. The crux of the problem came down to the fact that in our software product, it was possible to build/compile the product with "Feature X", but "Feature X" had three mutually incompatable variants called "aaa", "bbb", and "ccc" -- and somebody had b0rked the configuration of these things in the software build. This problem resulted in a bit of lost time and money (and some teeth gnashing too...).

Well, somebody in management heard about this problem, and so it was decided that we needed to implement a process to ensure that this problem didn't occur again. Somehow, the result of all of this was committee meetings and from these somebody actually started constructing a software tool that would ensure that there were no misconfigurations in the software build.

I heard about the construction of this software tool right after it started to get implemented. When I heard the news, I was puzzled because nearly everything in the codebase was reasonably delimited with preprocessor conditional blocks. For example, in the code, everything related to "Feature X" looked like this:

#ifdef CONFIG_FEATURE_X

....code for feature x....

#endif


So, after hearing that somebody was actually going through the trouble of creating a tool to detect software misconfiguration in the build, I contacted the person who was in charge of that project and suggested that all that we really needed to do was add some code that looked like this in one of our header files:

   #ifdef CONFIG_FEATURE_X
   
   #if    !defined(CONFIG_FEATURE_X_AAA) \
        && !defined(CONFIG_FEATURE_X_BBB) \
        && !defined(CONFIG_FEATURE_X_CCC)
   
   #error If you are going to enable Feature X, you must choose one of its variants:  AAA, BBB, or CCC.  \
          It is non-sensical to enable Feature X and not enable one of these variants.
   
   #endif
   
   
   #if       (defined(CONFIG_FEATURE_X_AAA) \
           && (   defined(CONFIG_FEATURE_X_BBB \
                       || defined(CONFIG_FEATURE_X_CCC))) \
        || \
\   
      (   defined(CONFIG_FEATURE_X_BBB) \
       && (   defined(CONFIG_FEATURE_X_AAA \
                   || defined(CONFIG_FEATURE_X_CCC))) \
        || \
\   
      (   defined(CONFIG_FEATURE_X_CCC) \
       && (   defined(CONFIG_FEATURE_X_AAA \
                   || defined(CONFIG_FEATURE_X_BBB)))
   
   #error Feature X variants AAA, BBB, and CCC are all mutually incompatible!
   
   #endif


I think that the thing that I really like about this code is that the people who maintained Feature X and its variants could make the decision as to what represented a misconfiguration -- not some third-party or some complex tool that would theoretically check for ALL misconfigurations. I also really like the fact that this solution solves this problem at nearly the cheapest place in the development process -- at compile time. From my understanding of the tool that was being written to solve this problem, this tool would not have this property.

The manager/engineer who I contacted with this suggestion liked my idea, and thought there were several places in the code that could be updated with code in this style. For some reason that I never really understood, the decision was made to continue on the development of the "misconfiguration detector tool" -- I transitioned to another project soon after and lost track of what was going on in that project.

Why am I telling you this story today? Because I just spent my morning trying to track down an esoteric problem in a well-known open-source product, and in the course of my work, somewhere in the reams and reams of compilation output, I saw a message like this:

   Warning:  this file is known to be mis-compiled with this version of compiler!


But notice this is merely a warning and not a hard error. This infuriates me. Am I really expected to crawl through hundreds of thousands of lines of compilation output to find "warnings" like this?

I am a big fan of clear error messages and not needlessly dragging programmers through painful debugging sessions.

13 October 2008

It's All Text!

My new favorite Firefox extension is It's All Text!

Now I can edit text in text-fields with my preferred text editor: XEmacs.

Life is good.

02 October 2008

24 August 2008

Leadville 100 -- Wiens and Armstrong -- Incredible

This post has been deleted because I just don't feel like giving the Mr. Armstrong any free publicity on this blog.

11 August 2008

Report from BlackHat 2008

I was pretty psyched that my employer sent me out to Black Hat last week. It was nice to hang out with a bunch of people who are enthusiastic about discussing computer security issues.

I wish I could have spent the entire week at the conference. Maybe next year. I got a good taste for the issues at hand, but I found myself yearning for more technical content.

The marquee talk that was presented at Black Hat was given by Dan Kaminsky on the subject of DNS security. I didn't actually attend this talk because it was mobbed and I am already very familiar with this issue. Basically, the problem Kaminsky has brought to light has to do with the low-level details of how the DNS protocol works. A sufficiently skilled attacker can poison a DNS server using faults in how the DNS protocol is specified and implemented. Kaminsky presented a significant new attack in this space.

After having some time to reflect on this attack, I am struck by how similar Kaminsky's attack is to a previous attack -- the attack first documented by Morris and made famous by Mitnick (who attacked Shimomura). This style of attack has been well-understood for decades now. When you get down to it, if an attacker decides to attack a protocol that is protected with easily guessable sequence numbers (or else the attacker can flood the host that he/she wishes to attack), the security of the protocol will soon be compromised.

Look how far we (haven't) come...

26 May 2008

I am a fan of Programming by Contract and gcc's -Wcast-qual

Late one Friday afternoon at $DAYJOB I was nearly finished implementing a new feature for the product that I worked on. This was the culmination of several long days effort, and I was looking forward to finishing my task and going home for the weekend.

As I was hooking everything up to enable the new feature, I spotted a bug. This was a strange bug, but I felt confident that I'd find it quickly. I called my wife and let her know that I'd be a little late for dinner.

Like I said, this was a strange bug. I noticed this bug because the unit test I was cobbling together acted strangely in the corner case that I was trying to get right.

Basically, the problem came down to the fact that at some point in time in my C program's execution, I had a (char *) variable that pointed to a particular string, but, later in the program's execution, for some unknown reason the contents of the string changed. This was very unexpected, and in fact this change corrupted a larger data-structure that my code was maintaining.

So, I looked at the code that maintained the (char *) variable with all of the concentration that I could muster on a Friday afternoon. This exercise proved to be fruitless -- I soon concluded that the code that I was looking at was correct.

I called my wife and told her that I was going to be a bit later...

Since the code seemed to be correct, I decided to pull out the big guns -- I decided to run the program through the memory debugger. I guessed that there might have been a improper memory access in the code, and this was the thing that was corrupting my string.

Fire up $MEMORY_DEBUGGER. Instrument. Wait.... Wait some more.... Run.

Running $MEMORY_DEBUGGER yielded absolutely nothing: no bad memory accesses.

Now I was getting irked. I called my wife and told her that I wasn't sure when I would be home. Luckily we were having leftovers that night...

So, now I focused on the problem again:

  1. my string was getting modified strangely.
  2. the code appeared to be correct.
  3. there didn't appear to be any memory access errors in the code.

At this point, I did what I probably should have done earlier: I fired up the debugger and added a watchpoint to the contents of the string. When I re-ran my program, I had my smoking gun: the string was being changed by some library code written by somebody else at $DAYJOB.

I was still a little bit confused though, because, like I said, the code that I had looked at was technically correct. But I hadn't looked at the library code...

The library code looked like this:


void do_something(const char *s)
{
char *s2 = (char *)s;
s2[0] = 'a';
}


I am hand-waving a little bit here, because I can't remember the exact circumstances. All I really remember is that the situation was a lot more complicated than this code snippet, with many levels of indirection.

When I saw this, especially the first line of do_something(), the problem was obvious: whoever wrote this function broke a fundamental rule of C -- you are not allowed to modify the thing being pointed at by a (const T *) object.

So, I figured out who wrote this function, sent $COWORKER an email asking them to please fix their code to adhere to the rules, and then I packed up and went home. I couldn't check my change in with this bug still in the system, so I decided to wait until Monday.

You might think that the conclusion to this story might be boring, cut-and-dry, etc., but it wasn't for me.

Monday morning came. My $COWORKER who wrote the buggy function read my email and then responded via email. To sum up his email: (1) he was not going to fix this function and (2) the problem was mine because (paraphrasing) "'const' does not mean what the C standards bodies have defined it to mean; instead, 'const' means what they defined it to be at $SOME_PREVIOUS_COMPANY_THAT_HE_WORKED_AT".

I was flabbergasted at this response, so I went over to talk with $COWORKER. He amazed me with his tenacity. There was no line of argument that I could employ that would change his deep held belief here. I never really could pin down an exact definition of what "const" meant at his previous company. He did further clarify his position here by telling me that, in his experience, "most programmers are too stupid to know what the standards bodies say about 'const'". I protested that I thought that I understood pretty clearly what "const" meant, and he agreed with this, but he held fast to his "programmers are too stupid" point.

At this point I even pointed out that GCC had a "-Wcast-qual" flag that would catch errors like this.

"So what?" he responded.

"Do you think that they would add this check into the compiler if it wasn't, like, important?" I asked.

"I don't care.", he responded.

"Do you understand that I stayed late on Friday night because of this bug?" I asked.

"That's unfortunate.", he responded.

We continued this fun interplay for a few minutes, but I eventually had to give up. Clearly, he wasn't going to change his code, and I simply could not fix all of the places in the codebase that used "const" in this non-standard way.

The maddening thing for me here was that $COWORKER was a very smart engineer, and he certainly understood the concept of Programming by Contract, but he was basically asking me to enter into a contract that said "no matter what other legal mumbo-jumbo is in this document, it is OK if my code does anything whatsoever, even if it is wrong or goes against the spirit of everything else in this contract". This isn't a very useful or meaningful contract.

I eventually learned that anytime I saw the keyword "const" certain subsystems in the code that I should attribute no meaning to this keyword. Seriously, in those subsystems, "const" meant whatever the author meant on that day, and tomorrow the meaning might (and in fact did) change. I started to write my code in a very defensive manner, making copies of important data-structures and sending these copies off to these Alice-in-const-Wonderland subsystems.

I did have enough control over the system to ensure that all of my code compiled with gcc's "-Wcast-qual", and this did keep me out of a bit of trouble a couple of times. I knew that my code was bulletproof and correct.

One day after I had given up on getting $COWORKER's code to be "const correct" I was struck with an idea, so I made a modest proposal to $COWORKER: since he seemed to want to use a keyword to attribute some meaning to variables (whatever meaning "const" meant at $SOME_PREVIOUS_COMPANY_THAT_HE_WORKED_AT), I suggested that we could migrate his code away from using "const" and we could instead define a new keyword for his code with the C preprocessor. I suggested that we could do something like this:


/* This is some header file */


/* For an exact definition of this keyword, please ask $COWORKER */
#define MEANINGLESS_CONST


...and then we could have updated the original function that caused me to stay late at work like so:


void do_something(MEANINGLESS_CONST char *s)
{
s[0] = 'a'; /* much more streamlined!!! */
}


I even offered to make all of the changes for him...

So, now our system would have been improved! We'd still have the use of the "const" keyword as the standards bodies had intended, but we'd also have MEANINGLESS_CONST too, and our code would be more streamlined and definitely a lot more understandable. We'd even be able to use "gcc -Wcast-qual" too!

Anyways, I presented this idea to $COWORKER. I give him credit for sticking to his guns -- "no" was his simple flat response to my proposal.

I wish I had some neat way to wrap up this story, but I don't. The Alice-in-const-Wonderland code continued to exist in the product until I left. This probably cost the company a bit of money in terms of bugs and lost efficiency, but that's the way things work sometimes. I just had to learn to live with this behavior in the code.

To sum things up, I'm a big fan of Programming by Contract and "gcc -Wcast-qual". I'm also a big fan of sticking to reasonable standards, and assuming that my co-workers are smart until they go out of their way prove otherwise.

09 May 2008

One of the best sports stories I have ever heard


Despite the awesomeness of playoff hockey, the best story that I've heard all week is this one:

Holtman and shortstop Liz Wallace lifted Tucholsky off the ground and supported her weight between them as they began a slow trip around the bases, stopping at each one so Tucholsky's left foot could secure her passage onward. Even with Tucholsky feeling the pain of what trainers subsequently came to believe was a torn ACL (she was scheduled for tests to confirm the injury on Monday), the surreal quality of perhaps the longest and most crowded home run trot in the game's history hit all three players.
This is simply an awesome story, one that I will never forget.

06 May 2008

Tour de Cure -- success

The ride was a success. Seventy-five miles in semi-tough conditions -- lots of rain and it wasn't warm either. A friend of mine even joined me for the ride.

Of course, to get the whole writeup, you'd have to pledge money, which I believe you can still do, here.

From what I read in the newspapers, this event was supposed to raise over $200k. Nice!

30 April 2008

Topozone transmorgifies into a less-useful thing

One of my favorite sites on the web was TopoZone dot Com but now I have learned that they have changed their business model completely -- now they want $50 a year for me to use the site. I have no problem paying for something that is useful but I am struggling to see how a subscription would be money well spent.

I am mostly interested in NH maps. When I tried out the "new" Trails dot Com (which purchased TopoZone), the sample NH trail map that I ended up turned out to be content from a book that I already own...

I am fairly certain that everything that Trails dot Com would give me for my subscription would also be easily obtained from the books and maps that I already own. I think that a lot of potential customers will make similar conclusions.

Thus, I think that Trails dot Com made a mistake, and that they are in for a pretty rough ride.

Oh, by the way, if you are looking for a good NH topo map, I think that the ones from Map Adventures are very very nice.

23 April 2008

Notes on accessing Subversion repositories via a custom SSH tunnel

I just spent a little while lost in the weeds as I was updating my server to support my custom svn+ssh setup. Here are my notes, just to help others along. I am trying to keep things simple, so I am running svnserve. I am also running sshd on a non-standard port -- this fact perhaps contributed to the way that I set all of this up. I want my setup to be simple to use on a day-to-day basis.

1. Suppose we have two machines, CLIENT and SERVER.

2. On CLIENT, generate a new ssh-key (KEY) and load it via ssh-agent.

3. I assume that on SERVER you have created a "subversion" user and created the repository in /home/subversion/repo .

4. On CLIENT, in your $HOME/.subversion/config file add the following to the [tunnels] stanza:


custssh = ssh -p your-port-number -l subversion \
-i /your/home/directory/.ssh/id_your_new_KEY


5: On SERVER add your new key to the ~subversion/.ssh/authorized_keys file, but add it in a special way:


command="svnserve --root=/home/subversion
--tunnel-user=your-loginid
--tunnel",no-port-forwarding,\
no-agent-forwarding,no-X11-forwarding,no-pty ssh-dsa #$#$#$#$#key-stuff-goes-here-lBB you@somedomain.org


Tip: to prevent you from going off into the weeds, I strongly suggest that you familiarize yourself with the format of this file and ensure that no stray characters end up in this file...

6. And now you can access your Subversion repository like so:


svn co svn+custssh://SERVER/repo/trunk/top-secret-project


This is a very handy way to have things setup.

22 April 2008

Happy Earth Day!

We get our milk delivered to us, the old fashioned way, in glass jars via a milkman. The milk comes from Sherman Farm and gets delivered by Catamount Farm. Honestly, it is a little more expensive, but we really enjoy the overall experience. There is nothing better than a cold glass of milk from a glass jar!

Anyways, our milk arrived today, and along with the delivery itself, we got a sticker that read:

Did you know? By recycling glass milk bottles, Catamount Farm customers have helped save an estimated 40,000lbs of plastic over the past 6 years!

It is Earth Day, and we find this to be very very cool.

16 April 2008

All I need is a programmer

This excellent blog posting reminds me of the time that one of the executives at $DAYJOB referred to programmers as "monkeys". It was clear to me at the time that he was simultaneously trying to express that we were all interchangeable and somehow he was trying to motivate us too.

His comment was actually wonderfully honest; in ten seconds I learned exactly what he thought of us and our work.

13 April 2008

The Human Footprint

We just got done watching The Human Footprint on TV. It was pretty interesting. The sheer magnitude of stuff that each of us consumes in a lifetime is pretty staggering.

We thought that some of the numbers cited by the show were probably accurate, but for other numbers and figures we were a little bit confused. For example, the show tells us that the average American consumes around 563 cans of soda per year Yow! That's a huge amount, and significantly more than an order-of-magnitude more than what we consume.

The show was a little bit monotonous at times, but overall, it was a good reminder that anything we can do to reduce, reuse, or recycle is a very good thing. Actually, it is more than a "good thing" because if everybody on the planet consumed the way the average American does, we'd need four planet Earths for all of the resources that would be consumed. Clearly, this level of consumption isn't sustainable.

09 April 2008

Get up, boy! I'm not done with you yet...

Yay! Gary Roberts is back. He scored two goals tonight.

On nearly the last occasion we saw Roberts (age 41), he took Ben Eager (age 23) to task because of a cheap elbow Eager had thrown to one of Robert's teammates. So, Roberts decided to teach Eager a lesson:



My favorite part of the fight was when Eager was just about to fall down and Roberts pulled him back up, almost as if he was saying "I'm not done with you quite yet!".

As for Eager, after getting beat up by a man that could have been his father, his team traded him soon after.

It's so nice to see a middle-aged man like Roberts be such a good....educator.

01 April 2008

Startup Life and Death -- April Fools

I have an odd story about April Fools jokes...or lack thereof.

One April 1st, I got to work at the startup that I was working at, did my usual morning ritual of checking the builds and answering my morning email. Then I decided to troll the Interweb to see what amusing new IETF RFC had come out that day.

Right at that moment our CEO called for an all-hands meeting, so I went to the meeting. The CEO came in and said "sorry, we're going out of business". I kept on waiting for him to scream "April Fools!" but he never did -- he was completely serious. The company's board decided to go out of business the night before.

I spent the rest of the day packing my boxes and drinking beer that one of our field techs obtained from the local brewery. Eventually somebody in management came by and told us we really had to be leaving and somebody chimed in "whaddaya going to do -- fire us?". To be clear, this was said in jest -- most of the people left in the company at this point got along just fine. It was just an unfortunate situation. We'd all been through so much together and now it was ending for good.

I never did get to see any April Fools jokes that day. It's not like the day itself brings me down -- I've long since gotten over that place. I guess the day was just an odd reminder that there is never a dull moment in high-tech.

29 March 2008

VESA driver works for me with very old Dell Latitude C800 ATI M4 32MB

I wanted to upgrade my ancient Dell Latitude C800 from Fedora Core 4 to something newer. So, I decided to try Fedora 8. During the install the video was garbled so I did a text install. After everything installed, I tried to get X running. I didn't have a lot of success. Everything was weird and still garbled. The install itself detected the ATI card so I continued to work with the ATI driver.

It was late and I still wasn't having any success so I decided to try out Ubuntu 6.x (I happend to have the CD handy). This didn't work either, giving me basically the same problem as the Fedora 8 install. So, I decided to call it a night.

In the morning, in an attempt to get something to work, I tried out the VESA driver. I discovered that this worked fine. I was even able to configure things so that I would be able to display at 1450x1050.

So, my conclusion is that the newer Xorg ATI drivers don't work very well with my ancient hardware, but the VESA drivers work fine. We're not talking about high-performance hardware here, so this is good enough for me.

27 March 2008

Tour de Cure -- American Diabetes Association -- 4-may-2008

In honor of a few people I know who suffer from diabetes, I will be riding my bike 75 miles in this year's Tour de Cure for the American Diabetes Association. I believe that this is the fourteenth straight time that I have participated in this ride.

If you would like to support me in this endeavour, please visit my TdC page.

I have ridden the century TdC ride (actually ~107 miles) many many times in the past, but, to be honest, 107 miles this early in the season is pretty difficult for me to train for. It seems prudent to do the 75 mile ride so I am doing this.

This is a good cause, and I would appreciate any support.

26 March 2008

Multiple Function Return Points Considered Harmful

I prefer to see functions written in such a manner that there is one consistent return point. I prefer this:


int f(int x)
{
int result;

if (x < 3)
result = 1;
else
result = 0;

return result;
}


...over this:


int f(int x)
{
if (x < 3)
return 1;

return 0;
}


This is a religious issue (to some extent). Clearly, a good optimizer is going to render the same code in either case, so the latter code isn't faster -- it is just shorter. I prefer my way because it goes along with my conservative coding style -- I prefer to make the code so simple that I can even understand it when I am tired.

By the way, I'm not so religious about this matter that I always follow my own advice. Every rule has exceptions.

However, I can think of one case in which I believe that my methodology (having a consistent return point) is a clear winner. I will illustrate this with a story.

....

One day at $DAYJOB, my $MANAGER informed me that I'd been assigned to work on a new project. I'd been assigned to add a $FEATURE to a $BIG_SUBSYSTEM in the product. I didn't know anything about this subsystem. $MANAGER told me that this was fine -- $COWORKER was the expert on this $BIG_SUBSYSTEM, I could use $COWORKER as a resource.

I'd never even heard of $COWORKER at this point. After exchanging a couple of terse emails with $COWORKER I figured out that $COWORKER worked really strange hours and actually worked in a cubicle a couple of rows away from me.

Later that day I managed to find $COWORKER in his cubicle, so I stopped by and tried to introduce myself:

Hi, I'm Kevin. I've been assigned by $MANAGER to work on $FEATURE in the $BIG_SUBSYSTEM. $MANAGER tells me that I can use you as a resource for this project. I'm trying to come up to speed with $BIG_SUBSYSTEM ; I've started reading the available documentation, but could you perhaps give me an overview of $BIG_SUBSYSTEM so that I have a better idea of what is going on in this subsystem.

$COWORKER looked at me with disdain and said "I'm sure that you'll figure it out.". Then $COWORKER put his headphones back on and turned back to his computer.

$COWORKER also managed to turn down my offer of a friendly handshake.

"Great...", I thought, "...$COWORKER is an unhelpful jerk". I knew what this meant too, because complaining to $MANAGER wasn't going to help me one single bit. I would have to familiarize myself with $BIG_SUBSYSTEM and implement $FEATURE on my own.

Weeks passed. I worked my ass off to come up to speed on $BIG_SUBSYSTEM. I barely had any interactions at all with $COWORKER. $MANAGER asked if $COWORKER was being helpful and I honestly answered "no, he seems to be working on his own work". "Oh, he must be busy" was the response.

Eventually, I was done. I tested my code and even showed it to $COWORKER. He made a bunch of comments that ranged from being somewhat helpful (things that I wished I had known before I started the project) to comments that just reflected his opinions about the code.

So, I tested one more time and checked my changes in.

There are four things that you need to know about my $DAYJOB before I continue: the project written in C, it was very large, the project was heavily multithreaded and the project made use of a huge number of branches in the source control system.

Soon after I checked my changes in $MANAGER told me that my changes were needed in a different branch in the source control system. So, I performed the arduous process of merging my changes into the new branch.

Let's just say that I performed this merge process into N different branches...

Each merge really was arduous. Due to the way that the software organization used the source control subsystem at $DAYJOB, it was very difficult to use the tools that came with the source control system to perform the merge. I soon concluded that the best way for me to do merges was via patch, ediff/Emacs, and a huge amount of attention to detail. Each merge took well over a day of solid effort, sometimes quite a bit more.

I repeated this merge several times, as needed. While I was doing all of this work, I had lots of time to think to myself "how could this be made better?" But more on that some other time...

Anyways, one day somebody in SQA contacted me and told me that he wasn't sure what was going wrong, but one of our internal software releases was dying strangely, and since I was the last person who made a major modification to the branch, he thought I might have something to do with this.

This was the start of the badness. As soon as the problem was found, $MANAGER was informed of the problem, and now $MANAGER was pretty insistent that I find the problem...and quickly. $COWORKER even came by and reminded me of the importance of finding the problem quickly. This was the most contact I had had with $COWORKER in months! Great...

So, under a bit of pressure, I started looking for the bug. $COWORKER started looking for the bug too, independently (of course). I felt some pressure to try to find the bug before $COWORKER, because technically the bug was probably mine.

Many hours passed. I had a difficult time trying to find this bug because I couldn't figure out what was different about my work on this branch versus all of the other branches that I had worked on. My work on all of those other branches worked fine and had passed SQA testing. This was a tough bug to isolate...

Eventually, $COWORKER found the bug. I was pretty unhappy. The bug was in my changes for $FEATURE, and $COWORKER was more than happy to rub my nose in the problem. The problem was a thread synchronization problem. My coworker pointed at the code and then at me and said "You need to pay more attention to details when you write code". My head was swimming. I ruefully noted that this was the most interaction with $COWORKER that I had ever had and that it had gone very badly. My code had a bug. I was responsible for the problem. Where did I go wrong?

...

I've been in the software engineering business a long time. Sometimes I make mistakes. Really, I try to learn from everything that I do. So, after the bugfix got checked in and everybody stopped freaking out, I tried to track down what went wrong.

A little while later, I figured out what went wrong.

When I originally implemented $FEATURE, I had to integrate my changes into the already large codebase. When I made my original implementation, my code changes mimicked the style that I found in the rest of the codebase -- of course.

Here is where we get back to the subject at hand: having a consistent place in each function where a function returns.

The codebase at $DAYJOB didn't adhere to my preferred pattern here. And, like I mentioned, it was heavily multithreaded (and this implies that it used a lot of synchronization primitives). So, part of my changes modified a function that looked like this:


void f(int x)
{
LOCK(&mutex);

switch (x) {

case FOO:
if (someFunc() == ERROR) {
UNLOCK(&mutex);
return;
}
/* do something else */
}
break;


/* HERE IS MY CODE */
case BAR:
if (someFunc() == ERROR) {
UNLOCK(&mutex);
return;
}
/* do something else */
}
break;

/* ...100 more cases... */


}

UNLOCK(&mutex);
}


But, when I merged my code into the new branch with patch, the patch miraculously managed to apply cleanly on this file, and the resulting/buggy file looked like this:

void f(int x)
{
LOCK(&mutex);
LOCK(&some_other_mutex);

switch (x) {

case FOO:
if (someFunc() == ERROR) {
UNLOCK(&some_other_mutex);
UNLOCK(&mutex);
return;
}
/* do something else */
}
break;


/* HERE IS MY CODE */
case BAR:
if (someFunc() == ERROR) {
UNLOCK(&mutex);
return;
}
/* do something else */
}
break;

/* ...100 more cases... */


}

UNLOCK(&some_other_mutex);
UNLOCK(&mutex);
}

See the problem? In this new branch of code, which I had never worked with before, somebody had introduced a new mutex (some_other_mutex), and my code wasn't cleaning up properly.

This is where I want to make my point: this code could have been written a lot more cleanly, like this:

void f(int x)
{
LOCK(&mutex);
LOCK(&some_other_mutex);

switch (x) {

case FOO:
if (someFunc() == ERROR) {
....
}
else {
....
}
}
break;


/* HERE IS MY CODE */
case BAR:
if (someFunc() == ERROR) {
....
}
else {
....
}
break;

/* ...100 more cases... */


}

UNLOCK(&some_other_mutex);
UNLOCK(&mutex);
}

...and, if it were, not only would my changes have applied cleanly with patch, but we would have avoided a long, gory, multi-hour, "let's find the bug" session.

I never did get to tell $COWORKER about what the true root cause of the problem was here. As you can see, this is a long story, and $COWORKER never had any patience for me. In fact, $COWORKER probably thinks I am a moron to this day. But I was glad to figure out what the root cause of the problem was, and this is one of the reasons why I really try to write functions in such a way that there is a single consistent return point -- especially in multithreaded code that utilizes synchronization primitives.

Postscript: several months after this bug reared its ugly head, I found two places in $COWORKER's code that suffered from the same bug. I was tempted to treat $COWORKER as rudely as he had treated me, but in the end I just decided to be polite about it.

24 March 2008

Interesting article on TCP congestion control

Interesting article on the subject of fixing TCP's congestion control algorithm:

http://blogs.zdnet.com/Ou/?p=1078

A lot of reasonable things are presented here, at a very high level. The graphs are especially interesting too. From what I have read of SCTP, this very nice protocol also has the potential to suffer from this problem.

Interesting thought question: what do you suppose an ISP is going to do when a large number of its customers begin hammering on their network with RTP traffic?

....

On the other hand, here is something that is very funny and very unreasonable:



This was originally here: http://i7.tinypic.com/5z6vt4n.jpg

20 March 2008

The VLAN Koan

I wrote the following parody/koan one day to help a tech-writer friend of mine understand VLANS. He liked it. I hope you do too.

Goso said: ``When a packet goes out of its egress to the edge of the abyss, its header and its payload all pass through, but why can't the VLAN tag also pass?''

Kevin's comment: If anyone can open one eye at this point and say a word of layer-2 bridging, he or she is qualified to discuss the intricacies of ``one-armed routers'', and, not only that, he or she can save all sentient beings under them from broadcast storms caused by Windoze boxen. But if he or she cannot say such a word of true IEEE 802.1Q VLANs, he or she should turn back to their tag.

If the packet is transmitted, it will fall into the ether;
If it remains in the queue, it will exceed its TTL,
That little VLAN tag
Is a very strange thing.


(apologies to Mumon... (-: )

13 March 2008

A conversation with Eliza

M-x doctor RET

I am the psychotherapist. Please, describe your problems. Each time you are finished talking, type RET twice.

I am upset. Joseph Weizenbaum has passed away.

Weizenbaum? Hah! I would appreciate it if you would continue.

I think that the curse of being middle-aged is that you see all of the
greats that preceded you eventually pass away.

Is it because the curse of being middle aged is that I see all of the greats that preceded me eventually pass away that you came to me?

Yes. I mean, how could I argue with that bit of insight?

Does it bother you that could you argue with that bit of insight?

Nope.

Can you elaborate on that?

Yow!

Yow! Are we interactive yet?

...

12 March 2008

09 March 2008

Ack is my favorite new tool

Ack is my favorite new tool. I highly recommend it. It is even written in my favorite programming language -- you know, the one with the really supercharged and usable regexp library.

Over the years, I have written some handy tools in this area....like for example, my "txtfind" shell alias:

 
   # put this in your .bashrc/.kshrc/etc.
  txtfind () {
    if [ $# -eq 0 ] ; then
       txtfind .
    else
       perl -MFile::Find -e 'find(sub{ print "$File::Find::name\n" if (-f && -T); }, @ARGV);' "${@}"
    fi
  }


This alias takes an (optional) list of directory names and searches underneath them -- when a "text" file is found, the filename is printed. To a shell hacker like me, it is very common to type things like:
txtfind /etc | xargs grep 192.168.9.37
...if, for example, I wanted to figure out why a machine was configured to use some strange IP address. This beats typing something like:
find /etc -type f -print | xargs grep 192.168.9.37
...because, of course, this will likely cause grep to troll through some binary files. In the end, this might hose your terminal.

I even have shell aliases like "binfind" and "dostxtfind". I have another alias called txtfind0 that allows me to use it in this manner:
txtfind0 /usr | xargs -0 grep foo
...but with this new tool, ack, I'll be able to eliminate a lot of hassle by simply typing something like:
ack --type=text foo /usr
That's not to say that things like my txtfind0 alias are now obsolete. Let's say, for example, I wanted to find a file that contained both the phrases "weapons of" and "mass destruction" -- in this case it would be as easy as:
txtfind0 /usr/secret | \
xargs -0 perl -l -0777 -ne 'print $ARGV
if (/weapons of/i && /mass destruction/i)'
But, just the same, I am enthusiastic to have a great new tool in my arsenal. Ack has a bunch of other features that I haven't really touched on here (my favorite is its intelligent ability to skip files in .svn directories) -- I encourage everybody to check ack out.


06 March 2008

Development Tip: Multiple Build Areas

Here is a code development tip that I nearly always employ in any workplace. I have employed this strategy for years, and several of my colleagues have told me "wow! that's a really good idea!" so I thought that share this.

I always setup multiple build areas to go along with the source control system that I am using. At the very least, I always have a "-work" directory (where I work on my current task), but (and here is the important bit) I always have a second "-clean" build area. I never modify any files in the "-clean" area! Ever. The only thing that I ever do wih the -clean area is (0) update this build area from source control, (1) run a build in this area and (2) run a regression test on this build area. Again, I never modify any files in this directory.

Having a "-clean" directory is terribly useful. For example, if I am making a big change that modifies ten files and my changes also depend on the addition of two files to the build tree, when I am done with my work, I will checkin my changes (under my "-work" area), and then I will immediately update my "-clean" area to run a build and a regression test. If I somehow forgot to add those two source files to the source tree, the build will fail -- but I will immediately notice this. It is much better for me to notice this immediately rather than my co-workers.

If you are a professional software engineer, the problem that I have just cited here has probably plagued you, what? -- several dozen or hundred times in your career? Yes? How much of your time has been wasted due to this problem? If only everybody employed this technique.

Like I mentioned, I always setup multiple build areas. In fact, I usually have at least a half dozen build areas going at the same time. I usually have a "-clean" area going for every source code branch that I work with, and I usually have a build area going for every task that I work on as well. This latter use of build areas seems to be particularly useful, because I have had colleagues who were dead-set on creating a new source control branch in the codebase to do their work tell me, after I have explained my multiple-build-areas methodology to them, that this trick saved them a lot of grief. Let's not forget, every time your organization creates a new branch, this costs your organization time and money. Sometimes you need a new branch, but many times you do not. This trick costs a modest amount of disk space, and disk space is cheap. Branches are never cheap.

I have used this trick wih dynamic views under ClearCase, static views under ClearCase, and directories under Subversion too. This trick can be used anywhere.


Update:  yeah, yeah, yeah, I realize that folks who use DVCS systems will probably look at this post quizzically.  Let me issue the following reminder:  not all shops use DVCS.

04 March 2008

A DIALOGUE WITH SARAH, AGED 3: IN WHICH IT IS SHOWN THAT IF YOUR DAD IS A CHEMISTRY PROFESSOR, ASKING “WHY” CAN BE DANGEROUS

Can you say ‘hydrophilic’?Link

Recursive Make Considered Harmful

One of the happiest days of my life was when I typed "make print" and I watched make invoke LaTeX and then my masters thesis started spewing out of my laser printer. Make is a dependable tool that, by definition, knows how to handle dependencies and is flexible enough to handle complex tasks (Towers of Hanoi, anybody?).

People who are fans of make like myself will probably like Recursive Make Considered Harmful.

03 March 2008

Debugging War Story 2

At one of the projects I worked on in the past I got to work on some protocol design and implementation. This was actually one of my favorite projects ever; it was a project where I had a lot of responsibility, I got to work with a lot of interesting people on some hard problems, and I got to work on a project that allowed me to be creative and technical at the same time.

Anyways, I was in charge of the protocol implementation. I am a very careful and a very conservative programmer, and a lot of the protocol implementation was coded in the style that I prefer.

The protocol itself was binary (out of necessity). One day as we continued the design of the protocol we concluded that we needed to add a 64-bit integer to one of the message fields. After agreeing on the design, I updated one of the structs that I had created to store the message fields:

#if defined(SOME_COMPILER) || defined(SOME_OTHER_COMPILER)
#pragma pack (1)
#endif
struct SomeMsgStruct {
uint32_t field1;
uint32_t field2;
....
uint64_t fieldN; /* NEW FIELD */
}
#if defined(GNUC)
__attribute__ ((__packed__))
#endif
;
#if defined(SOME_COMPILER) || defined(SOME_OTHER_COMPILER)
#pragma pack (0)
#endif


The key point you need to understand here is that I wanted to make sure this struct was packed and there was no padding in the struct (as the C standard allows). You need to understand that I was supporting several different compilers and target architectures, some of which I did not have easy access to.

So, in my conservative programming style, I also added the following code to one of the protocol's initialization functions:


SomeMsgStruct x;
assert(sizeof(x) == (sizeof(x.field1) + sizeof(x.field2) + sizeof(x.fieldN)));


After testing my code, I checked it in, announced my changes, and then moved on to my next tasks.

....

Several days later, one of my colleagues, who, shall we say, was not a detail-oriented engineer, complained to me that something was wrong with the protocol stack and that there was garbage being transmitted on the wire. I knew this problem had "wild goose chase" written all over it, but I didn't have a magic wand to fix this problem. So, at around 2pm, we sat down to debug the problem.

We analyzed logfiles. We looked at protocol traces. I tried to reproduce the problem on my setup, but the problem only reared its ugly head on my colleague's setup. My colleague's setup included a different target processor than what I had in my setup, so I was quickly forced to try to understand this problem on my colleague's foreign setup (which seemed to include a very tedious compile/link/load-the-binary-onto-the-target phase).

Eventually, in one of the logfiles, I noticed that something seemed to be wrong with "struct SomeMsgStruct". Ah! So I had my colleague add this to the code:

printf("sizeof SomeMsgStruct: %d\n", sizeof(SomeMsgStruct));

And, sure enough, the output was not what I was expecting.

Now I was really confused. At around 7:30pm, I wondered aloud to my colleague "How could this possibly be happening?! How could the compiler be doing this? I even put an assert() in the code to make sure that everything was right!".

At this point my co-worker blurted out:

assert()? Oh, what's that? When I first started working with your new code this morning, I kept on getting this `assertion failed' message. But I just wanted to get the code going, so I commented out that pesky line of code.

After this revelation, I excused myself from the noisy lab where I had just spent the entire afternoon, went outside to the parking lot, and had a good scream. After I composed myself, I went back inside and re-wrote the code so that the structure would be packed correctly on my colleague's target architecture. I also told my colleague, in no uncertain terms, to never modify my code again and to absolutely positively never ever remove an assertion from the code again unless he knew what he was doing and was prepared to deal with the consequences.

I am saddened to inform you, kind reader, that this wasn't the last time in my life that a colleague removed an assertion from my code. I guess that I am better able to deal with this now, but I am never less surprised when I see this.

02 March 2008

Tom and Atticus

I am a fan of Tom and Atticus. Maybe you would like them too.

01 March 2008

Gear Review: Rudy Project Horus Cycling Eyeglasses

My old cycling glasses (a cheapo pair of perscription sunglasses) died after a decade of abuse. I have been doing more and more cycling lately, including some interesting night rides. Because I have had several incidents over the years in which things have pinged off of my eyeglasses as I have been riding, and because ${employer} was chipping in in terms of employee benefits, I decided to buy something that would last a long time.

The thing that complicated this purchase is that I wear perscription lenses. I do not wear contacts, and I am not interested in Lasik. In fact, I like wearing glasses.

Again, my constraints were: (1) must be bombproof, (2) must be perscription-friendly, and (3) would be really nice if I could use these in wildly varying light conditions (bright sun to glare/fog to complete darkness).

It turns out that a product that satisfies all of these constraints is difficult to find. I definitely couldn't find anything locally.

Eventually, I ended up at a website called www DOT bicyclerx DOT com . I talked to a sales guy there on the phone and eventually I ordered a set of Rudy Project Horus cycling eyeglasses. I ordered these with two sets of detachable lenses, one clear and one tinted. The tinted lenses are polycarbonate, which is an upgrade. How much? Are you sitting down? $342. This is waaaay more than I wanted to spend, but again, my ${employer} paid for most of this. I also rationalized this by thinking that these would last a long time.

So, what is my review? These glasses are nice, but certainly not $342 nice. These glasses have one almost fatal flaw -- the lenses sometimes pop out.

I sometimes ride in cold weather, and when I do I wear a hat. There is something about the added resistance of wearing a hat on your head that occasionally doesn't interact well with the motion of putting these glasses on your head. I have been in the following situation twice: in complete darkness, with my bicycle, looking for one of my clear lenses with only the light from my bicycle light to help me. This sucks. In both cases I was able to find my lens, but this was a bad situation to be in.

I emailed Bicyclerx and Rudy Project telling them of my experiences. The salesman from Bicyclerx offered to swap frames in hopes that this would help, but I declined -- I'm just certain that this wouldn't help. I even suggested in my email how they might make the lens mounting mechanism more reliable but this generated no response. Whoever answers email at info@rudyproject.com didn't think that my email deserved a reply.

I am happy that I have my new glasses. Eye protection is very important on the bike. I just don't think that
these glasses are worth $342, not with the flaw that I have mentioned.

I have probably made these glasses quite a bit more reliable with the following trivial modification: I placed a tiny strip of clear package sealing tape on the bottom edge of the glasses. I haven't had any problems with these glasses since I made this modification.

Oh well. That's my review.

Debugging War Story

One day, one of my colleagues updated me about one of the problems that he was trying to fix in one of the older products that he maintained.

My colleague informed me:

We can't fix the problem because we can't even produce a build with the current codebase that doesn't crash instantly on the board. Something in the code changed. I've tracked it down; it is a compiler bug. We'll have to call the compiler vendor.

You have to realize, this problem report pushed so many wrong buttons in my engineer's brain that my head immediately started to hurt. I got a cup of coffee and prepared for battle.

When I got back to my colleague's desk I asked "when was the last time you produced a working build for this product?". Let's just say that the answer was "a lot longer than 6 months and many code re-orgs ago".

Great. The pain in my head became a dull throb.

I thought for a moment and then asked "You mentioned that you had tracked this down to being a compiler bug -- how do you know this?"

A few minutes later my colleague was showing me the assembly language output generated by the compiler. I was a bit out of my element here; I was not familiar with the target processor or its assembly language.

"So, what exactly is the bug here?" I asked. My co-worker explained to me that the compiler was dealing with some code that was working with uint32_t values, but in one particular case it just decided to deal with a uint32_t value using the processor's 16-bit instructions. So, a value in a register was getting "shaved", and this was the root cause of the fatal error the product was experiencing.

Again, I was not familiar with the target processor, but I did manage to look through a reference book on my colleague's desk and I did verify that, sure enough, the assembly language output was using 16-bit instructions in a sea of other code that treated the value properly as a 32-bit value.

At this point I learned a little bit more about the compiler. It wasn't GCC -- this compiler was provided by the chip vendor. The whole compiler seemed to be tightly integrated to the vendor's IDE, some win32 app that seemed a little flaky at best. I'd never used this compiler before in my life.

At this point I had two conflicting thoughts going on in my brain: (1) my co-worker was telling me that there was a compiler bug and (2) I haven't seen an actual compiler bug in a C compiler in over a decade, especially for code as simple as this.

So, I decided to look at the C code in question a little more carefully. It turned out that the problem description of "the compiler is generating code that uses 16-bit instructions to work with 32-bit values" was a bit of an oversimplification; rather, the problem could more accurately be described as "the compiler was emitting 16-bit instructions to move a 32-bit return value (returned from a function call) off of the stack". Let's call the function in question foo().

Oh. I was starting to get a hunch about the problem.

"Is there a prototype for this function that returns a uint32_t?" I asked my colleague. "Yes" was his response. Sure enough, he showed me the prototype in a header file. Damn, this was a minor setback to my hunch. It looked like this in the code, of course:
extern uint32_t foo(uint32_t some_param);
So, at this point I directed my colleague to utilize one of my favorite debugging techniques -- I asked him to run the compiler on the source file in question, but to only run the C preprocessor on the file. This is usually as simple as invoking the compiler like "cc -E" or "gcc -E". After a few minutes of futzing around with the win32 IDE that controlled the compiler, we were eventually able to generate the preprocessed output, all dumped to a file.

As soon as we generated the file, I had my smoking gun.

We imported the file into a text editor and I immediately asked my colleague to look for "foo" in the file. Sure enough, the first occurrence of this string in the file was at the place where this function was invoked. Let me be really clear here: yes, there was a prototype for this function, and this existed in some header file, but in the .c file that we were looking at this file was never #included!

I asked my colleague one more question, but I knew what the response would be before I even asked:

"What size are ints on this processor?"

"16 bits." was his response.

I started doing a little jig in his office....problem solved!

There was no compiler bug. The problem was that the compiler was being asked to generate some code to invoke a function called foo() but it had never heard of that function before. But this is C, and this is legal. So, the compiler generated the code to pop the return value off of the stack using the default that C uses -- int -- and on this particular target, ints were 16-bits wide.

What are the lessons from all of this? I would humbly suggest that there are three:

1: Quality code is built in an environment in which compiler warnings are copiously enabled and paid attention to.

2: If you have a product and you're not building and testing the build output frequently, you're doing something wrong.

3: Occasionally, it is handy to have an engineer who can debug issues like these on staff...