kdc-blog: code

Showing posts with label code. Show all posts

26 January 2011

ascii2keyboardscancode

Here is a short/goofy/useful script that others might find to be useful: ascii2keyboardscancode .

This is a quick hack that I threw together for a specific and cool little project I have been working on. In very high level terms, it is possible to use this script to drive VirtualBox's "headless" interface, using this CLI interface:

    VBoxManage controlvm some-machine keyboardputscancode ...

Using all of this, it is possible to create a new VM after some nightly builds run, use Kickstart to create a new OS install, shutdown the VM, transmorgrify the VM into various VM formats (.ovf, etc.), and otherwise automatically deploy the shiny new VM into a test harness.

The very best part of all of this is that everything happens unattended in the wee hours of the morning, when CPU time is cheap. By the time everybody arrives at work in the morning, all of this tedious work is done.

I love it when a plan comes together!

02 February 2009

Patch for Unix Network Programming Interface Code

I am a big fan of the book _UNIX Network Programming_ by Rich Stevens. Maybe if you are reading this blog you are too.

Anyways, I have discovered that some of the code in this book is suffering from some bit-rot. Specifically in section 17.6 of the third edition some code is presented that allows the programmer to traverse the list of network interfaces on the local machine. This code uses uses ioctl(...., SIOCGIFCONF, ...) to get the information.

The source of the bit-rot is that Stevens' code made some assumptions about the size of (struct ifreq).....and these assumptions are no longer valid, at least, not for me who is running a relatively modern Linux 2.6 box.

So, I've come up with a patch. I'm pretty pleased with my patch, because

the code can be fixed.
the code can be simplified.
the code can be made to be more portable and to still work even if this union is changed in the future.

I thought that other people would find this to be useful as well, so here it is.

26 May 2008

I am a fan of Programming by Contract and gcc's -Wcast-qual

Late one Friday afternoon at $DAYJOB I was nearly finished implementing a new feature for the product that I worked on. This was the culmination of several long days effort, and I was looking forward to finishing my task and going home for the weekend.

As I was hooking everything up to enable the new feature, I spotted a bug. This was a strange bug, but I felt confident that I'd find it quickly. I called my wife and let her know that I'd be a little late for dinner.

Like I said, this was a strange bug. I noticed this bug because the unit test I was cobbling together acted strangely in the corner case that I was trying to get right.

Basically, the problem came down to the fact that at some point in time in my C program's execution, I had a (char *) variable that pointed to a particular string, but, later in the program's execution, for some unknown reason the contents of the string changed. This was very unexpected, and in fact this change corrupted a larger data-structure that my code was maintaining.

So, I looked at the code that maintained the (char *) variable with all of the concentration that I could muster on a Friday afternoon. This exercise proved to be fruitless -- I soon concluded that the code that I was looking at was correct.

I called my wife and told her that I was going to be a bit later...

Since the code seemed to be correct, I decided to pull out the big guns -- I decided to run the program through the memory debugger. I guessed that there might have been a improper memory access in the code, and this was the thing that was corrupting my string.

Fire up $MEMORY_DEBUGGER. Instrument. Wait.... Wait some more.... Run.

Running $MEMORY_DEBUGGER yielded absolutely nothing: no bad memory accesses.

Now I was getting irked. I called my wife and told her that I wasn't sure when I would be home. Luckily we were having leftovers that night...

So, now I focused on the problem again:

my string was getting modified strangely.
the code appeared to be correct.
there didn't appear to be any memory access errors in the code.

At this point, I did what I probably should have done earlier: I fired up the debugger and added a watchpoint to the contents of the string. When I re-ran my program, I had my smoking gun: the string was being changed by some library code written by somebody else at $DAYJOB.

I was still a little bit confused though, because, like I said, the code that I had looked at was technically correct. But I hadn't looked at the library code...

The library code looked like this:


  void do_something(const char *s)
  {
     char *s2 = (char *)s;
     s2[0] = 'a';
  }

I am hand-waving a little bit here, because I can't remember the exact circumstances. All I really remember is that the situation was a lot more complicated than this code snippet, with many levels of indirection.

When I saw this, especially the first line of do_something(), the problem was obvious: whoever wrote this function broke a fundamental rule of C -- you are not allowed to modify the thing being pointed at by a (const T *) object.

So, I figured out who wrote this function, sent $COWORKER an email asking them to please fix their code to adhere to the rules, and then I packed up and went home. I couldn't check my change in with this bug still in the system, so I decided to wait until Monday.

You might think that the conclusion to this story might be boring, cut-and-dry, etc., but it wasn't for me.

Monday morning came. My $COWORKER who wrote the buggy function read my email and then responded via email. To sum up his email: (1) he was not going to fix this function and (2) the problem was mine because (paraphrasing) "'const' does not mean what the C standards bodies have defined it to mean; instead, 'const' means what they defined it to be at $SOME_PREVIOUS_COMPANY_THAT_HE_WORKED_AT".

I was flabbergasted at this response, so I went over to talk with $COWORKER. He amazed me with his tenacity. There was no line of argument that I could employ that would change his deep held belief here. I never really could pin down an exact definition of what "const" meant at his previous company. He did further clarify his position here by telling me that, in his experience, "most programmers are too stupid to know what the standards bodies say about 'const'". I protested that I thought that I understood pretty clearly what "const" meant, and he agreed with this, but he held fast to his "programmers are too stupid" point.

At this point I even pointed out that GCC had a "-Wcast-qual" flag that would catch errors like this.

"So what?" he responded.

"Do you think that they would add this check into the compiler if it wasn't, like, important?" I asked.

"I don't care.", he responded.

"Do you understand that I stayed late on Friday night because of this bug?" I asked.

"That's unfortunate.", he responded.

We continued this fun interplay for a few minutes, but I eventually had to give up. Clearly, he wasn't going to change his code, and I simply could not fix all of the places in the codebase that used "const" in this non-standard way.

The maddening thing for me here was that $COWORKER was a very smart engineer, and he certainly understood the concept of Programming by Contract, but he was basically asking me to enter into a contract that said "no matter what other legal mumbo-jumbo is in this document, it is OK if my code does anything whatsoever, even if it is wrong or goes against the spirit of everything else in this contract". This isn't a very useful or meaningful contract.

I eventually learned that anytime I saw the keyword "const" certain subsystems in the code that I should attribute no meaning to this keyword. Seriously, in those subsystems, "const" meant whatever the author meant on that day, and tomorrow the meaning might (and in fact did) change. I started to write my code in a very defensive manner, making copies of important data-structures and sending these copies off to these Alice-in-const-Wonderland subsystems.

I did have enough control over the system to ensure that all of my code compiled with gcc's "-Wcast-qual", and this did keep me out of a bit of trouble a couple of times. I knew that my code was bulletproof and correct.

One day after I had given up on getting $COWORKER's code to be "const correct" I was struck with an idea, so I made a modest proposal to $COWORKER: since he seemed to want to use a keyword to attribute some meaning to variables (whatever meaning "const" meant at $SOME_PREVIOUS_COMPANY_THAT_HE_WORKED_AT), I suggested that we could migrate his code away from using "const" and we could instead define a new keyword for his code with the C preprocessor. I suggested that we could do something like this:


  /*  This is some header file */


 /* For an exact definition of this keyword, please ask $COWORKER */
 #define MEANINGLESS_CONST

...and then we could have updated the original function that caused me to stay late at work like so:


  void do_something(MEANINGLESS_CONST char *s)
  {
     s[0] = 'a';   /* much more streamlined!!! */
  }

I even offered to make all of the changes for him...

So, now our system would have been improved! We'd still have the use of the "const" keyword as the standards bodies had intended, but we'd also have MEANINGLESS_CONST too, and our code would be more streamlined and definitely a lot more understandable. We'd even be able to use "gcc -Wcast-qual" too!

Anyways, I presented this idea to $COWORKER. I give him credit for sticking to his guns -- "no" was his simple flat response to my proposal.

I wish I had some neat way to wrap up this story, but I don't. The Alice-in-const-Wonderland code continued to exist in the product until I left. This probably cost the company a bit of money in terms of bugs and lost efficiency, but that's the way things work sometimes. I just had to learn to live with this behavior in the code.

To sum things up, I'm a big fan of Programming by Contract and "gcc -Wcast-qual". I'm also a big fan of sticking to reasonable standards, and assuming that my co-workers are smart until they go out of their way prove otherwise.

26 March 2008

Multiple Function Return Points Considered Harmful

I prefer to see functions written in such a manner that there is one consistent return point. I prefer this:


   int f(int x)
   {
     int result;
  
     if (x < 3)
       result = 1;
     else
       result = 0;
  
     return result;
   }

...over this:


   int f(int x)
   {
     if (x < 3)
       return 1;
  
     return 0;
   }

This is a religious issue (to some extent). Clearly, a good optimizer is going to render the same code in either case, so the latter code isn't faster -- it is just shorter. I prefer my way because it goes along with my conservative coding style -- I prefer to make the code so simple that I can even understand it when I am tired.

By the way, I'm not so religious about this matter that I always follow my own advice. Every rule has exceptions.

However, I can think of one case in which I believe that my methodology (having a consistent return point) is a clear winner. I will illustrate this with a story.

....

One day at $DAYJOB, my $MANAGER informed me that I'd been assigned to work on a new project. I'd been assigned to add a $FEATURE to a $BIG_SUBSYSTEM in the product. I didn't know anything about this subsystem. $MANAGER told me that this was fine -- $COWORKER was the expert on this $BIG_SUBSYSTEM, I could use $COWORKER as a resource.

I'd never even heard of $COWORKER at this point. After exchanging a couple of terse emails with $COWORKER I figured out that $COWORKER worked really strange hours and actually worked in a cubicle a couple of rows away from me.

Later that day I managed to find $COWORKER in his cubicle, so I stopped by and tried to introduce myself:

Hi, I'm Kevin. I've been assigned by $MANAGER to work on $FEATURE in the $BIG_SUBSYSTEM. $MANAGER tells me that I can use you as a resource for this project. I'm trying to come up to speed with $BIG_SUBSYSTEM ; I've started reading the available documentation, but could you perhaps give me an overview of $BIG_SUBSYSTEM so that I have a better idea of what is going on in this subsystem.

$COWORKER looked at me with disdain and said "I'm sure that you'll figure it out.". Then $COWORKER put his headphones back on and turned back to his computer.

$COWORKER also managed to turn down my offer of a friendly handshake.

"Great...", I thought, "...$COWORKER is an unhelpful jerk". I knew what this meant too, because complaining to $MANAGER wasn't going to help me one single bit. I would have to familiarize myself with $BIG_SUBSYSTEM and implement $FEATURE on my own.

Weeks passed. I worked my ass off to come up to speed on $BIG_SUBSYSTEM. I barely had any interactions at all with $COWORKER. $MANAGER asked if $COWORKER was being helpful and I honestly answered "no, he seems to be working on his own work". "Oh, he must be busy" was the response.

Eventually, I was done. I tested my code and even showed it to $COWORKER. He made a bunch of comments that ranged from being somewhat helpful (things that I wished I had known before I started the project) to comments that just reflected his opinions about the code.

So, I tested one more time and checked my changes in.

There are four things that you need to know about my $DAYJOB before I continue: the project written in C, it was very large, the project was heavily multithreaded and the project made use of a huge number of branches in the source control system.

Soon after I checked my changes in $MANAGER told me that my changes were needed in a different branch in the source control system. So, I performed the arduous process of merging my changes into the new branch.

Let's just say that I performed this merge process into N different branches...

Each merge really was arduous. Due to the way that the software organization used the source control subsystem at $DAYJOB, it was very difficult to use the tools that came with the source control system to perform the merge. I soon concluded that the best way for me to do merges was via patch, ediff/Emacs, and a huge amount of attention to detail. Each merge took well over a day of solid effort, sometimes quite a bit more.

I repeated this merge several times, as needed. While I was doing all of this work, I had lots of time to think to myself "how could this be made better?" But more on that some other time...

Anyways, one day somebody in SQA contacted me and told me that he wasn't sure what was going wrong, but one of our internal software releases was dying strangely, and since I was the last person who made a major modification to the branch, he thought I might have something to do with this.

This was the start of the badness. As soon as the problem was found, $MANAGER was informed of the problem, and now $MANAGER was pretty insistent that I find the problem...and quickly. $COWORKER even came by and reminded me of the importance of finding the problem quickly. This was the most contact I had had with $COWORKER in months! Great...

So, under a bit of pressure, I started looking for the bug. $COWORKER started looking for the bug too, independently (of course). I felt some pressure to try to find the bug before $COWORKER, because technically the bug was probably mine.

Many hours passed. I had a difficult time trying to find this bug because I couldn't figure out what was different about my work on this branch versus all of the other branches that I had worked on. My work on all of those other branches worked fine and had passed SQA testing. This was a tough bug to isolate...

Eventually, $COWORKER found the bug. I was pretty unhappy. The bug was in my changes for $FEATURE, and $COWORKER was more than happy to rub my nose in the problem. The problem was a thread synchronization problem. My coworker pointed at the code and then at me and said "You need to pay more attention to details when you write code". My head was swimming. I ruefully noted that this was the most interaction with $COWORKER that I had ever had and that it had gone very badly. My code had a bug. I was responsible for the problem. Where did I go wrong?

...

I've been in the software engineering business a long time. Sometimes I make mistakes. Really, I try to learn from everything that I do. So, after the bugfix got checked in and everybody stopped freaking out, I tried to track down what went wrong.

A little while later, I figured out what went wrong.

When I originally implemented $FEATURE, I had to integrate my changes into the already large codebase. When I made my original implementation, my code changes mimicked the style that I found in the rest of the codebase -- of course.

Here is where we get back to the subject at hand: having a consistent place in each function where a function returns.

The codebase at $DAYJOB didn't adhere to my preferred pattern here. And, like I mentioned, it was heavily multithreaded (and this implies that it used a lot of synchronization primitives). So, part of my changes modified a function that looked like this:


   void f(int x)
   {
     LOCK(&mutex);
  
     switch (x) {
  
       case FOO:
         if (someFunc() == ERROR) {
           UNLOCK(&mutex);
           return;
         }
         /* do something else */
       }
       break;
  
  
       /* HERE IS MY CODE */
       case BAR:
         if (someFunc() == ERROR) {
           UNLOCK(&mutex);
           return;
         }
         /* do something else */
       }
       break;
  
       /* ...100 more cases... */
  
  
     }
  
     UNLOCK(&mutex);
   }

But, when I merged my code into the new branch with patch, the patch miraculously managed to apply cleanly on this file, and the resulting/buggy file looked like this:


   void f(int x)
   {
     LOCK(&mutex);
     LOCK(&some_other_mutex);
  
     switch (x) {
  
       case FOO:
         if (someFunc() == ERROR) {
           UNLOCK(&some_other_mutex);
           UNLOCK(&mutex);
           return;
         }
         /* do something else */
       }
       break;
  
  
       /* HERE IS MY CODE */
       case BAR:
         if (someFunc() == ERROR) {
           UNLOCK(&mutex);
           return;
         }
         /* do something else */
       }
       break;
  
       /* ...100 more cases... */
  
  
     }
  
     UNLOCK(&some_other_mutex);
     UNLOCK(&mutex);
   }

See the problem? In this new branch of code, which I had never worked with before, somebody had introduced a new mutex (some_other_mutex), and my code wasn't cleaning up properly.

This is where I want to make my point: this code could have been written a lot more cleanly, like this:


   void f(int x)
   {
     LOCK(&mutex);
     LOCK(&some_other_mutex);
  
     switch (x) {
  
       case FOO:
         if (someFunc() == ERROR) {
           ....
         }
         else {
           ....
         }
       }
       break;
  
  
       /* HERE IS MY CODE */
       case BAR:
         if (someFunc() == ERROR) {
           ....
         }
         else {
           ....
         }
       break;
  
       /* ...100 more cases... */
  
  
     }
  
     UNLOCK(&some_other_mutex);
     UNLOCK(&mutex);
   }

...and, if it were, not only would my changes have applied cleanly with patch, but we would have avoided a long, gory, multi-hour, "let's find the bug" session.

I never did get to tell $COWORKER about what the true root cause of the problem was here. As you can see, this is a long story, and $COWORKER never had any patience for me. In fact, $COWORKER probably thinks I am a moron to this day. But I was glad to figure out what the root cause of the problem was, and this is one of the reasons why I really try to write functions in such a way that there is a single consistent return point -- especially in multithreaded code that utilizes synchronization primitives.

Postscript: several months after this bug reared its ugly head, I found two places in $COWORKER's code that suffered from the same bug. I was tempted to treat $COWORKER as rudely as he had treated me, but in the end I just decided to be polite about it.

06 March 2008

Development Tip: Multiple Build Areas

Here is a code development tip that I nearly always employ in any workplace. I have employed this strategy for years, and several of my colleagues have told me "wow! that's a really good idea!" so I thought that share this.

I always setup multiple build areas to go along with the source control system that I am using. At the very least, I always have a "-work" directory (where I work on my current task), but (and here is the important bit) I always have a second "-clean" build area. I never modify any files in the "-clean" area! Ever. The only thing that I ever do wih the -clean area is (0) update this build area from source control, (1) run a build in this area and (2) run a regression test on this build area. Again, I never modify any files in this directory.

Having a "-clean" directory is terribly useful. For example, if I am making a big change that modifies ten files and my changes also depend on the addition of two files to the build tree, when I am done with my work, I will checkin my changes (under my "-work" area), and then I will immediately update my "-clean" area to run a build and a regression test. If I somehow forgot to add those two source files to the source tree, the build will fail -- but I will immediately notice this. It is much better for me to notice this immediately rather than my co-workers.

If you are a professional software engineer, the problem that I have just cited here has probably plagued you, what? -- several dozen or hundred times in your career? Yes? How much of your time has been wasted due to this problem? If only everybody employed this technique.

Like I mentioned, I always setup multiple build areas. In fact, I usually have at least a half dozen build areas going at the same time. I usually have a "-clean" area going for every source code branch that I work with, and I usually have a build area going for every task that I work on as well. This latter use of build areas seems to be particularly useful, because I have had colleagues who were dead-set on creating a new source control branch in the codebase to do their work tell me, after I have explained my multiple-build-areas methodology to them, that this trick saved them a lot of grief. Let's not forget, every time your organization creates a new branch, this costs your organization time and money. Sometimes you need a new branch, but many times you do not. This trick costs a modest amount of disk space, and disk space is cheap. Branches are never cheap.

I have used this trick wih dynamic views under ClearCase, static views under ClearCase, and directories under Subversion too. This trick can be used anywhere.

Update: yeah, yeah, yeah, I realize that folks who use DVCS systems will probably look at this post quizzically. Let me issue the following reminder: not all shops use DVCS.

kdc-blog