28 October 2008

Beware of the Leopard (or, "In praise of #error")

One day at $DAYJOB, whilst all of the engineers were working on the large C++-based project that many people reading this blog have actually used, somebody discovered a bug. Actually, this was less of a bug and more of a software misconfiguration. The crux of the problem came down to the fact that in our software product, it was possible to build/compile the product with "Feature X", but "Feature X" had three mutually incompatable variants called "aaa", "bbb", and "ccc" -- and somebody had b0rked the configuration of these things in the software build. This problem resulted in a bit of lost time and money (and some teeth gnashing too...).

Well, somebody in management heard about this problem, and so it was decided that we needed to implement a process to ensure that this problem didn't occur again. Somehow, the result of all of this was committee meetings and from these somebody actually started constructing a software tool that would ensure that there were no misconfigurations in the software build.

I heard about the construction of this software tool right after it started to get implemented. When I heard the news, I was puzzled because nearly everything in the codebase was reasonably delimited with preprocessor conditional blocks. For example, in the code, everything related to "Feature X" looked like this:

#ifdef CONFIG_FEATURE_X

....code for feature x....

#endif


So, after hearing that somebody was actually going through the trouble of creating a tool to detect software misconfiguration in the build, I contacted the person who was in charge of that project and suggested that all that we really needed to do was add some code that looked like this in one of our header files:

   #ifdef CONFIG_FEATURE_X
   
   #if    !defined(CONFIG_FEATURE_X_AAA) \
        && !defined(CONFIG_FEATURE_X_BBB) \
        && !defined(CONFIG_FEATURE_X_CCC)
   
   #error If you are going to enable Feature X, you must choose one of its variants:  AAA, BBB, or CCC.  \
          It is non-sensical to enable Feature X and not enable one of these variants.
   
   #endif
   
   
   #if       (defined(CONFIG_FEATURE_X_AAA) \
           && (   defined(CONFIG_FEATURE_X_BBB \
                       || defined(CONFIG_FEATURE_X_CCC))) \
        || \
\   
      (   defined(CONFIG_FEATURE_X_BBB) \
       && (   defined(CONFIG_FEATURE_X_AAA \
                   || defined(CONFIG_FEATURE_X_CCC))) \
        || \
\   
      (   defined(CONFIG_FEATURE_X_CCC) \
       && (   defined(CONFIG_FEATURE_X_AAA \
                   || defined(CONFIG_FEATURE_X_BBB)))
   
   #error Feature X variants AAA, BBB, and CCC are all mutually incompatible!
   
   #endif


I think that the thing that I really like about this code is that the people who maintained Feature X and its variants could make the decision as to what represented a misconfiguration -- not some third-party or some complex tool that would theoretically check for ALL misconfigurations. I also really like the fact that this solution solves this problem at nearly the cheapest place in the development process -- at compile time. From my understanding of the tool that was being written to solve this problem, this tool would not have this property.

The manager/engineer who I contacted with this suggestion liked my idea, and thought there were several places in the code that could be updated with code in this style. For some reason that I never really understood, the decision was made to continue on the development of the "misconfiguration detector tool" -- I transitioned to another project soon after and lost track of what was going on in that project.

Why am I telling you this story today? Because I just spent my morning trying to track down an esoteric problem in a well-known open-source product, and in the course of my work, somewhere in the reams and reams of compilation output, I saw a message like this:

   Warning:  this file is known to be mis-compiled with this version of compiler!


But notice this is merely a warning and not a hard error. This infuriates me. Am I really expected to crawl through hundreds of thousands of lines of compilation output to find "warnings" like this?

I am a big fan of clear error messages and not needlessly dragging programmers through painful debugging sessions.

No comments: