24 December 2011

Engineering Documentation Rant

B. sent me a link to Institutional memory and reverse smuggling.  When I got to this paragraph:

Oh, and as an external consultant, I'm not allowed to know some of the trade secrets in the documents. The internal side of the team needs to handle the sensitive process information, and be careful about how that information crosses boundaries when talking to the external consultants. Unfortunately, the internal team doesn't know what the secrets are, while I do. I even invented a few of them, and have my name on some related patents. Nonetheless, I need to smuggle these trade secrets back into the company, so that the internal side can handle them. They just have to make sure they don't accidentally repeat them back to me.

....I didn't know whether to laugh or to cry.


Here is related and possibly semi-amusing story about some of the work that I used to do at a job in my past:

I used to work at a certain networking company.  I did a fair amount of work with the SNMP protocol and something called "SNMP MIBs".  A "MIB" is like a database schema -- the MIB isn't the management data itself, rather the MIB just describes how the management data is organized.

Well, as I cut my teeth at this company, I had to work with MIBs that were designed in-house.  This was frequently a tedious chore, because our internal MIBs nearly always came without any documentation whatsoever.  The implementation of these MIBs also varied from product family to product family -- probably because the design was poorly understood.  I was at the end-of-the line, trying to make sense out of all of this stuff.  Frequently my job was basically a reverse-engineering job -- I'd have to make sense out of some new MIB, and then I'd have to make sense out of how our own products actually implemented the MIB.  This entailed a lot of work in the lab, as well as a lot of email and phone tag.

Here was the first big problem:  it was totally crazy that I had to reverse engineer this stuff.  Everybody involved with the production of this stuff worked at the same company, during the same time period.  I shouldn't have had to have reverse engineer anything.  All of the engineering staff should have understood what the design was at the time that all of this stuff was produced.  But this simply did not happen.

Vendors aren't the only ones who write MIBs.  Standards organizations write most of the useful MIBs.  Here, take a look at one of my favorite MIBs:  RFC1573.  This is actually a very technical document:  there is a bunch of text in here that describes how the data is structured and how things work.  There is also a machine readable section in this document.

Whenever I had to do my work at this company, it was always a lot easier for me to deal with a MIB written by a standards organization rather than a MIB developed in-house.  Why?  Because somebody bothered to document the design of the MIB.  There is no reverse engineering required to deal with such a MIB.

Eventually, during my career at this company, I volunteered to help create new MIBs.  I had a few reasons for doing this, but one of the most straightforward reasons was that I thought that I could help make the company more efficient by ensuring that newly designed MIBs came with some documentation.  I made it very clear from the outset that I didn't want to spend a huge amount of time ensuring that the MIB was as...perfect...as an IETF MIB.  I just wanted to get the design into the MIB so that there would be no more reverse engineering required.

Here is the second, even crazier part of my story.  I started helping with the design of a certain MIB.  My work was done as a part of some sub-committee work.  I jumped into my work, and I produced a MIB that I think pretty elegantly solved a certain problem that was at hand.  I wrote everything up.  Nobody on the sub-committee besides me really had any input.  At this point, I gave my work to the committee chair for discussion and approval by the full committee.  I fully expected my new MIB to be quickly approved of by the overall committee.

I was so naive.  I did not anticipate what happened next.

Eventually, the committee chair forwarded my new MIB to all of the other committee members.  The fireworks started immediately.  I got at least a dozen confused queries, all in the form of "how does this work?".  I really did not understand these questions.  I tried to explain how things worked, but I always referred the members of the committee back to my original document.  I told them things like "see section 3 of the MIB document ; it gives an example that precisely answers your question".  Eventually, there was so much acrimony  over the design of my new MIB that a meeting was called.  I remember driving to this meeting, thinking to myself "I am traveling to a very hostile meeting, and I do not understand where the hostility is coming from".  I was totally confused.

The meeting itself was everything I expected -- there was lots of hostility and yelling.  I kept on telling people "I don't understand why you are confused about any of this ; this is all clearly explained in the document".  I spent nearly the whole meeting re-explaining things that were clearly described in the document.

After the meeting was over, I followed one of the committee members back to his office to try to understand what had happened.  I asked "why was there so much confusion over this?".  Eventually, I figured out what went wrong.

Take a look at RFC1573 again.  My document had a similar format (but it was a lot shorter).  Do you see how in RFC1573 there is some text, and then in section 6 appears the machine readable part of the document?  Well, the committee chair had never ever seen a MIB before that had any sort of descriptive text in it.  This confused him, so, before he sent out my MIB to the whole committee, he edited the MIB and took out all of the descriptive text.  The reason why the other committee members didn't understand the design of my MIB is because they were all looking at some bastardized version of my document, one that didn't contain any of the actual design.  Let me drive the point home:  what do you think would be easier for a human to understand:  the full text of RFC1573 or else just the text in section 6 of RFC1573?

When I learned of what happened to my document, I got pretty angry.  I had just endured a hostile, two hour meeting, and a lot of the acrimony during the meeting would not have been present if the committee chair had simply provided my document to the other committee members without modification.

So, I walked over to the committee chair's office at this point and asked him why he had done this really stupid thing.  His response?  "I never thought about it"  When he said this I completely believed him.

As I drove home from my meeting, I was pretty wrecked.  After some time, I realized that I worked for a company with some pretty real problems.  We not only had the problem of "the internals of our product is really poorly documented", but, we also had another problem too:  it was basically codified into the workings of this committee that "design documents should not exist".


A lot of other things went wrong with my involvement with this committee too.  I soon lost my enthusiasm for working on this committee.  I moved on to other things soon after.

I do continue to believe that a little bit of documentation, in just the right places, has a huge positive effect in the overall efficiency of an organization.

No comments: