22 May 2014

My Take on Aaron Maxwell's "Use the Unofficial Bash Strict Mode (Unless You Looove Debugging)"

I enjoyed Aaron Maxwell's Use the Unofficial Bash Strict Mode (Unless You Looove Debugging).  Read it -- seriously.  Mr. Maxwell is clearly somebody who knows what he is talking about.

I'm a shell hacker too.  My fascination with the shell has....evolved over the years.  I no longer reach for the shell as my first tool of choice...I more have the following heuristics in my head when I approach a task:
  • If a program can be written in under a page of code, then a shell script is a perfectly reasonable solution to a problem.
  • If a program can't be solved in under a page or two of code, it is time to think about writing the program in Python/Perl.
  • If a program basically has to fork+exec a bunch of other external programs, even if the program gets to be big, it might be just easier to keep the program implemented as a shell script.
  • But...if a shell-script evolves to be more than five (5) lines of code (or maybe even less...), then the shell script HAS TO be written according to the following template that I've come up with.

My Template

Here is my template for a reasonable shell script:


#!/bin/bash

#################################################
function usage() {

cat <<EOF
Usage:  $0 [OPTIONS]

OPTIONS

  --blah
        Allows you to specify the "blah" value.

EXAMPLE

  $0 --blah 1.2.3.4

EOF

}

########################################################
function cleanup() {
    echo "Cleaning up"
    rm -rf "$TMPFILE"
}

########################################################
function errHandler()  {
    echo $0: some unexpected error happened.  1>&2
    cleanup
    exit 1
}

########################################################
main() {    
    IFS="`printf '\n\t'`"

    set -E    # "set -e" is good.  I think that "set -E" is better.
    set -u   # exit if a reference is made to an undefined variable
    set -o pipefail   # exit if there is a failure somewhere in a pipe
  
    trap errHandler ERR



    # Let's say that somewhere in this program the code needs to
    # use a temporary file.  Well, let's have a consistent name for
    # this file, so we can ensure that it gets removed if this program
    # fails for some reason.
    TMPFILE=/tmp/some-temp-file.$$

    echo "blah" >"$TMPFILE"

  

    # Just for the sake of an example, here we're going to
    # introduce the chance for an error to occur.

    echo "The next operation may or may not fail.  Stay tuned..."
    echo

    [ $(expr $(date '+%s') % 2) -eq 0 ]


    cleanup
    return 0
}

main "${@}"
exit $?



Let me explain why I write shell scripts in this way:

#!/bin/bash

I know that some old-time Unix programmers might scoff at this.  Somebody might say "but the Bourne shell (/bin/sh) is the One True Shell".  My response:   I'll be glad to make my shell script ultra-portable when it needs to be.  Until then, nearly all of my shell-script code simply has to be portable to Linux (and occasional Mac) systems....and these all have Bash available for them.

I'm explicitly using the GNU Bourne-Again SHell here.  I do not care to write code that invokes /bin/sh but actually depends on /bin/sh being the Bourne-Again SHell.  Code like this dies when /bin/sh is actually Dash.  If somebody thinks that I am making a mountain out of a mole-hill here, I would remind them that Dash is the default /bin/sh on Debian/Ubuntu systems, and I don't want code that I produce to die strangely on such platforms.

I have no problems with using the original /bin/sh, "ash", or "dash", and I happily will use these if the situation is appropriate.

usage()

Putting usage information at the top of the file seems to allow me to provide documentation and useful code commentary with a minimum of duplication.

cleanup()

Many programs like this create temporary files, or some other thing that might need to be cleaned up.   I try to put all of my cleanup-related code in this function, as opposed to sprinkling code like this everywhere in the code.

errHandler() / set -E / trap errHandler ERR

This is how this program handles unanticipated errors. 

"set -e" and "set -E" are very similar, but, in my opinion, "set -E" is better because the "exit if something goes wrong" behavior is applied to subshells and functions.  I prefer to have Bash executing in this mode as much as possible -- I don't want this behavior to (silently...) not occur just because my code happens to be running in a function.

Notice how the errHandler() code properly calls cleanup() before it exits.

I can say a lot about Bash scripts that are written in this manner -- they exit if some unexpected error occurs.  Writing code in this manner is sort-of like employing a poor-man's exception-handler.   In my opinion, the following two points are very important to understand about code that executes in this configuration:

  • It does take some more effort to produce code that executes in this context.  The code needs to be written in such a way that every possible non-zero exit value is properly trapped.
  • On the other hand, code that is written in this manner is much more reliable than code that is written in the default mode.  As a programmer who is only interested in producing reliable code, I spend nearly zero time tracking down bizarre problems in code that is written in this style.   If the shell is not configured in this way (the default mode...) then you can pretty much guarantee having to track down a bizarro problem at some point in time.

main()

This is my invention.  I write code in this way because I've worked with hundreds of shell-scripts over the years that started out as some small bit of code...and then grew...and grew....and grew....and eventually became a steaming pile of top-level code plus functions plus more top-level code.  I've worked with some 20000 line shell-scripts in the past in which I had a hard time even figuring out where the code started executing.  Working with code like this causes me to want to wash my eyeballs in bleach afterwards.  I have little patience for this sort of thing: the code that I produce isn't going to suffer from this problem.

Please also notice that this code uses functions to group together related bits of code.  I have seen lots of shell-script code over the years that is just page after page after page of code...with no functions.  Code written in this "style" ignores basic software-engineering practices.  Yuck.

Epilogue

This shell-script template is my personal style.  I don't always write shell-script code (I'm more of a fan of higher-level languages...), but when I do write shell-script code I try to employ every reasonable technique that I can to help ensure that the code is clear and reliable.

In my opinion, the "errHandler() / set -E / trap errHandler ERR" part of this template is critically important.  I can think of several large shell-scripts that I have been called upon over the years to maintain that have benefited greatly from this scheme.  When I first started working with these large programs, the programs were HUGELY UNRELIABLE, regularly spewing errors and leaving systems in many inconsistent states.  It cost LARGE AMOUNTS OF TIME AND MONEY to fix systems that were left in an inconsistent state because of these problems.  After I started maintaining this scripts, it did take me a good deal of effort to update these unreliable programs to be written in my preferred style.  But, after this effort was over, the enhanced reliability has paid HUGE DIVIDENDS......I no longer have to chase boffo problems in these areas.

It is seductively easy to write shell-scripts.  But, the results are a lot better if some basic, common-sense rules are followed.