Shell scripts, if, test, and portability

7/25/2016

About a year-and-a-half ago I decided to take the plunge and switched from Windows to Linux on all of my computers. When once I viewed the command line as an intimidating totem of geekdom — whose promptings could only be appeased with a prememorized sequence of ritualistic keymashing –I now wonder how I ever got anything done without it. Gone are the days when I’d throw out a “Darmok and Jalad at Tanagra” reference to test the geek-depth of newly-met friends, Do you even shebang, bro? is my new challenge of choice.

Imagine the nerdistential crisis, then, when I ran an old shell script I wrote to automate backups and was met with utter failure on my new (ArchLinux) environment. Thus began the first of many lessons in writing portable shell programs, the gist of which is that there is no such thing as shell “programming”, only the manipulation of core programs through the use of shell commands.

To unpack that a little, consider the following sequence, inspired by Pat Brisbin’s excellent “The Unix Shell’s Humble If”:

if [ "Darmok at Tanagra" != "Jalad at Tanagra" ];
 then
 echo "Shaka, when the walls fell."
 else
 echo "His arms wide open."
fi

Brisbin highlights that the opening bracket of the test condition for the above if block “is just another command” — technically, it’s just a second form of the test utility, similar in usage to test except that it requires “a trailing ]”. Very same; So copy — but why? As Brisbin notes, “this bit of cleverness leads to an intuitive and familiar form when the [command is paired with if” but it also leads to improper usage among peeps who expect the Frankensteined block to parse intuitively. Having repeatedly mashed my brain against such cases, much head nodding happened when I read Rich Felker‘s preamable to the excellent POSIX shell tricks: “I am a strong believer that Bourne-derived languages are extremely bad, on the same order of badness as Perl, for programming, and consider programming sh for any purpose other than as a super-portable, lowest-common-denominator platform for build or bootstrap scripts and the like, as an extremely misguided endeavor.” Take the following code block, which does not do what you want it to do:

if [ grep -q 'Tanagra' /log/darmoks_trips.log ];
 then
 echo "Shaka, when the walls fell."
fi

The block fails because [ expects an expression to evaluate for an exit code . Remember, the first line is equivalent to:

if test grep -q 'Tanagra' /log/darmoks_trips.log; ...

…and the fix is simple:

if test "grep -q 'Tanagra' /log/darmoks_trips.log"; ...

But there’s no need for any of this because the if command already evaluates the expression that follows it. So all we need is:

if grep -q 'Tanagra' /log/darmoks_trips.log; ...

Which leads me to the second thing I’ve learned about writing portable shell scripts: Read The F^&k!ng Manual. Without getting into a digression on Bashisms, POSIX-compliance, and the many subtle differences in shells, if you want to ensure that your scripts will work as expected, you should be aware of what is and is not part of the POSIX specification. So what does POSIX say about ifs? Glad you asked:

THE IF CONDITIONAL CONSTRUCT

The if command shall execute a compound-list and use its exit status to determine whether to execute another compound-list.

The format for the if construct is as follows:

if compound-list
then
 compound-list
[elif compound-list
then
 compound-list] ...
[else
 compound-list]
fi

The if compound-list shall be executed; if its exit status is zero, the then compound-list shall be executed and the command shall complete. Otherwise, each elif compound-list shall be executed, in turn, and if its exit status is zero, the then compound-list shall be executed and the command shall complete. Otherwise, the else compound-list shall be executed.

EXIT STATUS

The exit status of the if command shall be the exit status of the then or else compound-list that was executed, or zero, if none was executed.

Obviously, that's just one gotcha among many and if you need your scripts to work for a wide user base, it's definitely worth investigating the common pitfalls. Luckily, there's a linter for that, ShellCheck (source on GitHub), and not only does it highlight Bashisms, it often provides POSIX alternatives: well worth a look!