The landscape
I'm helping someone with a scripting problem on an old system and an old shell. How old? Try IBM AIX 6.1, first released in 2007, and ksh93 "e" released in ... 1993. At least the AIX is a version of 6.1 from 2014! (Kudos to IBM for treating long-term support seriously.)
A second point to ponder. The goal is to improve remote
scripting—running scripts on a remote machine. In this
environment, ssh exists but is not used. The remote execution
tool chosen is rexec, considered
one of the most
dangerous tools possible. But my remit is not to address the
insecurity, just to improve the scripting. (They know this is a bad,
and are actively working to eventually resolve.)
So, given these constraints, what problem am I solving?
Example problem
This environment makes extensive use of remotely executed scripts to
wire together a distributed, locally-hosted system. Current scripts
duplicate the same approach, each implemented as a one-off: Copy a
script to a remote machine with rcp; use
rexec to invoke the script, capturing the output to a
file on the remote host; copy the captured file back to the local host;
process the output file; sometimes clean up the remote host afterwards.
Some gotchas to watch out for with ksh93e or
rexec:
- Function tracing - Using the standard
xtracesetting to trace script execution inksh93has problems with tracing functions, and requires using old-style function syntax - Variable scope - To keep variables local to a function in
ksh93, you must use the new-style function syntax (note the conflict with tracing) - Exit broken with trap - When calling
exitto quit a remote script,trapdoes not get a correct$?variable (it is always 0, asexitsucceeded in returning a non-0 exit status). Instead one must "set"$?with the code of a failing command, and then leave with a plain call toexit - No pipefail - Release "e" of ksh93 just does not know anything
about
set -o pipefail, and there is no uninstrusive workaround. This now common feature showed up in release "g" - No exit code - Would you believe
rexecdoes not itself exit with the exit code of the remote command, never has, and never will? It always exits 0 if the remote command could be started. - Buffered
stderr- Empirically,rexec(at least the version with this AIX) buffers thestderrstream of remote commands, and only flushes whenrexecexits, so the sense of ordering betweenstdout,stderrand the command-line prompt is even worse than usual (the actual handling is unspecified)
This problem and environment triggers a memory: The last time I worked on AIX was in 1994, and it was almost the same problem! I really thought I had escaped those days.
A solution
So I refactored. I couldn't change the use of rexec—this
environment is not ready for SSH key management—, I couldn't
replace KSH93 with BASH or replace AIX with Linux, but I could do
something about the imperfect duplication and random detritus files.
The solution
Note the need to call a fail function instead of
exit directly because of poor interaction with
trap.
Assuming some help, such as a global progname variable
(which could simply be $0), and avoiding remote
temporary files:
_transfer_exit_code() {
while read line
do
case $line in
^[0-9] | ^[1-9][0-9] | ^11[0-9] | ^12[0-7] ) return ${line#^} ;;
* ) printf '%s\n' "$line" ;;
esac
done
return 1 # ksh93e lacks pipefail; we get here when 'rscript' failed
}
rscript() {
case $# in
0 | 1 )
echo "$progname: BUG: Usage: rexec SCRIPT-NAME HOSTNAME [ARGS]..." >&2 ;;
* ) script_name=$1 ; shift
hostname=$1 ; shift ;;
esac
# Trace callers script if we ourselves are being traced
case $- in
*x* ) _set_x='set -x' ;;
esac
rexec $hostname /usr/bin/ksh93 -s "$@" <<EOS | _transfer_exit_code
set - "$@" # Only reasonable way to pass through function arguments
# Work around AIX ksh93 return code of exit ignored by trap
fail() {
return \$1
}
# Our hook to capture the exit code for rexec who dumbly swallows it
trap 'rc=\$?; echo ^\$rc; exit \$rc' EXIT
PS4='+$script_name:\$(( LINENO - 14 )) (\$SECONDS) '
$_set_x
# The callers script
$(cat)
EOS
}
Example use
#!/usr/bin/ksh93
progname=${0##*/}
PS4='+$progname:$LINENO ($SECONDS) '
usage() {
echo "Usage: $0 [-d] HOSTNAME"
}
. rexec.ksh
debug=false
while getopts :d opt
do
case $opt in
d ) debug=true ;;
* ) usage >&2 ; exit 2 ;;
esac
done
shift $(( OPTIND - 1 ))
case $# in
1 ) hostname=$1 ;;
* ) usage >&2 ; exit 2 ;;
esac
$debug && set -x
script_name=My-Remote-Script
tmp=${TMPDIR-/tmp}/$progname.$RANDOM
trap 'rm -f $tmp' EXIT
rscript $script_name $hostname Katy <<'EOS' >$tmp
echo $#: $1
fail 3
EOS
case $? in
3 ) ;;
* ) echo "$0: Did not pass through exit code" >&2 ; exit 1 ;;
esac
case "$(<$tmp)" in
'1: Katy' ) ;;
* ) echo "$0: Did not pass through arguments" >&2 ; exit 1 ;;
esac
Source
The code is in GitHub.