Shell Scripts: Safe Short/Long Option Parsing

I’ve written a ton of Bash scripts over the years - haven't we all?
To be honest, for a long time I handled command-line arguments in the most primitive way possible: checking for $1, $2, and so on.
It was ugly, fragile and a real pain to maintain. If the user (i.e. I myself in many cases 😂) put the flags in the "wrong" order? Script broken. Needed to add a new optional flag? More if-else spaghetti 🍝

THAT, my dear reader, is the problem getopt solves elegantly. It’s a powerful utility for bringing civilised, robust argument parsing to your shell scripts.

An open oyster shell revealing a single, perfect pearl.
Inside every good shell script is a pearl of wisdom. For command-line options, that pearl is getopt.


TL;DR

Stop manually parsing $1, $2, etc. in your shell scripts! Use getopt to define and parse arguments like a command-line pro 😎

Need Long Options or Optional Arguments?

Use GNU getopt.

The safe way to do it is:

  1. Run getopt
  2. Run eval set -- "$PARSED".
    1. The reason is that getopt returns a shell‑quoted string that must be re‑tokenised.
    2. Here, the eval is intentional and safe.

Only Short Options?

Use POSIX (bash, zsh) getopts.

No external dependencies, no eval.


The Problem I Want to Solve

I want a clean and robust shell script that can handle:

  • Optional arguments with default: -a, --arga[=VAL]
  • Flags: -b, --argb
  • Required arguments: -c, --argc VAL
  • Help: --help.
  • Non-option arguments: -- this and that.

The Core Idea: The "Option String" Menu

The only way to tell getopt what to expect is by giving it an "option string", one for short options (-o) and one for long options (--long).

You can look at the option string as the menu you hand to getopt. It tells getopt what "meals" (options) are available and which ones go with a "side" (an argument) 😋

Short Option String (-o)

  • Each character is an option.
    • abc means you accept -a, -b, and -c.
  • A colon : after a character means it requires an argument.
    • c: means -c must be followed by a value (e.g., -c my-value).
  • Two colons :: after a character means it has an optional argument.
    • a:: means -a can be used alone or with a value (e.g., -a or -a my-value).

Example:

Consider -o f:gh::i::

  • -f requires an argument.
  • -g is a simple flag (no argument).
  • -h has an optional argument.
  • -i requires an argument.

Long Option String (--long)

The rules are the same, but the options are full words separated by commas.

Example:

Consider --long file:,verbose,user::,help.

  • --file requires an argument (--file /path/to/it).
  • --verbose is a simple flag.
  • --user has an optional argument (--user or --user=bahman).
  • --help is a simple flag.

Rich Pattern: GNU getopt (Long + Short Options)

This is a safe, complete script you can use in any scenario - obviously after renaming the options 😁

#!/usr/bin/env bash

# Strict mode on
set -Eeuo pipefail

script_name="${0##*/}"

usage() {
  cat <<EOF
Usage: $script_name [OPTIONS] [--] [ARGS...]

Options:
  -a, --arga[=VAL]   Optional argument (default: "some default value")
  -b, --argb         Flag (off by default)
  -c, --argc VAL     Required argument
  -h, --help         Show this help and exit
  -v, --version      Show version and exit

Examples:
  $script_name -b -c hello
  $script_name --arga=foo --argc "hi there" -- file1 file2
  $script_name -a -cval -- some-positional
EOF
}

version() { echo "$script_name 1.0.0"; }

# Defaults
ARG_A="some default value"
ARG_B=0
ARG_C=""

# Resolve getopt binary first, then test it. GNU getopt (util-linux) returns exit status 4 for -T/--test.
GETOPT_BIN="${GETOPT:-getopt}"
"$GETOPT_BIN" -T >/dev/null 2>&1 || true
getopt_status=$?
if [[ $getopt_status -ne 4 ]]; then
  echo "Error: This script requires GNU 'getopt' (util-linux)." >&2
  echo "On macOS: brew install gnu-getopt && brew link --force gnu-getopt" >&2
  echo "Or export GETOPT=/<path>/getopt to point at the GNU binary." >&2
  echo "Homebrew path (incl. Apple Silicon): export GETOPT=\"$(brew --prefix gnu-getopt)/bin/getopt\"" >&2
  exit 1
fi

# Define short and long options:
# - Short: a:: (optional), b (flag), c: (required), h (help), v (version)
# - Long:  arga::,  argb,  argc:,  help,  version
PARSED="$("$GETOPT_BIN" \
  --options=a::bc:hv \
  --long=arga::,argb,argc:,help,version \
  --name "$script_name" -- "$@")" || {
  # getopt already printed an error; exit non-zero.
  exit 2
}

# Why 'eval set -- "$PARSED"'?  GNU getopt returns a shell-escaped string
# representing the normalised argv. This safely re-tokenises it so "$1", "$2"...
# are correct. It's safe because $PARSED is generated by getopt, not user input.
eval set -- "$PARSED"

while true; do
  case "$1" in
    -a|--arga)
      # Optional argument placeholder: util-linux inserts an empty string if omitted.
      if [[ -n "${2:-}" ]]; then
        ARG_A="$2"
        shift 2
      else
        # No value provided: leave default
        shift 2
      fi
      ;;
    -b|--argb)
      ARG_B=1
      shift
      ;;
    -c|--argc)
      ARG_C="$2"
      shift 2
      ;;
    -h|--help)
      usage
      exit 0
      ;;
    -v|--version)
      version
      exit 0
      ;;
    --)
      shift
      break
      ;;
    *)
      echo "Internal error: unexpected option '$1'" >&2
      exit 3
      ;;
  esac
done

# Remaining args after '--' are positional arguments:
POSITIONAL=("$@")

# Validate required options
if [[ -z "$ARG_C" ]]; then
  echo "Error: --argc is required." >&2
  echo >&2
  usage
  exit 2
fi

# Do something useful with parsed values
printf 'ARG_A = %s\n' "$ARG_A"
printf 'ARG_B = %s\n' "$ARG_B"
printf 'ARG_C = %s\n' "$ARG_C"
printf 'POSITIONAL (%d): %s\n' "${#POSITIONAL[@]}" "${POSITIONAL[*]-}"

Niceties to Note About The Script

  • Uses strict mode (set -Eeuo pipefail) for fail‑fast behaviour.
  • Explicitly identifies GNU getopt using -T (exit status 4), avoiding silent portability traps.
  • Handles optional arguments safely and predictably.
  • Use -- to separate options from positional arguments.
  • Safely uses eval set -- "$PARSED". That is the correct pattern for GNU getopt.

See It In Action

A terminal cast of getopts-demo.sh in action.



Portable Alternative: Pure POSIX getopts (Short Options Only)

Occasionally, when I don’t need long options, I reach for getopts.

The good thing is that it’s built‑in and portable with no external dependencies, but that comes at the cost of long options and optional arguments. Ugh!

#!/usr/bin/env bash

# Strict mode on.
set -Eeuo pipefail

script_name="${0##*/}"

usage() {
  cat <<EOF
Usage: $script_name [-a VAL] [-b] -c VAL [--] [ARGS...]

Options:
  -a VAL    Optional argument (simulate optional by defaulting if omitted)
  -b        Flag (off by default)
  -c VAL    Required argument
  -h        Show help

Note: getopts does not support long options or true optional arguments.
EOF
}

ARG_A="some default value"  # Simulate optional by default
ARG_B=0
ARG_C=""

# Leading colon (:) -> silent error reporting; we handle cases in *)
while getopts ":a:bc:h" opt; do
  case "$opt" in
    a) ARG_A="$OPTARG" ;;
    b) ARG_B=1 ;;
    c) ARG_C="$OPTARG" ;;
    h) usage; exit 0 ;;
    :)
      # Missing required argument for option
      echo "Error: Option -$OPTARG requires an argument." >&2
      usage
      exit 2
      ;;
    \?)
      echo "Error: Invalid option: -$OPTARG" >&2
      usage
      exit 2
      ;;
  esac
done
shift $((OPTIND - 1))

POSITIONAL=("$@")

if [[ -z "$ARG_C" ]]; then
  echo "Error: -c is required." >&2
  usage
  exit 2
fi

printf 'ARG_A = %s\n' "$ARG_A"
printf 'ARG_B = %s\n' "$ARG_B"
printf 'ARG_C = %s\n' "$ARG_C"
printf 'POSITIONAL (%d): %s\n' "${#POSITIONAL[@]}" "${POSITIONAL[*]-}"

Optional Arguments: Handle the Ambiguity

Optional arguments are tricky.

When you write --user file.txt, does file.txt belong to --user or is it a positional argument?

GNU getopt solves this by inserting empty strings when values are omitted:

# Input:  --user file.txt
# Output: --user '' -- file.txt

Best practice: Use = for long options, stick values to short options:

script --user=bahman     # Clear
script -ubahman          # Clear
script --user file.txt   # Ambiguous - avoid

Notes

  • At its core, GNU getopt is just a “normaliser”. That means it takes a messy user input and returns a clean, shell‑escaped “recipe” for the final argv. The eval set -- "$PARSED" turns that recipe back into a “pre‑cooked meal” which is a proper $1 $2 ... list your script can consume.
  • The -- separator is a shield: everything after it is treated as data, not options.
  • Using --name "$script_name" makes getopt print script‑specific error messages.

Common Gotchas

  • Always try to validate required flags early in your script and fail fast with friendly usage messages.
  • macOS ships a BSD getopt that’s not compatible with GNU getopt 🤦
    • Install GNU getopt via Homebrew: brew install gnu-getopt && brew link --force gnu-getopt.
    • Then point at it explicitly: export GETOPT="$(brew --prefix gnu-getopt)/bin/getopt". On Linux, you almost always have GNU already.
  • Optional arguments are a nasty creature:

    • Prefer --opt=value for optional arguments; it avoids ambiguity.
    • For short options, -oVALUE is okay; avoid -o VALUE when optional.
  • Don’t drop the -- before positional arguments:

    • script -- -not-an-option file.txt ensures -not-an-option isn’t parsed.
  • Do not eval user input. I repeat: Do. Not.
    • Only eval the string produced by GNU getopt. That output is already shell‑escaped and safe to re‑tokenise.
  • set -e interactions:
    • Using getopt -T in a conditional is safe; set -e doesn’t exit on failures in if tests.
  • getopts limitations:
    • No long options; no true optional arguments. Simulate optional by setting defaults.

Next Steps

  • Skim the GNU getopt man page for deeper tricks: man 1 getopt (util‑linux).
  • Do you wish for rich, pure‑Bash long options without getopt? Check out getoptions (a modern Bash library) or code generators like Argbash.
  • Related challenge: add subcommands (./script fetch ..., ./script sync ...) and parse them before options. I may get around to that soon 😎

Comments

  1. I did quite a bit of searching for introductions to getopt, and everything I found was packed with waaay too much and advanced information without even explaining things (why do programmers tend to with with so much da** complication and not even explain things??) until I found this, which was simple, to the point and actually explained things. Thank you.

    ReplyDelete
  2. Line 29 with the "Internal Error!" message doesn't work. No matter what wrong option you write, the message doesn't appear. Only the "invalid option --" lines appear. This is a problem I face also with my getopt script.

    ReplyDelete
    Replies
    1. That's expected 😄 Any "invalid" option will be caught be getopts on line #11 before it reaches line #29.

      The usecase for line #29 is when the options on line #11 are more general than your program may accept (for backward compatibility for example.) Then you would reject the unsupported options on line #29.

      If you want to silence the "invalid option" error, add `--quite` to getopts options on line #11.

      Hope this helps.

      Delete
    2. Thank you for your explanation, sir, but what I want in my script is that when it detects that the user has entered an invalid option like "-o", the program ends at that moment without showing anything else except the error message.
      Sorry for bothering you with a question about a script from more than 6 years ago. The thing is that achieving the result I'm talking about has kept me obsessed.
      And sorry for my bad english It's not my first language.

      Delete
    3. I'm afraid that's not something you can achieve with the snippet I provided.

      That said, you *could* try redirecting getopts stderr and exit the program if the exit code is not 0. I haven't tried it but on paper it may work.

      Delete

Post a Comment