Rob van der Woude's Scripting Pages

Batch How To ...

Validate Input From SET /P

The September 2014 disclosure of a command insertion vulnerability for command-shell scripts (batch files), by The Security Factory, made it painfully clear that batch files can be vulnerable to exploits.

Batch files are extremely "weakly typed" (to use the understatement of the millennium): everything is a string, and a string can be everything, command as well as data, or even both, and there is no way to distinguish between the two.
This makes batch code insertion more than just a potential threat.

It all comes down to input validation, and the good news is: input validation turns out to be remarkably simple, once you know how.

Command line validation, and parameter files as an alternative, are discussed in their own dedicated page.

 

Safely using SET /P to prompt for input

The Problem: Code Insertion

A convenient way to prompt for user input is Windows' native SET command:

SET /P "Input=Please type anything and press Enter: "

will prompt for input (Please type anything and press Enter: ), accept all keyboard input until Enter is pressed, and store the input in environment variable Input.

No risk so far...

Try it: when prompted, type abc&ping ::1 and press the Enter key.

Nothing much happened so far...

Enter the command SET Input to see if your input was stored in the environment variable Input:

Input=abc&ping ::1

However, when you enter the command ECHO %Input% the output will look like this:

abc

Pinging ::1 with 32 bytes of data:
Reply from ::1: time<1ms
Reply from ::1: time<1ms
Reply from ::1: time<1ms
Reply from ::1: time<1ms

Ping statistics for ::1:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 0ms, Maximum = 0ms, Average = 0ms

Why?

The explanation is that the ampersand acts as a "command separator": everything following a (single) ampersand will be interpreted as a new command.
ECHO %Input% is evaluated to ECHO abc&ping ::1 which is in turn interpreted as:

ECHO abc
ping ::1
Note: This issue is similar to the unquoted %CD% code insertion vulnerability (though less likely to be exploited — so far).

For the unquoted %CD% vulnerability, the solution is to put the variable between doublequotes: "%CD%"; this works because %CD% will never contain doublequotes itself (unless of course this dynamic variable has been set to a static value).
With SET /P however, the user can type anything when prompted, including ampersands and stray doublequotes.

Suppose we were to place %Input% between doublequotes; try:

ECHO "%Input%"

and the output will be:

"abc&ping ::1"

Looking good so far...
But now, repeat the SET /P command, and enter abc"&ping ::1&echo "oops and see what happens with ECHO "%Input%":

"abc"

Pinging ::1 with 32 bytes of data:
Reply from ::1: time<1ms
Reply from ::1: time<1ms
Reply from ::1: time<1ms
Reply from ::1: time<1ms

Ping statistics for ::1:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 0ms, Maximum = 0ms, Average = 0ms
"oops"

Oops indeed!

The Solution: Delayed Variable Expansion

Until recently I believed that safely parsing the input is a cat and mouse game we were never going to win.

Then Kang-Che Sung provided an ingenious and remarkably simple solution to the problem: use delayed variable expansion!

I read your page about SET /P validation and info about command injection vulnerability (which I have also discovered during my coding).
My personal solution is to just use Delayed Expansion for that variable, and (according to my test) nothing will be executed or wrongly parsed, period.

I actually believe Delayed Expansion is the ultimate solution to this SET /P problem, and with it you don't need any echo-piped-to-findstr tricks.

SETLOCAL EnableDelayedExpansion
SET /P var="Type anything here:"
ECHO .!var!.
ECHO strip quotes: .!var:"=!.
REM If you really want to reject the variable with any quotation mark...
ECHO !var! | FIND """" >NUL && SET var=
REM That's it.

Try it, try to insert code using ampersand, percent signs, exclamation marks, doublequotes... I could not make this code fail, so far.

To understand why it works, we need to understand how the command interpreter handles command lines.

Quoting Timothy Hill in his book Windows NT Shell Scripting (text marked red added by yours truly):

  1. All parameter and variable references are resolved [...].
  2. Compound commands are split into individual commands and each is then individually processed according to the following steps [...]. Continuation lines are also processed at this step.
  3. Delayed variable references are resolved. (1)
  4. The command is split into the command name and any arguments.
  5. If the command name does not specify a path, the shell attempts to match the command name against the list of internal shell commands. If a match is found, the internal command executes. Otherwise, the shell continues to step 5.
  6. If the command name specifies a path, the shell searches the specified path for an executable file matching the command name. If a match is found, the external command (the executable file) executes. If no match is found, the shell reports an error and command processing completes.
  7. If the command name does not specify a path, the shell searches the current directory for an executable file matching the command name. If a match is found, the external command (the executable file) executes. If no match is found, the shell continues to step 7.
  8. The shell now searches each directory specified by the PATH environment variable, in the order listed, for an executable file matching the command name. If a match is found, the external command (the executable file) executes. If no match is found, the shell reports an error and command processing completes.

Delayed variable expansion takes place in step 3 (marked red), whereas code insertion can only take place in step 2.

Notes: 1: Delayed variable expansion (step 3, marked red) is not mentioned in Tim Hill's book; this step was explained to me by Kang-Che Sung.
  2: Steps 5..8 describe how the executable is found for the specified command, these steps are not relevant for this explanation of code insertion.

Quoting from Kang-Che Sung's explanation:

  1. Percent expansion %var% and %1 ... %9
    (This step briefly explains why FOR tokens needs double-percent signs in scripts. Because this step will be handled differently if CMD is in interactive mode, i.e. the mode that you type commands)
  2. Command splitting. Handles "(" ")" "&" "|" "<" ">" (and newline) delimiters. (This is where command injection takes place.)
  3. Delayed Expansion. At this time ( ) & | < > is no longer special and so won't split into additional commands or pipes.
  4. Word splitting. Technically it just identify which part is the command (%0) and which part are arguments (%*). Unlike Unix scripting, where all pieces of arguments ($1 $2 $3 ...) are split at this point, Windows shells just pass the whole string of arguments to the external (callee) program and let it handle by itself. (This explains why Unix shell scripts allows "$@" in addition to "$*".)
  5. The rest, steps 5 to 8, are about resolving paths to command, and so are irrelevant for us.

Before processing the acquired input, it may be wise to reject input that contains "questionable" characters.

The following code will check if %Input% contains doublequotes, and if so, wipe the input:

SET Input | FIND """" >NUL
IF NOT ERRORLEVEL 1 SET Input=

Likewise, you can use FIND "&" to test for the occurrence of ampersands, or FINDSTR /R /C:"[&""|()]" to test for all "questionable" characters:

(
	SET Input | FINDSTR /R /C:"[&""|()]"
	IF NOT ERRORLEVEL 1 SET Input=
) >NUL
Notes: 3: Scott Sumner noted that, when redirecting FINDSTR's output to NUL in a batch file, the return code ("ErrorLevel") will always be 0.
That is why, for FINDSTR, redirection to NUL is done after checking the return code first.
  4: The shorter code SET Input | FINDSTR /R /C:"[&""|()]" && SET Input= does not work in Windows 7 (not tested in other Windows versions).
Though (SET Input | FINDSTR /R /C:"[&""|()]") && SET Input= does work, the nested parentheses required for redirection to NUL might lead to new issues.
My advice is to keep it simple and safe, and use the IF NOT ERRORLEVEL 1 test.

 

Alternatives for SET /P

You don't always need "free" input, often you only want the user to select from a list of choices.
In that case, consider using CHOICE instead of SET /P

If you do need "free" input, either use the routine above, or a different scripting language, or my InputBox.exe (which removes doublequotes, ampersands and redirection characters from the input).

 


page last modified: 2016-09-19; loaded in 0.0052 seconds