are you bringing in user input from a web form into your shell scripts?

jerf · on May 7, 2021

"are you bringing in user input from a web form into your shell scripts?"

This is shell scripting. We have literally decades of experience with these. We know for a positive fact that simple concatenation is dangerous. The contents of files, the names of files on the file system, and all the other things that shell scripts normally encounter are perfectly sufficient to wreck your day if you are too casual about them. Should you reply with something incredulous about this claim, be prepared for dozens of people to jump in with their war stories about how a single space in a filename in an unexpected place cost their business millions of dollars because of their shell script screwup. (Obviously, they are not that consequential on average, but the outliers are pretty rough in this case.)

Shell scripting is frankly dangerous enough just with what is on your system already; actually hooking up user input to it is borderline insane. Shell scripting would have to level up quite a bit to only be something to be concerned about when "user input" was being fed into it.

Learning to do half-decent shell scripting in most shells consists about half of learning the correct way to do things like pass through arguments from a parent to a child script, because for backwards compatibility reasons, all the easy ways are the wrong ways from the 1970s. It's nice when a more modern take on shell scripting is nicer than that.

I will also say when I'm evaluating libraries for things like shell scripting, I look for things like this, and it definitely doesn't score any points when I see stuff like this.

chrisfinazzo · on May 7, 2021

Just using `set -euo pipefail` will prevent many stupid things, but then again, conventional wisdom these days just seems to be to not use a shell if you can help it.

https://sipb.mit.edu/doc/safe-shell/

jerf · on May 7, 2021

That's another example of what I mean, learning the magic invocations that amount to "Oh, please shell, act halfway like this is the 21st century, please?" It's like how I still have "#!/usr/bin/bash NL use strict; NL use warnings;" still burned into my fingers for Perl. (AIUI that's obsolete now but I never got to upgrade to the versions where that became obsolete, and now I'm just out of it.)

lhorie · on May 7, 2021

> We have literally decades of experience with these. We know for a positive fact that simple concatenation is dangerous.

Yes and no. I agree with the overall premise that the footguns are well documented, but at the same time, projects like this show that there are still large segments of developers who will gleefully shoot themselves in the foot because they never took the time to learn shell, or they just never had the opportunity to earn the battle scars.

At least Google has a bug bounty program.

gnfargbl · on May 7, 2021

The developers of a not-insignificant portion of IoT firmware absolutely are bringing user input in from web forms and chucking it into shell scripts. And unfortunately, it's that same class of developers that are disproportionately likely to pick up zx and run with it.

The point is, at this stage in the evolution of internet security, we pretty much know where the bugs come from. Injection attacks are still a huge practical problem. It would be nice if new scripting languages reduced that attack surface rather than increasing it.

pwdisswordfish8 · on May 7, 2021

Who knows, maybe? Or maybe I just want to process file names with spaces in them? Maybe I don’t want to worry about apostrophes in people’s surnames?

dotancohen · on May 8, 2021

  > apostrophes in people’s surnames

In English, my daughter's given name has an optional apostrophe (Ma'ayan). I've seen systems that escape last-name apostrophes but not in the first name.

tutfbhuf · on May 7, 2021

How do you deal with it, when you are writing bash scripts?

pwdisswordfish8 · on May 7, 2021

By not using eval (and sh -c, and everything equivalent) unless it’s absolutely unavoidable and always quoting variables. The $ construct acts basically like (the shell’s) eval. Proper escaping is an absolute must here.

40four · on May 7, 2021

Indeed, I’m not a bash expert, but I’ve heard multiple times that using eval is a bad idea. Your point is a good reminder of why.

TechBro8615 · on May 7, 2021

I'm pretty sure that "eval is a bad idea" was mentioned in the "thought terminating cliches" topic on Ask HN a few days ago :)

In bash, `eval` is a footgun like any other. You can use it, but you just need to be aware of where your toes are when it shoots.

It's usually a "bad idea" in the sense that if you think you need to use it, you probably don't, and 90% of the time, there is an easier way to accomplish what you want to do. The next 5% of the time, using `eval` might be easier but will also create maintenance debt with its overgeneralization. And the final 5% of the time might be actual legit use cases for `eval`.

I just grepped my codebase for `eval` and I almost never use it. One example of the "overgeneralized" 5% might be when I realized I could use `eval` to set "variable variables" (i.e. the name of the variable is itself a variable, taken from a function argument). It was cool, but I ended up deleting it in favor of a more concrete solution.

Personally, if I'm hesitating to use `eval`, it's usually not for any security reasons. In general, my bash scripts only exist in dev machines and CI runners, and I don't copy them into the application containers that are exposed to a live runtime environment with untrusted users. So for CI/dev scripts, I can safely assume the code will only run in CI/dev, and therefore I can trust arbitrary user input (which I can of course still validate).

40four · on May 10, 2021

Thanks for the detailed response! I’ve been trying to level up my bash lately, and I definitely have seen a lot of ‘poor’ examples where they bailed out & use eval plus something else, for something native bash can in fact handle just fine, if you dig deep enough!

selfhoster11 · on May 7, 2021

This is usually answered with "poorly", or "with great difficulty".

DonHopkins · on May 7, 2021

By immediately invoking Python and getting the hell out of bash.

Spivak · on May 7, 2021

Python is my absolute favorite language but it's not suitable for the kinds of things you would use bash for.

This is the real code that a Ansible uses to run a shell command correctly and is 350 lines and is still a small subset of the features of a single line of bash. https://github.com/ansible/ansible/blob/a2776443017718f6bbd8...

The Python code to do what a single mv invocation does is 120 lines https://github.com/ansible/ansible/blob/a2776443017718f6bbd8...

People always focus on the footguns that exist in Bash the language but ignore how much systems programming is abstracted away from you in the shell environment.

In Bash you can enter a Linux namespace with a single nsenter invocation. If you want to do the same in Python you have use ctypes and call libc.clone manually.

nonameiguess · on May 7, 2021

How is that a remotely real comparison? The ansible mv function deals with preserving SELinux context, which mv doesn't do, and it automatically deals with a whole lot of common error conditions, whereas mv just fails. If you just want to replicate mv, Python has shutil.move, one line of code. Ansible is trying to do a lot more.

By the way, I don't know if this is the canonical implementation, but FreeBSD mv is 481 lines of C: https://github.com/freebsd/freebsd-src/blob/master/bin/mv/mv...

BiteCode_dev · on May 7, 2021

The code you are showing does a lot more than MV:

- it's portable accross OSes (and keep flags if the OS supports it, deal with encoding, etc)

- it ensures selinux context is saved if there is such a thing

- it has proper and rich error communication with the calling code

- it's includes documentation and comments

- it outputs json par parsing and storage

No to say "mv" is not awesome, because it is. There is much more boiler plate in python, and is why I'll often do subprocess.check_call(['mv', 'src', 'dst']) if my script is linux only.

But you are pushing it

overtomanu · on May 7, 2021

i think its not fair comparison. mv command implementation in 'C' might have more lines of code. Maybe we should complain that there are no OOTB library functions in python to move the file.

Spivak · on May 7, 2021

I don't really care how many lines of code there are in an implementation. I care how many lines of code I actually have to write.

Python has shutil.move and os.rename but the Ansible example is to illustrate that there's a lot of code that needs to surround those calls to make them useful and they're not 1-1.

FranchuFranchu · on May 7, 2021

Or xonsh

judofyr · on May 7, 2021

It also means that accidentally adding a space somewhere (“$HOME/go” -> “$HOME /go”) can have catastrophic effect. I wouldn’t dare write a single “rm” if I’m not 100% sure the argument is being quoted.

mirekrusin · on May 7, 2021

It's called shell microservice, you don't need b/e developers anymore.