The thing that surprised me was that you can't write an interpreter in an interpreted language, at least not in obsd. It is possible if you jump through a few hoops but you can't directly call it.
An example: if you made a language in python
/bin/my_lang: #does nothing but pretend it does
#!/usr/local/bin/python3
import sys
print('my_lang args', sys.argv)
for line in sys.stdin:
print('invalid_line:', line)
my_script:
#!/bin/my_lang
line of stuff
another line of stuff
chmod u+x my_script
./my_script
Probably for the best, but I was a bit sad that my recursive interpreter scheme was not going to work.
Update: looks like linux does allow nested interpreters, good for them.
Worked for me, but the way you described it has issues:
1. You chmod my_script twice.
2. Did you chmod u+x /bin/my_lang too? Since you put it in /bin, are you sure the owner isn't root?, in which case your user wouldn't have execute permission. Try +x instead of u+x.
3. Do you have python in that path? Try `/usr/bin/env python` instead.
4. In case you expected otherwise, my_script wouldn't be passed through stdin. It's just provided as an argument to my_lang.
I think that was what I was trying to figure out, how the program was passed. but OpenBSD does not do nested interpreters, it looks like if I had tried Linux it would have worked.
NixOS is annoying because everything is weird and symlinked and so I find myself fairly frequently making the mistake of writing `#!/bin/bash`, only to be told it can't find it, and I have to replace the path with `/run/current-system/sw/bin/bash`.
Or at least I thought I did; apparently I can just have done `#!bash`. I just tested this, and it worked fine. You learn something new every day I guess.
Because /bin is the standard location for bash. The only one that breaks that expectation is NixOS (and maybe GuixSD?), apparently. I'm surprised they didn't symlink /bin or put a stub. Last time I tried NixOS was like 10 years ago. I thought there was a /bin/bash, but maybe it was just a /bin/sh?
Other interpreters like python, ruby, etc. have more likelyhood of being used with "virtual environments", so it's more common to use /usr/bin/env with them.
Will not wok on OpenBSD where the shell that comes with the system is ksh at /bin/ksh and /bin/sh and if you want bash it is a third party package and correspondingly gets installed as /usr/local/bin/bash
It does get awkward, especially when porting. all your third party libraries and includes are in /usr/local/lib /usr/local/include but at least it is better than freebsd which also insists on putting all third party configs under /usr/local/etc/
They do symlink /bin/sh to be fair, and that's very often good enough for a lot of scripts. That's what I usually do if I don't need anything bash offers.
Thing is, a few years ago when Debian changed its default sh from bash to ... either ash or dash, I forget which, I got into the habit of always writing `#!/bin/bash` at the top of my scripts, in case I didn't realize that something I was using was a bashism not found in classic /bin/sh. So if I used Nix (I don't, since for my particular use cases the juice isn't worth the squeeze), I would get seriously messed up by that.
It's guaranteed to work provided that Bash is in the path.
It's very common for Python. Less so for Bash for two reasons: because the person who writes the script references /bin/sh instead (which is required to be there) even when they are writing bash-isms, or because the person who writes the script assumes that Bash is universally available as /bin/bash.
I think you're mixing two concepts: relative paths (which are allowed after #! but not very useful at all) and file lookup through $PATH (which is not done by the kernel, maybe it's some shell trickery).
It is possible for the shell to handle it. From zshall(1):
> If the program is a file beginning with ‘#!', the remainder of the first line specifies an interpreter for the program. The shell will execute the specified interpreter on operating systems that do not handle this executable format in the kernel.
I did a little digging and found that the `|| eno == ENOENT` was added quite a bit earlier[1] than the actual pathprog lookup[2]. While I could find the "issue discussion" for the pathprog change[3] I wasn't able to find it for the ENOENT addition, which was kind of interesting and frustrating--[4] is the `X-Seq` mentioned in the commit but that seems to be inconsistent or incorrect for the actual cross-reference, and nearby in time wasn't helpful either.
I think it makes it to calling open_exec but there's a test for BINPRM_FLAGS_PATH_INACCESSIBLE, which doesn't seem relevant since 'bash' isn't like '/dev/fd/<fd>/..', but does provoke an ENOENT.
env bash is all well and good for normies, but if you're already on NixOS did you know you can have nix-shell be your interpreter and back flip into any reproducible interpreted environment you like?
> Although this is probably the easiest way to implement '#!' inside the kernel, I'm a little bit surprised that it survived in Linux (in a completely independent implementation) and in OpenBSD (where the security people might have had a double-take at some point). But given Hyrum's Law there are probably people out there who are depending on this behavior so we're now stuck with it.
I don't see what there would be to gain in disallowing the program path on the shebang line to be relative. The person that wrote the shebang can also write relative paths in some other part of the file.
Or, like, if you aren't reading and caring about what the interpreter is--as that's the only time this can burn you: it isn't doing a PATH lookup, so you can't walk into this one on accident--then it could literally be something like /bin/rm on some key file. This entire article is based on an assumption that this is somehow so obviously bad that there doesn't even need to be an explanation or defense of any kind of that idea.
> This entire article is based on an assumption that this is somehow so obviously bad that there doesn't even need to be an explanation or defense of any kind of that idea.
I'm not reading it like that. The tone is just one of surprise, since this isn't something that one typically sees. Since it's obscure, it leads one to wonder if it can be bad, and I don't see how it could be.
I think it survived in the independent Linux because it's the simple and obvious way to do things, and it doesn't lead to any exceptional power of misuse one didn't already have with writing the rest of the file.
I wonder what the reason was for having the kernel handle this, instead of the shell? To allow programs besides the shell to execute interpreted scripts as if they were actual binaries?
This is of course in stark contrast to dynamic linking, which is performed by a userspace program instead of the kernel, and much like the #!, this "interpreter"'s path is also hardcoded in dynamically linked binaries.
In a call `execve("/my/script", ...)` of course the kernel has to figure out how to run it, there is no shell involved.
As for scripts vs elf executables, there's not much of a difference between the shebang line and PT_INTERP, just that parsing shebangs lines is simpler.
Hit the link expecting to read about UTF-8 Byte Order Marks at the top of the file, so that the first few bytes aren't actually #! but 0xEF 0xBB 0xBF #! instead. Ran into this one just a few months ago when a coworker who uses Windows had checked a Bash script into the Git repo. His editor was configured to save files as "UTF-8 with BOM" and so we were getting errors that looked like "./doit.sh: line 1: #!/bin/bash: No such file or directory". Can you see the invisible BOMb in that line? It's there, I promise you.
That's not what the article was actually about, as it turned out. The surprise in the article was about relative paths for script shebang lines. Which was useful to learn about, of course, but I was actually surprised by the surprise.
There is no security issue here. The file with the '#!' needs to be executable, and at that point it doesn't matter what it invokes because you made it executable. It could have shellcode in it or it could call python3 which can also execute shellcode. Or more likely, it would just be a malware binary which you deliberately gave permissions to and executed.
The thing that surprised me was that you can't write an interpreter in an interpreted language, at least not in obsd. It is possible if you jump through a few hoops but you can't directly call it.
An example: if you made a language in python /bin/my_lang: #does nothing but pretend it does
my_script: Probably for the best, but I was a bit sad that my recursive interpreter scheme was not going to work.Update: looks like linux does allow nested interpreters, good for them.
https://www.in-ulm.de/~mascheck/various/shebang/#interpreter...
really that whole document is a delightful read.
Worked for me, but the way you described it has issues:
1. You chmod my_script twice.
2. Did you chmod u+x /bin/my_lang too? Since you put it in /bin, are you sure the owner isn't root?, in which case your user wouldn't have execute permission. Try +x instead of u+x.
3. Do you have python in that path? Try `/usr/bin/env python` instead.
4. In case you expected otherwise, my_script wouldn't be passed through stdin. It's just provided as an argument to my_lang.
I am on openbsd. which does not allow it, it looks like nested interpreters are s supported on linux. So my loss there.
Wait... your interpreter reads from stdin. Shouldn't it read its first arg, instead?
I think that was what I was trying to figure out, how the program was passed. but OpenBSD does not do nested interpreters, it looks like if I had tried Linux it would have worked.
Huh. I wish I had known this before.
NixOS is annoying because everything is weird and symlinked and so I find myself fairly frequently making the mistake of writing `#!/bin/bash`, only to be told it can't find it, and I have to replace the path with `/run/current-system/sw/bin/bash`.
Or at least I thought I did; apparently I can just have done `#!bash`. I just tested this, and it worked fine. You learn something new every day I guess.
Anything other than ”#!/usr/bin/env bash” is doomed to fail at some time.
> Anything other than ”#!/usr/bin/env bash” is doomed to fail at some time.
if you have /usr/bin/env
And is this shebang guaranteed to work always? Why isn't it more common?
Because /bin is the standard location for bash. The only one that breaks that expectation is NixOS (and maybe GuixSD?), apparently. I'm surprised they didn't symlink /bin or put a stub. Last time I tried NixOS was like 10 years ago. I thought there was a /bin/bash, but maybe it was just a /bin/sh?
Other interpreters like python, ruby, etc. have more likelyhood of being used with "virtual environments", so it's more common to use /usr/bin/env with them.
Will not wok on OpenBSD where the shell that comes with the system is ksh at /bin/ksh and /bin/sh and if you want bash it is a third party package and correspondingly gets installed as /usr/local/bin/bash
It does get awkward, especially when porting. all your third party libraries and includes are in /usr/local/lib /usr/local/include but at least it is better than freebsd which also insists on putting all third party configs under /usr/local/etc/
They do symlink /bin/sh to be fair, and that's very often good enough for a lot of scripts. That's what I usually do if I don't need anything bash offers.
Thing is, a few years ago when Debian changed its default sh from bash to ... either ash or dash, I forget which, I got into the habit of always writing `#!/bin/bash` at the top of my scripts, in case I didn't realize that something I was using was a bashism not found in classic /bin/sh. So if I used Nix (I don't, since for my particular use cases the juice isn't worth the squeeze), I would get seriously messed up by that.
It's guaranteed to work provided that Bash is in the path.
It's very common for Python. Less so for Bash for two reasons: because the person who writes the script references /bin/sh instead (which is required to be there) even when they are writing bash-isms, or because the person who writes the script assumes that Bash is universally available as /bin/bash.
It’s quite common, although I probably see it used more frequently to invoke other (non-shell) scripting languages.
> apparently I can just have done `#!bash`
I think you're mixing two concepts: relative paths (which are allowed after #! but not very useful at all) and file lookup through $PATH (which is not done by the kernel, maybe it's some shell trickery).
You can use `#!/usr/bin/env bash` on NixOS
I didn't know that actually. I'll start using that from this point forward.
`#!/usr/bin/env bash` is the most portable form for executing it from $PATH
Is this meaningfully more portable than #!bash though?
In a sibling thread someone pointed out that #!bash doesn't actually work if you're calling it from bash, and appears to only work with zsh.
I just tried it and they were absolutely right, so `#!/usr/bin/env bash` is definitely more portable in that it consistently works.
This mechanism doesn't do a PATH lookup: #!bash would only work if bash was located in your current working directory.
Seems like it only works in zsh, not bash or fish
Is this UNIX?
This is NixOS, so no, it's Linux. I guess I just hoped it would work on Linux as well.
The kernel interprets the shebang line, not the shell.
It is possible for the shell to handle it. From zshall(1):
> If the program is a file beginning with ‘#!', the remainder of the first line specifies an interpreter for the program. The shell will execute the specified interpreter on operating systems that do not handle this executable format in the kernel.
Taking a quick look at the source in Src/exec.c:
I guess at some point someone added that `|| eno == ENOENT` and the docs weren't updated.I did a little digging and found that the `|| eno == ENOENT` was added quite a bit earlier[1] than the actual pathprog lookup[2]. While I could find the "issue discussion" for the pathprog change[3] I wasn't able to find it for the ENOENT addition, which was kind of interesting and frustrating--[4] is the `X-Seq` mentioned in the commit but that seems to be inconsistent or incorrect for the actual cross-reference, and nearby in time wasn't helpful either.
[1] https://sourceforge.net/p/zsh/code/ci/29ed6c7e3ab32da20f528a...
[2] https://sourceforge.net/p/zsh/code/ci/29ed6c7e3ab32da20f528a...
[3] https://www.zsh.org/mla/workers/2010/msg00522.html
[4] https://www.zsh.org/mla/workers/2000/msg01168.html
I'm not sure the reason then, but they're definitely right; it works fine with zsh, doesn't work with bash. I wrote a test script to try it myself.
I don't have fish installed and can't be bothered to go that far, but I suspect they're right about that as well.
It is strange, cursory digging for an explanation was a little more complex than I bargained for...
https://github.com/torvalds/linux/blob/v6.17/fs/binfmt_scrip...
I think it makes it to calling open_exec but there's a test for BINPRM_FLAGS_PATH_INACCESSIBLE, which doesn't seem relevant since 'bash' isn't like '/dev/fd/<fd>/..', but does provoke an ENOENT.
https://github.com/torvalds/linux/blob/v6.17/fs/exec.c#L1445
Maybe someone else can explain it, I'd enjoy the details, and ran out of steam.
env bash is all well and good for normies, but if you're already on NixOS did you know you can have nix-shell be your interpreter and back flip into any reproducible interpreted environment you like?
https://nixos.wiki/wiki/Nix-shell_shebang
> Although this is probably the easiest way to implement '#!' inside the kernel, I'm a little bit surprised that it survived in Linux (in a completely independent implementation) and in OpenBSD (where the security people might have had a double-take at some point). But given Hyrum's Law there are probably people out there who are depending on this behavior so we're now stuck with it.
I don't see what there would be to gain in disallowing the program path on the shebang line to be relative. The person that wrote the shebang can also write relative paths in some other part of the file.
Or, like, if you aren't reading and caring about what the interpreter is--as that's the only time this can burn you: it isn't doing a PATH lookup, so you can't walk into this one on accident--then it could literally be something like /bin/rm on some key file. This entire article is based on an assumption that this is somehow so obviously bad that there doesn't even need to be an explanation or defense of any kind of that idea.
> This entire article is based on an assumption that this is somehow so obviously bad that there doesn't even need to be an explanation or defense of any kind of that idea.
I'm not reading it like that. The tone is just one of surprise, since this isn't something that one typically sees. Since it's obscure, it leads one to wonder if it can be bad, and I don't see how it could be.
I think it survived in the independent Linux because it's the simple and obvious way to do things, and it doesn't lead to any exceptional power of misuse one didn't already have with writing the rest of the file.
Right: I agree with you. I'm saying the article is making an unfounded assumption and am providing more reasoning for why you are correct.
I wonder what the reason was for having the kernel handle this, instead of the shell? To allow programs besides the shell to execute interpreted scripts as if they were actual binaries?
This is of course in stark contrast to dynamic linking, which is performed by a userspace program instead of the kernel, and much like the #!, this "interpreter"'s path is also hardcoded in dynamically linked binaries.
In a call `execve("/my/script", ...)` of course the kernel has to figure out how to run it, there is no shell involved.
As for scripts vs elf executables, there's not much of a difference between the shebang line and PT_INTERP, just that parsing shebangs lines is simpler.
Hit the link expecting to read about UTF-8 Byte Order Marks at the top of the file, so that the first few bytes aren't actually #! but 0xEF 0xBB 0xBF #! instead. Ran into this one just a few months ago when a coworker who uses Windows had checked a Bash script into the Git repo. His editor was configured to save files as "UTF-8 with BOM" and so we were getting errors that looked like "./doit.sh: line 1: #!/bin/bash: No such file or directory". Can you see the invisible BOMb in that line? It's there, I promise you.
That's not what the article was actually about, as it turned out. The surprise in the article was about relative paths for script shebang lines. Which was useful to learn about, of course, but I was actually surprised by the surprise.
tbh it is lame for any program reading a text file to not support BOM. It's just one if.
There isn't really any one "text file" though, the kernel looks for the first two bytes to match what "#!" corresponds to in ASCII.
https://www.youtube.com/watch?v=J8nblo6BawU is some great watching on how "Plain text isn't that simple"
There's no security issue here. Certainly the OP hasn't explained why there is one.
There is no security issue here. The file with the '#!' needs to be executable, and at that point it doesn't matter what it invokes because you made it executable. It could have shellcode in it or it could call python3 which can also execute shellcode. Or more likely, it would just be a malware binary which you deliberately gave permissions to and executed.