Now Reading
Easy methods to Lose Management of your Shell

Easy methods to Lose Management of your Shell

2024-03-09 10:09:32

A couple of weeks in the past I used to be hacking on language server assist in Zed, attempting to get Zed to detect when a given language server binary, reminiscent of gopls, is already current in $PATH. In that case, it ought to use that as a substitute of downloading a brand new binary.

The problem: $PATH is usually dynamically modified by instruments reminiscent of direnv, asdf, mise and others, which let you set a selected $PATH in a given folder. (Why do these instruments do this? As a result of it provides you the flexibility to, say, prepend ./my_custom_binaries to $PATH once you’re in my-cool-project.) So we are able to’t simply use the $PATH related to the Zed course of, we want the $PATH as it’s once you cd into your undertaking listing.

Straightforward, I assumed. Simply launch a $SHELL, cd into the undertaking to set off direnv and whathaveyou, run env, retailer the surroundings, select $PATH, discover binaries in there.

And straightforward it was. Right here’s a few of the code, the half that launches $SHELL, cds and will get the env:

fn load_shell_environment(dir: &Path) -> Consequence<HashMap<String, String>> {
    // Get the $SHELL
    let shell = std::env::var("SHELL")?;

    // Assemble the command we wish the $SHELL to execute
    let command = format!("cd {:?}; /usr/bin/env -0;", dir);

    // Launch the $SHELL as an interactive shell (so the person's rc recordsdata are used)
    // and execute `command`:
    let output = std::course of::Command::new(&shell)
        .args(["-i", "-c", &command])
        .output()?;

    // [... check exit code, get stdout, turn stdout into HashMap, etc. ...]
}

Aside from one factor: after beginning a Zed occasion in my terminal that executed this operate, I might now not kill Zed by hitting Ctrl-C.

What?

I might spam the terminal with ^C and nothing occurred. Traces and contours of determined ^Cs that by no means hear their very own echo.

How? Why? … What?

After saying “What?” 20 occasions and hitting Ctrl-c much more, I requested Piotr for assist, as a result of I wasn’t 100% assured in how Rust spawns processes and he’s a Rust wizard. What I did know was that there must be fork and exec syscalls someplace inside std::course of::Command however I wasn’t certain whether or not Rust doesn’t do one thing intelligent with the sign handlers or has default sign handlers setup that mess with Ctrl-c. As a result of Ctrl-c ought to lead to an interrupt signal being sent to the processes which ought to trigger it to terminate, however clearly that stopped working.

We began to poke at every kind of issues to check every kind of hypotheses, as outlandish as they may be.

Are we certain that the shell just isn’t working anymore? Sure, we’re, as a result of .output() up there solely returns as soon as the command has completed working.

Is that this about cd? Do direnv or asdf or different instruments fireplace some hooks that take management of the terminal? No, seems after we simply run /usr/bin/env -0; with out cd it additionally takes management over the shell.

So is it the -0 that we go to env? It shouldn’t be, clearly, as a result of that’s simply formatting. However: determined occasions breed desparate debugging makes an attempt. So we tried it and it wasn’t -0 both.

Wait, is it env? Does it do one thing bizarre with my terminal? Huh.

So we modified the command from

let command = format!("/usr/bin/env;");

to

let command = format!("echo lol");

… and guess what? Ctrl-c labored once more.

What?

Okay, one other try. What if we do each?

let command = format!("/usr/bin/env; echo lol");

That additionally labored. WHAT!

Okay, wait a second… my intestine is telling me one thing. /usr/bin/env isn’t a shell built-in, is it? However echo is. Is {that a} clue?

Let’s do that one:

let command = format!("ls");

Good outdated ls. Most likely the command I’ve ran probably the most in my life. It’s at all times there once I want it and on each machine I acquire entry to I instantly run ls simply to see that it really works. I’d belief ls with my life.

And but: after working ls in that subshell, Ctrl-c stopped working. Et tu, ls?

Subsequent speculation: is it one thing in Zed? Can we setup some sign handlers? Let’s discover out. We copied the operate to a brand new, bare-bones Rust undertaking, ran it and… it reproduced. Ctrl-c stopped working in that undertaking too.

Okay, is it Rust then? I rewrote the operate to Go and in Go too Ctrl-c misplaced management.

At this level we had spent practically 2 hours on this and couldn’t determine it out. However we did have a workaround:

let command = format!("/usr/bin/env; exit 0;");

exit is a built-in in all of the totally different shells, so it’s protected to run and it fixes the issue. Okay, truthful sufficient. We slapped one hell of a remark above that line to let the subsequent particular person to return alongside know that the exit 0 is now load-bearing and moved on.

However this puzzle received to me. I requested fellow shell-nerds whether or not they know what’s occurring however nobody had a solution prepared. So in my mornings I began to research.

I setup a repository during which a small Rust program reproduced the issue: it spawns a shell course of, waits for it to exit, then idles for five seconds so I can check whether or not Ctrl-c nonetheless works. The hunt was on.

The primary massive mild bulb second got here once I realized that I don’t need to ship a sign through Ctrl-c: I can use the kill command. And, alas, it’s not the sign dealing with that’s borked! After I used kill -INT the sign arrived and the method stopped. It’s not that my course of doesn’t react to alerts anymore, however quite that Ctrl-c doesn’t ship the appropriate alerts after launching the shell course of.

Subsequent try: is the terminal said borked after launching the shell? Okay, so one thing in regards to the terminal state. Someone in the tweet replies did level me to stty, which helps you to set choices in your terminal gadget, such because the baud price (sure) and different issues. I modified my program to run stty -a earlier than and after the shell course of. No luck: no modifications within the output.

Determined, I additionally used Ghostty’s terminal inspector to see whether or not some state modifications within the terminal that ends in Ctl-c going up in smoke. However no luck there both.

After days of going backwards and forwards on this with ChatGPT (which I wrote about the last time) it lastly gave me a clue:

The spawned shell inherits the terminal (TTY) management, and because it’s an interactive shell (-i flag), it units itself because the foreground course of group chief for the terminal. This modifications how alerts, particularly SIGINT generated by Ctrl-C, are dealt with.

Huh. Foreground course of group chief. Attention-grabbing. Hmmm. Right here’s what Advanced Programming in the Unix Environment (APUE), which I pulled out as we speak whereas scripting this, says on course of teams:

A course of group is a group of a number of processes, normally related to the identical job (job management is mentioned in Part 9.8), that may obtain alerts from the identical terminal. Every course of group has a novel course of group ID. Course of group IDs are just like course of IDs: they’re optimistic integers and may be saved in a pid_t knowledge kind. The operate getpgrp returns the method group ID of the calling course of.

The vital half: “that may obtain alerts from the identical terminal.”

It’s been some time since I final seemed up one thing in a bodily ebook. Pictured: Superior Programming within the Unix Setting. A incredible ebook.

APUE has extra clues:

It’s doable for a course of group chief to create a course of group, create processes within the group, after which terminate.

So is that what occurs? The shell spawns, claims it’s the method group chief when it doesn’t run a built-in command, exits, after which doesn’t restore the earlier course of group chief?

It felt like I used to be getting nearer. So I saved asking ChatGPT verify this and it led me to tcgetprg:

See Also

The operate tcgetpgrp() returns the method group ID of the foreground course of group on the terminal related to fd, which have to be the controlling terminal of the calling course of.

Okay, now we’re speaking, this sounds prefer it could lead on us someplace. I requested ChatGPT to generate me some Rust code for that tcgetpgrp name:

fn get_process_group_id(fd: i32) -> io::Consequence<libc::pid_t> {
    let pgid = unsafe { libc::tcgetpgrp(fd) };
    if pgid == -1 {
        Err(io::Error::last_os_error())
    } else {
        Okay(pgid)
    }
}

I plugged that into my program so it might print the method group ID related to STDIN (file descriptor 0) earlier than and after the $SHELL course of has run. That is what it printed:

course of group earlier than: 54530
shell exited with standing: exit standing: 0
course of group after: 54571

Properly, hiya there! This actually appears to be like just like the homicide weapon. How can I verify that it is what kills my Ctrl-c although? Is there a way I might cease the shell from taking on as course of group chief? ChatGPT stated that I might use the pre_exec hook on std::course of::Command to place the shell course of in a brand new, separate course of session, which is able to put it in a brand new course of group, which in flip means it gained’t be capable to turn into the method group chief of the group related to STDIN. Like this:

let cmd = std::course of::Command::new("/bin/zsh");
cmd.args(["-i", "-c", "/usr/bin/env"]);

// Set a hook that will probably be executed proper after `fork`, however earlier than `exec`:
unsafe {
    cmd.pre_exec(|| {
        if libc::setsid() == -1 {
            return Err(std::io::Error::last_os_error());
        }
        Okay(())
    });
}

// Run the command
let output = cmd.output().unwrap();

Proper there, within the center: setsid. That’s known as proper after we create a brand new course of with fork however earlier than that course of is became $SHELL.

APUE on what occurs when a course of calls setsid:

  1. The method turns into the session chief of this new session. […]

  2. The method turns into the method group chief of a brand new course of group. […]

  3. The method has no controlling terminal. […] If the method had a controlling terminal earlier than calling setsid, that affiliation is damaged.`

That is sensible. By calling setsid it might break any affiliation the newly-spawned shell course of has with the terminal and that would assist me verify whether or not the shell mucking with the method teams chief is the issue.

And — growth! fireworks! loud noises! a small little one saying: “ta-da!” — with the pre_exec hook that is what this system printed:

course of group earlier than: 54530
shell exited with standing: exit standing: 0
course of group after: 54530

And Ctrl-C nonetheless labored!

The foreground course of group ID is the homicide weapon. At this level it was clear what occurs: the shell that’s spawned takes management of the terminal, by setting the foreground course of group ID, which suggests the sign ensuing from Ctrl-C is distributed to the shell course of. But when the shell runs a non-built-in command as its final command, it doesn’t clear up after itself and its course of ID stays related to the terminal, resulting in all of our Ctrl-Cs ending up within the void.

With that What? the subsequent query is: why?

Why does ZSH (the shell with which this occurred for me) not reset the foreground course of group chief when it runs a non-built in command?

On my Linux machine I ran strace -f to see which syscalls my course of and, extra importantly, its little one processes (together with the spawned shell) have been making. What I might determine was this:

When zsh is run with -c and the final command in that handed command is a non-built-in, reminiscent of ls or env, then ZSH execves into that final course of. Which means: it doesn’t create a toddler course of to run ls. No, as a substitute it turns itself into that command. Meaning on the cut-off date when ls is run in zsh -c 'echo lol; ls' the zsh course of is gone and became ls and there’s nobody left to reset the foreground course of group chief.

However once you run zsh -c '/usr/bin/env; echo lol', i.e.: first non-built-in, then built-in, then ZSH doesn’t disappear. It forks and execs /usr/bin/env after which executes the echo lol and, someplace in there, cleans up the foreground course of group chief.

Now, pay attention. I want I might proceed right here and finish with “… and this is why ZSH does it that method!” and somebody would lastly PayPal me $100 with the message “thanks in your publication”, however I’ve to disappoint you.

I don’t know the way and why precisely ZSH does what it does. I cloned the repo, I compiled it, I attempted to run it from supply, however in some way failed and man cmake is rather a lot and likewise the folders have names like Src and Doc and who the hell capitalizes the primary letter in a folder identify and there’s additionally a ./configure you must run after which you’ll want to be certain that it doesn’t use your system library and… You see this shell investigation stuff isn’t straightforward and I gave up, sorry.

What I did find though is that ZSH does actively set the method group id for job management. And it additionally remembers the original one and resets it. However I gave up once I noticed this part here that does job management stuff in ZSH and realized that I’m not getting paid for this.

I await your letters with the reason.



Source Link

What's Your Reaction?
Excited
0
Happy
0
In Love
0
Not Sure
0
Silly
0
View Comments (0)

Leave a Reply

Your email address will not be published.

2022 Blinking Robots.
WordPress by Doejo

Scroll To Top