The Terminal Escape Sequences Ocean is Deep and Darkish: Debugging a Digital Terminal
01-16-2023 5:00PM (ET) 01-17-2023 9:12PM (ET) (edited)
Just a few months in the past I acquired to debug and fix an fascinating bug within the Csharp digital terminal library VtNetCore.
We use VtNetCore at BastionZero to extract terminal instructions in realtime from consumer terminal periods as they movement over the wire. This requires feeding the usual out and normal err of a terminal session into VtNetCore after which detecting when a consumer has entered a command and discovering for the command within the rendered display buffer of VtNetCore. We start noticing that generally instructions would cease being extracted from a session and that after this the command extraction thread would beginning taking growing lengthy instances to complete.
I began off attempting to copy this conduct in a dev occasion of our service. This was the toughest a part of the debugging it concerned plenty of guess work. I used to be ultimately capable of replicate the difficulty by operating tmux after which operating high in one of many tmux panes. As soon as I may replicate the conduct I recorded the terminal session into an asciinema file utilizing bastionzero’s session recording characteristic. I then wrote a unittest which replayed the asciicinema file into VtNetCore. This was the most important “ah ha!” second as a result of it meant I may do a repeatable clear room replication of the bug.
Sadly the recorded terminal session was far too giant to learn by hand and search for points. To search out the offending chunk of the session recording, I wrote an automatic check which might repeatly rerun the session recording onto VtNetCore however on every rerun would drop yet one more byte from the entrance of the asciinema session recording after which verify if the bug was replicated. As soon as I discovered the offset by which the bug stopped showing, I knew precisely the precise bytes within the session recording that set off the bug.
It was: 'u00b1' , ']', '1', '1', '2', ...
Somewhat little bit of googling and I uncover that that is the terminal escape sequence for OSC-112 “reset textual content cursor shade”.
Terminal escape sequences are ways in which the server in a terminal session can sign particular directions to the terminal operating on the shopper. They will let a server change the colour of the textual content, transfer the cursor to a special place, rewrite what’s proven within the title bar of the terminal, and many others… Sometimes they begin with the byte u00b1
known as the ASCII ESC
character. ESC
is the 27th ASCII character. It’s used primarly for signaling {that a} terminal ought to deal with the subsequent few bytes as an escape sequence.
The terminal escape sequences ocean is deep and darkish. There are an many sorts of escape sequences. I’ve but to search out even a whole itemizing of all of them. No terminal helps all of them. They’ve simply been layered on decade after decade afte decade. OSC-1337 is used to copy files to a remote server or DECALN which triggered by the byte sequence ESC
, #
, 8
, [will print a screen alignment test on your terminal(https://vt100.net/docs/vt510-rm/DECALN.html). In this case OSC-112 is an OSC (Operating System Control) terminal escape sequence.
OSC terminal sequences are documented in xterm as having the following format:
OSC Ps ; Pt BEL
Ps A single (usually optional) numeric parameter, composed of one or more digits.
Pt A text parameter composed of printable characters.
If no parameters are given, this control has no effect.
Where OSC is the escape sequence, ESC ]
a.ok.a, u001b]
.
ECMA-48 offers the next description of OSC.
OSC - OPERATING SYSTEM COMMAND
Notation: (C1)
Illustration: 09/13 or ESC 05/13
OSC is used because the opening delimiter of a management string for working system use. The command string
following might include a sequence of bit combos within the vary 00/08 to 00/13 and 02/00 to 07/14.
The management string is closed by the terminating delimiter STRING TERMINATOR (ST). The
interpretation of the command string is determined by the related working system.
OSC 112 – reset cursor color: resets the colour of the textual content cursor. Since OSC-112 does which have a Pt (textual content parameter) and thus can is specified as both:
* ESC ]112 BELL
(u001b]112u0007
)
or
* ESC ]112; BELL
(u001b]112;u0007
)
Tmux makes use of the model with out the ;
. As an example in tmux/tty-features.c tmux defines OSC-112 as:
/* Terminal helps cursor colors. */
static const char *const tty_feature_ccolour_capabilities[] = {
"Cs=E]12;%p1percentsa",
"Cr=E]112a",
NULL
};
To find the foundation trigger I hooked a debugger as much as my replication check and stepped by way of it line by line. After an hour of investigation I found that the foundation trigger was that VtNetCore had a parsing bug in the way it dealt with the terminal management sequence OSC-112 (Operation System Sequences). The OSC parser (ConsumeOSC) in VtNetCore assumes that OSC control sequences at all times match the sample:
u00b1
+ ]
+ <numeric parameters>
+ <command>
+ u0007
This assumption is wrong because the OSC-112 management sequence doesn’t at all times have a letter after the numeric parameter 112. Which means that VtNetCore’s digital terminal misses the u0007
BELL character that ought to finish the management sequence and assumes that every one following textual content is definitely a part of the management sequence. For the reason that u0007
(0x07
or the BELL
character) character may be very unusual, it is going to reread all enter it has seen on every new worth despatched to the information shopper ready for the management sequence to finish.
As an example if the unprocessed terminal out was u00b1, ], 1, 1, 2, u0007, A,
it might learn till it acquired an error as a result of there was no bytes after A
, it might then assume the total escape sequence is not within the terminal but and transfer u00b1, ], 1, 1, 2, u0007, A,
again to the unprocessed buffer. When it will get one other character, say B
, it might append it to unprocessed buffer after which learn u00b1, ], 1, 1, 2, u0007, A, B
run out of knowledge to course of and transfer u00b1, ], 1, 1, 2, u0007, A, B
again to the unprocessed buffer. It’s going to by no means make progress and the unprocessed buffer will develop in dimension with every new worth. Every time new output is available in, it is going to linearly learn throughout the rising buffer. It’s going to deal with any string or any size as ‘unfinished escape seqeuence’ till finds a letter adopted by the u0007
BELL character. For the really curious I present an in depth step by way of of what’s taking place internally in VtNetCore on the finish this weblog entry.
It is a massive drawback. Tmux will generate this management sequence on beginning and thus carry down any VtNetCore digital terminal that’s studying tmux output. As an example in a single tmux session I recorded the InputBuffer was 60KB and was being fully reread on every new DataConsumer.Push. This might take 7 seconds for every learn of the buffer to finish and the digital terminal would by no means make progress.
The irony of the answer being a literal byte intended to ring an actual, electro-mechanical, bell on old teletype machine was not misplaced on me.
To repair this situation I added code to finish an OSC escape sequence when it detects the BELL chracter u0007
even when no letters have been encountered. I then put collectively and submitted a PR to the VtNetCore project. This PR included my repair and unittests to copy the difficulty and present that my repair resolved the bug.
Darrenstarr, the maintainer of VtNetCore merged my PR with these variety phrases. I’ve by no means seen a maintainer react to a bug report with this a lot grace and encouragement. It made me extra prone to submit PRs to opensource software program sooner or later.
For the really curious, I quote from my PR and supply a step-by-step description of what occurs inner to VtNetCore when parsing when the digital terminal makes an attempt to course of [u00b1, ], 1, 1, 2, u0007, A, B, C]
- The InputBuffer is empty, accommodates 0 components, place is 0, the rest is 0.
u00b1]112u0007ABC
is pushed to VtNetCore’s DataConsumer i.e.,DataConsumer.Push("u00b1]112u0007ABC")
. Push provides this to the InputBuffer.- The InputBuffer is
[u00b1, ], 1, 1, 2, u0007, A, B, C]
, accommodates 8 components, place is 0, the rest is 8 - VtNetCore reads
u00b1
and determines it’s coming into an escape sequence, - VtNetCore reads
]
and determines the escape sequence is an OSC sequence and makes use of ConsumeOSC to parse the OSC sequence - ConsumeOSC reads
1
,1
,2
as numeric parameters, - ConsumeOSC reads the bell character
u0007
and unitsreadingCommand = true;
- ConsumeOSC reads
A
assuming it’s a part of the command, - ConsumeOSC reads
B
assuming it’s a part of the command, - ConsumeOSC reads
A
assuming it’s a part of the command, - When it runs to the top of the InputBuffer it throws IndexOutOfRangeException,
- This Exception is dealt with within the DataConsumer L:102
- The InputBuffer is
[u00b1, ], 1, 1, 2, u0007, A, B, C]
, accommodates 8 components, place is 8, the rest is 0 - The code dealing with the IndexOutOfRangeException in DataConsumer calls
InputBuffer.PopAllStates()
, - The InputBuffer is
[u00b1, ], 1, 1, 2, u0007, A, B, C]
, accommodates 8 components, place is 0, the rest is 8 - DataConsumer.Push(“u00b1]112u0007ABC”)` returns
- The consumer then calls DataConsumer.Push(“DEFG”)`
- Push provides this to the InputBuffer. The InputBuffer is
[u00b1, ], 1, 1, 2, u0007, A, B, C, D, E, F, G]
, accommodates 12 components, place is 0, the rest is 12, - The earlier steps repeat with ConsumeOSC now believing
ABCDEFG
is the start of the OSC command, - As earlier than when it runs to the top of the InputBuffer it throws IndexOutOfRangeException and resets the place,
- The InputBuffer is
[u00b1, ], 1, 1, 2, u0007, A, B, C, D, E, F, G]
, accommodates 12 components, place is 0, the rest is 12, - This continues with VtNetCore scanning over the ever rising enter buffer on each
Push
however by no means making any progress.