Be taught x86-64 meeting by writing a GUI from scratch
Most individuals assume meeting is just for use to put in writing toy applications for studying functions, or to put in writing a extremely optimized model of a selected perform inside a codebase written in a high-level language.
Properly, what if we wrote an entire program in meeting that opens a GUI window? Will probably be the whats up world of the GUI world, however that also counts. Here’s what we’re working in the direction of:
I needed to broaden my data of meeting and by doing one thing enjoyable and motivating. All of it originated from the remark that so many program binaries right this moment are very huge, typically over 30 Mib (!), and I requested myself: How small a binary might be for a (very simplistic) GUI? Properly, it seems, little or no. Spoiler alert: round 1 KiB!
I’m under no circumstances an knowledgeable in meeting or in X11. I simply hope to supply an entertaining, approachable article, one thing a newbie can perceive. One thing I wanted I had discovered once I was studying these subjects. In the event you spot an error, please open a Github issue!
Desk of Contents
What do we’d like?
I might be utilizing the nasm
assembler which is easy, cross-platform, quick, and has fairly a readable syntax.
For the GUI, I might be utilizing X11 since I’m primarily based on Linux and it has some attention-grabbing properties that make it simple to do with out exterior libraries. If you’re working Wayland, it ought to work with XWayland out of the field, and maybe additionally on macOS with XQuartz, however I’ve not examined these.
Word that the one distinction between *nix working programs within the context of this program is the system name values. Since I’m primarily based on Linux I might be utilizing the Linux system name values, however ‘porting’ this program to, say, FreeBSD, would solely require to alter these values, probably utilizing the nasm macros:
%ifdef linux
%outline SYSCALL_EXIT 1
%elifdef freebsd
%outline SYSCALL_EXIT 60
%endif
%outline
and its variants are a part of the macro system innasm
, which is highly effective however we’ll solely use it right here to outline constants, similar to in C:#outline FOO 3
.
No want for added tooling to cross-compile, points with dynamic libraries, libc variations, and many others. Simply compile on Linux by defining the best variable on the command line, ship the binary to your good friend on FreeBSD, and it simply works(tm). That’s refreshing.
So let’s dive in!
X11 fundamentals
X11 is a server accessible over the community that handles windowing and rendering inside these home windows. A shopper opens a socket, connects to the server, and sends instructions in a selected format to open a window, draw shapes, textual content, and many others. The server sends message about errors or occasions to the shopper.
Most purposes will need to use libX11
or libxcb
which supply a C API, however we need to do this ourselves.
The place the server lives is definitely not related for a shopper, it’d run on the identical machine or in a datacenter far far-off. After all, within the context of a desktop laptop in 2023, will probably be working on the identical machine, however that’s a element.
The official documentation is fairly good, so unsure we will seek advice from it.
Predominant in x64 meeting
Let’s begin gradual with minimal program that merely exits with 0, and construct from there.
First, we inform nasm we’re writing a 64 bit program and that we goal x86_64. Then, we’d like a foremost perform, which we name _start
and must be seen since that is the entry level of our program (therefore the international
key phrase):
; Feedback begin with a semicolon!
BITS 64 ; 64 bits.
CPU X64 ; Goal the x86_64 household of CPUs.
part .textual content
international _start
_start:
xor rax, rax ; Set rax to 0. Not truly wanted, it is simply to keep away from having an empty physique.
part .textual content
is telling nasm
and the linker, that what follows is code that must be positioned within the textual content part of the executable.
We’ll quickly have a part .information
for our international variables.
Word that these part often get mapped by the OS to completely different pages in reminiscence with completely different permissions (seen with readelf -l
) in order that the textual content part just isn’t writable and the info part just isn’t executable, however that varies from OS to OS.
The _start
perform has a physique that does nothing for now, however not for lengthy. The precise identify of the principle perform is definitely as much as us, it’s simply that begin
or _start
is common.
We construct and run our little program like this:
$ nasm -f elf64 -g foremost.nasm && ld foremost.o -static -o foremost
nasm
truly solely produces an object file, so to get an executable out of it, we have to invoke the linker ld
. The flag -g
is telling nasm
to provide debugging data which is immensely helpful when writing uncooked meeting, since firing the debugger is commonly our solely recourse in face of a bug.
To take away the debugging data, we will go -s
to the linker, for instance once we are about to ship our program and need to save a couple of KiB.
We lastly have an executable:
$ file ./foremost
foremost: ELF 64-bit LSB executable, x86-64, model 1 (SYSV), statically linked, with debug_info, not stripped
We will see the completely different sections with readelf -a ./foremost
, and it tells us that the .textual content
part, which accommodates our code, is just 3 bytes lengthy.
Now, if we attempt to run our program, it would segfault. That’s as a result of we’re anticipated by the working system to exit (utilizing the exit system name) ourselves. That’s what libc does for us in C applications, so let’s deal with that:
%outline SYSCALL_EXIT 60
international _start:
_start:
mov rax, SYSCALL_EXIT
mov rdi, 0
syscall
nasm
makes use of the intel syntax:<instruction> <vacation spot>, <supply>
, somov rdi, 0
places 0 into the registerrdi
. Different assemblers use the AT&T syntax which swaps the supply and vacation spot. My recommendation: decide one syntax and one assembler and keep on with it, each syntaxes are high quality and most instruments have some assist for each.
Following the System V ABI, which is required on Linux and different Unices for system calls, invoking a system name requires us to place the system name code within the register rax
, the parameters to the syscall (as much as 6) within the registers rdi
, rsi
, rdx
, rcx
, r8
, r9
, and extra parameters, if any, on the stack (which is not going to occur on this program so we will overlook about it).
We then use the instruction syscall
and examine rax
for the return worth, 0
often that means: no error.
Word that Linux has a ‘enjoyable’ distinction, which is that the fourth parameter of a system name is definitely handed utilizing the register r10
.
Word that the System V ABI is required when making system calls and when interfacing with C however we’re free to make use of no matter conventions we would like in our personal meeting code. For a very long time, Go was utilizing a unique calling conference than the System V ABI, for instance, when calling features (passing arguments on the stack). Most instruments (debuggers, profilers) count on the System V ABI although, so I like to recommend sticking to it.
Again to our program: once we run it, we see…nothing. That’s as a result of the whole lot went properly, true to the UNIX philosophy!
We will examine the exit code:
Altering mov rdi, 0
to mov rdi, 8
will now end in:
One other method to observe system calls made by a program is with strace
, which will even show very helpful when troubleshooting. On some BSD, its equal is truss
or dtruss
.
$ strace ./foremost
execve("./foremost", ["./main"], 0x7ffc60e6bf10 /* 60 vars */) = 0
exit(8) = ?
+++ exited with 8 +++
Let’s change it again to 0 and proceed.
A stack primer
Earlier than we will proceed, we have to know the fundamentals of how the stack works in meeting since we’ve no pleasant compiler to try this for us.
The three most vital issues in regards to the stack are:
- It grows downwards: to order extra space on the stack, we lower the worth of
rsp
- A perform should restore the stack pointer to its unique worth earlier than the perform returns, that means, both bear in mind the unique worth and set
rsp
to this, or, match each decrement by an increment of the identical worth. - Earlier than a perform name, the stack pointer must be 16 bytes aligned, in keeping with the System V ABI. Additionally, on the very starting of a perform, the stack pointer worth is:
16*N + 8
. That’s as a result of earlier than the perform name, its worth was 16 byte aligned, i.e.16*N
, and thename
instruction pushes on the stack the present location (the registerrip
, which is 8 bytes lengthy), to know the place to leap when the known as perform returns.
Not abiding by these guidelines will end in nasty crashes, so be warned. That’s as a result of the situation of the place to leap when the perform returns might be seemingly overwritten and this system will bounce to the fallacious location. That, or the stack content material might be overwritten and this system will function on fallacious values. Unhealthy both manner.
A small stack instance
Let’s write a perform that prints whats up
to the usual out, utilizing the stack, to study the ropes.
We have to reserve (a minimum of) 5 bytes on the stack, since that’s the size in bytes of whats up
.
The stack seems like this:
And rsp
factors to the underside of it.
Right here’s how we entry every ingredient:
Reminiscence location (instance) | Meeting code | Stack ingredient |
---|---|---|
0x1016 | … | |
0x1015 | rsp + 5 | rbp |
0x1014 | rsp + 4 | o |
0x1013 | rsp + 3 | l |
0x1012 | rsp + 2 | l |
0x1011 | rsp + 1 | e |
0x1010 | rsp + 0 | h |
We then go the handle on the stack of the start of the string to the write
syscall, in addition to its size:
%outline SYSCALL_WRITE 1
%outline STDOUT 1
print_hello:
push rbp ; Save rbp on the stack to have the ability to restore it on the finish of the perform.
mov rbp, rsp ; Set rbp to rsp
sub rsp, 5 ; Reserve 5 bytes of area on the stack.
mov BYTE [rsp + 0], 'h' ; Set every byte on the stack to a string character.
mov BYTE [rsp + 1], 'e'
mov BYTE [rsp + 2], 'l'
mov BYTE [rsp + 3], 'l'
mov BYTE [rsp + 4], 'o'
; Make the write syscall
mov rax, SYSCALL_WRITE
mov rdi, STDOUT ; Write to stdout.
lea rsi, [rsp] ; Deal with on the stack of the string.
mov rdx, 5 ; Move the size of the string which is 5.
syscall
add rsp, 5 ; Restore the stack to its unique worth.
pop rbp ; Restore rbp
ret
lea vacation spot, supply
hundreds the efficient handle of the supply into the vacation spot, which is how C pointers are carried out. To dereference a mememory location we use sq. brackets. So, assuming we simply have loaded an handle intordi
withlea
, e.g.lea rdi, [hello_world]
, and we need to retailer the worth on the handle intorax
, we do:mov rax, [rdi]
. We often have to informnasm
what number of bytes to dereference withBYTE
,WORD
,DWORD
,QWORD
so:mov rax, DWORD [rdi]
, as a result ofnasm
doesn’t maintain observe of the sizes of every variable. That’s additionally what the C compiler does once we dereference aint8_t
,int16_t
,int32_t
, andint64_t
pointer, respectively.
There’s a lot to unpack right here.
First, what’s rbp
? That’s a register like some other. However, you’ll be able to select to comply with the conference of not utilizing this register like the opposite registers, to retailer arbitrary values, and as an alternative, use it to retailer a linked record of name frames. That’s lots of phrases.
Mainly, on the very starting of a perform, the worth of rbp
is saved on the stack (that’s push rbp
). Since rbp
shops an handle (the handle of the body that’s known as us), we’re storing on the stack the handle of the caller in a recognized location.
Instantly after that, we set rbp
to rsp
, that’s, to the stack pointer at first of the perform. push rbp
and mov rbp, rsp
are thus often known as the perform prolog.
For the remainder of the perform physique, we deal with rbp
as a relentless and solely lower rsp
if we have to reserve area on the stack.
So if perform A calls perform B which in flip calls perform C, and every perform shops on the stack the handle of the caller body, we all know the place to seek out on the stack the handle of every. Thus, we will print a stack hint in any location of our program just by inspecting the stack. Fairly nifty. That’s already very helpful to profilers and different related instruments.
We should not overlook in fact, simply earlier than we exit the perform, to revive rbp
to its unique worth (which continues to be on the stack at that time): that’s pop rbp
. That is also referred to as the perform epilog. One other manner to take a look at it’s that we take away the final ingredient of the linked record of name frames, since we’re exiting the leaf perform.
Don’t fear when you’ve got not totally understood the whole lot, simply bear in mind to at all times have the perform epilogs and prologs and also you’ll be high quality:
my_function:
push rbp
mov rbp, rsp
sub rsp, N
[...]
add rsp, N
pop rbp
ret
Word: There may be an optimization methodology that makes use of rbp
as an ordinary register (with a C compiler, that’s the flag -fomit-frame-pointer
), which suggests we lose the details about the decision stack. My recommendation is: by no means do that, it’s no price it.
Wait, however didn’t you say the stack must be 16 byte aligned (that’s, a a number of of 16)? Final time I checked, 5 just isn’t actually a a number of of 16!
Good catch! The one cause why this program works, is that print_hello
is a leaf perform, that means it doesn’t name one other perform. Bear in mind, the stack must be 16 bytes aligned once we do a name
!
So the proper manner can be:
print_hello:
push rbp
mov rbp, rsp
sub rsp, 16
mov BYTE [rsp + 0], 'h'
mov BYTE [rsp + 1], 'e'
mov BYTE [rsp + 2], 'l'
mov BYTE [rsp + 3], 'l'
mov BYTE [rsp + 4], 'o'
mov rax, SYSCALL_WRITE
mov rdi, STDOUT
lea rsi, [rsp]
mov rdx, 5
syscall
name print_world
add rsp, 16
pop rbp
ret
Since once we enter the perform, the worth of rsp
is 16*N+8
, and pushing rbp
will increase it by 8, the stack pointer is 16 bytes aligned on the level of sub rsp, 16
. Decrementing it by 16 (or a a number of of 16) retains it 16 bytes aligned.
We all know can safely name one other perform from inside print_hello
:
print_world:
push rbp
mov rbp, rsp
sub rsp, 16
mov BYTE [rsp + 0], ' '
mov BYTE [rsp + 1], 'w'
mov BYTE [rsp + 2], 'o'
mov BYTE [rsp + 3], 'r'
mov BYTE [rsp + 4], 'l'
mov BYTE [rsp + 5], 'd'
mov rax, SYSCALL_WRITE
mov rdi, STDOUT
lea rsi, [rsp]
mov rdx, 6
syscall
add rsp, 16
pop rbp
ret
print_hello:
push rbp
mov rbp, rsp
sub rsp, 16
mov BYTE [rsp + 0], 'h'
mov BYTE [rsp + 1], 'e'
mov BYTE [rsp + 2], 'l'
mov BYTE [rsp + 3], 'l'
mov BYTE [rsp + 4], 'o'
mov rax, SYSCALL_WRITE
mov rdi, STDOUT
lea rsi, [rsp]
mov rdx, 5
syscall
name print_world
add rsp, 16
pop rbp
ret
And we get whats up world
as an output.
Now, attempt to do sub rsp, 5
in print_hello
, and your program might crash. There is no such thing as a assure, that’s what makes it exhausting to trace down.
My recommendation is:
- All the time use the usual perform prologs and epilogs
- All the time increment/decrement
rsp
by (a a number of of) 16 - If it’s a must to decrement
rsp
by a worth that’s unknown at compile time (much like howalloca()
works in C), you’ll be able toand rsp, -16
to 16 bytes align it. - Deal with gadgets on the stack relative to
rsp
, i.e.mov BYTE [rsp + 4], 'o'
And also you’ll be secure.
The final level is attention-grabbing, see for your self:
(gdb) p -100 & -16
$1 = -112
(gdb) p -112 & -16
$2 = -112
Which interprets in meeting to:
sub rsp, 100
and rsp, -16
Lastly, following these conventions signifies that our meeting features might be safely known as from C or different languages following the System V ABI, with none modification, which is nice.
I’ve not talked in regards to the crimson zone which is a 128 byte area on the backside of the stack which our program is free to make use of because it pleases with out having to alter the stack pointer. For my part, it isn’t useful and creates exhausting to trace bugs, so I don’t suggest to make use of it. To disable it solely, run: nasm -f elf64 -g foremost.nasm && cc foremost.o -static -o foremost -mno-red-zone -nostdlib
.
Opening a socket
We now are able to open a socket with the socket(2)
syscall, so we add a couple of constants, taken from the libc headers (be aware that these values would possibly truly be completely different on a unique Unix, I’ve not checked. Once more, a couple of %ifdef
can simply treatment this discrepancy):
%outline AF_UNIX 1
%outline SOCK_STREAM 1
%outline SYSCALL_SOCKET 41
The AF_UNIX
fixed means we would like a Unix area socket, and SOCK_STREAM
means TCP. We use a site socket since we now that our server is working on the identical machine and it must be sooner, however we may change it to AF_INET
to connect with a distant IPv4 handle for instance.
We then fill the related registers with these values and invoke the system name:
mov rax, SYSCALL_SOCKET
mov rdi, AF_UNIX ; Unix socket.
mov rsi, SOCK_STREAM ; Tcp-like.
mov rdx, 0 ; Automated protocol.
syscall
The C equal can be: socket(AF_UNIX, SOCK_STREAM, 0);
. So that you see that if we fill the registers in the identical order because the C perform parameters, we keep near what C code would do.
The entire program now seems like this:
BITS 64 ; 64 bits.
CPU X64 ; Goal the x86_64 household of CPUs.
part .textual content
%outline AF_UNIX 1
%outline SOCK_STREAM 1
%outline SYSCALL_SOCKET 41
%outline SYSCALL_EXIT 60
international _start:
_start:
; open a unix socket.
mov rax, SYSCALL_SOCKET
mov rdi, AF_UNIX ; Unix socket.
mov rsi, SOCK_STREAM ; Tcp-like.
mov rdx, 0 ; computerized protocol.
syscall
; The top.
mov rax, SYSCALL_EXIT
mov rdi, 0
syscall
Constructing and working it below strace
reveals that it really works and we get a socket with the file descriptor 3
(on this case, it is likely to be completely different for you if you’re following at house):
$ nasm -f elf64 -g foremost.nasm && ld foremost.o -static -o foremost
$ strace ./foremost
execve("./foremost", ["./main"], 0x7ffe54dfe550 /* 60 vars */) = 0
socket(AF_UNIX, SOCK_STREAM, 0) = 3
exit(0) = ?
+++ exited with 0 +++
Connecting to the server
Now that we’ve created a socket, we will hook up with the server with the join(2)
system name.
It’s a great time to extract that logic in its personal little perform, similar to in some other high-level language.
x11_connect_to_server:
; TODO
In meeting, a perform is just a label we will bounce to. However for readability, each for readers of the code and instruments, we will add a touch that it is a actual perform we will name, like this: name x11_connect_to_server
. This can enhance the decision stack for instance when utilizing strace -k
. This trace has the shape (in nasm
): static <identify of the perform>:perform
.
After all, we additionally want so as to add our customary perform prolog and epilog:
x11_connect_to_server:
static x11_connect_to_server:perform
push rbp
mov rbp, rsp
pop rbp
ret
An extra assist when studying features in meeting code is including feedback describing what parameters they settle for and what’s the return worth, if any. Since there isn’t any language stage characteristic for this, we resort to feedback:
; Create a UNIX area socket and hook up with the X11 server.
; @returns The socket file descriptor.
x11_connect_to_server:
static x11_connect_to_server:perform
push rbp
mov rbp, rsp
pop rbp
ret
First, let’s transfer the socket creation logic to our perform and name it in this system:
; Create a UNIX area socket and hook up with the X11 server.
; @returns The socket file descriptor.
x11_connect_to_server:
static x11_connect_to_server:perform
push rbp
mov rbp, rsp
; Open a Unix socket: socket(2).
mov rax, SYSCALL_SOCKET
mov rdi, AF_UNIX ; Unix socket.
mov rsi, SOCK_STREAM ; Tcp-like.
mov rdx, 0 ; Automated protocol.
syscall
cmp rax, 0
jle die
mov rdi, rax ; Retailer socket fd in `rdi` for the rest of the perform.
pop rbp
ret
die:
mov rax, SYSCALL_EXIT
mov rdi, 1
syscall
_start:
international _start:perform
name x11_connect_to_server
; The top.
mov rax, SYSCALL_EXIT
mov rdi, 0
syscall
The error checking could be very simplistic: we solely examine that the return worth of the system name (in rax
) is what we count on, in any other case we exit this system with a non-zero code by leaping to the die
part.
jle
is a conditional bounce, which inspects international flags, hopefully set simply earlier than withcmp
ortake a look at
, and jumps to a label if the situation is true. Right here, we evaluate the returned worth with 0, and whether it is decrease or equal to 0, we bounce to the error label. That’s how we implement conditionals and loops.
Okay, we will lastly hook up with the server now. The join(2)
system name takes the handle of a sockaddr_un
construction because the second argument. This construction is simply too huge to slot in a register.
That is the primary syscall we encounter that must be handed a pointer, in different phrases, the handle of a area in reminiscence. That area might be on the stack or on the heap, and even be our personal executable mapped in reminiscence. That’s meeting, we get to do no matter we would like.
Since we need to maintain issues easy and quick, we’ll retailer the whole lot on this program on the stack. And since we’ve 8 MiB of it (in keeping with restrict
, on my machine, that’s), it’ll be a lot sufficient. Truly, probably the most area we’ll want on the stack on this program might be 32 KiB.
The dimensions of the sockaddr_un
construction is 110 bytes, so we reserve 112 to align rsp
to 16 bytes.
Nasm does have structs, however they’re quite a method to outline offsets with a reputation, than buildings like in C with a selected syntax to handle a selected subject. For the sake of simplicity, I’ll use the guide manner, with out
nasm
structs.
We set the primary 2 bytes of this construction to AF_UNIX
since it is a area socket. Then comes the trail of the Unix area socket which X11 expects to be in a sure format. We need to show our window on the primary monitor beginning at 0, so the string is: /tmp/.X11-unix/X0
.
In C, we’d do:
const sockaddr_un addr = {.sun_family = AF_UNIX,
.sun_path = "/tmp/.X11-unix/X0"};
const int res =
join(x11_socket_fd, (const struct sockaddr *)&addr, sizeof(addr));
How will we translate that to meeting, particularly the string half?
We may set every byte to every character of the string within the construction, on the stack, manually, one after the other. One other way to do it’s to make use of the rep movsb
idiom, which instructs the CPU to repeat a personality from a string A to a different string B, N occasions. That is precisely what we’d like!
The best way it really works is:
- We put the string within the
.rodata
part (similar as the info part however read-only) - We load its handle in
rsi
(it’s the supply) - We load the handle of the string within the construction on the stack in
rdi
(it’s the vacation spot) - We set
rcx
to the variety of bytes to be copied - We use
cld
to clear theDF
flag to make sure the copy is finished forwards (because it may also be completed backwards) - We name
rep movsb
and voila
It’s principally memcpy
from C.
It is a attention-grabbing case: we will see that some directions count on a few of their operands to be in sure registers and there’s no manner round it. So, we’ve to plan forward and count on these registers to be overwritten. If we have to maintain their unique values round, we’ve to retailer these values elsewhere, for instance on the stack (that’s known as spilling) or in different registers. It is a broader matter of register allocation which is NP-hard! In small features, it’s manageable although.
First, the .rodata
part:
part .rodata
sun_path: db "/tmp/.X11-unix/X0", 0
static sun_path:information
Then we copy the string:
mov WORD [rsp], AF_UNIX ; Set sockaddr_un.sun_family to AF_UNIX
; Fill sockaddr_un.sun_path with: "/tmp/.X11-unix/X0".
lea rsi, sun_path
mov r12, rdi ; Save the socket file descriptor in `rdi` in `r12`.
lea rdi, [rsp + 2]
cld ; Transfer ahead
mov ecx, 19 ; Size is nineteen with the null terminator.
rep movsb ; Copy.
ecx
is the 32 bit type of the registerrcx
, that means we solely set right here the decrease 32 bits of the 64 bit register. This handy table lists the entire varieties for the entire registers. However be cautious of the pitfall case of solely setting a worth in a part of a register, after which utilizing the entire register later. The remainder of the bits that haven’t been set will include some previous worth, which is tough to troubleshoot. The answer is to make use ofmovzx
to zero lengthen, that means setting the remainder of the bits to 0. A great way to visualise that is to make use ofdata registers
inside gdb, and that may show for every register the worth for every of its varieties, e.g. forrcx
, it would show the worth forrcx
,ecx
,cx
,ch
,cl
.
Then, we do the syscall, examine the returned worth, exit this system if the worth just isn’t 0, and eventually return the socket file descriptor, which might be used each time in the remainder of this system when speaking to the X11 server.
The whole lot collectively, it seems like:
; Create a UNIX area socket and hook up with the X11 server.
; @returns The socket file descriptor.
x11_connect_to_server:
static x11_connect_to_server:perform
push rbp
mov rbp, rsp
; Open a Unix socket: socket(2).
mov rax, SYSCALL_SOCKET
mov rdi, AF_UNIX ; Unix socket.
mov rsi, SOCK_STREAM ; Tcp-like.
mov rdx, 0 ; Automated protocol.
syscall
cmp rax, 0
jle die
mov rdi, rax ; Retailer socket fd in `rdi` for the rest of the perform.
sub rsp, 112 ; Retailer struct sockaddr_un on the stack.
mov WORD [rsp], AF_UNIX ; Set sockaddr_un.sun_family to AF_UNIX
; Fill sockaddr_un.sun_path with: "/tmp/.X11-unix/X0".
lea rsi, sun_path
mov r12, rdi ; Save the socket file descriptor in `rdi` in `r12`.
lea rdi, [rsp + 2]
cld ; Transfer ahead
mov ecx, 19 ; Size is nineteen with the null terminator.
rep movsb ; Copy.
; Connect with the server: join(2).
mov rax, SYSCALL_CONNECT
mov rdi, r12
lea rsi, [rsp]
%outline SIZEOF_SOCKADDR_UN 2+108
mov rdx, SIZEOF_SOCKADDR_UN
syscall
cmp rax, 0
jne die
mov rax, rdi ; Return the socket fd.
add rsp, 112
pop rbp
ret
We’re prepared to speak to the X11 server!
Sending information over the socket
There may be the ship(2)
syscall to do that, however we will maintain it easy and use the generic write(2)
syscall as an alternative. Both manner works.
%outline SYSCALL_WRITE 1
The C construction for the handshake seems like this:
typedef struct {
u8 order;
u8 pad1;
u16 main, minor;
u16 auth_proto, auth_data;
u16 pad2;
} x11_connection_req_t;
pad*
fields might be ignored since they’re padding and their worth just isn’t learn by the server.
For our handshake, we have to set the order
to be l
, that’s, little-endian, since X11 might be informed to interpret message as huge or little endian. Since x64 is little-endian, we don’t need to have a endianness translation layer and so we keep on with little-endian.
We additionally have to set the main
subject, which is the model, to 11
. I’ll go away it to the reader to guess why.
In C, we’d do:
x11_connection_req_t req = {.order = 'l', .main = 11};
This construction is just 12 bytes lengthy, however since we must learn the response from the server which is sort of huge (round 14 KiB throughout my testing), we’ll straight away reserve lots of area on the stack, 32 KiB, to be secure:
sub rsp, 1<<15
mov BYTE [rsp + 0], 'l' ; Set order to 'l'.
mov WORD [rsp + 2], 11 ; Set main model to 11.
Then we ship it to the server:
; Ship the handshake to the server: write(2).
mov rax, SYSCALL_WRITE
mov rdi, rdi
lea rsi, [rsp]
mov rdx, 12*8
syscall
cmp rax, 12*8 ; Examine that every one bytes had been written.
jnz die
After that, we learn the server response, which must be at first 8 bytes:
; Learn the server response: learn(2).
; Use the stack for the learn buffer.
; The X11 server first replies with 8 bytes. As soon as these are learn, it replies with a a lot larger message.
mov rax, SYSCALL_READ
mov rdi, rdi
lea rsi, [rsp]
mov rdx, 8
syscall
cmp rax, 8 ; Examine that the server replied with 8 bytes.
jnz die
cmp BYTE [rsp], 1 ; Examine that the server despatched 'success' (first byte is 1).
jnz die
The primary byte within the server response is 0
for failure and 1
for achievement (and 2
for authentication however we is not going to want it right here).
The server ship sends a giant message with lots of common data, which we’ll want for later, so we retailer sure fields in international variables positioned within the information part.
First we add these variables, every 4 bytes huge:
part .information
id: dd 0
static id:information
id_base: dd 0
static id_base:information
id_mask: dd 0
static id_mask:information
root_visual_id: dd 0
static root_visual_id:information
Then we learn the server response, and skip over the components we’re not fascinated with. This boils right down to incrementing a pointer by a dynamic worth, a couple of occasions. Word that since we don’t do any checks right here, that may be an excellent assault vector to set off a stack overflow or such in our program.
; Learn the remainder of the server response: learn(2).
; Use the stack for the learn buffer.
mov rax, SYSCALL_READ
mov rdi, rdi
lea rsi, [rsp]
mov rdx, 1<<15
syscall
cmp rax, 0 ; Examine that the server replied with one thing.
jle die
; Set id_base globally.
mov edx, DWORD [rsp + 4]
mov DWORD [id_base], edx
; Set id_mask globally.
mov edx, DWORD [rsp + 8]
mov DWORD [id_mask], edx
; Learn the knowledge we'd like, skip over the remaining.
lea rdi, [rsp] ; Pointer that may skip over some information.
mov cx, WORD [rsp + 16] ; Vendor size (v).
movzx rcx, cx
mov al, BYTE [rsp + 21]; Variety of codecs (n).
movzx rax, al ; Fill the remainder of the register with zeroes to keep away from rubbish values.
imul rax, 8 ; sizeof(format) == 8
add rdi, 32 ; Skip the connection setup
add rdi, rcx ; Skip over the seller data (v).
add rdi, rax ; Skip over the format data (n*8).
mov eax, DWORD [rdi] ; Retailer (and return) the window root id.
; Set the root_visual_id globally.
mov edx, DWORD [rdi + 32]
mov DWORD [root_visual_id], edx
All collectively:
; Ship the handshake to the X11 server and browse the returned system data.
; @param rdi The socket file descriptor
; @returns The window root id (uint32_t) in rax.
x11_send_handshake:
static x11_send_handshake:perform
push rbp
mov rbp, rsp
sub rsp, 1<<15
mov BYTE [rsp + 0], 'l' ; Set order to 'l'.
mov WORD [rsp + 2], 11 ; Set main model to 11.
; Ship the handshake to the server: write(2).
mov rax, SYSCALL_WRITE
mov rdi, rdi
lea rsi, [rsp]
mov rdx, 12*8
syscall
cmp rax, 12*8 ; Examine that every one bytes had been written.
jnz die
; Learn the server response: learn(2).
; Use the stack for the learn buffer.
; The X11 server first replies with 8 bytes. As soon as these are learn, it replies with a a lot larger message.
mov rax, SYSCALL_READ
mov rdi, rdi
lea rsi, [rsp]
mov rdx, 8
syscall
cmp rax, 8 ; Examine that the server replied with 8 bytes.
jnz die
cmp BYTE [rsp], 1 ; Examine that the server despatched 'success' (first byte is 1).
jnz die
; Learn the remainder of the server response: learn(2).
; Use the stack for the learn buffer.
mov rax, SYSCALL_READ
mov rdi, rdi
lea rsi, [rsp]
mov rdx, 1<<15
syscall
cmp rax, 0 ; Examine that the server replied with one thing.
jle die
; Set id_base globally.
mov edx, DWORD [rsp + 4]
mov DWORD [id_base], edx
; Set id_mask globally.
mov edx, DWORD [rsp + 8]
mov DWORD [id_mask], edx
; Learn the knowledge we'd like, skip over the remaining.
lea rdi, [rsp] ; Pointer that may skip over some information.
mov cx, WORD [rsp + 16] ; Vendor size (v).
movzx rcx, cx
mov al, BYTE [rsp + 21]; Variety of codecs (n).
movzx rax, al ; Fill the remainder of the register with zeroes to keep away from rubbish values.
imul rax, 8 ; sizeof(format) == 8
add rdi, 32 ; Skip the connection setup
add rdi, rcx ; Skip over the seller data (v).
add rdi, rax ; Skip over the format data (n*8).
mov eax, DWORD [rdi] ; Retailer (and return) the window root id.
; Set the root_visual_id globally.
mov edx, DWORD [rdi + 32]
mov DWORD [root_visual_id], edx
add rsp, 1<<15
pop rbp
ret
From this level on, I’ll assume you’re accustomed to the fundamentals of meeting and X11 and won’t go as a lot into particulars.
Producing ids
When creating sources on the server-side, we often first generate an id on the shopper facet, and ship that id to the server when creating the useful resource.
We retailer the present id in a world variable and increment it every time a brand new id is generated.
That is how we do it:
; Increment the worldwide id.
; @return The brand new id.
x11_next_id:
static x11_next_id:perform
push rbp
mov rbp, rsp
mov eax, DWORD [id] ; Load international id.
mov edi, DWORD [id_base] ; Load international id_base.
mov edx, DWORD [id_mask] ; Load international id_mask.
; Return: id_mask & (id) | id_base
and eax, edx
or eax, edi
add DWORD [id], 1 ; Increment id.
pop rbp
ret
Opening a font
To open a font, which is a prerequisite to attract textual content, we ship a message to the server specifying (a part of) the identify of the font we would like, and the server will choose an identical font.
To play with one other font, you should use xfontsel
which shows all of the font names that the X11 server is aware of about.
First, we generate an id for the font regionally, after which we ship it alongside the font identify.
; Open the font on the server facet.
; @param rdi The socket file descriptor.
; @param esi The font id.
x11_open_font:
static x11_open_font:perform
push rbp
mov rbp, rsp
%outline OPEN_FONT_NAME_BYTE_COUNT 5
%outline OPEN_FONT_PADDING ((4 - (OPEN_FONT_NAME_BYTE_COUNT % 4)) % 4)
%outline OPEN_FONT_PACKET_U32_COUNT (3 + (OPEN_FONT_NAME_BYTE_COUNT + OPEN_FONT_PADDING) / 4)
%outline X11_OP_REQ_OPEN_FONT 0x2d
sub rsp, 6*8
mov DWORD [rsp + 0*4], X11_OP_REQ_OPEN_FONT | (OPEN_FONT_NAME_BYTE_COUNT << 16)
mov DWORD [rsp + 1*4], esi
mov DWORD [rsp + 2*4], OPEN_FONT_NAME_BYTE_COUNT
mov BYTE [rsp + 3*4 + 0], 'f'
mov BYTE [rsp + 3*4 + 1], 'i'
mov BYTE [rsp + 3*4 + 2], 'x'
mov BYTE [rsp + 3*4 + 3], 'e'
mov BYTE [rsp + 3*4 + 4], 'd'
mov rax, SYSCALL_WRITE
mov rdi, rdi
lea rsi, [rsp]
mov rdx, OPEN_FONT_PACKET_U32_COUNT*4
syscall
cmp rax, OPEN_FONT_PACKET_U32_COUNT*4
jnz die
add rsp, 6*8
pop rbp
ret
Making a graphical context
Since an utility in X11 can have a number of home windows, we first have to create a graphical context containing the final data. Once we create a window, we seek advice from this graphical context by id.
Once more, we have to generate an id for the graphical context to be.
X11 shops a hierarchy of home windows, so when creating the graphical context, we additionally want to provide it the basis window id (i.e. the mum or dad id).
; Create a X11 graphical context.
; @param rdi The socket file descriptor.
; @param esi The graphical context id.
; @param edx The window root id.
; @param ecx The font id.
x11_create_gc:
static x11_create_gc:perform
push rbp
mov rbp, rsp
sub rsp, 8*8
%outline X11_OP_REQ_CREATE_GC 0x37
%outline X11_FLAG_GC_BG 0x00000004
%outline X11_FLAG_GC_FG 0x00000008
%outline X11_FLAG_GC_FONT 0x00004000
%outline X11_FLAG_GC_EXPOSE 0x00010000
%outline CREATE_GC_FLAGS X11_FLAG_GC_BG | X11_FLAG_GC_FG | X11_FLAG_GC_FONT
%outline CREATE_GC_PACKET_FLAG_COUNT 3
%outline CREATE_GC_PACKET_U32_COUNT (4 + CREATE_GC_PACKET_FLAG_COUNT)
%outline MY_COLOR_RGB 0x0000ffff
mov DWORD [rsp + 0*4], X11_OP_REQ_CREATE_GC | (CREATE_GC_PACKET_U32_COUNT<<16)
mov DWORD [rsp + 1*4], esi
mov DWORD [rsp + 2*4], edx
mov DWORD [rsp + 3*4], CREATE_GC_FLAGS
mov DWORD [rsp + 4*4], MY_COLOR_RGB
mov DWORD [rsp + 5*4], 0
mov DWORD [rsp + 6*4], ecx
mov rax, SYSCALL_WRITE
mov rdi, rdi
lea rsi, [rsp]
mov rdx, CREATE_GC_PACKET_U32_COUNT*4
syscall
cmp rax, CREATE_GC_PACKET_U32_COUNT*4
jnz die
add rsp, 8*8
pop rbp
ret
Creating the window
We will now create the window, which refers back to the freshly created graphical context.
We additionally present the specified x and y coordinates of the window, in addition to the specified dimensions (width and peak).
Word that these are merely hints and the ensuing window might properly have completely different coordinates and dimensions, for instance when utilizing a tiling window supervisor, or when resizing the window.
; Create the X11 window.
; @param rdi The socket file descriptor.
; @param esi The brand new window id.
; @param edx The window root id.
; @param ecx The basis visible id.
; @param r8d Packed x and y.
; @param r9d Packed w and h.
x11_create_window:
static x11_create_window:perform
push rbp
mov rbp, rsp
%outline X11_OP_REQ_CREATE_WINDOW 0x01
%outline X11_FLAG_WIN_BG_COLOR 0x00000002
%outline X11_EVENT_FLAG_KEY_RELEASE 0x0002
%outline X11_EVENT_FLAG_EXPOSURE 0x8000
%outline X11_FLAG_WIN_EVENT 0x00000800
%outline CREATE_WINDOW_FLAG_COUNT 2
%outline CREATE_WINDOW_PACKET_U32_COUNT (8 + CREATE_WINDOW_FLAG_COUNT)
%outline CREATE_WINDOW_BORDER 1
%outline CREATE_WINDOW_GROUP 1
sub rsp, 12*8
mov DWORD [rsp + 0*4], X11_OP_REQ_CREATE_WINDOW | (CREATE_WINDOW_PACKET_U32_COUNT << 16)
mov DWORD [rsp + 1*4], esi
mov DWORD [rsp + 2*4], edx
mov DWORD [rsp + 3*4], r8d
mov DWORD [rsp + 4*4], r9d
mov DWORD [rsp + 5*4], CREATE_WINDOW_GROUP | (CREATE_WINDOW_BORDER << 16)
mov DWORD [rsp + 6*4], ecx
mov DWORD [rsp + 7*4], X11_FLAG_WIN_BG_COLOR | X11_FLAG_WIN_EVENT
mov DWORD [rsp + 8*4], 0
mov DWORD [rsp + 9*4], X11_EVENT_FLAG_KEY_RELEASE | X11_EVENT_FLAG_EXPOSURE
mov rax, SYSCALL_WRITE
mov rdi, rdi
lea rsi, [rsp]
mov rdx, CREATE_WINDOW_PACKET_U32_COUNT*4
syscall
cmp rax, CREATE_WINDOW_PACKET_U32_COUNT*4
jnz die
add rsp, 12*8
pop rbp
ret
Mapping the window
If you’re following alongside at house, and simply ran this system, you’ve got realized nothing is displayed.
That’s as a result of X11 doesn’t present the window till we’ve mapped it. It is a easy message to ship:
; Map a X11 window.
; @param rdi The socket file descriptor.
; @param esi The window id.
x11_map_window:
static x11_map_window:perform
push rbp
mov rbp, rsp
sub rsp, 16
%outline X11_OP_REQ_MAP_WINDOW 0x08
mov DWORD [rsp + 0*4], X11_OP_REQ_MAP_WINDOW | (2<<16)
mov DWORD [rsp + 1*4], esi
mov rax, SYSCALL_WRITE
mov rdi, rdi
lea rsi, [rsp]
mov rdx, 2*4
syscall
cmp rax, 2*4
jnz die
add rsp, 16
pop rbp
ret
We now have a black window:
Yay!
Polling for server messages
We want to draw textual content within the window now, however we’ve to attend for the Expose
occasion to be despatched to us, which signifies that the window is seen, to have the ability to begin drawing on it.
We need to pay attention for all server messages truly, be it errors or occasions, for instance when the consumer presses a key on the keyboard.
If we do a easy blocking learn(2)
, however the server sends nothing, this system will seem not responding. Not good.
The answer is to make use of the ballot(2)
system name to be awoken by the working system at any time when there’s information to be learn on the socket, a la NodeJS or Nginx.
First, we have to mark the socket as ‘non-blocking’ since it’s by default in blocking mode:
; Set a file descriptor in non-blocking mode.
; @param rdi The file descriptor.
set_fd_non_blocking:
static set_fd_non_blocking:perform
push rbp
mov rbp, rsp
mov rax, SYSCALL_FCNTL
mov rdi, rdi
mov rsi, F_GETFL
mov rdx, 0
syscall
cmp rax, 0
jl die
; `or` the present file standing flag with O_NONBLOCK.
mov rdx, rax
or rdx, O_NONBLOCK
mov rax, SYSCALL_FCNTL
mov rdi, rdi
mov rsi, F_SETFL
mov rdx, rdx
syscall
cmp rax, 0
jl die
pop rbp
ret
Then, we write a small perform to learn information on the socket. For simplicity, we solely learn 32 bytes of information, as a result of most messages from X11 are of this measurement. We additionally return the primary byte which accommodates the occasion kind.
; Learn the X11 server reply.
; @return The message code in al.
x11_read_reply:
static x11_read_reply:perform
push rbp
mov rbp, rsp
sub rsp, 32
mov rax, SYSCALL_READ
mov rdi, rdi
lea rsi, [rsp]
mov rdx, 32
syscall
cmp rax, 1
jle die
mov al, BYTE [rsp]
add rsp, 32
pop rbp
ret
We now can ballot. If an error happens or the opposite facet has closed their finish of the socket, we exit this system.
; Ballot indefinitely messages from the X11 server with ballot(2).
; @param rdi The socket file descriptor.
; @param esi The window id.
; @param edx The gc id.
poll_messages:
static poll_messages:perform
push rbp
mov rbp, rsp
sub rsp, 32
%outline POLLIN 0x001
%outline POLLPRI 0x002
%outline POLLOUT 0x004
%outline POLLERR 0x008
%outline POLLHUP 0x010
%outline POLLNVAL 0x020
mov DWORD [rsp + 0*4], edi
mov DWORD [rsp + 1*4], POLLIN
mov DWORD [rsp + 16], esi ; window id
mov DWORD [rsp + 20], edx ; gc id
.loop:
mov rax, SYSCALL_POLL
lea rdi, [rsp]
mov rsi, 1
mov rdx, -1
syscall
cmp rax, 0
jle die
cmp DWORD [rsp + 2*4], POLLERR
je die
cmp DWORD [rsp + 2*4], POLLHUP
je die
mov rdi, [rsp + 0*4]
name x11_read_reply
jmp .loop
add rsp, 16
pop rbp
ret
Drawing textual content
Eventually, we will draw textual content. The small problem right here is that the textual content is of unknown size within the common case, so we’ve to compute the scale of the X11 message, together with the padding on the finish. Up to now, we solely had messages of mounted measurement.
The official documentation has formulation to compute these values.
; Draw textual content in a X11 window with server-side textual content rendering.
; @param rdi The socket file descriptor.
; @param rsi The textual content string.
; @param edx The textual content string size in bytes.
; @param ecx The window id.
; @param r8d The gc id.
; @param r9d Packed x and y.
x11_draw_text:
static x11_draw_text:perform
push rbp
mov rbp, rsp
sub rsp, 1024
mov DWORD [rsp + 1*4], ecx ; Retailer the window id straight within the packet information on the stack.
mov DWORD [rsp + 2*4], r8d ; Retailer the gc id straight within the packet information on the stack.
mov DWORD [rsp + 3*4], r9d ; Retailer x, y straight within the packet information on the stack.
mov r8d, edx ; Retailer the string size in r8 since edx might be overwritten subsequent.
mov QWORD [rsp + 1024 - 8], rdi ; Retailer the socket file descriptor on the stack to free the register.
; Compute padding and packet u32 depend with division and modulo 4.
mov eax, edx ; Put dividend in eax.
mov ecx, 4 ; Put divisor in ecx.
cdq ; Signal lengthen.
idiv ecx ; Compute eax / ecx, and put the rest (i.e. modulo) in edx.
; LLVM optimizer magic: `(4-x)%4 == -x & 3`, for some cause.
neg edx
and edx, 3
mov r9d, edx ; Retailer padding in r9.
mov eax, r8d
add eax, r9d
shr eax, 2 ; Compute: eax /= 4
add eax, 4 ; eax now accommodates the packet u32 depend.
%outline X11_OP_REQ_IMAGE_TEXT8 0x4c
mov DWORD [rsp + 0*4], r8d
shl DWORD [rsp + 0*4], 8
or DWORD [rsp + 0*4], X11_OP_REQ_IMAGE_TEXT8
mov ecx, eax
shl ecx, 16
or [rsp + 0*4], ecx
; Copy the textual content string into the packet information on the stack.
mov rsi, rsi ; Supply string in rsi.
lea rdi, [rsp + 4*4] ; Vacation spot
cld ; Transfer ahead
mov ecx, r8d ; String size.
rep movsb ; Copy.
mov rdx, rax ; packet u32 depend
imul rdx, 4
mov rax, SYSCALL_WRITE
mov rdi, QWORD [rsp + 1024 - 8] ; fd
lea rsi, [rsp]
syscall
cmp rax, rdx
jnz die
add rsp, 1024
pop rbp
ret
We then name this perform contained in the polling loop, and we retailer the ‘uncovered’ state in a boolean on the stack to know whether or not we must always render the textual content or not:
%outline X11_EVENT_EXPOSURE 0xc
cmp eax, X11_EVENT_EXPOSURE
jnz .received_other_event
.received_exposed_event:
mov BYTE [rsp + 24], 1 ; Mark as uncovered.
.received_other_event:
cmp BYTE [rsp + 24], 1 ; uncovered?
jnz .loop
.draw_text:
mov rdi, [rsp + 0*4] ; socket fd
lea rsi, [hello_world] ; string
mov edx, 13 ; size
mov ecx, [rsp + 16] ; window id
mov r8d, [rsp + 20] ; gc id
mov r9d, 100 ; x
shl r9d, 16
or r9d, 100 ; y
name x11_draw_text
Lastly, we see our Good day, world!
textual content displayed contained in the window:
The top
Wow, that was quite a bit. However we did it! We wrote a (albeit simplistic) GUI program in pure meeting, no dependencies, and that’s simply 600 traces of code in the long run.
How did we fare on the executable measurement half?
- With debug data: 10744 bytes (10 KiB)
- With out debug data (stripped): 8592 bytes (8 KiB)
- Stripped and OMAGIC (
--omagic
linker flag, from the person web page:Set the textual content and information sections to be readable and writable. Additionally, don't page-align the info phase
): 1776 bytes (1 KiB)
Not too shaby, a GUI program in 1 KiB.
The place to go from there?
- We may transfer textual content rendering client-side. Doing it server-side has a number of limitations.
- We may add form rendering, similar to quads and circles
- We may take heed to keyboard and mouse occasions (the polling loop is simple to increase to try this)
I hope that you simply had as a lot enjoyable as I did!
Addendum: the total code
; Construct with: nasm -f elf64 -g foremost.nasm && ld foremost.o -static -o foremost
BITS 64 ; 64 bits.
CPU X64 ; Goal the x86_64 household of CPUs.
part .rodata
sun_path: db "/tmp/.X11-unix/X0", 0
static sun_path:information
hello_world: db "Good day, world!"
static hello_world:information
part .information
id: dd 0
static id:information
id_base: dd 0
static id_base:information
id_mask: dd 0
static id_mask:information
root_visual_id: dd 0
static root_visual_id:information
part .textual content
%outline AF_UNIX 1
%outline SOCK_STREAM 1
%outline SYSCALL_READ 0
%outline SYSCALL_WRITE 1
%outline SYSCALL_POLL 7
%outline SYSCALL_SOCKET 41
%outline SYSCALL_CONNECT 42
%outline SYSCALL_EXIT 60
%outline SYSCALL_FCNTL 72
; Create a UNIX area socket and hook up with the X11 server.
; @returns The socket file descriptor.
x11_connect_to_server:
static x11_connect_to_server:perform
push rbp
mov rbp, rsp
; Open a Unix socket: socket(2).
mov rax, SYSCALL_SOCKET
mov rdi, AF_UNIX ; Unix socket.
mov rsi, SOCK_STREAM ; Tcp-like.
mov rdx, 0 ; Automated protocol.
syscall
cmp rax, 0
jle die
mov rdi, rax ; Retailer socket fd in `rdi` for the rest of the perform.
sub rsp, 112 ; Retailer struct sockaddr_un on the stack.
mov WORD [rsp], AF_UNIX ; Set sockaddr_un.sun_family to AF_UNIX
; Fill sockaddr_un.sun_path with: "/tmp/.X11-unix/X0".
lea rsi, sun_path
mov r12, rdi ; Save the socket file descriptor in `rdi` in `r12`.
lea rdi, [rsp + 2]
cld ; Transfer ahead
mov ecx, 19 ; Size is nineteen with the null terminator.
rep movsb ; Copy.
; Connect with the server: join(2).
mov rax, SYSCALL_CONNECT
mov rdi, r12
lea rsi, [rsp]
%outline SIZEOF_SOCKADDR_UN 2+108
mov rdx, SIZEOF_SOCKADDR_UN
syscall
cmp rax, 0
jne die
mov rax, rdi ; Return the socket fd.
add rsp, 112
pop rbp
ret
; Ship the handshake to the X11 server and browse the returned system data.
; @param rdi The socket file descriptor
; @returns The window root id (uint32_t) in rax.
x11_send_handshake:
static x11_send_handshake:perform
push rbp
mov rbp, rsp
sub rsp, 1<<15
mov BYTE [rsp + 0], 'l' ; Set order to 'l'.
mov WORD [rsp + 2], 11 ; Set main model to 11.
; Ship the handshake to the server: write(2).
mov rax, SYSCALL_WRITE
mov rdi, rdi
lea rsi, [rsp]
mov rdx, 12*8
syscall
cmp rax, 12*8 ; Examine that every one bytes had been written.
jnz die
; Learn the server response: learn(2).
; Use the stack for the learn buffer.
; The X11 server first replies with 8 bytes. As soon as these are learn, it replies with a a lot larger message.
mov rax, SYSCALL_READ
mov rdi, rdi
lea rsi, [rsp]
mov rdx, 8
syscall
cmp rax, 8 ; Examine that the server replied with 8 bytes.
jnz die
cmp BYTE [rsp], 1 ; Examine that the server despatched 'success' (first byte is 1).
jnz die
; Learn the remainder of the server response: learn(2).
; Use the stack for the learn buffer.
mov rax, SYSCALL_READ
mov rdi, rdi
lea rsi, [rsp]
mov rdx, 1<<15
syscall
cmp rax, 0 ; Examine that the server replied with one thing.
jle die
; Set id_base globally.
mov edx, DWORD [rsp + 4]
mov DWORD [id_base], edx
; Set id_mask globally.
mov edx, DWORD [rsp + 8]
mov DWORD [id_mask], edx
; Learn the knowledge we'd like, skip over the remaining.
lea rdi, [rsp] ; Pointer that may skip over some information.
mov cx, WORD [rsp + 16] ; Vendor size (v).
movzx rcx, cx
mov al, BYTE [rsp + 21]; Variety of codecs (n).
movzx rax, al ; Fill the remainder of the register with zeroes to keep away from rubbish values.
imul rax, 8 ; sizeof(format) == 8
add rdi, 32 ; Skip the connection setup
add rdi, rcx ; Skip over the seller data (v).
add rdi, rax ; Skip over the format data (n*8).
mov eax, DWORD [rdi] ; Retailer (and return) the window root id.
; Set the root_visual_id globally.
mov edx, DWORD [rdi + 32]
mov DWORD [root_visual_id], edx
add rsp, 1<<15
pop rbp
ret
; Increment the worldwide id.
; @return The brand new id.
x11_next_id:
static x11_next_id:perform
push rbp
mov rbp, rsp
mov eax, DWORD [id] ; Load international id.
mov edi, DWORD [id_base] ; Load international id_base.
mov edx, DWORD [id_mask] ; Load international id_mask.
; Return: id_mask & (id) | id_base
and eax, edx
or eax, edi
add DWORD [id], 1 ; Increment id.
pop rbp
ret
; Open the font on the server facet.
; @param rdi The socket file descriptor.
; @param esi The font id.
x11_open_font:
static x11_open_font:perform
push rbp
mov rbp, rsp
%outline OPEN_FONT_NAME_BYTE_COUNT 5
%outline OPEN_FONT_PADDING ((4 - (OPEN_FONT_NAME_BYTE_COUNT % 4)) % 4)
%outline OPEN_FONT_PACKET_U32_COUNT (3 + (OPEN_FONT_NAME_BYTE_COUNT + OPEN_FONT_PADDING) / 4)
%outline X11_OP_REQ_OPEN_FONT 0x2d
sub rsp, 6*8
mov DWORD [rsp + 0*4], X11_OP_REQ_OPEN_FONT | (OPEN_FONT_NAME_BYTE_COUNT << 16)
mov DWORD [rsp + 1*4], esi
mov DWORD [rsp + 2*4], OPEN_FONT_NAME_BYTE_COUNT
mov BYTE [rsp + 3*4 + 0], 'f'
mov BYTE [rsp + 3*4 + 1], 'i'
mov BYTE [rsp + 3*4 + 2], 'x'
mov BYTE [rsp + 3*4 + 3], 'e'
mov BYTE [rsp + 3*4 + 4], 'd'
mov rax, SYSCALL_WRITE
mov rdi, rdi
lea rsi, [rsp]
mov rdx, OPEN_FONT_PACKET_U32_COUNT*4
syscall
cmp rax, OPEN_FONT_PACKET_U32_COUNT*4
jnz die
add rsp, 6*8
pop rbp
ret
; Create a X11 graphical context.
; @param rdi The socket file descriptor.
; @param esi The graphical context id.
; @param edx The window root id.
; @param ecx The font id.
x11_create_gc:
static x11_create_gc:perform
push rbp
mov rbp, rsp
sub rsp, 8*8
%outline X11_OP_REQ_CREATE_GC 0x37
%outline X11_FLAG_GC_BG 0x00000004
%outline X11_FLAG_GC_FG 0x00000008
%outline X11_FLAG_GC_FONT 0x00004000
%outline X11_FLAG_GC_EXPOSE 0x00010000
%outline CREATE_GC_FLAGS X11_FLAG_GC_BG | X11_FLAG_GC_FG | X11_FLAG_GC_FONT
%outline CREATE_GC_PACKET_FLAG_COUNT 3
%outline CREATE_GC_PACKET_U32_COUNT (4 + CREATE_GC_PACKET_FLAG_COUNT)
%outline MY_COLOR_RGB 0x0000ffff
mov DWORD [rsp + 0*4], X11_OP_REQ_CREATE_GC | (CREATE_GC_PACKET_U32_COUNT<<16)
mov DWORD [rsp + 1*4], esi
mov DWORD [rsp + 2*4], edx
mov DWORD [rsp + 3*4], CREATE_GC_FLAGS
mov DWORD [rsp + 4*4], MY_COLOR_RGB
mov DWORD [rsp + 5*4], 0
mov DWORD [rsp + 6*4], ecx
mov rax, SYSCALL_WRITE
mov rdi, rdi
lea rsi, [rsp]
mov rdx, CREATE_GC_PACKET_U32_COUNT*4
syscall
cmp rax, CREATE_GC_PACKET_U32_COUNT*4
jnz die
add rsp, 8*8
pop rbp
ret
; Create the X11 window.
; @param rdi The socket file descriptor.
; @param esi The brand new window id.
; @param edx The window root id.
; @param ecx The basis visible id.
; @param r8d Packed x and y.
; @param r9d Packed w and h.
x11_create_window:
static x11_create_window:perform
push rbp
mov rbp, rsp
%outline X11_OP_REQ_CREATE_WINDOW 0x01
%outline X11_FLAG_WIN_BG_COLOR 0x00000002
%outline X11_EVENT_FLAG_KEY_RELEASE 0x0002
%outline X11_EVENT_FLAG_EXPOSURE 0x8000
%outline X11_FLAG_WIN_EVENT 0x00000800
%outline CREATE_WINDOW_FLAG_COUNT 2
%outline CREATE_WINDOW_PACKET_U32_COUNT (8 + CREATE_WINDOW_FLAG_COUNT)
%outline CREATE_WINDOW_BORDER 1
%outline CREATE_WINDOW_GROUP 1
sub rsp, 12*8
mov DWORD [rsp + 0*4], X11_OP_REQ_CREATE_WINDOW | (CREATE_WINDOW_PACKET_U32_COUNT << 16)
mov DWORD [rsp + 1*4], esi
mov DWORD [rsp + 2*4], edx
mov DWORD [rsp + 3*4], r8d
mov DWORD [rsp + 4*4], r9d
mov DWORD [rsp + 5*4], CREATE_WINDOW_GROUP | (CREATE_WINDOW_BORDER << 16)
mov DWORD [rsp + 6*4], ecx
mov DWORD [rsp + 7*4], X11_FLAG_WIN_BG_COLOR | X11_FLAG_WIN_EVENT
mov DWORD [rsp + 8*4], 0
mov DWORD [rsp + 9*4], X11_EVENT_FLAG_KEY_RELEASE | X11_EVENT_FLAG_EXPOSURE
mov rax, SYSCALL_WRITE
mov rdi, rdi
lea rsi, [rsp]
mov rdx, CREATE_WINDOW_PACKET_U32_COUNT*4
syscall
cmp rax, CREATE_WINDOW_PACKET_U32_COUNT*4
jnz die
add rsp, 12*8
pop rbp
ret
; Map a X11 window.
; @param rdi The socket file descriptor.
; @param esi The window id.
x11_map_window:
static x11_map_window:perform
push rbp
mov rbp, rsp
sub rsp, 16
%outline X11_OP_REQ_MAP_WINDOW 0x08
mov DWORD [rsp + 0*4], X11_OP_REQ_MAP_WINDOW | (2<<16)
mov DWORD [rsp + 1*4], esi
mov rax, SYSCALL_WRITE
mov rdi, rdi
lea rsi, [rsp]
mov rdx, 2*4
syscall
cmp rax, 2*4
jnz die
add rsp, 16
pop rbp
ret
; Learn the X11 server reply.
; @return The message code in al.
x11_read_reply:
static x11_read_reply:perform
push rbp
mov rbp, rsp
sub rsp, 32
mov rax, SYSCALL_READ
mov rdi, rdi
lea rsi, [rsp]
mov rdx, 32
syscall
cmp rax, 1
jle die
mov al, BYTE [rsp]
add rsp, 32
pop rbp
ret
die:
mov rax, SYSCALL_EXIT
mov rdi, 1
syscall
; Set a file descriptor in non-blocking mode.
; @param rdi The file descriptor.
set_fd_non_blocking:
static set_fd_non_blocking:perform
push rbp
mov rbp, rsp
%outline F_GETFL 3
%outline F_SETFL 4
%outline O_NONBLOCK 2048
mov rax, SYSCALL_FCNTL
mov rdi, rdi
mov rsi, F_GETFL
mov rdx, 0
syscall
cmp rax, 0
jl die
; `or` the present file standing flag with O_NONBLOCK.
mov rdx, rax
or rdx, O_NONBLOCK
mov rax, SYSCALL_FCNTL
mov rdi, rdi
mov rsi, F_SETFL
mov rdx, rdx
syscall
cmp rax, 0
jl die
pop rbp
ret
; Ballot indefinitely messages from the X11 server with ballot(2).
; @param rdi The socket file descriptor.
; @param esi The window id.
; @param edx The gc id.
poll_messages:
static poll_messages:perform
push rbp
mov rbp, rsp
sub rsp, 32
%outline POLLIN 0x001
%outline POLLPRI 0x002
%outline POLLOUT 0x004
%outline POLLERR 0x008
%outline POLLHUP 0x010
%outline POLLNVAL 0x020
mov DWORD [rsp + 0*4], edi
mov DWORD [rsp + 1*4], POLLIN
mov DWORD [rsp + 16], esi ; window id
mov DWORD [rsp + 20], edx ; gc id
mov BYTE [rsp + 24], 0 ; uncovered? (boolean)
.loop:
mov rax, SYSCALL_POLL
lea rdi, [rsp]
mov rsi, 1
mov rdx, -1
syscall
cmp rax, 0
jle die
cmp DWORD [rsp + 2*4], POLLERR
je die
cmp DWORD [rsp + 2*4], POLLHUP
je die
mov rdi, [rsp + 0*4]
name x11_read_reply
%outline X11_EVENT_EXPOSURE 0xc
cmp eax, X11_EVENT_EXPOSURE
jnz .received_other_event
.received_exposed_event:
mov BYTE [rsp + 24], 1 ; Mark as uncovered.
.received_other_event:
cmp BYTE [rsp + 24], 1 ; uncovered?
jnz .loop
.draw_text:
mov rdi, [rsp + 0*4] ; socket fd
lea rsi, [hello_world] ; string
mov edx, 13 ; size
mov ecx, [rsp + 16] ; window id
mov r8d, [rsp + 20] ; gc id
mov r9d, 100 ; x
shl r9d, 16
or r9d, 100 ; y
name x11_draw_text
jmp .loop
add rsp, 16
pop rbp
ret
; Draw textual content in a X11 window with server-side textual content rendering.
; @param rdi The socket file descriptor.
; @param rsi The textual content string.
; @param edx The textual content string size in bytes.
; @param ecx The window id.
; @param r8d The gc id.
; @param r9d Packed x and y.
x11_draw_text:
static x11_draw_text:perform
push rbp
mov rbp, rsp
sub rsp, 1024
mov DWORD [rsp + 1*4], ecx ; Retailer the window id straight within the packet information on the stack.
mov DWORD [rsp + 2*4], r8d ; Retailer the gc id straight within the packet information on the stack.
mov DWORD [rsp + 3*4], r9d ; Retailer x, y straight within the packet information on the stack.
mov r8d, edx ; Retailer the string size in r8 since edx might be overwritten subsequent.
mov QWORD [rsp + 1024 - 8], rdi ; Retailer the socket file descriptor on the stack to free the register.
; Compute padding and packet u32 depend with division and modulo 4.
mov eax, edx ; Put dividend in eax.
mov ecx, 4 ; Put divisor in ecx.
cdq ; Signal lengthen.
idiv ecx ; Compute eax / ecx, and put the rest (i.e. modulo) in edx.
; LLVM optimizer magic: `(4-x)%4 == -x & 3`, for some cause.
neg edx
and edx, 3
mov r9d, edx ; Retailer padding in r9.
mov eax, r8d
add eax, r9d
shr eax, 2 ; Compute: eax /= 4
add eax, 4 ; eax now accommodates the packet u32 depend.
%outline X11_OP_REQ_IMAGE_TEXT8 0x4c
mov DWORD [rsp + 0*4], r8d
shl DWORD [rsp + 0*4], 8
or DWORD [rsp + 0*4], X11_OP_REQ_IMAGE_TEXT8
mov ecx, eax
shl ecx, 16
or [rsp + 0*4], ecx
; Copy the textual content string into the packet information on the stack.
mov rsi, rsi ; Supply string in rsi.
lea rdi, [rsp + 4*4] ; Vacation spot
cld ; Transfer ahead
mov ecx, r8d ; String size.
rep movsb ; Copy.
mov rdx, rax ; packet u32 depend
imul rdx, 4
mov rax, SYSCALL_WRITE
mov rdi, QWORD [rsp + 1024 - 8] ; fd
lea rsi, [rsp]
syscall
cmp rax, rdx
jnz die
add rsp, 1024
pop rbp
ret
_start:
international _start:perform
name x11_connect_to_server
mov r15, rax ; Retailer the socket file descriptor in r15.
mov rdi, rax
name x11_send_handshake
mov r12d, eax ; Retailer the window root id in r12.
name x11_next_id
mov r13d, eax ; Retailer the gc_id in r13.
name x11_next_id
mov r14d, eax ; Retailer the font_id in r14.
mov rdi, r15
mov esi, r14d
name x11_open_font
mov rdi, r15
mov esi, r13d
mov edx, r12d
mov ecx, r14d
name x11_create_gc
name x11_next_id
mov ebx, eax ; Retailer the window id in ebx.
mov rdi, r15 ; socket fd
mov esi, eax
mov edx, r12d
mov ecx, [root_visual_id]
mov r8d, 200 | (200 << 16) ; x and y are 200
%outline WINDOW_W 800
%outline WINDOW_H 600
mov r9d, WINDOW_W | (WINDOW_H << 16)
name x11_create_window
mov rdi, r15 ; socket fd
mov esi, ebx
name x11_map_window
mov rdi, r15 ; socket fd
name set_fd_non_blocking
mov rdi, r15 ; socket fd
mov esi, ebx ; window id
mov edx, r13d ; gc id
name poll_messages
; The top.
mov rax, SYSCALL_EXIT
mov rdi, 0
syscall