TCC RISC-V Compiler runs within the Net Browser (due to Zig Compiler)
📝 4 Feb 2024
TCC is a Tiny C Compiler for 64-bit RISC-V (and different platforms)…
Can we run TCC Compiler in a Net Browser?
Let’s do it! We’ll compile TCC (Tiny C Compiler) from C to WebAssembly with Zig Compiler.
On this article, we discuss in regards to the difficult bits of our TCC ported to WebAssembly…
-
We compiled TCC to WebAssembly with one tiny repair
-
However we hit some Lacking POSIX Features
-
So we constructed minimal File Enter and Output
-
Hacked up a easy workaround for fprintf and pals
-
And TCC produces a RISC-V Binary that runs OK
(After some fiddling and meddling in RISC-V Meeting)
Why are we doing this?
In the present day we’re operating Apache NuttX RTOS inside a Net Browser, with WebAssembly + Emscripten + 64-bit RISC-V.
(Actual-Time Working System in a Net Browser on a Normal-Objective Working System!)
What if we might Construct and Take a look at NuttX Apps within the Net Browser…
-
We kind a C Program into our Net Browser (pic under)
-
Compile it into an ELF Executable with TCC
-
Copy the ELF Executable to the NuttX Filesystem
-
And NuttX Emulator runs our ELF Executable contained in the Net Browser
Studying NuttX turns into so cool! That is how we made it occur…
(Not to be confused with TTC Compiler)
Online Demo of TCC Compiler in WebAssembly
Click on this hyperlink to attempt TCC Compiler in our Net Browser (pic above)
This C Program seems…
// Demo Program for TCC Compiler
int foremost(int argc, char *argv[]) {
printf("Hiya, World!!n");
return 0;
}
Click on the “Compile” button. Our Net Browser calls TCC to compile the above program…
## Compile to RISC-V ELF
tcc -c howdy.c
And it downloads the compiled RISC-V ELF a.out
. We examine the Compiled Output…
## Dump the RISC-V Disassembly
## of TCC Output
$ riscv64-unknown-elf-objdump
--syms --source --reloc --demangle
--line-numbers --wide --debugging
a.out
foremost():
## Put together the Stack
0: fe010113 addi sp, sp, -32
4: 00113c23 sd ra, 24(sp)
8: 00813823 sd s0, 16(sp)
c: 02010413 addi s0, sp, 32
10: 00000013 nop
## Load to Register A0: "Hiya World"
14: fea43423 sd a0, -24(s0)
18: feb43023 sd a1, -32(s0)
1c: 00000517 auipc a0, 0x0
1c: R_RISCV_PCREL_HI20 L.0
20: 00050513 mv a0, a0
20: R_RISCV_PCREL_LO12_I .textual content
## Name printf()
24: 00000097 auipc ra, 0x0
24: R_RISCV_CALL_PLT printf
28: 000080e7 jalr ra ## 24 <foremost+0x24>
## Clear up the Stack and
## return 0 to Caller
2c: 0000051b sext.w a0, zero
30: 01813083 ld ra, 24(sp)
34: 01013403 ld s0, 16(sp)
38: 02010113 addi sp, sp, 32
3c: 00008067 ret
Yep the 64-bit RISC-V Code seems legit! Similar to our NuttX App. (So it can most likely run on NuttX)
What simply occurred? We go behind the scenes…
(About the RISC-V Instructions)
Will Zig Compiler fortunately compile TCC to WebAssembly?
Amazingly, sure! (Pic above)
## Zig Compiler compiles TCC Compiler
## from C to WebAssembly. Produces `tcc.o`
zig cc
-c
-target wasm32-freestanding
-dynamic
-rdynamic
-lc
-DTCC_TARGET_RISCV64
-DCONFIG_TCC_CROSSPREFIX=""riscv64-""
-DCONFIG_TCC_CRTPREFIX=""/usr/riscv64-linux-gnu/lib""
-DCONFIG_TCC_LIBPATHS=""{B}:/usr/riscv64-linux-gnu/lib""
-DCONFIG_TCC_SYSINCLUDEPATHS=""{B}/embrace:/usr/riscv64-linux-gnu/embrace""
-DTCC_GITHASH=""foremost:b3d10a35""
-Wall
-O2
-Wdeclaration-after-statement
-fno-strict-aliasing
-Wno-pointer-sign
-Wno-sign-compare
-Wno-unused-result
-Wno-format-truncation
-Wno-stringop-truncation
-I.
tcc.c
(About the Zig Compiler Options)
We hyperlink the TCC WebAssembly with our Zig Wrapper (that exports the TCC Compiler to JavaScript)…
## Compile our Zig Wrapper `tcc-wasm.zig` for WebAssembly
## and hyperlink it with TCC compiled for WebAssembly `tcc.o`
## Generates `tcc-wasm.wasm`
zig build-exe
-target wasm32-freestanding
-rdynamic
-lc
-fno-entry
-freference-trace
--verbose-cimport
--export=compile_program
zig/tcc-wasm.zig
tcc.o
## Take a look at every thing with Net Browser
## or Node.js
node zig/check.js
(See the Zig Wrapper tcc-wasm.zig)
(See the Test JavaScript test.js)
What’s inside our Zig Wrapper?
Our Zig Wrapper will…
-
Obtain the C Program from JavaScript
-
Obtain the TCC Compiler Choices from JavaScript
-
Name TCC Compiler to compile our program
-
Return the compiled RISC-V ELF to JavaScript
Like so: tcc-wasm.zig
/// Name TCC Compiler to compile a
/// C Program to RISC-V ELF
pub export fn compile_program(
options_ptr: [*:0]const u8, // Choices for TCC Compiler (Pointer to JSON Array: ["-c", "hello.c"])
code_ptr: [*:0]const u8, // C Program to be compiled (Pointer to String)
) [*]const u8 { // Returns a pointer to the `a.out` Compiled Code (Measurement in first 4 bytes)
// Obtain the C Program from
// JavaScript and set our Learn Buffer
// https://weblog.battlefy.com/zig-made-it-easy-to-pass-strings-back-and-forth-with-webassembly
const code: []const u8 = std.mem.span(code_ptr);
read_buf = code;
// Omitted: Obtain the TCC Compiler
// Choices from JavaScript
// (JSON containing String Array: ["-c", "hello.c"])
...
// Name the TCC Compiler
_ = foremost(@intCast(argc), &args_ptrs);
// Return pointer of `a.out` to
// JavaScript. First 4 bytes: Measurement of
// `a.out`. Adopted by `a.out` information.
const slice = std.heap.page_allocator.alloc(u8, write_buflen + 4)
catch @panic("Didn't allocate reminiscence");
const size_ptr: *u32 = @alignCast(@ptrCast(slice.ptr));
size_ptr.* = write_buflen;
@memcpy(slice[4 .. write_buflen + 4], write_buf[0..write_buflen]);
return slice.ptr; // TODO: Deallocate this reminiscence
}
Plus a few Magical Bits that we’ll cowl within the subsequent part.
(How JavaScript calls our Zig Wrapper)
Zig Compiler compiles TCC with none code adjustments?
Inside TCC, we stubbed out the setjmp / longjmp to make it compile with Zig Compiler.
The whole lot else compiles OK!
Is it actually OK to stub them out?
setjmp / longjmp are referred to as to Deal with Errors throughout TCC Compilation. Assuming every thing goes hunky dory, we received’t want them.
Later we’ll discover a higher approach to categorical our outrage. (As a substitute of leaping round)
We probe the Magical Bits inside our Zig Wrapper…
What’s this POSIX?
TCC Compiler was created as a Command-Line App. So it calls the standard POSIX Functions like fopen, fprintf, strncpy, malloc, …
However WebAssembly operating in a Net Browser ain’t No Command Line! (Pic above)
(WebAssembly doesn’t have a C Standard Library libc)
Is POSIX an issue for WebAssembly?
We counted 72 POSIX Functions wanted by TCC Compiler, however lacking from WebAssembly.
Thus we fill within the Missing Functions ourselves.
(About the Missing POSIX Functions)
Certainly different Zig Devs may have the identical drawback?
Fortunately we will borrow the POSIX Code from different Zig Libraries…
72 POSIX Features? Seems like a variety of work…
We would not want all 72 POSIX Features. We stubbed out most of the capabilities to establish those which might be referred to as: tcc-wasm.zig
// Stub Out the Lacking POSIX
// Features. If TCC calls them,
// we'll see a Zig Panic. Then we
// implement them. The Sorts do not
// matter as a result of we'll halt anyway.
pub export fn atoi(_: c_int) c_int {
@panic("TODO: atoi");
}
pub export fn exit(_: c_int) c_int {
@panic("TODO: exit");
}
pub export fn fopen(_: c_int) c_int {
@panic("TODO: fopen");
}
// And lots of extra capabilities...
A few of these capabilities are particularly troubling for WebAssembly…
Why no #embrace in TCC for WebAssembly? And no C Libraries?
WebAssembly runs in a Safe Sandbox. No File Entry allowed, sorry! (Like for Header and Library Information)
That’s why our Zig Wrapper Emulates File Entry for the naked minimal 2 information…
Studying a Supply File howdy.c
is extraordinarily simplistic: tcc-wasm.zig
/// Emulate the POSIX Operate `learn()`
/// We copy from One Single Learn Buffer
/// that accommodates our C Program
export fn learn(fd0: c_int, buf: [*:0]u8, nbyte: size_t) isize {
// TODO: Help a couple of file
const len = read_buf.len;
assert(len < nbyte);
@memcpy(buf[0..len], read_buf[0..len]);
buf[len] = 0;
read_buf.len = 0;
return @intCast(len);
}
/// Learn Buffer for learn
var read_buf: []const u8 = undefined;
(read_buf is populated at startup)
Writing the Compiled Output a.out
is simply as barebones: tcc-wasm.zig
/// Emulate the POSIX Operate `write()`
/// We write to One Single Reminiscence
/// Buffer that shall be returned to
/// JavaScript as `a.out`
export fn fwrite(ptr: [*:0]const u8, dimension: usize, nmemb: usize, stream: *FILE) usize {
// TODO: Help a couple of `stream`
const len = dimension * nmemb;
@memcpy(write_buf[write_buflen .. write_buflen + len], ptr[0..]);
write_buflen += len;
return nmemb;
}
/// Write Buffer for fputc and fwrite
var write_buf = std.mem.zeroes([8192]u8);
var write_buflen: usize = 0;
(write_buf will be returned to JavaScript)
Can we deal with A number of Information?
Proper now we’re making an attempt to embed the straightforward ROM FS Filesystem into our Zig Wrapper.
The ROM FS Filesystem shall be preloaded with the Header and Library Information wanted by TCC.
(See the updates for ROM FS Filesystem)
Why is fprintf significantly problematic?
Right here’s the fearsome factor about fprintf and pals: sprintf, snprintf, vsnprintf…
Therefore we hacked up an implementation of String Formatting that’s safer, easier and so-barebones-you-can-make-soup-tulang.
Soup tulang? Inform me extra…
Our Zig Wrapper makes use of Pattern Matching to match the C Codecs and substitute the Zig Equal (pic above): tcc-wasm.zig
// Format a Single `%d`
// like `#outline __TINYC__ %d`
FormatPattern{
// If the C Format String accommodates this...
.c_spec = "%d",
// Then we apply this Zig Format...
.zig_spec = "{}",
// And extract these Argument Sorts
// from the Varargs...
.type0 = c_int,
.type1 = null
}
This works OK (for now) as a result of TCC Compiler solely makes use of 5 Patterns for C Format Strings: tcc-wasm.zig
/// Sample Matching for C String Formatting:
/// We'll match these patterns when
/// formatting strings
const format_patterns = [_]FormatPattern{
// Format a Single `%d`, like `#outline __TINYC__ %d`
FormatPattern{
.c_spec = "%d", .zig_spec = "{}",
.type0 = c_int, .type1 = null
},
// Format a Single `%u`, like `L.%u`
FormatPattern{
.c_spec = "%u", .zig_spec = "{}",
.type0 = c_int, .type1 = null
},
// Format a Single `%s`, like `.relapercents`
// Or `#outline __BASE_FILE__ "%s"`
FormatPattern{
.c_spec = "%s", .zig_spec = "{s}",
.type0 = [*:0]const u8, .type1 = null
},
// Format Two `%s`, like `#outline %spercentsn`
FormatPattern{
.c_spec = "%spercents", .zig_spec = "{s}{s}",
.type0 = [*:0]const u8, .type1 = [*:0]const u8
},
// Format `%s:%d`, like `%s:%d: `
// (File Title and Line Quantity)
FormatPattern{
.c_spec = "%s:%d", .zig_spec = "{s}:{}",
.type0 = [*:0]const u8, .type1 = c_int
},
};
That’s our fast hack for fprintf and friends!
So easy? Unbelievable!
Really we’ll hit extra Format Patterns as TCC Compiler emits numerous Error and Warning Messages. Nevertheless it’s a great begin!
Later our Zig Wrapper shall cautiously and meticulously parse every kind of C Format Strings. Or we do the parsing in C, compiled to WebAssembly. (160 strains of C!)
(Humorous how printf is the very first thing we study C. But it’s extremely tough to implement!)
TCC in WebAssembly has compiled our C Program to RISC-V ELF…
Will the ELF run on NuttX?
Apache NuttX RTOS is a tiny working system for 64-bit RISC-V that runs on QEMU Emulator. (And lots of different gadgets)
We construct NuttX for QEMU and duplicate our RISC-V ELF a.out
to the NuttX Apps Filesystem (pic above)…
## Copy RISC-V ELF `a.out`
## to NuttX Apps Filesystem
cp a.out apps/bin/
chmod +x apps/bin/a.out
Then we boot NuttX and run a.out
…
## Boot NuttX on QEMU 64-bit RISC-V
$ qemu-system-riscv64
-semihosting
-M virt,aclint=on
-cpu rv64
-smp 8
-bios none
-kernel nuttx
-nographic
## Run `a.out` in NuttX Shell
NuttShell (NSH) NuttX-12.4.0
nsh> a.out
Loading /system/bin/a.out
Exported image "printf" not discovered
Didn't load program 'a.out'
NuttX politely accepts the RISC-V ELF (produced by TCC). And says that printf is lacking.
Which is smart: We haven’t linked our C Program with the C Library!
(Loading a RISC-V ELF should look like this)
How else can we print one thing in NuttX?
To print one thing, we will make a System Call (ECALL) on to NuttX Kernel (bypassing the POSIX Features)…
// NuttX System Name that prints
// one thing. System Name Quantity
// is 61 (SYS_write). Works precisely
// like POSIX `write()`
ssize_t write(
int fd, // File Descriptor (1 for Customary Output)
const char *buf, // Buffer to be printed
size_t buflen // Buffer Size
);
// Which makes an ECALL with these Parameters...
// Register A0 is 61 (SYS_write)
// Register A1 is the File Descriptor (1 for Customary Output)
// Register A2 factors to the String Buffer to be printed
// Register A3 is the Buffer Size
That’s the identical NuttX System Name that printf executes internally.
Last likelihood to say howdy to NuttX…
We’re making a System Name (ECALL) to NuttX Kernel to print one thing…
How will we code this in C?
We execute the ECALL in RISC-V Assembly like this: test-nuttx.js
int foremost(int argc, char *argv[]) {
// Make NuttX System Name
// to jot down(fd, buf, buflen)
const unsigned int nbr = 61; // SYS_write
const void *parm1 = 1; // File Descriptor (stdout)
const void *parm2 = "Hiya, World!!n"; // Buffer
const void *parm3 = 15; // Buffer Size
// Load the Parameters into
// Registers A0 to A3
// Be aware: This does not work with TCC,
// so we load once more under
register lengthy r0 asm("a0") = (lengthy)(nbr);
register lengthy r1 asm("a1") = (lengthy)(parm1);
register lengthy r2 asm("a2") = (lengthy)(parm2);
register lengthy r3 asm("a3") = (lengthy)(parm3);
// Execute ECALL for System Name
// to NuttX Kernel. Once more: Load the
// Parameters into Registers A0 to A3
asm unstable (
// Load 61 to Register A0 (SYS_write)
"addi a0, zero, 61 n"
// Load 1 to Register A1 (File Descriptor)
"addi a1, zero, 1 n"
// Load 0xc0101000 to Register A2 (Buffer)
"lui a2, 0xc0 n"
"addiw a2, a2, 257 n"
"slli a2, a2, 0xc n"
// Load 15 to Register A3 (Buffer Size)
"addi a3, zero, 15 n"
// ECALL for System Name to NuttX Kernel
"ecall n"
// NuttX wants NOP after ECALL
".phrase 0x0001 n"
// Enter+Output Registers: None
// Enter-Solely Registers: A0 to A3
// Clobbers the Reminiscence
:
: "r"(r0), "r"(r1), "r"(r2), "r"(r3)
: "reminiscence"
);
// Loop Perpetually
for(;;) {}
return 0;
}
We copy this into our Net Browser and compile it. (Pic above)
(Why so complicated? Explained here)
(Caution: SYS_write 61 may change)
Does it work?
TCC in WebAssembly compiles the code above to RISC-V ELF a.out
. After we copy it to NuttX and run it…
NuttShell (NSH) NuttX-12.4.0-RC0
nsh> a.out
...
## NuttX System Name for SYS_write (61)
riscv_swint:
cmd: 61
A0: 3d ## SYS_write (61)
A1: 01 ## File Descriptor (Customary Output)
A2: c0101000 ## Buffer
A3: 0f ## Buffer Size
...
## NuttX Kernel says howdy
Hiya, World!!
NuttX Kernel prints “Hiya World” yay!
Certainly we’ve created a C Compiler in a Net Browser, that produces correct NuttX Apps!
OK so we will construct NuttX Apps in a Net Browser… However can we run them in a Net Browser?
Yep, a NuttX App constructed within the Net Browser… Now runs OK with NuttX Emulator within the Net Browser! 🎉 (Pic under)
TLDR: We referred to as JavaScript Local Storage
to repeat the RISC-V ELF a.out
from TCC WebAssembly to NuttX Emulator… Then we patched a.out
into the ROM FS Filesystem for NuttX Emulator. Nifty!
NuttX App built in a Web Browser… Runs inside the Web Browser!
Because of the TCC Team, we have now a 64-bit RISC-V Compiler that runs within the Net Browser…
-
Zig Compiler compiles TCC to WebAssembly with one tiny repair
-
However POSIX Features are lacking in WebAssembly
-
So we did the naked minimal for File Enter and Output
-
And cooked up the only workaround for fprintf and pals
-
Lastly TCC produces a RISC-V Binary that runs OK on Apache NuttX RTOS
-
Now we will Construct and Take a look at NuttX Apps all inside a Net Browser!
How will you utilize TCC in a Net Browser? Please lemme know 🙏
(Construct and run RISC-V Apps on iPhone?)
Many Because of my GitHub Sponsors (and the superior NuttX and Zig Communities) for supporting my work! This text wouldn’t have been doable with out your help.
Received a query, remark or suggestion? Create an Situation or submit a Pull Request right here…
Online Demo of TCC Compiler in WebAssembly
That is how we run Zig Compiler to compile TCC Compiler from C to WebAssembly (pic under)…
## Obtain the (barely) Modified TCC Supply Code.
## Configure the construct for 64-bit RISC-V.
git clone https://github.com/lupyuen/tcc-riscv32-wasm
cd tcc-riscv32-wasm
./configure
make cross-riscv64
## Name Zig Compiler to compile TCC Compiler
## from C to WebAssembly. Produces `tcc.o`
## Omitted: Run the `zig cc` command from earlier...
## https://lupyuen.codeberg.web page/articles/tcc#zig-compiles-tcc-to-webassembly
zig cc ...
## Compile our Zig Wrapper `tcc-wasm.zig` for WebAssembly
## and hyperlink it with TCC compiled for WebAssembly `tcc.o`
## Generates `tcc-wasm.wasm`
## Omitted: Run the `zig build-exe` command from earlier...
## https://lupyuen.codeberg.web page/articles/tcc#zig-compiles-tcc-to-webassembly
zig build-exe ...
How did we work out the “zig
cc
” choices?
Earlier we noticed a protracted checklist of Zig Compiler Options…
## Zig Compiler Choices for TCC Compiler
zig cc
tcc.c
-DTCC_TARGET_RISCV64
-DCONFIG_TCC_CROSSPREFIX=""riscv64-""
-DCONFIG_TCC_CRTPREFIX=""/usr/riscv64-linux-gnu/lib""
-DCONFIG_TCC_LIBPATHS=""{B}:/usr/riscv64-linux-gnu/lib""
-DCONFIG_TCC_SYSINCLUDEPATHS=""{B}/embrace:/usr/riscv64-linux-gnu/embrace""
...
We obtained them from “make
--trace
”, which reveals the GCC Compiler Choices…
## Present the GCC Choices for compiling TCC
$ make --trace cross-riscv64
gcc
-o riscv64-tcc.o
-c
tcc.c
-DTCC_TARGET_RISCV64
-DCONFIG_TCC_CROSSPREFIX=""riscv64-""
-DCONFIG_TCC_CRTPREFIX=""/usr/riscv64-linux-gnu/lib""
-DCONFIG_TCC_LIBPATHS=""{B}:/usr/riscv64-linux-gnu/lib""
-DCONFIG_TCC_SYSINCLUDEPATHS=""{B}/embrace:/usr/riscv64-linux-gnu/embrace""
-DTCC_GITHASH=""foremost:b3d10a35""
-Wall
-O2
-Wdeclaration-after-statement
-fno-strict-aliasing
-Wno-pointer-sign
-Wno-sign-compare
-Wno-unused-result
-Wno-format-truncation
-Wno-stringop-truncation
-I.
And we copied above GCC Choices to grow to be our Zig Compiler Options.
Beforehand we noticed some JavaScript (Net Browser and Node.js) calling our TCC Compiler in WebAssembly (pic above)…
That is how we check the TCC WebAssembly in a Net Browser with a Native Net Server…
## Obtain the (barely) Modified TCC Supply Code
git clone https://github.com/lupyuen/tcc-riscv32-wasm
cd tcc-riscv32-wasm
## Begin the Net Server
cargo set up simple-http-server
simple-http-server ./docs &
## At any time when we rebuild TCC WebAssembly...
## Copy it to the Net Server
cp tcc-wasm.wasm docs/
Browse to this URL and our TCC WebAssembly will seem…
## Take a look at TCC WebAssembly with Net Browser
http://localhost:8000/index.html
Verify the JavaScript Console for Debug Messages.
How does it work?
On clicking the Compile Button, our JavaScript hundreds the TCC WebAssembly: tcc.js
// Load the WebAssembly Module and begin the Primary Operate.
// Referred to as by the Compile Button.
async operate bootstrap() {
// Load the WebAssembly Module `tcc-wasm.wasm`
// https://developer.mozilla.org/en-US/docs/WebAssembly/JavaScript_interface/instantiateStreaming
const end result = await WebAssembly.instantiateStreaming(
fetch("tcc-wasm.wasm"),
importObject
);
// Retailer references to WebAssembly Features
// and Reminiscence exported by Zig
wasm.init(end result);
// Begin the Primary Operate
window.requestAnimationFrame(foremost);
}
(importObject exports our JavaScript Logger to Zig)
(wasm is our WebAssembly Helper)
Which triggers the Primary Operate and calls our Zig Operate compile_program: tcc.js
// Primary Operate
operate foremost() {
// Allocate a String for passing the Compiler Choices to Zig
// `choices` is a JSON Array: ["-c", "hello.c"]
const choices = read_options();
const options_ptr = allocateString(JSON.stringify(choices));
// Allocate a String for passing the Program Code to Zig
const code = doc.getElementById("code").worth;
const code_ptr = allocateString(code);
// Name TCC to compile this system
const ptr = wasm.occasion.exports
.compile_program(options_ptr, code_ptr);
// Get the `a.out` dimension from first 4 bytes returned
const reminiscence = wasm.occasion.exports.reminiscence;
const data_len = new Uint8Array(reminiscence.buffer, ptr, 4);
const len = data_len[0] | data_len[1] << 8 | data_len[2] << 16 | data_len[3] << 24;
if (len <= 0) { return; }
// Encode the `a.out` information from the remainder of the bytes returned
// `encoded_data` seems like %7fpercent45percent4cpercent46...
const information = new Uint8Array(reminiscence.buffer, ptr + 4, len);
let encoded_data = "";
for (const i in information) {
const hex = Quantity(information[i]).toString(16).padStart(2, "0");
encoded_data += `%${hex}`;
}
// Obtain the `a.out` information into the Net Browser
obtain("a.out", encoded_data);
// Save the ELF Information to Native Storage for loading by NuttX Emulator
localStorage.setItem("elf_data", encoded_data);
};
Our Primary Operate then downloads the a.out
file returned by our Zig Operate.
(allocateString allocates a String from Zig Memory)
What about Node.js calling TCC WebAssembly?
## Take a look at TCC WebAssembly with Node.js
node zig/check.js
For Simpler Testing (by way of Command-Line): We copied the JavaScript above right into a Node.js Script: test.js
// Allocate a String for passing the Compiler Choices to Zig
const choices = ["-c", "hello.c"];
const options_ptr = allocateString(JSON.stringify(choices));
// Allocate a String for passing Program Code to Zig
const code_ptr = allocateString(`
int foremost(int argc, char *argv[]) {
printf("Hiya, World!!n");
return 0;
}
`);
// Name TCC to compile a program
const ptr = wasm.occasion.exports
.compile_program(options_ptr, code_ptr);
(Test Script for NuttX QEMU: test-nuttx.js)
(Test Log for NuttX QEMU: test-nuttx.log)
Some time again we noticed our Zig Wrapper doing Sample Matching for Formatting C Strings…
How It Works: We seek for Format Patterns within the C Format Strings and substitute the Zig Equal (pic above): tcc-wasm.zig
// Format a Single `%d`
// like `#outline __TINYC__ %d`
FormatPattern{
// If the C Format String accommodates this...
.c_spec = "%d",
// Then we apply this Zig Format...
.zig_spec = "{}",
// And extract these Argument Sorts
// from the Varargs...
.type0 = c_int,
.type1 = null
}
(FormatPattern is defined here)
To implement this, we name comptime Features in Zig: tcc-wasm.zig
/// CompTime Operate to format a string by Sample Matching.
/// Format a Single Specifier, like `#outline __TINYC__ %dn`
/// If the Spec matches the Format: Return the variety of bytes written to `str`, excluding terminating null.
/// Else return 0.
fn format_string1(
ap: *std.builtin.VaList, // Varargs handed from C
str: [*]u8, // Buffer for returning Formatted String
dimension: size_t, // Buffer Measurement
format: []const u8, // C Format String, like `#outline __TINYC__ %dn`
comptime c_spec: []const u8, // C Format Sample, like `%d`
comptime zig_spec: []const u8, // Zig Equal, like `{}`
comptime T0: kind, // Kind of First Vararg, like `c_int`
) usize { // Return the variety of bytes written to `str`, excluding terminating null
// Rely the Format Specifiers: `%`
const spec_cnt = std.mem.rely(u8, c_spec, "%");
const format_cnt = std.mem.rely(u8, format, "%");
// Verify the Format Specifiers: `%`
// Stop if the variety of specifiers are totally different
// Or if the specifiers are usually not discovered
if (format_cnt != spec_cnt or
!std.mem.containsAtLeast(u8, format, 1, c_spec)) {
return 0;
}
// Fetch the First Argument from the C Varargs
const a = @cVaArg(ap, T0);
// Format the Argument
var buf: [512]u8 = undefined;
const buf_slice = std.fmt.bufPrint(&buf, zig_spec, .{a}) catch {
@panic("format_string1 error: buf too small");
};
// Change the C Format Sample by the Zig Equal
var buf2 = std.mem.zeroes([512]u8);
_ = std.mem.change(u8, format, c_spec, buf_slice, &buf2);
// Return the Formatted String and Size
const len = std.mem.indexOfScalar(u8, &buf2, 0).?;
assert(len < dimension);
@memcpy(str[0..len], buf2[0..len]);
str[len] = 0;
return len;
}
// Omitted: Operate `format_string2` seems comparable,
// however for two Varargs (as a substitute of 1)
The operate above is known as by a comptime Inline Loop that applies all of the Format Patterns that we noticed earlier: tcc-wasm.zig
/// Runtime Operate to format a string by Sample Matching.
/// Return the variety of bytes written to `str`, excluding terminating null.
fn format_string(
ap: *std.builtin.VaList, // Varargs handed from C
str: [*]u8, // Buffer for returning Formatted String
dimension: size_t, // Buffer Measurement
format: []const u8, // C Format String, like `#outline __TINYC__ %dn`
) usize { // Return the variety of bytes written to `str`, excluding terminating null
// If no Format Specifiers: Return the Format, like `warning: `
const len = format_string0(str, dimension, format);
if (len > 0) { return len; }
// For each Format Sample...
inline for (format_patterns) |sample| {
// Attempt formatting the string with the sample...
const len2 =
if (sample.type1) |t1|
// Sample has 2 parameters
format_string2(ap, str, dimension, format, // Output String and Format String
sample.c_spec, sample.zig_spec, // Format Specifiers for C and Zig
sample.type0, t1 // Forms of the Parameters
)
else
// Sample has 1 parameter
format_string1(ap, str, dimension, format, // Output String and Format String
sample.c_spec, sample.zig_spec, // Format Specifiers for C and Zig
sample.type0 // Kind of the Parameter
);
// Loop till we discover a match sample
if (len2 > 0) { return len2; }
}
// Format String would not match any Format Sample.
// We return the Format String and Size.
const len3 = format.len;
assert(len3 < dimension);
@memcpy(str[0..len3], format[0..len3]);
str[len3] = 0;
return len3;
}
And the above operate is known as by fprintf and pals: tcc-wasm.zig
/// Implement the POSIX Operate `fprintf`
export fn fprintf(stream: *FILE, format: [*:0]const u8, ...) c_int {
// Put together the varargs
var ap = @cVaStart();
defer @cVaEnd(&ap);
// Format the string
var buf = std.mem.zeroes([512]u8);
const format_slice = std.mem.span(format);
const len = format_string(&ap, &buf, buf.len, format_slice);
// TODO: Print to different File Streams.
// Proper now we assume it is stderr (File Descriptor 2)
return @intCast(len);
}
// Do the identical for sprintf, snprintf, vsnprintf
(Without comptime: Our code gets super tedious)
Simply now we noticed an enormous chunk of C Code that makes a NuttX System Name…
Why so sophisticated?
We consult with the Pattern Code for NuttX System Calls (ECALL). Rightfully this shorter model ought to work…
// Make NuttX System Name to jot down(fd, buf, buflen)
const unsigned int nbr = 61; // SYS_write
const void *parm1 = 1; // File Descriptor (stdout)
const void *parm2 = "Hiya, World!!n"; // Buffer
const void *parm3 = 15; // Buffer Size
// Execute ECALL for System Name to NuttX Kernel
register lengthy r0 asm("a0") = (lengthy)(nbr);
register lengthy r1 asm("a1") = (lengthy)(parm1);
register lengthy r2 asm("a2") = (lengthy)(parm2);
register lengthy r3 asm("a3") = (lengthy)(parm3);
asm unstable (
// ECALL for System Name to NuttX Kernel
"ecall n"
// NuttX wants NOP after ECALL
".phrase 0x0001 n"
// Enter+Output Registers: None
// Enter-Solely Registers: A0 to A3
// Clobbers the Reminiscence
:
: "r"(r0), "r"(r1), "r"(r2), "r"(r3)
: "reminiscence"
);
Surprisingly TCC generates mysterious RISC-V Machine Code that mashes up the RISC-V Registers…
foremost():
// Put together the Stack
0: fc010113 add sp, sp, -64
4: 02113c23 sd ra, 56(sp)
8: 02813823 sd s0, 48(sp)
c: 04010413 add s0, sp, 64
10: 00000013 nop
14: fea43423 sd a0, -24(s0)
18: feb43023 sd a1, -32(s0)
// Right: Load Register A0 with 61 (SYS_write)
1c: 03d0051b addw a0, zero, 61
20: fca43c23 sd a0, -40(s0)
// Nope: Load Register A0 with 1?
// Blended up with Register A1! (Worth 1)
24: 0010051b addw a0, zero, 1
28: fca43823 sd a0, -48(s0)
// Nope: Load Register A0 with "Hiya World"?
// Blended up with Register A2!
2c: 00000517 auipc a0,0x0 2c: R_RISCV_PCREL_HI20 L.0
30: 00050513 mv a0,a0 30: R_RISCV_PCREL_LO12_I .textual content
34: fca43423 sd a0, -56(s0)
// Nope: Load Register A0 with 15?
// Blended up with Register A3! (Worth 15)
38: 00f0051b addw a0, zero, 15
3c: fca43023 sd a0, -64(s0)
// Execute ECALL with Register A0 set to fifteen.
// Nope: A0 needs to be 61!
40: 00000073 ecall
44: 0001 nop
Thus we hardcode Registers A0 to A3 in RISC-V Meeting: test-nuttx.js
// Load 61 to Register A0 (SYS_write)
addi a0, zero, 61
// Load 1 to Register A1 (File Descriptor)
addi a1, zero, 1
// Load 0xc0101000 to Register A2 (Buffer)
lui a2, 0xc0
addiw a2, a2, 257
slli a2, a2, 0xc
// Load 15 to Register A3 (Buffer Size)
addi a3, zero, 15
// ECALL for System Name to NuttX Kernel
ecall
// NuttX wants NOP after ECALL
.phrase 0x0001
And it prints “Hiya World”!
TODO: Is there a workaround? Will we paste the ECALL Meeting Code ourselves? NuttX Libraries received’t hyperlink with TCC
What’s with the addi
and nop
?
TCC received’t assemble the “li
” and “nop
” directions.
So we used this RISC-V Online Assembler to assemble the code above.
“addi
” above is the longer type of “li
”, which TCC received’t assemble…
// Load 61 to Register A0 (SYS_write)
// However TCC will not assemble `li a0, 61`
// So we do that...
// Add 0 to 61 and save to Register A0
addi a0, zero, 61
“lui / addiw / slli
” above is our growth of “li a2, 0xc0101000
”, which TCC received’t assemble…
// Load 0xC010_1000 to Register A2 (Buffer)
// However TCC will not assemble `li a2, 0xc0101000`
// So we do that...
// Load 0xC0 << 12 into Register A2 (0xC0000)
lui a2, 0xc0
// Add 257 to Register A2 (0xC0101)
addiw a2, a2, 257
// Shift Left by 12 Bits (0xC010_1000)
slli a2, a2, 0xc
How did we work out that the buffer is at 0xC010_1000?
We noticed this in our ELF Loader Log…
NuttShell (NSH) NuttX-12.4.0
nsh> a.out
...
Learn 576 bytes from offset 512
Learn 154 bytes from offset 64
1. 00000000->c0000000
Learn 0 bytes from offset 224
2. 00000000->c0101000
Learn 16 bytes from offset 224
3. 00000000->c0101000
4. 00000000->c0101010
Which says that the NuttX ELF Loader copied 16 bytes from our NuttX App Information Part (.information.ro
) to 0xC010_1000
.
That’s all 15 bytes of “Hiya, World!!n”, together with the terminating null.
Thus our buffer in NuttX QEMU needs to be at 0xC010_1000
.
(NuttX WebAssembly Emulator uses 0x8010_1000
instead)
(More about the NuttX ELF Loader)
Why can we Loop Perpetually?
// Omitted: Execute ECALL for System Name to NuttX Kernel
asm unstable ( ... );
// Loop Perpetually
for(;;) {}
That’s as a result of NuttX Apps are usually not purported to Return to NuttX Kernel.
We should always name the NuttX System Name __exit
to terminate peacefully.
Online Demo of Apache NuttX RTOS
Listed here are the steps to construct and run NuttX for QEMU 64-bit RISC-V (Kernel Mode)
-
Set up the Construct Conditions, skip the RISC-V Toolchain…
-
Obtain the RISC-V Toolchain for riscv64-unknown-elf…
-
Obtain and configure NuttX…
## Obtain NuttX Supply Code mkdir nuttx cd nuttx git clone https://github.com/apache/nuttx nuttx git clone https://github.com/apache/nuttx-apps apps ## Configure NuttX for QEMU RISC-V 64-bit (Kernel Mode) cd nuttx instruments/configure.sh rv-virt:knsh64 make menuconfig
We use Kernel Mode as a result of it permits loading of NuttX Apps as ELF Information.
(As a substitute of Statically Linking the NuttX Apps into NuttX Kernel)
-
(Non-compulsory) To allow ELF Loader Logging, choose…
Construct Setup > Debug Choices > Binary Loader Debug Options:
- Allow “Binary Loader Error, Warnings and Information”
-
(Non-compulsory) To allow System Name Logging, choose…
Construct Setup > Debug Choices > SYSCALL Debug Options:
- Allow “SYSCALL Error, Warnings and Information”
-
Save and exit menuconfig.
-
Construct the NuttX Kernel and NuttX Apps…
## Construct NuttX Kernel make -j 8 ## Construct NuttX Apps make -j 8 export pushd ../apps ./instruments/mkimport.sh -z -x ../nuttx/nuttx-export-*.tar.gz make -j 8 import popd
This produces the NuttX ELF Picture nuttx
that we might boot on QEMU RISC-V Emulator…
## For macOS: Set up QEMU
brew set up qemu
## For Debian and Ubuntu: Set up QEMU
sudo apt set up qemu-system-riscv64
## Boot NuttX on QEMU 64-bit RISC-V
qemu-system-riscv64
-semihosting
-M virt,aclint=on
-cpu rv64
-smp 8
-bios none
-kernel nuttx
-nographic
NuttX Apps are positioned in apps/bin
.
We might copy our RISC-V ELF a.out
to that folder and run it…
NuttShell (NSH) NuttX-12.4.0-RC0
nsh> a.out
Hiya, World!!
Bear in mind we mentioned that POSIX Features aren’t supported in WebAssembly? (Pic above)
We dump the Compiled WebAssembly of TCC Compiler, and we uncover that it calls 72 POSIX Features…
## Dump the Compiled WebAssembly
## for TCC Compiler `tcc.o`
$ sudo apt set up wabt
$ wasm-objdump -x tcc.o
Import:
- func[0] sig=1 <env.strcmp> <- env.strcmp
- func[1] sig=12 <env.memset> <- env.memset
- func[2] sig=1 <env.getcwd> <- env.getcwd
...
- func[69] sig=2 <env.localtime> <- env.localtime
- func[70] sig=13 <env.qsort> <- env.qsort
- func[71] sig=19 <env.strtoll> <- env.strtoll
Do we want all 72 POSIX Features? We scrutinise the checklist…
Filesystem Features
We’ll simulate these capabilities for WebAssembly, by embedding the straightforward ROM FS Filesystem into our Zig Wrapper…
(See the updates for ROM FS Filesystem)
Varargs Features
As mentioned earlier, Varargs shall be tricky to implement in Zig. Most likely we must always do it in C.
Proper now we’re doing easy Pattern Matching. Nevertheless it may not be enough when TCC compiles Actual Packages…
String Features
We’ll borrow the String Features from ziglibc…
Semaphore Features
Unsure why TCC makes use of Semaphores? Perhaps we’ll perceive once we help #embrace
information.
(The place can we borrow the Semaphore Features?)
Customary Library
qsort isn’t used proper now. Perhaps for the Linker later?
(Borrow qsort from the place? We are able to most likely implement exit)
Time and Math Features
Not used proper now, perhaps later.
(Anybody can lend us ldexp? How will we do the Time Features? Name out to JavaScript to fetch the time?)
Excellent Features
We’ve carried out (totally or partially) 48 POSIX Features from above.
Those that we haven’t carried out? These 24 POSIX Functions will Halt when TCC WebAssembly calls them…