Utilizing Zig to Unit Take a look at a C Utility · mtlynch.io
Zig is a brand new, independently developed low-level programming language. It’s a contemporary reimagining of C that makes an attempt to retain C’s efficiency whereas embracing enhancements from the final 30 years of tooling and language design.
Zig makes calling into C code simpler than every other language I’ve used. Zig additionally treats unit testing as a first-class function, which the C language definitely doesn’t.
These two properties of Zig create an attention-grabbing alternative: Zig means that you can add unit checks to present C code. You are able to do this with out rewriting any of your C code or construct logic.
To display find out how to use Zig to check present C code, I added unit checks to a real-world C software that I take advantage of each day.
The true-world C software: uStreamer 🔗︎
For the previous three years, I’ve been engaged on TinyPilot, an open-source KVM over IP. TinyPilot means that you can plug a Raspberry Pi into any computer after which management that pc remotely.
To stream the goal pc’s show, TinyPilot makes use of uStreamer, a video streaming utility that’s optimized for Raspberry Pi’s {hardware}.
I’ve been working with uStreamer for a number of years, however I discover the codebase troublesome to strategy. It’s carried out in C, and it doesn’t have any automated checks.
I study greatest by tinkering with code, so exercising uStreamer’s C code by Zig looks like a great way to study extra about each uStreamer and Zig.
Getting the uStreamer supply code 🔗︎
To start, I’ll seize the uStreamer supply code. The most recent launch as of this writing is v5.45
, so I’ll seize that model:
USTREAMER_VERSION='v5.45'
git clone
--branch "${USTREAMER_VERSION}"
https://github.com/pikvm/ustreamer.git
What’s the only C operate in uStreamer? 🔗︎
For this train, the problem goes to be utilizing Zig, so I need the C half to be so simple as doable.
I wish to discover a useless easy operate in uStreamer’s C code — one thing that I can feed some enter, and it offers me some output that I can examine simply.
Scanning by the filenames, I seen base64.c
. That sounded promising. I do know that base64 is a scheme for encoding arbitrary knowledge as a printable string.
For instance, if I learn 10 bytes from /dev/random
into my terminal, I get some unprintable bytes:
$ head -c 10 /dev/random > /tmp/output && cat /tmp/output
V�1A�����b
If I encode the info as base64, I get clear, printable charcters:
$ base64 < /tmp/output
Vo8xQbWmnsLQYg==
Right here’s the signature of uStreamer’s base64 operate:
// src/libs/base64.h
void us_base64_encode(const uint8_t *knowledge, size_t measurement, char **encoded, size_t *allotted);
From inspecting the operate’s implementation in base64.c
, right here’s what I deduce in regards to the semantics of us_base64_encode
:
knowledge
is enter knowledge to encode with the base64 encoding scheme.measurement
is the size of theknowledge
buffer (in bytes).encoded
is a pointer to an output buffer during whichus_base64_encode
shops the base64-encoded string.us_base64_encode
allocates reminiscence for the output, and the caller is chargeable for liberating the reminiscence after they’re carried out with it.- Technically,
us_base64_encode
permits the caller to allocate the buffer forencoded
, however, for simplicity, I’m ignoring that performance.
allotted
is a pointer thatus_base64_encode
populates with the variety of bytes it allotted intoencoded
.
Right here’s a easy take a look at program to name this operate from C:
// src/take a look at.c
#embrace <stdio.h>
#embrace "libs/base64.h"
void important(void) {
char *enter = "hey, world!";
char *encoded = NULL;
size_t encoded_bytes = 0;
us_base64_encode((uint8_t *)enter, strlen(enter), &encoded, &encoded_bytes);
printf("enter: %sn", enter);
printf("output: %sn", encoded);
printf("output bytes: %lun", encoded_bytes);
free(encoded);
}
I’ll compile it with gcc, a well-liked C compiler:
$ gcc src/take a look at.c src/libs/base64.c -o /tmp/b64test
In file included from src/libs/base64.h:31,
from src/take a look at.c:3:
src/libs/instruments.h: In operate ‘us_signum_to_string’:
src/libs/instruments.h:194:34: warning: implicit declaration of operate ‘sigabbrev_np’ [-Wimplicit-function-declaration]
194 | const char *const identify = sigabbrev_np(signum);
| ^~~~~~~~~~~~
Hmm, the code compiles, however I’m getting a number of compiler warnings a few instruments.h
header that the uStreamer code consists of.
If I look into src/libs/instruments.h
, I see that every one the errors are round a single operate: us_signum_to_string
. Let me see if I can simply remark out that operate to clear away the irrelevant warnings.
/*
DEBUG: Briefly delete this operate to get the construct working once more.
INLINE char *us_signum_to_string(int signum) {
...
return buf;
}
*/
With the pesky us_signum_to_string
operate eliminated, I’ll attempt to compile construct once more:
$ gcc src/take a look at.c src/libs/base64.c -o /tmp/b64test && /tmp/b64test
enter: hey, world!
output: aGVsbG8sIHdvcmxkIQ==
output bytes: 21
Hooray, no extra compiler warnings.
If I have been attempting to compile all of uStreamer, I’d have to determine find out how to get us_signum_to_string
to compile. For this train, I’m simply calling us_base64_encode
from Zig, so I don’t want us_signum_to_string
.
If I evaluate my take a look at.c
program’s output to my system’s built-in base64
utility, I can confirm that I’m producing the right consequence:
$ printf 'hey, world!' | base64
aGVsbG8sIHdvcmxkIQ==
The whole instance at this stage is on Github.
Including Zig to my uStreamer mission surroundings 🔗︎
My favourite approach of putting in Zig is with Nix, because it permits me to modify Zig variations simply. Be happy to install Zig any approach you favor.
I added the next flake.nix
file to my mission, which pulls Zig 0.11.0 into my surroundings:
{
description = "Dev surroundings for zig-c-simple";
inputs = {
flake-utils.url = "github:numtide/flake-utils";
# 0.11.0
zig_dep.url = "github:NixOS/nixpkgs/46688f8eb5cd6f1298d873d4d2b9cf245e09e88e";
};
outputs = { self, flake-utils, zig_dep }@inputs :
flake-utils.lib.eachDefaultSystem (system:
let
zig_dep = inputs.zig_dep.legacyPackages.${system};
in
{
devShells.default = zig_dep.mkShell {
packages = [
zig_dep.zig
];
shellHook = ''
echo "zig" "$(zig model)"
'';
};
});
}
From right here, I can run nix develop
, and I see that Nix 0.11.0 is on the market in my mission surroundings:
# There is a bizarre quirk of Nix flakes that they should be added to your git
# repo.
$ git add flake.nix
$ nix develop
zig 0.11.0
Making a Zig executable 🔗︎
The Zig compiler’s init-exe
creates a boilerplate Zig software, so I’ll use it to create a easy Zig app throughout the uStreamer supply tree:
$ zig init-exe
data: Created construct.zig
data: Created src/important.zig
data: Subsequent, attempt `zig construct --help` or `zig construct run`
If I attempt compiling and working the boilerplate Zig software, I see that every little thing works:
$ zig construct run
All of your codebase are belong to us.
Run `zig construct take a look at` to run the checks.
The uStreamer C file I wish to name depends on the C standard library, so I must make a small adjustment to my construct.zig
file to hyperlink in opposition to that library. Whereas I’m adjusting, I’ll additionally substitute the boilerplate binary identify with base64-encoder
:
const exe = b.addExecutable(.{
.identify = "base64-encoder", // Change binary identify.
.root_source_file = .{ .path = "src/important.zig" },
.goal = goal,
.optimize = optimize,
});
exe.linkLibC(); // Hyperlink in opposition to C customary library.
exe.addIncludePath(.{ .path = "src" });
Calling uStreamer code from Zig 🔗︎
Now, I wish to name the us_base64_encode
C operate from Zig.
As a reminder, right here’s the C operate I’m attempting to name from Zig, which I defined above:
// src/libs/base64.h
void us_base64_encode(const uint8_t *knowledge, size_t measurement, char **encoded, size_t *allotted);
Determining find out how to translate between C varieties and Zig varieties turned out to be the toughest a part of this course of, as I’m nonetheless a Zig novice.
Right here was my first try:
// src/important.zig
const ustreamer = @cImport({
@cInclude("libs/base64.c");
});
pub fn important() !void {
const enter = "hey, world!";
var cEncoded: *u8 = undefined;
var allocatedSize: usize = 0;
// WRONG: This does not compile.
ustreamer.us_base64_encode(&enter, enter.len, &cEncoded, &allocatedSize);
}
That yielded this compiler error:
$ zig construct run
zig build-exe b64 Debug native: error: the next command failed with 1 compilation errors:
...
src/important.zig:17:32: error: anticipated sort '[*c]const u8', discovered '*const *const [13:0]u8'
ustreamer.us_base64_encode(&enter, enter.len, &cEncoded, &allocatedSize);
^~~~~~
src/important.zig:17:32: word: pointer sort baby '*const [13:0]u8' can not solid into pointer sort baby 'u8'
/house/mike/ustreamer/zig-cache/o/9599bf4c636d23e50eddd1a55dd088ff/cimport.zig:1796:43: word: parameter sort declared right here
pub export fn us_base64_encode(arg_data: [*c]const u8, arg_size: usize, arg_encoded: [*c][*c]u8, arg_allocated: [*c]usize) void {
I had hassle understanding this error at first as a result of a lot of it was unfamiliar.
The essential little bit of the compiler error above is error: anticipated sort '[*c]const u8', discovered '*const *const [13:0]u8'
. It’s telling me that I attempted to go in a *const *const [13:0]u8
, however Zig wants me to go in [*c]const u8
.
What does that imply?
Understanding the kind I used 🔗︎
Based on the Zig compiler, I handed in a parameter of sort '*const *const [13:0]u8
. To grasp what this implies, I’ll go from proper to left:
u8
is an unsigned byte, which is how Zig represents characters in a string.
[13:0]
means a null-terminated array. The 13
is the size of the array, which Zig calculates at compile-time. :0
implies that the array has an additional byte with a worth of 0
to point the tip of the string. For extra particulars in regards to the mechanics of null-terminated strings in Zig, see my previous post.
*const
means a continuing pointer. A pointer is an deal with in reminiscence, and the const
implies that subsequent code could not reassign the variable.
*const *const
means a continuing pointer to a continuing pointer. In different phrases, enter
is a continuing pointer to a string, so meaning &enter
is a continuing pointer to a continuing pointer.
Changing a Zig sort to a C sort 🔗︎
Okay, now I perceive how Zig views the string that I handed. What did Zig need me to go because the enter
sort?
anticipated sort '[*c]const u8'
What the heck does [*c]
imply?
This was surprisingly onerous to determine. I finally pieced it collectively from just a few totally different sources.
Right here’s the official Zig documentation:
C Pointers 🔗︎
This kind is to be prevented each time doable. The one legitimate cause for utilizing a C pointer is in auto-generated code from translating C code.
When importing C header information, it’s ambiguous whether or not pointers ought to be translated as single-item pointers (*T) or many-item pointers ([*]T). C pointers are a compromise in order that Zig code can make the most of translated header information immediately.
I didn’t perceive the documentation, because it appeared to be warning in opposition to utilizing C pointers moderately than explaining what they’re.
Extra Kagi‘ing led me to this clarification on reddit, which I discovered extra accessible:
[*c]T
is only a C pointer to sort T, it says that it doesn’t know whether or not there are a number of parts in that pointer or not. There may very well be, there couldn’t be. We additionally don’t know the size of it (it’s not a slice which has pointer+size, it’s only a pointer). And if there are a number of parts, we don’t know whether it is say null-terminated or not.
Okay, that makes extra sense.
In C, a pointer is only a reminiscence deal with and an information sort. A C sort of char*
may level to a single character like 'A'
, or it may level to the primary character in a sequence like "ABCD"
.
In Zig, a pointer to an array is a special sort than a pointer to a single aspect. When Zig has to deduce an information sort from C code, Zig can’t inform whether or not the C code is referring to a single aspect or an array, so the C pointer sort ([*c]T
) is Zig’s approach of claiming, “I don’t know. I acquired this from C.”
By way of trial and error, I discovered that Zig needed me to get a pointer to enter
by referencing enter.ptr
moderately than utilizing the address-of operator &
.
This Zig snippet exhibits the distinction between the .ptr
and &
:
const enter = "hey, world!";
std.debug.print("enter is sort {s}n", .{@typeName(@TypeOf(enter))});
std.debug.print("&enter is sort {s}n", .{@typeName(@TypeOf(&enter))});
std.debug.print("enter.ptr is sort {s}n", .{@typeName(@TypeOf(enter.ptr))});
enter is sort *const [13:0]u8
&enter is sort *const *const [13:0]u8
enter.ptr is sort [*]const u8
Recall that Zig desires me to go us_base64_encode
a parameter of sort [*c]const u8
, so it appears like it could convert [*]const u8
to that sort.
Okay, let me attempt calling us_base64_encode
once more:
const enter = "hey, world!";
var cEncoded: *u8 = undefined;
var allocatedSize: usize = 0;
ustreamer.us_base64_encode(enter.ptr, enter.len, &cEncoded, &allocatedSize);
That offers me:
$ zig construct run
zig build-exe b64 Debug native: error: the next command failed with 1 compilation errors:
...
src/important.zig:12:54: error: anticipated sort '[*c][*c]u8', discovered '**u8'
ustreamer.us_base64_encode(enter.ptr, enter.len, &cEncoded, &allocatedSize);
^~~~~~~~~
Progress!
The code nonetheless doesn’t compile, however Zig is now complaining in regards to the third parameter as an alternative of the primary. That not less than tells me that I’ve equipped the anticipated varieties for the primary two parameters.
Translating the output parameters into Zig 🔗︎
The compiler error additionally accommodates a useful bit of data for calling into the C implementation of us_base64_encode
:
pub export fn us_base64_encode(arg_data: [*c]const u8, arg_size: usize, arg_encoded: [*c][*c]u8, arg_allocated: [*c]usize) void {
That’s the signature of the C operate translated into Zig, so Zig is telling me precisely the categories I must go in to name the operate.
Alternatively, I can use the zig translate-c
utility to translate this C operate signature into Zig. This successfully offers the identical outcomes because the compiler error above, nevertheless it preserves the unique parameter names, whereas the compiler error prefixes them with arg_
.
# We add --library c to let Zig know the code depends upon libc.
$ zig translate-c src/libs/base64.h --library c | grep us_base64
pub extern fn us_base64_encode(knowledge: [*c]const u8, measurement: usize, encoded: [*c][*c]u8, allotted: [*c]usize) void;
From extra trial and error, I finally guessed my strategy to these semantics for calling us_base64_encode
from Zig:
const enter = "hey, world!";
var cEncoded: [*c]u8 = null;
var allocatedSize: usize = 0;
ustreamer.us_base64_encode(enter.ptr, enter.len, &cEncoded, &allocatedSize);
And it compiles efficiently!
Can I do higher than C pointers? 🔗︎
Recall what the Zig documentation said about C pointers:
The one legitimate cause for utilizing a C pointer is in auto-generated code…
I’m penning this code by hand, so I assume I shouldn’t be utilizing a sort reserved for auto-generated code.
I do know that the third parameter to us_base64_encode
is a pointer to a null-terminated string. How do I symbolize that in Zig?
My first thought was to do that:
var cEncoded: [*:0]u8 = undefined;
ustreamer.us_base64_encode(enter.ptr, enter.len, &cEncoded, &allocatedSize);
That appeared affordable. I do know that us_base64_encode
will populate cEncoded
with a string, and [*:0]u8
represents a null-terminated string of unkown size. However after I compile, Zig mentioned no:
error: anticipated sort '[*c][*c]u8', discovered '*[*:0]u8'
I used to be stumped, so I requested for assistance on Ziggit, a Zig dialogue discussion board. Inside an hour, one other consumer showed me a solution:
var cEncoded: ?[*:0]u8 = null;
ustreamer.us_base64_encode(enter.ptr, enter.len, &cEncoded, &allocatedSize);
The problem was that in C, a sort of char**
will be null
, whereas a Zig sort of [*:0]u8
can’t be null. That’s why Zig refused to let me go in my earlier try.
Breaking down the right sort of ?[*:0]u8
, I see that it’s:
- a null-terminated slice of bytes (
:0]u8
) - of unknown size (
[*
) - that might be null (
?
)
The new type allows me to compile the code, but if I try to print the value of cEncoded
, I get what appears to be a memory address rather than a string:
$ zig build run
input: hello, world!
output: u8@2b12a0 # << whoops, not what I expected
output size: 21
In order to convert cEncoded
back to a printable string, I have to unwrap it from its optional variable by verifying in code that its value is non-null:
var cEncoded: ?[*:0]u8 = null;
ustreamer.us_base64_encode(enter.ptr, enter.len, &cEncoded, &allocatedSize);
const output: [*:0]u8 = cEncoded orelse return error.UnexpectedNull;
...
std.debug.print("output: {s}n", .{output});
After which it prints the right consequence:
$ zig construct run
enter: hey, world!
output: aGVsbG8sIHdvcmxkIQ==
output measurement: 21
Finishing the decision to C from Zig 🔗︎
At this level, I now have full working code for calling the C us_base64_encode
from Zig. Right here’s the total src/important.zig
file:
// src/important.zig
const std = @import("std");
// Import the base64 implementation from uStreamer's C supply file.
const ustreamer = @cImport({
@cInclude("libs/base64.c");
});
pub fn important() !void {
// Create a normal Zig string.
const enter = "hey, world!";
// Create variables to retailer the ouput parameters of us_base64_encode.
var cEncoded: ?[*:0]u8 = null;
var allocatedSize: usize = 0;
// Name the uStreamer C operate from Zig.
ustreamer.us_base64_encode(enter.ptr, enter.len, &cEncoded, &allocatedSize);
// Get the output as a non-optional sort.
const output: [*:0]u8 = cEncoded orelse return error.UnexpectedNull;
// Free the reminiscence that the C operate allotted when this operate exits.
defer std.c.free(cEncoded);
// Print the enter and output of the base64 encode operation.
std.debug.print("enter: {s}n", .{enter});
std.debug.print("output: {s}n", .{output});
std.debug.print("output measurement: {d}n", .{allocatedSize});
}
$ zig construct run
enter: hey, world!
output: aGVsbG8sIHdvcmxkIQ==
output measurement: 21
Nice! That labored. And the outcomes are equivalent to my C implementation above.
The whole instance at this stage is on Github.
Making a Zig wrapper for the native C implementation 🔗︎
At this level, I can efficiently name the C us_base64_encode
operate from Zig, however the code is a bit messy. Most of my important()
operate is coping with translating values to and from C code.
A method to enhance the code is so as to add a Zig wrapper operate for us_base64_encode
. That approach, I may encapsulate all of the Zig to C interop logic, and callers of my wrapper wouldn’t should know or care that I’m calling C.
What ought to my wrapper operate seem like?
It ought to settle for arbitrary bytes and return a null-terminated string, so the operate signature ought to look one thing like this:
fn base64Encode(knowledge: []const u8) [:0]u8 {...}
I have already got the primary few strains of my implementation primarily based on my important()
operate above:
fn base64Encode(knowledge: []const u8) [:0]u8 {
var cEncoded: ?[*:0]u8 = null;
var allocatedSize: usize = 0;
ustreamer.us_base64_encode(knowledge.ptr, knowledge.len, &cEncoded, &allocatedSize);
// TODO: Full the implementation.
Who’s chargeable for liberating the reminiscence C allotted? 🔗︎
There’s an issue I haven’t addressed but. us_base64_encode
allotted reminiscence into the cEncoded
pointer. The caller is chargeable for both liberating that reminiscence or passing off that duty to its callers.
Usually, it’s high quality for a operate to declare that the caller is chargeable for liberating an output worth, however this case is a little bit trickier. This isn’t a traditional Zig-allocated reminiscence buffer — it’s a C-allocated buffer that requires a particular free operate (std.c.free
).
I wish to summary away the C implementation particulars, so callers shouldn’t have to make use of a C-specific reminiscence liberating operate.
That tells me what I must do to finish the implementation of my Zig wrapper. I take advantage of defer std.c.free
to free the C-allocated reminiscence, after which I’ll want to repeat it right into a Zig-managed slice:
fn base64Encode(knowledge: []const u8) ![:0]u8 {
var cEncodedOptional: ?[*:0]u8 = null;
var allocatedSize: usize = 0;
ustreamer.us_base64_encode(knowledge.ptr, knowledge.len, &cEncodedOptional, &allocatedSize);
const cEncoded: [*:0]u8 = cEncodedOptional orelse return error.UnexpectedNull;
// Get the output as a non-optional sort.
const output: [*:0]u8 = cEncoded orelse return error.UnexpectedNull;
// Free the C-allocated reminiscence buffer earlier than exiting the operate.
defer std.c.free(cEncoded);
// TODO: Copy the contents of cEncoded right into a [:0]u8 buffer.
}
Changing a C string to a Zig string 🔗︎
At this level, I’ve acquired the string as a [*:0]u8
(unknown size, zero-terminated Zig slice), however I wish to return [:0]u8
(length-aware, null-terminated Zig slice). How do I convert a C-style string to a Zig slice?
In my previous post, I transformed a C string to a Zig string with this course of:
- Create a Zig slice of the C string utilizing
std.mem.span
. - Use
allocator.dupeZ
to repeat the contents of the slice right into a newly allotted Zig slice.
That course of would work right here, however I’d be doing a ineffective work in step (1). std.mem.span
has to iterate the string to seek out the null terminator. On this code, I already know the place the null terminator is as a result of us_base64_encode
shops that data within the allocatedSize
parameter.
As a substitute, I create a length-aware Zig slice of the cEncoded
slice like this:
// The allocatedSize consists of the null terminator, so subtract 1 to get the
// variety of non-null characters within the string.
const cEncodedLength = allocatedSize - 1;
// Convert cEncoded (unknown size slice) to a length-aware slice.
const outputLengthAware: [:0] = cEncoded[0..cEncodedLength :0];
At this level, I can full the implementation of my wrapper operate:
fn base64Encode(allocator: std.mem.Allocator, knowledge: []const u8) ![:0]u8 {
var cEncoded: [*c]u8 = null;
var allocatedSize: usize = 0;
ustreamer.us_base64_encode(knowledge.ptr, knowledge.len, &cEncoded, &allocatedSize);
// Get the output as a non-optional sort.
const output: [*:0]u8 = cEncoded orelse return error.UnexpectedNull;
// Free the C-allocated reminiscence buffer earlier than exiting the operate.
defer std.c.free(cEncoded);
// The allocatedSize consists of the null terminator, so subtract 1 to get the
// variety of non-null characters within the string.
const cEncodedLength = allocatedSize - 1;
return allocator.dupeZ(u8, cEncoded[0..cEncodedLength :0]);
}
To name dupeZ
, I want a Zig allocator, so I adjusted the semantics of my base64Encode
wrapper to simply accept a std.mem.Allocator
sort.
Tying all of it collectively 🔗︎
With my Zig wrapper in place, it’s now trivial to train the C us_base64_encode
operate from Zig.
Recall that my earlier code regarded like this:
const enter = "hey, world!";
var cEncoded: ?[*:0]u8 = null;
var allocatedSize: usize = 0;
ustreamer.us_base64_encode(enter.ptr, enter.len, &cEncoded, &allocatedSize);
const output: [*:0]u8 = cEncoded orelse return error.UnexpectedNull;
defer std.c.free(cEncoded);
With my Zig wrapper, the semantics simplify to 2 strains:
const output = attempt base64Encode(allocator, "hey, world!");
defer allocator.free(output);
Right here’s the total instance:
const std = @import("std");
// Import the base64 implementation from uStreamer's C supply file.
const ustreamer = @cImport({
@cInclude("libs/base64.c");
});
fn base64Encode(allocator: std.mem.Allocator, knowledge: []const u8) ![:0]u8 {
var cEncodedOptional: ?[*:0]u8 = null;
var allocatedSize: usize = 0;
ustreamer.us_base64_encode(knowledge.ptr, knowledge.len, &cEncodedOptional, &allocatedSize);
const cEncoded: [*:0]u8 = cEncodedOptional orelse return error.UnexpectedNull;
defer std.c.free(cEncodedOptional);
const cEncodedLength = allocatedSize - 1;
return allocator.dupeZ(u8, cEncoded[0..cEncodedLength :0]);
}
pub fn important() !void {
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
const allocator = gpa.allocator();
defer _ = gpa.deinit();
const enter = "hey, world!";
const output = attempt base64Encode(allocator, enter);
defer allocator.free(output);
// Print the enter and output of the base64 encode operation.
std.debug.print("enter: {s}n", .{enter});
std.debug.print("output: {s}n", .{output});
std.debug.print("output measurement: {d}n", .{output.len});
}
$ zig construct run
enter: hey, world!
output: aGVsbG8sIHdvcmxkIQ==
output measurement: 20
The output measurement is now 20
as an alternative of 21
as a result of the underlying knowledge sort modified. Beforehand, I used to be printing the output measurement parameter that us_base64_encode
populated, which included the null terminator. Now, I’m utilizing the .len
property of the output string, which doesn’t embrace the null terminator.
The whole instance at this stage is on Github.
Creating the primary unit take a look at 🔗︎
Now that I can name the C us_base64_encode
operate by a handy Zig wrapper, I’m prepared to begin writing unit checks to confirm that the C implementation is right.
The very first thing I must do is make a few small changes to my construct.zig
file in order that the unit checks can entry libc and uStreamer’s C supply information:
// construct.zig
const unit_tests = b.addTest(.{
.root_source_file = .{ .path = "src/important.zig" },
.goal = goal,
.optimize = optimize,
});
unit_tests.linkLibC(); // Hyperlink in opposition to libc.
unit_tests.addIncludePath(.{ .path = "src" }); // Search src path for consists of.
I’ve already carried out the heavy lifting right here by writing my Zig wrapper operate, so writing my first unit take a look at is simple:
// src/important.zig
take a look at "encode easy string as base64" {
const allocator = std.testing.allocator;
const precise = attempt base64Encode(allocator, "hey, world!");
defer allocator.free(precise);
attempt std.testing.expectEqualStrings("aGVsbG8sIHdvcmxkIQ==", precise);
}
The zig construct take a look at
command runs my unit take a look at:
$ zig construct take a look at --summary all
Construct Abstract: 3/3 steps succeeded; 1/1 checks handed
take a look at success
└─ run take a look at 1 handed 1ms MaxRSS:1M
└─ zig take a look at Debug native success 2s MaxRSS:211M
Success! My first unit take a look at is working and exercising the C code.
The whole instance at this stage is on Github.
Checking for false constructive take a look at outcomes 🔗︎
My unit take a look at is succeeding, however I wish to be certain that the take a look at is really executing the C code and never simply returning a false constructive. I can confirm this by deliberately introducing a bug into the C code.
It is a snippet from the implementation of base64.c
:
# outline OCTET(_name) unsigned _name = (data_index < measurement ? (uint8_t)knowledge[data_index++] : 0)
OCTET(octet_a);
OCTET(octet_b);
OCTET(octet_c);
# undef OCTET
Let me attempt swapping the order of those two strains:
OCTET(octet_a);
OCTET(octet_c); // I've swapped these
OCTET(octet_b); // two strains.
And right here’s what occurs after I attempt re-running my unit take a look at on the C operate after my tampering:
$ zig construct take a look at --summary all
run take a look at: error: 'take a look at.encode easy string as base64' failed: ====== anticipated this output: =========
aGVsbG8sIHdvcmxkIQ==␃
======== as an alternative discovered this: =========
aGxlbCxvIG93cmRsIQ==␃
Cool, the take a look at works!
Once I launched a bug into us_base64_encode
, my take a look at failed and revealed the bug.
Including a number of unit checks 🔗︎
I’d like to increase my single take a look at case into many take a look at circumstances to extend my confidence that I’m exercising extra of the C operate’s logic.
Half of the strains in my first unit take a look at have been boilerplate round managing reminiscence, so I’d wish to keep away from repeating that for every take a look at. I wrote a utility operate to seize the boilerplate:
fn testBase64Encode(
enter: []const u8,
anticipated: [:0]const u8,
) !void {
const allocator = std.testing.allocator;
const precise = attempt base64Encode(allocator, enter);
defer allocator.free(precise);
attempt std.testing.expectEqualStrings(anticipated, precise);
}
My take a look at utility operate permits me so as to add new checks simply:
take a look at "encode strings as base64" {
attempt testBase64Encode("", "");
attempt testBase64Encode("h", "aA==");
attempt testBase64Encode("he", "aGU=");
attempt testBase64Encode("hel", "aGVs");
attempt testBase64Encode("hell", "aGVsbA==");
attempt testBase64Encode("hey, world!", "aGVsbG8sIHdvcmxkIQ==");
}
take a look at "encode uncooked bytes as base64" {
attempt testBase64Encode(&[_]u8{0}, "AA==");
attempt testBase64Encode(&[_]u8{ 0, 0 }, "AAA=");
attempt testBase64Encode(&[_]u8{ 0, 0, 0 }, "AAAA");
attempt testBase64Encode(&[_]u8{255}, "/w==");
attempt testBase64Encode(&[_]u8{ 255, 255 }, "//8=");
attempt testBase64Encode(&[_]u8{ 255, 255, 255 }, "////");
}
$ zig construct take a look at --summary all
Construct Abstract: 3/3 steps succeeded; 2/2 checks handed
take a look at success
└─ run take a look at 2 handed 2ms MaxRSS:2M
└─ zig take a look at Debug native success 2s MaxRSS:195M
The whole instance at this stage is on Github.
Due to Zig’s glorious interoperability with C, it’s doable so as to add unit checks to an present C software with out modifying any of the C code or construct course of.
Within the instance I confirmed, the C code doesn’t find out about Zig in any respect, and it continues to work as-is with no adjustments to its present Makefile
.
I discovered this train a helpful approach of studying extra about each the Zig language and the C code I’m testing.
Because of the Ziggit group for their help with this blog post. Excerpts from uStreamer are used beneath the GPLv3 license.