Issue is quicker than Zig! – Re: Issue

2023-11-10 14:25:03

Not too long ago, I used to be trying on the Zig programming language.
As I usually do, I began implementing a number of typical issues in new languages to be taught them. Nicely, certainly one of them was tremendous sluggish and Zig is meant to be tremendous
quick, so I used to be attempting to grasp the place the disconnect was and evaluate it to
Factor!

I used to be in a position to cut back the difficulty to a small check case and it seems that there
is a behavioral subject of their implementation of
HashMap
that
makes their HashMaps get slow over
time
. The check case performs
these steps:

  • creates a HashMap of two million objects
  • decrements the map values, eradicating an merchandise each third loop
  • inserts a alternative new merchandise to keep up 2 million objects within the HashMap
  • for a complete of 250 million actions, then
  • deletes the remaining objects from the HashMap

We document the overall time every block of 1 million actions takes:

One thing could be very incorrect!

Zig

That is the straightforward check case carried out in Zig utilizing the std.HashMap:

const std = @import("std");

pub fn predominant() !void {
    var map = std.AutoHashMap(u64, u64).init(std.heap.page_allocator);
    defer map.deinit();

    var listing = std.ArrayList(u64).init(std.heap.page_allocator);
    defer listing.deinit();

    var prng = std.rand.DefaultPrng.init(0);
    const random = prng.random();

    var begin = std.time.milliTimestamp();

    var i: u64 = 0;
    whereas (i < 2_000_000) : (i += 1) {
        attempt map.put(i, 3);
        attempt listing.append(i);
    }

    whereas (i < 250_000_000) : (i += 1) {
        var index = random.uintLessThan(usize, listing.objects.len);
        var j = listing.objects[index];
        var okay = map.get(j).?;
        if (okay == 1) {
            _ = map.take away(j);
            attempt map.put(i, 3);
            listing.objects[index] = i;
        } else {
            attempt map.put(j, okay - 1);
        }

        if (i % 1_000_000 == 0) {
            var finish = std.time.milliTimestamp();
            std.debug.print("{} block took {} msn", .{ i, finish - begin });
            begin = std.time.milliTimestamp();
        }
    }

    whereas (listing.objects.len > 0) {
        var j = listing.pop();
        _ = map.take away(j);
    }
}

We will run it utilizing ReleaseFast to get the perfect efficiency and see
that, over time, it will get tremendous sluggish – so sluggish that it isn’t even in a position to
actually end the check case:

$ zig model
0.11.0

$ zig run -O ReleaseFast maptest.zig
2000000 block took 156 ms
3000000 block took 122 ms
4000000 block took 127 ms
5000000 block took 133 ms
6000000 block took 138 ms
7000000 block took 141 ms
8000000 block took 143 ms
9000000 block took 145 ms
10000000 block took 147 ms
11000000 block took 148 ms
12000000 block took 151 ms
13000000 block took 153 ms
14000000 block took 155 ms
15000000 block took 157 ms
16000000 block took 159 ms
17000000 block took 164 ms
18000000 block took 167 ms
19000000 block took 171 ms
20000000 block took 173 ms
21000000 block took 180 ms
22000000 block took 186 ms
23000000 block took 190 ms
24000000 block took 195 ms
25000000 block took 205 ms
26000000 block took 213 ms
27000000 block took 221 ms
28000000 block took 234 ms
29000000 block took 247 ms
30000000 block took 264 ms
31000000 block took 282 ms
32000000 block took 301 ms
33000000 block took 320 ms
34000000 block took 346 ms
35000000 block took 377 ms
36000000 block took 409 ms
37000000 block took 448 ms
38000000 block took 502 ms
39000000 block took 550 ms
40000000 block took 614 ms
41000000 block took 694 ms
42000000 block took 767 ms
43000000 block took 853 ms
44000000 block took 961 ms
45000000 block took 1088 ms
46000000 block took 1250 ms
47000000 block took 1420 ms
48000000 block took 1612 ms
49000000 block took 1826 ms
50000000 block took 2056 ms
51000000 block took 2320 ms
52000000 block took 2688 ms
53000000 block took 3015 ms
54000000 block took 3467 ms
55000000 block took 3971 ms
56000000 block took 4618 ms
57000000 block took 5377 ms
58000000 block took 6172 ms
59000000 block took 7094 ms
60000000 block took 8173 ms
61000000 block took 9469 ms
62000000 block took 11083 ms
63000000 block took 12737 ms
64000000 block took 14000 ms
65000000 block took 16243 ms
66000000 block took 17912 ms
67000000 block took 20452 ms
68000000 block took 24356 ms
...

We will swap the instance above to make use of std.ArrayHashMap which doesn’t have
this downside, though it’s a bit slower with every block taking roughly 250
milliseconds.

Issue

We will evaluate that to a easy implementation in Issue:

USING: assocs calendar formatting io kernel math random
sequences ;

:: maptest ( -- )
    H{ } clone :> m
    V{ } clone :> l

    now :> begin!
    0   :> i!

    [ i 2,000,000 < ] [
        3 i m set-at
        i l push
        i 1 + i!
    ] whereas

    [ i 250,000,000 < ] [
        l length random :> j
        j l nth :> k
        k m at 1 - [
            k m delete-at
            3 i m set-at
            i j l set-nth
        ] [
            k m set-at
        ] if-zero

        i 1,000,000 mod zero? [
            i now start time- duration>milliseconds
            "%d block took %d msn" printf flush
            now start!
        ] when
        i 1 + i!
    ] whereas

    [ l empty? ] [
        l pop m delete-at
    ] till ;

We will run it in Issue and see how lengthy it takes. There are notably some lengthy
delays within the first few blocks which I’d like to grasp higher – probably
on account of extreme rehashing or some allocation sample with the Issue rubbish
collector – after which it rapidly reaches a gentle state the place every block takes
about 250 milliseconds.

$ issue maptest.issue
2000000 block took 855 ms
3000000 block took 198 ms
4000000 block took 205 ms
5000000 block took 3579 ms
6000000 block took 4438 ms
7000000 block took 3624 ms
8000000 block took 2996 ms
9000000 block took 232 ms
10000000 block took 243 ms
11000000 block took 248 ms
12000000 block took 298 ms
13000000 block took 233 ms
14000000 block took 238 ms
15000000 block took 298 ms
16000000 block took 233 ms
17000000 block took 521 ms
18000000 block took 231 ms
19000000 block took 236 ms
20000000 block took 280 ms
21000000 block took 235 ms
22000000 block took 235 ms
23000000 block took 281 ms
24000000 block took 231 ms
25000000 block took 236 ms
26000000 block took 294 ms
27000000 block took 231 ms
28000000 block took 236 ms
29000000 block took 506 ms
30000000 block took 234 ms
31000000 block took 237 ms
32000000 block took 277 ms
33000000 block took 232 ms
34000000 block took 239 ms
35000000 block took 279 ms
36000000 block took 235 ms
37000000 block took 239 ms
38000000 block took 275 ms
39000000 block took 234 ms
40000000 block took 514 ms
41000000 block took 231 ms
42000000 block took 236 ms
43000000 block took 282 ms
44000000 block took 235 ms
45000000 block took 235 ms
46000000 block took 282 ms
47000000 block took 231 ms
48000000 block took 233 ms
49000000 block took 280 ms
50000000 block took 234 ms
51000000 block took 238 ms
52000000 block took 507 ms
53000000 block took 231 ms
54000000 block took 236 ms
55000000 block took 276 ms
56000000 block took 231 ms
57000000 block took 238 ms
58000000 block took 278 ms
59000000 block took 234 ms
60000000 block took 235 ms
61000000 block took 278 ms
62000000 block took 237 ms
63000000 block took 239 ms
64000000 block took 510 ms
65000000 block took 234 ms
66000000 block took 284 ms
...

Not unhealthy!

What’s the Bug?

So, Zig may be tremendous quick, however the default std.HashMap implementation
makes use of tombstone buckets to mark slots as being deleted, and over time these
tombstone buckets create fragmentation within the HashMap, which causes their
linear probing to pattern in direction of the worst case examination of each bucket in
the HashMap when in search of a key.

We will implement a rehash() methodology on the HashMap that performs an in-place
rehashing of all the weather, with out allocations. Ideally, this might be finished
when the variety of crammed and deleted slots reaches some capability threshold.
However, for now, we are able to simply run map.rehash() as soon as per block, and see how that
improves efficiency:

See Also

diff --git a/lib/std/hash_map.zig b/lib/std/hash_map.zig
index 8a3d78283..7192ba733 100644
--- a/lib/std/hash_map.zig
+++ b/lib/std/hash_map.zig
@@ -681,6 +681,11 @@ pub fn HashMap(
             self.unmanaged = .{};
             return end result;
         }
+
+         /// Rehash the map, in-place
+         pub fn rehash(self: *Self) void {
+             self.unmanaged.rehash(self.ctx);
+         }
     };
 }
 
@@ -1505,6 +1510,92 @@ pub fn HashMapUnmanaged(
             return end result;
         }
 
+       /// Rehash the map, in-place
+       pub fn rehash(self: *Self, ctx: anytype) void {
+             const masks = self.capability() - 1;
+
+             var metadata = self.metadata.?;
+             var keys_ptr = self.keys();
+             var values_ptr = self.values();
+             var curr: Dimension = 0;
+
+             // Whereas we're re-hashing each slot, we are going to use the
+             // fingerprint to mark used buckets as getting used and both free
+             // (needing to be rehashed) or tombstone (already rehashed).
+
+             whereas (curr < self.capability()) : (curr += 1) {
+                 metadata[curr].fingerprint = Metadata.free;
+             }
+
+             // Now iterate over all of the buckets, rehashing them
+
+             curr = 0;
+             whereas (curr < self.capability()) {
+                 if (!metadata[curr].isUsed()) {
+                     assert(metadata[curr].isFree());
+                     curr += 1;
+                     proceed;
+                 }
+
+                 var hash = ctx.hash(keys_ptr[curr]);
+                 var fingerprint = Metadata.takeFingerprint(hash);
+                 var idx = @as(usize, @truncate(hash & masks));
+
+                 // For every bucket, rehash to an index:
+                 // 1) earlier than the cursor, probed right into a free slot, or
+                 // 2) equal to the cursor, no want to maneuver, or
+                 // 3) forward of the cursor, probing over already rehashed
+
+                 whereas ((idx < curr and metadata[idx].isUsed()) or
+                     (idx > curr and metadata[idx].fingerprint == Metadata.tombstone))
+                 {
+                     idx = (idx + 1) & masks;
+                 }
+
+                 if (idx < curr) {
+                     assert(metadata[idx].isFree());
+                     metadata[idx].fingerprint = fingerprint;
+                     metadata[idx].used = 1;
+                     keys_ptr[idx] = keys_ptr[curr];
+                     values_ptr[idx] = values_ptr[curr];
+
+                     metadata[curr].used = 0;
+                     assert(metadata[curr].isFree());
+                     keys_ptr[curr] = undefined;
+                     values_ptr[curr] = undefined;
+
+                     curr += 1;
+                 } else if (idx == curr) {
+                     metadata[idx].fingerprint = fingerprint;
+                     curr += 1;
+                 } else {
+                     assert(metadata[idx].fingerprint != Metadata.tombstone);
+                     metadata[idx].fingerprint = Metadata.tombstone;
+                     if (metadata[idx].isUsed()) {
+                         var tmpkey = keys_ptr[idx];
+                         var tmpvalue = values_ptr[idx];
+
+                         keys_ptr[idx] = keys_ptr[curr];
+                         values_ptr[idx] = values_ptr[curr];
+
+                         keys_ptr[curr] = tmpkey;
+                         values_ptr[curr] = tmpvalue;
+                     } else {
+                         metadata[idx].used = 1;
+                         keys_ptr[idx] = keys_ptr[curr];
+                         values_ptr[idx] = values_ptr[curr];
+
+                         metadata[curr].fingerprint = Metadata.free;
+                         metadata[curr].used = 0;
+                         keys_ptr[curr] = undefined;
+                         values_ptr[curr] = undefined;
+
+                         curr += 1;
+                     }
+                 }
+             }
+         }
+
oid {
             @setCold(true);
             const new_cap = @max(new_capacity, minimal_capacity);
@@ -2218,3 +2309,35 @@ check "std.hash_map repeat fetchRemove" {
     attempt testing.anticipate(map.get(2) != null);
     attempt testing.anticipate(map.get(3) != null);
 }
+
+check "std.hash_map rehash" {
+    var map = AutoHashMap(u32, u32).init(std.testing.allocator);
+    defer map.deinit();
+
+    var prng = std.rand.DefaultPrng.init(0);
+    const random = prng.random();
+
+    const rely = 6 * random.intRangeLessThan(u32, 100_000, 500_000);
+
+    var i: u32 = 0;
+    whereas (i < rely) : (i += 1) {
+        attempt map.put(i, i);
+        if (i % 3 == 0) {
+            attempt expectEqual(map.take away(i), true);
+        }
+    }
+
+    map.rehash();
+
+    attempt expectEqual(map.rely(), rely * 2 / 3);
+
+    i = 0;
+    whereas (i < rely) : (i += 1) {
+        if (i % 3 == 0) {
+            attempt expectEqual(map.get(i), null);
+        } else {
+            attempt expectEqual(map.get(i).?, i);
+        }
+    }
+}

We will apply that diff to lib/std/hash_map.zig and take a look at once more, now taking
about 165 milliseconds per block together with the time for map.rehash():

$ zig run -O ReleaseFast maptest.zig --zig-lib-dir ~/Dev/zig/lib
2000000 block took 155 ms
3000000 block took 147 ms
4000000 block took 154 ms
5000000 block took 160 ms
6000000 block took 163 ms
7000000 block took 164 ms
8000000 block took 165 ms
9000000 block took 166 ms
10000000 block took 166 ms
11000000 block took 165 ms
12000000 block took 166 ms
13000000 block took 165 ms
14000000 block took 166 ms
15000000 block took 172 ms
16000000 block took 165 ms
17000000 block took 167 ms
18000000 block took 165 ms
19000000 block took 167 ms
20000000 block took 169 ms
21000000 block took 168 ms
22000000 block took 167 ms
23000000 block took 166 ms
24000000 block took 167 ms
25000000 block took 167 ms
26000000 block took 165 ms
27000000 block took 166 ms
28000000 block took 166 ms
29000000 block took 165 ms
30000000 block took 165 ms
31000000 block took 165 ms
32000000 block took 166 ms
33000000 block took 165 ms
34000000 block took 167 ms
35000000 block took 170 ms
36000000 block took 165 ms
37000000 block took 166 ms
38000000 block took 166 ms
39000000 block took 164 ms
40000000 block took 165 ms
41000000 block took 167 ms
42000000 block took 166 ms
43000000 block took 167 ms
44000000 block took 169 ms
45000000 block took 166 ms
46000000 block took 165 ms
47000000 block took 166 ms
48000000 block took 166 ms
49000000 block took 166 ms
50000000 block took 166 ms
51000000 block took 166 ms
52000000 block took 164 ms
53000000 block took 165 ms
54000000 block took 167 ms
55000000 block took 165 ms
56000000 block took 166 ms
57000000 block took 166 ms
58000000 block took 165 ms
59000000 block took 166 ms
60000000 block took 169 ms
61000000 block took 165 ms
62000000 block took 165 ms
63000000 block took 166 ms
64000000 block took 166 ms
65000000 block took 165 ms
66000000 block took 166 ms
67000000 block took 176 ms
68000000 block took 166 ms
...

Nicely now, Zig is quick and the whole lot is true once more with the world – and
Issue takes solely about 50% extra time than Zig’s std.HashMap with
rehash() and about the identical as std.ArrayHashMap, which is fairly
good for a dynamic language.

I submitted a pull request adding a rehash() method to
HashMap
and hopefully it will get into
the upcoming Zig 0.12 launch and perhaps for Zig 0.13 they’ll alter it to
robotically rehash when it will get sufficiently fragmented, think about using
quadratic probing as an alternative of linear probing, or maybe swap to utilizing a
utterly completely different HashMap algorithm like Facebook’s F14 hash
table
, which
doesn’t have this subject.

Perhaps we must always contemplate a few of these enhancements for Issue as nicely!

Open supply is enjoyable!

Source Link

What's Your Reaction?
Excited
0
Happy
0
In Love
0
Not Sure
0
Silly
0
View Comments (0)

Leave a Reply

Your email address will not be published.

2022 Blinking Robots.
WordPress by Doejo

Scroll To Top