Now Reading
4 billion if statements | Blabbin’

4 billion if statements | Blabbin’

2023-12-28 00:11:31

I not too long ago stumbled upon this screenshot whereas researching social media on the prepare. In fact, it was adopted by a cascade of spiteful feedback, criticizing this contemporary programmer’s try to resolve a classical drawback in pc science. The modulus operation.

TikTok screenshot

Now, in a world the place AI is changing programmers by the minute, taking their jobs and revolutionizing the way in which we take into consideration code, possibly we must be extra open to the ideas of the contemporary new blood of the business? In truth, the above code is an ideal instance of a time-memory tradeoff. You’re buying and selling off your time and on the similar time, the computer systems reminiscence and time as effectively! Really a wonderful algorithm!

So I went to work to discover this concept of checking if a quantity is odd and even by solely utilizing comparisons to see how effectively it really works in an actual world situation. Since I’m a terrific believer in performant code I made a decision to implement this within the C programming language because it’s by far the quickest language on the planet to at the present time (because of the visionary genius Dennis Richie).

So I began composing

/* Copyright 2023. All unauthorized distribution of this supply code 
   will probably be persecuted to the fullest extent of the legislation*/
#embody <stdio.h>
#embody <stdint.h>
#embody <stdlib.h>
int important(int argc, char* argv[])
{
    uint8_t quantity = atoi(argv[1]); // No issues right here
    if (quantity == 0)
        printf("evenn");
    if (quantity == 1)
        printf("oddn");
    if (quantity == 2)
        printf("evenn");
    if (quantity == 3)
        printf("oddn");
    if (quantity == 4)
        printf("evenn");
    if (quantity == 5)
        printf("oddn");
    if (quantity == 6)
        printf("evenn");
    if (quantity == 7)
        printf("oddn");
    if (quantity == 8)
        printf("evenn");
    if (quantity == 9)
        printf("oddn");
    if (quantity == 10)
        printf("evenn");
}

Stunning! Lets compile the code, disabling optimizations with /Od to guarantee that the pesky compiler doesn’t intervene with our algorithm. After compiling we are able to do a fast take a look at of this system we get some optimistic outcomes:

PS > cl.exe /Od program.c
PS > .program.exe 0 
even
PS > .program.exe 4
even
PS > .program.exe 3
odd
PS > .program.exe 7
odd

Nonetheless, after doing a little additional testing I discovered some issues:

PS > .program.exe 50
PS > .program.exe 11
PS > .program.exe 99

No output! Evidently this system solely works for numbers below 11! Going again to the code we are able to discover the problem proper after the final if assertion, we want extra if statements!

Now, it is a time-memory tradeoff, however my time on this earth is proscribed so I made a decision to meta-program the if statements utilizing a programmer program in a unique programming language. To compensate for this dishonest I made a decision to make use of the slowest language on the planet, Python (because of the visionary genius of Ross van der Gussom).

print("/* Copyright 2023. All unauthorized distribution of this supply code")
print("   will probably be persecuted to the fullest extent of the legislation*/")

print("#embody <stdio.h>")
print("#embody <stdint.h>")
print("#embody <stdlib.h>")

print("int important(int argc, char* argv[])")
print("{")
print("    uint8_t quantity = atoi(argv[1]); // No issues right here")

for i in vary(2**8):
    print("    if (quantity == "+str(i)+")")
    if i % 2 == 0:
        print("        printf("evenn");")
    else:
        print("        printf("oddn");")

print("}")

Good! Now we are able to generate a program that solves the even-odd drawback for all 8-bit integers!

PS > python programmer.py > program.c
PS > cl.exe /Od program.c
PS > .program.exe 99
odd
PS > .program.exe 50
even
PS > .program.exe 240
even
PS > .program.exe 241
odd

Would you have a look at that! It really works flawlessly! Now, let’s scale it as much as 16 bit!

print("    uint16_t quantity = atoi(argv[1]); // No issues right here")

for i in vary(2**16):

This offers a pleasant and thick c file of round 130k traces. Nothing actually when trying again at a few of the code bases I’ve labored on through the years. Let’s compile!

PS > python programmer.py > program.c
PS > cl.exe /Od program.c
PS > .program.exe 21000
even
PS > .program.exe 3475 
odd
PS > .program.exe 3   
odd
PS > .program.exe 65001
odd
PS > .program.exe 65532
even

Stunning! Our algorithm appears to scale with the info! The executable is round 2 MB, however that’s no match for my beefy gaming rig with a whopping 31.8 GB of reminiscence.

Now, 16 bit is a really cool bitwidth, however as everyone knows, 32 bit is the holy grail of computing and is the ultimate bitwidth that we have to remedy all sensible engineering and scientific issues. In spite of everything, IPv4 continues to be standing stronger than ever, 60 years after it was deemed deprecated as a result of so known as “address exhaustion”.

So with out additional ado, lets scale to our last dimension. 32 bit is simply 65536 occasions as many numbers as 16 bit, what might go improper?

print("    uint32_t quantity = atoi(argv[1]); // No issues right here")

for i in vary(2**32):

So I let the mighty snake do its work and after getting a cup of espresso and getting again to verify on this system 48 hours later I used to be left with a wonderful c file, virtually 330 GB in dimension! Nearly definitely among the many largest c recordsdata in historical past. My fingers have been trembling once I entered the following command, absolutely MSVC had by no means earlier than encountered such highly effective supply code. After abusing the pagefile of my poor, highly effective pc for half an hour the next was spat out:

PS > cl /Od program.c
Microsoft (R) C/C++ Optimizing Compiler Model 19.32.31329 for x64
Copyright (C) Microsoft Company.  All rights reserved.

program.c
program.c(134397076): warning C4049: compiler restrict: terminating line quantity emission
program.c(134397076): notice: Compiler restrict for line quantity is 16777215
program.c(41133672): deadly error C1060: compiler is out of heap area

Pathetic!

And never solely did the compiler fail us, however when trying into the bounds of the Transportable Executable format (.exe) for home windows, I found that it can’t deal with greater than a measly 4 GB! With greater than 4 billion comparisons wanted to be encoded into the executable, it is a main impediment for implementing our algorithm. Even when every comparability would use lower than a single byte we’d nonetheless be too heavy.

Nonetheless, unhealthy compilers and file codecs mustn’t cease us from reaching our dream. In spite of everything, all what a compiler does is writing some fancy machine code right into a file and the file format is just a few construction telling the OS how you can put the binary code into reminiscence. Actually, we are able to try this ourselves.

Let’s begin by writing an IsEven operate in x86-64 assembly because it’s the native language of my Intel powered machine. It appears one thing like this:

; Argument is saved in ECX, return worth in EAX
XOR EAX, EAX ; Set eax to zero (return worth for odd quantity)
CMP ECX, 0h ; Examine arg to 0 
JNE 3h ; Skip subsequent two directions if it wasn't equal
INC EAX ; It was even, set even return worth (1)
RET ; Return
CMP ECX, 1h ; Examine arg to 1
JNE 2 ; Skip subsequent instruction if not equal
RET ; Odd return worth already in EAX, simply RET
; add the following 2...2^32-1 comparisons right here
RET ; Fallback return

Not likely appropriate asm, nevertheless it doesn’t matter a lot, as a result of we’re gonna compile it into machine code manually.

How did I do that? Effectively I jumped on-line, utilizing a mixture of my adolescence expertise coding emulators and hacking and regarded into the x86(-64) structure manuals to determine the proper opcodes and format for every instruction.

… Simply kidding, that’s horrible. I requested ChatGPT what the proper opcode was for every instruction and fortunate for us it didn’t hallucinate any new extensions to x86-64.

See Also

So now we simply write a “compiler” to output this code. Observe that we’ll write the opcodes we received from the AI for the directions immediately. Right here’s the way it appears in our pal python:

import struct

with open('isEven.bin', 'wb') as file:
   
    file.write(b"x31xC0")                     # XOR EAX, EAX

    for i in vary(2**32):
        ib = struct.pack("<I", i)               # Encode i as 32 bit little endian integer

        file.write(b"x81xF9" + ib)            # CMP ECX, i

        if i%2 == 0:
            file.write(b"x75x03")             # JNE +3
            file.write(b"xFFxC0")             # INC EAX
            file.write(b"xC3")                 # RET
        else:
            file.write(b"x75x01")             # JNE +1
            file.write(b"xC3")                 # RET

    file.write(b"xC3")                         # Fallback RET

Whereas we considerably deviated from the unique imaginative and prescient of the TikTok submit, the essence stays the identical. We create a protracted, lengthy, lengthy listing of if-statements to find out if any quantity is even or odd, ignoring any arithmetic operation that will assist out.

Operating this offers us a pleasant 40 GB file which accommodates all 4.2 billion comparisons wanted to find out if any 32 bit quantity is even or odd! Now we simply want to jot down our host program that may load and use these directions. For added efficiency (it is vitally vital), I made a decision to map the file into the deal with area as an alternative of studying all of it. By doing this, we are able to simply fake that your entire file is already in reminiscence and let the poor OS take care of becoming a 40 GB blob into digital reminiscence. After mapping the file with READ and EXECUTE permissions we are able to name into the code by utilizing a operate pointer. It appears like this:

#embody <stdio.h>
#embody <Home windows.h>
#embody <stdint.h>

int important(int argc, char* argv[])
 GENERIC_EXECUTE, FILE_SHARE_READ,
                        NULL,
                        OPEN_EXISTING,
                        FILE_ATTRIBUTE_NORMAL,
                        NULL);
   
    // Get 64 bit dimension of file
    LARGE_INTEGER codeSize;
    GetFileSizeEx(binFile, &codeSize);

    // Create reminiscence map of the file
    HANDLE mapping = CreateFileMapping(
                        binFile,
                        NULL,
                        PAGE_EXECUTE_READ,
                        0,
                        0,
                        NULL);

    // Get a pointer to the code
    LPVOID code = MapViewOfFile(
                    mapping,FILE_MAP_EXECUTE 

And there we go! We now have all the pieces to verify if any 32 bit quantity is even or odd. Let’s take it for a spin:

PS >.program.exe 300
even
PS >.program.exe 0
even
PS >.program.exe 1000000
even
PS >.program.exe 100000007
odd
PS >.program.exe 400000000
even
PS >.program.exe 400000001
odd
PS >.program.exe 400000006
even
PS >.program.exe 4200000000
odd <---- WRONG!

Nearly! Looks as if the algorithm has some points with signedness, any worth over 2^31 appears to present random outcomes. Unhappy!

Let’s repair the ultimate bug.

It seems that atoi can’t take care of unsigned pureness, so it did not parse our huge boy numbers. Changing it with strtoul fixes all the pieces.

uint32_t quantity = strtoul(argv[1], NULL, 10);// No issues right here
PS >.program.exe 4200000000
even
PS >.program.exe 4200000001
odd

As a aspect notice, this system is amazingly performant. For small numbers the outcomes are instantaneous and for the massive quantity near the two^32 restrict the consequence continues to be returned in round 10 seconds. Contemplating the pc has to learn 40 GB of knowledge from disk, map it to bodily reminiscence after which let the CPU has a rip of it with out many possibilities of caching is actually fairly thoughts blowing. For reference, the pc is a Core i5 12600K with 32 GB reminiscence and the recordsdata are residing on a M.2 SSD disk. Whereas calculating, the height learn pace I noticed from the SSD was round 800 MB/s (which doesn’t actually make sense as that ought to give execution speeds at 40+ seconds, however computer systems are magical so who is aware of what’s going on).

And there we have now it! The Web confirmed improper as soon as once more, not solely are you able to truly write a totally functioning and performant program within the method of the TikTok submit, nevertheless it’s additionally very enjoyable.

Source Link

What's Your Reaction?
Excited
0
Happy
0
In Love
0
Not Sure
0
Silly
0
View Comments (0)

Leave a Reply

Your email address will not be published.

2022 Blinking Robots.
WordPress by Doejo

Scroll To Top