Now Reading
Malloc tutorial

Malloc tutorial

2023-11-25 19:28:57

Let’s write a malloc and see the way it works with current packages!

This tutorial goes to imagine that you understand what pointers are, and that you understand sufficient C to know that *ptr dereferences a pointer, ptr->foo means (*ptr).foo, that malloc is used to dynamically allocate space, and that you simply’re accustomed to the idea of a linked record. For a fundamental intro to C, Pointers on C is one in all my favourite books. If you wish to have a look at all of this code without delay, it is out there here.

Preliminaries apart, malloc’s operate signature is

void *malloc(size_t measurement);

It takes as enter numerous bytes and returns a pointer to a block of reminiscence of that measurement.

There are a selection of how we will implement this. We will arbitrarily select to make use of sbrk. The OS reserves stack and heap house for processes and sbrk lets us manipulate the heap. sbrk(0) returns a pointer to the present high of the heap. sbrk(foo) increments the heap measurement by foo and returns a pointer to the earlier high of the heap.

Diagram of linux memory layout, courtesy of Gustavo Duarte.

If we need to implement a extremely easy malloc, we will do one thing like

#embody <assert.h>
#embody <string.h>
#embody <sys/sorts.h>
#embody <unistd.h>


void *malloc(size_t measurement) {
  void *p = sbrk(0);
  void *request = sbrk(measurement);
  if (request == (void*) -1) {
    return NULL; // sbrk failed.
  } else {
    assert(p == request); // Not thread protected.
    return p;
  }
}

When a program asks malloc for house, malloc asks sbrk to increment the heap measurement and returns a pointer to the beginning of the brand new area on the heap. That is lacking a technicality, that malloc(0) ought to both return NULL or one other pointer that may be handed to free with out inflicting havoc, however it mainly works.

However talking of free, how does free work? Free’s prototype is

void free(void *ptr);

When free is handed a pointer that was beforehand returned from malloc, it is presupposed to free the house. However given a pointer to one thing allotted by our malloc, we do not know what measurement block is related to it. The place can we retailer that? If we had a working malloc, we might malloc some house and retailer it there, however we will run into hassle if we have to name malloc to order house every time we name malloc to order house.

A typical trick to work round that is to retailer meta-information a few reminiscence area in some house that we squirrel away slightly below the pointer that we return. Say the highest of the heap is at present at 0x1000 and we ask for 0x400 bytes. Our present malloc will request 0x400 bytes from sbrk and return a pointer to 0x1000. If we as an alternative save, say, 0x10 bytes to retailer details about the block, our malloc would request 0x410 bytes from sbrk and return a pointer to 0x1010, hiding our 0x10 byte block of meta-information from the code that is calling malloc.

That lets us free a block, however then what? The heap area we get from the OS must be contiguous, so we will not return a block of reminiscence within the center to the OS. Even when we have been keen to repeat every little thing above the newly freed area right down to fill the outlet, so we might return house on the finish, there isn’t any solution to notify all the code with tips to the heap that these pointers should be adjusted.

As an alternative, we will mark that the block has been freed with out returning it to the OS, in order that future calls to malloc can use re-use the block. However to do this we’ll want be capable to entry the meta info for every block. There are numerous attainable options to that. We’ll arbitrarily select to make use of a single linked record for simplicity.

So, for every block, we’ll need to have one thing like

struct block_meta {
  size_t measurement;
  struct block_meta *subsequent;
  int free;
  int magic; // For debugging solely. TODO: take away this in non-debug mode.
};

#outline META_SIZE sizeof(struct block_meta)

We have to know the scale of the block, whether or not or not it is free, and what the subsequent block is. There is a magic quantity right here for debugging functions, however it’s probably not needed; we’ll set it to arbitrary values, which can allow us to simply see which code modified the struct final.

We’ll additionally want a head for our linked record:

void *global_base = NULL;

For our malloc, we’ll need to re-use free house if attainable, allocating house after we cannot re-use current house. On condition that we now have this linked record construction, checking if we now have a free block and returning it’s simple. After we get a request of some measurement, we iterate via our linked record to see if there is a free block that is giant sufficient.

struct block_meta *find_free_block(struct block_meta **final, size_t measurement) {
  struct block_meta *present = global_base;
  whereas (present && !(current->free && current->measurement >= measurement)) {
    *final = present;
    present = current->subsequent;
  }
  return present;
}

If we do not discover a free block, we’ll need to request house from the OS utilizing sbrk and add our new block to the top of the linked record.

struct block_meta *request_space(struct block_meta* final, size_t measurement) {
  struct block_meta *block;
  block = sbrk(0);
  void *request = sbrk(measurement + META_SIZE);
  assert((void*)block == request); // Not thread protected.
  if (request == (void*) -1) {
    return NULL; // sbrk failed.
  }

  if (final) { // NULL on first request.
    last->subsequent = block;
  }
  block->measurement = measurement;
  block->subsequent = NULL;
  block->free = 0;
  block->magic = 0x12345678;
  return block;
}

As with our unique implementation, we request house utilizing sbrk. However we add a bit of additional house to retailer our struct, after which set the fields of the struct appropriately.

Now that we now have helper capabilities to test if we now have current free house and to request house, our malloc is easy. If our international base pointer is NULL, we have to request house and set the bottom pointer to our new block. If it isn’t NULL, we test to see if we will re-use any current house. If we will, then we do; if we will not, then we request house and use the brand new house.

void *malloc(size_t measurement) {
  struct block_meta *block;
  // TODO: align measurement?

  if (measurement <= 0) {
    return NULL;
  }

  if (!global_base) { // First name.
    block = request_space(NULL, measurement);
    if (!block) {
      return NULL;
    }
    global_base = block;
  } else {
    struct block_meta *final = global_base;
    block = find_free_block(&final, measurement);
    if (!block) { // Failed to search out free block.
      block = request_space(final, measurement);
      if (!block) {
        return NULL;
      }
    } else {      // Discovered free block
      // TODO: take into account splitting block right here.
      block->free = 0;
      block->magic = 0x77777777;
    }
  }

  return(block+1);
}

For anybody who is not accustomed to C, we return block+1 as a result of we need to return a pointer to the area after block_meta. Since block is a pointer of sort struct block_meta, +1 increments the handle by one sizeof(struct block_meta).

If we simply needed a malloc with no free, we might have used our unique, a lot easier malloc. So let’s write free! The principle factor free must do is about ->free.

As a result of we’ll must get the handle of our struct in a number of locations in our code, let’s outline this operate.

struct block_meta *get_block_ptr(void *ptr) {
  return (struct block_meta*)ptr - 1;
}

Now that we now have that, here is free:

void free(void *ptr) {
  if (!ptr) {
    return;
  }

  // TODO: take into account merging blocks as soon as splitting blocks is carried out.
  struct block_meta* block_ptr = get_block_ptr(ptr);
  assert(block_ptr->free == 0);
  assert(block_ptr->magic == 0x77777777 || block_ptr->magic == 0x12345678);
  block_ptr->free = 1;
  block_ptr->magic = 0x55555555;
}

Along with setting ->free, it is legitimate to name free with a NULL ptr, so we have to test for NULL. Since free should not be known as on arbitrary addresses or on blocks which might be already freed, we will assert that these issues by no means occur.

You by no means really want to say something, however it typically makes debugging quite a bit simpler. In truth, after I wrote this code, I had a bug that may have resulted in silent knowledge corruption if these asserts weren’t there. As an alternative, the code failed on the assert, which make it trivial to debug.

Now that we have got malloc and free, we will write packages utilizing our customized reminiscence allocator! However earlier than we will drop our allocator into current code, we’ll must implement a pair extra widespread capabilities, realloc and calloc. Calloc is simply malloc that initializes the reminiscence to 0, so let us take a look at realloc first. Realloc is meant to regulate the scale of a block of reminiscence that we have gotten from malloc, calloc, or realloc.

Realloc’s operate prototype is

void *realloc(void *ptr, size_t measurement)

If we move realloc a NULL pointer, it is presupposed to act similar to malloc. If we move it a beforehand malloced pointer, it ought to release house if the scale is smaller than the earlier measurement, and allocate more room and replica the prevailing knowledge over if the scale is bigger than the earlier measurement.

All the pieces will nonetheless work if we do not resize when the scale is decreased and we do not free something, however we completely need to allocate more room if the scale is elevated, so let’s begin with that.

void *realloc(void *ptr, size_t measurement) {
  if (!ptr) {
    // NULL ptr. realloc ought to act like malloc.
    return malloc(measurement);
  }

  struct block_meta* block_ptr = get_block_ptr(ptr);
  if (block_ptr->measurement >= measurement) {
    // Now we have sufficient house. May free some as soon as we implement cut up.
    return ptr;
  }

  // Want to essentially realloc. Malloc new house and free previous house.
  // Then copy previous knowledge to new house.
  void *new_ptr;
  new_ptr = malloc(measurement);
  if (!new_ptr) {
    return NULL; // TODO: set errno on failure.
  }
  memcpy(new_ptr, ptr, block_ptr->measurement);
  free(ptr);
  return new_ptr;
}

And now for calloc, which simply clears the reminiscence earlier than returning a pointer.

void *calloc(size_t nelem, size_t elsize) {
  size_t measurement = nelem * elsize; // TODO: test for overflow.
  void *ptr = malloc(measurement);
  memset(ptr, 0, measurement);
  return ptr;
}

Be aware that this does not test for overflow in nelem * elsize, which is definitely required by the spec. All the code right here is simply sufficient to get one thing that kinda sorta works.

Now that we now have one thing that kinda works, we will use our with current packages (and we do not even must recompile the packages)!

First, we have to compile our code. On linux, one thing like

clang -O0 -g -W -Wall -Wextra -shared -fPIC malloc.c -o malloc.so

ought to work.

-g provides debug symbols, so we will have a look at our code with gdb or lldb. -O0 will assist with debugging, by stopping particular person variables from getting optimized out. -W -Wall -Wextra provides further warnings. -shared -fPIC will allow us to dynamically hyperlink our code, which is what lets us use our code with existing binaries!

On macs, we’ll need one thing like

clang -O0 -g -W -Wall -Wextra -dynamiclib malloc.c -o malloc.dylib

Be aware that sbrk is deprecated on current variations of OS X. Apple makes use of an unorthodox definition of deprecated — some deprecated syscalls are badly damaged. I did not actually take a look at this on a Mac, so it is attainable that this may trigger bizarre failures or or simply not work on a mac.

Now, to make use of get a binary to make use of our malloc on linux, we’ll must set the LD_PRELOAD atmosphere variable. For those who’re utilizing bash, you are able to do that with

export LD_PRELOAD=/absolute/path/right here/malloc.so

For those who’ve received a mac, you may need

export DYLD_INSERT_LIBRARIES=/absolute/path/right here/malloc.so

If every little thing works, you may run some arbitrary binary and it’ll run as regular (besides that will probably be a bit slower).

$ ls
Makefile  malloc.c  malloc.so  README.md  take a look at  test-0  test-1  test-2  test-3  test-4

If there is a bug, you would possibly get one thing like

$ ls
Segmentation fault (core dumped)

Debugging

Let’s discuss debugging! For those who’re accustomed to utilizing a debugger to set breakpoints, examine reminiscence, and step via code, you may skip this part and go straight to the exercises.

This part assumes you may work out learn how to set up gdb in your system. For those who’re on a mac, chances are you’ll need to simply use lldb and translate the instructions appropriately. Since I do not know what bugs you would possibly run into, I’ll introduce a few bugs and present how I would observe them down.

First, we have to work out learn how to run gdb with out having it segfault. If ls segfaults, and we attempt to run gdb ls, gdb is nearly definitely going to segfault, too. We might write a wrapper to do that, however gdb additionally helps this. If we begin gdb after which run set atmosphere LD_PRELOAD=./malloc.so earlier than working this system, LD_PRELOAD will work as regular.

$ gdb /bin/ls
(gdb) set atmosphere LD_PRELOAD=./malloc.so
(gdb) run
Program obtained sign SIGSEGV, Segmentation fault.
0x00007ffff7bd7dbd in free (ptr=0x0) at malloc.c:113
113       assert(block_ptr->free == 0);

As anticipated, we get a segfault. We are able to go searching with record to see the code close to the segfault.

(gdb) record
108     }
109
110     void free(void *ptr) 

After which we will use p (for print) to see what is going on on with the variables right here:

(gdb) p ptr
$6 = (void *) 0x0
(gdb) p block_ptr
$7 = (struct block_meta *) 0xffffffffffffffe8

ptr is 0, i.e., NULL, which is the reason for the issue: we forgot to test for NULL.

See Also

Now that we have figured that out, let’s strive a barely tougher bug. To illustrate that we determined to exchange our struct with

struct block_meta {
  size_t measurement;
  struct block_meta *subsequent;
  int free;
  int magic;    // For debugging solely. TODO: take away this in non-debug mode.
  char knowledge[1];
};

after which return block->knowledge as an alternative of block+1 from malloc, with no different modifications. This appears fairly much like what we’re already doing — we simply outline a member that factors to the top of the struct, and return a pointer to that.

However here is what occurs if we attempt to use our new malloc:

$ /bin/ls
Segmentation fault (core dumped)
gdb /bin/ls
(gdb) set atmosphere LD_PRELOAD=./malloc.so
(gdb) run

Program obtained sign SIGSEGV, Segmentation fault.
_IO_vfprintf_internal (s=s@entry=0x7fffff7ff5f0, format=format@entry=0x7ffff7567370 "%spercentspercents:%u: %spercentsAssertion `%s' failed.npercentn", ap=ap@entry=0x7fffff7ff718) at vfprintf.c:1332
1332    vfprintf.c: No such file or listing.
1327    in vfprintf.c

This is not as good as our final error — we will see that one in all our asserts failed, however gdb drops us into some print operate that is being known as when the assert fails. However that print operate makes use of our buggy malloc and blows up!

One factor we might do from right here can be to examine ap to see what assert was making an attempt to print:

(gdb) p *ap
$4 = {gp_offset = 16, fp_offset = 48, overflow_arg_area = 0x7fffff7ff7f0, reg_save_area = 0x7fffff7ff730}

That might work nice; we might poke round till we work out what’s presupposed to get printed and work out the fail that approach. Another options can be to put in writing our personal customized assert or to make use of the suitable hooks to forestall assert from utilizing our malloc.

However on this case, we all know there are just a few asserts in our code. The one in malloc checking that we do not attempt to use this in a multithreaded program and the 2 in free checking that we’re not releasing one thing we should not. Let us take a look at free first, by setting a breakpoint.

$ gdb /bin/ls
(gdb) set atmosphere LD_PRELOAD=./malloc.so
(gdb) break away
Breakpoint 1 at 0x400530
(gdb) run /bin/ls

Breakpoint 1, free (ptr=0x61c270) at malloc.c:112
112       if (!ptr) {

block_ptr is not set but, but when we use s just a few occasions to step ahead to after it is set, we will see what the worth is:

(gdb) s
(gdb) s
(gdb) s
free (ptr=0x61c270) at malloc.c:118
118       assert(block_ptr->free == 0);
(gdb) p/x *block_ptr
$11 = {measurement = 0, subsequent = 0x78, free = 0, magic = 0, knowledge = ""}

I am utilizing p/x as an alternative of p so we will see it in hex. The magic subject is 0, which must be not possible for a sound struct that we’re making an attempt to free. Perhaps get_block_ptr is returning a foul offset? Now we have ptr out there to us, so we will simply examine totally different offsets. Since it is a void *, we’ll need to solid it in order that gdb is aware of learn how to interpret the outcomes.

(gdb) p sizeof(struct block_meta)
$12 = 32
(gdb) p/x *(struct block_meta*)(ptr-32)
$13 = {measurement = 0x0, subsequent = 0x78, free = 0x0, magic = 0x0, knowledge = {0x0}}
(gdb) p/x *(struct block_meta*)(ptr-28)
$14 = {measurement = 0x7800000000, subsequent = 0x0, free = 0x0, magic = 0x0, knowledge = {0x78}}
(gdb) p/x *(struct block_meta*)(ptr-24)
$15 = {measurement = 0x78, subsequent = 0x0, free = 0x0, magic = 0x12345678, knowledge = {0x6e}}

If we again off a bit from the handle we’re utilizing, we will see that the right offset is 24 and never 32. What’s taking place right here is that structs get padded, in order that sizeof(struct block_meta) is 32, though the final legitimate member is at 24. If we need to reduce out that further house, we have to repair get_block_ptr.

That is it for debugging!

Workouts

Personally, this type of factor by no means sticks with me until I work via some workouts, so I will go away a pair workouts right here for anybody who’s .

  1. malloc is meant to return a pointer “which is suitably aligned for any built-in sort”. Does our malloc try this? If that’s the case, why? If not, repair the alignment. Be aware that “any built-in sort” is mainly as much as 8 bytes for C as a result of SSE/AVX sorts aren’t built-in sorts.

  2. Our malloc is admittedly wasteful if we attempt to re-use an current block and we do not want all the house. Implement a operate that may cut up up blocks in order that they use the minimal quantity of house needed

  3. After doing 2, if we name malloc and free a lot of occasions with random sizes, we’ll find yourself with a bunch of small blocks that may solely be re-used after we ask for small quantities of house. Implement a mechanism to merge adjoining free blocks collectively in order that any consecutive free blocks will get merged right into a single block.

  4. Discover bugs within the current code! I have never examined this a lot, so I am certain there are bugs, even when this mainly kinda sorta works.

Sources

I learn this tutorial by Marwan Burelle earlier than sitting down and making an attempt to put in writing my very own implementation, so it is fairly comparable. The principle implementation variations are that my model is easier, however extra susceptible to reminiscence fragmentation. When it comes to exposition, my model is much more informal. In order for you one thing a bit extra formal, Dr. Burelle has you lined.

For extra on how Linux offers with reminiscence administration, see this post by Gustavo Duarte.

For extra on how real-world malloc implementations work, dlmalloc and tcmalloc are each nice studying. I have never learn the code for jemalloc, and I’ve heard that it is a bit extra extra obscure, however it’s additionally probably the most broadly used high-performance malloc implementation round.

For assist debugging, Address Sanitizer is superb. If you wish to write a thread-safe model, Thread Sanitizer can be a fantastic software.

There is a Spanish translation of this post here due to Matias Garcia Isaia.

Acknowledgements

Due to Gustavo Duarte for letting me use one in all his photos for instance sbrk, and to Ian Whitlock, Danielle Sucher, Nathan Kurz, “tedu”, @chozu@fedi.absturztau.be, and David Farrel for feedback/corrections/dialogue. Please let me know in the event you discover different bugs on this submit (whether or not they’re within the writing or the code).



Source Link

What's Your Reaction?
Excited
0
Happy
0
In Love
0
Not Sure
0
Silly
0
View Comments (0)

Leave a Reply

Your email address will not be published.

2022 Blinking Robots.
WordPress by Doejo

Scroll To Top