Now Reading
Understanding glibc malloc – sploitF-U-N

Understanding glibc malloc – sploitF-U-N

2023-03-24 00:53:18

I at all times received fascinated by heap reminiscence. Questions resembling

How heap reminiscence is obtained from kernel?
How effectively reminiscence is managed?
Is it managed by kernel or by library or by utility itself?
Can heap reminiscence be exploited?

have been in my thoughts for fairly a while. However solely just lately I received time to know about it. So right here I want to share my fascination turned information!! On the market within the wild, many reminiscence allocators can be found:

  • dlmalloc – Normal objective allocator
  • ptmalloc2 – glibc
  • jemalloc – FreeBSD and Firefox
  • tcmalloc – Google
  • libumem – Solaris

Each reminiscence allocator claims they’re quick, scalable and reminiscence environment friendly!! However not all allocators could be suited properly for our utility. Reminiscence hungry utility’s efficiency largely depends upon reminiscence allocator efficiency. On this publish, I’ll solely speak about ‘glibc malloc’ reminiscence allocator. In future, I hope to cowl up different reminiscence allocators. All through this publish, for higher understanding of ‘glibc malloc’, I’ll hyperlink its recent supply code. So buckle up, lets get began with glibc malloc!!

Historical pastptmalloc2 was forked from dlmalloc. After fork, threading assist was added to it and received launched in 2006. After its official launch, ptmalloc2 received built-in into glibc supply code. As soon as its integration, code modifications have been made on to glibc malloc supply code itself. Therefore there might be lot of modifications between ptmalloc2 and glibc’s malloc implementation.

System Calls: As seen in this publish malloc internally invokes both brk or mmap syscall.

Threading: Throughout early days of linux, dlmalloc was used because the default reminiscence allocator. However later as a result of ptmalloc2’s threading assist, it grew to become the default reminiscence allocator for linux. Threading assist helps in enhancing reminiscence allocator efficiency and therefore utility efficiency. In dlmalloc when two threads name malloc on the identical time ONLY one thread can enter the important part, since freelist knowledge construction is shared amongst all of the obtainable threads. Therefore reminiscence allocation takes time in multi threaded purposes, leading to efficiency degradation. Whereas in ptmalloc2, when two threads name malloc on the identical time reminiscence is allotted instantly since every thread maintains a separate heap phase and therefore freelist knowledge constructions sustaining these heaps are additionally separate. This act of sustaining separate heap and freelist knowledge constructions for every thread is known as per thread enviornment.

Instance:

/* Per thread enviornment instance. */
#embody <stdio.h>
#embody <stdlib.h>
#embody <pthread.h>
#embody <unistd.h>
#embody <sys/varieties.h>

void* threadFunc(void* arg) {
        printf("Earlier than malloc in thread 1n");
        getchar();
        char* addr = (char*) malloc(1000);
        printf("After malloc and earlier than free in thread 1n");
        getchar();
        free(addr);
        printf("After free in thread 1n");
        getchar();
}

int fundamental() {
        pthread_t t1;
        void* s;
        int ret;
        char* addr;

        printf("Welcome to per thread enviornment instance::%dn",getpid());
        printf("Earlier than malloc in fundamental threadn");
        getchar();
        addr = (char*) malloc(1000);
        printf("After malloc and earlier than free in fundamental threadn");
        getchar();
        free(addr);
        printf("After free in fundamental threadn");
        getchar();
        ret = pthread_create(&t1, NULL, threadFunc, NULL);
        if(ret)
        {
                printf("Thread creation errorn");
                return -1;
        }
        ret = pthread_join(t1, &s);
        if(ret)
        {
                printf("Thread be a part of errorn");
                return -1;
        }
        return 0;
}

Output Evaluation:

Earlier than malloc in fundamental thread: Within the beneath output we will see that there’s NO heap phase but and no per thread stack too since thread1 will not be but created.

sploitfun@sploitfun-VirtualBox:~/ptmalloc.ppt/mthread$ ./mthread 
Welcome to per thread enviornment instance::6501
Earlier than malloc in fundamental thread
...
sploitfun@sploitfun-VirtualBox:~/ptmalloc.ppt/mthread$ cat /proc/6501/maps
08048000-08049000 r-xp 00000000 08:01 539625     /house/sploitfun/ptmalloc.ppt/mthread/mthread
08049000-0804a000 r--p 00000000 08:01 539625     /house/sploitfun/ptmalloc.ppt/mthread/mthread
0804a000-0804b000 rw-p 00001000 08:01 539625     /house/sploitfun/ptmalloc.ppt/mthread/mthread
b7e05000-b7e07000 rw-p 00000000 00:00 0 
...
sploitfun@sploitfun-VirtualBox:~/ptmalloc.ppt/mthread$

After malloc in fundamental thread: Within the beneath output we will see that heap phase is created and its lies simply above the information phase (0804b000-0806c000), this exhibits heap reminiscence is created by growing program break location ( ie) utilizing brk syscall). Additionally do notice that eventhough person requested solely 1000 bytes, heap reminiscence of dimension 132 KB is created. This contiguous area of heap reminiscence is known as enviornment. Since this enviornment is created by fundamental thread its known as main arena. Additional allocation requests retains utilizing this enviornment till it runs out of free area. When arena runs out of free space, it might grow by growing program break location (After rising high chunk’s size is adjusted to incorporate the additional area). Equally enviornment may also shrink when there may be lot of free area on high chunk.

NOTE: Prime chunk is the highest most chunk of an enviornment. For additional particulars about it, see “Prime Chunk” part beneath.

sploitfun@sploitfun-VirtualBox:~/ptmalloc.ppt/mthread$ ./mthread 
Welcome to per thread enviornment instance::6501
Earlier than malloc in fundamental thread
After malloc and earlier than free in fundamental thread
...
sploitfun@sploitfun-VirtualBox:~/lsploits/hof/ptmalloc.ppt/mthread$ cat /proc/6501/maps
08048000-08049000 r-xp 00000000 08:01 539625     /house/sploitfun/ptmalloc.ppt/mthread/mthread
08049000-0804a000 r--p 00000000 08:01 539625     /house/sploitfun/ptmalloc.ppt/mthread/mthread
0804a000-0804b000 rw-p 00001000 08:01 539625     /house/sploitfun/ptmalloc.ppt/mthread/mthread
0804b000-0806c000 rw-p 00000000 00:00 0          [heap]
b7e05000-b7e07000 rw-p 00000000 00:00 0 
...
sploitfun@sploitfun-VirtualBox:~/ptmalloc.ppt/mthread$

After free in fundamental thread: Within the beneath output we will see that when allotted reminiscence area is freed, reminiscence behind it doesnt get launched to the working system instantly. Allotted reminiscence area (of dimension 1000 bytes) is released solely to ‘glibc malloc’ library, which provides this freed block to fundamental arenas bin (In glibc malloc, freelist datastructures are referred as bins). Later when person requests reminiscence, ‘glibc malloc’ doesnt get new heap reminiscence from kernel, as an alternative it would attempt to find a free block in bin. And solely when no free block exists, it obtains reminiscence from kernel.

sploitfun@sploitfun-VirtualBox:~/ptmalloc.ppt/mthread$ ./mthread 
Welcome to per thread enviornment instance::6501
Earlier than malloc in fundamental thread
After malloc and earlier than free in fundamental thread
After free in fundamental thread
...
sploitfun@sploitfun-VirtualBox:~/lsploits/hof/ptmalloc.ppt/mthread$ cat /proc/6501/maps
08048000-08049000 r-xp 00000000 08:01 539625     /house/sploitfun/ptmalloc.ppt/mthread/mthread
08049000-0804a000 r--p 00000000 08:01 539625     /house/sploitfun/ptmalloc.ppt/mthread/mthread
0804a000-0804b000 rw-p 00001000 08:01 539625     /house/sploitfun/ptmalloc.ppt/mthread/mthread
0804b000-0806c000 rw-p 00000000 00:00 0          [heap]
b7e05000-b7e07000 rw-p 00000000 00:00 0 
...
sploitfun@sploitfun-VirtualBox:~/ptmalloc.ppt/mthread$

Earlier than malloc in thread1: Within the beneath output we will see that there’s NO thread1 heap phase however now thread1’s per thread stack is created.

sploitfun@sploitfun-VirtualBox:~/ptmalloc.ppt/mthread$ ./mthread 
Welcome to per thread enviornment instance::6501
Earlier than malloc in fundamental thread
After malloc and earlier than free in fundamental thread
After free in fundamental thread
Earlier than malloc in thread 1
...
sploitfun@sploitfun-VirtualBox:~/ptmalloc.ppt/mthread$ cat /proc/6501/maps
08048000-08049000 r-xp 00000000 08:01 539625     /house/sploitfun/ptmalloc.ppt/mthread/mthread
08049000-0804a000 r--p 00000000 08:01 539625     /house/sploitfun/ptmalloc.ppt/mthread/mthread
0804a000-0804b000 rw-p 00001000 08:01 539625     /house/sploitfun/ptmalloc.ppt/mthread/mthread
0804b000-0806c000 rw-p 00000000 00:00 0          [heap]
b7604000-b7605000 ---p 00000000 00:00 0 
b7605000-b7e07000 rw-p 00000000 00:00 0          [stack:6594]
...
sploitfun@sploitfun-VirtualBox:~/ptmalloc.ppt/mthread$

After malloc in thread1: Within the beneath output we will see that thread1’s heap phase is created. And its lies in reminiscence mapping phase area (b7500000-b7521000 whose dimension is 132 KB) and therefore this exhibits heap reminiscence is created utilizing mmap syscall in contrast to fundamental thread (which makes use of sbrk). Once more right here, eventhough person requested solely 1000 bytes, heap reminiscence of dimension 1 MB is mapped to course of deal with area. Out of those 1 MB, just for 132KB read-write permission is about and this turns into the heap reminiscence for this thread. This contiguous area of reminiscence (132 KB) is known as thread arena.

NOTE: When person request dimension is greater than 128 KB ( let’s imagine malloc(132*1024)) and when there may be not sufficient area in an enviornment to fulfill person request, reminiscence is allotted utilizing mmap syscall (and NOT utilizing sbrk) regardless of whether or not a request is created from fundamental enviornment or thread enviornment.

sploitfun@sploitfun-VirtualBox:~/ptmalloc.ppt/mthread$ ./mthread 
Welcome to per thread enviornment instance::6501
Earlier than malloc in fundamental thread
After malloc and earlier than free in fundamental thread
After free in fundamental thread
Earlier than malloc in thread 1
After malloc and earlier than free in thread 1
...
sploitfun@sploitfun-VirtualBox:~/ptmalloc.ppt/mthread$ cat /proc/6501/maps
08048000-08049000 r-xp 00000000 08:01 539625     /house/sploitfun/ptmalloc.ppt/mthread/mthread
08049000-0804a000 r--p 00000000 08:01 539625     /house/sploitfun/ptmalloc.ppt/mthread/mthread
0804a000-0804b000 rw-p 00001000 08:01 539625     /house/sploitfun/ptmalloc.ppt/mthread/mthread
0804b000-0806c000 rw-p 00000000 00:00 0          [heap]
b7500000-b7521000 rw-p 00000000 00:00 0 
b7521000-b7600000 ---p 00000000 00:00 0 
b7604000-b7605000 ---p 00000000 00:00 0 
b7605000-b7e07000 rw-p 00000000 00:00 0          [stack:6594]
...
sploitfun@sploitfun-VirtualBox:~/ptmalloc.ppt/mthread$

After free in thread1: Within the beneath output we will see that liberating allotted reminiscence area doesnt launch heap reminiscence to the working system. As an alternative allotted reminiscence area (of dimension 1000 bytes) is released to ‘glibc malloc’, which provides this freed block to its thread arenas bin.

sploitfun@sploitfun-VirtualBox:~/ptmalloc.ppt/mthread$ ./mthread 
Welcome to per thread enviornment instance::6501
Earlier than malloc in fundamental thread
After malloc and earlier than free in fundamental thread
After free in fundamental thread
Earlier than malloc in thread 1
After malloc and earlier than free in thread 1
After free in thread 1
...
sploitfun@sploitfun-VirtualBox:~/ptmalloc.ppt/mthread$ cat /proc/6501/maps
08048000-08049000 r-xp 00000000 08:01 539625     /house/sploitfun/ptmalloc.ppt/mthread/mthread
08049000-0804a000 r--p 00000000 08:01 539625     /house/sploitfun/ptmalloc.ppt/mthread/mthread
0804a000-0804b000 rw-p 00001000 08:01 539625     /house/sploitfun/ptmalloc.ppt/mthread/mthread
0804b000-0806c000 rw-p 00000000 00:00 0          [heap]
b7500000-b7521000 rw-p 00000000 00:00 0 
b7521000-b7600000 ---p 00000000 00:00 0 
b7604000-b7605000 ---p 00000000 00:00 0 
b7605000-b7e07000 rw-p 00000000 00:00 0          [stack:6594]
...
sploitfun@sploitfun-VirtualBox:~/ptmalloc.ppt/mthread$

Area:

Variety of enviornment’s: In above instance, we noticed fundamental thread incorporates fundamental enviornment and thread 1 incorporates its personal thread enviornment. So can there be a one to 1 mapping between threads and enviornment, regardless of variety of threads? Actually not. An insane utility can comprise extra variety of threads (than variety of cores), in such a case, having one enviornment per thread turns into bit costly and ineffective. Therefore because of this, utility’s arena limit is based on number of cores current within the system.

For 32 bit programs:
     Variety of enviornment = 2 * variety of cores.
For 64 bit programs:
     Variety of enviornment = 8 * variety of cores.

A number of Area:

Instance: Shall we say a multithreaded utility (4 threads – Essential thread + 3 person threads) runs on a 32 bit system which incorporates 1 core. Right here no of threads (4) > 2*no of cores (2). Therefore in such a case, ‘glibc malloc’ makes positive that a number of arenas are shared amongst all obtainable threads. However how its shared?

  • When fundamental thread, calls malloc for the primary time already created main arena is used with none competition.
  • When thread 1 and thread 2 calls malloc for the primary time, a new arena is created for them and its used with none competition. Till this level threads and enviornment have one-to-one mapping.
  • When thread 3 calls malloc for the primary time, variety of enviornment restrict is calculated. Right here enviornment restrict is crossed, therefore strive reusing current enviornment’s (Essential enviornment or Area 1 or Area 2)
  •  Reuse:
    • Once loop over the obtainable arenas, whereas looping try to lock that enviornment.
    • If locked efficiently (let’s imagine fundamental enviornment is locked efficiently), return that enviornment to the person.
    • If no enviornment is discovered free, block for the sector subsequent in line.
  • Now when thread 3 calls malloc (second time), malloc will try to use last accessed arena (fundamental enviornment). If fundamental enviornment is free its used else thread3 is blocked till fundamental enviornment will get freed. Thus now fundamental enviornment is shared amongst fundamental thread and thread 3.

A number of Heaps:

Primarily beneath three datastructures are present in ‘glibc malloc’ supply code:

heap_info – Heap Header – A single thread enviornment can have a number of heaps. Every heap has its personal header. Why a number of heaps wanted? To start with each thread enviornment incorporates ONLY one heap, however when this heap phase runs out of area, new heap (non contiguous area) will get mmap’d to this enviornment.
malloc_state – Area Header – A single thread enviornment can have a number of heaps, however for all these heaps solely a single enviornment header exists. Area header incorporates details about bins, high chunk, final the rest chunk…
malloc_chunk – Chunk Header – A heap is split into many chunks primarily based on person requests. Every of these chunks has its personal chunk header.

NOTE:

  • Essential enviornment dont have a number of heaps and therefore no heap_info construction. When fundamental enviornment runs out of area, sbrk’d heap phase is prolonged (contiguous area) till it bumps into reminiscence mapping phase.
  • In contrast to thread enviornment, fundamental enviornment’s enviornment header isnt a part of sbrk’d heap phase. Its a global variable and therefore its present in libc.so’s knowledge phase.

Pictorial view of fundamental enviornment and thread enviornment (single heap phase):

Pictorial view of thread enviornment (a number of heap phase’s):

Chunk: A piece discovered inside a heap phase could be one of many beneath varieties:

  • Allotted chunk
  • Free chunk
  • Prime chunk
  • Final The rest chunk

Allotted chunk:

prev_size: If the earlier chunk is free, this area incorporates the scale of earlier chunk. Else if earlier chunk is allotted, this area incorporates earlier chunk’s person knowledge.
size: This area incorporates the scale of this allotted chunk. Final 3 bits of this area incorporates flag data.

  • PREV_INUSE (P) – This bit is about when earlier chunk is allotted.
  • IS_MMAPPED (M) – This bit is about when chunk is mmap’d.
  • NON_MAIN_ARENA (N) – This bit is about when this chunk belongs to a thread enviornment.

NOTE:

  • Different fields of malloc_chunk (like fd, bk) is NOT used for allotted chunk. Therefore instead of these fields person knowledge is saved.
  • Consumer requested dimension is converted into usable dimension (inner illustration dimension) since some additional area is required for storing malloc_chunk and likewise for alignment functions. Conversion takes place in such a means that final 3 bits of usable dimension isn’t set and therefore its used for storing flag data.

Free Chunk:

prev_size: No two free chunks could be adjoining collectively. When each the chunks are free, its will get mixed into one single free chunk. Therefore at all times earlier chunk to this freed chunk could be allotted and subsequently prev_size incorporates earlier chunk’s person knowledge.
size: This area incorporates the scale of this free chunk.
fd: Ahead pointer – Factors to subsequent chunk in the identical bin (and NOT to the following chunk current in bodily reminiscence).
bk: Backward pointer – Factors to earlier chunk in the identical bin (and NOT to the earlier chunk current in bodily reminiscence).

See Also

Bins: Bins are the freelist datastructures. They are used to carry free chunks. Primarily based on chunk sizes, totally different bins can be found:

  • Quick bin
  • Unsorted bin
  • Small bin
  • Massive bin

Datastructures used to carry these bins are:

fastbinsY: This array maintain quick bins.
bins: This array maintain unsorted, small and enormous bins. Completely there are 126 bins and its divided as follows:

  • Bin 1 – Unsorted bin
  • Bin 2 to Bin 63 – Small bin
  • Bin 64 to Bin 126 – Massive bin

Quick Bin: Chunks of dimension 16 to 80 bytes is known as a quick chunk. Bins holding quick chunks are known as quick bins. Amongst all of the bins, quick bins are quicker in reminiscence allocation and deallocation.

  • Variety of bins – 10
    • Every quick bin incorporates a single linked record (a.okay.a binlist) of free chunks. Single linked record is used since in quick bins chunks aren’t faraway from the center of the record. Each addition and deletion occurs on the entrance finish of the record – LIFO.
  • Chunk dimension – 8 bytes apart
    • Quick bins comprise a binlist of chunks whose sizes are 8 bytes aside. ie) First quick bin (index 0) incorporates binlist of chunks of dimension 16 bytes, second quick bin (index 1) incorporates binlist of chunks of dimension  24 bytes and so forth…
    • Chunks inside a selected quick bin are of identical sizes.
  • Throughout malloc initialization, most quick bin dimension is set to 64 (!80) bytes. Therefore by default chunks of dimension 16 to 64 is categorized as quick chunks.
  • No Coalescing – Two chunks that are free could be adjoining to one another, it doesnt get mixed into single free chunk. No coalescing might lead to exterior fragmentation however it accelerates free!!
  • malloc(quick chunk) –
  • free(quick chunk) –

Unsorted Bin: When small or massive chunk will get freed as an alternative of including them in to their respective bins, its will get added into unsorted bin. This strategy provides ‘glibc malloc’ a second likelihood to reuse the just lately freed chunks. Therefore reminiscence allocation and deallocation accelerates a bit (due to unsorted bin) since time taken to search for applicable bin is eradicated.

  • Variety of bins – 1
    • Unsorted bin incorporates a round double linked record (a.okay.a binlist) of free chunks.
  • Chunk dimension – There isn’t any dimension restriction, chunks of any dimension belongs to this bin.

Small Bin: Chunks of dimension less than 512 bytes is known as as small chunk. Bins holding small chunks are known as small bins. Small bins are quicker than massive bins (however slower than quick bins) in reminiscence allocation and deallocation.

  • Variety of bins – 62
    • Every small bin incorporates a round double linked record (a.okay.a binlist) of free chunks. Double linked record is used since in small bins chunks are unlinked from the center of the record. Right here addition occurs on the entrance finish and deletion occurs on the rear finish of the record – FIFO.
  • Chunk Measurement – 8 bytes apart
    • Small bin incorporates a binlist of chunks whose sizes are 8 bytes aside. ie) First Small bin (Bin 2) incorporates binlist of chunks of dimension 16 bytes, second small bin (Bin 3) incorporates binlist of chunks of dimension 24 bytes and so forth…
    • Chunks inside a small bin are of identical sizes and therefore it doesnt must be sorted.
  • Coalescing – Two chunks that are free cant be adjoining to one another, it will get combined into single free chunk. Coalescing eliminates exterior fragmentation however it slows up free!!
  • malloc(small chunk) –
    • Initially all small bins could be NULL and therefore eventhough person requested a small chunk, as an alternative of small bin codeunsorted bin code tries to service it.
    • Additionally through the first name to malloc, small bin and enormous bin datastructures (bins) present in malloc_state is initialized ie) bins would point to itself signifying they’re empty.
    • Later when small bin is non empty, final chunk from its corresponding binlist is removed and returned to the person.
  • free(small chunk) –
    • Whereas liberating this chunk, verify if its previous or next chunk is free, if that’s the case coalesce ie) unlink these chunks from their respective linked lists after which add the brand new consolidated chunk into the beginning of unsorted bin’s linked record.

Massive Bin: Chunks of dimension larger than equal to 512 is known as a big chunk. Bins holding massive chunks are known as massive bins. Massive bins are slower than small bins in reminiscence allocation and deallocation.

  • Variety of bins – 63
    • Every massive bin incorporates a round double linked record (a.okay.a binlist) of free chunks. Double linked record is used since in massive bins chunks are added and eliminated at any place (entrance or center or rear).
    • Out of those 63 bins:
      • 32 bins comprise binlist of chunks of dimension that are 64 bytes apart. ie) First massive bin (Bin 65) incorporates binlist of chunks of dimension 512 bytes to 568 bytes, second massive bin (Bin 66) incorporates binlist of chunks of dimension 576 bytes to 632 bytes and so forth…
      • 16 bins comprise binlist of chunks of dimension that are 512 bytes apart.
      • 8 bins comprise binlist of chunks of dimension that are 4096 bytes apart.
      • 4 bins comprise binlist of chunks of dimension that are 32768 bytes apart.
      • 2 bins comprise binlist of chunks of dimension that are 262144 bytes apart.
      • 1 bin incorporates a bit of remaining dimension.
    • In contrast to small bin, chunks inside a big bin are NOT of identical dimension. Therefore they’re saved in lowering order. Largest chunk is saved within the entrance finish whereas the smallest chunk is saved within the rear finish of its binlist.
  • Coalescing – Two chunks that are free cant be adjoining to one another, it will get combined into single free chunk.
  • malloc(massive chunk) –
  • free(massive chunk) – Its process is just like free(small chunk).

Prime Chunk: Chunk which is on the high border of an enviornment is known as top chunk. It doesnt belong to any bin. Prime chunk is used to service person request when there may be NO free blocks, in any of the bins. If top chunk size is greater than user requested size high chunk is break up into two:

  • Consumer chunk (of person requested dimension)
  • The rest chunk (of remaining dimension)

The remainder chunk becomes the new top. If high chunk dimension is lesser than person requested dimension, high chunk is extended utilizing sbrk (fundamental enviornment) or mmap (thread enviornment) syscall.

Final The rest Chunk: The rest from the newest break up of a small request. Final the rest chunk helps to enhance locality of reference ie) consecutive malloc request of small chunks may find yourself being allotted shut to one another.

However out of the various chunks obtainable in an enviornment, which chunk qualifies to be final the rest chunk?

When a person request of small chunk, can’t be served by a small bin and unsorted bin, binmaps are scanned to search out subsequent largest (non empty) bin. As stated earlier, on discovering the following largest (non empty) bin, its break up into two, person chunk will get returned to the person and the rest chunk will get added to the unsorted bin. Along with it, it becomes the new last remainder chunk.

How locality of reference is achieved?

Now when user subsequently request’s a small chunk and if the last remainder chunk is the only chunk in unsorted bin, final the rest chunk is split into two, person chunk will get returned to the person and the rest chunk will get added to the unsorted bin. Along with it, it becomes the new last remainder chunk. Thus subsequent reminiscence allocations find yourself being subsequent to one another.

Source Link

What's Your Reaction?
Excited
0
Happy
0
In Love
0
Not Sure
0
Silly
0
View Comments (0)

Leave a Reply

Your email address will not be published.

2022 Blinking Robots.
WordPress by Doejo

Scroll To Top