Now Reading
Experimenting with GC-less (heap-less) Java

Experimenting with GC-less (heap-less) Java

2024-02-29 15:37:14

Stand With Ukraine

February 2024

What

https://github.com/xonixx/gc_less – on this repository I did a sequence of experiments in GC-less (heap-less) Java utilizing solar.misc.Unsafe and in addition the most recent different java.lang.overseas.MemorySegment.

GC-less (heap-less) means we allocate knowledge constructions in native reminiscence straight (exterior JVM heap). Such constructions usually are not seen to GC. However which means that we’re chargeable for their (express) de-allocations (i.e., guide reminiscence administration).

Why

For enjoyable and out of curiosity.

Mainly it began once I needed to verify if the well-known solar.misc.Unsafe method continues to be viable with the newest Java.

Additionally, I needed to attempt to program in Java like if I program in C and see the way it goes.

How

solar.misc.Unsafe

It was a shock to seek out that this class continues to be supported by the newest (on the time of this writing) Java model 23.

solar.misc.Unsafe class is unsafe certainly. You’ll be able to simply crash your JVM with only a couple traces of code:

JVM crash with sun.misc.Unsafe

But nonetheless it seems to be pretty widely used.

Anyway, of curiosity to us is a family of methods in this class, related to off-heap memory management.

Information constructions implementation

In order a sensible software of GC-less (heap-less) fashion of programming I made a decision to implement some primary data structures:

  • array
  • array checklist
  • stack
  • hash desk

Not solely was I fascinated with how simple it’s or if it’s possible in any respect. I additionally questioned how the efficiency will examine to plain Java’s knowledge constructions.

Producing code by template

As an alternative of writing separate implementations for every Java sort (int, lengthy, double, and so forth.) I made a decision to generate them by templates.

This fashion from a single template (for instance, TemplateArrayList.java) the set on specialised implementations are generated:

Fascinating to notice is that the template itself is a runnable code. In truth implementation-wise it corresponds to lengthy-specialized model.
That is handy, as a result of we will test a template and this makes positive all of the specialised implementations are right too.

The technology is applied in type of a script gen.awk (Why AWK?).

We use annotation @Type and sophistication Tpl to indicate patterns to get replaced by a generator:

Epsilon GC

To make it possible for we certainly don’t (unintentionally) devour heap I enabled the Epsilon GC setting.

Epsilon GC is “a GC that handles reminiscence allocation however doesn’t implement any precise reminiscence reclamation mechanism. As soon as the out there Java heap is exhausted, the JVM will shut down.”

Some implementation particulars

I applied the info constructions in a barely “unusual” OOP-resembling approach.

So, for instance, check out IntArrayList.

All of the strategies of this class are static, as a result of we don’t need to allocate object on heap.

Additionally, all of the strategies (apart from allocate, which performs a job of the constructor) settle for lengthy handle as first parameter. It performs a job of this (in a way, it’s one way or the other analogous to Python’s self parameter).

So the utilization appears to be like like:

    lengthy handle = IntArray.allocate(cleaner,10);
    IntArray.set(handle, 7, 222);
    System.out.println(IntArray.get(handle, 7));

So this lengthy handle performs a job of a reference. Clearly, with this you may’t have inheritance / overriding, however we don’t want it.

try-with-resources + Cleaner and Ref lessons

The thought of the Cleaner class is you move the occasion of it within the allocation process of a (GC-less) class, such that the Cleaner turns into chargeable for free-ing the reminiscence of the allotted occasion. This performs greatest with try-with-resources.

    strive (Cleaner cleaner = new Cleaner()) {
      lengthy array = IntArray.allocate(cleaner,10);
      IntArray.set(array,1,1);
      lengthy map = IntHashtable.allocate(cleaner,10,.75f);
      map = IntHashtable.put(map,1,1); 
      /*
        Why can we re-assign the map? 
        It is as a result of when the info construction grows,
        it could possibly ultimately overgrow its initially-allocated reminiscence area. 
        The algorithm will re-allocate the larger inside storage which is able to trigger
        the handle change. 
      */
      map = IntHashtable.put(map,2,22);
      System.out.println(IntArray.getLength(array));
      System.out.println(IntHashtable.getSize(map));
    }
    // each array and map above are de-allocated, reminiscence reclaimed

Ref class is required after we need to register some object for cleanup. However we will’t merely register its handle, as a result of the article will be relocated attributable to reminiscence reallocation attributable to rising. So it means, the GC-less object registers for cleanup by Cleaner via the Ref after which the Ref pointer will get up to date on each object relocation.

Reminiscence leaks detection

I had an thought to implement reminiscence leaks detection. This appeared relatively easy to achieve. The thought: on every reminiscence allocation we keep in mind the place (we instantiate new Exception() to seize a stack hint). On every corresponding reminiscence free() we discard it.
Thus, on the finish we verify what’s left.

Memory leak detection demo

I applied a base test class such that check that extends it could possibly mechanically make sure the absence of reminiscence leaks.

Visible demonstration

So we will say that one of many advantages of guide reminiscence administration off-heap is deterministic reminiscence utilization.

The category Main4 supplies a visible demonstration of allocation and de-allocation in a loop as seen by way of reminiscence graph of Process Supervisor:

Visual demonstration

java.lang.overseas.MemorySegment

It seems, that very not too long ago this JEP emerged: “Deprecate Memory-Access Methods in sun.misc.Unsafe for Removal”.

Which means regardless of utilizing memory-access strategies in solar.misc.Unsafe is enjoyable and generally helpful, we will not depend on this performance.

The JEP occurs to supply the safer alternate options. I made a decision to take a deeper have a look at the up to date API and perceive the way it compares to now deprecated solar.misc.Unsafe.

In its place we now have java.lang.overseas.MemorySegment class. It’s price mentioning that this class is part of a much bigger API, devoted to “invoking foreign functions (i.e., code outside the JVM)” that allows “Java applications to name native libraries and course of native knowledge with out the brittleness and hazard of JNI”.

General I discovered that the brand new API supplies a “managed” API over the native reminiscence. That’s, when beforehand we accessed reminiscence by way of lengthy that represented the uncooked reminiscence handle, now we’ve got java.lang.overseas.MemorySegment object (that represents each a pointer and a reminiscence area).

That is, clearly, good. Utilizing a devoted get/set strategies from that class provides a lot safer ensures, like, for instance, built-in out-of-bounds entry checks, management of accessing correctly aligned reminiscence, and so forth.

It could actually appear that this API is way heavier. For each off-heap allocation we have to have an on-heap deal with within the type of MemorySegment occasion. So, we will suppose that this may render the concept of utilizing the algorithms with plenty of small allocations non-viable.

See Also

Surprisingly, this isn’t the case!

My solar.misc.Unsafe-based hashtable implementation is an instance of an algorithm that makes use of many allocation. It’s a well known hashtable algorithm (analogous to plain Java’s HashMap), with an array of buckets crammed primarily based on the hash code, and utilizing linked lists of nodes for parts.

For the sake of experiment I converted it to MemorySegment. I used to be anticipating that, due to many MemorySegment objects allotted, its JVM heap consumption might be proportional to the hashtable dimension.

As an alternative, experiment exhibits that the heap reminiscence consumption is fairly O(1).

How is that this doable?

It seems that:

  1. MemorySegment is a value-based class as said by its javadoc. Value-based classes represent the first move towards the inline classes that might be launched by the project Valhalla.
  2. It appears to be like just like the implementation of inlining worth objects is already out there at the very least for some JDK lessons, like our MemorySegment. That is applied by extreme utilization of @jdk.inside.vm.annotation.ForceInline annotation:

My tackle that is that in the course of the JIT-compilation all allocations of MemorySegments are utterly eliminated by inlining this class.

Python-like hashtable implementation

Whereas doing these experiments I used to be fortunate sufficient to satisfy this text: Internals of sets and dicts. It describes the implementation particulars of units and maps within the Python programming language.

It seems, that the dict (that is the title for hashtable in Python) algorithm in Python doesn’t use many allocations. As an alternative, it allocates one piece of reminiscence and distributes the important thing, worth, hashes knowledge in it. When assortment grows, it re-allocates this steady piece of reminiscence and re-distributes the info. That is precisely what we’d like!

So I got here up with this MemorySegment-based, Python-based off-heap hashtable implementation.

Stress check with restricted heap

With Java it’s inevitable, that even in the event you implement an off-heap algorithm, some heap might be used. So we will solely contemplate an off-heap algorithm sensible if it consumes small/fastened quantity of heap. In any other case, if its extra consumption of heap is proportional to the issue dimension, off-heap doesn’t make a lot sense.

I’ve created this small experiment – a program that allocates a hashtable and fills it with 50 million of parts for every of three implementation:

  • Unsafe-based hashtable
  • Python-based hashtable
  • MemorySegment-based hashtable

The experiment runs with Epsilon-GC with heap dimension fixed at 20 MB.

We are able to verify that each one three implementation move the check with this heap restrict.

[0.002s][info][gc] Utilizing Epsilon
[0.003s][warning][gc,init] Contemplate enabling -XX:+AlwaysPreTouch to keep away from reminiscence commit hiccups
[0.011s][info   ][gc     ] Heap: 20480K reserved, 20480K (100.00%) dedicated, 1060K (5.18%) used

===== Unsafe-based hashtable =====
Unsafe-based: 0
Unsafe-based: 1000000
Unsafe-based: 2000000
Unsafe-based: 3000000
...
Unsafe-based: 47000000
Unsafe-based: 48000000
Unsafe-based: 49000000
Liberating...
Liberating native addr 140590889037840...
Liberating native ref  140596292148752...
Liberating locals     140596292147328...

===== MemorySegment-based hashtable =====
[16.869s][info   ][gc     ] Heap: 20480K reserved, 20480K (100.00%) dedicated, 2106K (10.28%) used
[16.941s][info   ][gc     ] Heap: 20480K reserved, 20480K (100.00%) dedicated, 3159K (15.43%) used
[16.980s][info   ][gc     ] Heap: 20480K reserved, 20480K (100.00%) dedicated, 4262K (20.82%) used
[17.019s][info   ][gc     ] Heap: 20480K reserved, 20480K (100.00%) dedicated, 5438K (26.55%) used
MemorySegment-based: 0
[17.084s][info   ][gc     ] Heap: 20480K reserved, 20480K (100.00%) dedicated, 6665K (32.55%) used
[17.129s][info   ][gc     ] Heap: 20480K reserved, 20480K (100.00%) dedicated, 7894K (38.55%) used
[17.165s][info   ][gc     ] Heap: 20480K reserved, 20480K (100.00%) dedicated, 9123K (44.55%) used
[17.186s][info   ][gc     ] Heap: 20480K reserved, 20480K (100.00%) dedicated, 10352K (50.55%) used
[17.203s][info   ][gc     ] Heap: 20480K reserved, 20480K (100.00%) dedicated, 11581K (56.55%) used
MemorySegment-based: 1000000
MemorySegment-based: 2000000
MemorySegment-based: 3000000
...
MemorySegment-based: 47000000
MemorySegment-based: 48000000
MemorySegment-based: 49000000

===== Python-based hashtable =====
Python-based: 0
Python-based: 1000000
Python-based: 2000000
Python-based: 3000000
...
Python-based: 47000000
Python-based: 48000000
Python-based: 49000000
[33.169s][info   ][gc     ] Heap: 20480K reserved, 20480K (100.00%) dedicated, 11993K (58.56%) used

Benchmarks

Lastly, I want to current to your consideration couple benchmarks I’ve created.

AccessSpeedTest.java measures learn/write velocity (decrease = higher):

  int[] Unsafe MemorySegment
Learn check, ms 1885 1883 1879
Write check, ms 2177 2213 2214

Surprisingly we will hardly see right here any noticeable distinction in velocity, regardless of int[] and MemorySegment have bounds checking, so I anticipated them to be slower than Unsafe.


One other velocity check is IntHashtableSpeedTest (decrease = higher):

What can we see right here?

  1. It seems that GC-less algorithms is usually slower than standard GC-full
    • Constructed-in HashMap is 3+ occasions quicker than Unsafe-based with related algorithm
    • Python-like on-heap is sort of 4 occasions quicker than off-heap.
  2. It additionally seems that for the actual case of map<int,int> it’s doable to give you the implementation (Python-like) that’s 8 occasions quicker over the built-in implementation (HashMap)

Takeaways

  1. It’s doable to make use of guide reminiscence administration in Java, for instance, in the event you want deterministic reminiscence consumption.
  2. Velocity shouldn’t be a very good cause for utilizing off-heap algorithms. We’ve seen that very same algorithms are 3+ occasions slower off-heap. In all probability attributable to higher optimization alternatives for JIT compiler with standard Java code.
  3. In your implementation it’s suggested to desire java.lang.overseas.MemorySegment over solar.misc.Unsafe.

In case you observed a typo or produce other suggestions, please e mail me at xonixx@gmail.com

Source Link

What's Your Reaction?
Excited
0
Happy
0
In Love
0
Not Sure
0
Silly
0
View Comments (0)

Leave a Reply

Your email address will not be published.

2022 Blinking Robots.
WordPress by Doejo

Scroll To Top