Now Reading
Writing a Profiler in 240 Traces of Pure Java

Writing a Profiler in 240 Traces of Pure Java

2023-03-27 01:53:52

A number of months again, I began writing a profiler from scratch, and the code since turned the bottom of my profiler validation instruments. The one downside with this undertaking: I wished to write down a correct non-safepoint-biased profiler from scratch. This can be a noble effort, however it requires tons C/C++/Unix programming which is finicky, and never everybody can learn C/C++ code.

For individuals unfamiliar with safepoint bias: A safepoint is a time limit the place the JVM has a recognized outlined state, and all threads have stopped. The JVM itself wants safepoints to do main rubbish collections, Class definitions, technique deoptimizations, and extra. Threads are often checking whether or not they need to get right into a safepoint, for instance, at technique entry, exit, or loop backjumps. A profiler that solely profiles at a safepoint have an inherent bias as a result of it solely consists of frames from the places inside strategies the place Threads examine for a safepoint. The one benefit is that the stack-walking at safepoints is barely much less error-prone, as there are fewer mutations of heap and stack. For extra data, think about studying the superb article Java Safepoint and Async Profiling by Seetha Wenner, the more technical one by JP Bempel, or the basic article Safepoints: Meaning, Side Effects and Overheads by Nitsan Wakart. To conclude: Safepoint-biased profilers don’t provide you with a holistic view of your utility, however can nonetheless be useful to research main efficiency points the place you have a look at the larger image.

This weblog publish goals to develop a tiny Java profiler in pure Java code that everybody can perceive. Profilers aren’t rocket science, and ignoring safepoint-bias, we are able to write a usable profiler that outputs a flame graph in simply 240 strains of code.

You’ll find the entire undertaking on GitHub. Be happy to make use of it as a base in your adventures.

We implement the profiler in a daemon thread began by a Java agent. This enables us to begin and run the profiler alongside the Java program we need to profile. The principle elements of the profiler are:

  • Primary: Entry level of the Java agent and starter of the profiling thread
  • Choices: Parses and shops the agent choices
  • Profiler: Incorporates the profiling loop
  • Retailer: Shops and outputs the collected outcomes

Primary Class

We begin by implementing the agent entry factors:

public class Primary {
    public static void agentmain(String agentArgs) {
        premain(agentArgs);
    }

    public static void premain(String agentArgs) {
        Primary major = new Primary();
        major.run(new Choices(agentArgs));
    }

    non-public void run(Choices choices) {
        Thread t = new Thread(new Profiler(choices));
        t.setDaemon(true);
        t.setName("Profiler");
        t.begin();
    }
}

The premain known as when the agent is hooked up to the JVM at first. That is typical as a result of the person handed the -javagent to the JVM. In our instance, which means the person runs Java with

java -javaagent:./goal/tiny_profiler.jar=agentArgs ...

However there’s additionally the chance that the person attaches the agent at runtime. On this case, the JVM calls the strategy agentmain. To study extra about Java agent, go to the JDK documentation.

Please bear in mind that we have now to set the Premain-Class and the Agent-Class attributes within the MANIFEST file of our ensuing JAR file.

Our Java agent parses the agent arguments to get the choices. The choices are modeled and parsed by the Choices class:

public class Choices {
    /** interval possibility */
    non-public Period interval = Period.ofMillis(10);

    /** flamegraph possibility */
    non-public Optionally available<Path> flamePath;

    /** desk possibility */
    non-public boolean printMethodTable = true;
    ...
}

The thrilling a part of the Primary class is its run technique: The Profiler class implements the Runnable interface in order that we are able to create a thread instantly:

Thread t = new Thread(new Profiler(choices));

We then mark the profiler thread as a daemon thread; which means the JVM does terminate on the finish of the profiled utility even when the profiler thread is operating:

t.setDaemon(true);

No, we’re virtually completed; we solely have to begin the thread. Earlier than we do that, we title the thread, this isn’t required, however it makes debugging simpler.

t.setName("Profiler");
t.begin();

Profiler Class

The precise sampling takes place within the Profiler class:

public class Profiler implements Runnable {
    non-public ultimate Choices choices;
    non-public ultimate Retailer retailer;

    public Profiler(Choices choices) {
        this.choices = choices;
        this.retailer = new Retailer(choices.getFlamePath());
        Runtime.getRuntime().addShutdownHook(new Thread(this::onEnd));
    }

    non-public static void sleep(Period length) {
        // ...
    }

    @Override
    public void run() {
        whereas (true) {
            Period begin = Period.ofNanos(System.nanoTime());
            pattern();
            Period length = Period.ofNanos(System.nanoTime())
                                        .minus(begin);
            Period sleep = choices.getInterval().minus(length);
            sleep(sleep);
        }
    }

    non-public void pattern() {
        Thread.getAllStackTraces().forEach(
          (thread, stackTraceElements) -> {
            if (!thread.isDaemon()) { 
                // exclude daemon threads
                retailer.addSample(stackTraceElements);
            }
        });
    }

    non-public void onEnd() {
        if (choices.printMethodTable()) {
            retailer.printMethodTable();
        }
        retailer.storeFlameGraphIfNeeded();
    }

We begin by wanting on the constructor. The attention-grabbing half is

Runtime.getRuntime().addShutdownHook(new Thread(this::onEnd));

which causes the JVM to name the Profiler::onEnd when it shuts down. That is vital because the profiler thread is silently aborted, and we nonetheless need to print the captured outcomes. You’ll be able to learn extra on shutdown hooks within the Java documentation.

After this, we check out the profiling loop within the run technique:

whereas (true) {
    Period begin = Period.ofNanos(System.nanoTime());
    pattern();
    Period length = Period.ofNanos(System.nanoTime())
                                .minus(begin);
    Period sleep = choices.getInterval().minus(length);
    sleep(sleep);
}

This calls the pattern technique and sleeps the required time afterward, to make sure that the pattern technique known as each interval (sometimes 10 ms).

The core sampling takes place on this pattern technique:

Thread.getAllStackTraces().forEach(
  (thread, stackTraceElements) -> {
    if (!thread.isDaemon()) { 
        // exclude daemon threads
        retailer.addSample(stackTraceElements);
    }
});

We use right here the Thread::getAllStackTraces technique to acquire the stack traces of all threads. This triggers a safepoint and is why this profiler is safepoint-biased. Taking the stack traces of a subset of threads wouldn’t make sense, as there is no such thing as a technique within the JDK for this. Calling Thread::getStackTrace on a subset of threads would set off many safepoints, not only one, leading to a extra important efficiency penalty than acquiring the traces for all threads.

The results of Thread::getAllStackTraces is filtered in order that we don’t embrace daemon threads (just like the Profiler thread or unused Fork-Be part of-Pool threads). We cross the suitable traces to the Retailer, which offers with the post-processing.

Retailer Class

That is the final class of this profiler and in addition the by far most vital, post-processing, storing, and outputting of the collected data:

See Also

bundle me.bechberger;

import java.io.BufferedOutputStream;
import java.io.OutputStream;
import java.io.PrintStream;
import java.nio.file.Path;
import java.util.HashMap;
import java.util.Record;
import java.util.Map;
import java.util.Optionally available;
import java.util.stream.Stream;

/**
 * retailer of the traces
 */
public class Retailer {

    /** too giant and browsers cannot show it anymore */
    non-public ultimate int MAX_FLAMEGRAPH_DEPTH = 100;

    non-public static class Node {
        // ...
    }

    non-public ultimate Optionally available<Path> flamePath;
    non-public ultimate Map<String, Lengthy> methodOnTopSampleCount = 
        new HashMap<>();
    non-public ultimate Map<String, Lengthy> methodSampleCount = 
        new HashMap<>();

    non-public lengthy totalSampleCount = 0;

    /**
     * hint tree node, solely populated if flamePath is current
     */
    non-public ultimate Node rootNode = new Node("root");

    public Retailer(Optionally available<Path> flamePath) {
        this.flamePath = flamePath;
    }

    non-public String flattenStackTraceElement(
      StackTraceElement stackTraceElement) {
        // name intern to protected some reminiscence
        return (stackTraceElement.getClassName() + "." + 
            stackTraceElement.getMethodName()).intern();
    }

    non-public void updateMethodTables(String technique, boolean onTop) {
        methodSampleCount.put(technique, 
            methodSampleCount.getOrDefault(technique, 0L) + 1);
        if (onTop) {
            methodOnTopSampleCount.put(technique, 
                methodOnTopSampleCount.getOrDefault(technique, 0L) + 1);
        }
    }

    non-public void updateMethodTables(Record<String> hint) {
        for (int i = 0; i < hint.dimension(); i++) {
            String technique = hint.get(i);
            updateMethodTables(technique, i == 0);
        }
    }

    public void addSample(StackTraceElement[] stackTraceElements) {
        Record<String> hint = 
            Stream.of(stackTraceElements)
                   .map(this::flattenStackTraceElement)
                   .toList();
        updateMethodTables(hint);
        if (flamePath.isPresent()) {
            rootNode.addTrace(hint);
        }
        totalSampleCount++;
    }

    // the one purpose this requires Java 17 :P
    non-public document MethodTableEntry(
        String technique, 
        lengthy sampleCount, 
        lengthy onTopSampleCount) {
    }

    non-public void printMethodTable(PrintStream s, 
      Record<MethodTableEntry> sortedEntries) {
        // ...
    }

    public void printMethodTable() {
        // type strategies by pattern depend
        // the print a desk
        // ...
    }

    public void storeFlameGraphIfNeeded() {
        // ...
    }
}

The Profiler calls the addSample technique which flattens the stack hint parts and shops them within the hint tree (for the flame graph) and counts the traces that any technique is a part of.

The attention-grabbing half is the hint tree modeled by the Node class. The thought is that each hint A -> B -> C (A calls B, B calls C, [C, B, A]) when returned by the JVM) may be represented as a root node with a baby node A with little one B with little one C, so that each captured hint is a path from the basis node to a leaf. We depend what number of instances a node is a part of the hint. This could then be used to output the tree knowledge construction for d3-flame-graph which we use to create good flamegraphs like:

Flame graph produced by the profiler for the renaissance dotty benchmark

Hold in my thoughts that the precise Node class is as follows:

non-public static class Node {                                                                                                                                                                                              
    non-public ultimate String technique;                                                                                                                                                                                         
    non-public ultimate Map<String, Node> kids = new HashMap<>();                                                                                                                                                          
    non-public lengthy samples = 0;                                                                                                                                                                                            
                                                                                                                                                                                                                         
    public Node(String technique) {                                                                                                                                                                                         
        this.technique = technique;                                                                                                                                                                                            
    }                                                                                                                                                                                                                    
                                                                                                                                                                                                                         
    non-public Node getChild(String technique) {                                                                                                                                                                               
        return kids.computeIfAbsent(technique, Node::new);                                                                                                                                                              
    }                                                                                                                                                                                                                    
                                                                                                                                                                                                                         
    non-public void addTrace(Record<String> hint, int finish) {                                                                                                                                                                 
        samples++;                                                                                                                                                                                                       
        if (finish > 0) {                                                                                                                                                                                      
            getChild(hint.get(finish)).addTrace(hint, finish - 1);                                                                                                                                                           
        }                                                                                                                                                                                                                
    }                                                                                                                                                                                                                    
                                                                                                                                                                                                                         
    public void addTrace(Record<String> hint) {                                                                                                                                                                           
        addTrace(hint, hint.dimension() - 1);                                                                                                                                                                               
    }                                                                                                                                                                                                                    
                                                                                                                                                                                                                         
    /**                                                                                                                                                                                                                  
     * Write in d3-flamegraph format                                                                                                                                                                                     
     */                                                                                                                                                                                                                  
    non-public void writeAsJson(PrintStream s, int maxDepth) {                                                                                                                                                              
        s.printf("{ "title": "%s", "worth": %d, "kids": [", 
                 method, samples);                                                                                                                                 
        if (maxDepth > 1) {                                                                                                                                                                                              
            for (Node child : children.values()) {                                                                                                                                                                       
                child.writeAsJson(s, maxDepth - 1);                                                                                                                                                                      
                s.print(",");                                                                                                                                                                                            
            }                                                                                                                                                                                                            
        }                                                                                                                                                                                                                
        s.print("]}");                                                                                                                                                                                                   
    }                                                                                                                                                                                                                    
                                                                                                                                                                                                                         
    public void writeAsHTML(PrintStream s, int maxDepth) {                                                                                                                                                               
        s.print("""                                                                                                                                                                                                      
                <head>                                                                                                                                                                                                   
                  <hyperlink rel="stylesheet" 
                   kind="textual content/css" 
                   href="https://cdn.jsdelivr.web/npm/d3-flame-graph@4.1.3/dist/d3-flamegraph.css">                                                                                
                </head>                                                                                                                                                                                                  
                <physique>                                                                                                                                                                                                   
                  <div id="chart"></div>                                                                                                                                                                                 
                  <script kind="textual content/javascript" 
                   src="https://d3js.org/d3.v7.js"></script>                                                                                                                               
                  <script kind="textual content/javascript" 
                   src="https://cdn.jsdelivr.web/npm/d3-flame-graph@4.1.3/dist/d3-flamegraph.min.js"></script>                                                                             
                  <script kind="textual content/javascript">                                                                                                                                                                        
                  var chart = flamegraph().width(window.innerWidth);                                                                                                                                                     
                  d3.choose("#chart").datum(""");                                                                                                                                                                        
        writeAsJson(s, maxDepth);                                                                                                                                                                                        
        s.print("""                                                                                                                                                                                                      
                ).name(chart);                                                                                                                                                                                           
                  window.onresize = 
                      () => chart.width(window.innerWidth);                                                                                                                                                
                  </script>                                                                                                                                                                                              
                </physique>                                                                                                                                                                                                  
                """);                                                                                                                                                                                                    
    }                                                                                                                                                                                                                    
}                                                                                                                                                                                                                        
                                                                                                                                                                                                                         

Tiny-Profiler

I named the ultimate profiler tiny-profiler and its sources are on GitHub (MIT licensed). The profiler ought to work on any platform with a JDK 17 or newer. The utilization is pretty easy:

# construct it
mvn bundle

# run your program and print the desk of strategies sorted by their pattern depend
# and the flame graph, taking a pattern each 10ms
java -javaagent:goal/tiny-profiler.jar=flamegraph=flame.html ...

You’ll be able to simply run it on the renaissance benchmark and create the flame graph proven earlier:

# obtain a benchmark
> check -e renaissance.jar || wget https://github.com/renaissance-benchmarks/renaissance/releases/obtain/v0.14.2/renaissance-gpl-0.14.2.jar -O renaissance.jar

> java -javaagent:./goal/tiny_profiler.jar=flamegraph=flame.html -jar renaissance.jar dotty
...
===== technique desk ======
Whole samples: 11217
Methodology                                      Samples Proportion  On high Proportion
dotty.instruments.dotc.typer.Typer.typed            59499     530.44       2       0.02
dotty.instruments.dotc.typer.Typer.typedUnadapted   31050     276.81       7       0.06
scala.runtime.operate.JProcedure1.apply      24283     216.48      13       0.12
dotty.instruments.dotc.Driver.course of               19012     169.49       0       0.00
dotty.instruments.dotc.typer.Typer.typedUnnamed$1   18774     167.37       7       0.06
dotty.instruments.dotc.typer.Typer.typedExpr        18072     161.11       0       0.00
scala.assortment.immutable.Record.foreach       16271     145.06       3       0.03
...                                                                              

The overhead for this instance is round 2% on my MacBook Professional 13″ for a 10ms interval, which makes the profiler usable while you ignore the safepoint-bias.

Conclusion

Writing a Java profiler in 240 strains of pure Java is feasible and the ensuing profiler may even be used to research efficiency issues. This profiler isn’t designed to interchange actual profilers like async-profiler, however it demystifies the inside workings of easy profilers.

I hope you loved this code-heavy weblog publish. As at all times I’m glad for any suggestions, difficulty, or PR.

This weblog publish is a part of my work within the SapMachine group at SAP, making profiling simpler for everybody. Vital elements of this publish have been written under the English channel…

Source Link

What's Your Reaction?
Excited
0
Happy
0
In Love
0
Not Sure
0
Silly
0
View Comments (0)

Leave a Reply

Your email address will not be published.

2022 Blinking Robots.
WordPress by Doejo

Scroll To Top