Debugging Compilers
That is (I hope) a part of a collection of weblog posts with some concepts on utilizing FlowStorm, a Clojure omniscient and time journey debugger, to assist us purpose about Clojure methods.
This primary submit, goes to be half a FlowStorm overview, and half a tour of the ClojureScript compiler internals, since we’re going to be utilizing it as en instance of a non trivial system we wish to discover/debug/perceive.
Following this submit does not require any explicit information on compilers. Because the compilation unit of most Lisps is a type as a substitute of a file like on most different languages, the core of the ClojureScript compiler could be seen as a program that may take a string representing a Clojure type as enter, learn it, recursively parse it right into a tree of expressions also called an AST (summary syntax tree), after which walks down the tree emitting strings containing JavaScript code.
You may consider it like :
(-> "(defn sum [a b] (+ a b))"
learn
analyze
emit)
"perform sumb(a, b) { return a + b;}"
which is after all an over-simplification, however must be sufficient for following the submit.
In case you like, you possibly can comply with alongside by copying and pasting the instructions into your terminal. It is not going to rely upon any explicit IDE or pre-installed tooling, aside from the Clojure cli and git.
If it’s your first encounter with FlowStorm, we name it a debugger, since loads of its options are these discovered on debuggers, however its capabilities can be utilized for far more than chasing bugs.
FlowStorm was designed as a instrument for visualizing what is occurring inside our Clojure applications as they run, throughout growth, and specifically designed with interactive programming and immutability in thoughts.
For individuals new to Clojure and Lisps normally, interactive programming is about growing a program by interacting with its operating course of. That is fairly completely different from the extra conventional method of writing applications, which more often than not implies modifying your information, re-compillig all the pieces (hopefully incrementally), operating the method, stopping it, then rinse and repeat. Immutability is about our applications dealing principally with immutable values, as a substitute of references to mutable objects or locations in reminiscence.
Interactivity and immutability alone are already fairly highly effective instruments when making an attempt to know a system, given you possibly can poke at it by calling completely different capabilities, examine the information, modify and recompile particular elements, all with out shedding your software state or having to take care of lengthy compilation instances.
However poking at methods this fashion, through guide perform calling, println (or the extra fashionable faucet>), scope capturing, single stepping, and so on, works their greatest when a lot of the system and you might be assured that particular factors within the execution will likely be sufficient to disclose the solutions to your questions. So on prime of that, FlowStorm offers a straightforward method of recording and visualizing our applications execution on demand.
OK, sufficient preamble, let’s bounce into it.
For working with the ClojureScript compiler we first want its sources, so let’s begin by cloning the official repo’s grasp department :
$ git clone https://github.com/clojure/clojurescript
$ cd clojurescript
We are able to now setup FlowStorm by the simply cloned undertaking deps.edn
like this :
{...
:aliases
{...
:storm
{:classpath-overrides {org.clojure/clojure nil} ;; for disabling the official compiler
:extra-deps {com.github.flow-storm/clojure {:mvn/model "1.11.1-11"}
com.github.flow-storm/flow-storm-dbg {:mvn/model "3.8.2"}}
:jvm-opts ["-Dclojure.storm.instrumentEnable=true"
"-Dclojure.storm.instrumentOnlyPrefixes=cljs"
"-Dflowstorm.startRecording=false"
"-Dclojure.server.repl={:port 5555 :accept clojure.core.server/repl}"]}}}
There may be quite a bit occurring there, fortunately we solely want to do that as soon as.
We added a brand new alias, :storm
, so we will simply begin a repl with all the pieces we’d like.
The necessary elements are :
- we disabled the official Clojure compiler, since we’re going to swap it by ClojureStorm, our dev compiler
- added FlowStorm and ClojureStorm dependencies
- instructed ClojureStorm that:
- we wish instrumentation allow at startup
- to instrument all namespaces below
cljs.*
- and that recording must be paused at startup
- we additionally begin a socket repl so we will have two terminals, one with a ClojureScript repl and one other with a Clojure one, each on the identical JVM course of
FlowStorm can be utilized with simply the official Clojure compiler, however by swapping it by the ClojureStorm dev compiler we get automated instrumentation, which supplies us a a lot nicer expertise for the form of issues we’re going to do subsequent.
Now we will lastly run a Clojure repl with the :storm
alias :
$ clj -A:storm
ClojureStorm 1.11.1-11
Consider the :assist key phrase for more information or :tut/fundamentals for a inexperienced persons tour.
As quickly because the repl begins, the very first thing we’ll do is to start out the FlowStorm UI, which we accomplish by evaluating the :dbg
key on this ClojureStorm repl.
person=> :dbg
person=> (require 'cljs.primary)
person=> (cljs.primary/-main "--repl")
ClojureScript 0.0.249695361
cljs.person=>
Proper after, we require the principle ClojureScript compiler namespace, after which run the principle perform with the --repl
arg (for individuals new to Clojure, the ClojureScript compiler is only a perform we will invoke from our repl), which tells it to start out a browser repl. This could have opened a browser window and changed our Clojure repl with a ClojureScript one. Every thing we kind on this new repl will likely be learn, compiled into JavaScript and despatched to the browser for execution.
Word: Beginning a repl might take some seconds since we’re utilizing ClojureScript from supply, so it must compile all clj information.
And final however not least we’ll join one other Clojure repl to the identical course of our compiler is operating, by utilizing telnet to hook up with the socket repl we began earlier.
$ telnet 127.0.0.1 5555
person=>
And that’s all of the setup we’d like, at this level we should always have :
- a terminal with a Clojure repl
- a terminal with a ClojureScript repl related to the browser
- the FlowStorm UI
FlowStorm UI fundamentals
Once you simply begin FlowStorm and you continue to haven’t any recordings you see :
- Your clear button, to discard any recordings
- Begin/Cease recording button
- A textfield to shortly bounce into stepping any recorded perform
- Your primary instruments tabs
- And eventually how a lot heap you will have accessible, so you possibly can regulate it and clear your recordings if you end up operating out of heap area
Recording a type compilation
Let’s click on the Begin recording button. It’s best to see the icon altering to a Cease, that is how one can inform you might be at the moment recording.
Now that we’re recording, let’s go to our ClojureScript repl terminal and eval a easy perform, like :
cljs.person=> (defn sum [a b] (+ a b))
After you hit enter it would make ClojureScript learn and compile that perform, which can generate a bunch of recordings.
As soon as the perform is outlined we will safely cease recording, since we now have all of the information from a compilation of our perform. It’s good follow to cease recording after we do not want it since some functions can have threads continually polling for instance, which can waste our heap and pollute our recordings.
As quickly as we have now recordings, we’ll see an inventory of threads on the left showing.
Which means exercise was recorded for these threads. That is already fascinating since we get to see that typing that expression on the repl is operating code in a number of threads.
Double clicking on a thread will open the thread recordings. We’re simply going to concentrate on the primary
thread since it’s the place all of the compilation occurs, however be at liberty to discover different threads actions.
As quickly as we open a thread we will likely be confronted with the decision tree, an expandable tree of all of the capabilities calls recorded. Time flows from prime to backside. As you possibly can see within the image above, simply by increasing some nodes we already see some acquainted capabilities associated to studying, analyzing, macroexpanding, parsing, and emitting. By clicking on the emit-str
perform we additionally get to see the enter of emit-str
which is an AST node, and its output, which is a string containing Javascript code.
There are additionally 3 necessary instruments there :
- The decision tree (the one we simply talked about)
- The code stepping instrument, the place we will step over the code forwards and backwards in time.
- The capabilities record, the place we get to see all of the capabilities and their calls.
Since this is not a full FlowStorm tutorial I will not go into the main points of any of this instruments. You could find the person information here for more information.
Now let us take a look at how we will use FlowStorm to attempt to perceive the completely different phases of the compilation.
When it comes to compilation, step one made by the ClojureScript compiler is studying.
Learn takes a string because the enter and outputs a type, which is a nested construction of Clojure lists, vectors, maps, and so on, representing the construction on the enter string as Clojure information.
ClojureScript accomplish this step by calling clojure.instruments.reader/learn
which is a part of the clojure.tools.reader library.
If we need to take a fast take a look at the learn code we will use the Fast bounce
instrument to seek for a learn perform, like within the image above. After hitting enter or clicking on it, it would transfer the stepper to the primary name of this perform. We are able to inform it was solely known as as soon as from the quantity proper subsequent to the title.
Now we will step over it utilizing the controls on the prime or by clicking across the highlighted expression. Each non highlighted expression means it did not execute as a part of this perform body.
On the precise panel we will see the worth for the present expression, the one in inexperienced.
As we will see from the learn
supply code, it’s calling learn*
. Wanting on the learn*
arguments and return worth we will see it goes from a SourceLoggingPushbackReader
and a few choices right into a Clojure type. If we need to step into learn*
, we place the debugger on any expression earlier than the decision, after which step subsequent till we bounce into its supply code.
Since learn*
has been known as a number of instances there’s a higher method of it.
We are able to transfer to the capabilities record instrument (the final one of many backside tabs) and filter the record, so it solely reveals learn capabilities, like within the image above.
Double clicking any of them will make the panel on the precise record all of the calls with their arguments. Single clicking on a name will present the return worth within the backside panel, whereas double clicking it would take us to step that decision.
If you’re following together with your individual setup, take a while to discover round this learn capabilities.
Now let’s transfer on and discover the evaluation of these varieties.
Analyze
is a recursive course of that takes a type as enter and produce a tree of AST nodes, which on this case are plain Clojure maps.
The image above reveals the decision tree growth for the start of the analyze name stack, the place we will already see how analyze
, macroexpand-1
and parse
are being known as.
Because the evaluation course of walks down its enter, it would macroexpand, parse after which run a number of passes on the ensuing nodes for issues like optimizations, kind inference, and so on as we will see within the image above. You may bounce into the analyze*
perform physique utilizing the fast bounce, in case you need to have a look.
We are able to take the identical method as we took for the reader, to get a greater concept of this evaluation capabilities, by going to the capabilities record and filter it with analyzer/analyze
, like within the image above.
Let’s check out analyze-seq
. This time we’re going to mute arguments 1, 3 and 4 utilizing the checkboxes on the prime and simply take a look at the second, which incorporates the shape to be analyzed.
As we will see on the precise, there are calls to analyze-seq
on many varieties, together with the one we typed on the repl.
Clojure builders will discover that the decision proper after the one we typed is similar expression however after a macroexpand-1
name. It is because analize-seq
will even take care of macroexpansion. You may double click on on any of this calls to leap into analyze-seq
physique, and as you will notice there, it’s answerable for macroexpanding varieties, and also will name itself recursively if the macroexpansion macroexpanded something.
Now let’s click on on the (defn sum [...] ...)
as soon as, the highest stage expression we typed on the repl.
The panel on the backside reveals the return of analyze-seq
, a reasonably print of the AST node constructed by it. Since it’s a nested information construction, it’s fairly inconvenient to take a look at in fairly print type.
Fortunately FlowStorm comes with an information inspector. Let’s click on on the INS
button to open it, which permits us to navigate this nested information buildings.
Every AST node incorporates a :op
key as the kind discriminator for the node, a :kids
key with a vector of keys for the sub elements of the node, plus data related to every kind of node.
We are able to click on round to navigate deeper, after which use the breadcrumbs on the prime to navigate backwards.
Even with the inspector, making an attempt to have a way of the construction of this tree is form of onerous, so let’s pull one other trick.
We are able to take any worth again to our repl by giving it a reputation. Whereas having the inspector on the root of our price, click on on the DEF
button then give it a reputation, for instance def-op
.
Now we will go to our Clojure repl (not the ClojureScript one) and use that worth, until we assigned a namespace to it, it will likely be outlined below person/def-op
.
We’re going to write a bit helper perform right here to stroll down the tree and print the :op
with some indentation, which hopefully will assist us perceive extra in regards to the construction of this tree.
(defn print-ast-node
"Recursively print ast-node ops with indentation"
([ast-node] (print-ast-node ast-node 0))
([{:keys [op children] :as ast-node} indent-level]
(println (apply str (repeat (* 2 indent-level) " "))
(str "op: " op))
(doseq [ch-key children]
(let [ch-node (get ast-node ch-key)]
(if (vector? ch-node)
(doseq [ch ch-node]
(print-ast-node ch (inc indent-level)))
(print-ast-node ch-node (inc indent-level)))))))
So we first copy and paste the above perform in our Clojure repl.
And now we name it with our def-op
node :
person=> (print-ast-node def-op)
op: :def
op: :var
op: :fn
op: :binding
op: :fn-method
op: :binding
op: :binding
op: :do
op: :js
op: :native
op: :native
Good, this can hopefully assist us perceive the tree construction higher.
Transferring ahead, in case you are following alongside, let’s return to the inspector and dig into this “def-op” node till we attain the one with :op :js
, you’ll have to dig into [:init :methods 0 :body :ret]
by clicking on every key worth.
It’s best to now have a map with the :op :js
targeted in your inspector proper pane.
As an instance we need to perceive the emission for this explicit node.
A trick we might use is, whereas retaining the inspector open and targeted on our price, go to the principle FlowStorm window after which :
- go to the code stepping instrument tab
- transfer the debugger to the final step by hitting the
>|
button. - and now again into the inspector hit “Discover the prev expression that incorporates this worth”, which is the left arrow at the highest.
What this can do is to look the inspector targeted worth from the again, and if it finds it, transfer the stepper to that cut-off date, which ought to depart us within the emit code for the AST node we’re excited by.
We must be now positioned proper earlier than calling (emit* ast)
with ast being our node, contained in the cljs.compile/emit
perform.
To get some extra context and see how we acquired right here, we will take a look at the present stack on the underside proper panel. It appears like we’re within the path of emitting our :def
, :fn
and :do
ast nodes, which make sense.
Similar as cljs.analyzer/parse
which can create AST nodes primarily based on completely different Clojure particular varieties, this cljs.compiler/emit*
can be a multimethod, which can emit completely different Javascript for various AST nodes.
It additionally appears like we’re emitting another code, one thing wrapped in a strive, which we’ve not typed on our repl. Some readers might have observed this additionally in earlier steps. It is because the ClojureScript compiler is wrapping our code in some further code, however we’ll come to that in a minute.
Now stepping ahead a few instances ought to take us to the emits
perform, which is the perform that may lastly emit Javascript, as in write a string.
Inspecting the worth of s
, tells us that for our node, it begins by emitting the return key phrase.
We are able to additionally see that the ClojureScript compiler emits Javascript code by writing on *out*
. Since that is the place that writes Javascript strings, it might be fascinating to see what are all of the values being written to this *out*
alongside the complete execution, not simply this time.
For this we’re going to introduce another instrument, the Printer
. For inspecting the worth of s
alongside the execution we begin by proper clicking the s
expression after which Add to prints
. A dialog ought to popup asking for a message. We are able to write no matter we wish right here, that is the form of messages we all the time add to our println debugging. We are able to do that on as many expressions as we wish, however this one must be sufficient for our goal.
We are able to discover the Printer
instrument by clicking on the final tab on the left.
On the prime we should always see a thread selector, and proper after it, all our outlined printers. For operating the printers we have to choose a thread, on this case primary
, after which click on the refresh button subsequent to the thread selector.
We should always now see all our prints for the complete execution of the primary
thread, like within the image above.
We are able to use the textual content subject on the prime to filter our prints if we’re looking for any explicit emitted Javascript.
If we’re excited by any explicit print we will double click on it, which can take us to the assertion and time that generated that print, the place we will then use the remainder of the instruments to maintain inspecting.
Go forward and double click on on any of them, so it takes us again on our emits
perform.
Utilizing the printer we get to see all the only writes to *out*
, which can be all of the emitted Javascript, nevertheless it may be fascinating to see it a bit nicer, so we will copy and paste it into our editor for instance.
We are able to use the truth that *out*
is bounded to the identical mutable worth (a StringWriter
), the entire time, which signifies that no matter reference we get to it incorporates the ultimate state, the one it took after the execution.
More often than not this isn’t what we wish, and FlowStorm offers a method for coping with mutable objects, in case you are have a look here.
However on this case we’re going to benefit from it, and simply give this StringWriter
reference a reputation, so we will take it to our repl, the identical we did earlier than for our def-node
. So we click on on the *out*
expression after which on the DEF
button on the proper panel, let’s name it out
Now let’s go to our Clojure repl the place we will print a string illustration of it.
person=> (println (.toString out))
(perform() {
strive {
return cljs.core.pr_str.name(null, (perform() {
var ret__13392__auto__ = (perform() {
cljs.person.sum = (perform cljs$person$sum(a, b) {
return (a + b);
});
return (
new cljs.core.Var(perform() {
return cljs.person.sum;
}, new cljs.core.Image("cljs.person", "sum", "cljs.person/sum", 1580982348, null), cljs.core.PersistentHashMap.fromArrays([new cljs.core.Keyword(null, "ns", "ns", 441598760), new cljs.core.Keyword(null, "name", "name", 1843675177), new cljs.core.Keyword(null, "file", "file", -1269645878), new cljs.core.Keyword(null, "end-column", "end-column", 1425389514), new cljs.core.Keyword(null, "source", "source", -433931539), new cljs.core.Keyword(null, "column", "column", 2078222095), new cljs.core.Keyword(null, "line", "line", 212345235), new cljs.core.Keyword(null, "end-line", "end-line", 1837326455), new cljs.core.Keyword(null, "arglists", "arglists", 1661989754), new cljs.core.Keyword(null, "doc", "doc", 1913296891), new cljs.core.Keyword(null, "test", "test", 577538877)], [new cljs.core.Symbol(null, "cljs.user", "cljs.user", 877795071, null), new cljs.core.Symbol(null, "sum", "sum", 1777518341, null), "<cljs repl>", 10, "sum", 1, 1, 1, cljs.core.list(new cljs.core.PersistentVector(null, 2, 5, cljs.core.PersistentVector.EMPTY_NODE, [new cljs.core.Symbol(null, "a", "a", -482876059, null), new cljs.core.Symbol(null, "b", "b", -1172211299, null)], null)), null, (cljs.core.truth_(cljs.person.sum) ? cljs.person.sum.cljs$lang$take a look at : null)])));
})();
(cljs.core._STAR_3 = cljs.core._STAR_2);
(cljs.core._STAR_2 = cljs.core._STAR_1);
(cljs.core._STAR_1 = ret__13392__auto__);
return ret__13392__auto__;
})());
} catch (e24083) {
var e__13393__auto__ = e24083;
(cljs.core._STAR_e = e__13393__auto__);
throw e__13393__auto__;
}
})()
And there we have now it, a fast method of wanting on the complete type generated Javascript. Word that it’s the full type that was despatched to the browser, not simply what we typed on the repl. It incorporates our sum
perform definition but additionally wrapped in some further code. This further code prints the consequence and in addition units up ClojureScript repl *1
, *2
and *3
vars and in case of an exception additionally *e
.
One other method to take a look at the ultimate string is to leap to the emit-str
perform utilizing the Fast bounce
. That perform must be known as solely as soon as to generate the ultimate string.
We’re reaching the top of the submit, so wrapping up, we noticed a bunch of methods we will use to debug and perceive the ClojureScript compiler. If we’re growing it, this provides us a pleasant workflow :
- Begin recording
- Eval some cljs code on our repl so it compiles it
- Cease recording
- Go searching
- Modify/repair and re-eval our compiler code
- Repeat
All this steps must be fairly quick, since we’re re-evaluating simply what we modified on the ClojureScript compiler, and in addition compiling and recording one type at a time.
Additionally all of the methods described listed here are by no imply particular to compiler growth, they can be utilized on any Clojure software.
And that’s it, I hope you had a enjoyable studying, discover it fascinating or discovered one thing.
Till the subsequent time! Completely satisfied hacking!
Focus on this submit here.
Printed: 2023-10-19
Tagged: