Now Reading
Asserting Topiary – Tweag

Asserting Topiary – Tweag

2023-05-19 18:58:49

Topiary goals to be a common formatter engine inside the
Tree-sitter ecosystem. Named after the artwork of clipping or trimming
timber into unbelievable shapes, it’s designed for formatter authors and
formatter customers:

  • Authors can create a formatter for a language with out having to jot down
    their very own formatting engine, and even their very own parser.

  • Customers profit from uniform, comparable code fashion, throughout a number of
    languages, with the comfort of a single formatter
    device.

The core of Topiary is written in Rust, with declarative formatting
guidelines for bundled languages written within the Tree-sitter query
language
. On this first launch, now we have
targeting formatting OCaml code, capitalising on the OCaml
experience inside the Topiary Group and our colleague, Nicolas Jeannerod.

All growth and releases occur over within the Topiary GitHub
repository
.




Topiary logo

Motivation

Coding fashion has traditionally been a matter of non-public alternative. That is
inherently subjective, resulting in bikeshedding over formatting decisions,
slightly than significant dialogue throughout evaluate. Prescribed fashion
guides, linters and in the end automated formatters — popularised by
gofmt, whose builders had the insight to
impose “adequate” uniform formatting on a codebase — have helped
remedy these points.

This motivated analysis into growing a formatter for our Nickel
language
. Nevertheless, its inside parser didn’t present a syntax
tree that retained sufficient context to permit the unique program to be
reconstructed after parsing. After making a Tree-sitter grammar for
Nickel
, for syntax highlighting,
we concluded that it might be attainable to
leverage Tree-sitter for formatting as effectively.

However why cease at Nickel? Topiary generalises this method for any
language that doesn’t make use of semantic whitespace — for which,
specialised formatters, corresponding to our Haskell formatter Ormolu, are
required — by expressing formatting fashion guidelines within the Tree-sitter
query language
. It thus aspires to be a “common
formatter engine” for such languages; enabling the quick growth of
formatters, supplied a Tree-sitter grammar is
obtainable.

Design Rules

To that finish, Topiary has been created with the next targets in thoughts:

  • Use Tree-sitter for parsing, to keep away from writing yet one more engine for
    a formatter.
  • Anticipate idempotency. That’s, formatting of already-formatted code
    shouldn’t change something.
  • For bundled formatting kinds to satisfy the next constraints:
    • Suitable with attested formatting kinds used for that language in
      the wild.
    • Devoted to the creator’s intent: if code has been written such that
      it spans a number of traces, that call is preserved.
    • Minimise adjustments between commits such that diffs focus primarily on the
      code that’s modified, slightly than superficial artefacts.
    • Be well-tested and sturdy, such that they are often trusted on massive
      tasks.
  • For finish customers, the formatter ought to run effectively and combine with
    different developer instruments, corresponding to editors and language servers.

The way it Works

So long as a Tree-sitter grammar is outlined for a
language, Tree-sitter can parse it and construct a concrete syntax tree.
Tree-sitter additionally permits us to run queries towards this tree. We will make
use of those to focus on fascinating subtrees (e.g., an if block or a
loop), to which we are able to apply formatting guidelines. These cohere right into a
declarative definition of how that language must be formatted.

For instance:

(
  [
    (infix_operator)
    "if"
    ":"
  ] @append_space
  .
  (_)
)

It will match any node that the grammar has recognized as an
infix_operator, or the nameless nodes containing if or : tokens,
instantly adopted by any named node (represented by the (_)
wildcard sample). The question matches on subtrees of the identical form,
the place the annotated node inside it is going to be “captured” with the identify
@append_space; one among many formatting rules we
have outlined. Our formatter runs by way of all matches and captures, and
once we course of any seize referred to as @append_space, we append an area
after the annotated node.

Earlier than rendering the output, Topiary does some post-processing, corresponding to
squashing consecutive areas and newlines, trimming extraneous
whitespace, and ordering indentation and newline directions
constantly. This implies which you could, for instance, prepend and append
areas to if and true, and Topiary will nonetheless output if true with
only one area between the phrases.

To make this extra concrete, think about the expression 1+2. This has the
following syntax tree, if it’s interpreted as OCaml, the place the match
described by the above question is highlighted in purple:

See Also

Syntax tree, with the match highlighted

The @append_space seize instructs Topiary to append an area after
the infix_operator, rendering 1+ 2. Repeating this course of for each
syntactic construction we care about — making even handed generalisations
wherever attainable — leads us to an total formatting fashion for a
language.

As a formatter creator, defining a method for a language is only a matter
of increase these queries. Finish customers can then apply them to their
codebase with Topiary, to render their code on this fashion.

Topiary will not be the primary device to make use of Tree-sitter past its unique
scope, neither is it the primary device that makes an attempt to be a formatter for
a number of languages (e.g., Prettier). This part accommodates some instruments
that we drew inspiration from, or used throughout the growth of
Topiary.

Tree-sitter Particular

Meta-Formatters

  • treefmt: A normal formatter orchestrator, which unifies formatters
    below a typical interface.
  • format-all: A formatter orchestrator for Emacs.
  • null-ls.nvim: An LSP framework for Neovim that facilitates formatter
    orchestration.

Getting Began

We’re actually enthusiastic about Topiary and the potential it has on this
area.

This primary launch concentrates on formatting help for OCaml, as effectively
as easy languages, corresponding to JSON and TOML. Experimental formatting
help can be obtainable for Nickel, Bash, Rust, and Tree-sitter’s
personal question language; these are below lively growth or serve a
pedagogical finish for formatter authors.

We’d extremely encourage you to strive Topiary and invite you to take a look at
the Topiary GitHub repository to see for your self.
Info on putting in and utilizing Topiary may be discovered on this
repository, the place we’d additionally welcome contributions,
characteristic requests, and bug studies.

Source Link

What's Your Reaction?
Excited
0
Happy
0
In Love
0
Not Sure
0
Silly
0
View Comments (0)

Leave a Reply

Your email address will not be published.

2022 Blinking Robots.
WordPress by Doejo

Scroll To Top