Now Reading

2023-05-02 04:11:20

A weblog about laptop science, programming, and whatnot.

Desk of Contents


Like most of my colleagues, I exploit LaTeX to jot down papers, reviews, notes, or what have you ever.
In truth, I believe the entire locations that I commonly write helps some variable subset of LaTeX.
Additionally like most of my colleagues, I’m not a TeXnician.
I’m not proud to be ignorant on this regard, however there’s solely so many hours in a day, and
the features from correctly studying an enormous ecosystem like LaTeX appears minuscule in contrast
to the preliminary buy-in price.

Nonetheless, I used to be curious.

LaTeX and TeX, tomato tomato?
Right here’s how I see it.
If LaTeX is like C++20 — large, complicated, complicated, filled with cruft, however nonetheless very talked-about —
then TeX is like C89 — small, easier, complicated, a baby of its time, and infrequently uncared for.

There’s a sure pleasure in going far sufficient down the stack that the methods you might be utilizing turns into easy sufficient to purpose about on a deep degree.
It’s the sensation you would possibly get sitting down one afternoon attempting to jot down some meeting after
a protracted week of debugging consistency errors in your sharded database throughout a number of kubernetes clusters.
No magic, no have to always seek for different individuals who’s had the identical issues you’re coping with on StackOverflow.
It’s simply you and the CPU, and sure the Intel Instruction Set Guide or one thing as large and scary.
I wished that, however with typesetting.

This was my romantic motivation to dig into TeX and attempt to see whether or not it truly is rewarding to
step again just a few a long time to keep away from the complexity of newer and larger typesetting methods.
I purchased the TeXbook, and browse it from begin to end.
Effectively, some paragraphs are marked with “harmful bends”, signalling that the content material lined or the background assumed
for these paragraphs are extra superior. I learn the only bends, however skipped the double bends, not less than more often than not.

Someplace within the e-book I discovered the definition of newif, a macro that’s used to outline conditionals,
which you’ll be able to later question, and department on. Booleans, in different phrases.
I learn it, and actually didn’t perceive a single factor,
and I figured that if I can handle to sit down down and determine what on earth this macro is doing and why, then
I’ve had style of what it’s like digging down this low on this planet of TeX.

This submit is the results of that course of.

How Do I Write TeX?

This isn’t actually as apparent as it would sound. In any case, TeX produces a doc, however when taking part in with macros
we actually need to see what varieties develop to, which macros are outlined, and so forth.
I’ve to say upfront that the strategy I used right here in all probability wasn’t the best,
as a result of I simply began used tex (or typically pdftex, for the needs of this submit they appear to be precisely the identical),
and began writing. The repl doesn’t help readline bindings or arrow keys, or clicking to maneuver the cursor,
so if I wished so as to add one thing in the course of a line, I needed to maintain backspace all the way in which again to the place I wished to go
and write out the remainder of the expression. Generally I pasted forwards and backwards from a textual content editor, which labored okay.

Right here’s precisely how I bought began.

/h/martin$ tex
That is TeX, Model 3.141592653 (TeX Stay 2021/Arch Linux) (preloaded format=tex)
**loosen up  % do not learn enter from a file

*tracingall=1                 % Give us plenty of output
{vertical mode: tracingstats}
{the character =}
{horizontal mode: the character =}
{clean house  }

*message{This may present someplace}    % some pattern message
This may present someplace               % here is the stuff you wrote above
{clean house  }

*defmymacro{from the macro}  % Make a brand new macro
{clean house  }

*message{mymacro}         % message will develop the macro

mymacro ->from the macro   % mymacro is expanded to `from the macro`
from the macro              % ... and we get the totally expanded kind out.
{clean house  }


Enter traces begin with a *.
It’s very helpful to set tracingall=1, which makes TeX output a bunch of issues a few of which you care about.
Word that I’ve modified up the formatting of the output all through this submit in order that it’s simpler to see what’s happening.

One other fast notice: I didn’t need to spend hours write an intro to TeX in addition to no matter that is, so
when you have by no means written a line of TeX or LaTeX, this is likely to be troublesome to observe. If you happen to’ve written
some LaTeX, and possibly outlined your individual easy macros, I believe you’ll be high-quality.

The Aim

That is the definition we’ll unravel, copied verbatim from The TeXbook.

outerdefnewif#1{depend@=escapechar escapechar=-1
  @if#1{false}escapechar=depend@} % the situation begins out false
{uccode`1=`i uccode`2=`f uppercase{gdefif@12{}}} % `if` is required

Don’t despair if that is nonsense:
the entire level of this submit is to clarify what’s happening, and to get some
higher thought of how actual and (considerably) concerned TeX macros work.

How TeX Reads Tokens

To start out on the best foot, let’s be sure that we correctly perceive how TeX reads tokens.
A token is the enter “unit” that TeX reads when it reads a doc.
As an example in the event you have been to jot down Let $n=numb$ be a quantity. then this will likely be remodeled right into a queue of tokens from which
we are going to learn one by one. Precisely how the tokens are break up up will not be essential to understanding, however on this instance
it appears to be like one thing like this:

tokens = ['L', 'e', 't', ' ', $, 'n', '=', numb, $, ...]

Discover three issues.
First, a letter is a token in of itself and we should not have one “phrase” be a token
Second, $ will not be the character '$', however the particular start/finish math mode token.
If we have been to jot down $ we might get the character token '$'.
Third, the entire macro numb is one single token.
If you hear “token”, suppose “enter unit”.

So how does TeX learn the tokens? One psychological mannequin is like this:

whereas tokens will not be empty
    t <- pop(tokens)
    if shouldexpand(t)
        exp <- develop(t)
        course of(t)

Some tokens, just like the newif token we are going to determine on this submit, develop,
and the growth is one other checklist of tokens, a few of which is likely to be common character tokens,
and a few of which is likely to be different tokens that additionally develop. Due to this fact once we develop a token
we are going to push the end result again onto the entrance of the queue.

Word that once we develop a macro that takes arguments, like defparen#1{(#1)} the growth of paren will
pop extra tokens from the queue, after which push the tokens of the expanded kind again onto the queue.

What does it imply to “course of” a token? For a personality, this principally means to jot down that character
on the present place on the web page.
For a macro definition like defbob{123} it means to make the definition and storing it someplace in reminiscence
in order that in the event you ever encounter a bob token you recognize that it expands to the three tokens 1,2,3.

A Brief Instance

Let defA{a} defB{A b} defC{BB} and the enter token queue be [C].
To verify we perceive how this works, let’s manually develop this complete factor.
The left column is the token queue, and the left facet of the queue is the entrance, which is the place at which
we will likely be working.
The precise column explains what we’re about to do.

Tokens Present motion
[C] take C out of the entrance of the queue
[] C expands to BB, which we push again
[B B] take the primary B out
[B] B expands to A b
[A b B] A is taken out, expanded to a and pushed again
[ a b B] a is taken out and processed, as a result of it doesn’t develop.
[ b B] b is taken out and processed.
[B] you get the thought…
[A b]
[ a b]
[ b]

The tip results of this execution is that now we have despatched the tokens a, b, a, b to the processing a part of TeX.

A Primer on Catcodes

We have to know yet one more factor about tokens, or quite how the characters of your enter are break up into them.
Every character have a class code, or catcode for brief. Catcodes resolve how one can group and break up characters into
a token. There’s a character code for letters (11), a code for house (10), and one for math shift (3) (there are additionally others).
This fashion TeX is aware of that within the enter let $ consists of three characters, one house, and one “math shift”.
That is additionally how TeX figures out when the title of a macro ends and new tokens start, as in hey3:
right here now we have one token with catcode 0 (the escape character ), three of catcode 11, and one in every of catcode 12 (“others”, which embody numbers).
The title of a macro is barely letters, so this fashion TeX is aware of that hey is a macro and 3 is simply the following token within the queue.

However catcodes might be modified. Why is this handy? Effectively, if we wish to make some macros that one other person wouldn’t unintentionally
redefine now we have it embody a personality that, by default, isn’t allowed to be in its title, like @.
The catcode of @ is 12, and so the enter h@ will likely be learn as two tokens h and '@'. Nevertheless, if we alter the catcode of @ to 11
it’s as if @ is only a common letter, and h@ will likely be learn as a single token h@.

That is how we alter the catcode of @ to 11 after which again to 12:

*catcode`@=11  % Class 11 consists of standard letters
*catcode`@=12  % Class 12 consists of "different characters"

Some Not So Unhealthy Macros

We have to find out about just a few different macros that newif makes use of internally. Most of those are
fairly straight ahead.


Takes an argument and replaces it by the non-expanded token checklist.
stringfoo expands to the 4 tokens f o o, it doesn’t matter what the macro foo would develop to.
A vital element which we are going to come again to is that the tokens string produces will get catcode 12 (until it’s an area).


The character which is used when a management sequence is outputted as textual content. Usually set to .
If that is set to as an example @, then stringfoo would develop to the 4 tokens @ f o o as an alternative.


Brief for uppercase code. This permits one to set the uppercase character code for an additional letter.
Often this might be uccode`x=`X uccode`X=`X and so forth, however this, like most issues in TeX, might be modified,
and modifications, like most issues in TeX, are native to the present group.

csname and endcsname

Learn and develop all the pieces up till the matching endcsname.
The growth end result ought to be an inventory of character tokens,
and this checklist will likely be made right into a single management sequence token.
If that is at the moment not outlined it will likely be outlined to loosen up.

As an example csname helloendcsname will develop to the only token hey and make the macro hey develop to loosen up.
Extra curiously, definner{hey}csnameinnerendcsname will do the identical:
Right here the interior macro expands to the checklist of tokens h e l l o, and the csname pair of macros
develop this macro, successfully changing it with csname helloendcsname.


Usually definitions made with def are native to your scope, similar to in most programming languages.
Nevertheless, typically we need to outline world macros, and gdef does precisely this.
When a macro is outlined with gdef it’s as if it was outlined within the high degree scope.

    interior  % expands to  h e l l o
interior   % this does not work, as a result of interior is now not outlined

    interior  % expands to  h e l l o
interior % additionally expands to  h e l l o


It is a security measure that you just put earlier than a def which ensures that this macro
will not be allowed to be an argument, within the parameter textual content, or within the substitute textual content of one other macro.

The expandafter Macro

Now that we’ve seen just a few easy macros we flip to at least one that’s barely much less easy.
The expandafter macro first reads the very subsequent token within the queue with out increasing it.
Then, it’ll learn and develop the following token after that.
Final, it can put the primary token again in entrance, with out increasing it.
Right here’s a small instance of the way it runs:


second ->SECOND

first ->FIRST
{the letter F}

Right here the output reveals that second is expanded earlier than first, and that the primary token that we course of is f.
Word that the second kind is barely expanded and never really processed, so the next
does not work:

*expandafterfirstdeffirst{one other first!}

The second time period, the def will likely be expanded, but it surely won’t “run”, so when
expandafter later expands first it can nonetheless have the identical worth as earlier than,
as an example to not be outlined.

Because of how TeX growth guidelines work, a macro doesn’t should have all of
it’s arguments in place while you use it; currying is in a way attainable.
We are able to use expandafter to make use of this reality if the primary token expands to a
curried macro, and the primary token within the growth of the second token is
the argument we need to give to the curried kind.

Right here’s an instance. Say now we have a macro twoarray that takes two issues and wraps them in sq.
brackets divided by a comma, in addition to a macro tuple that expands to 2 tokens 4 and 5.
If we need to have twoarray wrap the 2 tokens from tuple, it doesn’t work out of the field:

*deftwoarray#1#2{[ #1 , #2 ]}
*deftuple{4 5}
*twoarraytuple X  % X is only a placeholder for no matter's subsequent; we do not need it.
[ 4 5 , X ]
% This doesn't work as a result of `twoarray` will learn two tokens, `tuple` and `X`

*expandaftertwoarraytuple X
[ 4 , 5 ] X
% This does work as a result of `tuple` is expanded earlier than `twoarray`, and so the token
% queue once we course of `twoarray` is  `4 5 X`


So what occurs once we chain a number of expandafters collectively?
Let’s work it out with some notation:
dashes underneath a line means expandafter is skipping that line,
and it’s increasing the token above the hat ^.
Primed a' letters means expanded.

*expandafter a  b  c  d ...
%             -  ^
% token checklist: a  b' c  d

With two expandafters this turns into

*expandafter expandafter a  b  c  d ...
%             ------------ ^
% token checklist:  expandafter a' b  c  d
*expandafter  a' b  c  d ...
%              -  ^
% token checklist:  a' b' c  d

It undid itself! The growth order was a after which b.
Let’s strive three expands in a row. Now we’re getting someplace, as a result of when increasing the second token that expandafter finds,
we’d find yourself studying further tokens, if that token takes arguments. On this
case this token is expandafter, which does certainly take two arguments!

*expandafter expandafter expandafter a  b  c  d ...
%             ------------     ^^^
%                          [eat 2 arguments]
*             expandafter       a  b'        c  d ...
% That is simply the primary instance once more.
% token checklist:  a  b'' c  d ...

and we’re once more again to having the growth order of a and b flipped.
Regardless of this although, they don’t seem to be similar, as a result of expandafter doesn’t develop a kind till it solely expands to itself, however solely as soon as.
We are able to consider common growth as taking out the following token within the queue
and whether it is expandable we push again the growth onto the queue.

Let’s get concrete.
As a heat up, right here is the straightforward case the place the 2 varieties are similar, particularly when increasing as soon as is totally expanded.
The checklist of A ->a beneath every enter line is the analysis sequence such that the macro A expands to the token a.


A ->a   B ->b   C ->c   
B ->b   A ->a   C ->c   
A ->a   B ->b   C ->c   
B ->b   A ->a   C ->c   

Word that similar to we mentioned above, the primary and third traces are the identical, and the second and fourth are the identical.

Subsequent we make it barely extra attention-grabbing by increasing macros which physique is one other macro:


AA ->A   A ->a     BB ->B    B ->b   CC ->C   C ->c   
BB ->B   AA ->A   A ->a      B ->b   CC ->C   C ->c   
AA ->A   BB ->B   A ->a      B ->b   CC ->C   C ->c   
BB ->B   B ->b     AA ->A    A ->a   CC ->C   C ->c   

The 4 traces have all distinct orders on which macros are expanded when, in distinction with the final instance.
With 4 expandafters we’re again to as if we had none.

What if we had AAA and buddies?

The TeX tracing output is getting fairly large, so I’ve compressed it all the way down to the next desk,
the place the left column is the variety of expandafters earlier than AAABBBCCC,
and every row is the order during which macros have been expanded.
As an example, within the first row we first expanded AAA, then AA, then A and so forth.

0     AAA     AA      A    BBB     BB      B    CCC     CC      C
1     BBB    AAA     AA      A     BB      B    CCC     CC      C 
2     AAA    BBB     AA      A     BB      B    CCC     CC      C 
3     BBB     BB    AAA     AA      A      B    CCC     CC      C 
4     AAA     AA    BBB      A     BB      B    CCC     CC      C
5     BBB    AAA     BB     AA      A      B    CCC     CC      C 
6     AAA    BBB     BB     AA      A      B    CCC     CC      C 
7     BBB     BB      B    AAA     AA      A    CCC     CC      C 
8     AAA     AA      A    BBB     BB      B    CCC     CC      C 

After 8 of them we’re again to the place we began. Additionally notice that the CCCs by no means change.


Begin Truly Increasing newif

If you happen to’ve made it this far, good job! I understand this can be a honest quantity of conditions earlier than
attending to the purpose of the submit.

Right here’s the definition of newif once more, however formatted somewhat otherwise:

    expandafterexpandafterexpandafter def@if#1{true}{let#1=iftrue}%
    expandafterexpandafterexpandafter def@if#1{false}{let#1=iffalse}%
    @if#1{false} % the situation begins out false
} % `if` is required

Let’s do that in components, beginning with the underside group, then the center def, after which transfer on to the precise newif.
Word that solely the primary kind is the precise physique of newif and that the underside group and the def within the center
is simply a part of the one-time setup.
We’ll begin with the underside group.

The Backside Group

} % `if` is required

Recall from earlier than that the uccode macro units the character code of the uppercase model of a personality,
so we will as an example change the uppercase of g to be H by writing uccode`g=`H.
In our snippet we’re setting the uppercase model of the numbers 1 and 2 to be i and f. Sure actually.
Additionally recall that the change is native to the present group, so this modification will likely be undone after the third macro.

So we’ve modified the uppercase of 1 and 2, and subsequent we’re uppercasing a gdef which title is if@12.

Let’s make this barely simpler by solely having one character we uppercase

See Also

*{uccode`1=`M uppercase{gdefbob1{bob}}}
bob M->BOB

Discover that the title of the macro is simply bob, not bob1 or bobM.

A notice about extra superior parameter texts

TeX permits us to make sure that there are different tokens within the argument checklist of a macro growth, or that the arguments are delimited by
sure tokens.
As an example think about the next:

*defcommasep#1,#2{(#1, #2)}
*message{commasep 1 2 3 , 9 8 7}
(1 2 3 ,9) 8 7

We see that the primary argument was not actually simply the primary token, however all tokens up till we hit , which
we had after the #1 within the parameter textual content.
The final argument nonetheless, was simply the following token.

We are able to additionally do that:

*defmfirst m#1{(#1)}
*message{mfirst a a}
! Use of mfirst would not match its definition.
<*> mfirst a
*message{mfirst m a}

Right here we’ve mentioned that we want an m earlier than we get the following token as the primary argument to the macro.
If the following token will not be an m, like within the first try, we error.
It’s principally a quite simple model of sample matching.

Again to Bob

In our definition of bob now we have ensured that the parameter textual content ought to finish with an uppercase 1, which was M.
There’s a downside although:

*bob M
! Use of bob would not match its definition.
<*> bob M


The rationale this doesn’t work is that whereas the uppercase of 1 is quickly set to M
and the macro actually does anticipate to be referred to as as bob M, the M we ship in now has
the mistaken character code: it’s a letter and never a quantity.
We are able to quickly change this in a bunch, and it’ll work.

*{catcode`M=12 bob M}
{begin-group character {}
{coming into easy group (degree 1)}
{altering catcode77=11}
{into catcode77=12}

bob M->BOB
{the letter B}
{end-group character }}
{restoring catcode77=11}
{leaving easy group (degree 1)}
{clean house  }


Now we’re prepared to grasp the present snippet

{uccode`1=`i uccode`2=`f uppercase{gdefif@12{}}} % `if` is required

This may outline a macro if@ that ensures that the primary two tokens after it’s i and f with class code 12.
Additionally notice that it’s going to develop to nothing, however it can eat the matched tokens within the parameter checklist.
In different phrases:

*defeat h{H} message{eat hey}
Whats up

The h is eaten and changed with the physique of the macro, H, and the remainder of the tokens ello are simply
characters so nothing is finished to them, and the result’s Whats up.

To summarize, we’ve now globally outlined a macro if@ which ensures that when utilized the following two tokens within the
token checklist will likely be two tokens with catcode 12 that’s i and f, and these tokens will likely be taken out of the token checklist.

The Center def

Shifting on to this half:


Let’s peel the onion. We’ve bought a csname/endcsname pair, so the output of the operate
will likely be a management sequence title, which can, until already outlined, be outlined to develop to loosen up.
The title would be the results of expandafterif@string#1#2;
the arguments handed to @if (the def we’re taking a look at) will thus be despatched to if@,
however the first argument will likely be eaten by string first.
We simply discovered that the one factor that if@ does is to make sure that the primary two tokens given
are i f of catcode 12. And it simply so occur that the tokens that we get from increasing string
are precisely of catcode 12!

Let’s attempt to develop @if{ifeven}{true}:

csname expandafterif@string{i f e v e n}{t r u e}endcsname
csname if@ i f e v e n {t r u e}endcsname
csname e v e n {t r u e}endcsname
csname e v e n t r u eendcsname   % csname would not care about grouping

The result’s a single management sequence token with the title eventrue.
That’s it! So long as the string growth of the primary argument begins with i f
we are going to get a management sequence token that’s the concatenation of the 2 arguments.

The First def

Phew, again on the high. Right here it’s, as soon as extra:

    expandafterexpandafterexpandafter def@if#1{true}{let#1=iftrue}%
    expandafterexpandafterexpandafter def@if#1{false}{let#1=iffalse}%
    @if#1{false} % the situation begins out false

We’re virtually there; it’s only a matter of piecing collectively a few of the components that we’ve already
First we will notice that we’re quickly setting escapechar to be -1 after which restoring it
on the finish. There are two questions we will reply right here: (1) why can we set it, and (2) why can’t we group it as an alternative?

  1. We wish the argument to newif to be a management sequence, like newififred,
    and we additionally have to test that the given management sequence begins with if,
    which we do in if@ by way of the string macro. If naively utilized, stringifred would
    develop to i f r e d, however we want it to be i f r e d. By setting escapechar=-1
    we make string output nothing for , and we’re good.
  2. Had we used grouping the defs now we have inside can be native to the group and successfully destroyed
    by the point we’re achieved increasing newif. If we have been to make use of gdef then all outlined macros with newif would have
    to be world. This fashion we will have the person outline newifs which can be native to their teams.

That solely leaves three traces within the macro physique, and two of them are of the identical kind.
From earlier we keep in mind that three expandafter would develop the second token within the token checklist twice.
Let’s assume #1 = ifred. With the whole kind

expandafterexpandafterexpandafter def @if ifred {true} {let ifred = iftrue}

we might first develop @if, which can eat two tokens, #1 and {true} and get replaced with the
physique of the macro, as seen above. Then we want a second growth to develop the csname pair,
and it will develop to the management sequence token redtrue. This might be put again within the token queue,

expandafter def csname expandafterif@stringifred{true}endcsname{let ifred = iftrue}
def redtrue{let ifred = iftrue}

and on the finish now we have a well-known kind. The identical occurs with the false variant.
The subsequent line is then ran:

@ififred{false} % develop:
csname expandafterif@stringifred{true}endcsname  % eval the csname pair
redfalse  % we simply outlined this macro
letifred=iffalse  % run this

Finally, we restore escapechar to no matter it was initially.

In Conclusion

Taking all of it collectively, operating newififred expands to this:

% Within the preamble now we have the varieties
{uccode`1=`i uccode`2=`f uppercase{gdefif@12{}}} % `if` is required

% The person writes
% .. which expands to
expandafterexpandafterexpandafter def@ififred{true}{letifred=iftrue}
expandafterexpandafterexpandafter def@ififred{false}{letifred=iffalse}
% ... which is principally the identical as

and that’s it!
So hey, we needed to peel just a few onions, however ultimately we managed to unravel the thriller and
actually perceive what’s happening in newif; it seems it’s rather a lot, although the primary
performance appears that we don’t have to jot down these three traces each time we need to outline a brand new conditional,
however that just one suffices.

If you wish to know extra “actual” definition and edge instances, take a look at this site;
I went forwards and backwards on that and within the TeXbook when penning this submit, and having a searchable index of principally
the complete language is, properly, indispensable. In fact, in the event you don’t know a lot about TeX from earlier than
I can solely assume that the reference will likely be exhausting to dig into.

Notes, feedback, questions, and tomatoes might be despatched to my public inbox.

Hope you discovered one thing, and thanks for studying.

Creative Commons Licence
This work is licensed underneath a Creative Commons Attribution-ShareAlike 4.0 International License

Source Link

What's Your Reaction?
In Love
Not Sure
View Comments (0)

Leave a Reply

Your email address will not be published.

2022 Blinking Robots.
WordPress by Doejo

Scroll To Top