Now Reading
The Case of the Lacking 4th Commodore BASIC Variable (and the fifth Byte)

The Case of the Lacking 4th Commodore BASIC Variable (and the fifth Byte)

2023-03-15 04:14:39

One more detective story.

Title illustation

We’ve met them aready, again within the happyier days of ’20, when issues nonetheless appeared proper, earlier than that cloud of gloom settled over town, the jolly bunch often known as the Commodore BASIC variables. To frequent data, there are 3 of them, Float, Integer and String, and in the event you browse the gazettes and tales distributed over the primary streat counters of the Web, this may increasingly all you recognize about them. Every of them is understood by their signature grin and every of them comes with a goal.

Let’s have them rounded up for a fast identification:

Mug Reminiscence Signature Enterprise Stature, Private Traits
A1 A 1 (0x40 0x31) Floating Level Quantity 5 bytes: exponent/signal, 4 bytes mantissa
I2% I̅ 2̅ (0xC9 0xB2) Integer Quantity 5 bytes: 2 bytes binary worth, 3 zero-bytes (unused)
S3$ S 3̅ (0x53 0xB3) String 5 bytes: size, 2-byte reminiscence pointer, 2 zero-bytes

Every of them is 7 bytes in reminiscence, 2 bytes for the title, adopted by a 5-byte variable physique, which does the precise enterprise. The title additionally encodes their kind, so that you already know who you’re coping with as quickly as they arrive round. They don’t make a lot of a secret of their enterprise, as they proudly present it off, proper of their face, by signal marks sprinkled throughout them.

Particularly, Floats comes with a clear child face, with no marks in any respect, contemporary ASCII strings throughout. Integer, nevertheless, is yet one more character, marked by indicators on each cheeks, and ol‘ Strings is understood by a single signal mark on the second, right-hand aspect of his signature grin.

- Commodore BASIC variables by sign-bit -

0 0   Float
1 1   Integer
0 1   String

In case you have been round for a while within the backyards and alleys they name the Binaries, you finally develop a really feel for this. One thing was telling me that this is probably not all, that there could also be nonetheless some in hiding. Slightly one thing, we hadn‘t seen, but. Who is aware of, possibly a damsel in misery?

Simply to place you within the image, I interrogated them with my trusty PET 2001 emulator, which now comes with a quick instrument for disassembling variables as in reminiscence. (That is yet one more story, keep tuned.) There’s no hiding anymore and here’s what they seem like with out their fairly itemizing garments:

screenshot of an emulated PET screen sowing a BASIC program
No-gloves investigation into Float, Integer, and String.
### COMMODORE BASIC ###

 15359 BYTES FREE

READY.
10 A1 =2.345
20 I2%=258
30 S3$="BLA"

RUN

READY.
█

→ Utils/Export → Disassemble Variables

                         .[simple BASIC variables]

042B  41 31               A1
042D  82 16 14 7A E2      =  2.345
0432  C9 B2               I2%
0434  01 02 00 00 00      =  258
0439  53 B3               S3$
043B  03 24 04 00 00      len: 3, @ $0424

                         .[end of BASIC variables]

(Thoughts that they give the impression of being a bit completely different, after they are available a flock and establish solely by subscript.)

However the place is the story in that — and what concerning the damsel?

One other Sort?

It wasn’t earlier than a buddy of mine got here round with an outdated supply of his that I caught a primary glimpse of her: (Fancy speak apart, of which we might have had sufficient by now, this was Jason Prepare dinner, who grew to become a useful beta tester for the brand new model of the emulator. Take a look at his new PET game!)

1C0A  D2 00 B4 0A 13 1C B5  ;var: "R" + sign-bit, 0

There she was, shyly revealing the sign-bit that adorned her first byte!

So there really are,

- Commodore BASIC variables by sign-bit -

0 0   Float
1 1   Integer
0 1   String
1 0   damsel in misery?

However, who was she, and was she really in misery?

That is a good deeper thriller, since Commodore by no means made a lot of a thriller of variable codecs, proper from the start. The PET manuals clearly describe how BASIC interacts with reminiscence and gives some examples for in-memory codecs, however it solely mentions 3 sorts, floating level, integer, and string. So what might this 4th variable kind be, and what mysteries are lurging behind this?

I knew already some, specifically that she was recognized by the only letter “R”. So it wasn‘t that tough to hint her right down to the origins, hidden in a bunch of densely formatted BASIC statements:

150 DEFFNR(X)=INT(X*RND(U)):GOSUB8010:A1$="NLTSMR"

(STARTREK1978.PRG by Jason Prepare dinner)

It’s a DEFFN variable! — This makes really some sense that these person outlined features must be saved as variables, so as to look them up by title.

So let’s have a more in-depth have a look at her (*blush*) anatomy…

So as to take action, let‘s give you a a lot less complicated instance that lends itself a bit simpler to investigations:

10 DEFFNR(X)=1+X*X
20 PRINT FNR(3)

RUN
 10

Now let‘s take a look on the variable as in reminiscence:

→ Utils/Export → Disassemble Variables

                         .[simple BASIC variables]

0420  D2 00               FNR()
0422  0C 04 29 04 31      – ??? –
0427  58 00               X
0429  00 00 00 00 00      =  0

                         .[end of BASIC variables]

And, as we’re at it, let’s examine the tokenized program as in reminiscence, as effectively:

→ Utils/Export → Disassemble Program

                         .[tokenized BASIC text]

0401  12 04              hyperlink: $0412
0403  12 04              line# 10
0405  96                 token DEF
0406  A5                 token FN
0407  52 28 58 29        ascii «R(X)»
040B  B2                 token =
040C  31                 ascii «1»
040D  AA                 token +
040E  58                 ascii «X»
040F  AC                 token *
0410  58                 ascii «X»
0411  00                 -EOL-
0412  1E 04              hyperlink: $041E
0414  1E 04              line# 20
0416  99                 token PRINT
0417  20                 ascii « »
0418  A5                 token FN
0419  52 28 33 29        ascii «R(3)»
041D  00                 -EOL-
041E  00 00              -EOP- (hyperlink: null)

                         .[end of BASIC text]

A versed investigator of BASIC affairs might have noticed it already, straight away: the primary two bytes are pointers into reminiscence, as given away by their second (excessive) byte of 04, pointing at addresses within the 0x04000x04FF vary, with BASIC beginning on the PET at 0x0401, populated by the tokenized BASIC textual content, adopted by easy variables after which arrays, if there are any.

Let’s rearrange this:

0401  12 04              hyperlink: $0412
0403  12 04              line# 10
0405  96                 token DEF
0406  A5                 token FN
0407  52 28 58 29        ascii «R(X)»
040B  B2                 token =
040C  31                 ascii «1»
040D  AA                 token +
040E  58                 ascii «X»
040F  AC                 token *
0410  58                 ascii «X»
0411  00                 -EOL-

      (...)

0420  D2 00               FNR()
0422  0C 04               pointer to $040C (low, excessive)
0424  29 04               pointer to $0429 (low, excessive)
0426  31                  – ??? –
0427  58 00               X
0429  00 00 00 00 00      =  0

The primary pointer faucets straight into the perform physique after the task to the perform definition.
The second pointer faucets straight into the variable physique of the argument “X”, which is definitely a world variable. (Which does make some sense, as there are solely international variables in BASIC.)

This already guarantees some speedy and optimized execution at run-time, because the pointers refer instantly to reminiscence as wanted. Furthermore, we are able to see, why solely floating level values are allowed as an argument, because the pointer to the argument skips previous any notion of the title and sort of that variable, assuming, it‘s a float, straight away.

The Thriller of the 5th Byte

So, what might the 5th byte be about? A few of this may increasingly remind us of how strings are saved, by a primary byte storing the size after which a pointer to the in-memory location, at which the string begins. Is it a size of types? (This will appear much more believable, because the code for executing “DEFFN” borrows some from the code for string dealing with.)

This was really my first assumption, nourished by some coincidence. Nonetheless, this, after all, it’s not. The execution at run-time simply stops at the primary colon (“:“) or the primary finish of line, what ever comes first, extending over a single BASIC assertion. No lengths required for that. Is it associated to the variable title? However this was yet one more coincidence in my early investigations into this. As may be clearly seen by the above instance, the place 0x31 provides the ASCII code for “1”, which bears no relation to “R”. So, what’s it?

Let‘s broaden on our little experiment:

10 DEFFNR(X)=1+X*X
20 DEFFNG(Y)=3*Y+4

Which (after RUN) gives the next variable read-out:

0425  D2 00               FNR()
0427  0C 04 2E 04 31      @ $040C, arg @ $042E, ??
042C  58 00               X
042E  00 00 00 00 00      =  0
0433  C7 00               FNG()
0435  1D 04 3C 04 33      @ $041D, arg @ $043C, ??
043A  59 00               Y
043C  00 00 00 00 00      =  0

So, the primary variable has a 5th byte of 0x31 and the second variable considered one of 0x33. Is it some counter? (This additionally reveals, as soon as once more, that this isn‘t associated to any names, since nothing in both “R”, “G”, “X”, or “Y” interprets to a distinction of two.)

So let’s add one other onother DEFFN definition to this, simply to confirm:

10 DEFFNR(X)=1+X*X
20 DEFFNG(Y)=3*Y+4
30 DEFFNI(T)=3*T-2

0436  D2 00               FNR()
0438  0C 04 3F 04 31      @ $040C, arg @ $043F, ??
043D  58 00               X
043F  00 00 00 00 00      =  0
0444  C7 00               FNG()
0446  1D 04 4D 04 33      @ $041D, arg @ $044D, ??
044B  59 00               Y
044D  00 00 00 00 00      =  0
0452  C9 00               FNI()
0454  2E 04 5B 04 33      @ $042E, arg @ $045B, ??
0459  54 00               T
045B  00 00 00 00 00      =  0

Hum, that is considerably disappointing: each the second and the third FN variable have 0x33 as their final byte. So it isn’t a counter in any respect. Furthermore, including another variables to our quick program or altering any of the names doesn’t present any impact on this 5th byte of the variable physique, in any respect.

Nonetheless, if we alter the very first character of the perform physique, we lastly do make a distinction:

30 DEFFNI(T)=4*T-2

0452  C9 00               FNI()
0454  2E 04 5B 04 34      @ $042E, arg @ $045B, ??

Let’s make this

30 DEFFNI(T)=T-2

0450  C9 00               FNI()
0452  2E 04 59 04 54      @ $042E, arg @ $0459, ??

Because the eagle-eyed might have noticed already, 0x34 is the ASCII code for “1” and 0x54 is ASCII “T”.
It’s the primary byte literal of our DEFFN perform physique!

Let’s verify this with a token within the first place:

30 DEFFNI(T)=INT(T)

0451  C9 00               FNI()
0453  2E 04 5A 04 B5      @ $042E, arg @ $045A, ??

Sure, 0xB5 has the sign-bit set, freely giving the BASIC token, and it’s the BASIC token for INT, certainly:

0425  33 04              line# 30
0427  96                 token DEF
0428  A5                 token FN
0429  49 28 54 29        ascii «I(T)»
042D  B2                 token =
042E  B5                 token INT
042F  28 54 29           ascii «(T)»
0432  00                 -EOL-

Nicely, that is that thriller solved.

However, does this 5th byte matter?

See Also

10 DEFFNR(X)=1+X*X
20 DEFFNG(Y)=3*Y+4
30 DEFFNI(T)=INT(T)
40 POKE 1160,32 : REM DEC 1160 = $0488
50 PRINT FNI(4.1)

RUN
 4

READY.

It doesn’t appear so. The end result remains to be what we’d anticipate because of the BASIC perform INT. It’s additionally not what we’d anticipated, if we changed the token INT within the BASIC textual content by 32, which is an easy area/clean, giving “ (T)”.

And we really modified that final byte:

0482  C9 00               FNI()
0484  2E 04 8B 04 20      @ $042E, arg @ $048B, « »

Let‘s have one other go at this, this time changing the “1” in FNR by the ASCII code for “2”:

10 DEFFNR(X)=1+X*X
20 DEFFNG(Y)=3*Y+4
30 DEFFNI(T)=INT(T)
40 POKE 1130,50 : REM DEC 1130 = $046A
50 PRINT FNR(2)

RUN
 5

0464  D2 00               FNR()
0466  0C 04 6D 04 32      @ $040C, arg @ $046D, «2»

This didn‘t make a distinction, as effectively.

A extra thorough investigation into literature on the matter produced a sole supply, specifically “Programming the PET/CBM” by Raeto West.
Right here, we discover FN variables really described as a particular kind, on p. 9, the place the final byte is described as “INITIAL OF VAR.”

faximile: Raeto West, Programming the PET/CBM; p.9

The precise that means of “INITIAL OF VAR.” is probably not that clear because it‘s supplied with out additional context, however — as we‘ve established already — that is truthful and proper, if we’re meant to know, “the preliminary byte of the perform physique refered to by the variable.” (Versus, e.g., “the primary character of the variable identifier,“ or comparable.) The descriptive textual content goes as follows,

A perform definition has two pointers; one to the definition within the physique of the
BASIC program, and one to the floating-point dependent variable. They level simply
after the ‘=’ signal and to the exponent byte respectively. The ultimate byte is rubbish,
generated when the definition is ready up, and isn’t used.

Nicely, I assume, that’s it. Particularly, as (as already talked about) the code makes use of some sources devoted to string dealing with. Nonetheless, it’s nonetheless a bit unusual that this 5th byte isn’t simply set to 0 as with every different surplus bytes in integer and string variables.

What is that this FN damsel hiding? Which causes her such misery that she ought to exhibit essentially the most intimate secrets and techniques of her construct like this in broad daylight?

I assume, that is yet one more story. Which additionally brings this true detective story to an finish.

Anyhow, if you wish to have a more in-depth have a look at the brand new model of the PET emulator, right here it’s working all the newest demos:

Edit/Replace

Whereas it could be right to talk of the perform parameter (argument) as a world variable, within the sense that it’s created along with the FN variable and saved alongside it within the international variable reminiscence, it doesn‘t behave like one:

10 X=1
20 DEFFNR(X)=1+X*X
30 PRINT X
40 PRINT FNR(2)
50 PRINT X
RUN
 1
 5
 1

READY.
█

Furthermore, even, if there isn’t a battle, the perform parameter isn‘t accessible from outdoors:

10 DEFFNR(X)=1+X*X
20 PRINT FNR(2)
30 PRINT X
RUN
 5
 0

READY.
█

As could also be inferred from this, the worth of the variable (1 byte exponent and 4 bytes mantissa) is saved earlier than the variable is accessed as a parameter/argument after which restored once more. As person outlined features are callable from inside person outlined features, this can’t be only a buffer within the zero-page (that means, there could also be multiple worth to be saved at any given time), moderately, the contents of the variable physique is pushed to the processor stack after which restored from there. So, whereas they could be outlined as international variables, these are literally behaving like native variables.

We might notice that the trouble taken (5 pushs and 5 pulls to and from the stack, along with the reads and writes that go together with this) to make these behave like native variables considerably counteracts the effectivity instructed by the argument pointer tapping straight within the variable physique.

Wait, there may be extra: don’t miss the Bonus Episode!

Source Link

What's Your Reaction?
Excited
0
Happy
0
In Love
0
Not Sure
0
Silly
0
View Comments (0)

Leave a Reply

Your email address will not be published.

2022 Blinking Robots.
WordPress by Doejo

Scroll To Top