Porting the Slint UI Toolkit to a Microcontroller with 264K RAM — Slint Weblog
Our imaginative and prescient with Slint as a cross-platform native UI toolkit is to offer person interfaces for any system.
Initially we centered on working on desktop-class machines and embedded units that help OpenGL ES.
Just a few months in the past we began porting Slint to microcontrollers (MCUs), and this weblog publish describes how we achieved that.
Aim: Our Printer Demo on a Raspberry Pi Pico
We selected the Raspberry Pi Pico as the primary
board to help. It is geared up with Raspberry Pi’s first self-made microcontroller, the RP2040,
a Cortex-M0 class processor, with 264KB of RAM and 2M of flash. This board is open-source-friendly and it prices lower than 4€.
We hooked up the Waveshare Pico-ResTouch-LCD-2.8 to the Raspberry Pico board.
It is a 320×240 pixels 2.8″ display that connects the show and the contact display over the identical SPI
bus.
This {hardware} mixture prices lower than 20€ on-line, which supplies a low barrier of entry
for lovers and hobbyists.
We acknowledge that it is a pretty low-end MCU for GUIs, with constraints. We see this mixture as a proof of viability for Slint:
If our demo runs on this board, then it may run on any MCU.
Based mostly on this video, it is secure to say that we succeeded. ????
#![no_std]
We’re engaged on what is known as naked steel: There isn’t a working system between us
and the {hardware}. Our runtime is written solely in Rust, which requires utilizing the
#![no_std]
attribute to run on naked steel. This attribute
disables the Rust Customary Library which requires an working system. Even when we tried to allow it, the Customary
Library shouldn’t be obtainable for the thumbv6m-none-eabi
goal, which the RP2040 makes use of.
Step one was to compile our core library for thumbv6m-none-eabi
.
We would have liked to gate something that used the Customary Library behind a "std"
function gate.
Fortunately, we’re not utilizing a lot of ordinary library, and we had already established the behavior of utilizing
core::
as a substitute of std::
for varieties from the Rust Core Library. All kinds
that require reminiscence allocation are within the alloc crate, so we needed to introduce that in a number of locations.
Essentially the most sophisticated half was code that requires the thread_local!
macro.
With the assistance of this macro we retailer some international variables. We needed to resort to unsafe
static
as a substitute when not utilizing std. It is a bit annoying, however sadly we could not give you a greater different.
Then we hit the issue that the structure has neither atomic examine and swap directions nor
a {hardware} floating level unit. Which means that the Rust Core Library disables some options of the
atomic module, that are utilized by crates we rely on. So along with ensuring that every of our
dependencies helps #![no_std]
, we additionally wanted to make a few of them use the
atomic_polyfill crate to work across the lacking atomic performance.
For the shortage of {hardware} floating level help, we use the libm crate
(through num-traits), which supplies software program floating level help for
the operations that we use.
{Hardware} Abstraction
Thus far we have handled common points that apply to all constrained embedded environments. The RP2040 microcontroller
that we’re beginning with is positioned on the Pico board, however there are different boards obtainable as properly with the identical chip.
Some are geared up with further peripherals corresponding to LEDs or extra flash storage. An working system usually supplies
drivers for the boards and peripherals, and supplies an abstraction over these combos to functions. We don’t
have an working system to do this for us, however we would not have to begin from scratch both: Thankfully the Rust
ecosystem supplies crates that make it comparatively straightforward to get began in these naked steel environments. We’re utilizing crates
from the RP2040 rp2040-hal mission, in addition to helper crates from the
Rust Embedded mission.
Framebuffer?
264KB of RAM shouldn’t be a lot reminiscence. Usually, when a display is related straight
to the reminiscence, a framebuffer is used to retailer one full display in reminiscence. We might
render into the framebuffer and mix the UI earlier than passing it on to the show. Truly, we might allocate two of them so
that we will render into one whereas the opposite one is being displayed.
However with 16-bit colours (two bytes per pixel), that is 240 × 280 × 2 = 134.4kB per framebuffer.
That will be half of our obtainable reminiscence gone, or the entire of it if we wish double-buffering which wants two framebuffers,
leaving us with little or no area for code implementing the precise UI.
We solved this with a special method: The show has its personal on-chip reminiscence, which is massive
sufficient to characterize the display. We will write into that reminiscence by sending instructions over the SPI
bus, that include the handle and the colour. So, as a substitute of rendering the entire display into our personal framebuffer,
we have designed our software program renderer to render a smaller half in a smaller buffer. We determined to render
one line at a time, so we allocate two buffers, every the scale of 1 line solely. We will render one line into
one buffer on the CPU after which use DMA
to add the pixels to the show. Certainly, the RP2040 has a number of DMA items, able to studying
fundamental reminiscence and transmitting it over the SPI bus to peripherals, just like the display. This occurs fully
with none CPU involvement. We will instruct the CPU to render the following line whereas the primary remains to be being
despatched to the display.
Sadly, we don’t help DMA but. That is
difficult, as a result of the DMA help for the RP2040 in Rust remains to be work in progress,
and help requires adjustments deep down within the driver. Presently, the driving force we use for the display solely helps passing an iterator of pixels to attract. However
if we need to help DMA, we have to give it a selected &'static
buffer. For the reason that display driver owns the SPI pins,
we would wish to fork it, or at the least re-implement the half that sends management command. That is completely doable, however not one thing
we wished to spend time on at this level.
Replace: DMA support has now been implemented.
The display updates at the moment are a lot sooner, see this updated video
Line By Line Software program Rendering
Microcontrollers usually do not include a GPU that helps OpenGL. We needed to create a software program renderer
that runs solely on the CPU, dividing the rendering into the next steps:
- Go to the scene and create a number of primitive rendering instructions for every merchandise. For instance, we now have
a primitive command to fill a rectangle with a shade, or to mix a portion of a picture to a sure location on
the display. - Type the instructions by their y place.
- For every line, gather the instructions that have an effect on this line. That is completed by merging the instructions from the earlier
line that spans into the brand new line with the brand new instructions within the y-sorted listing of instructions computed within the
earlier step. - Draw every command into the buffer for the present line again to entrance.
On the Pico board we do not have a file system from which to load pictures or fonts. Due to this fact, the Slint compiler hundreds and pre-renders
pictures and glyphs at compile time, and embeds them uncooked into this system binary.
Partial Rendering
This is a breakdown of the cumulative occasions the person steps take to render your complete scene of the printer demo:
Step | Length |
Go to scene, create rendering instructions & type by y place | 55ms |
Gather instructions for present line | 7ms |
Draw instructions for present line into buffer | 59ms |
Ship pixels to the display over SPI | 111ms |
The above desk exhibits that more often than not is spent in importing pixels over the SPI bus to the display.
Consequently, the less pixels we add, the sooner the display updates. When an animation runs in a single a part of the display, we do
not need to re-render different components of the display that stay unchanged. Our property
system permits us to seek out out, on a really detailed stage, which gadgets within the scene are altering. We decide their
location on the display and gather them as soiled areas. When drawing a brand new body,
we will then restrict our efforts to the soiled areas and ship solely newly rendered pixels to the display.
Panic Handler
When Rust code panics, the default conduct is to cease this system with a message printed to the console explaining the explanation.
With no_std
there isn’t any console and it’s worthwhile to manually set a panic handler operate. Customized panic handlers can be found on crates.io,
which may ship the panic message over UART or defmt to your improvement machine. Since
we’re related to a display, we thought we would use it and print the panic message there.
Our panic handler steals
the Peripherals
object, and resets the display state, earlier than utilizing a customized implementation of the
Write
trait
to show the message on display. Then we spin an infinite loop, till the person resets the system.
We used our Slint blue shade as a background. Any resemblance with crash screens that you’ll have seen in different working methods
is solely coincidental. ????
Board Assist and Customization
We attempt to maintain our examples, in addition to your software code, so simple as attainable, regardless whether or not
the goal platform is a desktop system, an online browser, or now an MCU. This requires abstracting over
varied variations corresponding to the appliance entry level (fn fundamental()
), how the construct system is invoked,
or for instance combos of supported peripherals put in on the board the MCU runs on.
For the boards that we help straight, we gather as a lot of the board particular code in our MCU crate. That is often boilerplate code to initialize pins, energy on
peripherals, or display drivers, which might in any other case be duplicated into every of our examples. Nonetheless we acknowledge the necessity for personalisation.
You would possibly need to use Slint with a board that we’re not acquainted with. We need to present a low-level API with traits, so that you just
can mix your individual low-level system particular code with Slint. With these traits you may then present the buffers for line-by-line pixel
rendering, report the system time, and different {hardware} capabilities that we’d like.
Run it your self
When you have a Pico with the display that we’re utilizing, try our
README
for directions on find out how to run the code your self.
What’s subsequent
At this level, our printer demo works on the Raspberry Pi Pico. That is proof that Slint runs on low-end units. Subsequent, we’ll port to
one other board, polish our APIs, and launch a model of Slint that can be utilized straight from crates.io with MCUs.