Now Reading
The Linux graphics stack in a nutshell, half 1 [LWN.net]

The Linux graphics stack in a nutshell, half 1 [LWN.net]

2023-12-19 15:59:18


Welcome to LWN.web

The next subscription-only content material has been made out there to you
by an LWN subscriber. 1000’s of subscribers depend upon LWN for the
finest information from the Linux and free software program communities. For those who get pleasure from this
article, please think about accepting the trial provide on the suitable. Thanks
for visiting LWN.web!

Free trial subscription

Strive LWN totally free for 1 month: no cost
or bank card required. Activate
your trial subscription now
and see why hundreds of
readers subscribe to LWN.web.

December 19, 2023

This text was contributed by Thomas Zimmermann

Linux graphics builders typically converse of fashionable Linux graphics
once they consult with plenty of particular person software program elements and the way they
work together
with one another.
Amongst different issues, it is a mixture of kernel-managed show sources,
Wayland for compositing, accelerated 3D rendering, and decidedly not X11.
In a two-part collection, we’ll take a fast-paced journey
via the graphics code to see the way it converts utility information
to pixel information and shows it on the display. On this installment, we glance
at utility rendering, Mesa internals, and the
essential kernel options.

Software rendering

Graphics output begins within the utility, which processes and
shops formatted information that’s to be visualized.
The widespread information construction for visualization is the
scene graph, which
is
a tree the place every node shops both a mannequin in 3D area or its
attributes. Mannequin nodes comprise the info to be visualized, equivalent to a
sport’s surroundings or parts of a scientific simulation. Attribute
nodes set the orientation or location of the fashions. Every attribute
node applies to the nodes beneath. To render its scene graph into an
on-screen picture, an utility walks the tree from high to
backside and left to proper, units or clears the attributes and renders
the 3D fashions accordingly.

Within the instance scene graph proven beneath, rendering begins on the root node,
which prepares the renderer and units the output location. The appliance
first takes the department to the left and renders “Rectangle 1” at
place (0, 0) with the floor sample saved in texture 1.
The appliance then strikes again to the foundation node
and takes the department to the suitable the place it enters the attribute node named
“Remodel”. Purposes describe transformations,
equivalent to positioning or scaling, in 4×4 matrices that the algorithm
applies throughout
the rendering course of. Within the instance, the remodel node
scales all of its little one nodes by an element of 0.5. So rendering
“Rectangle 2”
and “Rectangle 3” shows them at half their unique sizes, with their
positions adjusted to (10, 10) and (15, 15), respectively. Each
rectangles
use completely different textures: 2 and 3, respectively.

[Scene graph]

To simplify rendering and make use of {hardware} acceleration, most
functions make the most of one of many normal APIs, equivalent to
OpenGL
or
Vulkan.
Particulars range among the many particular person APIs, however every gives interfaces
to handle graphics reminiscence, fill it with information, and render the
saved info. The result’s a picture that the appliance can
both show as-is or use as enter information to additional processing.

All graphics information is held in buffer objects, every of which is a
vary of graphics reminiscence with a deal with or ID hooked up. For instance,
every 3D mannequin is saved in a
vertex-buffer object,
every texture is saved in a texture-buffer object, every object’s surface
normals
are saved in a buffer object, and so forth. The output picture
is itself stored in a buffer object. So graphics rendering is, in giant
half, an train in reminiscence administration.

The appliance can present enter information in any format, so long as
the graphics shader can course of it. A
shader
is a program that accommodates the directions to rework the enter
information into an output picture. It’s supplied by the appliance and executed by the
graphics card.

Actual-world shader applications can implement
advanced, multi-pass algorithms, however for this instance we break it
right down to the minimal required. In all probability the 2 most typical operations
in shaders are vertex transformations and texture lookups. We will assume
of a vertex as a nook of a polygon. Written in
OpenGL
Shading Language
(GLSL),
reworking a vertex appears like this:

    uniform mat4 Matrix; // similar for all of a rectangles's vertices
    in vec4 inVertexCoord; // accommodates a unique vertex coordinate on every invocation

    gl_Position = Matrix * inVertexCoord;

The variable inVertexCoord is an enter coordinate coming from the
utility’s scene graph. The variable gl_Position
is the coordinate throughout the utility’s output buffer. In broad phrases,
the previous coordinate is throughout the displayed surroundings, whereas the latter is
throughout the utility window.
Matrix is the 4×4 matrix that describes the transformation between
the 2 coordinate methods. This shader operation runs for every vertex in
the scene graph. Within the instance of the appliance’s scene graph of
rectangles above, inVertexCoord accommodates every vertex of every rectangle
a minimum of as soon as. The matrix Matrix then accommodates that vertex’s
transformation, equivalent to shifting it to the right place or scaling it by
the issue of 0.5 as specified within the remodel node.

As soon as the vertices are reworked to the output coordinate system, the
shader program computes the values of the coated “fragments”,
which is graphics jargon for an output pixel with a depth worth alongside the Z
axis and
possibly different info. Every fragment requires a shade. In GLSL, the
shader’s texture() operate retrieves the colour from a texture
like this:

    uniform sampler2D Tex; // the feel object of the present rectangle
    in vec2 vsTexCoord; // interpolated texture coordinate for the fragment

    Coloration = texture(Tex, vsTexCoord);

Right here, Tex represents a texture buffer. The worth vsTexCoord
is the feel coordinate; the place the place to learn throughout the texture.
Utilizing each values, texture() returns a
shade worth. Assigning it to Coloration writes a coloured pixel to the
output buffer. To fill the output buffer with pixel information, this shader code runs
for every particular person fragment. The feel buffer is designated by the mannequin
that’s being drawn, the feel coordinate is supplied by OpenGL’s
inner computation. For the instance scene graph, the appliance invokes
this code for every of the rectangles utilizing that rectangle’s texture buffer.

Working these shader directions on the entire scene graph generates the
full output picture for the appliance.

Mesa

Nothing now we have mentioned to date is particular to Linux, but it surely provides us the
framework to take a look at the way it’s all applied. On Linux, the
Mesa 3D
library, Mesa for brief, implements 3D rendering interfaces and help
for varied graphics {hardware}. To functions, it gives OpenGL or
Vulkan for desktop graphics,
OpenGL ES
for cellular methods, and
OpenCL
for computation. On the {hardware} facet, Mesa implements drivers for many
of at this time’s graphics {hardware}.

Mesa drivers typically don’t implement these utility interfaces by
themselves as Mesa accommodates loads of helpers and abstractions.
For stateful interfaces, equivalent to OpenGL, Mesa’s
Gallium3D
framework connects interfaces and drivers
with one another. That is referred to as a state tracker. Mesa accommodates
state trackers for varied variations of OpenGL, OpenGL ES, and OpenCL. When the
utility makes use of
the API, it modifies the state tracker for the given interface.

A {hardware} driver inside Mesa additional converts the state-tracker info
to {hardware} state and rendering directions.
For instance, calling OpenGL’s
glBindTexture()
selects the present texture buffer throughout the OpenGL
state tracker. The {hardware} driver then installs the texture-buffer
object in graphics reminiscence and hyperlinks the energetic shader program to
consult with the buffer object as its texture. In our instance above, the
texture turns into out there as Tex within the
shader program.

Vulkan is a stateless interface, so Gallium3D isn’t helpful
for these drivers; Mesa as a substitute provides the Vulkan runtime to assist
with their
implementation. If there’s a Vulkan driver out there, although, there
won’t be a necessity for Gallium3D-based OpenGL help in any respect for
that {hardware}.
Zink
is a Mesa driver that maps Gallium3D to Vulkan. With Zink, OpenGL state
turns into Gallium3D state, which is then forwarded to {hardware} through normal
Vulkan interfaces. In precept, this works with any {hardware}’s Vulkan
driver. One can think about that future drivers inside Mesa solely implement
Vulkan and depend on Zink for OpenGL compatibility.

In addition to Gallium3D, Mesa gives extra helpers, equivalent to winsys or GBM, to
its {hardware}
drivers. Winsys wraps the small print of the
window system. GBM, the Generic Buffer Supervisor, simplifies buffer-object
allocation.
There are additionally plenty of shader languages, equivalent to
GLSL
or
SPIR-V,
out there to the appliance. Mesa compiles the
application-provided shader code to the “New Interface Illustration” or
NIR, which Mesa drivers flip into
{hardware} directions. To get the shader directions and the related
information processed by Mesa’s {hardware} acceleration, their buffer objects have
to be saved in reminiscence places accessible to the graphics card.

Kernel reminiscence administration

Any reminiscence accessible to the graphics {hardware} is often subsumed below
the umbrella time period of graphics reminiscence; it’s the graphics
stack’s central useful resource, as all the stack’s elements work together with
it. On the {hardware} facet, graphics reminiscence is available in quite a lot of
configurations that vary from devoted reminiscence on discrete graphics
adapters to common system reminiscence of system-on-chip (SoC) boards. In between are
graphics chips with DMA-able or
shared graphics memory,
graphics
address remapping table
(GART) reminiscence
of discrete gadgets, or the so-called stolen graphics reminiscence
of on-board graphics.

Being a system-wide useful resource, graphics reminiscence is maintained by the kernel’s
Direct
Rendering Manager
(DRM)
subsystem. To entry DRM performance, Mesa opens
the graphics card’s machine file below /dev/dri, equivalent to
/dev/dri/renderD128. As required by its
user-space counterparts, DRM exposes graphics reminiscence within the type
of buffer objects, the place every buffer object represents a slice of the
out there reminiscence.

See Also

The DRM framework gives plenty of reminiscence managers for essentially the most
widespread circumstances. The DRM drivers for the discrete graphics playing cards
from AMD, NVIDIA, and (quickly) Intel use the Translation Table Manager
(TTM). It helps discrete graphics reminiscence, GART reminiscence, and system reminiscence.
TTM can transfer buffer objects between these areas, so if the machine’s
discrete reminiscence fills up, unused buffer objects could be swapped out to
system reminiscence.

Drivers for easy framebuffer gadgets typically use the
SHMEM helpers, which allocate buffer objects in shared reminiscence. Right here,
common system reminiscence acts as a shadow buffer for the machine’s
restricted sources. The graphics driver maintains the machine’s
graphics reminiscence internally, however exposes buffer objects in system reminiscence
to the surface. This additionally makes it
attainable to memory-map buffer objects of gadgets on the USB or I2C bus,
although these buses don’t help web page mappings of
machine reminiscence; the shadow buffer could be mapped as a substitute.

The opposite widespread
allocator, the DMA helper, manages buffer
objects in DMA-able areas of the bodily reminiscence. This design is commonly used
in SoC boards, the place graphics chips fetch and retailer information through DMA operations.
In fact, DRM drivers with further necessities have the choice of extending
one of many present reminiscence managers or implementing their very own.

The ioctl() interface for managing buffer objects is named the Graphics
Execution Manager
(GEM). Every DRM driver implements GEM in keeping with its
{hardware}’s
options and necessities.
The GEM interface permits mapping a buffer object’s reminiscence pages to user-space
or kernel handle area, permits pinning the pages at a sure location, or
exporting them to different drivers. For instance, an utility in consumer area
can get entry to a buffer object’s reminiscence pages by invoking mmap()
with the
right offset on the DRM machine file’s file descriptor. The decision will finally
find yourself within the DRM driver’s GEM code, which units up the mapping. As we’ll see
beneath, it is a helpful characteristic for software program rendering.

The one widespread operation that GEM doesn’t present is buffer allocation.
Every buffer object has a particular use case, which impacts and is affected by
the thing’s allocation parameters, reminiscence location, or {hardware} constraints.
Therefore, every DRM driver provides a devoted ioctl() operation for buffer-object
allocation that captures these hardware-specific settings. The DRM driver’s
counterpart in Mesa invokes stated ioctl() operation accordingly.

Rendering operations

Simply having buffer objects for storing the output picture, the enter information, and
the shader applications isn’t sufficient. To start out rendering, Mesa instructs DRM to
put all essential buffer objects in graphics reminiscence and invokes the
energetic shader program. It is once more all particular to the {hardware} and supplied
as ioctl() operations by every DRM driver individually. As with buffer allocation,
the {hardware} driver inside Mesa invokes the DRM driver’s ioctl() operations.
For Mesa drivers primarily based on Gallium3D, this occurs when the driving force converts the
state tracker info into {hardware} state.

The graphics driver ideally acts solely as a proxy between the appliance
in consumer area and the {hardware}. The {hardware} renderer runs asynchronously to
the remainder of the graphics stack and stories again to the driving force solely in
case of an error or on profitable completion; very similar to the system CPU
informs the working system on web page faults or unlawful directions. As lengthy
as there’s nothing to report, driver overhead is minimal. There are
exceptions; for instance, older fashions of Intel graphics chips don’t
help
vertex transformations, so the driving force inside Mesa has to implement them in
software program. On the Raspberry Pi, the kernel’s DRM driver has to validate every
shader’s reminiscence entry, because the VideoCore 4 chip doesn’t comprise
an I/O MMU to isolate the shader from the system.

Software program rendering

Thus far, now we have assumed {hardware} help for graphics rendering. What if
there is not any such help or the user-space utility can not use it? For
instance, a user-space GUI toolkit would possibly desire rendering in
software program as a result of hardware-centric interfaces like OpenGL don’t match its wants.
And there is Plymouth, this system that shows the boot-up brand and prompts
for disk-encryption passwords throughout boot, which often doesn’t have a full
graphics stack out there. For these situations, DRM provides the dumb-buffer
ioctl() interface.

By using dumb buffers, an utility allocates
buffer objects in graphics reminiscence, however with out help for {hardware}
acceleration. So any returned buffer object is just usable with software program
rendering. The appliance in consumer area, equivalent to a GUI
toolkit or Plymouth, maps the buffer object’s pages into its handle area
and copies over the output picture. Mesa’s software program renderer works equally:
the enter buffer objects all dwell in system reminiscence and the system CPU
processes the shader directions. The output buffer is a dumb-buffer
object that shops the rendered picture. Whereas that is neither quick nor fancy,
it is ok to run a contemporary desktop atmosphere on easy {hardware} that
doesn’t help accelerated rendering.

Now we have now gone via the appliance’s graphics stack for rendering. After
having accomplished the traversal of the scene graph, the appliance’s output
buffer object accommodates the visualized surroundings or information that it desires to
show. However the buffer isn’t
but on the display. Whether or not accelerated or dumb, placing the buffer on the
display requires compositing and mode setting, which type the opposite half of
the graphics stack. Partly 2, we’ll have a look at Wayland
compositing, setting show modes with DRM, and some different options of the
graphics stack.




Did you want this text? Please settle for our
trial subscription offer to be
in a position to see extra content material prefer it and to take part within the dialogue.

(Log in to publish feedback)

Source Link

What's Your Reaction?
Excited
0
Happy
0
In Love
0
Not Sure
0
Silly
0
View Comments (0)

Leave a Reply

Your email address will not be published.

2022 Blinking Robots.
WordPress by Doejo

Scroll To Top