A Mirror for Rust: Compile-Time Reflection Report

With a robust trait system, compile-time constants, and the place
-and-:
-style bounding for sorts and constants, Rust’s tackle generic capabilities has been a refreshing departure from the anything-goes, Wild Wild West, errors-only-when-called template system of C++. Moreover, its macro system has been a a lot wanted alternative of C’s underpowered macro system, permitting customers to really generate code in a constant and reliable method at compile-time, with the ever-powerful Procedural Macros (“proc macros”) caring for a few of the heaviest language extension duties. However…
Very like C, Rust maybe a bit too closely depends on (proc) macros and code technology methods to keep away from coping with honest deficiencies within the language, producing a lot heavier compile instances by frontloading work at inappropriate compilation phases to offset the dearth of language and library options. To this finish, we’ve begun engaged on the specification, formalization, and potential integration (which might not be accomplished absolutely as a part of this work) of a set of core language primitives we’re bikeshed-naming introwospection
, which we hope to make out there underneath the core::introwospection
and std::introwospection
modules in Rust.
This text will spend a maybe obscene period of time reasoning, designing, critiquing, evaluating, pontificating, and craving for this or that function, performance, idiom, or functionality.
As a general-purpose disclaimer, whereas we’ve spoken with a lot of people in particular locations about the way to obtain what we’re going to write about and report on right here, it needs to be expressly famous that no particular person talked about right here has any recorded help place for this work. No matter assertions are contained under, it’s crucial to acknowledge that we’ve one and just one aim in thoughts for this mission and that aim was developed solely underneath our — Shepherd Oasis’s — personal beliefs and efforts. Our opinions are our personal and don’t mirror every other entity, particular person, mission and/or group talked about on this report.
One of many greatest issues Shepherd’s Oasis needs to allow is infinitely invaluable and scalable code. For instance, within the primary serialization crate featured across the entire Rust ecosystem, serde
, we wish to allow David Tolnay (or any one of many 159/future contributors)’s code to have the ability to deal with this:
0 struct Level {
1 x: i32,
2 y: i32,
3 }
4
5 fn important() {
6 let level = Level { x: 1, y: 2 };
7
8 // Convert the Level to a JSON string.
9 // this nonetheless works.
10 let serialized = serde_json::to_string(&level).unwrap();
11
12 // Prints serialized = {"x":1,"y":2}
13 println!("serialized = {}", serialized);
14 }
That is the flagship instance from serde
, however with a couple of crucial adjustments made. The code above does not require:
#[derive(…)]
;impl Serialize for Level { … }
;- and, no
construct.rs
, “newtype” idioms, or different shenanigans to generate code or implement interfaces.
With a purpose to do that, we wish serde
to have the ability to write one thing just like the (non-compiling, using-unimplemented-syntax, not-at-all totally checked) generic code slightly below as a base implementation for Serialize
, its flagship serialization trait for any given struct
or enum
:
0 use std::introwospection::*;
1 use serde::ser::{
2 Serialize, Serializer,
3 SerializeTupleStruct, SerializeStruct,
4 SerializeTupleVariant, SerializeStructVariant,
5 Error
6 };
7
8 struct DefaultSerializeVisitor<S, T>
9 the place
10 S: Serializer,
11 T: Serialize + ?Sized
12 {
13 serializer: &mut S,
14 worth: &T,
15 }
16
17 pub trait Serialize {
18 fn serialize<S, T>(
19 &self, serializer:
20 S, worth: &T
21 ) -> End result<S::Okay, S::Error>
22 the place S: Serializer,
23 T: Serialized + ?Sized,
24 {
25 let mut customer = DefaultSerializeVisitor{
26 serializer,
27 worth: self
28 };
29 introwospect(Self, customer)
30 }
31 }
32
33 struct DefaultStructSerializeVisitor<S, T>
34 the place
35 S: Serializer,
36 T: Serialize + ?Sized
37 {
38 serializer: &mut S,
39 worth: &T,
40 newtype_idiom: bool,
41 tuple_idiom: Possibility<(&mut S::SerializeTupleStruct)>,
42 normal_idiom: Possibility<(&mut S::SerializeStruct)>,
43 maybe_error_index: Possibility<usize>
44 }
45
46 struct DefaultEnumSerializeVisitor<S, T>
47 {
48 serializer: &mut S,
49 worth: &T,
50 variant_info: Possibility<(&'static str, bool, usize)>,
51 tuple_idiom: Possibility<(&mut S::SerializeTupleVariant)>,
52 normal_idiom: Possibility<(&mut S::SerializeStructVariant)>,
53 maybe_found_index: Possibility<usize>
54 maybe_error_index: Possibility<usize>
55 }
56
57 impl<S: Serializer, T: Serialize + ?Sized> EnumDescriptorVisitor
58 for DefaultSerializeVisitor<S, T>
59 {
60 // Drop into the `enum`eration-style serialization and strategies by
61 // creating, particularly, that customer sort. This offers
62 // context to the `FieldVisitor`-using strategies so we all know that at
63 // the top-level we're working with an `enum`eration.
64 sort Output -> End result<S::Okay, S::Error>
65
66 fn visit_enum_mut<Descriptor: 'static>(&mut self) -> Self::Output
67 the place Descriptor: EnumDescriptor
68 {
69 let mut customer = DefaultEnumSerializeVisitor{
70 serializer: self.serializer,
71 worth: self.worth,
72 variant_info: None,
73 tuple_idiom: None,
74 normal_idiom: None,
75 maybe_found_index: None,
76 maybe_error_index: None
77 };
78 introwospect(T, customer)
79 }
80 }
81
82 impl<S: Serializer, T: Serialize + ?Sized> StructDescriptorVisitor
83 for DefaultSerializeVisitor<S, T>
84 {
85 // Drop into the `struct`-style serialization and strategies by
86 // creating, particularly, that customer sort. This offers
87 // context to the `FieldVisitor`-using strategies so we all know
88 // that on the top-level we're working with an `struct`.
89 sort Output -> End result<S::Okay, S::Error>
90
91 fn visit_struct_mut<Descriptor: 'static>(&mut self) -> Self::Output
92 the place Descriptor: EnumDescriptor
93 {
94 let mut customer = DefaultStructSerializeVisitor{
95 serializer: self.serializer,
96 worth: self.worth,
97 newtype_idiom: false,
98 tuple_idiom: None,
99 normal_idiom: None,
100 maybe_error_index: None
101 };
102 introwospect(T, customer)
103 }
104 }
105
106 // … and a lot extra.
and have it work, in perpetuity, for the remainder of their life for the overwhelming majority of Rust sorts.
Proper now, completely not one of the above code is smart to the overwhelming majority of individuals. And that’s tremendous. However, that is the last word aim of this mix article-report; we can be going via the constructs above, going to elucidate what it does, and the way it allows much less boilerplate, much less markup, much less “new sort idiom” utilization, and extra doing precisely what you anticipate and wish by-default out of the overwhelming majority of code.
Let’s get began.
The identify is a placeholder. Initially, we needed to easily name it “mirror” and “reflection”, however each are already reserved in Rust, together with within the compiler as a logo. It was then modified to “uwuflection”, however we aren’t doing code technology (within the macro sense) with this function like different reflection services in different languages. Thus, it was modified to “introspection”. Nevertheless, it was requested we make the identify no less than 11% sillier so there was no query that we truthfully don’t care what the ultimate identify can be, and to keep away from what would undoubtedly be a wholly nugatory bikeshed session. Enter: introwospection
.
Introwospection’s core beliefs are as follows:
- doesn’t drive the consumer to pay for what they don’t use (if a sort is just not mirrored on, then no details about it in anyway ought to present up within the remaining artifacts);
- won’t produce run-time, dynamic allocations, nor will it require it underneath any circumstances (making it appropriate for constrained and resource-starved environments);
- will produce info that may be acted upon by the kind system or at compile-time (i.e.
const fn
time) with out exception; - could be utilized to examine sorts and values inside generic capabilities, together with values and varieties of values not owned by the present crate;
- and, can’t be utilized to examine non-public or hidden properties which can be invisible to code on the present scope, module, or crate.
These core beliefs form what we wish out of the API, and the way it differs from present makes an attempt at this in Rust reasonably powerfully. Specifically, present makes an attempt at reflection and comparable make the most of (procedural) macro programming, defer completely to run-time/type-erased entities, or some mixture of the 2 along side hand-crafted wrapper sorts and strategies. Introwospection needs to have the ability to work with any sort in anyway — very similar to within the above beefy default Serialize
implementation — reasonably than sorts that present a selected implementation or sorts that we explicitly personal. To clarify a few of this requires us to return, and discuss the way in which issues work in the present day, together with the truth that Rust doesn’t have compile-time reflection like of us initially defined to us once we first began the language a few years in the past.
We held this perception upon first seeing Rust a while in the past. Think about, briefly, this code from the rocket.rs entrance web page mission as of April thirteenth, 2023:
0 #[macro_use] extern crate rocket;
1
2 #[get("/hello/<name>/<age>")]
3 fn hey(identify: &str, age: u8) -> String {
4 format!("Hiya, {} 12 months outdated named {}!", age, identify)
5 }
6
7 #[launch]
8 fn rocket() -> _ {
9 rocket::construct().mount("/", routes![hello])
10 }
This code — apparently, with magic — is able to understanding that we wish a string (&str
, on this case, which is a reference to an present blob of string reminiscence) to designate a reputation, and an 8-bit integer for the age. It parses that mechanically from the <identify>
and <age>
parts of this get
attribute-labeled route. It additionally returns a string, that may then be transported in a fundamental HTTP-valid type of transportation all the way in which to the consumer’s browser. Which means — in some way — Rocket understands this hey
perform, its parameters, its return sorts, and — extra importantly — the negotiation of those properties to and from a kind understood by all net browsers. Is that not reflection? The power to gaze into regular Rust code as written, and mechanically generate the boilerplate and interop? Clearly, Rust had achieved one thing that C customers solely dream of, and that they will solely produce with externally-orchestrated instruments related by make
(or CMake
, or meson
, or Bazel
, or any of the opposite dozens of construct techniques holding up a Mt. Everest of code).
Sadly, that is solely half of the story.
What is going on right here is just not reflection, which dismayed us vastly as we really discovered the language. It’s really the machinations of a separate system that has been constructed on prime of Rust’s precise programming language. A separate shadow world whose job it’s to do the immense heavy lifting that makes code like this doable. And that whole shadow world that’s powering essentially the most elegant Rust code begins with a single crate maintained and propelled ahead by David Tolnay and his religious helpers, poetically named syn
.
syn
The syn
crate (stated like the primary a part of the phrase “syntax” and precisely just like the phrase “sin”) is the Huge Mover and Shaker of Rust. When issues get unwieldy to specific and complex to maintain typing out in C, one falls again to the preprocessor system or code mills. Equally, Rust programmers fall again to their very own token enlargement / technology system, termed “macros”. Precisely just like the C counterpart, Rust’s macro programming mannequin doesn’t really perceive something about this system itself. It receives a stream of tokens, just like the way in which invoking a perform macro in C (e.g., DO_CALL(A, B, 234, str)
) will get a listing of tokens to work together with. They’re allowed to generate a brand new sequence of tokens (topic to a handful of guidelines and hygene necessities that C macros would not have). However, the enjoyable doesn’t cease there for Rust macros; they are often supercharged with much more capabilities that permit them to hook into the “customized attribute” and “customized derive” settings, in addition to completely repurpose Rust tokens to create their very own languages (delimited throughout the normal my_macro!(…)
or my_macro![…]
or my_macro!{ … }
invocations).
These enhanced macros — known as procedural macros — can do no matter they need by performing as a compiler plugin over that token stream. Whereas C object and performance macros are extraordinarily restricted in scope and energy — regardless of working on the identical conceptual degree as Rust macros — Rust macros are so fully-featured that one can reimplement all the Rust Frontend of their Rust macro function. Others new to Rust however well-aged in lots of programming languages — save for the older Lisp veterans — would scoff. If we recommended implementing a C frontend out of token parsing in preprocessor for C, we might be laughed out of the room for even wanting a preprocessor that {powerful}. However, on this courageous new Rust world, not solely is Rust’s preprocessor theoretically {powerful} sufficient to do this work as an educational on-paper train, additionally it is in-practice precisely that {powerful}.
syn
is the fruits of that very thought. It’s a library that parses a stream of Rust tokens into Rust-understood and intimately recognizable constructs akin to an syn::Expression
, a syn::Kind
, a syn::AssocType
, a syn::DataUnion
, and a lot extra.
The particular #[launch]
and #[get(…)]
macros from the Rocket instance are how syn
is deployed. Libraries use these attributes because the hooking factors and leaping factors for his or her macros and procedural macros. Then, they dip their palms into syn
to parse and deal with these Rust constructs with a view to generate code. This, successfully, implies that each macro and procedural macro is re-doing the work of the Rust frontend (as much as its AST technology), after which performing on that with a view to generate code for constructs (sorts, related objects, names, traits, and so on.) it acknowledges. That is how Rocket is aware of to generate a important
perform for us with #[launch]
, is aware of the way to generate the boilerplate that connects a fully-received HTTP Request into one thing that may speak to our hey
perform, and so on. and so on.! It’s a tremendous feat of engineering. Make no mistake, David Tolnay plus the 80+ contributors to this crate are one in every of a number of crucial pillars of why Rust is a serious system’s programming language price taking significantly in the present day. It purchased a critical period of time for the language designers and the quite a few compiler engineers to deal with different elements of Rust, whereas use instances involving producing code and comparable could possibly be wrapped up in (proc) macros.
As with most engineering approaches born out of necessity, macros utilizing syn
to parse Rust supply code parsing in a preprocessing language comes with attention-grabbing penalties exacerbated by Rust’s programming mannequin. Specifically, Rust’s sturdy possession guidelines (not only for sources, however for code ideas) implies that macros in a short time hit very explicit limits. Typically, these limits are billed as benefits, however with our work we’ve begun to see it as extra of a hindrance than a profit. We’ll take a slight detour to elucidate code possession, particularly in relation to the C mannequin, and expound upon the way it pertains to compile-time reflection.
Robust Possession: Not Only for Assets
One of many issues that makes C and C++ brutal for code, along side a source-text #embrace
-based mannequin of programming, is how possession turns into very onerous to outline. A number of translation items might find yourself with a perform, construction, or comparable which have similar absolutely certified names and identify areas. How C and C++ deal with that is successfully a shrug of the shoulders, mixed with a rule known as the One Definition Rule. This rule states that a number of translation items that don’t decide into sure modifying key phrases (e.g., static
, nameless top-level namespaces namespace { … }
, or extern
) promise that any code with identically-named entities shall have the identical content material, Or Else™
. That “Or Else™
” is just not at all times enforced, as its positioned underneath both implicit undefined conduct (C) or as Sick-formed, No Diagnostic Required (IFNDR, C++).
Today, some compilers can verify these assumptions; for instance, GCC or Clang with -Wodr
AND Hyperlink-Time Optimizations/Hyperlink-Time Code Era turned on can warn on some One Definition Rule collisions. However for essentially the most half, if the code from two or extra translation items have similar names, compilers simply play a fast recreation of Russian Roulette and eradicate all however one model of the code. It mustn’t matter if there are a number of variations as a result of we promised that every one variations would be the identical. This typically will get violated (due to completely different translation unit compilations utilizing completely different macro definitions, or producing odd compile-time values that go into capabilities to impress differing conduct), and it ends in occasional hilarity in addition to brutal, tear-filled 3 AM debugging classes.
Rust sidesteps this drawback completely by basing their code ideas not inside translation items that get blended into an executable later, however as a substitute with conceptual modules and crates. We won’t regale everybody with the main points right here, however successfully capabilities, buildings, unions, traits, and extra all belong to the modules and crates that outline them. Code is included via using use
importations and comparable, and this permits sturdy possession of code that belongs to 1 logical entity (the crate, and inside a crate, to the module) reasonably than each single file (translation unit) claiming complete management over each single piece of code that will get #embrace
d in. This eliminates fairly a couple of apparent flaws from C, chief amongst them the necessity to manically mark each header-implemented perform with inline
after which spend billions in compute cycles deduplicating the mess.
The drawbacks that begin exhibiting up, particularly in relation to the macro system and generic programming techniques in Rust, are about this sturdy possession property.
Traits and Possession
Think about a trait in a given a crate cats
, akin to:
0 pub trait Purr {
1 fn do_purr(&self);
2 }
In our personal library (which is its personal crate), we’ve a construction known as Lion
:
We’re allowed to implement the trait Purr
for our Lion
like so, in our personal library:
0 use cats::Purr;
1
2 pub struct Lion {}
3
4 impl Purr for Lion {
5 fn do_purr (&self) {
6 // massive cat, greater purrs
7 println!("p u r r")
8 }
9 }
This works tremendous for our personal code in our personal applications/libraries. Nevertheless, take into account a unique construction named Puma
that exists in one other library big_animals
. It doesn’t have an implementation of Purr
on it, however a Puma
is a (massive) cat, so we’d like so as to add one. So, in our personal library once more, we attempt to import the Puma
construction and add the required Purr
implementation:
0 use big_animals::Puma;
1 use cats::Purr;
2
3 impl Purr for Puma {
4 fn do_purr (&self) {
5 // lengthy, modern purs
6 println!("purrrrrr")
7 }
8 }
This doesn’t compile. It runs afoul of what’s known as the Orphan Rule, which is a part of a broader Rust property known as coherence. The brief definition of the Orphan Rule is as follows:
You can not present implementations of a trait for a struct until you might be both the crate that defines the struct, or the crate that defines the trait.
Realizing this, big_animals
can’t have somebody outdoors of it add the cats::Purr
trait on it, as that runs afoul of each coherence and, particularly, the Orphan Rule. This presents one of many greatest issues that Rocket, rlua, clap, and so many different codebases need to wrestle with after they do #[derive(…)]
primarily based or Trait-based programming. There are various methods to get round such a difficulty, however nearly each resolution requires extra wrappers and particular sorts. For instance, for the issue with Puma
above, there’s 2 methods to go about this: create a brand new trait, or create a brand new sort. The latter is the answer that’s nominally used, whereby the “new sort idiom” is used. It really works, essentially, like this:
0 use cats::Purr;
1
2 struct MyPuma(big_animals::Puma);
3
4 impl Purr for MyPuma {
5 fn do_purr (&self) {
6 // lengthy, modern purs
7 println!("purrrrrr")
8 }
9 }
That is, ostensibly, not very workable if we’ve to cross a Puma
right into a perform to do some work, or obtain a Puma
again. In both case, we have to “unpack” (simple sufficient by doing my_puma_thing.0
) or we’ve to assemble a MyPuma
each time we get it again (e.g., MyPuma(thing_that_makes_a_regular_puma())
). This requires a little bit of handbook labor, and it introduces some compatibility points in that there’s now a (purely syntactic) barrier between the performance that somebody might want/want versus the info sort it’s applied over.
Viral Proliferation
As a result of it’s inconceivable to totally anticipate the myriad of traits that one might have to implement on a struct
ure, union
, enum
eration, or different sort in Rust, many crates want so as to add “options” onto themselves to accommodate exterior crates. Going via a handful of crates will reveal {identify}-serde
options or comparable addendums sticking off lots of them. That is the place “baseline” or in any other case essential — however nonetheless completely exterior/orthogonal — traits are given needed implementation by the house owners of no matter {identify}
crate. For most of the serde
prefixed and suffixed options, they exist simply to drag within the foundational serialization crate serde
after which present baseline definitions with out the brand new sort/new trait idiom on the categories supplied by the crate that added the function. This can be stunning to some of us, however is definitely pretty regular for these of us who participated within the C# and .NET ecosystems for a very long time.
That’s, C# encounters this identical base-level want for its interfaces — and precisely the interfaces outlined by a handful of libraries — to be shared everywhere in the ecosystem. As a result of interfaces are distinctive to the “meeting” (the closest .NET parallel to a crate), even when they’ve the identical members/strategies/properties, one has to implement precisely the interface from that particular meeting. It ends in very a lot the identical type of viral want to repeatedly pull in usually instances unrelated assemblies and dependencies. It’s expressly for the aim of implementing the interfaces on a given sort “correctly” so it may be used seamlessly with a given library. Issues are usually okay once we’re writing generic capabilities in C# and dealing with IEnumerable
or an identical interface specs that come from the usual library, however it will get very dicey when we’ve to start out combining a number of items of code that need a perform Foo()
, and there’s 3 completely different interface IFoo
within the completely different libraries. There’s quite a lot of methods to resolve this drawback in Rust, however none are with out drawbacks or unfavorable tradeoffs.
That is the place the generic system of C++ turns into significantly better for end-users and — in some instances — the very librarians writing the code themselves. Even with its objectionable template syntax and intensely miserable utilization expertise (SFINAE, as a starter, and deeply nested error explanations with oft-times lacking info), it’s extremely simple to take code that’s 5, 10, and even 20 years outdated and make it work seamlessly with different code. The dearth of such sturdy possession semantics implies that templated capabilities and templated buildings could be modified, specialised, and hacked as much as cowl all kinds of sorts that the library creator or software developer has no possession rights to. Perform overloads could be added for brand spanking new knowledge with out impacting any present overload selections. 13 12 months outdated hole buffer implementations sitting on GitHub can have the mud blown off of them and made to work with the prevailing and future Normal Library algorithms with little to no effort. No IEnumerable
interfaces should be added, nor Iterator
traits opted into.
C++ solely will get away with this by deeply abusing inline
/header-written code, and its lack of code possession. It toes to the very line what its One Definition Rule means. Nevertheless, this finally results in generic code written in C++ can present, each in concept and in important business apply, infinite worth with out really being absolutely rewritten. That is in stark distinction to interface-based and trait-based generics, the place it should be continually up to date or tweaked or opted into to supply the total vary of advantages afforded by the generic system. Mixed with the possession guidelines for Rust-style trait implementations, it should not solely be opted into, however must be wrapped over and again and again if the unique crate creator isn’t keen to make a function or addendum to their library as they could not take into account our use case worthwhile or foundational.
As with most advanced Pc Science subjects, the true reply is “it relies upon”.
There exist a combined bag of approaches. Many builders use the fundamental strategy with “newtype” and simply decide to plenty of unwrapping and unboxing. Often, the implementations are so bog-standard that they both rewrite with newtype idioms or simply write wrapper capabilities that decision into the “actual” capabilities to do work, hooking them as much as the precise implementation. For a small (say, as much as even 100) capabilities and kinds, such hand-tailored work is just not inconceivable. If the work follows a selected sample, macros and procedural macros can usually wrap up the work. There may be nonetheless boiler plate to declare, however numerous that may be considerably automated since Rust macros are expressly designed to generate and emit code. Very like C macros, nonetheless, debugging such expansions and dealing via any troubles from the created code seems to be a problem, although usually the instruments are significantly better than those that include most C compilers.
Nonetheless, these are usually not the one approaches doable. Some shift the burden completely to procedural macros and attempt to take it to its logical most, akin to with mirror
.
mirror
David Tolnay’s reflect, on the outset, operates in macro / procedural macro area and desires to resolve the issue of robustness when utilizing low/mid-level abstractions in syn
and pals. That implies that the entire above criticisms about #[derive(…)]
, customized attributes, and possession issues will at all times apply to no matter comes out of the hassle on this area. There’ll at all times be some quantity of both needing to personal the code explicitly so it may be appropriately marked up. Or, a difficulty with needing to create new sorts/traits to allow them to be marked up. Or, attempting to duplicate sufficient of rustc
’s conduct in order that it might probably lookup crates, stroll the file system, and try to do extra of what each cargo
and rustc
do at (procedural) macro time, which has apparent drawbacks for compilation time and construct system sanity.
The proof of idea and the philosophy do make for a really engaging metaprogramming story; specifically, it offers an interface that’s superficially just like Zig’s comptime T: sort
-style of generic programming. You deal with mirrored values just like a means you’ll deal with a sort
worth at comptime
in Zig. We don’t suppose every part we do with introwospection
will be capable of cowl all of what dtolnay/mirror
— or, certainly, (procedural) macro programming in Rust — can do, however we do purpose to take a major chunk of the use instances current right here and as a substitute transfer them into one thing the Rust language can deal with itself. This leaves the opposite strategy to this drawback: run-time registration and reflection.
bevy_reflect
The bevy_reflect
crate is far more absolutely featured and production-ready, focusing completely on the power to carry onto, transport, and work together with sorts and strategies at run-time. It has benefits for the way it interacts with code in that they supply a set set of interfaces that permit of us to register sorts, capabilities, and even potential trait info at run-time. It forsakes good code technology by now not being compile-time (and, thusly, giving up a meager quantity of efficiency, binary code dimension, and disk/RAM area) and leans deeply right into a well-defined, deeply polished system.
With bevy_reflect
, we describe the conduct of a sort utilizing particular traits and registration capabilities and root all the system in a handful of traits — Replicate
being one in every of them — earlier than utilizing copious quantities of downcast
and downcast_ref
to get the knowledge we wish to, at particular junctures within the system. This implies we solely register what we wish to use, and it isn’t restricted solely to sorts we explicitly personal as effectively. Even the examples on bevy_reflect
’s tutorials are exceptionally cool and really {powerful}, from modifying present fields on a struct hidden behind a run-time sort description to iterating over a choose record of buildings that match a sure casting sample. This reflection resolution feels is the closest to a Java or C#-style of reflection constructed completely out of a library, and sure feels acquainted to of us who’ve constructed comparable techniques out of void*
(C) or BaseClass*
(in C++).
Sadly, as a result of it goes about this in at run-time, it violates our fairly specific aim of not making the consumer pay for issues they could not use. Run-time registration that should be solid down and solid up, (slight) run-time price to the operations being carried out, needing to verify continually to make sure issues are the kind or form that we want it to be in; all of this stuff are each programmatic and efficiency overhead for our acknowledged objectives. Many of those operations are issues the language may make safer if it had the power to have a look at its personal sorts and capabilities, in order that it didn’t want fixed checking, casting, and unwrap()
-ing to get work carried out.
The std::any::*
Module and dyn Trait
std::any::Any
, std::any::TypeId
, and comparable constructs within the std::any
module present methods of dealing with values via what’s successfully a wrapped-up interface. That is additional difficult by the Rust language itself, utilizing dyn SomeTrait
and comparable constructs to permit boxing and digital table-ing of many entities. This falls in a lot the identical class as bevy_reflect
does, as these are run-time powered mechanisms. The main target of this module and for dyn SomeTrait
is the power so as to add a degree of indirection (digital tables, oblique perform calls, and comparable methods) to permit for “hiding” the supply of knowledge whereas permitting a downstream shopper to work with it with out strictly typing the interface. A lot of this additionally powers most of the implementation methods in bevy_reflect
, so it’s protected to name this a subset of what bevy_reflect
presents to end-users with extra general-purpose instruments.
Is That Sufficient?
Effectively, probably not. As our aim is compile-time, no-overhead introspection on present program sorts, we come to a little bit of an deadlock. Whereas dtolnay/mirror
can theoretically present us with no-overhead introspection, it’s tie-in to the procedural macro system prevents us from utilizing it within the regular Rust language, nor on sorts we would not have rights to (e.g., outdoors our program/library crate). It additionally doesn’t essentially enhance the generic programming scenario in Rust, nor give us higher primitives to work with within the language itself.
Having seen the myriad of approaches, we now flip to our personal try which strays from each the procedural macro-heavy strategy of Tolnay’s mirror
and turns away the run-time powers of bevy_reflect
. We might not be capable of clear up each drawback that procedural macros or run-time reflection registration can, however we consider that the under examples will illustrate most of the methods through which we will promote a greater means of performing compile-time computation with the aim of trying again on code. We’ll begin with explaining the API and compiler interface itself, then dive into the explanations for design selections, the way it avoids a few of the pitfalls of options, and the challenges we must face down with a view to make it occur. We may even focus on shortcuts which can be utilized to sidestep the dearth of many unstable, lacking, and/or incomplete Rust options.
At its core, introwospection
makes an attempt to invert the #[derive(MakeSomeTrait)]
-style of code technology and as a substitute substitute it with true compile-time, generic-capable introspection, using info the compiler already has. Here’s a fundamental instance of the API in a program crate known as cats_exec
:
0 #![feature(introwospection)]
1 use std::introwospection::{FieldDescriptor, StructDescriptor};
2 use std::mem::size_of;
3 use std::any::type_name;
4
5 pub struct Kitty {
6 pub is_soft: bool,
7 pub meows: i64,
8 destruction_of_all_curtains_trigger: u32
9 }
10
11 pub fn important () {
12 sort KittyDesc = introwospect_type<Kitty>;
13 println!("struct {}, with {} fields {{",
14 <KittyDesc as StructDescriptor>::NAME,
15 <KittyDesc as StructDescriptor>::FIELD_COUNT);
16 println!("t{} ({}, dimension {}, at {})",
17 <KittyDesc::Fields as FieldDescriptor<0>>::NAME,
18 std::any::type_name::<<KittyDesc::Fields as FieldDescriptor<0>>::Kind>(),
19 std::mem::size_of::<<KittyDesc::Fields as FieldDescriptor<0>>::Kind>(),
20 <KittyDesc::Fields as FieldDescriptor<0>>::BYTE_OFFSET);
21 println!("t{} ({}, dimension {}, at {})",
22 <KittyDesc::Fields as FieldDescriptor<1>>::NAME,
23 std::any::type_name::<<KittyDesc::Fields as FieldDescriptor<1>>::Kind>(),
24 std::mem::size_of::<<KittyDesc::Fields as FieldDescriptor<1>>::Kind>(),
25 <KittyDesc::Fields as FieldDescriptor<1>>::BYTE_OFFSET);
26 println!("t{} ({}, dimension {}, at {})",
27 <KittyDesc::Fields as FieldDescriptor<2>>::NAME,
28 std::any::type_name::<<KittyDesc::Fields as FieldDescriptor<2>>::Kind>(),
29 std::mem::size_of::<<KittyDesc::Fields as FieldDescriptor<2>>::Kind>(),
30 <KittyDesc::Fields as FieldDescriptor<2>>::BYTE_OFFSET);
31 println!("}}")
32 }
The integer fixed I
for FieldDescriptor<I>
is the declaration (supply code) index. We’re accessing every subject explicitly, separately. Normally, every entity in Rust is folded right into a trait which has related sorts and related const
objects, and the gathering of those traits are known as Descriptors
. The entire info could be carried via const fn
s and/or the kind system: on this instance, we use introwospect_type<…>
, which takes a single sort argument to show the which can finally print:
0 struct cats_exec::Kitty, with 3 fields {
1 is_soft (bool, dimension 1, at 12)
2 meows (i64, dimension 8, at 0)
3 destruction_of_all_curtains_trigger (u32, dimension 4, at 8)
4 }
These not used to Rust will be aware that — when not annotated by a selected form of #[repr(…)]
attribute — byte offsets don’t correspond to a linear sequence throughout the construction. They are often rearranged at will, which is why the API makes use of the 0
/1
/…/up-to-::FIELD_COUNT - 1
values that correspond to the distinctive declaration index reasonably than every other metric. Every FieldDescriptor<I>
trait implementation yields the details about that subject on the construction. This API additionally doesn’t permit entry to e.g. crate-hidden or non-public fields. For instance, if Kitty
is moved out of cats_exec
and is as a substitute in a crate named felines
, the code can be unable to entry “Subject 2” (destruction_of_all_curtains_trigger
). So, with felines/src/Kitty.rs
:
0 pub struct Kitty {
1 pub is_soft: bool,
2 pub meows: i64,
3 destruction_of_all_curtains_trigger: u32
4 }
And cats_exec/src/important.rs
:
0 #![feature(introwospection)]
1 use std::introwospection::{FieldDescriptor, StructDescriptor};
2 use std::mem::size_of;
3 use std::any::type_name;
4 use felines::Kitty;
5
6 pub fn important () {
7 sort KittyDesc = introwospect_type<Kitty>;
8 println!("struct {}, with {} fields {{",
9 <KittyDesc as StructDescriptor>::NAME,
10 <KittyDesc as StructDescriptor>::FIELD_COUNT);
11 println!("t{} ({}, dimension {}, at {})",
12 <KittyDesc::Fields as FieldDescriptor<0>>::NAME,
13 type_name::<<KittyDesc::Fields as FieldDescriptor<0>>::Kind>(),
14 size_of::<<KittyDesc::Fields as FieldDescriptor<0>>::Kind>(),
15 <KittyDesc::Fields as FieldDescriptor<0>>::BYTE_OFFSET);
16 println!("t{} ({}, dimension {}, at {})",
17 <KittyDesc::Fields as FieldDescriptor<1>>::NAME,
18 type_name::<<KittyDesc::Fields as FieldDescriptor<1>>::Kind>(),
19 size_of::<<KittyDesc::Fields as FieldDescriptor<1>>::Kind>(),
20 <KittyDesc::Fields as FieldDescriptor<1>>::BYTE_OFFSET);
21 // Compile-time error
22 // <KittyDesc::Fields as FieldDescriptor<2>>
23 // would error!
24 //
25 //println!("t{} ({}, dimension {}, at {})",
26 // <KittyDesc::Fields as FieldDescriptor<2>>::NAME,
27 // type_name::<<KittyDesc::Fields as FieldDescriptor<2>>::Kind>(),
28 // size_of::<<KittyDesc::Fields as FieldDescriptor<2>>::Kind>(),
29 // <KittyDesc::Fields as FieldDescriptor<2>>::BYTE_OFFSET);
30 //
31 println!("}}")
32 }
Which might produce:
0 struct felines::Kitty, with 2 fields {
1 is_soft (bool, dimension 1, at 12)
2 meows (i64, dimension 8, at 0)
3 }
This implies we’ve a privacy-respecting means of trying on the knowledge for a sort which isn’t a part of a crate we authored! There is only one small drawback with the above code, sadly.
It. Is. U G L Y.
It doesn’t take an opinionated code beautician or a grizzled Workers Engineer to have a look at the above code and inform us that this must be the world’s ugliest API in existence. No one needs to put in writing the above, no one needs to hard-code source-level indices, no one needs to continually solid some sort to a given subject descriptor. Rust additionally doesn’t permit us to have a neighborhood sort
declaration or one thing comparable, like so:
0 #![feature(introwospection)]
1 use std::introwospection::{FieldDescriptor, StructDescriptor};
2 use std::mem::size_of;
3 use std::any::type_name;
4 use felines::Kitty;
5
6 pub fn important () {
7 sort KittyDesc = introwospect_type<Kitty>;
8 sort trait KittyField0 = <KittyDesc as FieldDescriptor<0>>;
9 sort trait KittyField1 = <KittyDesc as FieldDescriptor<1>>;
10 /* remainder of the code… */
11 }
Which means each line has to comprise the trait solid, to supply compile-time entry to a selected, declaration-indexed subject on one thing. It’s genuinely a horrible option to program for reflection, being each non-ergonomic and brittle. The query, nonetheless, is why is the interface like this within the first place? Can we not merely use a pure assortment for fields or enumeration variants or perform arguments or every other factor that boils right down to “record of sorts and different related info”? It seems that it is a lot tougher than it appears with out compromising the objectives of introwospection
. First, let’s begin with utilizing arrays and slices.
What’s improper with [T; N]
or [T]
?
One may ask, why we aren’t utilizing arrays (or slices) to supply this info in a constant, easy-to-use method. The straightforward reply is that arrays require every part to coalesce to a single sort. Doing so would imply that it will be inconceivable to supply sort info in a compile-time, non-type-erased, non-potentially-allocating method with the e.g. Field_at_I::Kind
. Which means arrays are unsuitable for the duty at hand; we wish no erasure to be going down and no run-time checking of sorts as is completed with bevy_reflect
. Alternatively, we must wrap each record of subject descriptors up into an enum
of compiler-generated buildings that every comprise the required FieldDescriptor<I>
implementation, however that additionally imposes a compile-time overhead as a result of one would solely be capable of entry the kind inside a match
clause. That is additional inhibiting as a result of match
clauses over such sorts don’t permit us to carry out completely different sorts of implementations for various T
s throughout the case arm. The best way round that is with one thing gory like transmute_copy
, and praying that it optimizes out. Reportedly, energy customers of Rust have carried out one thing like this and achieved success by relying on the compiler optimizing out unused compiler match
arms for a given perform and instantly accessing the right reminiscence via the transmute_copy
, however this as soon as once more is counting on the great graces of the compiler. There isn’t any assure a Rust compiler won’t simply, on a whim in the future, fumble the code and begin producing a few of the most gnarly jumps and conditionals we’ve seen up till now.
In the end, we determined that such an strategy was more likely to be brittle, and representing lists of sorts akin to perform arguments, struct fields, enumeration variants (and the fields on every variant) on this method was too error susceptible.
What About (T, U, V, R, S, …)
? They’re Heterogenous Collections, Proper?
One other potential strategy is offering a tuple of sorts, from 0
to Description::FIELD_COUNT - 1
. Sadly, tuple-based programming is unsophisticated in Rust. To match, C has no tuples to talk of aside from instantly making a construction. C++ has the grungy std::tuple
, which is usable — together with at compile-time — however deeply expensive (to construct instances) with out compiler implementations pouring a LOT of vitality into writing heuristics to hurry up their entry, instantiation, and common utilization. Rust has built-in tuples, which is a gigantic win for compilation time at instantiation and utilization time, however robbed itself of most usability. Accessing a tuple can’t be carried out programmatically with fixed expressions, as a result of the my_tuple.0
syntax is a hardcoded, specific allowance of the syntax that actually requires a quantity after the .
.
We can’t calculate a const INDEX: usize = /* some computation right here */;
after which do my_tuple.(INDEX)
, my_tuple[INDEX]
, my_tuple[std::num::type_num<INDEX>]
, or std::tuple::get::<INDEX>(my_tuple)
. We truthfully don’t care what the syntax would appear to be, and perhaps std::tuple::get::<INDEX>(my_tuple)
could possibly be applied, in the present day, in Rust, however it doesn’t exist. There isn’t any option to program over a tuple aside from hardcoding not solely each tuple from dimension 0 to N right into a a number of traits, however then doing the combinatoric manual spam to get each value. This contributes to Rust compile-time points, in the identical means that utilizing the superb P99 library for many macro metaprogramming in C contributes to elevated compile-times with a excessive depend for the P99 FOR_EACH
or comparable macros. Even C++ normal libraries would implement variadic std::function
with macros in C++, and so they suffered compile time loss till they moved to true variadics and compile-time programmability. It seems that “faux it till you make it” is universally an costly factor to do it doesn’t matter what language structure we construct up for ourselves. Needing to parse 0-to-8, 0-to-12, or 0-to-25 faked up issues will at all times be dearer than having true variadic and tuple help. The language additionally doesn’t help operations like .map(…)
-ing over a tuple to provide a tuple of various sorts, although because the code from frunk
proves we will ostensibly faux it for ourselves. Most tried faux-implementations — particularly naïve ones that take a “Cons and Listing” fashion strategy — may end up in quadratic time algorithms, additional exacerbating compile time prices.
Given this, we’ve determined to not rely instantly on tuples to satisfy the metaprogramming wants along side our desired API for compile-time introspection. That is within the curiosity of preserving compile instances as little as doable from the outset, and likewise within the curiosity of not saddling the consumer with a subpar, map
-less API that requires exterior crates or critical workarounds to deal with. Normally, type-heterogenous programming and help is pretty weak in Rust, which is why we’ve provide you with the next API under that’s much more manageable. It isn’t an ideal API and it can’t deal with really generic, type-heterogenous collections with out compiler assist, however given how the introwospect_type
, introwospect
, and introwospect_over
key phrases can be applied, it ought to present us with the utmost quantity of flexibility for the roles we wish to pursue.
To make issues simpler, we simply need to implement a fundamental customer sort:
0 #![feature(introwospection)]
1 use std::introwospection::{FieldDescriptor, StructDescriptor, };
2 use std::mem::size_of;
3 use std::any::type_name;
4 use felines::Kitty;
5
6 struct DescriptorPrinter;
7
8 impl FieldDescriptorVisitor for DescriptorPrinter {
9
10 sort Output = ()
11
12 fn visit_field<Kind: 'static, const INDEX: usize>(&self) -> Self::Output
13 the place Kind : FieldDescriptor<INDEX>
14 {
15 let type_name = type_name::<Descriptor::Kind>();
16 let member_size = size_of::<Descriptor::Kind>();
17 println!("t{} ({}, dimension {}, at {})",
18 Descriptor::NAME,
19 type_name,
20 member_size,
21 Descriptor::BYTE_OFFSET);
22 }
23 }
24
25 impl StructDescriptorVisitor for DescriptorPrinter {
26
27 sort Output = ()
28
29 fn visit_struct<Kind>(&self) -> Self::Output
30 the place Kind : StructDescriptor
31 {
32 println!("struct {}, with {} fields {{",
33 Descriptor::NAME,
34 Descriptor::FIELD_COUNT);
35 // now, introspect over the fields of this kind.
36 ( introwospect_over(Descriptor::Kind, Descriptor::Fields, self) );
37 println!("}}");
38 }
39 }
40
41 pub fn important () {
42 let customer = DescriptionPrinter;
43 introwospect(Kitty, customer);
44 }
Which might produce output similar to the one above:
0 struct felines::Kitty, with 2 fields {
1 is_soft (bool, dimension 1, at 12)
2 meows (i64, dimension 8, at 0)
3 }
Guests are supposed to be the mid-level API. They provide a considerable amount of flexibility and, along side the introwospect
and introwospect_over
key phrases, permit for simply iterating via the seen fields on struct
s, union
s, enum
s fn
s, and different sorts. This degree of compile-time introspection permits us to carry out algorithms in a systemic method over a any sort and any subject. The guests that include the API are as follows. All INDEX
names are fixed usize
expressions that check with the declaration (supply order) index of the entity inside no matter context it’s utilized.
StructDescriptorVisitor
: for observing compile-time info via a generically suppliedStructDescriptor
-implementing sort parameter describing both astruct
orunion
sort.EnumDescriptorVisitor
: for observing compile-time info via a generically suppliedEnumDescriptor
-implementing sort parameter describing anenum
sort. If the enumeration has no variants, then theVariants
isNoType
. In any other case, it has fromINDEX = 0
toINDEX = VARIANT_COUNT - 1
implementations of the traitVariantDescriptor<INDEX>
.TupleDescriptorVisitor
: for observing compile-time info via a generically suppliedTupleDescriptor
-implementing sort parameter describing the built-in sort(…)
. The unit sort()
is roofed by aTupleDescriptor
whosFields
isNoType
(e.g., there are not any fields on the tuple). In any other case, theFields
has fromINDEX = 0
toINDEX = FIELD_COUNT - 1
implementations of the traitFieldDescriptor<INDEX>
.ArrayDescriptorVisitor
: for observing compile-time info via a generically suppliedArrayDescriptor
-implementing sort parameter describing the built-in sort[T; N]
. Makes use ofsort ElementType;
andconst ELEMENT_COUNT: usize;
to explain the compile-time dimension.SliceDescriptorVisitor
: for observing compile-time info via a generically suppliedSliceDescriptor
-implementing sort parameter describing the built-in sort[T]
. Makes use ofsort ElementType;
to explain theT
of the slice.FunctionDescriptorVisitor
: for observing compile-time info via a generically suppliedFunctionDescriptor
-implementing sort parameter describing the built-in sortfn identify (Args…);
and probably closure objects as effectively. Makes use ofsort ReturnType;
to explain the return sort, andconst PARAMETER_COUNT: usize;
to explain the variety of parameters the perform has. Whether it is larger than 0, thesort Parameters
has implementations fromINDEX = 0
toINDEX = PARAMETER_COUNT - 1
ofParameterDescriptor<INDEX>
.- Two variations for visiting fields at compile-time:
FieldDescriptorVisitor
: for observing compile-time info via a generically suppliedFieldDescriptor<INDEX>
-implementing sort parameter. Use with thestd::introwospection::get_field::<…>(…)
perform and an object whose sort matchessort Proprietor
to get the sector of the tuple, union, enumeration, or construction.FieldDescriptorVisitorAt<INDEX>
: similar toFieldDescriptorVisitor
, however offers the knowledge for the specifiedINDEX
instantly into the*Customer
implementation itself. This enables a unique related sort to be chosen for every subject on a construction descriptor, enabling heterogenous programming over a visited assortment.
- Two variations for visiting variants at compile-time:
VariantDescriptorVisitor
: for observing compile-time info via a generically suppliedVariantDescriptor<INDEX>
-implementing sort parameter. Variants are attention-grabbing in {that a} variant on an enumeration doesn’t characterize a sort by itself, procuring all kinds of issues for the reflection API as a complete. It typically requires weird workarounds to know on the knowledge optionally contained inside a variant. Asort Fields;
is out there on every variant descriptor which has its personalFieldDescriptor<INDEX>
implementations fromINDEX = 0
toINDEX = FIELD_COUNT - 1
, supplied the variant descriptor’sFIELD_COUNT
is bigger than 0. Programmatically, one can iterate over every variant and use theDISCRIMINANT
subject to verify if a given enumeration object has the identical variant set, after which usestd::introwospection::get_field::<…>(…)
VariantDescriptorVisitorAt<INDEX>
: similar toVariantDescriptorVisitor
, however offers the knowledge for the specifiedINDEX
instantly into the*Customer
implementation itself. This enables a unique related sort to be chosen for every variant on aenum
’s descriptor, enabling heterogenous programming over a visited assortment.
- Two variations for visiting perform parameters at compile-time:
ParameterDescriptorVisitor
: for observing compile-time info via a generically suppliedFieldDescriptor<INDEX>
-implementing sort parameter. Use with thestd::introwospection::get_field::<…>(…)
perform and an object whose sort matchessort Proprietor
to get the sector of the tuple, union, enumeration, or construction.ParameterDescriptorVisitorAt<INDEX>
: similar toParameterDescriptorVisitor
, however offers the knowledge for the specifiedINDEX
instantly into the*Customer
implementation itself. This enables a unique related sort to be chosen for every subject on a perform descriptor, enabling heterogenous programming over a visited assortment.
Every of the best degree *Descriptor
sorts are additionally paired with an AdtDescriptor
trait, which offers the const NAME: &'static str;
string identify of the kind, a const ID: AdtId;
identification enumeration of the kind for what sort it’s, and const ATTRIBUTES: &'static attributes;
, which incorporates the names (and doubtlessly, values) of something discovered inside a #[introwospect(…)]
key phrase.
Up till now, we’ve type of hand-waved how numerous this works. That is for good cause: whereas we have been in a position to create mockups that achieved fundamental variations of what we’ve above utilizing present Rust Compiler Nightly builds with unstable options, we very swiftly ran up towards varied issues each theoretical and sensible with how this could possibly be applied or supplied by Rust. At every level, we needed to workaround varied points till we hit a few of our penultimate blockers which have prevented us from having a working model of this in unstable, nightly Rust. We’ll begin from the highest with our varied key phrases and what they’re meant to do, along side the semantics they must have if “written out” by hand. Then, we’ll drill down into the person challenges — many already alluded to — and hone in on these points.
introwospect_type
is the primary key phrase. It indicators to the compiler that the kind fed into it needs to be mirrored upon. The sort that comes out is just not well-specified to be any explicit sort, however it will likely be one thing that implements one of many above Descriptor
sorts primarily based on what sort was fed into it (a construction, union, enumeration, perform (pointer), so on and so forth). For instance, utilizing an enumeration sort CatHours
along side Kitty
for a program crate known as cat_time
:
0 #![feature(introwospection)]
1 use std::time::SystemTime;
2 use std::introwospection::*;
3
4 #[non_exhaustive]
5 pub struct Kitty {
6 pub is_soft: bool,
7 pub meows: i64,
8 destruction_of_all_curtains_trigger: u32
9 }
10
11 pub enum CatHours {
12 ZeroOfThem,
13 OneOfThem(Kitty),
14 TwoOfThem{one: Kitty, two: Kitty},
15 LotsOfThem(&[Kitty]),
16 #[introwospection(lossy)]
17 LostCountOfThem{ last_known: usize, when: SystemTime }
18 }
19
20 pub fn important () {
21 sort HoCDesc = introwospect_type<CatHours>;
22 // .. different issues right here
23 }
This could produce one thing just like the next by the compiler:
0 #![feature(const_discriminant)]
1 use std::time::SystemTime;
2 use std::introwospection::*;
3 use std::mem::{offset_of, Discriminant, discriminant_at}; // for impl
4 use std::choice::Possibility; // for impl
5
6 #[introwospect(are_little_precious_babies = "yes!!")]
7 #[non_exhaustive]
8 pub struct Kitty {
9 pub is_soft: bool,
10 pub meows: i64,
11 destruction_of_all_curtains_trigger: u32
12 }
13
14 pub enum CatHours {
15 ZeroOfThem,
16 OneOfThem(Kitty),
17 TwoOfThem{one: Kitty, two: Kitty},
18 LotsOfThem(&'static [Kitty]),
19 #[introwospection(lossy)]
20 LostCountOfThem{ last_known: usize, when: SystemTime }
21 }
22
23 /* COMPILER GENERATION BEGINS HERE! */
24 // struct Kitty
25 unsafe impl AdtDescriptor for Kitty {
26 const ID: AdtId = AdtId::Struct;
27 const NAME: &'static str = "cat_time::Kitty";
28 const ATTRIBUTES: &'static [AttributeDescriptor]
29 = &[
30 AttributeDescriptor{
31 name: "are_little_precious_babies",
32 value: Some("yes!!")
33 }
34 ];
35 }
36 unsafe impl FieldDescriptor<0> for Kitty {
37 sort Proprietor = Kitty;
38 sort Kind = bool;
39 const NAME: &'static str = "is_soft";
40 const BYTE_OFFSET: usize = offset_of!(Kitty, is_soft);
41 }
42 unsafe impl FieldDescriptor<1> for Kitty {
43 sort Proprietor = Kitty;
44 sort Kind = i32;
45 const NAME: &'static str = "meows";
46 const BYTE_OFFSET: usize = offset_of!(Kitty, meows);
47 }
48 unsafe impl FieldDescriptor<2> for Kitty {
49 sort Proprietor = Kitty;
50 sort Kind = u32;
51 const NAME: &'static str = "destruction_of_all_curtains_trigger";
52 const BYTE_OFFSET: usize
53 = offset_of!(Kitty, destruction_of_all_curtains_trigger);
54 }
55 unsafe impl StructDescriptor for Kitty {
56 sort Fields
57 : FieldDescriptor<0> + FieldDescriptor<1> + FieldDescriptor<2>
58 = Kitty;
59 const FIELD_COUNT: usize = 3;
60 // That is NOT a tuple struct (e.g., `Kitty(u64, i32, …)`).
61 const IS_TUPLE_STRUCT: bool = false;
62 // non-public fields are seen to the utilization on this context
63 const HAS_NON_VISIBLE_FIELDS: bool = false;
64 }
65 // CatHours
66 unsafe impl AdtDescriptor for CatHours {
67 const ID: AdtId = AdtId::Enum;
68 const NAME: &'static str = "cat_time::CatHours";
69 }
70 unsafe impl VariantDescriptor<0> for CatHours {
71 sort Proprietor = CatHours;
72 const NAME: &'static str = "ZeroOfThem";
73 const DISCRIMINANT: &'static Discriminant<CatHours>
74 = &discriminant_at::<CatHours>(0).unwrap();
75 }
76 struct CatHours_Variant1_FieldsType;
77 unsafe impl FieldDescriptor<0> for CatHours_Variant1_FieldsType {
78 sort Proprietor = CatHours;
79 sort Kind = Kitty;
80 const NAME: &'static str = "0";
81 const BYTE_OFFSET: usize = offset_of!(CatHours, OneOfThem.0);
82 }
83 unsafe impl VariantDescriptor<1> for CatHours {
84 sort Proprietor = CatHours;
85 sort Fields
86 : FieldDecsriptor<0>
87 = CatHours_Variant1_FieldsType;
88 const NAME: &'static str = "OneOfThem";
89 const DISCRIMINANT: &'static Discriminant<CatHours>
90 = &discriminant_at::<CatHours>(1).unwrap();
91 }
92 struct CatHours_Variant2_FieldsType;
93 unsafe impl FieldDescriptor<0> for CatHours_Variant2_FieldsType {
94 sort Proprietor = CatHours;
95 sort Kind = Kitty;
96 const NAME: &'static str = "one";
97 const BYTE_OFFSET: usize = offset_of!(CatHours, TwoOfThem.one);
98 }
99 unsafe impl FieldDescriptor<1> for CatHours_Variant2_FieldsType {
100 sort Proprietor = CatHours;
101 sort Kind = Kitty;
102 const NAME: &'static str = "two";
103 const BYTE_OFFSET: usize = offset_of!(CatHours, TwoOfThem.two);
104 }
105 unsafe impl VariantDescriptor<2> for CatHours {
106 sort Proprietor = CatHours;
107 sort Fields
108 : FieldDecsriptor<0> + FieldDecsriptor<1>
109 = CatHours_Variant2_FieldsType;
110 const NAME: &'static str = "TwoOfThem";
111 const DISCRIMINANT: &'static Discriminant<CatHours>
112 = &discriminant_at::<CatHours>(2).unwrap();
113 }
114 struct CatHours_Variant3_FieldsType;
115 unsafe impl FieldDescriptor<0> for CatHours_Variant3_FieldsType {
116 sort Proprietor = CatHours;
117 sort Kind = &'static [Kitty];
118 const NAME: &'static str = "0";
119 const BYTE_OFFSET: usize = offset_of!(CatHours, LotsOfThem.0);
120 }
121 unsafe impl VariantDescriptor<3> for CatHours {
122 sort Proprietor = CatHours;
123 sort Fields
124 : FieldDecsriptor<0>
125 = CatHours_Variant3_FieldsType;
126 const NAME: &'static str = "LotsOfThem";
127 const DISCRIMINANT: &'static Discriminant<CatHours>
128 = &discriminant_at::<CatHours>(3).unwrap();
129 }
130 struct CatHours_Variant4_FieldsType;
131 unsafe impl FieldDescriptor<0> for CatHours_Variant4_FieldsType {
132 sort Proprietor = CatHours;
133 sort Kind = usize;
134 const NAME: &'static str = "last_known";
135 const BYTE_OFFSET: usize = offset_of!(CatHours, LostCountOfThem.last_known);
136 }
137 unsafe impl FieldDescriptor<1> for CatHours_Variant4_FieldsType {
138 sort Proprietor = CatHours;
139 sort Kind = SystemTime;
140 const NAME: &'static str = "when";
141 const BYTE_OFFSET: usize = offset_of!(CatHours, LostCountOfThem.when);
142 }
143 unsafe impl VariantDescriptor<4> for CatHours {
144 sort Proprietor = CatHours;
145 sort Fields
146 : FieldDecsriptor<0> + FieldDecsriptor<1>
147 = CatHours_Variant4_FieldsType;
148 const NAME: &'static str = "LostCountOfThem";
149 const DISCRIMINANT: &'static Discriminant<CatHours>
150 = &discriminant_at::<CatHours>(4).unwrap();
151 const ATTRIBUTES: &'static [AttributeDescriptor] = &[
152 AttributeDescriptor{
153 name: "lossy",
154 value: None
155 }
156 ];
157 }
158 /* COMPILER GENERATION ENDS HERE! */
159
160 pub fn important () {
161 // Description sort, on this case, is solely
162 // the kind itself!
163 sort CatHoursDesc = CatHours;
164 // .. different issues right here
165 }
This can be a lot of boilerplate. introwospect_type
offers us a sort that implements the suitable StructDescriptor
(for struct
s), EnumDescriptor
(for enum
s), VariantDescriptor<INDEX>
for every variant at source-code index INDEX
, and FieldDescriptor<INDEX>
for every subject inside an enum
’s variant or a struct
at source-code index INDEX
. In some instances, the kind given is equal to the kind itself. At different factors, it’s only a compiler-generated and un-nameable struct
that can be embellished with the suitable implementations to allow the introspection.
There’s a couple of issues that needs to be highlighted in regards to the anticipated semantics / magic we’re using on this generated code.
std::mem::offset_of!
is Magic™
std::mem::offset_of!
has just landed for Rust and can ultimately should be stabilized. As of April twenty fifth, 2023, it additionally doesn’t have plans for the way to get the offsets of the fields of a variant inside an enumeration. So, the fanciful std::mem::offset_of!(CatHours, LostCountOfThem.when)
syntax we’re utilizing above is totally fictitious. We doubt that the official Language, Compiler, or Library groups take weblog posts as a honest type of suggestions so in some unspecified time in the future we’ll probably need to take part sooner or later challenge that asks for std::mem::offset_of
to work on some type of Rust’s enumerations. This is also fastened by making CatHours::LostCountOfThem
an actual sort in Rust, reasonably than a magical entity that will solely be named particularly inside a match
clause, instantly (see further below).
std::mem::discriminant_at::<EnumType>(some_usize_index)
is Not Actual
This perform name is just not an actual intrinsic out there in Rust. It is usually awkward to program. As a result of enumerations in Rust can’t instantly converse of a given variant inside itself, it’s inconceivable to make a wonderfully protected variant of this for the end-user that doesn’t work off an present enumeration worth. For this reason there solely exists a type of getting a std::mem::Discrimimant<EnumType>
by calling std::mem::Discrimimant<EnumType>(enum_type_object)
with an present enum_type_object
that’s already one of many present variants; it’s the solely option to safely get at an enumeration sort’s worth with out returning an Possibility
.
For the specified core perform to exist in Rust,
0 pub fn discriminant_at<EnumType>(
1 declaration_variant_index: usize
2 ) -> Possibility<Discriminant<EnumType>>;
it should return a std::choice::Possibility
sort, because the declaration_variant_index
could also be outdoors of the 0
to VARIANT_COUNT
for an enumeration, or bigger than 0 for different out there for the buildings and unions. There is also an unsafe variant, which might make returning and dealing with the info less complicated, however the efficiency for that use case is mostly lined by simply returning an Possibility
and easily performing an unwrap_unchecked()
inside an unsafe { … }
block.
All of those choices are strictly worse than being able to ensure with the kind system that each the enumeration and the variant sort are half of each other:
0 pub fn discriminant_for<EnumType, VariantType>() -> Discriminant<EnumType>;
This might solely occur if we may write e.g. std::mem::discriminant_for::<CatHours, CatHours::ZeroOfThem>()
. However, as acknowledged many instances on this article, variant types are not real or touchable outside of match
statements.
Enumeration Variants are usually not Nameable Varieties
Within the generated compiler code, we’ve so as to add fictitious CatHours_Variant{INDEX}_FieldsType
sorts for a Fields
inside the VariantDescriptor<INDEX>
itself, reasonably than with the ability to simply make each VariantDescriptor<INDEX>
additionally include a StructDescriptor
implementation. It’s because variants can’t be named as actual sorts. This really provides that additional degree of API complexity that’s there solely as a result of enumeration variants, even when they appear and odor like different tuples or buildings within the language, are utterly completely different. For instance:
0 struct StructMeow {
1 pub a: i64
2 pub b: i32
3 pub c: f64
4 }
5
6 enum EnumMeow {
7 A(StructMeow),
8 B{a: i64, b: i32, c:f64}
9 C((i64, i32, f64))
10 D(i64, i32, f64)
11 }
StructMeow
require the identical alignment, structure, subject ordering, or byte offsets as EnumMeow::B
. Equally, the tuple in EnumMeow::C
doesn’t need to identical alignment, structure, subject ordering, or byte offsets as EnumMeow::D
. Think about, briefly, that EnumMeow
makes use of a u8
to retailer the discriminant that tells between all 4 components. The structure for EnumMeow
may place the discriminant at a byte offset of 0. It will probably then place EnumMeow::B.b
at byte offset 4, then place a
and c
at byte offsets 8 and 16 respectively. Equally, D
can have its .0
, .1
, and .2
fields rearranged in a means that doesn’t match the contained tuple in C
.
Successfully, there is a particular sort for the Summary Knowledge Kind’s variant that’s completely different from a knowledge sort that has the identical bodily/supply code look and look. However, it might probably solely be talked about instantly in match
expressions and sample matching contexts. It might be useful to have the ability to clearly check with such a sort, for each std::mem::discriminant_at
’s API and likewise to solidify the spelling of entry for std::mem::offset_of!
.
Attribute Introspection is Restricted to #[introwospection(…)]
Attributes
The attribute #[non_exhaustive]
doesn’t seem within the ATTRIBUTES
itemizing for Kitty
’s StructDescriptor
implementation. It’s because we can’t expose any and all types of attributes on a sort, subject, or perform: it will be too invasive. Due to this fact, solely the values supplied by an #[introwospection(…)]
attribute can be supplied on the Descriptor::ATTRIBUTES
related const
merchandise.
This ensures backwards compatibility with present attributes. It moreover strongly scopes what attribute-based metaprogramming can do. There could also be a future the place extra attributes are put into the scope, however for now we wish to keep away from any potential points with ascribing which means to preexisting attributes. If we transfer in a path the place extra attributes are introspectable, we might probably want to contemplate an ADDITIONAL_ATTRIBUTES
variable with the entire varied attributes on a given sort.
Fields
and Variants
are each Tough to Specify Absolutely
For the EnumDescriptor
(with sort Variants
) and StructDescriptor
/VariantDescriptor
(with sort Fields
), it seems to be extremely tough to specify, generically, a working trait certain for the Fields
/Variants
. To clarify, right here is the total definition of the pub unsafe trait StructDescriptor
a part of the current repository with feedback:
0 /// An outline of a `struct` sort.
1 pub unsafe trait StructDescriptor: AdtDescriptor {
2 /// The kind of the `struct` that was described.
3 sort Kind;
4 /// A kind that represents the fields of this `struct`. If that is
5 /// `core::introwospection::NoType`, then it has no fields and no subject
6 /// implementations on it.
7 ///
8 /// NOTE
9 /// TODO(thephd) Allow a succint option to describe the entire constraints on this kind:
10 /// ```
11 /// sort Fields :
12 /// (for <const I: usize = 0..Self::FIELD_COUNT> FieldDescriptor<I>)
13 /// = NoType;
14 /// ```
15 /// to specify the right boundaries to make this kind usable in
16 /// generic contexts. (That is bikeshed syntax and topic to vary,
17 /// as there may be already a `for <T>` trait bounds function in Rust.)
18 sort Fields = NoType;
19 /// The variety of fields for this `struct` sort.
20 const FIELD_COUNT: usize = 0;
21 /// What sort of syntax was used to encapsulate the fields on this `struct` sort.
22 const FIELD_SYNTAX: FieldSyntax = FieldSyntax::Nothing;
23 /// Whether or not or not there are any fields which aren't seen for this kind.
24 const NON_VISIBLE_FIELDS: bool = false;
25 }
Successfully, we have to create bounds which can be composed of 0
to Self::FIELD_COUNT - 1
FieldDescriptor
s. The rationale these bounds are needed is in order that, in generic code (akin to with the *Customer
sorts within the mid-level API), we’ve the power to make use of these Fields
s and Variants
s in these strategies. Keep in mind, Rust’s Trait system is just not like C++ templates, C’s preprocessor, or Rust macros: they require that you simply specify up-front the entire needed actions that may be taken on a given enter parameter. We’re not allowed to “determine it out” later at utilization time (C++ templates) or hack the “step earlier than really doing the language” a lot that it spits out one thing the bottom language can perceive (C preprocessor, Rust macros). For:
sort Fields
onStructDescriptor
s andVariantDescriptor<INDEX>
;sort Variants
onEnumerationDescriptor
;- and,
sort Parameters
onFunctionDescriptor
;
we’ve to have the ability to use some type of currently-unknown syntax and specification. We’re informed that the currently-incomplete function that wants much more time and funding known as Generic const
Expressions (GCEs) will make this doable/believable in some style, however going via the const
repository and paperwork doesn’t make it instantly apparent to us how we might program such a factor utilizing GCEs. Nonetheless, what this does imply is that throughout the present trait system, it’s completely and utterly inexpressible to do the form of issues we wish to do for Rust’s compile-time introspection. So, despite the fact that we’ve completed what would be the library definitions of all of those traits and buildings within the library/core/src/introwospection.rs
supply file for our repository, we’ll want tons of extra language enhancements to succeed in our objectives.
Variadics Do Not Exist in Rust
This is likely one of the greatest points and causes a few of the worst issues when attempting to create an API of this caliber. Whereas Rust has a built-in tuple sort — much better than its counterpart in lots of different languages — tuples are usually not compile-time or generically programmable. As defined within the above part on tuple syntaxes. Entry is hard-coded, and whereas sensible of us can make partial solutions (thanks, Alyssa Haroldsen), they don’t seem to be versatile sufficient to supply the ergonomics needed for end-users. This creates an on-going stress: if variadics have been an actual function, the place there was a language assemble representing “0 or extra” sorts or “0 or extra values” (each logically in line with tuples) with a means of accessing these sorts or values, we might be capable of program over the fields of a struct
or the variants (and its fields) of an enum
. However we would not have any idea of “0 or extra” of one thing in Rust, and subsequently it’s patently inconceivable to program over what’s successfully a listing of “0 or extra” fields, variants, sorts, and so on. that include the territory of performing compile-time introspection.
This isn’t only a compile-time introspection drawback, both: the Rust SIMD Working Group confronted comparable problems with “how do I work with 0 or extra of the identical sort” when attempting to create typical vector sorts that matched to {hardware} SIMD/AVX/RISC-V/PowerPC instruction units. They needed to ultimately compromise on a few of their unique design for this, in addition to extract a couple of concessions from the Rust core language with a view to lastly obtain the current working product of std::simd
(however with notable restrictions, still).
This is likely one of the major drivers of the introwospect_over
key phrase that’s more likely to be launched with the compiler work related to these adjustments. Since we each can’t categorical the bounds on e.g. sort Fields;
, and we can’t write an algorithm which works over what’s successfully a heterogenous assortment with out falling right down to writing pretty concerned tuple-trait-implementation spam, we as a substitute launched this key phrase. It has a 3-piece syntax. Taking from the instance of computerized enumeration and construction serialization from a potentially-future serde, and modifying it for simplicity, we will see introwospect_over
’s supposed conduct:
0 use std::introwospection::*;
1 use serde::{Serializer, Serialize};
2
3 pub struct GeneralStructSerializer<S, T>
4 the place S: Serializer, T: Serialize + ?Sized
5 {
6 // worth being serialized
7 worth: &T,
8 normal_struct_state: Possibility<&mut S::SerializeStruct>,
9 maybe_field_error_index: Possibility<usize>
10 }
11
12 // Serialization routine for a `struct` sort.
13 impl<S: Serializer, T: Serialize + ?Sized> StructDescriptorVisitor
14 for GeneralStructSerializer<S, T>
15 {
16 sort Output -> End result<S::Okay, S::Error>
17
18 fn visit_struct_mut<Descriptor: 'static>(&mut self) -> Self::Output
19 the place Descriptor: StructDescriptor
20 {
21 // common construction serialization
22 let mut state = serializer.serialize_struct(
23 Descriptor::NAME, Descriptor::FIELD_COUNT
24 )?;
25 self.normal_struct_state = Some(&state);
26 let outcomes = [
27 // !! USED HERE !!
28 introwospect_over(Descriptor::Type, Descriptor::Fields, self)
29 ];
30 self.normal_struct_state = None;
31 if let Some(error_index) = self.maybe_field_error_index {
32 return outcomes[error_index];
33 }
34 return state.finish();
35 }
36 }
37
38 // Serialization routine for the fields of a `struct`.
39 impl<S: Serializer, T: Serialize + ?Sized> FieldDescriptorVisitor
40 for DefaultStructSerializeVisitor<S, T>
41 {
42 sort Output -> End result<S::Okay, S::Error>
43
44 fn visit_field_mut<Descriptor: 'static, const INDEX: usize>(
45 &mut self
46 ) -> Self::Output
47 the place Descriptor: FieldDescriptor<INDEX>
48 {
49 if self.maybe_field_error_index.is_some() {
50 return S::Error::customized(
51 "no use: earlier subject serialization already failed"
52 );
53 }
54 let mut state = self.normal_struct_state.unwrap();
55 // regular construction serializing:
56 // simply serialize the sector!
57 let outcome = state.serialize_field(
58 Descriptor::NAME,
59 get_field::<Descriptor, INDEX>(worth)
60 );
61 if outcome.is_err() {
62 self.maybe_error_index = Some(INDEX);
63 }
64 return outcome;
65 }
66 }
What we’re doing right here is utilizing introwospect_over
to cheat our means into reaching what variadic and const
generics may do for us as a substitute. introwospect_over
’s whole level is to take one thing which identifies the proudly owning sort (Descriptor::Kind
), inform us what we wish to iterate over (on this case the fields of a construction, so Descriptor::Fields
), and at last the thing which can be used to get the fields out of. As a result of we requested for the Fields
, we’ll get a write out of a comma-delimited record of perform calls to the visit_…_mut
name that each …DescriptorVisitor
-implementing sort has on it. For instance, utilizing the Kitty
sort outlined earlier on this publish:
0 #[introwospect(are_little_precious_babies = "yes!!")]
1 #[non_exhaustive]
2 pub struct Kitty {
3 pub is_soft: bool,
4 pub meows: i64,
5 destruction_of_all_curtains_trigger: u32
6 }
calling introwospect_over
on it within the above context would produce a comma-delimited record of visit_…_mut
calls that take this kind:
0 // … code from above
1 let outcomes = [
2 // !! USED HERE !!
3 self.visit_field_mut::<<Kitty as StructDescriptor>::Fields, 0>(),
4 self.visit_field_mut::<<Kitty as StructDescriptor>::Fields, 1>(),
5 self.visit_field_mut::<<Kitty as StructDescriptor>::Fields, 2>()
6 ];
7 // … code from above
The INDEX
values handed into the visit_…_mut
go from 0
to FIELD_COUNT - 1
. This enables us to get across the lack of variadic capabilities by simply having the compiler develop the record of sorts out for us and do the perform calls we wish. introwospect_over
additionally behaves a lot the identical for enum
s with their variants, simply performing customer.visit_variant_mut::<<Kind as EnumDescriptor>::Variants, 0>()
from 0
as much as VARIANT_COUNT - 1
. If there are not any fields, then nothing is produced in that spot. That is just like capabilities, which produce customer.visit_parameter_mut::<<Kind as FunctionDescriptor>::Parameters, 0>()
, all the way in which as much as PARAMETER_COUNT - 1
.
Word that the enlargement we’re performing here’s a non secular equivalence, not a precise or good semantic equivalence. sort Fields
, sort Variants
, and sort Parameters
would not have the suitable trait bounds on it. This regular-looking Rust code is inexpressible within the literal “that is actual compiled supply code” sense. However, with introwospect_over
, we will commit any motion we wish to and its completely authorized as a result of we are the compiler. As long as each the bounds are inexpressible and variadic programming is inexpressible in Rust, we’ll at all times want a key phrase to principally cowl the dearth of programmatic entry to numerous points of Rust.
introwospect(Kind, customer)
works in an identical means. Primarily based on what Kind
is, introwospect
takes the next actions:
0 /// `union`?
1 customer.visit_union_mut:::<<Kind as UnionDescriptor>>()
2 // `struct` ?
3 customer.visit_struct_mut::<<Kind as UnionDescriptor>>()
4 // `enum` ?
5 customer.visit_enum_mut::<<Kind as UnionDescriptor>>()
6 // `fn`/perform ?
7 customer.visit_function_mut::<<Kind as FunctionDescriptor>>()
One of many causes this additionally must be a key phrase is as a result of there’s no option to categorical “do that for a union
, try this for an enum
, do one other factor for a perform
, and do that for a struct
” in Rust generic code. So, switching primarily based on what sort of entity we’re visiting is paramount to making sure code compiles. If the core language ever adapts a means to do that in a chic, Rusty means, we might completely welcome it.
The lengthy record of points above, even with a hacked-in language function, is just not confidence-inspiring in how easy this job can be. When contemplating reflection — at compile-time — for C++, Reflection Research Group 7 (SG7), regardless of having not but completed a full reflection specification appropriate for ISO standardization, by no means actually had the issue of “what we’re doing is completely inexpressible within the language”. Templates have been so absolutely featured that one may do reflection in them given the appropriate language primitives since C++03 (2003). The interface could be completely horrendous, however the language was successfully prepared for compile-time reflection in a single weirdly-shaped means or one other and has been for 20 years. That C++ doesn’t have it’s a results of competing designs and the emergence of constexpr
, pushing the boundaries for what is feasible in compile-time contexts in C++. And rightly so: template metaprogramming one’s option to compile-time reflection is grody and disgusting, and this comes from a gaggle — us — who’ve carried out far an excessive amount of of it to nice success.
With Rust, the duty is way extra intimidating. Each the language and the library are wholly incapable of expressing the required ideas to do work on this stuff with regular Rust. The Trait system in Rust is just not {powerful} sufficient to specific needed constraints within the slightest.
For instance, one can recursively categorical a Fields
sort in C++ by making a template template <typename Kind, size_t INDEX> struct FieldDescriptor;
that takes the typename Kind
and a size_t INDEX
. One can carry out recursion with C++ templates, creating what’s successfully a stack of sorts describing every subject by doing FieldDescriptor<Kind, INDEX - 1>
. The cease situation for that template is writing a specialization — one thing deeply frowned upon in Rust due to soundness points — and stopping for FieldDescriptor<Kind, 0>
to stop blowing out the compiler’s reminiscence via infinite recursion.
With Rust traits, not solely is INDEX - 1
not precisely a kosher operation in const
generics, however there isn’t any option to write a cease situation for FieldDescriptor<0>
to inform it to cease its recursion, which means it will ultimately blow out the compiler’s reminiscence stack or simply be flagged as a straight up unlawful operation. Which means it’s inconceivable to do recursive compile-time programming with sort info or integer fixed expressions in an excellent remotely regular means in Rust.
This creates an infinite variety of issues for the Trait-based world of generics that Rust has constructed up for itself. As a result of we can’t correctly constraint to have precisely all the sector descriptors needed, it implies that generic code can’t write predictable, compile-time computation over an really generic set of components. Rust even lacks the power to pick for a selected subset of sorts that match a selected trait, as a result of it’s going to error in case you use multiple bounding trait within the assortment of implementations for any given trait.
Issues which can be trivially expressible in C++, Zig, and different languages develop into infinitely inconceivable in Rust’s traits and generics.
This, sadly, is a robust driver of how numerous the mid-level API can and should work in Rust. Key phrases introwospect
, introwospect_type
, and introwospect_over
are all pushed by deeply-seated inadequacies in Rust’s core syntaxes and language capabilities, every half a basic acknowledgement of a honest challenge that should be labored round with a view to obtain the end-goal of correct compile-time reflection. And even once we take shortcuts with issues like introwospect_type
and introwospect_over
, it introduces its personal extreme points with the ergonomics of the programming fashions provided to end-users.
The Downside of Customer
-based Programming
The rationale we describe the introwospect(...)
and introwospect_over(...)
APIs as “mid degree” is as a result of:
- there’s a higher option to program for these constructs and doing so with unconditionally linear calls utilizing 0 or extra fields ends in needing to recollect state from earlier perform calls to early-exit from different perform calls;
- and, it actually does match the slang time period “mid”, as in subpar high quality.
For instance, take into account generic construction serialization we made above that was derived from the total serde implementation prior. Specifically, take into account this line of code on the prime of the visit_field_mut
perform name:
0 // … remainder of the code right here!!
1 if self.maybe_field_error_index.is_some() {
2 return S::Error::customized(
3 "no use: earlier subject serialization already failed"
4 );
5 }
6 // … remainder of the code right here!!
Right here, we’ve to book-keep that an error has occurred on a previous subject and inject that state into this FieldDescriptorVisitor
implementation. It’s because introwospect_over
lays out a listing of objects that produces a comma-delimited record, and finally each visit_field_mut
perform for that customer should return the identical sort since we put it into an array:
0 // … remainder of the code right here!!
1 let outcomes = [
2 // !! USED HERE !!
3 introwospect_over(Descriptor::Type, Descriptor::Fields, self)
4 ];
5 // … remainder of the code right here!!
That is the issue with function-based and closure-based code in Rust: by having utterly completely different capabilities and, successfully, completely unrelated scopes, objects themselves need to develop into liable for book-keeping round perform calls, together with early cancellation (e.g., stopping if the primary subject fails to serialize correctly).
Can We Do Higher?
The higher option to repair that is with a programmatic assemble that means that you can entry every factor at compile-time in a generic style, akin to with a compile-time for
loop. For instance, a (made-up, not-at-all-real) for const
may clear up this drawback:
0 // … under is rewritten code to make use of fanciful, not-real `for const` movement management
1
2 // Serialization routine for a `struct` sort.
3 impl<S: Serializer, T: Serialize + ?Sized> StructDescriptorVisitor
4 for GeneralStructSerializer<S, T>
5 {
6 sort Output -> End result<S::Okay, S::Error>
7
8 fn visit_struct_mut<Descriptor: 'static>(&mut self) -> Self::Output
9 the place Descriptor: StructDescriptor
10 {
11 // common construction serialization
12 let mut state = serializer.serialize_struct(
13 Descriptor::NAME, Descriptor::FIELD_COUNT
14 )?;
15 for const INDEX in 0..Descriptor::FIELD_COUNT {
16 // regular construction serializing:
17 // simply serialize the sector!
18 state.serialize_field(
19 Descriptor::NAME,
20 get_field::<Descriptor, INDEX>(worth)
21 )?;
22 // no "outcome" storage
23 // no "normal_struct_state" storage
24 // none of that nonsense!!!
25 }
26 return state.finish();
27 }
28 }
Right here, we’re saying “this for
loop runs at compile time”. It’s equal to successfully performing a compile-time unrolling of the loop, every iteration of the loop successfully it’s personal scoped physique throughout the perform. The ?
utilization in right here simply bails if the serialization fails, with out us having to save lots of any intermediate state in order that the subsequent perform name of a visit_field_mut
has to deal with it. This is much better than the *DescriptorVisitor
-based programming. That is what we might take into account a correct “excessive degree” API, that’s each excessive when it comes to abstraction degree/ease of use, and when it comes to high quality of the utilization expertise.
It permits us to remain in our scope, however grasp compile-time values significantly better. As acknowledged beforehand, this syntax is just not actual. It might make every part a lot less complicated, nonetheless, and thus could be price investigating within the long-term for doing higher const
programming in Rust. Normally, offering extra constructs which permit this seamless transition between related const
objects, sorts, and conduct would allow not simply compile-time programming use instances, however make code for the SIMD mission and several other different use instances much more elegant, readable, and tractable.
There may be much more that we may discuss insofar as Rust’s strengths and weaknesses for its generic programming. Now we have not even gotten to speak in regards to the sneaky means one can introduce post-monomorphization errors to cease somebody from serializing a struct
with fields which can be non-visible to somebody outdoors of your crate. From the total default-serialization serde instance:
0 // … ELIDED CODE ABOVE
1
2 // Personal trait to set off assertion at post-monomorphization time.
3 trait PostMonomorphizationValidityCheck {
4 const TRIGGER: ();
5 }
6
7 /// This perform takes a listing of attributes, and the boolean about whether or not or not this
8 // sort has private fields, and tells whether or not or not we will serialize this utilizing
9 // the default serializer at compile-time.
10 const fn is_default_serializable(
11 has_non_visible_fields: bool,
12 attributes: &[AttributeDescriptor],
13 ) -> bool {
14 if !has_non_visible_fields {
15 return true;
16 }
17 std::introwospection::contains_attribute("allow_private", attributes)
18 }
19
20 // Serialization routine for a `struct` sort.
21 impl<S: Serializer, T: Serialize + ?Sized> StructDescriptorVisitor
22 for DefaultStructSerializeVisitor<S, T>
23 {
24 sort Output -> End result<S::Okay, S::Error>
25
26 fn visit_struct_mut<Descriptor: 'static>(&mut self) -> Self::Output
27 the place Descriptor: StructDescriptor
28 {
29 // Implementing an related fixed that matches the trait necessities
30 // permits us to bypass the unique trait checks, however defer
31 // the precise compile-time set off to post-monomorphization time, a lot
32 // like a C++ template second-stage utilization verify.
33 struct C<CheckedDescriptor> the place CheckedDescriptor: StructDescriptor;
34 impl PostMonomorphizationValidityCheck for C<Kind> {
35 const TRIGGER: () = assert!(
36 !is_default_serializable(
37 Descriptor::NON_VISIBLE_FIELDS,
38 Descriptor::ATTRIBUTES
39 ),
40 concat!(
41 "We can't serialize a construction with "
42 "non-visible non-public fields and no "
43 "`#[introwospection(allow_private)]` attribute."
44 )
45 );
46 }
47 // Set off the verify upon post-monomorphization of this perform.
48 const _NO_INACCESSIBLE_FIELDS: () = <C as InaccessibelFieldCheck>::TRIGGER;
49 // … MORE ELIDED CODE BELOW IN FUNCTION
50 }
51 }
52 // … MORE ELIDED CODE BELOW
There are genuinely cool issues that may be carried out in Rust with generics, and genuinely superior issues {that a} Trait-based system like Rust permits, however in the mean time compile-time reflection is an extremely steep order.
There’s a lot that may and needs to be mentioned. For the second, what Shepherd’s Oasis goes to deal with are the straightforward library issues talked about above akin to
0 pub fn std::mem::discriminant_at<EnumType>(
1 index: usize
2 ) -> std::choice::Possibility<std::mem::Discriminant<EnumType>>;
and some different low-level utilities that can be in service of the code we wish to work on. Implementing the key phrases and the remainder of the performance in time for the top of this Grant interval — one other 2 months and 1 week — appears excessively past our capabilities as first-time rustc
contributors. Shepherd’s Oasis may even look into aiding Waffle and DrMeepster in pushing and stabilizing std::mem::offset_of!
for enumerations, unions, and extra to verify the syntax is stable.
As a part of that dialogue, one of many issues we wish to tackle up-front is the notion of elevated breakable API floor space. As a consequence of offering extra info at compile-time, there exists an opportunity that people might:
- depend on compile-time computations that feed into different sorts (however not the identical sort, as that might be errored by the cycle checker);
- depend on the ordering of fields for inside and exterior code;
- and, depend on the variety of fields out there or accessible from an exterior crate.
This can be a very actual concern, and is very {powerful} when coping with exterior crates from e.g. crates.io. Whereas crate-internal and personal knowledge sorts and capabilities can at all times be dealt with with grace, adjustments to exterior crates may have sturdy ripple results in day-to-day code. If we have been working within the C or C++ ecosystem, that is one thing we must take into account very significantly as a part of a proposal so as to add such options to the language. Fortunately, for Rust, that is completely a non-issue. In contrast to C and C++ which nonetheless depend on — in all good actuality, honesty, and equity — grotesque Perl applications, python scripts, and willful amnesia-inducing autoconf scripts combined in with makefiles, Rust has a sturdy package deal administration system constructed into its very core. cargo.toml
has strict semantic versioning, function specs, and is paired with a language that was developed in an age the place supply management and repository management is each ubiquitous and personally catered to by crates.io, with an enormously responsive neighborhood.
Sage Griffin didn’t beforehand co-lead a complete workforce of essentially the most fearless Ferris-lovers at crates.io to blaze a path of glory for all the Rust ecosystem simply to have of us out of the blue be afraid of change.
If a crate adjustments a construction’s public fields, or adjustments perform parameters in a serious semantic versioning improve, or adjustments the kind of a selected enumeration’s variant, the APIs introduced right here provide you with all the ability to correctly warn/error at compile-time about such adjustments. And, in case you are too scared to make such a change, nothing stops the consumer from modifying their cargo.toml
to roll again to an older model if an improve is just not possible. We’re previous the stage the place we’ve to listen to horror tales about somebody who’s constructing towards 25 12 months outdated object recordsdata as a result of they blew up the one copy of the supply code again within the day, and subsequently have to keep up actual compatibility with a set of the world’s worst oldest and worst bundle of static libraries. Now we have the infrastructure and functionality to protect compatibility regionally whereas pushing for higher enhancements globally. Most of those instruments — from cargo
to crates.io
— have been supplied at nice burnout and private price to elder Rustaceans, battle-hardened software program builders, weary SREs, and devoted pc scientists.
We sturdy consider people afraid of such situations ought to use the instruments made out there to them for their very own private benefit and luxury.
We additionally consider that we’ve achieved the appropriate form for the API we wish. It’s type-based, and all of the values exist at compile-time via using related const
objects. We can also chime into the team working on const
generic expressions and general-purpose Rust const
-eval to supply some suggestions every now and then as we work on issues. We hope that what we wrote right here could be helpful inspiration for them, and assist information the language to supply APIs and language options that make compile-time programming not simply on-par with different languages of their space, however much better than they might ever hope to be.
We wish to thank the next folks, who put up with numerous our inquiries and, in some instances, acquired us into this bother within the first place (it is a #[non_exhaustive]
record).
- Manish Goregaokar, for getting us concerned on this titanic endeavor within the first place and inspiring us to strive for higher generic programming and serialization in Rust (Website, GitHub).
- Miguel Younger de la Sota, for additionally getting us concerned on this titanic endeavor within the first place and sowing the seeds of chaotic enchancment (Website, GitHub, Art).
- Waffle, for serving to us get off the bottom with constructing
rustc
and fixing unusual errors. Additionally, for serving to DrMeepster mergestd::mem::offset_of!
whilst we have been writing this text (Website, GitHub). - boxy, for being an knowledgeable on
const
generics and placing up with one million questions from us, lots of which have been very, very fundamental; she additionally aided in getting us off the bottom forrustc
builds (GitHub). - compiler-errors, for additionally placing up with numerous our worst and most boring questions and serving to us cease erroring the
rustc
construct (GitHub). - Callie, for offering a ton of concepts for the implementation and form of the
Descriptor
sorts, working via howDiscriminant<T>
could also be used, serving to write a extra elegantstr_equals
inconst fn
Rust (that also compiles nice!) and a lot extra. - Jubilee, for spell checking and listening to out random concepts, spell checking a couple of of our preliminary pull requests and docs, and pinging us in Zulip (GitHub).
- Nilstrieb, for fielding questions and serving to us work via most of the implications of the Trait system and
const fn
/const
generics (GitHub). - Alyssa Haroldsen, who did a monumental quantity of effort in explaining post-monomorphization error methods, particular generic dealing with, tuple
map
implementations (and their varied compile-time pitfalls), and extra (Website, GitHub, Twitter, Fediverse). - oli-obk, for steering and encouragement with kicking off with the Rust compiler (GitHub).
Congratulations on making it to the top of this very lengthy report. We hope studying this was as useful because it was for these of us who wrote it.