Now Reading
Is coding in Rust as dangerous as in C++?

Is coding in Rust as dangerous as in C++?

2023-01-05 23:21:19

Is coding in Rust as dangerous as in C++?

A sensible comparability of construct and check velocity between C++ and Rust.

Written by strager on

C++ is infamous for its sluggish construct instances. “My code’s
compiling” is a meme within the programming world, and C++ retains this
joke alive.

“Compiling” by Randall Munroe, edited, CC BY-NC 2.5

Tasks like Google Chromium take
an hour to build
on model new {hardware} and
6 hours to build
on older {hardware}. There are tons of
documented tweaks
to make builds quicker, and
error-prone shortcuts
to compile much less stuff. Even with hundreds of {dollars} of cloud
computational energy, Chromium construct instances are nonetheless on the order of half
a dozen minutes. That is fully unacceptable to me. How can folks
work like this daily?

I’ve heard the identical factor about Rust: construct instances are an enormous drawback.
However is it actually an issue in Rust, or is that this anti-Rust
propaganda? How does it examine to C++’s construct time drawback?

I deeply care about construct velocity and runtime efficiency. Quick build-test
cycles make me a productive, blissful programmer, and I bend over backwards
to make my software program quick so my prospects are blissful too. So I made a decision to
see for myself whether or not Rust construct instances had been as dangerous as they declare. Right here
is the plan:

  1. Discover an open supply C++ undertaking.
  2. Isolate a part of the undertaking into its personal mini undertaking.
  3. Rewrite the C++ code line-by-line into Rust.
  4. Optimize the construct for each the C++ undertaking and the Rust undertaking.
  5. Examine compile+check instances between the 2 tasks.

My hypotheses (educated guesses, not conclusions):

  1. The Rust port could have barely fewer strains of code than the C++
    model.

    Most features and strategies have to be declared twice in C++ (one in
    the header, and one within the implementation file). This is not wanted
    in Rust, lowering the road rely.

  2. For full builds, C++ will take longer to compile than Rust (i.e.
    Rust wins).

    That is due to C++’s #embrace characteristic and C++
    templates, which have to be compiled as soon as per .cpp file. This
    compilation is completed in parallel, however parallelism is imperfect.

  3. For incremental builds, Rust will take longer to compile than
    C++ (i.e. C++ wins).

    It is because Rust compiles one crate at a time, relatively than
    one file at a time like in C++, so Rust has to take a look at extra code
    after every small change.

strager@stragerWhich language compiles quicker: C++ or Rust?I ported a C++ undertaking line-by-line to Rust.Each are 17k SLOC.C++ builds quicker41.9%Rust builds quicker17.1%Identical construct velocity5.7%No thought (“it relies upon™”)35.2%105 votes · Ultimate outcomes3:08 AM · Dec 30, 2022Linux: linkers carry out about the identicaldecrease is healthier.construct+checkw/o depsGNU ld (default)1961msMould1965ms (+0.2%)construct+checkw/ depsGNU ld (default)2173msMould2176ms (+0.1%)incrementaldiag-typesGNU ld (default)1119msMould1115ms (-0.4%)incrementallexGNU ld (default)1004msMould1003ms (-0.1%)incrementaltest-utf-8GNU ld (default)352msMould357ms (+1.6%)Rust toolchains: {custom} toolchain is quickestexamined on Linux. decrease is healthier.construct+checkw/ depsSecure2000msNightly2007msCustomized2351ms (+17%)Customized+PGO1999ms (-0.4%)Customized+PGO+BOLT1960ms (-2.4%)incrementallexSecure936msNightly931msCustomized980ms (+5.3%)Customized+PGO889ms (-4.5%)Customized+PGO+BOLT881ms (-5.4%)Linux: {custom} Clang is quickest toolchaindecrease is healthier.construct+checkw/o depsGCC5291msUbuntu Clang libc++2817msUbuntu Clang libstdc++2873msMy Clang libc++2105msMy Clang libstdc++1874msincrementaldiag-typesGCC3173msUbuntu Clang libc++1615msUbuntu Clang libstdc++1588msMy Clang libc++1157msMy Clang libstdc++1081msincrementallexGCC783msUbuntu Clang libc++539msUbuntu Clang libstdc++690msMy Clang libc++413msMy Clang libstdc++501msincrementaltest-utf-8GCC798msUbuntu Clang libc++482msUbuntu Clang libstdc++458msMy Clang libc++388msMy Clang libstdc++358msmacOS: Xcode is quickest toolchaindecrease is healthier.construct+checkw/o depsXcode ld641942msXcode lld1968msXcode zld1942msClang 15 ld642461msClang 15 lld2342msClang 15 zld2466msincrementaldiag-typesXcode ld641198msXcode lld1200msXcode zld1212msClang 15 ld641452msClang 15 lld1395msClang 15 zld1448msincrementallexXcode ld64513msXcode lld500msXcode zld529msClang 15 ld64567msClang 15 lld518msClang 15 zld569msincrementaltest-utf-8Xcode ld64451msXcode lld433msXcode zld480msClang 15 ld64493msClang 15 lld471msClang 15 zld514msLinux: Rust typically builds quicker than C++decrease is healthier.construct+checkw/o depsRust1847msC++1874ms (+1.5%)incrementaldiag-typesRust866msC++1081ms (+25%)incrementallexRust881msC++501ms (-43%)incrementaltest-utf-8Rust288msC++358ms (+24%)macOS: C++ normally builds quicker than Rustdecrease is healthier.construct+checkw/o depsRust2893msC++1942ms (-33%)incrementaldiag-typesRust1771msC++1198ms (-32%)incrementallexRust1897msC++513ms (-73%)incrementaltest-utf-8Rust445msC++451ms (+1.3%)C++ full builds scale higher than Rustexamined on Linux. decrease is healthier.construct+checkw/o deps1x Rust1847ms8x Rust3156ms (+71%)16x Rust4901ms (+165%)24x Rust6750ms (+266%)1x C++1874ms8x C++2262ms (+21%)16x C++2728ms (+46%)24x C++3343ms (+78%)C++ incremental builds scale higher than Rustexamined on Linux. decrease is healthier.incrementaldiag-types1x Rust866ms8x Rust1694ms (+96%)16x Rust2821ms (+226%)24x Rust4023ms (+365%)1x C++1081ms8x C++1471ms (+36%)16x C++2104ms (+95%)24x C++2745ms (+154%)incrementallex1x Rust881ms8x Rust1597ms (+81%)16x Rust2573ms (+192%)24x Rust3595ms (+308%)1x C++501ms8x C++653ms (+30%)16x C++833ms (+66%)24x C++1008ms (+101%)incrementaltest-utf-81x Rust288ms8x Rust783ms (+172%)16x Rust1395ms (+385%)24x Rust2040ms (+610%)1x C++358ms8x C++501ms (+40%)16x C++667ms (+86%)24x C++831ms (+132%)macOS: linkers carry out about the identicaldecrease is healthier.construct+checkw/o depsld64 (default)3186mslld3164mszld3226msconstruct+checkw/ depsld64 (default)3417mslld3390mszld3378msincrementaldiag-typesld64 (default)2059mslld2040mszld2045msincrementallexld64 (default)2090mslld2067mszld1999msincrementaltest-utf-8ld64 (default)515mslld511mszld484msRust backend: LLVM (default) beats Craneliftexamined on Linux. decrease is healthier.construct+checkw/o depsLLVM2017msCranelift2085ms (+3.4%)construct+checkw/ depsLLVM2198msCranelift2392ms (+8.8%)incrementaldiag-typesLLVM1127msCranelift1157ms (+2.7%)incrementallexLLVM1026msCranelift1173ms (+14%)incrementaltest-utf-8LLVM344msCranelift489ms (+42%)rustc flags: fast construct beats debug constructexamined on Linux. decrease is healthier.construct+checkw/o depsdebug (default)2017msfast, incremental=false1650ms (-18%)fast, incremental=true1865ms (-7.5%)fast, -Zshare-generics1868ms (-7.4%)incrementaldiag-typesdebug (default)1127msfast, incremental=false1360ms (+21%)fast, incremental=true909ms (-19%)fast, -Zshare-generics903ms (-20%)incrementallexdebug (default)1026msfast, incremental=false1353ms (+32%)fast, incremental=true931ms (-9.3%)fast, -Zshare-generics935ms (-8.9%)incrementaltest-utf-8debug (default)344msfast, incremental=false318ms (-7.6%)fast, incremental=true319ms (-7.2%)fast, -Zshare-generics319ms (-7.3%)Rust full builds: workspace structure is quickestexamined on Linux. decrease is healthier.construct+checkw/o depsworkspace; many check exes1862msworkspace; 1 check exe1812mssingle crate; many check exes2039mssingle crate; assessments in lib1952ms2 crates; many check exes3286ms2 crates; 1 check exe2189msRust incremental builds: greatest structure is unclearexamined on Linux. decrease is healthier.incrementaldiag-typesworkspace; many check exes888msworkspace; 1 check exe841mssingle crate; many check exes1036mssingle crate; assessments in lib800ms2 crates; many check exes2298ms2 crates; 1 check exe948msincrementaltest-utf-8workspace; many check exes333msworkspace; 1 check exe342mssingle crate; many check exes354mssingle crate; assessments in lib659ms2 crates; many check exes1575ms2 crates; 1 check exe526msDisabling libc options makes no distinctionexamined on Linux. decrease is healthier.construct+checkw/ depsdefault2198msdisable libc default options2193ms (-0.3%)Linux: cargo-nextest slows down testingdecrease is healthier.construct+checkw/o depsDefault1865mscargo-nextest1888ms (+1.2%)construct+checkw/ depsDefault2007mscargo-nextest2036ms (+1.4%)incrementaldiag-typesDefault909mscargo-nextest927ms (+2.0%)incrementallexDefault931mscargo-nextest935ms (+0.4%)incrementaltest-utf-8Default319mscargo-nextest331ms (+3.8%)check solelyDefault135mscargo-nextest157ms (+16%)macOS: cargo-nextest accelerates construct+checkdecrease is healthier.construct+checkw/o depsDefault2969mscargo-nextest2893ms (-2.5%)construct+checkw/ depsDefault3172mscargo-nextest3085ms (-2.7%)incrementaldiag-typesDefault1866mscargo-nextest1771ms (-5.1%)incrementallexDefault1906mscargo-nextest1897ms (-0.5%)incrementaltest-utf-8Default442mscargo-nextest445ms (+0.8%)check solelyDefault206mscargo-nextest187ms (-9.1%)

What do you assume? I polled my viewers to get their opinion:

my ballot on Twitter

42% of individuals assume that C++ will win the race.
35% of individuals agree with me that “it relies upon™”.
And 17% of individuals assume Rust will show us all fallacious.

Try the
optimizing Rust build times part
if simply need to make your Rust undertaking construct quicker.

Try the
C++ vs Rust build times part if
you simply need the C++ vs Rust comparisons.

Let’s get began!

Making the C++ and Rust check topics

Discovering a undertaking

If I will spend a month rewriting code, what code ought to I port? I
selected a number of standards:

  • Few or no third-party dependencies. (Customary library is okay.)
  • Works on Linux and macOS. (I do not care a lot about construct instances on
    Home windows.)
  • Intensive check suite. (With out one, I would not know if my Rust code
    was appropriate.)
  • Somewhat little bit of the whole lot: FFI; pointers; normal and {custom}
    containers; utility courses and features; I/O; concurrency; generics;
    macros; SIMD; inheritance

The selection is straightforward: port the undertaking I have been engaged on for the previous
couple of years! I will port the JavaScript lexer within the
quick-lint-js project.

Dusty, the quick-lint-js mascot

Trimming the C++ code

The C++ portion of quick-lint-js comprises over 100k
SLOC. I am not going to port that a lot code to Rust; that will
take me half a 12 months! Let’s as an alternative give attention to simply the JavaScript lexer.
This pulls in different elements of the undertaking:

  • Diagnostic system
  • Translation system (used for diagnostics)
  • Numerous reminiscence allocators and containers (e.g. bump allocator;
    SIMD-friendly string)
  • Numerous utility features (e.g. UTF-8 decoder; SIMD intrinsic
    wrappers)
  • Take a look at helper code (e.g. {custom} assertion macros)
  • C API

Sadly, this subset does not embrace any concurrency or I/O. This
means I can not check the compile time overhead of Rust’s
async/await. However that is a small a part of
quick-lint-js, so I am not too involved.

I will begin the undertaking by copying all of the C++ code, then deleting code I
knew was not related to the lexer, such because the parser and LSP server. I
really ended up deleting an excessive amount of code and had so as to add some again in. I
saved trimming and trimming till I could not trim no extra. All through the
course of, I saved the C++ assessments passing.

After stripping the quick-lint-js code right down to the lexer (and the whole lot
the lexer wants), I find yourself with about 17k SLOC of C++:

C++ undertaking measurement
C++
SLOC
src 9.3k
check 7.3k
subtotal 16.6k
dep: Google Take a look at 69.7k

The rewrite

How am I going to rewrite hundreds of strains of messy C++ code? One file
at a time. Here is the method:

  1. Discover a good module to transform.
  2. Copy-paste the code and assessments, search-replace to repair some syntax, then
    maintain working cargo check till the construct and assessments go.
  3. If it seems I wanted one other module first, go to step 2 for that
    wanted module, then come again to this module.
  4. If I am not executed changing the whole lot, go to step 1.

There’s one main distinction between the Rust and C++ tasks which
may have an effect on construct instances. In C++, the diagnostics system is applied
with loads of code era, macros, and constexpr. In
the Rust port, I take advantage of code era, proc macros, regular macros, and a
sprint of const. I’ve heard claims that proc macros are
sluggish, and different claims that proc macros are solely sluggish as a result of they’re
normally poorly written. I hope I did a superb job my with my proc macros.
????

The Rust undertaking seems to be barely bigger than the C++ undertaking:
17.1k SLOC of Rust in comparison with 16.6k SLOC of C++:

Undertaking sizes
C++
SLOC
Rust SLOC C++ vs Rust SLOC
src 9.3k 9.5k +0.2k
(+1.6%)
check 7.3k 7.6k +0.3k
(+4.3%)
subtotal 16.6k 17.1k +0.4k
(+2.7%)
dep: Google Take a look at 69.7k
dep: autocfg 0.6k
dep: lazy_static 0.4k
dep: libc 88.6k
dep: memoffset 0.6k

Optimizing the Rust construct

I care so much about construct instances. Subsequently, I had already optimized construct
instances for the C++ undertaking (earlier than trimming it down). I must put in a
related quantity of effort into optimizing construct instances for the Rust
undertaking.

Let’s strive these items which may enhance Rust construct instances:

Quicker linker

My first step is to profile the construct. Let’s first profile utilizing the
-Zself-profile rustc flag. In my undertaking, this flag outputs two totally different information. In one of many
information, the run_linker section stands out:

-Zself-profile
outcomes (spherical 1)
Merchandise Self time % of complete time
run_linker 129.20ms 60.326
LLVM_module_­codegen_emit_obj 23.58ms 11.009
LLVM_passes 13.63ms 6.365

Prior to now, I efficiently improved C++ construct instances by switching to the
Mold linker. Let’s strive it
with my Rust undertaking:


Disgrace; the advance, if any, is barely noticeable.

That was Linux. macOS additionally has alternate options to the default linker: lld
and zld. Let’s strive these:


On macOS, I additionally see little to no enchancment by switching away from the
default linker. I think that the default linkers on Linux and macOS
are doing a adequate job with my small undertaking. The optimized linkers
(Mould, lld, zld) shine for large tasks.

Cranelift backend

Let us take a look at the
-Zself-profile
profiles once more. In one other file, the
LLVM_module_­codegen_emit_obj and
LLVM_passes phases stood out:

-Zself-profile
outcomes (spherical 2)
Merchandise Self time % of complete time
LLVM_module_­codegen_emit_obj 171.83ms 24.274
typeck 57.50ms 8.123
eval_to_allocation_raw 54.56ms 7.708
LLVM_passes 50.03ms 7.068
codegen_module 40.58ms 5.733
mir_borrowck 36.94ms 5.218

I heard speak about different rustc backends to LLVM, specifically Cranelift.
If I construct with the
rustc Cranelift backend, -Zself-profile seems promising:

-Zself-profile
outcomes (spherical 2, with Cranelift)
Merchandise Self time % of complete time
outline operate 69.21ms 12.307
typeck 57.94ms 10.303
eval_to_allocation_raw 55.77ms 9.917
mir_borrowck 37.44ms 6.657

Sadly, precise construct instances are worse with Cranelift than with
LLVM:


Compiler and linker flags

Compilers have a bunch of knobs to hurry up builds (or sluggish them down).
Let’s strive a bunch:

  • -Zshare-generics=y (rustc) (Nightly solely)
  • -Clink-args=-Wl,-s (rustc)
  • debug = false (Cargo)
  • debug-assertions = false (Cargo)
  • incremental = true and incremental = false (Cargo)
  • overflow-checks = false (Cargo)
  • panic="abort" (Cargo)
  • lib.doctest = false (Cargo)
  • lib.check = false (Cargo)

Be aware: fast, -Zshare-generics=y is identical as
fast, incremental=true however with the
-Zshare-generics=y flag enabled. Different bars exclude
-Zshare-generics=y as a result of that flag will not be secure (thus
requires the nightly Rust compiler).

Most of those knobs are documented elsewhere, however I have never seen anybody
point out linking with -s. -s strips debug data,
together with debug data from the statically-linked Rust normal library.
This implies the linker must do much less work, lowering hyperlink instances.

Workspace and check layouts

Rust and Cargo have some flexibility in the way you place your information on
disk. For this undertaking, there are three cheap layouts:

single crate
  • Cargo.toml
  • src/

    • lib.rs
    • fe/

    • i18n/

    • check/ (check helpers)

    • util/

2 crates
workspace
  • Cargo.toml
  • libs/

    • fe/

    • i18n/

    • check/ (check helpers)

    • util/

In concept, if you happen to cut up your code into a number of crates, Cargo can
parallelize rustc invocations. As a result of I’ve a 32-thread CPU on my
Linux machine, and a 10-thread CPU on my macOS machine, I count on
unlocking parallelization to scale back construct instances.

For a given crate, there are additionally a number of locations on your assessments in a
Rust undertaking:

many check exes
  • Cargo.toml
  • src/

  • assessments/

    • test_a.rs
    • test_b.rs
    • test_c.rs
1 check exe
  • Cargo.toml
  • src/

  • assessments/

    • check.rs
    • t/

      • mod.rs
      • test_a.rs
      • test_b.rs
      • test_c.rs
assessments in lib
  • Cargo.toml
  • src/

    • a.rs
    • b.rs
    • c.rs
    • lib.rs
    • test_a.rs
    • test_b.rs
    • test_c.rs
unittests
  • Cargo.toml
  • src/

    • a.rs (+assessments)
    • b.rs (+assessments)
    • c.rs (+assessments)
    • lib.rs

Due to dependency cycles, I could not benchmark the
assessments inside src information structure. However I did benchmark the opposite
layouts in some combos:



The workspace configurations (with both separate check
executables (many check exes) or one merged check executable (1 check exes)) appears to be the all-around winner. Let’s persist with the
workspace; many check exes configuration from right here onward.

Decrease dependency options

Many crates assist non-obligatory options. Typically, non-obligatory options are
enabled by default. Let’s have a look at what options are enabled utilizing the
cargo tree command:

$ cargo tree --edges options
cpp_vs_rust v0.1.0
├── cpp_vs_rust_proc characteristic "default"
│   └── cpp_vs_rust_proc v0.1.0 (proc-macro)
├── lazy_static characteristic "default"
│   └── lazy_static v1.4.0
└── libc characteristic "default"
    ├── libc v0.2.138
    └── libc characteristic "std"
        └── libc v0.2.138
[dev-dependencies]
└── memoffset characteristic "default"
    └── memoffset v0.7.1
        [build-dependencies]
        └── autocfg characteristic "default"
            └── autocfg v1.1.0

The libc crate has a characteristic referred to as std. Let’s
flip it off, check it, and see if construct instances enhance:

Cargo.toml
 [dependencies]

libc = { model = "0.2.138" }

Construct instances are no higher. Possibly the std characteristic
does not really do something significant? Oh properly. On to the following tweak.

cargo-nextest

cargo-nextest is a instrument which claims to
be “as much as 60% quicker than cargo check.”. My Rust code base is
44% assessments, so perhaps cargo-nextest is simply what I would like. Let’s strive it and
examine construct+check instances:


See Also

On my Linux machine, cargo-nextest both does not assist or makes issues
worse. The output does look fairly, although…

pattern cargo-nextest check output
PASS [   0.002s]        cpp_vs_rust::test_locale no_match
PASS [   0.002s]     cpp_vs_rust::test_offset_of fields_have_different_offsets
PASS [   0.002s]     cpp_vs_rust::test_offset_of matches_memoffset_for_primitive_fields
PASS [   0.002s] cpp_vs_rust::test_padded_string as_slice_excludes_padding_bytes
PASS [   0.002s]     cpp_vs_rust::test_offset_of matches_memoffset_for_reference_fields
PASS [   0.004s] cpp_vs_rust::test_linked_vector push_seven

How about on macOS?


cargo-nextest does barely velocity up builds+assessments on my MacBook Professional. I
marvel why speedup is OS-dependent. Maybe it is really
hardware-dependent?

From right here on, on macOS I’ll use cargo-nextest, however on Linux I’ll
not.

Customized-built toolchain with PGO

For C++ builds, I discovered that constructing the compiler myself with
profile-guided optimizations (PGO, also referred to as
FDO) gave
vital efficiency wins. Let’s strive PGO with the Rust toolchain.
Let’s additionally strive
LLVM BOLT
to additional optimize rustc. And -Ctarget-cpu=native as properly.


In comparison with C++ compilers, it seems just like the Rust toolchain revealed
by way of rustup is already well-optimized. PGO+BOLT gave us lower than a ten%
efficiency enhance. However a perf win is a perf win, so let’s use this
quicker toolchain within the struggle versus C++.

Once I first tried constructing a {custom} Rust toolchain, it was slower than
Nightly by about 2%. I struggled for days to a minimum of attain parity,
tweaking all types of knobs within the Rust config.toml, and
cross-checking Rust’s CI construct scripts with my very own. As I used to be placing the
ending touches on this text, I made a decision to
rustup replace, git pull, and re-build the
toolchain from scratch. Then my {custom} toolchain was quicker! I assume
this was what I wanted; maybe I used to be by chance on the fallacious commit
within the Rust repo. ????‍♀️

Optimizing the C++ construct

When engaged on the unique C++ undertaking, quick-lint-js, I already
optimized construct instances utilizing frequent methods, as utilizing
PCH, disabling exceptions and
RTTI, tweaking construct
flags, eradicating pointless #embraces, transferring code out of
headers, and externing template instantiations. However there
are a number of C++ compilers and linkers to select from. Let’s examine them
and select one of the best earlier than I examine C++ with Rust:


On Linux, GCC is a transparent outlier. Clang fares significantly better. My
custom-built Clang (which is constructed with PGO and BOLT, like my {custom}
Rust toolchain) actually improves construct instances in comparison with Ubuntu’s Clang.
libstdc++ builds barely quicker on common than libc++. Let’s use my
{custom} Clang with libstdc++ in my C++ vs Rust comparability.


On macOS, the Clang toolchain which comes with Xcode appears to be
better-optimized than the Clang toolchain from LLVM’s web site. I will use
the Xcode Clang for my C++ vs Rust comparability.

C++20 modules

My C++ code makes use of #embrace. However what about
import launched in C++20? Aren’t C++20 modules supposed
to make compilation tremendous quick?

I attempted to make use of C++20 modules for this undertaking. As of
, CMake assist for modules on Linux is so
experimental that even
‘hello world’ doesn’t work.

Possibly 2023 would be the 12 months of C++20 modules. As somebody who cares so much
about construct instances, I actually hope so! However for now, I’ll pit Rust
towards traditional C++ #embraces.

C++ vs Rust construct instances

I ported the C++ undertaking to Rust and optimized the Rust construct instances as
a lot as I may. Which one compiles quicker: C++ or Rust?

Sadly, the reply is: it relies upon!


On my Linux machine, Rust builds are typically quicker than C++ builds,
however typically slower or the identical velocity. Within the
incremental lex benchmark, which modifies the biggest src file,
Clang was quicker than rustc. However for the opposite incremental benchmarks,
rustc got here out on high.


On my macOS machine, nonetheless, the story may be very totally different. C++ builds
are normally a lot quicker than Rust builds. Within the
incremental test-utf-8 benchmark, which modifies a medium-sized
check file, rustc compiled barely quicker than Clang. However for the opposite
incremental benchmarks, and for the total construct benchmark, Clang clearly
got here out on high.

Scaling past 17k SLOC

I benchmarked a 17k SLOC undertaking. However that was a small undertaking.
How do construct instances examine for a bigger undertaking of, say, 100k SLOC or
extra?

To check how properly the C++ and Rust compilers scale, I took the most important
module (the lexer) and copy-pasted its code and assessments, making 8, 16, and
24 copies.

As a result of my benchmarks additionally embrace the time it takes to run assessments, I
count on instances to extend linearly, even with prompt construct instances.

Scaled undertaking sizes
C++
SLOC
Rust SLOC
1x 16.6k 17.1k
8x 52.3k (+215%) 43.7k (+156%)
16x 93.1k (+460%) 74.0k (+334%)
24x 133.8k (+705%) 104.4k (+512%)


Each Rust and Clang scaled linearly, which is nice to see.

For C++, altering a header file (incremental diag-types) result in
the most important change in construct time, as anticipated. Construct time scaled with a
low issue for the opposite incremental benchmarks, principally due to the
Mould linker.

I’m disenchanted with how poorly Rust’s construct scales, even with the
incremental test-utf-8 benchmark which should not be affected that
a lot by including unrelated information. This check makes use of the
workspace; many check exes crate structure, which implies test-utf-8
ought to get its personal executable which ought to compile independently.

Conclusion

Are compilation instances an issue with Rust? Sure. There
are some ideas and tips to hurry up builds, however I did not discover the
magical order-of-magnitude enhancements which might make me blissful
creating in Rust.

Are construct instances as dangerous with Rust as with C++? Sure. And
for greater tasks, growth compile instances are worse with Rust than
with C++, a minimum of with my code type.

my hypotheses, I used to be fallacious on all counts:

  1. The Rust port had extra strains than the C++ model, not fewer.
  2. For full builds, in comparison with Rust, C++ builds took about the identical
    period of time (17k SLOC) or took much less time (100k+ SLOC), not longer.
  3. For incremental builds, in comparison with C++, Rust builds had been typically
    shorter and typically longer (17k SLOC) or for much longer (100k+ SLOC),
    not at all times longer.

Am I unhappy? Sure. In the course of the porting course of, I’ve realized to love some
facets of Rust. For instance, proc macros would let me exchange three
totally different code mills, simplifying the construct pipeline and making
life simpler for brand new contributors. I do not miss header information in any respect. And
I recognize Rust’s tooling (particularly Cargo, rustup, and miri).

I made a decision to not port the remainder of quick-lint-js to Rust.
However… if construct instances enhance considerably, I’ll change my
thoughts! (Except I develop into enchanted by
Zig first.)

Appendix

Supply code

Source code for
the trimmed C++ undertaking, the Rust port (together with totally different undertaking
layouts), code era scripts, and benchmarking scripts.
GPL-3.0-or-later.

Linux machine

title
strapurp
CPU
AMD Ryzen 9 5950X (PBO; inventory clocks) (32 threads) (x86_64)
RAM
G.SKILL F4-4000C19-16GTZR 2×16 GiB (overclocked to 3800 MT/s)
OS
Linux Mint 21.1
Kernel
Linux strapurp 5.15.0-56-generic #62-Ubuntu SMP Tue Nov 22 19:54:14
UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Linux efficiency governor
schedutil
CMake
model 3.19.1
Ninja
model 1.10.2
GCC
model 12.1.0-2ubuntu1~22.04
Clang (Ubuntu)
model 14.0.0-1ubuntu1
Clang ({custom})
model 15.0.6 (Rust fork; commit
3dfd4d93fa013e1c0578­d3ceac5c8f4ebba4b6ec)
libstdc++ for Clang
model 11.3.0-1ubuntu1~22.04
Rust Secure
1.66.0 (69f9c33d7 2022-12-12)
Rust Nightly
model 1.68.0-nightly (c7572670a 2023-01-03)
Rust ({custom})
model 1.68.0-dev (c7572670a 2023-01-03)
Mould
model 0.9.3 (ec3319b37f653dccfa4d­1a859a5c687565ab722d)
binutils
model 2.38

macOS machine

title
strammer
CPU
Apple M1 Max (10 threads) (AArch64)
RAM
Apple 64 GiB
OS
macOS Monterey 12.6
CMake
model 3.19.1
Ninja
model 1.10.2
Xcode Clang
Apple clang model 14.0.0 (clang-1400.0.29.202) (Xcode 14.2)
Clang 15
model 15.0.6 (LLVM.org web site)
Rust Secure
1.66.0 (69f9c33d7 2022-12-12)
Rust Nightly
model 1.68.0-nightly (c7572670a 2023-01-03)
Rust ({custom})
model 1.68.0-dev (c7572670a 2023-01-03)
lld
model 15.0.6
zld
commit d50a975a5fe6576ba0fd­2863897c6d016eaeac41

Benchmarks

construct+check w/ deps
C++:
cmake -S construct -B . -G Ninja && ninja -C construct quick-lint-js-test
&& construct/check/quick-lint-js-test

timed
Rust: cargo fetch untimed, then
cargo check timed
construct+check w/o deps
C++:
cmake -S construct -B . -G Ninja && ninja -C construct gmock gmock_main
gtest

untimed, then
ninja -C construct quick-lint-js-test &&
construct/check/quick-lint-js-test

timed
Rust:
cargo construct --package lazy_static --package libc --package
memoffset"

untimed, then cargo check timed
incremental diag-types
C++: construct+check untimed, then modify diagnostic-types.h,
then
ninja -C construct quick-lint-js-test &&
construct/check/quick-lint-js-test
Rust: construct+check untimed, then modify
diagnostic_types.rs, then cargo check
incremental lex
Like incremental diag-types, however with lex.cpp/lex.rs
incremental test-utf-8
Like incremental diag-types, however with test-utf-8.cpp/test_utf_8.rs

For every executed benchmark, 12 samples had been taken. The primary two had been
discarded. Bars present the typical of the final 10 samples. Error bars present
the minimal and most pattern.



Source Link

What's Your Reaction?
Excited
0
Happy
0
In Love
0
Not Sure
0
Silly
0
View Comments (0)

Leave a Reply

Your email address will not be published.

2022 Blinking Robots.
WordPress by Doejo

Scroll To Top