Honey, I shrunk the npm bundle · Jamie Magee
Have you ever ever questioned what lies beneath the floor of an npm bundle? At its coronary heart, it’s nothing greater than a gzipped tarball. Working in software program growth, supply code and binary artifacts are almost all the time shipped as .tar.gz
or .tgz
recordsdata. And gzip compression is supported by each HTTP server and net browser on the market. caniuse.com doesn’t even give statistics for help, it simply says “supported in effectively all browsers”. However right here’s the kicker: gzip is beginning to present its age, making manner for newer, extra trendy compression algorithms like Brotli and ZStandard. Now, think about a world the place npm embraces considered one of these new algorithms. On this weblog put up, I’ll dive into the realm of compression and discover the probabilities of moderinising npm’s compression technique.
What’s the competitors?
The 2 main gamers on this house are Brotli and ZStandard (or zstd for brief). Brotli was launched by Google in 2013 and zstd was launched by Fb in 2016. They’ve since been standardised, in RFC 7932 and RFC 8478 respectively, and have seen widespread use all around the software program business. It was really the announcement by Arch Linux that they have been going to begin compressing their packages with zstd by default that made take into consideration this within the first place. Arch Linux was certainly not the primary mission, neither is it the one one. However to search out out if it is smart for the Node ecosystem, I have to do some benchmarks. And meaning breaking out tar
.
Benchmarking half 1
I’m going to begin with tar
and see what kind of comparisons I can get by switching gzip, Brotli, and zstd. I’ll take a look at with the npm package of npm itself because it’s a fairly fashionable bundle, averaging over 4 million downloads per week, whereas additionally being fairly giant at round 11MB unpacked.
1$ curl --remote-name https://registry.npmjs.org/npm/-/npm-9.7.1.tgz
2$ ls -l --human npm-9.7.1.tgz
3-rw-r--r-- 1 jamie customers 2.6M Jun 16 20:30 npm-9.7.1.tgz
4$ tar --extract --gzip --file npm-9.7.1.tgz
5$ du --summarize --human --apparent-size bundle
611M bundle
gzip is already giving good outcomes, compressing 11MB to 2.6MB for a compression ratio of round 0.24. However what can the contenders do? I’m going to stay with the default choices for now:
1$ brotli --version
2brotli 1.0.9
3$ tar --use-compress-program brotli --create --file npm-9.7.1.tar.br bundle
4$ zstd --version
5*** Zstandard CLI (64-bit) v1.5.5, by Yann Collet ***
6$ tar --use-compress-program zstd --create --file npm-9.7.1.tar.zst bundle
7$ ls -l --human npm-9.7.1.tgz npm-9.7.1.tar.br npm-9.7.1.tar.zst
8-rw-r--r-- 1 jamie customers 1.6M Jun 16 21:14 npm-9.7.1.tar.br
9-rw-r--r-- 1 jamie customers 2.3M Jun 16 21:14 npm-9.7.1.tar.zst
10-rw-r--r-- 1 jamie customers 2.6M Jun 16 20:30 npm-9.7.1.tgz
Wow! With no configuration each Brotli and zstd come out forward of gzip, however Brotli is the clear winner right here. It manages a compression ratio of 0.15 versus zstd’s 0.21. In actual phrases meaning a saving of round 1MB. That doesn’t sound like a lot, however at 4 million weekly downloads, that may save 4TB of bandwidth per week.
Benchmarking half 2: Electrical boogaloo
The compression ratio is just telling half of the story. Really, it’s a 3rd of the story, however compression velocity isn’t actually a priority. Compression of a bundle solely occurs as soon as, when a bundle is revealed, however decompression occurs each time you run npm set up
. So any time saved decompressing packages means faster set up or construct steps.
To check this, I’m going to make use of hyperfine, a command-line benchmarking device. Decompressing every of the packages I created earlier 100 instances ought to give me a good suggestion of the relative decompression velocity.
1$ hyperfine --runs 100 --export-markdown hyperfine.md
2 'tar --use-compress-program brotli --extract --file npm-9.7.1.tar.br --overwrite'
3 'tar --use-compress-program zstd --extract --file npm-9.7.1.tar.zst --overwrite'
4 'tar --use-compress-program gzip --extract --file npm-9.7.1.tgz --overwrite'
Command | Imply [ms] | Min [ms] | Max [ms] | Relative |
---|---|---|---|---|
tar –use-compress-program brotli –extract –file npm-9.7.1.tar.br –overwrite | 51.6 ± 3.0 | 47.9 | 57.3 | 1.31 ± 0.12 |
tar –use-compress-program zstd –extract –file npm-9.7.1.tar.zst –overwrite | 39.5 ± 3.0 | 33.5 | 51.8 | 1.00 |
tar –use-compress-program gzip –extract –file npm-9.7.1.tgz –overwrite | 47.0 ± 1.7 | 44.0 | 54.9 | 1.19 ± 0.10 |
This time zstd comes out in entrance, adopted by gzip and Brotli. This is smart, as “real-time compression” is among the massive options that’s touted in zstd’s documentation. Whereas Brotli is 31% slower in comparison with zstd, in actual phrases it’s solely 12ms. And in comparison with gzip, it’s solely 5ms slower. To place that into context, you’d want a greater than 1Gbps connection to make up for the 5ms loss it has in decompression in contrast with the 1MB it saves in bundle dimension.
Benchmarking half 3: This time it’s severe
Up till now I’ve simply been Brotli and zstd’s default settings, however each have a variety of knobs and dials you could regulate to alter the compression ratio and compression or decompression velocity. Fortunately, the business customary lzbench has bought me lined. It might run by the entire completely different high quality ranges for every compressor, and spit out a pleasant desk with all the information on the finish.
However earlier than I dive in, there are just a few caveats I ought to level out. The primary is that lzbench isn’t in a position to compress a whole listing like tar
, so I opted to make use of lib/npm.js
for this take a look at. The second is that lzbench doesn’t embrace the gzip device. As a substitute it makes use of zlib, the underlying gzip library. The final is that the variations of every compressor aren’t fairly present. The most recent model of zstd is 1.5.5, launched on April 4th 2023, whereas lzbench makes use of model 1.4.5, launched on Might twenty second 2020. The most recent model of Brotli is 1.0.9, launched on August twenty seventh 2020, whereas lzbench makes use of a model launched on October 1st 2019.
1$ lzbench -o1 -ezlib/zstd/brotli bundle/lib/npm.js
Click on to broaden outcomes
Compressor title | Compression | Decompress. | Compr. dimension | Ratio | Filename |
---|---|---|---|---|---|
memcpy | 117330 MB/s | 121675 MB/s | 13141 | 100.00 | bundle/lib/npm.js |
zlib 1.2.11 -1 | 332 MB/s | 950 MB/s | 5000 | 38.05 | bundle/lib/npm.js |
zlib 1.2.11 -2 | 382 MB/s | 965 MB/s | 4876 | 37.11 | bundle/lib/npm.js |
zlib 1.2.11 -3 | 304 MB/s | 986 MB/s | 4774 | 36.33 | bundle/lib/npm.js |
zlib 1.2.11 -4 | 270 MB/s | 1009 MB/s | 4539 | 34.54 | bundle/lib/npm.js |
zlib 1.2.11 -5 | 204 MB/s | 982 MB/s | 4452 | 33.88 | bundle/lib/npm.js |
zlib 1.2.11 -6 | 150 MB/s | 983 MB/s | 4425 | 33.67 | bundle/lib/npm.js |
zlib 1.2.11 -7 | 125 MB/s | 983 MB/s | 4421 | 33.64 | bundle/lib/npm.js |
zlib 1.2.11 -8 | 92 MB/s | 989 MB/s | 4419 | 33.63 | bundle/lib/npm.js |
zlib 1.2.11 -9 | 95 MB/s | 986 MB/s | 4419 | 33.63 | bundle/lib/npm.js |
zstd 1.4.5 -1 | 594 MB/s | 1619 MB/s | 4793 | 36.47 | bundle/lib/npm.js |
zstd 1.4.5 -2 | 556 MB/s | 1423 MB/s | 4881 | 37.14 | bundle/lib/npm.js |
zstd 1.4.5 -3 | 510 MB/s | 1560 MB/s | 4686 | 35.66 | bundle/lib/npm.js |
zstd 1.4.5 -4 | 338 MB/s | 1584 MB/s | 4510 | 34.32 | bundle/lib/npm.js |
zstd 1.4.5 -5 | 275 MB/s | 1647 MB/s | 4455 | 33.90 | bundle/lib/npm.js |
zstd 1.4.5 -6 | 216 MB/s | 1656 MB/s | 4439 | 33.78 | bundle/lib/npm.js |
zstd 1.4.5 -7 | 140 MB/s | 1665 MB/s | 4422 | 33.65 | bundle/lib/npm.js |
zstd 1.4.5 -8 | 101 MB/s | 1714 MB/s | 4416 | 33.60 | bundle/lib/npm.js |
zstd 1.4.5 -9 | 97 MB/s | 1673 MB/s | 4410 | 33.56 | bundle/lib/npm.js |
zstd 1.4.5 -10 | 97 MB/s | 1672 MB/s | 4410 | 33.56 | bundle/lib/npm.js |
zstd 1.4.5 -11 | 37 MB/s | 1665 MB/s | 4371 | 33.26 | bundle/lib/npm.js |
zstd 1.4.5 -12 | 27 MB/s | 1637 MB/s | 4336 | 33.00 | bundle/lib/npm.js |
zstd 1.4.5 -13 | 20 MB/s | 1601 MB/s | 4310 | 32.80 | bundle/lib/npm.js |
zstd 1.4.5 -14 | 18 MB/s | 1582 MB/s | 4309 | 32.79 | bundle/lib/npm.js |
zstd 1.4.5 -15 | 18 MB/s | 1582 MB/s | 4309 | 32.79 | bundle/lib/npm.js |
zstd 1.4.5 -16 | 9.03 MB/s | 1556 MB/s | 4305 | 32.76 | bundle/lib/npm.js |
zstd 1.4.5 -17 | 8.86 MB/s | 1559 MB/s | 4305 | 32.76 | bundle/lib/npm.js |
zstd 1.4.5 -18 | 8.86 MB/s | 1558 MB/s | 4305 | 32.76 | bundle/lib/npm.js |
zstd 1.4.5 -19 | 8.86 MB/s | 1559 MB/s | 4305 | 32.76 | bundle/lib/npm.js |
zstd 1.4.5 -20 | 8.85 MB/s | 1558 MB/s | 4305 | 32.76 | bundle/lib/npm.js |
zstd 1.4.5 -21 | 8.86 MB/s | 1559 MB/s | 4305 | 32.76 | bundle/lib/npm.js |
zstd 1.4.5 -22 | 8.86 MB/s | 1589 MB/s | 4305 | 32.76 | bundle/lib/npm.js |
brotli 2019-10-01 -0 | 604 MB/s | 813 MB/s | 5182 | 39.43 | bundle/lib/npm.js |
brotli 2019-10-01 -1 | 445 MB/s | 775 MB/s | 5148 | 39.18 | bundle/lib/npm.js |
brotli 2019-10-01 -2 | 347 MB/s | 947 MB/s | 4727 | 35.97 | bundle/lib/npm.js |
brotli 2019-10-01 -3 | 266 MB/s | 936 MB/s | 4645 | 35.35 | bundle/lib/npm.js |
brotli 2019-10-01 -4 | 164 MB/s | 930 MB/s | 4559 | 34.69 | bundle/lib/npm.js |
brotli 2019-10-01 -5 | 135 MB/s | 944 MB/s | 4276 | 32.54 | bundle/lib/npm.js |
brotli 2019-10-01 -6 | 129 MB/s | 949 MB/s | 4257 | 32.39 | bundle/lib/npm.js |
brotli 2019-10-01 -7 | 103 MB/s | 953 MB/s | 4244 | 32.30 | bundle/lib/npm.js |
brotli 2019-10-01 -8 | 84 MB/s | 919 MB/s | 4240 | 32.27 | bundle/lib/npm.js |
brotli 2019-10-01 -9 | 7.74 MB/s | 958 MB/s | 4237 | 32.24 | bundle/lib/npm.js |
brotli 2019-10-01 -10 | 4.35 MB/s | 690 MB/s | 3916 | 29.80 | bundle/lib/npm.js |
brotli 2019-10-01 -11 | 1.59 MB/s | 761 MB/s | 3808 | 28.98 | bundle/lib/npm.js |
This gorgeous a lot confirms what I’ve proven to this point. zstd is ready to present quicker decompression velocity than both gzip or Brotli, and barely edge out gzip in compression ratio. Brotli, then again, has comparable decompression speeds and compression ratio with gzip at decrease high quality ranges, however at ranges 10 and 11 it’s in a position to edge out each gzip and zstd’s compression ratio.
Every thing is by-product
Now that I’ve completed with benchmarking, I have to step again and take a look at my unique thought of changing gzip as npm’s compression customary. Because it seems, Evan Hahn had an analogous thought in 2022 and proposed an npm RFC. He proposed utilizing Zopfli, a backwards-compatible gzip compression library, and Brotli’s older (and cooler ????) sibling. Zopfli is ready to produce smaller artifacts with the trade-off of a a lot slower compression velocity. In concept a simple win for the npm ecosystem. And in case you watch the RFC meeting recording or learn the meeting notes, everybody appears massively in favour of the proposal. Nonetheless, the one massive roadblock that stops this RFC from being instantly accepted, and in the end ends in it being deserted, is the shortage of a local JavaScript implementation.
Studying from this earlier RFC and my outcomes from benchmarking Brotli and zstd, what would it not take to construct a robust RFC of my very own?
Placing all of it collectively
Each Brotli and zstd’s reference implementations are written in C. And whereas there are a variety of ports on the npm registry utilizing Emscripten or WASM, Brotli has an implementation in Node.js’s zlib module, and has performed since Node.js 10.16.0, launched in Might 2019. I opened an issue in Node.js’s GitHub repo so as to add help for zstd, nevertheless it’ll take a very long time to make its manner into an LTS launch, nevermind the remainder of npm’s dependency chain. I used to be already leaning in the direction of Brotli, however this simply seals the deal.
Deciding on an algorithm is one factor, however implementing it’s one other. npm’s present help for gzip compression in the end comes from Node.js itself. However the dependency chain between npm and Node.js is lengthy and barely completely different relying on in case you’re packing or unpacking a bundle.
The dependency chain for packing, as in npm pack
or npm publish
, is:
npm → libnpmpack → pacote → tar → minizlib → zlib (Node.js)
However the dependency chain for unpacking (or ‘reifying’ as npm calls it), as in npm set up
or npm ci
is:
npm → @npmcli/arborist → pacote → tar → minizlib → zlib (Node.js)
That’s fairly just a few packages that must be up to date, however fortunately the primary steps have already been taken. Help for Brotli was added to minizlib 1.3.0 again in September 2019. I constructed on high of that and contributed Brotil support to tar
. That’s now available in version 6.2.0. It might take some time, however I can see a transparent path ahead.
The ultimate situation is backwards compatibility. This wasn’t a priority with Evan Hahn’s RFC, as Zopfli generates backwards-compatible gzip recordsdata. Nonetheless, Brotli is a wholly new compression format, so I’ll have to suggest a really cautious adoption plan. The method I can see is:
- Help for packing and unpacking is added in a minor launch of the present model of npm
- Unpacking utilizing Brotli is dealt with transparently
- Packing utilizing Brotli is disabled by default and solely enabled if one of many following are true:
- The
engines
area inbundle.json
is ready to a model of npm that helps Brotli - The
engines
area inbundle.json
is ready to a model of node that bundles a model of npm that helps Brotli - Brotli help is explicitly enabled in
.npmrc
- The
- Packing utilizing Brotli is enabled by default within the subsequent main launch of npm after the LTS model of Node.js that bundles it goes out of help
Let’s say that Node.js 22 comes with npm 10, which has Brotli help. Node.js 22 will cease getting LTS updates in April 2027. Then, the subsequent main model of npm after that date ought to allow Brotli packing by default.
I admit that that is an extremely lengthy transition interval. Nonetheless, it’ll assure that in case you’re utilizing a model of Node.js that’s nonetheless being supported, there shall be no seen affect to you. And it nonetheless permits early adopters to opt-in to Brotli help. But when anybody has different concepts about how to do that transition, I’m open to strategies.
What’s subsequent?
As I wrap up my exploration into npm compression, I need to admit that my journey has solely simply begun. To push the boundaries additional, there are much more steps. In the beginning, I have to do some extra intensive benchmarking with the top 250 most downloaded npm packages, as a substitute of specializing in a single bundle. One which’s full, I have to draft an npm RFC and search suggestions from the broader group. In case you’re concerned about serving to out, or simply wish to see the way it’s going, you may comply with me on Mastodon at @[email protected], or on Twitter at @Jamie_Magee.