Merging bcachefs [LWN.net]
Welcome to LWN.internet
The next subscription-only content material has been made obtainable to you
by an LWN subscriber. Hundreds of subscribers depend upon LWN for the
finest information from the Linux and free software program communities. In case you get pleasure from this
article, please contemplate subscribing to LWN. Thanks
for visiting LWN.internet!
The bcachefs filesystem, and the
course of for getting it upstream, have been the matters
of a session led remotely by
Kent Overstreet, creator of bcachefs, on the
2023 Linux Storage, Filesystem,
Memory-Management and BPF Summit. He has additionally mentioned bcachefs in
earlier editions of the summit, first
in 2018 and at last year’s event;
in each of these instances, the query of getting bcachefs merged
into the mainline kernel got here up, however that merge has not occurred but.
This time
round, although, Overstreet appeared
nearer than ever to being prepared to truly begin that course of.
He started his discuss by noting that he had been saying bcachefs is sort of
prepared for merging for a while now; “now I am saying, let’s lastly do
it”. He needed to report on the standing of the filesystem and on why it’s
prepared now for upstreaming, however he needed to make use of the majority of the session to
talk about
the method of doing so. “It is a large, 90,000-lines-of-code
beast” that should get reviewed, so there’s a want to determine the
course of to try this evaluation.
His aim with bcachefs is to have the “efficiency, reliability,
scalability, and robustness of XFS with fashionable options”. That is a excessive
bar, and one which bcachefs has not but reached, however “I believe we’re fairly
far alongside”. Persons are operating bcachefs on 100TB filesystems “with none
points or complaints”; he’s ready for the primary 1PB filesystem.
“Snapshots scale superbly”, which isn’t true for Btrfs, primarily based on person
complaints,
he mentioned.
Standing
Within the final yr, there was plenty of scalability work executed, a lot of
which required deep rewrites, together with for the allocator, which dates again
to bcache. There’s a new
“no copy-on-write” (nocow) mode and snapshots have been applied. Individuals
are utilizing the snapshots to do backups of MySQL databases, he mentioned, which is
a take a look at of the robustness of the characteristic.
Erasure coding is
the final actually massive characteristic that he wish to get into bcachefs earlier than
upstreaming it. However he thinks “it is time to attract a line within the sand”, so
that may await a bit. There may be nonetheless plenty of work to do, however “the large
characteristic work is lessening”; he’ll be capable to work on being a maintainer
with out having to vanish for a month to work on one thing, as he did for
snapshots, for instance.
The bcachefs group is rising; Brian Foster at Crimson Hat has been doing rather a lot
of nice work on bug fixes, Overstreet mentioned. Eric Sandeen has helped in
attracting curiosity in bcachefs at Crimson Hat as nicely. There’s a bi-weekly
name on bcachefs improvement. There may be automated testing infrastructure
that has been added and it’s “making my life a lot simpler”, Overstreet
mentioned. The take a look at system runs in about half an hour and consists of a number of
runs of fstests in addition to the “big take a look at suite” for bcachefs.
Rust is one thing that he has been evangelizing about to “anybody who will
hear”; he thinks “writing code in C, once we lastly have a greater possibility
obtainable, is insanity”. He loves to write down code, however to not debug it;
writing in Rust “simply means rather a lot much less time debugging”. He intends to
slowly rewrite bcachefs in Rust, which will likely be a ten-plus-year undertaking, however
the usage of Rust in bcachefs has already began. Among the user-space
instruments have been rewritten in Rust and somebody is taking a look at shifting a few of
that work into the kernel.
Upstreaming
That morning he had posted 32
preliminary patches including infrastructure that bcachefs will want; these
patches have been already being reviewed, he mentioned. The
relaxation is 90,000 traces of code in 2,500 patches that he didn’t
submit; he did embody a hyperlink to his Git repository, the place
these patches dwell in a bcachefs-for-upstream
branch. He then opened up the ground to debate how these patches would
be reviewed and, ultimately, merged.
Josef Bacik mentioned that he thinks the response will likely be a lot the identical as final
yr; filesystem builders are “actually excited” to see bcachefs get
merged. He doesn’t plan to evaluation the implementation of the filesystem
itself and suspects that’s typically true. The people who find themselves engaged on
it can evaluation it; “belief yourselves for that half”. The “generic stuff is
what we have to evaluation”, as soon as that’s executed, the remainder of the filesystem code
might be merged so far as he’s involved. That’s, after all, as much as Linus
Torvalds.
Overstreet mentioned that considered one of his questions is: “what can we take to Linus?”
He has spent the final yr on course of and infrastructure, getting a group
collectively, working with Crimson Hat, placing collectively an automatic take a look at suite,
and so forth. Mike Snitzer remotely identified {that a} patch set that had
just lately been rejected contained two monumental patches that have been basically
unimaginable to evaluation; he contrasted that with the two,500 fine-grained
patches that make up bcachefs, which is way simpler to digest.
Whereas Snitzer is
undecided that having everybody undergo them one-by-one in evaluation is the fitting
method, the plain effort that went into that patch sequence makes it
simpler to belief the code and the method that went into growing it.
“You’ve got executed the heavy lifting by doing all of that work to separate up
patches.” Overstreet mentioned that it was plenty of work to rebase almost the
complete historical past, however that it got here in useful round six months in the past when Crimson
Hat observed some massive efficiency regressions. He was in a position to make use of that
historical past to do automated bisection and received virtually all the efficiency again.
Bacik mentioned that Torvalds is the “maintainer” accountable for merging a brand new
filesystem, so will probably be as much as him to determine if he’s prepared to tug the
full historical past into the mainline. It might be Bacik’s desire to take action,
as a result of the historical past is “tremendous helpful”, however that’s not one thing that the
folks within the room can determine. He recommended that the pull request be extra
of a query about whether or not the complete historical past was acceptable and, if not,
what can be.
One concern is that after bcachefs will get merged, will probably be tough for
anybody moreover Overstreet to take care of the bug stories, Amir Goldstein
mentioned. It is crucial that or not it’s defined within the pull request; “I need to
merge this and I’ve a group that may assist this”. Getting extra assist
was one of many standards earlier than upstreaming, Overstreet mentioned. He knew that
if it was a one-man present and he received deluged with bug stories, he would “go
insane and run away to South America”; Foster has been “an enormous assist”, which
is likely one of the issues that makes him really feel comfy about merging at this
level.
Paradoxically, the latest push to remove some
filesystems (e.g. ReiserFS)
from the kernel is definitely going to make it simpler so as to add new ones, Ted
Ts’o mentioned. He can keep in mind Hans Reiser being keen about his new
filesystem, with a group to assist it, however that each one fell into disrepair
over time. The kernel undertaking now has a path for eradicating filesystems
after a deprecation cycle. The concept that “accepting a filesystem is not
perpetually, makes it an entire lot simpler” to merge new ones.
He additionally recommended breaking apart the patch sequence into smaller, extra reviewable
chunks that gather up a small variety of associated patches. That will make
it simpler
for folks to evaluation, say, all the lockdep patches in a single chunk. It
would imply stress-free the overall guideline about not merging infrastructure
till
its first caller is merged, which he’s in favor of; he would amend that
guideline to permit merging
when it features a pointer to the Git tree of the primary caller.
Overstreet thinks that the preliminaries that he posted earlier that day
won’t be too controversial and aside from maybe one or two “will simply
sail by means of”. He famous that Christoph Hellwig had objected to the vmalloc_exec()
patch, although that performance is required for bcachefs, Overstreet
mentioned. Because the
discuss, Mike Rapoport has proposed the JIT allocator, which might
remedy the
underlying drawback.
A distant participant mentioned that Foster’s expertise had proven that the code
base is approachable; as soon as bcachefs is on the market, builders will likely be
capable of come up to the mark and begin engaged on it with few difficulties.
Christian Brauner requested that there be a transparent delineation for who else
might step in and merge patches if Overstreet is unavailable. Brauner famous
that the NTFS/NTFS3 maintainer disappeared and, though there have been folks
who have been contributing to the filesystem, it was not clear “who might route
patches upstream”. Overstreet mentioned that he would belief Foster in that position
if “he’s prepared to step as much as that”.
Brauner mentioned that he thinks bcachefs is in “wonderful form to be
upstreamed”, however he’s involved with the variety of filesystems within the
kernel; he’s glad to see that there are efforts to take away a few of them.
Adjustments that impression all the filesystems within the tree “get painful very
very quick” and, in some instances, there isn’t any one obtainable to evaluation the
modifications. He would really like the acceptance course of to be extra conservative;
accepting NTFS/NTFS3 was “an enormous mistake”, for instance. Brauner mentioned that
none of that was directed at bcachefs, however was a extra basic concern;
filesystem acceptance and deprecation
was taken up in a lightning discuss (YouTube video) later
that day.
Darrick Wong mentioned that he had already began doing what Ts’o recommended in
his patches
for XFS online repair. He has a group
of infrastructure patches that discuss with callers which are coming quickly; he
has satisfied Dave Chinner that there’s worth in reviewing the
infrastructure items whereas additionally trying on the greater image of the place it
is all main. That helps him as a result of he can cease “rebasing issues
repeatedly and having to play code golf, like shifting small helper capabilities
up and down within the patch set”. Placing all of that stuff in a separate set
of infrastructure patches helped him, although it did trigger some complaints
from reviewers, however there’s now some precedent for that method, he mentioned.
Overstreet mentioned that he’s not notably involved in regards to the 30 or
so “comparatively uncomplicated” infrastructure patches that he must
land. He’s going to attend for the Acked-by and Reviewed-by tags to come back
in, but when they don’t, then he’ll use the recommended method “as a Plan
B”. With that, the
session got here to an in depth.
(Log in to submit feedback)