Are You Positive You Need to Use MMAP in Your DBMS?
A paper with the above provocative title began making the rounds again in 2022. Whereas we initially mentioned it over twitter, and a few of our colleagues wrote longer responses on their blogs, we by no means wrote an extended type response earlier than. However for the reason that paper retains resurfacing occasionally, it appeared like a good suggestion to handle it in depth, as soon as and for all.
The paper’s summary will get off to a strident begin. “There are, nevertheless, extreme correctness and efficiency points with mmap that aren’t instantly
obvious. Such issues make it troublesome, if not unimaginable, to make use of
mmap accurately and effectively in a contemporary DBMS.” The irony of this assertion is that multiple research projects have proven that LMDB is likely one of the solely DB engines that persistently proves to have completely right crash-resistance, whereas different DB engines utilizing extra conventional buffer pool administration schemes have proven quite a lot of failure/corruption circumstances.
In part 1 the paper’s introduction once more makes some ridiculous claims “Sadly, mmap has a hidden darkish facet with many sordid issues that make it undesirable for file I/O in a DBMS. As we describe on this paper, these issues contain each information security and system efficiency issues. We contend that the engineering steps required to beat them negate the purported simplicity of working with mmap. For these causes, we consider that mmap provides an excessive amount of complexity with no commensurate efficiency profit and strongly urge DBMS builders to keep away from utilizing mmap as a alternative for a standard buffer pool.”
LMDB safely addresses all issues and continues to be the smallest most dependable database engine on this planet, coming in at beneath 64KB of object code. In the meantime, taking the normal method offers you DB engines that require orders of magnitude extra code simply to aim to be right, however nonetheless failing. Their evaluation of complexity and correctness is totally unsuitable.
In part 2 their overview of mmap is mainly right. In 2.3 “MMAP Gone Mistaken” they checklist quite a lot of DBMSs which have efficiently used mmap (together with LMDB) after which they cite quite a lot of well-known examples of DBMSs that used mmap and acquired it unsuitable, together with MongoDB and others. The entire part is troublesome to take critically; they declare it is almost unimaginable to get it proper after which give an inventory of initiatives that acquired it proper. Those who acquired it unsuitable are irrelevant, as a result of right options clearly exist for all of the potential issues.
In part 3 “Issues with mmap” we must be attending to the guts of the matter, however I discover little or no of curiosity right here since LMDB has none of those issues. Their dialogue of Shadow Paging that explicitly describes LMDB is not even related, since by default LMDB does not use a writable mmap. As such their dialogue of partial updates and msync does not even apply. These elements of LMDB’s conduct are properly documented, so the actual fact they get this unsuitable displays poorly on their analysis efforts.
In part 3.2 “I/O stalls” is once more a non-issue; regardless of how your DBMS handles I/O internally, synchronously or asynchronously, the calling software cannot make any progress till the I/O completes, and if the info is not already in reminiscence then the appliance should wait.
In part 3.3 “Error dealing with” they accurately be aware “pointer errors may corrupt pages in reminiscence” however that is why LMDB makes use of a read-only mmap by default. So once more, non-issue. These issues are apparent to anybody considering such a design, and the answer is trivial, fairly the alternative of the insurmountable, near-impossibility they declare.
In part 3.4 “Efficiency points” – each benchmark reveals that LMDB all the time massively outperforms each different DB engine on reads, so there’s actually nothing of substance right here both. Web page eviction is explicitly not a problem since LMDB makes use of a read-only mmap. Meaning all map pages are all the time clear; at any time when reminiscence strain causes the OS to want to reclaim a web page it will probably simply accomplish that instantly with out having to evict/flush a web page out. (Additionally the RavenDB weblog submit straight addressed the opposite factors so I will not re-tread that floor.)
Part 4 “Experimental Evaluation” actually takes the cake – they take a look at utilizing fio, a filesystem benchmarking device. They do not really examine DBMS implementations, so none of their evaluation takes under consideration the complexity and efficiency prices for a DBMS to not use mmap. That is the guts of their paper and its comparisons are utterly invalid.
Part 6 Conclusion is simply utterly unsuitable:
“When you shouldn’t use mmap:”
-
“you could carry out updates in a transactionally secure style” – LMDB’s ACID transactions are 100% completely secure.
-
“you want specific management over what information is in reminiscence” – on a machine with digital reminiscence, i.e., all fashionable working techniques, you by no means have this at software stage.
See Also -
“You care about error dealing with and must return right outcomes” – LMDB all the time returns right outcomes.
“When it is best to perhaps use mmap in your DBMS”
-
“Your working set (or all the database) matches in reminiscence and the workload is read-only” – LMDB beats others for DBs a lot bigger than reminiscence, on learn/write workloads http://www.lmdb.tech/bench/ondisk/
-
“You might want to rush a product to the market and don’t care about
information consistency or long-term engineering complications.” LMDB is an open supply undertaking, we do not care in regards to the market as a result of we do not have to pay again any vulture capitalists. We took the time to do issues proper – which really was quite a bit sooner than doing issues the normal method.
Apart from the great benefit in simplicity and robustness LMDB enjoys, there are different advantages not even touched on right here, reminiscent of the truth that mmap solely makes use of the filesystem web page cache means you may simply assist multi-process concurrency, in addition to multi-thread concurrency, with none extra reminiscence overhead. No different DBMS engine can try this.
For so long as working techniques and database administration techniques have existed, there was a rivalry between OS and DBMS builders. The DBMS guys all the time declare that as a result of they’ve extra intimate data of the intricacies of the appliance workload, they will wonderful tune to ship higher efficiency. However all of that wonderful tuning comes at an incredible price in complexity, and the fact is, on a multiuser machine, they’re useless unsuitable. Even on a devoted single-user machine, it is extra advanced and costly for an software to gather the entire measurements and statistics wanted to correctly profile their workload, than it’s to collect that data contained in the kernel. However on a multiuser machine, the place the DBMS shares the machine with different processes and different purposes, it is unimaginable. No single course of can get hold of an correct view of all system useful resource utilization and calls for; in truth it is the OS’s job to cover such particulars from the appliance stage. Whenever you’re sharing a machine with a number of different duties, solely the OS can ever actually know what is going on on within the I/O susbsystem, in reminiscence strain, and many others. and many others.
The DBMS people declare that figuring out the intimate particulars of the workload can enable them to do extra environment friendly caching. With an excessive amount of work that may very well be true on a devoted machine, however on a shared machine, the place all your fastidiously managed buffers may get paged out at any time to fulfill different calls for, the proposition is ludicrous. Additionally, there actually aren’t a number of methods to beat a Least Not too long ago Used (LRU) caching technique. There are extra environment friendly implementations (like CLOCK) however the general technique stays the identical. And once more LMDB leverages all of that with zero extra effort, as a result of its B+tree design is of course optimum with an LRU cache. Even when used with a number of tables, for separate indices and different metadata, as a result of LMDB handles a number of tables as a tree of bushes, it means the appliance does not must care about which tables are used extra ceaselessly than others. All of them begin from the foundation of the LMDB B+tree, and an LRU mechanism will naturally kind their accesses out so as of recency.
Finally, the reply to the query “are you certain you wish to use mmap in your DBMS?” must be rephrased – do you actually wish to reimplement every thing the OS already does for you? Do you actually consider you are able to do it accurately, higher than the OS already does? The DBMS world is affected by initiatives whose authors believed, incorrectly, that they may.