Now Reading
~5 Minutes Of Coding Yields A 6%+ Enhance To Linux I/O Efficiency

~5 Minutes Of Coding Yields A 6%+ Enhance To Linux I/O Efficiency

2024-01-16 11:53:17

LINUX STORAGE

IO_uring creator and Linux block subsystem maintainer Jens Axboe spent about 5 minutes engaged on two patches to implement caching for issue-side time querying within the block layer and may yield 6% or extra higher I/O efficiency.

Axboe shared about his newest attention-grabbing Linux I/O efficiency optimization, “One thing I’ve had at the back of my thoughts for years, and at last did it right now. Which is type of unhappy, because it was actually a 5 min job, yielding a greater than 6% enchancment. Would probably be even bigger on a full scale distro type kernel config.

Axboe defined he sometimes disables iostats when testing because of the efficiency overhead of the time querying by default. However when offering some primary caching for the issue-side time querying, he is seeing round a 6% enhance to IOPS and for a extra bloated Linux distribution vendor kernel the good points are probably extra vital.

Intel Optane storage

He detailed within the RFC patch series:

“Querying the present time is the costliest factor we do within the block layer per IO, and relying on kernel config settings, we could do it many instances per IO.

Not one of the callers really need nsec granularity. Reap the benefits of that by caching the present time within the plug, with the idea right here being that any time checking will probably be temporally shut sufficient that the slight lack of precision does not matter.

See Also

If the block plug will get flushed, eg on preempt or schedule out, then we invalidate the cached clock.

which is greater than a 6% enchancment in efficiency. Taking a look at perf diff, we will see an enormous discount in time overhead:

10.55% -9.88% [kernel.vmlinux] [k] read_tsc
1.31% -1.22% [kernel.vmlinux] [k] ktime_get

Observe that since this depends on blk_plug for the caching, it is solely relevant to the problem facet. However that is the place more often than not calls occur anyway. It is also price nothing that the above testing does not allow any of the upper price CPU objects on the block layer facet, like wbt, cgroups, iocost, and many others, which all would add further time querying. IOW, outcomes would probably look even higher compared with these enabled, as distros would do.”

A pleasant win and hopefully this continues to pan out and show helpful for upstreaming with the Linux v6.9 cycle in a number of months,



Source Link

What's Your Reaction?
Excited
0
Happy
0
In Love
0
Not Sure
0
Silly
0
View Comments (0)

Leave a Reply

Your email address will not be published.

2022 Blinking Robots.
WordPress by Doejo

Scroll To Top