A very in-depth article from our friends at ScyllaDB about how they fought an I/O latency bug in Linux CFQ scheduler.
I remember myself back in like 4 years ago struggling to understand how can
select on non-blocking socket take anywhere up to a 15 seconds to return on AWS. However my issue was not related to the scheduler but the quest and persistence of ScyllaDB team is just another reminder that there are people who really care. Something I miss a lot nowadays since I deal with big corporations and men in suits™.
The article itself is a very informative read but I personally was completely unaware of eBPF filters existence and awesome BPF framework. Next time I would be stuck with something weird in kernel - I know which gun to bring to a party.
Hats off to Glauber - my favorite Brazilian guy since 2008.