A performance review of io_uring vs. epoll for standard/streamed socket traffic

samsquire · on Nov 6, 2022

This is interesting. Thank you for this.

I wrote an epoll echo server which multiplexes multiple clients over each thread. The idea is that each core can scale the number of clients it serves.

Its kind of similar to libuv. I use IO threads to handle IO. It's incomplete though but a proof of idea.

https://GitHub.com/samsquire/epoll-server

It is based on a multiconsumer multiproducer RingBuffer by Alexander Krizhanovsky.

https://www.linuxjournal.com/content/lock-free-multi-produce...

I also wrote a userspace 1:M:N lightweight thread scheduler which should be integrated with the epoll server. This is an alternative to coroutines. I multiplex multiple lightweight threads on a kernel thread and switch between them fast. The scheduler thread preempts hot for and while loops by setting the looping variable to the limit. This allows preemption to occur when the code finished the current iteration. This is why I call it userspace preemption.

https://GitHub.com/samsquire/preemptible-thread

One idea I have for even higher performance is to split sending and receiving to their own threads and multiplex sending and receiving across threads. This means you can scale sending and receiving.

I want to add io_uring maybe I can learn it from this repository

gavinray · on Nov 6, 2022

Benchmark code is linked at the footer of the parent comment, but for ease of access, the URL to it is:

https://github.com/alibaba/PhotonLibOS/blob/main/examples/pe...

beef9999 · on Nov 8, 2022