Discussion:
[asio-users] [proposal] Multi-threaded polling
Tatsuyuki Ishi
2016-12-09 09:59:56 UTC
Permalink
Relying on too much mutexes isn't a good idea. Some atomic mechanics should
be involved.

1. epoll, kqueue, IOCP, /dev/poll are all edge-triggered, and AFAIK they
are threadsafe.
Side notes:
Time to ditch select(). Who use it?
Switching to GetQueuedCompletionStatusEx allows us to use the same scheme
for UNIX and Windows (IOCP). It's more important to reduce system calls
than to reduce mutex locks. XP? Nobody updates software on an unsupported
OS.

2. Switch to atomic MPMC queue.
This one <https://github.com/cameron314/concurrentqueue> seems handy.

3. Polling method: poll a bunch, pick one, push the others to queue.
Order: poll -> queue -> block

Conclusion:
Most system calls do synchronization in kernel-level which is way cheaper
(they cheat using privileged instructions). The global lock is not a good
idea on highly-concurrent system, and anyway atomic instructions is always
better. We should minimize userland mutex usage and enable parallel polling
instead.
Marat Abrarov
2016-12-09 10:44:22 UTC
Permalink
Hi Tatsuyuki Ishi,
Post by Tatsuyuki Ishi
Time to ditch select(). Who use it?
Linux Kernel 2.4
QNX Neutrino
AIX
HP-UX
Tru64
XP? Nobody updates software on an unsupported OS
I'd like to keep support of Windows XP for sure. Boost.Asio is used in real world (in production) where outdated OS is reality. It's again a kind of tradeoff, but I don't find much profit from dropping support of Windows XP (and other OSs which are no longer supported by their vendors).
Post by Tatsuyuki Ishi
Switching to GetQueuedCompletionStatusEx allows us to use the same scheme for UNIX and Windows (IOCP).
Refer to https://sourceforge.net/p/asio/mailman/message/28396716. I believe GetQueuedCompletionStatus helps OS to manage IOCP-bound threads in a better way.
Post by Tatsuyuki Ishi
Switch to atomic MPMC queue.
It would be great. Just need to notice that lock-free algorithms are hard to prove correctness. I'd prefer to use smth from Boost.LockFree (http://www.boost.org/doc/libs/1_62_0/doc/html/lockfree.html) but it has no MPMC queue (why? Could someone start from contributing MPMC queue into Boost.LockFree?)

Regards,
Marat Abrarov (https://github.com/mabrarov/asio_samples).




------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today.http://sdm.link/xeonphi
_______________________________________________
asio-users mailing list
asio-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/asio-users
_______________________________________________
Using Asio? List your project at
http://think-async.com/Asio/WhoIsUsingAsio
Bjorn Reese
2016-12-09 11:20:52 UTC
Permalink
Post by Marat Abrarov
It would be great. Just need to notice that lock-free algorithms are hard to prove correctness. I'd prefer to use smth from Boost.LockFree (http://www.boost.org/doc/libs/1_62_0/doc/html/lockfree.html) but it has no MPMC queue (why? Could someone start from contributing MPMC queue into Boost.LockFree?)
boost::lockfree::queue is MPMC.


------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today.http://sdm.link/xeonphi
_______________________________________________
asio-users mailing list
asio-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/asio-users
_______________________________________________
Using Asio? List your project at
http://think-async.com/Asio/WhoIsUsingAsio
Tatsuyuki Ishi
2016-12-09 13:17:24 UTC
Permalink
Hi Marat Abrarov,
Post by Tatsuyuki Ishi
Linux Kernel 2.4
QNX Neutrino
AIX
HP-UX
Tru64
2.4 is EOL, we don't care. AIX, HP-UX and Tru64 all have eventport or their
own implementation of edge-triggering, thus select is not a good idea. I
have no idea about QNX, due to the small amount of resources that can be
found by a simple search.
Post by Tatsuyuki Ishi
I'd like to keep support of Windows XP for sure. Boost.Asio is used in
real world (in production) where outdated OS is reality. It's again a kind
of tradeoff, but I don't find much profit from dropping support of Windows
XP (and other OSs which are no longer supported by their vendors).
I don't believe people using outdated OS update their software; since this
is a new release, it's safe to drop legacy supports. By the way, Boost have
been breaking ABI for many times, and changing things is not a problem.
Post by Tatsuyuki Ishi
I believe GetQueuedCompletionStatus helps OS to manage IOCP-bound threads
in a better way.
My answer is no. Golang and libuv is using the Ex version (basically they
have the same process for Windows and UNIX polling), and a mutex (or even
atomic queue) is way cheaper than making lots of syscalls to get a new job.
Post by Tatsuyuki Ishi
I'd prefer to use smth from Boost.LockFree
The biggest concern is that LockFree has a crappy memory concept. The one I
mentioned has 1000+ stars, proving it's large user base and stability.

Best Regards,
Tatsuyuki Ishi
Marat Abrarov
2016-12-09 13:37:06 UTC
Permalink
Post by Marat Abrarov
I believe GetQueuedCompletionStatus helps OS to manage IOCP-bound threads in a better way.
My answer is no. Golang and libuv is using the Ex version (basically they have the same
process for Windows and UNIX polling), and a mutex (or even atomic queue) is way cheaper
than making lots of syscalls to get a new job.
Do you have any reference to read about it (I'm really interested)? I know just about special scheduling algorithm (FIFO) Windows applies to threads bound to IOCP (any thread which used particular instance of IOCP). I believe that mutex will decrease performance for "single io_service instance - multiple (1 per each logical CPU) threads utilizing this instance of io_service" scenario.

Regards,
Marat Abrarov.




------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today.http://sdm.link/xeonphi
_______________________________________________
asio-users mailing list
asio-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/asio-users
_______________________________________________
Using Asio? List your project at
http://think-async.com/Asio/WhoIsUsingAsio
Marat Abrarov
2016-12-09 14:08:59 UTC
Permalink
Post by Marat Abrarov
I know just about special scheduling algorithm (FIFO)
Windows applies to threads bound to IOCP (any thread
which used particular instance of IOCP)
it should be LIFO
(https://msdn.microsoft.com/en-us/library/windows/desktop/aa365198(v=vs.85).
aspx):

A thread (either one created by the main thread or the main thread itself)
uses the GetQueuedCompletionStatus function to wait for a completion packet
to be queued to the I/O completion port, rather than waiting directly for
the asynchronous I/O to complete. Threads that block their execution on an
I/O completion port are released in last-in-first-out (LIFO) order, and the
next completion packet is pulled from the I/O completion port's FIFO queue
for that thread. This means that, when a completion packet is released to a
thread, the system releases the last (most recent) thread associated with
that port, passing it the completion information for the oldest I/O
completion.

...

The most efficient scenario occurs when there are completion packets waiting
in the queue, but no waits can be satisfied because the port has reached its
concurrency limit. Consider what happens with a concurrency value of one and
multiple threads waiting in the GetQueuedCompletionStatus function call. In
this case, if the queue always has completion packets waiting, when the
running thread calls GetQueuedCompletionStatus, it will not block execution
because, as mentioned earlier, the thread queue is LIFO. Instead, this
thread will immediately pick up the next queued completion packet. No thread
context switches will occur, because the running thread is continually
picking up completion packets and the other threads are unable to run



------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today.http://sdm.link/xeonphi
_______________________________________________
asio-users mailing list
asio-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/asio-users
_______________________________________________
Using Asio? List your project at
http://think-async.com/Asio/WhoIsUsingAsio
Gruenke,Matt
2016-12-11 09:25:19 UTC
Permalink
As the author of concurrentqueue states (in the README.md):

Note that lock-free programming is a patent minefield, and this code may very well violate a pending patent (I haven't looked), though it does not to my present knowledge. I did design and implement this queue from scratch.

So, even though it uses the boost license, there still might be reasons to prefer Boost.LockFree’s implementation. Furthermore, we should avoid adding more non-boost library dependencies on Boost.Asio, and importing this code into Boost.Asio’s repo creates a potential maintenance burden.

If there are issues with Boost.LockFree, those should really be taken up with its maintainer(s). Fixing or improving it would benefit more than just Boost.Asio.


Matt


From: Tatsuyuki Ishi [mailto:***@gmail.com]
Sent: Friday, December 09, 2016 08:17
To: asio-***@lists.sourceforge.net
Subject: Re: [asio-users] [proposal] Multi-threaded polling

Hi Marat Abrarov,
Post by Tatsuyuki Ishi
Linux Kernel 2.4
QNX Neutrino
AIX
HP-UX
Tru64
2.4 is EOL, we don't care. AIX, HP-UX and Tru64 all have eventport or their own implementation of edge-triggering, thus select is not a good idea. I have no idea about QNX, due to the small amount of resources that can be found by a simple search.
Post by Tatsuyuki Ishi
I'd like to keep support of Windows XP for sure. Boost.Asio is used in real world (in production) where outdated OS is reality. It's again a kind of tradeoff, but I don't find much profit from dropping support of Windows XP (and other OSs which are no longer supported by their vendors).
I don't believe people using outdated OS update their software; since this is a new release, it's safe to drop legacy supports. By the way, Boost have been breaking ABI for many times, and changing things is not a problem.
Post by Tatsuyuki Ishi
I believe GetQueuedCompletionStatus helps OS to manage IOCP-bound threads in a better way.
My answer is no. Golang and libuv is using the Ex version (basically they have the same process for Windows and UNIX polling), and a mutex (or even atomic queue) is way cheaper than making lots of syscalls to get a new job.
Post by Tatsuyuki Ishi
I'd prefer to use smth from Boost.LockFree
The biggest concern is that LockFree has a crappy memory concept. The one I mentioned has 1000+ stars, proving it's large user base and stability.

Best Regards,
Tatsuyuki Ishi

________________________________

This e-mail contains privileged and confidential information intended for the use of the addressees named above. If you are not the intended recipient of this e-mail, you are hereby notified that you must not disseminate, copy or take any action in respect of any information contained in it. If you have received this e-mail in error, please notify the sender immediately by e-mail and immediately destroy this e-mail and its attachments.
Loading...