[asio-users] [strand] Bug: Handlers execute on the wrong strand.

Discussion:

Greg Barron

2013-10-24 01:37:25 UTC

Hi All,

I would like to refer you to this boost::asio bug report:
https://svn.boost.org/trac/boost/ticket/9203

I won't paste the full contents of the bug report into this message.
The executive summary is that the allocation algorithm for the
implementation of asio::strand can and does allocate the
implementation of an existing strand to strands that are subsequently
allocated.

This can have the effect that handlers wrapped in strand (b) will
actually be executed on strand (a). If these handlers are related in
some way (for example with mutexes) and need to not be executed on the
same strand to avoid deadlocks then there is trouble!

Note that this is a change in behaviour from earlier versions of asio.
The behaviour changed with version 1.4.4.

I consider this to be a bug but if there is something that I'm
misunderstanding about the operation of strands it would be great if
you could help me clear it up.

I've implemented a work around which could only be described as a hack
and would be embarrassed to show it to anybody. But it gets me over my
hump :).

Thanks!

--
Greg

Roger Austin (Australia)

2013-10-24 03:41:04 UTC

Permalink

Hi Greg,

It may be a change of behaviour and a problem for you, but strictly speaking I don't think it is a bug. The strand concept guarantees that handlers wrapped on the _same_ strand will _not_ execute concurrently. It does not guarantee that handlers wrapped on _different_ strands _will_ be allowed to execute concurrently.

Clearly it would be nice to think that if a handler blocks one thread in a multi-threaded io_service, all the other handlers will eventually be executed on one of the other threads, but that is not the case in the current version of asio. The fact that two strands sometimes share the same implementation does not violate the strand guarantee. I would guess the reason it is done like this is because the alternative does not scale well; if every strand has its own implementation then you are continually creating and destroying mutexes and might also hit OS constraints on the number of mutexes if the number of strands is large.

The asio philosophy (as I understand it) is that blocking or long-running activities should not be carried out in handlers, but should be handed off to worker threads, in order to keep the io part of the system responsive. I will admit I don't always follow this advice but it has yet to be a problem for me.

Cheers,
Roger

-----Original Message-----
From: Greg Barron [mailto:***@gmail.com]
Sent: Thursday, 24 October 2013 12:37
To: asio-***@lists.sourceforge.net
Subject: [asio-users] [strand] Bug: Handlers execute on the wrong strand.

Hi All,

I would like to refer you to this boost::asio bug report:
https://svn.boost.org/trac/boost/ticket/9203

I won't paste the full contents of the bug report into this message.
The executive summary is that the allocation algorithm for the implementation of asio::strand can and does allocate the implementation of an existing strand to strands that are subsequently allocated.

This can have the effect that handlers wrapped in strand (b) will actually be executed on strand (a). If these handlers are related in some way (for example with mutexes) and need to not be executed on the same strand to avoid deadlocks then there is trouble!

Note that this is a change in behaviour from earlier versions of asio.
The behaviour changed with version 1.4.4.

I consider this to be a bug but if there is something that I'm misunderstanding about the operation of strands it would be great if you could help me clear it up.

I've implemented a work around which could only be described as a hack and would be embarrassed to show it to anybody. But it gets me over my hump :).

Thanks!
--
Greg

Marat Abrarov

2013-10-24 05:54:04 UTC

Permalink

Post by Roger Austin (Australia)
It may be a change of behaviour and a problem for you, but strictly
speaking I don't think it is a bug. The strand concept guarantees that
handlers wrapped on the _same_ strand will _not_ execute concurrently. It
does not guarantee that handlers wrapped on _different_ strands _will_ be
allowed to execute concurrently.

I think this has to be noted in documentation to prevent misunderstanding.

Post by Roger Austin (Australia)
I would guess the reason it is done like this is because the alternative
does not scale well; if every strand has its own implementation then you
are continually creating and destroying mutexes and might also hit OS
constraints on the number of mutexes if the number of strands is large.

Yes. This is the only reason. It was discussed (this mail list) some years
ago.

The best solution I see is to add additional API to create exclusive
strands.
Something like optional parameter at constructor of strand:

asio::io_service::strand(bool exclusive = false) {...}

Regards,
Marat Abrarov.

Greg Barron

2013-10-25 02:46:10 UTC

Permalink

Hi All,

Thanks for your responses. I wish I had of posted here earlier rather than
the boost lists.

Post by Marat Abrarov

I think this has to be noted in documentation to prevent misunderstanding.

Yes I agree with this. When I realised what was happening I looked at the
documentation very closely to for any hints as to whether the behavior was
expected. The documentation doesn't fully explain the current operation:

"An boost::asio::strand guarantees that, for those handlers that are
dispatched through it, an executing handler will be allowed to complete
before the next one is started. This is guaranteed irrespective of the
number of threads that are calling
io_service::run()<http://www.boost.org/doc/libs/1_54_0/doc/html/boost_asio/reference/io_service/run.html>.
Of course, the handlers may still execute concurrently with other handlers
that were not dispatched through an boost::asio::strand, or were dispatched
through a different boost::asio::strand object."

The following snippet is true: "Of course, the handlers may still execute
concurrently with other handlers that were... ... dispatched through a
different boost::asio::strand object". What it doesn't say is that handlers
may execute sequentially with handlers dispatched through a different
boost::asio::strand object. There are no guarantees.

Greg Barron

2013-10-25 03:00:21 UTC

Permalink

On 24 October 2013 14:41, Roger Austin (Australia) <

Post by Roger Austin (Australia)
Hi Greg,
It may be a change of behaviour and a problem for you, but strictly
speaking I don't think it is a bug. The strand concept guarantees that
handlers wrapped on the _same_ strand will _not_ execute concurrently. It
does not guarantee that handlers wrapped on _different_ strands _will_ be
allowed to execute concurrently.
Clearly it would be nice to think that if a handler blocks one thread in a
multi-threaded io_service, all the other handlers will eventually be
executed on one of the other threads, but that is not the case in the
current version of asio. The fact that two strands sometimes share the same
implementation does not violate the strand guarantee. I would guess the
reason it is done like this is because the alternative does not scale well;
if every strand has its own implementation then you are continually
creating and destroying mutexes and might also hit OS constraints on the
number of mutexes if the number of strands is large.

Our particular situation is a bit unusual. We have auto generated code
(by gsoap) sending soap messages inside a strand. This code uses raw
sockets and blocks waiting for a response to the soap message. In the same
process we have a soap server creating a connection for each received
message. This connection uses a strand to synchronises the async_read and
async_write calls. Regularly the connection's strand is assigned the same
implementation of the strand used by the soap sender. Hence the async_read
calls never happen so the sender never gets its response. The sender
eventually gives up and the reads happen but by then its all to late.

The asio philosophy (as I understand it) is that blocking or long-running

Post by Roger Austin (Australia)
activities should not be carried out in handlers, but should be handed off
to worker threads, in order to keep the io part of the system responsive. I
will admit I don't always follow this advice but it has yet to be a problem
for me.

I've got no doubt we can rework our code to work successfully with the
current operation of strand. I wonder how many things out there are getting
similar issues happening and don't even realise it!

Thanking you.

Cheers,
--
Greg.

Igor R

2013-10-24 05:21:18 UTC

Permalink

Post by Greg Barron
https://svn.boost.org/trac/boost/ticket/9203

Does it affect the implicit io_service strand? In other words, if 2
io_service's are running in 2 different threads (like in
io_service-per-core approach), is it still safe to assume that their
completion handlers will never "share" the same thread?

Marat Abrarov

2013-10-24 06:13:43 UTC

Permalink

Post by Igor R
Does it affect the implicit io_service strand? In other words, if 2
io_service's are running in 2 different threads (like in io_service-per-
core approach), is it still safe to assume that their completion handlers
will never "share" the same thread?

For now (Boost version <= 1.54) it is implemented so that strands belonging
to different io_services don't shares their implementations (each instance
of io_service creates its own set of shared implementations of strand). This
is good enough because it is the only way to create "independent" strands.

What about sharing the same thread with "implicit io_sevice strand" - it is
guaranteed by io_service that handlers will not be executed in some other
threads than those that user specifies (by calling
io_service::run/run_one/poll/poll_one and ~io_service() for destructors of
handlers).

http://www.boost.org/doc/libs/1_54_0/doc/html/boost_asio/overview/core/threa
ds.html
http://www.boost.org/doc/libs/1_54_0/doc/html/boost_asio/reference/io_servic
e.html#boost_asio.reference.io_service.synchronous_and_asynchronous_operatio
ns
http://www.boost.org/doc/libs/1_54_0/doc/html/boost_asio/reference/io_servic
e/_io_service.html

Regards,
Marat Abrarov.

Vinnie Falco

2013-10-25 03:23:53 UTC

Permalink

Post by Marat Abrarov
I think this has to be noted in documentation to prevent misunderstanding.

The documentation needs to state this clearly in some manner in both the
tutorial and reference sections.

I am interested in this subject as well. Upon inspect of the 1.54
boost code for strand, I find:

// Number of implementations shared between all strand objects.
enum { num_implementations = 193 };

Why the number 193? This means that a server which processes SSL
connections will have at most 193 distinct strands. I don't know what
the implications are but it is surprising to see such a small number,
and small regardless of platform or environment.

What about someone who is implementing some sort of peer to peer
server? For example, a Bittorrent client, Bitcoin daemon, or Ripple
payment server? If the peer connections use SSL (which most of them
do) then any broadcast of messages to all connected peers over
ssl-enabled sockets (common use case for p2p apps) will have to be
accomplished by posting a handler to each connections' strand, in
order to protect the socket. There's no public interface for manually
acquiring the mutex associated with a strand. The requirement to post
a handler for each connection instead of simply acquiring a lock is
onerous and inefficient. Or am I missing something?

Thanks

--
Follow me on Github: https://github.com/vinniefalco

2013-10-25 05:50:17 UTC

Permalink

Post by Vinnie Falco
Why the number 193? This means that a server which processes SSL
connections will have at most 193 distinct strands. I don't know what
the implications are

No, the number of strands is not limited to 193, but the number of the
simultaneous async operations (and, hence, threads) is.
Considering a typical number of cores on an average rarely exceeds 16, 193
threads per service has got to be enough for everyone, provided handlers *do
not block*.