Discussion:
[asio-users] boost asio: boost::asio::deadline_timer blocking ?
Avi Bahra
2012-11-09 11:07:01 UTC
Permalink
Using boost asio 1.47, suse linux 11.3, gcc 4.5

I am using boost::asio::deadline_timer, where the timer fires every second.
When the timer expires, I check responses from the clients.
I use the same timer to check when I have reach the minute boundary,
so that server can do some other work.

Now 99.9% of the time this works perfectly. However occasionally
the timer does not fire every second, in fact it can delay/block for
several minutes!

This is what I use:

timer_.expires_from_now( boost::posix_time::seconds( 1 ) );
timer_.async_wait( server_->io_service_.wrap( boost::bind(
&NodeTreeTraverser::traverse,this,boost::asio::placeholders::error ) )
);

When I run strace on the process I see:

epoll_wait(4, {{EPOLLIN, {u32=15957244, u64=15957244}}}, 128, 4294967295) = 1
timerfd_settime(5, 0, {it_interval={0, 0}, it_value={16, 944525000}},
{it_interval={16048400, 15957984}, it_value={140733847085872,
6025966}}) = 0
timerfd_settime(5, 0, {it_interval={0, 0}, it_value={0, 999992000}},
{it_interval={15958092, 6011901}, it_value={17195917584,
140733847086024}}) = 0

I have several questions:

o Are there known problems with boost::asio::deadline_timer accuracy
I realise I need soft real time, with one minute resolution.
o Should I have two separate timers, one 1 second & for 1 minute
o Is using timer_.expires_at(...) more reliable, I assume this affects
how it configures timerfd_settime
o Now I suspect this issue may be related to slow disk system
Could a slow disk system affect timerfd_settime system call,
since it is based on file descriptors.
o Are there any well known asio based design patterns/idioms
for ensuring reliable timer resolutions

Any help appreciated.

Ta,
Avi
Gruenke, Matt
2012-11-10 01:49:59 UTC
Permalink
Answers:

1. I doubt this is really an issue is with deadline_timer accuracy.

2. Are you double-scheduling the same timer? Or are you just doing some
extra work, every time a single expiry event crosses a minute boundary?
I wouldn't double-schedule the same timer, if that's what you're doing.

3. The timer may be susceptible to time changes on your system. It
needn't be, but don't assume that means it's not. It might be worth
looking at the implementation, to make sure.

4. Slow disk could be an issue, but only if your io_service thread(s)
are blocking on disk activity (either direct disk access or page
faults). The file descriptors you're talking about have nothing to do
with disk files, in this case.

5. Depends on your accuracy needs and the load on the rest of the
system. On a system under moderate loads, I'd expect no problems using
ASIO to perform processing on a one-second granularity. Just be sure to
compute the next expiry based the current time, rather than sleeping for
1 second. If you want to make this resilient to time change, you can
use a monotonic timebase.


Matt


-----Original Message-----
From: Avi Bahra [mailto:***@gmail.com]
Sent: November 09, 2012 06:07
To: asio-***@lists.sourceforge.net
Subject: [asio-users] boost asio: boost::asio::deadline_timer blocking ?

Using boost asio 1.47, suse linux 11.3, gcc 4.5

I am using boost::asio::deadline_timer, where the timer fires every
second.
When the timer expires, I check responses from the clients.
I use the same timer to check when I have reach the minute boundary, so
that server can do some other work.

Now 99.9% of the time this works perfectly. However occasionally the
timer does not fire every second, in fact it can delay/block for several
minutes!

This is what I use:

timer_.expires_from_now( boost::posix_time::seconds( 1 ) );
timer_.async_wait( server_->io_service_.wrap( boost::bind(
&NodeTreeTraverser::traverse,this,boost::asio::placeholders::error ) )
);

When I run strace on the process I see:

epoll_wait(4, {{EPOLLIN, {u32=15957244, u64=15957244}}}, 128,
4294967295) = 1 timerfd_settime(5, 0, {it_interval={0, 0}, it_value={16,
944525000}}, {it_interval={16048400, 15957984},
it_value={140733847085872,
6025966}}) = 0
timerfd_settime(5, 0, {it_interval={0, 0}, it_value={0, 999992000}},
{it_interval={15958092, 6011901}, it_value={17195917584,
140733847086024}}) = 0

I have several questions:

o Are there known problems with boost::asio::deadline_timer accuracy
I realise I need soft real time, with one minute resolution.
o Should I have two separate timers, one 1 second & for 1 minute
o Is using timer_.expires_at(...) more reliable, I assume this affects
how it configures timerfd_settime
o Now I suspect this issue may be related to slow disk system
Could a slow disk system affect timerfd_settime system call,
since it is based on file descriptors.
o Are there any well known asio based design patterns/idioms
for ensuring reliable timer resolutions

Any help appreciated.

Ta,
Avi

------------------------------------------------------------------------
------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics Download AppDynamics Lite for
free today:
http://p.sf.net/sfu/appdyn_d2d_nov
_______________________________________________
asio-users mailing list
asio-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/asio-users
_______________________________________________
Using Asio? List your project at
http://think-async.com/Asio/WhoIsUsingAsio
Avi Bahra
2012-11-14 14:18:11 UTC
Permalink
Answers:

1. I doubt this is really an issue is with deadline_timer accuracy.

2. Are you double-scheduling the same timer? Or are you just doing some
extra work, every time a single expiry event crosses a minute boundary?
I wouldn't double-schedule the same timer, if that's what you're doing.
No. I record the next minute boundary, when the timer fires, I check the current
time against the minute boundary. So no double scheduling.
3. The timer may be susceptible to time changes on your system. It
needn't be, but don't assume that means it's not. It might be worth
looking at the implementation, to make sure.
The problem occurs independent of the time change.
4. Slow disk could be an issue, but only if your io_service thread(s)
are blocking on disk activity (either direct disk access or page
faults). The file descriptors you're talking about have nothing to do
with disk files, in this case.
The current server is single threaded, so my this was my first suspect.
I log every client request to a log file. I had suspected that a slow
file system, would block when logging the client request, and affect the timer.
I added extra instrumentation to time the logging. However this was not the cause
5. Depends on your accuracy needs and the load on the rest of the
system. On a system under moderate loads, I'd expect no problems using
ASIO to perform processing on a one-second granularity. Just be sure to
compute the next expiry based the current time, rather than sleeping for
1 second. If you want to make this resilient to time change, you can
use a monotonic timebase.


Matt

================================================================

The client is designed to cancel long running connection to the
server.. ( > 20seconds)
This timeout is based on boost asio example:
http://www.boost.org/doc/libs/1_47_0/doc/html/boost_asio/example/timeouts/async_tcp_client.cpp

What I have noticed is that the 1 second timer in the server
does not fire, *when* client request have been cancelled.
It appears that some of chain of async handlers
that are being called in the server, during this process, cause the
timer to hang ?

For example if the client cancels after 20 seconds, then something in the server
must been blocking, for that period ?
When the client cancels, the server::handle_write(..), closes
both end of the socket. So I see the following message logged:

shutdown: Transport endpoint is not connected

Here is the async' functions in the server:

void server::start_accept()
{
connection_ptr new_conn( new connection( io_service_ ) );
acceptor_.async_accept( new_conn->socket(),
boost::bind( &server::handle_accept, this,
boost::asio::placeholders::error,
new_conn ) );
}

void server::handle_accept( const boost::system::error_code& e,
connection_ptr conn )
{
if (!acceptor_.is_open()) {
if (serverEnv_.debug()) cout << " server::handle_accept:
acceptor is closed, returning\n";
return;
}

if ( !e ) {
conn->async_read( inbound_request_,
boost::bind( &server::handle_read, this,
boost::asio::placeholders::error,conn ) );
}
else {
if (e != boost::asio::error::operation_aborted) {
// An error occurred. Log it
LogToCout toCoutAsWell;
LOG(Log::ERR, " server::handle_accept error occurred " <<
e.message());
}
}
start_accept();
}

void server::handle_read( const boost::system::error_code&
e,connection_ptr conn )
{
if ( !e ) {
try {
// Service the in bound request, handling the request will
populate the outbound_response_
// Note:: Handle request will first authenticate
outbound_response_.set_cmd( inbound_request_.handleRequest( this ) );
}
catch (exception& e) {
outbound_response_.set_cmd( PreAllocatedReply::error_cmd( e.what() ));
}

// *Reply* back to the client:
conn->async_write( outbound_response_,
boost::bind(&server::handle_write,
this,
boost::asio::placeholders::error,
conn ) );
}
else {
LOG(Log::ERR, "server::handle_read error occurred : " << e.message());
}
}

void server::handle_write( const boost::system::error_code& e,
connection_ptr conn )
{
boost::system::error_code ec;
conn->socket().shutdown(boost::asio::ip::tcp::socket::shutdown_both,ec);
if (ec) {
ecf::LogToCout logToCout;
LOG(Log::ERR,"server::handle_write: socket shutdown both failed:
" << ec.message());
return;
}
}

I had noticed that I was not checking error code in the server::handle_write(..)
but I'm not sure if this would account, for why the timer does not
fire temporarily.


Ta,
Avi

Loading...