[asio-users] Can I use dead_line_timer with async_read together when keeping thousands of TCP connections?

Discussion:

陈抒

2014-04-01 12:46:44 UTC

Hi,
My server needs to handle thousands of TCP connections. But I find many
connections are not closed because async_read method blocks, it's weird!
Today my server cannot work because there are too many TCP connections, I
have to restart OS.
I read a few articles which describes how to use dead line timer with
async_read, but I have some sad experience, thousands of timers will slow
down my server.
See my question on stackoverflow:
http://stackoverflow.com/questions/22777835/is-boostasio-asyn-read-with-timer-a-good-idea

My question is:
Why does my async_read blocks?
Can I use dead line timer in this case?

Dean Chen
Best regards
http://blog.csdn.net/csfreebird

Slav

2014-04-01 13:03:09 UTC

Permalink

I had the same problem. Each day ~15000 users was connecting the server but
some of connections wasn't closed, so when ~500000 connections was hanging
(some of them was live), server must be restarted - it happened like each
month-two.
Timers will significantly slow your server down, indeed. As was answered to
me in this mailing lists, it should be solved with periodic iteration
through all connections to kill any which hang too much depending your
inner server logic.
For my server I kick everyone who didn't sent anything to server for an
hour. Kicking process is being invoked each 5 minutes. Iteration through
10000 connections is lightning fast: takes less time than human can measure
(so less than 80 milliseconds) so it's possible to run it each second but
not needed.

Post by éæ
Hi,
My server needs to handle thousands of TCP connections. But I find many
connections are not closed because async_read method blocks, it's weird!
Today my server cannot work because there are too many TCP connections, I
have to restart OS.
I read a few articles which describes how to use dead line timer with
async_read, but I have some sad experience, thousands of timers will slow
down my server.
http://stackoverflow.com/questions/22777835/is-boostasio-asyn-read-with-timer-a-good-idea
Why does my async_read blocks?
Can I use dead line timer in this case?
Dean Chen
Best regards
http://blog.csdn.net/csfreebird
------------------------------------------------------------------------------
_______________________________________________
asio-users mailing list
https://lists.sourceforge.net/lists/listinfo/asio-users
_______________________________________________
Using Asio? List your project at
http://think-async.com/Asio/WhoIsUsingAsio

Dean Chen

2014-04-01 13:51:41 UTC

Permalink

Hi, Slav:
Thanks for your response. I used to create one thread to close the
connection if it's timeout. But this cause my process to crash. Because I
cannot close one socket in another thread when there is another
async_write or async_read operation on it.
Accordin to this post:
http://web.archiveorange.com/archive/v/Q0J4VefPMc2v8QYvcVr3

Most likely you have a threading issue in your program where you close a
socket from one thread while simultaneously starting another async
operation on the same socket from another thread. If you are sure this is
not the case, please attach a small, complete program that exhibits the
problem. Thanks.

Could you teach me your way, how to close the socket safely?

Post by Slav
I had the same problem. Each day ~15000 users was connecting the
server but some of connections wasn't closed, so when ~500000
connections was hanging (some of them was live), server must be
restarted - it happened like each month-two.
Timers will significantly slow your server down, indeed. As was
answered to me in this mailing lists, it should be solved with
periodic iteration through all connections to kill any which hang too
much depending your inner server logic.
For my server I kick everyone who didn't sent anything to server for
an hour. Kicking process is being invoked each 5 minutes. Iteration
through 10000 connections is lightning fast: takes less time than
human can measure (so less than 80 milliseconds) so it's possible to
run it each second but not needed.
Hi,
My server needs to handle thousands of TCP connections. But I find
many connections are not closed because async_read method blocks,
it's weird!
Today my server cannot work because there are too many TCP
connections, I have to restart OS.
I read a few articles which describes how to use dead line timer
with async_read, but I have some sad experience, thousands of
timers will slow down my server.
http://stackoverflow.com/questions/22777835/is-boostasio-asyn-read-
with-timer-a-good-idea
Why does my async_read blocks?
Can I use dead line timer in this case?
Dean Chen
Best regards
http://blog.csdn.net/csfreebird
-------------------------------------------------------------------
-----------
_______________________________________________
asio-users mailing list
https://lists.sourceforge.net/lists/listinfo/asio-users
_______________________________________________
Using Asio? List your project at
http://think-async.com/Asio/WhoIsUsingAsio
------------------------------------------------------------------------------
_______________________________________________
asio-users mailing list
https://lists.sourceforge.net/lists/listinfo/asio-users
_______________________________________________
Using Asio? List your project at
http://think-async.com/Asio/WhoIsUsingAsio

Slav

2014-04-01 16:02:38 UTC

Permalink

To safely work with connections asio::strand can be used:

//created once when io_service is created:
asio::strand strand( io_service );

...

//at any moment connection must be "touched":
//no parallel async_read/write or handle_accept() can be called since
we're within asio::strand, but your server can start using clients so lock
them in that case:
clientsPoolMutex.lock();

for each client in clients
{
//multiple harmless errors can arise (connection was already
closed, connection was forced to be close and so on) - just ignoring them
is fine, but usage of overloaded function which do not accept error codes
field can throw exceptions which can kill your server if not catch them:
asio::error_code shutdownErrorCode;
asio::error_code closeErrorCode;

//should be kicked due to being too old:
if ( client->lastMessageReceived < currentTime - ( 60 * 60 ) )
{
client->socket.shutdown( asio::socket_base::shutdown_both,
shutdownErrorCode );
client->socket.close( closeErrorCode );
}
}

Post by Dean Chen
Thanks for your response. I used to create one thread to close the
connection if it's timeout. But this cause my process to crash. Because I
cannot close one socket in another thread when there is another
async_write or async_read operation on it.
http://web.archiveorange.com/archive/v/Q0J4VefPMc2v8QYvcVr3
Most likely you have a threading issue in your program where you close a
socket from one thread while simultaneously starting another async
operation on the same socket from another thread. If you are sure this is
not the case, please attach a small, complete program that exhibits the
problem. Thanks.
Could you teach me your way, how to close the socket safely?

------------------------------------------------------------------------------

Post by Slav
_______________________________________________
asio-users mailing list
https://lists.sourceforge.net/lists/listinfo/asio-users
_______________________________________________
Using Asio? List your project at
http://think-async.com/Asio/WhoIsUsingAsio

------------------------------------------------------------------------------
_______________________________________________
asio-users mailing list
https://lists.sourceforge.net/lists/listinfo/asio-users
_______________________________________________
Using Asio? List your project at
http://think-async.com/Asio/WhoIsUsingAsio

陈抒

2014-04-02 01:40:20 UTC

Permalink

陈抒

2014-04-02 01:51:41 UTC

Permalink

But If all read/write operations need to get the global mutex, it might be
another performance issue.

Dean Chen
Best regards
http://blog.csdn.net/csfreebird

虫草君

2014-04-02 03:18:23 UTC

Permalink

No global mutex, strand should enough.

Just push all function call of same socket to the same strand .

But if you access those connections/sessions from multiple threads, you should use Slav's global mutex

------------------ ÔÊŒÓÊŒþ ------------------
·¢ŒþÈË: "³ÂÊã";<***@gmail.com>;
·¢ËÍÊ±Œä: 2014Äê4ÔÂ2ÈÕ(ÐÇÆÚÈý) ÉÏÎç9:51
ÊÕŒþÈË: "asio-***@lists.sourceforge.net"<asio-***@lists.sourceforge.net>;

Ö÷Ìâ: Re: [asio-users] Can I use dead_line_timer with async_read togetherwhen keeping thousands of TCP connections?

But If all read/write operations need to get the global mutex, it might be another performance issue.

Dean Chen
Best regards
http://blog.csdn.net/csfreebird

On Wed, Apr 2, 2014 at 9:40 AM, ³ÂÊã <***@gmail.com> wrote:
Hi, Slav: From you code, I guess I should do the following jobs:
1. in one thread, use for(;;) loop to iterate all connected client objects, check them to see if they are timeout, if it is, close the socket

2. I had tried this before, but ran into threading issue, it made my app crash

3. So you suggest to use a global mutex to make sure when I am closing one socket, no async_read or asnyc_write operation happens on it in another thread.

4. I need to apply this mutex to all my read/write operations like this:
Sign is one connected object in my server app:

void Sign::DoWrite(boost::shared_ptr<Message> msg) {
write_msgs_.push_back(msg);
if (write_msgs_.size() == 1) {
boost::shared_ptr<vector<char> > data(new vector<char>(400, 0));
msg->Write(*data);

clientsPoolMutex.lock();
async_write(socket,
buffer(&(*data)[0], data->size()),
strand_.wrap(bind(&Sign::AfterSend, shared_from_this(), _1, data)));
}
}

Dean Chen
Best regards
http://blog.csdn.net/csfreebird

On Wed, Apr 2, 2014 at 12:02 AM, Slav <***@gmail.com> wrote:
To safely work with connections asio::strand can be used:

//created once when io_service is created:

asio::strand strand( io_service );

...

//at any moment connection must be "touched":

//no parallel async_read/write or handle_accept() can be called since we're within asio::strand, but your server can start using clients so lock them in that case:

clientsPoolMutex.lock();

for each client in clients
{

//multiple harmless errors can arise (connection was already closed, connection was forced to be close and so on) - just ignoring them is fine, but usage of overloaded function which do not accept error codes field can throw exceptions which can kill your server if not catch them:

asio::error_code shutdownErrorCode;

asio::error_code closeErrorCode;

//should be kicked due to being too old:

if ( client->lastMessageReceived < currentTime - ( 60 * 60 ) )
{

client->socket.shutdown( asio::socket_base::shutdown_both, shutdownErrorCode );

client->socket.close( closeErrorCode );
}
}

2014-04-01 17:51 GMT+04:00 Dean Chen <***@gmail.com>:
Hi, Slav:
Thanks for your response. I used to create one thread to close the
connection if it's timeout. But this cause my process to crash. Because I
cannot close one socket in another thread when there is another
async_write or async_read operation on it.
Accordin to this post:
http://web.archiveorange.com/archive/v/Q0J4VefPMc2v8QYvcVr3

Most likely you have a threading issue in your program where you close a
socket from one thread while simultaneously starting another async
operation on the same socket from another thread. If you are sure this is
not the case, please attach a small, complete program that exhibits the
problem. Thanks.

Could you teach me your way, how to close the socket safely?

------------------------------------------------------------------------------
_______________________________________________
asio-users mailing list
asio-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/asio-users
_______________________________________________
Using Asio? List your project at
http://think-async.com/Asio/WhoIsUsingAsio

------------------------------------------------------------------------------

_______________________________________________
asio-users mailing list
asio-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/asio-users
_______________________________________________
Using Asio? List your project at
http://think-async.com/Asio/WhoIsUsingAsio

Slav

2014-04-02 10:17:55 UTC

Permalink

Mutex is needed just to embrace data in case your server is multithreaded -
it's beyond your question so I possibly shouldn't add it.
The whole work done here is by the strand, which ensures that function is
being invoked within the same thread as any others async_read/write/accept
handlers.
Now I see I forgot to show that the for(;;) code must be issued within
strand, sorry. I've introduced strand but never used it :). So it should
look like this:

Server::Server()
{
//created once when io_service is created:
asio::strand strand( io_service );

...

//at any moment connection must be "touched":
strand.dispatch( boost::bind( &Server::ExpelIdlers, this ) );
}

void Server::ExpelIdlers()
{
for each client in clients
{
//multiple harmless errors can arise (connection was already
closed, connection was forced to be close and so on) - just ignoring them
is fine, but usage of overloaded function which do not accept error codes
field can throw exceptions which can kill your server if not catch them:
asio::error_code shutdownErrorCode;
asio::error_code closeErrorCode;

//should be kicked due to being too old:
if ( client->lastMessageReceived < currentTime - ( 60 * 60 ) )
{
client->socket.shutdown( asio::socket_base::shutdown_
both, shutdownErrorCode );
client->socket.close( closeErrorCode );
}
}
}

Post by è«èå
No global mutex, strand should enough.
Just push all function call of same socket to the same strand .
But if you access those connections/sessions from multiple threads, you
should use Slav's global mutex
------------------ ÔÊŒÓÊŒþ ------------------
*·¢ËÍÊ±Œä:* 2014Äê4ÔÂ2ÈÕ(ÐÇÆÚÈý) ÉÏÎç9:51
*Ö÷Ìâ:* Re: [asio-users] Can I use dead_line_timer with async_read
togetherwhen keeping thousands of TCP connections?
But If all read/write operations need to get the global mutex, it might be
another performance issue.
Dean Chen
Best regards
http://blog.csdn.net/csfreebird

Dean Chen

2014-04-02 14:14:34 UTC

Permalink

Thanks for Slav's help and 虫草君's reminder. The key is strand, no mutex needed.
My way is a bit different. Because my io_service objecg is running in
thread pool, so my Sign(Connection object) already has a strand_ member
variable, I use strand_ to guarantee all async_read/async_write
operations can be executed concurrently correctly. For example:

async_read(socket, buffer(body_buffer_, length - 5),
strand_.wrap(bind(&Sign::AfterReadBody, shared_from_this(), _1, length - 5)));

I recalled I have implement one method Sign::Send, this method let other
thread to use Sign object to send message to remote client.

void Sign::Send(boost::shared_ptr<Message> msg) {
if (is_closed_) {
BOOST_LOG_SEV(my_logger::get(), debug) << "object id: " << id_ << "never send bytes because the socket is flagged to close";
return;
}
strand_.post(bind(&Sign::DoWrite, shared_from_this(), msg));
}

void Sign::DoWrite(boost::shared_ptr<Message> msg) {

write_msgs_.push_back(msg);
if (write_msgs_.size() == 1) {
boost::shared_ptr<vector<char> > data(new vector<char>(400, 0));
msg->Write(*data);
async_write(socket,
buffer(&(*data)[0], data->size()),
strand_.wrap(bind(&Sign::AfterSend, shared_from_this(), _1, data)));
}
}

All I need to do is to reuse this design, add two functions like below:

void Sign::ToClose() {
strand_.post(bind(&Sign::DoClose, shared_from_this()));
}

void Sign::DoClose() {
BOOST_LOG_SEV(my_logger::get(), info) << "object id: " << id_ << " DoClose";
Connection::CloseSocket();
is_closed_ = true;
}

Then in another thread, I call it like this:

void Signs::CheckTimeout() {
.....

for (; itor != values_.end(); ++itor) {
if (itor->second->Timeout()) {
string display_id = itor->first;
BOOST_LOG_SEV(my_logger::get(), debug) << "object id: " << itor->second->id_ << " is timeout, address:" << itor->second->address;
itor->second->ToClose();
values_.erase(itor);
DisplayService::Logout(display_id);
// only remove the first timeout sign here
// because erase will cause the itor become invalidated, and the operator++ on this iterator will cause crash
return;
} else {
BOOST_LOG_SEV(my_logger::get(), debug) << "object id: " << itor->second->id_ << " no timeout, address:" << itor->second->address;
}
}
}

I have watched my log files for a few minutes, it works fine, many
sockets are closed correctly. Hope this problem is fixed, I will watch
over this for a few days.

BTW, since many people will need to learn how to handle a large mount of
TCP connections on server side app, someone could add good examples
for boost::asio for this case will be very helpful.

Post by Slav
Mutex is needed just to embrace data in case your server is
multithreaded - it's beyond your question so I possibly shouldn't add
it.
The whole work done here is by the strand, which ensures that function
is being invoked within the same thread as any others
async_read/write/accept handlers.
Now I see I forgot to show that the for(;;) code must be issued within
strand, sorry. I've introduced strand but never used it :). So it
Server::Server()
{
asio::strand strand( io_service );
...
strand.dispatch( boost::bind( &Server::ExpelIdlers, this ) );
}
void Server::ExpelIdlers()
{
for each client in clients
{
//multiple harmless errors can arise (connection was already closed,
connection was forced to be close and so on) - just ignoring them is
fine, but usage of overloaded function which do not accept error codes
asio::error_code shutdownErrorCode;
asio::error_code closeErrorCode;
if ( client->lastMessageReceived < currentTime - ( 60 * 60 ) )
{
client->socket.shutdown( asio::socket_base::shutdown_
both, shutdownErrorCode );
client->socket.close( closeErrorCode );
}
}
}
No global mutex, strand should enough.
Just push all function call of same socket to the same strand .
But if you access those connections/sessions from multiple
threads, you should use Slav's global mutex
------------------ 原始邮件 ------------------
发送时间: 2014年4月2日(星期三) 上午9:51
主题: Re: [asio-users] Can I use dead_line_timer with async_read
togetherwhen keeping thousands of TCP connections?
But If all read/write operations need to get the global mutex, it
might be another performance issue.
Dean Chen
Best regards
http://blog.csdn.net/csfreebird