George van Venrooij
2013-05-08 11:28:03 UTC
Hi,
I have been trying to write a simple http server and I based its
code on the example HTTP Server 3. The server works fine under
functional testing conditions, but when testing load with Gatling, I
see a strange phenomemon:
When attempting to stress-test the server, Gatling tries to connect
with many clients simultaneously, then sends a message and expects a
certain reply. What I saw with the server is that it consistently
fails to accept all connections. About 1-3% are never accepted and
gatling reports they have been remotely closed, or they have timed out
waiting for a reply. I've seen this behavior with 3000 connections, as
well as with 100 connections.
In the server code, everything is perfectly fine. None of the
asynchronous operations ever returns any error and if I keep track of
the number of connections, it's just as if the failed ones never
happened from the server's point of view. Adding threads to the thread
pool doesn't help either. The only way I can successfully accept all
connections in a test, is by NOT communicating with connections
already established (i.e. no reading or writing on a connection, but
keeping it open during the test until the client times out).
This behavior occurs both on Windows 7 (64-bit) as well as on
Ubuntu 12.04 (64-bit) and I'm baffled by it. Both the http server 2
and 3 from the examples exhibit this behavior and I've tried various
approaches to handling this, but found no suitable ones.
So my question would be this: does this sound familiar? And is
there a solution or work-around for this problem?
Kind regards,
George van Venrooij
I have been trying to write a simple http server and I based its
code on the example HTTP Server 3. The server works fine under
functional testing conditions, but when testing load with Gatling, I
see a strange phenomemon:
When attempting to stress-test the server, Gatling tries to connect
with many clients simultaneously, then sends a message and expects a
certain reply. What I saw with the server is that it consistently
fails to accept all connections. About 1-3% are never accepted and
gatling reports they have been remotely closed, or they have timed out
waiting for a reply. I've seen this behavior with 3000 connections, as
well as with 100 connections.
In the server code, everything is perfectly fine. None of the
asynchronous operations ever returns any error and if I keep track of
the number of connections, it's just as if the failed ones never
happened from the server's point of view. Adding threads to the thread
pool doesn't help either. The only way I can successfully accept all
connections in a test, is by NOT communicating with connections
already established (i.e. no reading or writing on a connection, but
keeping it open during the test until the client times out).
This behavior occurs both on Windows 7 (64-bit) as well as on
Ubuntu 12.04 (64-bit) and I'm baffled by it. Both the http server 2
and 3 from the examples exhibit this behavior and I've tried various
approaches to handling this, but found no suitable ones.
So my question would be this: does this sound familiar? And is
there a solution or work-around for this problem?
Kind regards,
George van Venrooij