[OPENDS-3913] OpenDS stops responding after sending TCP "ping" packets Created: 03/Apr/09 Updated: 17/Apr/09 Resolved: 17/Apr/09 Status: Project: Component/s: Affects Version/s: Fix Version/s: Resolved opends Security/Other pre-2.0 Type: Reporter: Resolution: Labels: Remaining Estimate: Time Spent: Original Estimate: Environment: Bug rmetrich Fixed None Not Specified Attachments: after.jstack snoop 3,913 Issuezilla Id: 2.0 Priority: Assignee: Votes: Critical boli 0 Not Specified Not Specified Operating System: All Platform: All after.pfiles before.jstack before.pfiles Description Some hardware loadbalancers do send the following TCP sequence to "ping" a service: SYN, ACK, ACK-RST. When doing so every second on admin port of LDAP ports, OpenDS stops responding ("Connection refused" is dropped to the client). Additionally, a file descriptor leak occurs. This leads to a Deny of Service. Comments Comment by rmetrich [ 03/Apr/09 ] Created an attachment (id=420) jstack before sending killer sequence Comment by rmetrich [ 03/Apr/09 ] Created an attachment (id=421) jstack after sending killer sequence Comment by rmetrich [ 03/Apr/09 ] Created an attachment (id=422) pfiles before sending killer sequence Comment by rmetrich [ 03/Apr/09 ] Created an attachment (id=423) pfiles after sending killer sequence Comment by rmetrich [ 03/Apr/09 ] Sorry, read "on admin port OR LDAP ports" instead of "of" Comment by ludovicp [ 03/Apr/09 ] Same pb was reported a couple of weeks ago to me directly by a french ISV. Load balancer is a software one : Keepalived on Linux. Comment by matthew_swift [ 07/Apr/09 ] We should investigate this for 2.0 - could it be an issue in Java's TCP stack implementation? Comment by boli [ 07/Apr/09 ] I'm unable to reproduce the problem with OpenDS rev 5186 on 1.6.0_12-b04. I used hping to generate a SYN, SYN-ACK, and RST sequence to the listening LDAP port on 11389. I have attached the snoop output. Checking pfiles showed now leaked file descriptors. It looks like the file descriptor leak was the result of a half open connections. It could be because the SYN_ACK from OpenDS was never followed by a RST from the client or load balancer. Could you provide the snoop output from OpenDS' perspective? Comment by boli [ 07/Apr/09 ] Created an attachment (id=427) Snoop output of TCP ping from OpenDS' perspective Comment by ludovicp [ 07/Apr/09 ] According to Renaud, OpenDS 1.2.0 completely stops after some time... OpenDS user reported a complete hang. OpenDS 1.3.0-builds00x leak the file descriptor but continues to work until it runs out of FD. He has a machine setup on our network with a test client to reproduce. I believe Matt has the info to access that machine and run the test program. Comment by rmetrich [ 08/Apr/09 ] Your hping sequence is not identical, since the 3-step handshake is not finished in your case. The sequence to send to reproduce is the following: send SYN receive SYN-ACK send ACK send ACK-RST As Ludovic stated, I have a tool in our internal network to reproduce. Comment by matthew_swift [ 08/Apr/09 ] Forgot to target 2.0. Comment by boli [ 09/Apr/09 ] This issue is caused by a race condition from a known issue: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6378870 Basically, the remote sends the ACK-RST and closes the connection before OpenDS calls the setKeepAlive and setTcpNoDelay on the new socket. An SocketException is thrown when those methods are called on the now closed socket. After two consecutive errors, OpenDS shuts down the connection handler. After all the connection handlers are shutdown, all further connections will be rejected by the OS. Since this type of TCP ping fully opens the connection then resets it, it could cause CONNECT and Connection Reset messages in OpenDS access/error logs. It would be more efficient if the load balancer sends an RST right after receiving the SYN-ACK from the server. The half connection attempt never makes it up to Java. Fixed in revision 5210 Comment by rmetrich [ 17/Apr/09 ] Hardware sending such sequence is: Cisco CSS 11501 - Content Services Switch Generated at Mon Mar 07 08:09:39 UTC 2016 using JIRA 6.2.3#6260sha1:63ef1d6dac3f4f4d7db4c1effd405ba38ccdc558.