Word - Java.net

advertisement
[OPENDS-3913] OpenDS stops responding after sending TCP "ping" packets
Created: 03/Apr/09 Updated: 17/Apr/09 Resolved: 17/Apr/09
Status:
Project:
Component/s:
Affects
Version/s:
Fix Version/s:
Resolved
opends
Security/Other
pre-2.0
Type:
Reporter:
Resolution:
Labels:
Remaining
Estimate:
Time Spent:
Original
Estimate:
Environment:
Bug
rmetrich
Fixed
None
Not Specified
Attachments:
after.jstack
snoop
3,913
Issuezilla Id:
2.0
Priority:
Assignee:
Votes:
Critical
boli
0
Not Specified
Not Specified
Operating System: All
Platform: All
after.pfiles
before.jstack
before.pfiles
Description
Some hardware loadbalancers do send the following TCP sequence to "ping" a
service: SYN, ACK, ACK-RST.
When doing so every second on admin port of LDAP ports, OpenDS stops responding
("Connection refused" is dropped to the client).
Additionally, a file descriptor leak occurs.
This leads to a Deny of Service.
Comments
Comment by rmetrich [ 03/Apr/09 ]
Created an attachment (id=420)
jstack before sending killer sequence
Comment by rmetrich [ 03/Apr/09 ]
Created an attachment (id=421)
jstack after sending killer sequence
Comment by rmetrich [ 03/Apr/09 ]
Created an attachment (id=422)
pfiles before sending killer sequence
Comment by rmetrich [ 03/Apr/09 ]
Created an attachment (id=423)
pfiles after sending killer sequence
Comment by rmetrich [ 03/Apr/09 ]
Sorry, read "on admin port OR LDAP ports" instead of "of"
Comment by ludovicp [ 03/Apr/09 ]
Same pb was reported a couple of weeks ago to me directly by a french ISV.
Load balancer is a software one : Keepalived on Linux.
Comment by matthew_swift [ 07/Apr/09 ]
We should investigate this for 2.0 - could it be an issue in Java's TCP stack
implementation?
Comment by boli [ 07/Apr/09 ]
I'm unable to reproduce the problem with OpenDS rev 5186 on 1.6.0_12-b04. I used
hping to generate a SYN, SYN-ACK, and RST sequence to the listening LDAP port on
11389. I have attached the snoop output. Checking pfiles showed now leaked file
descriptors.
It looks like the file descriptor leak was the result of a half open
connections. It could be because the SYN_ACK from OpenDS was never followed by a
RST from the client or load balancer. Could you provide the snoop output from
OpenDS' perspective?
Comment by boli [ 07/Apr/09 ]
Created an attachment (id=427)
Snoop output of TCP ping from OpenDS' perspective
Comment by ludovicp [ 07/Apr/09 ]
According to Renaud,
OpenDS 1.2.0 completely stops after some time... OpenDS user reported a complete hang.
OpenDS 1.3.0-builds00x leak the file descriptor but continues to work until it runs out of FD.
He has a machine setup on our network with a test client to reproduce. I believe Matt has the
info to
access that machine and run the test program.
Comment by rmetrich [ 08/Apr/09 ]
Your hping sequence is not identical, since the 3-step handshake is not finished
in your case.
The sequence to send to reproduce is the following:
send SYN
receive SYN-ACK
send ACK
send ACK-RST
As Ludovic stated, I have a tool in our internal network to reproduce.
Comment by matthew_swift [ 08/Apr/09 ]
Forgot to target 2.0.
Comment by boli [ 09/Apr/09 ]
This issue is caused by a race condition from a known issue:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6378870
Basically, the remote sends the ACK-RST and closes the connection before OpenDS
calls the setKeepAlive and setTcpNoDelay on the new socket. An SocketException
is thrown when those methods are called on the now closed socket. After two
consecutive errors, OpenDS shuts down the connection handler. After all the
connection handlers are shutdown, all further connections will be rejected by
the OS.
Since this type of TCP ping fully opens the connection then resets it, it could
cause CONNECT and Connection Reset messages in OpenDS access/error logs. It
would be more efficient if the load balancer sends an RST right after receiving
the SYN-ACK from the server. The half connection attempt never makes it up to Java.
Fixed in revision 5210
Comment by rmetrich [ 17/Apr/09 ]
Hardware sending such sequence is:
Cisco CSS 11501 - Content Services Switch
Generated at Mon Mar 07 08:09:39 UTC 2016 using JIRA 6.2.3#6260sha1:63ef1d6dac3f4f4d7db4c1effd405ba38ccdc558.
Download