we are trying to investigate performance problems of our LDAP server.
We are moving from an openldap on linux to proprietory (DirX) on windows 2008.
The LDAP is a user store for weblogic applications and configured there as an
Authentication Provider.
Some of the apps are now experiencing performance problems.
Hardware resources are fine and querying the (new) LDAP server using other ldap clients
and via scripts could not reproduce the performance problems.
Then we looked at the network. We found, for the effected applications, a large number of ldap "abandon reqeust" operations eminating from the client. We looked at the behavior with
the original openldap server and found some, but much less 'abandon request' operations.
We don't (yet) know under what circumstances the client decides to send the abandon request.
The scenario below is kind of typical. After the client sends an Abandon Request there
appear a pair of ACKs first the client then the server. and the one sent by ldap takes 0.1 - 0.4 seconds.
And its this delay in the ACK that seems to be costing...
Not being that well versed in the TCP protocol i was hoping for someone to help clarify/confirm my interpretation:
Packet 664 is the ACK from the server to packet 662 (abandon request length 10).
Packet 663 is the ACK from the client to packet 661 (searchResDone)
If my interpretation is correct i would now investigate:
- why is the ack to an abandon reqeust taking so long ie 0.20 as opposed to say 0.04
( or why is the server taking so long to ack the abandon?)
- why is the client sending the abandons in the first place?
any other ideas ?
thanks in advance,
Michael
...
657 9.2943 ldapserver ldapclient LDAP 170 searchResDone(57)
658 9.2948 ldapclient ldapserver TCP 66 18367 > 389 [ACK] Seq=10007 Ack=19799 Len=0
659 9.2954 ldapclient ldapserver LDAP 1009 searchRequest(134)
660 9.2972 ldapserver ldapclient LDAP 630 searchResEntry(134)
661 9.2973 ldapserver ldapclient LDAP 172 searchResDone(134)
Sequence number:45106, len=106
662 9.2978 ldapclient ldapserver LDAP 76 abandonRequest(134)
Sequence number: 46818, len=10
663 9.3378 ldapclient ldapserver TCP 66 18368 > 389 [ACK] Seq=46828 Ack=45212 Len=0
(The RTT to ACK the segment was: 0.040492000 seconds)
664 9.5062 ldapserver ldapclient TCP 66 389 > 18368 [ACK] Seq=45212 Ack=46828 Len=0
(The RTT to ACK the segment was: 0.208397000 seconds)
665 9.5068 ldapclient ldapserver LDAP 183 searchRequest(136)
666 9.5229 ldapserver ldapclient LDAP 163 searchResEntry(136)
667 9.5229 ldapserver ldapclient LDAP 170 searchResDone(136)
....
Related
I’d like to reopen a previous issue that was incorrectly classified as a network engineering problem and after more test, I think it’s a real issue for programmers.
So, my application streams mp3 files from a server. I can’t modify the server. The client reads data from the server as needed which is 160kbits/s and feeds it to a DAC. Let’s use a file the file of 3.5MB.
When the server is done sending last byte, it closes the connection, so it sends a FIN, seems normal practice.
The problem is that the kernel, especially on Windows, seems to store 1 to 3 MB of data, I assume TCP window size has fully opened.
After a few seconds, the server has sent the whole 3.5 MB and about 3MB sit inside the kernel buffer. At this point the server has sent FIN which is ACK in due time.
From a client point of view, it continues reading data by chunk of 20kB and will do that for the next 3MB/20 ~= 150s before it sees the EOF.
Meanwhile the server is in FIN_WAIT_2 (and not TIME_WAIT as I initially wrote, thank to Steffen for correcting me. Now, OS like Windows seems to have a half-closed socket timer that starts with sending their FIN and be as small as 120s, regardless of the actual TCPWindowsize BTW). Of course after 120s it considers that it should have received a client’s FIN, so it sends a RST. That RST cause all client’s kernel buffer to be discarded and the application fails.
As code is required, here is:
int sock = socket(AF_INET, SOCK_STREAM, 0);
struct sockaddr_in addr;
addr.sin_addr.s_addr = htonl(INADDR_LOOPBACK);
addr.sin_family = AF_INET;
addr.sin_port = htons(80);
int res = connect(sock, (const struct sockaddr*) & addr, sizeof(addr));
char* get = "GET /data-3 HTTP/1.0\n\r"
"User-Agent: mine\n\r"
"Host: localhost\n\r"
"Connection: close\n\r"
"\n\r\n\r";
bytes = send(sock, get, strlen(get), 0);
printf("send %d\n", bytes);
char *buf = malloc(20000);
while (1) {
int n = recv(sock, buf, 20000, 0);
if (n == 0) {
printf(“normal eof at %d”, bytes);
close(sock);
break;
}
if (n < 0) {
printf(“error at %d”, bytes);
exit(1);
}
bytes += n;
Sleep(n*1000/(160000/8));
}
free(buf);
closesocket(sock);
It can be tested with any HTTP server.
I know there are solutions by having a handshake with the server before it closes the socket (but server is just an HTTP server) but the kernel level of buffering make that a systematic failure when its buffer are larger than the time to consume them.
The client is perfectly real time in absorbing data. Having a larger client buffer or no buffer at all does not change the issue which seems a system design flaw to me, unless there is possibility to either control kernel buffers, at the application level, not the whole OS, or detect a FIN reception at client level before the EOF of recv(). I’ve tried to change SO_RCVBUF but it does not seems to influence logically this level of kernel buffering.
Here is a capture of one successful and one failed exchange
success
3684 381.383533 192.168.6.15 192.168.6.194 TCP 54 [TCP Retransmission] 9000 → 52422 [FIN, ACK] Seq=9305427 Ack=54 Win=262656 Len=0
3685 381.387417 192.168.6.194 192.168.6.15 TCP 60 52422 → 9000 [ACK] Seq=54 Ack=9305428 Win=131328 Len=0
3686 381.387417 192.168.6.194 192.168.6.15 TCP 60 52422 → 9000 [FIN, ACK] Seq=54 Ack=9305428 Win=131328 Len=0
3687 381.387526 192.168.6.15 192.168.6.194 TCP 54 9000 → 52422 [ACK] Seq=9305428 Ack=55 Win=262656 Len=0
failed
5375 508.721495 192.168.6.15 192.168.6.194 TCP 54 [TCP Retransmission] 9000 → 52436 [FIN, ACK] Seq=5584802 Ack=54 Win=262656 Len=0
5376 508.724054 192.168.6.194 192.168.6.15 TCP 60 52436 → 9000 [ACK] Seq=54 Ack=5584803 Win=961024 Len=0
6039 628.728483 192.168.6.15 192.168.6.194 TCP 54 9000 → 52436 [RST, ACK] Seq=5584803 Ack=54 Win=0 Len=0
Here is what I think is the cause, thanks very much to Steffen for putting me on the right track.
an mp3 file is 3.5 MB at 160 kbits/s = 20 kB/s
a client reads it at the exact required speed, 20kB/sec, let's say one recv() of 20kB per second, no pre-buffering for simplicity
some OS, like Windows, can have very large TCP kernel buffer (about 3MB or more) and with a fast connection, the TCP windows size is widely open
in a matter of seconds, the whole file is sent to the client, let's say that about 3MB are in the kernel buffers
as far as the server is concerned, all has been sent and acknowledge, so it does a close()
the close() sends a FIN to the client which responds by an ACK and the server enters FIN_WAIT_2 state
BUT, at that point from a client point of view, all recv() will have plenty of read for the next 150 s before it sees the eof!
so client, will not do a close() and thus will not send a FIN
the server is in FIN_WAIT_2 state and according to the TCP specs, it should stay like that forever
now, various OS (Windows at least) start a timer similar to TIME_WAIT (120s) when starting a close(), or when receiving the ACK of their FIN, that I don't know (in fact Windows has a specific registry entry for that, AFAIK). This is to more aggressively deal with half-closed sockets.
of course, after 120s, the server has not seen a client's FIN and sends a RST
that RST is received by the client and causes an error there and all the remaining data in the TCP buffers to be discarded and lost
of course, not of that happens with high bitrate formats as the client consumes data fast enough so that the kernel TCP buffers are never idle for 120s and it might not happen for low bit rate when the application buffering system reads it all. It has to be the bad combination of bitrate, file size and kernel's buffers... hence it does not happen all the time.
That's it. That can be reproduced with a few lines of code and every HTTP server. This can be debated, but I see that as a systemic OS issue. Now, the solution that seems to work is to force client's receive buffers (SO_RCVBUF) to a lower level so that the server has little chances to have sent all data and that data sits in client's kernel buffers for too long. Note that this can still happen though if the buffer is 20kB and the client consumes it at 1B/s... hence I call it a systemic failure instead. Now I agree that some will see that as an application issue
I am trying to estimate bandwidth usage of a XMPP application.
The application is receiving a 150-bytes ping each second, and answering with a ping of the same size.(*)
However, when I measure the data usage, I get something like 900 bytes per ping (and not the 300 expected)
I have a suspicion this might relate to something layer bellow (TCP? IP?) and datagram sizes. But, so far, reading the TCP/IP guide did not lead me anywhere.
Another hypothesis would be that this overhead comes from XMPP itself, somehow.
Can anyone enlighten me ?
(*) to get this "150 bytes" I counted the number of chars in the <iq> (the xml representation of the ping)
I am using TLS, but not BOSH (actually, BOSH on the other connection: I am measuring results in the android client, and the pings are coming from a web application, but I think that should not matter)
The client is Xabber, running on android
Lets try to calculate the worst-case overhead down to the IP level.
For TLS we have:
With TLS 1.1 and up, in CBC mode: an IV of 16 bytes.
Again, in CBC mode: TLS padding. TLS uses blocks of 16 bytes, so it may need to add 15 bytes of padding.
(Technically TLS allows for up to 255 bytes of padding, but in practice I think that's rare.)
With SHA-384: A MAC of 48 bytes.
TLS header of 5 bytes.
That's 84 extra bytes.
TCP and IP headers are 40 bytes (from this answer) if no extra options are used, but for IPv6 this would be 60 bytes.
So you could be seeing 84 + 60 + 150 = 294 bytes per ping.
However, on the TCP level we also need ACKs. If you are pinging a different client (especially over BOSH), then the pong will likely be too late to piggyback the TCP ACK for the ping. So the server must send a 60 byte ACK for the ping and the client also needs to send a 60 byte ACK for the pong.
That brings us to:
294 + 60 + 294 + 60 = 708
900 still sounds a lot too large. Are you sure the ping and the pong are both 150 bytes?
I'm trying to understand how the transmission and ACK work in TCP. In this Figuure, when A retransmitt the seq 100 after he recive three duplikated ACK , B will answer with 121 ACK or 158 ACK ?
B would only be issuing the 100 ACK because it didn't receive the SEQ 121 packet. There's no evidence that it received the next packets either, but even if it did it isn't required to save them. So the answer depends on things that aren't specified in your question.
I am using tcpdump/wireshark to capture tcp packets while tcp client sending data to tcp server. The client simply sends 4096 bytes to server in one "send()" call. And I get different tcp packets on two sides, two packets on the sender side seem to be "compacted" on the receiver side, this conflicts with how i understand the tcp protocol and I stuck on this issue for a few days and really need some help.
Please notice the packet length in following packets:
client (sender) sends 2 packets 0Xbcac (4) and 0xbcae (5), sends 2896 + 1200 = 4096 bytes in all.
(0xbcac) 4 14:31:33.838305 192.168.91.194 192.168.91.193 TCP 2962 59750 > 9877 [ACK] Seq=1 Ack=1 Win=14720 **Len=2896** TSval=260728 TSecr=3464603 0
(0xbcae) 5 14:31:33.838427 192.168.91.194 192.168.91.193 TCP 1266 59750 > 9877 [PSH, ACK] Seq=2897 Ack=1 Win=14720 **Len=1200** TSval=260728 TSecr=3464603 0
However on the server (receiver) side, only one packet is presented, with ip.id=0xbcac and length = 4096 (receiver.packet.0xbcac = sender.packet.0xbcac + 0xbcae):
(0xbcac) 4 14:31:33.286296 192.168.91.194 192.168.91.193 TCP 4162 59750 > 9877 [PSH, ACK] Seq=1 Ack=1 Win=14720 **Len=4096** TSval=260728 TSecr=3464603 0
I'm aware that tcp is a stream protocol and data sent can be divided into packets according to MSS (or MTU), but i guess the division happens before packets are sent to NIC, thus before captured. I'm also aware that the PSH flag in packet 0xbcae lead to writing data from buffer to NIC, but that cannot explain the "compacted" packet. Also I tried in client to send 999999 bytes in one "send" call and the data are divided into small packets and sent, but still mismatch the packets captured on server side. At last I disable tcp nagle, get the same result, and ruled out that reason.
So my question is the mismatching i encountered normal? If it is, what caused this? If not, i'm using ubuntu 12.04 and ubuntu 13.10 in LAN, and what is the possible reason to this "compacted" packet?
Thanks in advance for any help!
two packets on the sender side seem to be "compacted" on the receiver
side
It looks like a case of generic receive offload or large receive offload. Long story short, the receiving network card does some smart stuff and coalesces segments before they hit the kernel, which improves performance.
To check if this is the case you can try to disable it using:
$ ethtool -K eth0 gro off
$ ethtool -K eth0 lro off
Something complementary happens on the sending side: tcp segmentation offload or generic segmentation offload.
After disabling these don't forget to reenable them: they seriously improve performance.
In the client side , sftp application send some packet to ssh server port 22.
SFTP application send packet to TCP , from etherial capture we can see that
sftp packet send from application to TCP and TCP send to packet to server but TCP not recieved TCP ACK from the server so TCP again send the packet after few second but still no response from server..
It seems that server no received the packet from client.
Client SFTP allication wait in select for TCP recv with timeout of 120 second
after 120 second application get timeout from select and close the SFTP operation
with timeout error.
In capture I can see TCP retransmit packet many times but fail to recv server TCP ACK.
Scenario:
1. Timeout happen sometime only.
2. After this issue next SFTP opration[upload] success with same server.
3. It seems network has no issue because next upload is working fine.
4. both client and server has SOLARIS OS
5. we are unable to reproduce this in our Lab environment
6. This issue happen only in real customer network.
7. Appln is in C language. SSH server is Open SSH server.
I want to know:
1. How can we found reason for TCP not recv ACK repply form Server.
2. Is any TCP system setting in solaris cause this issue.
3. please provide any inforamtion so that we can resolve this issue.
I assume your topology looks like this:
10.25.190.12 10.10.10.10
[e1000g0] [eth0]
SFTP_Client--------------Network------------OpenSSH_Server
There are two things you need to do:
Establish whether there is regular significant packet loss between your client and server. TCP tolerates some packet loss, but if you start dropping a lot (which is honestly hard to quantify) it's going to just give up in some circumstances. I would suggest two ways of detecting packet loss... the first is mtr, the second is ping. mtr is far preferable, since you get loss statistics per-hop (see below). Run mtr 10.10.10.10 from the client and mtr 10.25.190.12 from the server. Occasionally, packet loss is path-dependent, so it's useful to do it from both sides when you really want to nail down the source of it. If you see packet loss, work with your network administrators to fix it first; you're wasting your time otherwise. In the process of fixing the packet loss, it's possible you will fix this TCP ACK problem as well.
If there is no significant packet loss, you need to sniff both sides of the connection simultaneously with snoop or tshark (you can get tshark from SunFreeware) until you see the problem again. When you find this situation with missing TCP ACKs, figure out whether: A) the OpenSSH_Server sent the ACK, and B) whether the SFTP_Client received it. If the Client gets the ACK on its ethernet interface, then you probably need to start looking in your software for clues. You should be restricting your sniffs to the IP addresses of the client and server. In my experience, this kind of issue is possible, but not a common problem; 90+% of the time, it's just network packet loss.
Sample output from mtr:
mpenning#mpenning-T61:~$ mtr -n 4.2.2.4
HOST: mpenning-T61 Loss% Snt Last Avg Best Wrst StDev
1. 10.239.84.1 0.0% 407 8.8 9.1 7.7 11.0 1.0
2. 66.68.3.223 0.0% 407 11.5 9.2 7.1 11.5 1.3
3. 66.68.0.8 0.0% 407 19.9 16.7 11.2 21.4 3.5
4. 72.179.205.58 0.0% 407 18.5 23.7 18.5 28.9 4.0
5. 66.109.6.108 5.2% 407 16.6 17.3 15.5 20.7 1.5 <----
6. 66.109.6.181 4.8% 407 18.2 19.1 16.8 23.6 2.3
7. 4.59.32.21 6.3% 407 20.5 26.1 19.5 68.2 14.9
8. 4.69.145.195 6.4% 406 21.4 27.6 19.8 79.1 18.1
9. 4.2.2.4 6.8% 406 22.3 23.3 19.4 32.1 3.7