I am work on a high load tcp application with Java Netty, which expect to arrive 300k concurrent TCP connections.
It works perfect on test server, arrive 300k connections, but when deploy to production server, it only can support 65387 connections, after arrive this number, client will throw out a "java.io.IOException: Connection reset by peer" exceptions. I try many times, every time, when connections up to 65387, client will can't create connection.
The network capture as bellow, 10.95.196.27 is server, 10.95.196.29 is client :
16822 12:26:12.480238 10.95.196.29 10.95.196.27 TCP 74 can-ferret > http [SYN] Seq=0 Win=14600 Len=0 MSS=1460 SACK_PERM=1 TSval=872641174 TSecr=0 WS=128
16823 12:26:12.480267 10.95.196.27 10.95.196.29 TCP 66 http > can-ferret [SYN, ACK] Seq=0 Ack=1 Win=2920 Len=0 MSS=1460 SACK_PERM=1 WS=1024
16824 12:26:12.480414 10.95.196.29 10.95.196.27 TCP 60 can-ferret > http [ACK] Seq=1 Ack=1 Win=14720 Len=0
16825 12:26:12.480612 10.95.196.27 10.95.196.29 TCP 54 http > can-ferret [FIN, ACK] Seq=1 Ack=1 Win=3072 Len=0
16826 12:26:12.480675 10.95.196.29 10.95.196.27 HTTP 94 Continuation or non-HTTP traffic
16827 12:26:12.480697 10.95.196.27 10.95.196.29 TCP 54 http > can-ferret [RST] Seq=1 Win=0 Len=0
The exception cause by after client 3 handshake to server, server send a RST package to client, and the new connection was broken.
client side exception stack as bellow:
16:42:05.826 [nioEventLoopGroup-1-15] WARN i.n.channel.DefaultChannelPipeline - An exceptionCaught() event was fired, and it reached at the end of the pipeline. It usually means the last handler in the pipeline did not handle the exception.
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[na:1.7.0_25]
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) ~[na:1.7.0_25]
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:225) ~[na:1.7.0_25]
at sun.nio.ch.IOUtil.read(IOUtil.java:193) ~[na:1.7.0_25]
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:375) ~[na:1.7.0_25]
at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:259) ~[netty-all-4.0.0.Beta3.jar:na]
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:885) ~[netty-all-4.0.0.Beta3.jar:na]
at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:226) ~[netty-all-4.0.0.Beta3.jar:na]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:72) ~[netty-all-4.0.0.Beta3.jar:na]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:460) ~[netty-all-4.0.0.Beta3.jar:na]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:424) ~[netty-all-4.0.0.Beta3.jar:na]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:360) ~[netty-all-4.0.0.Beta3.jar:na]
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:103) ~[netty-all-4.0.0.Beta3.jar:na]
at java.lang.Thread.run(Thread.java:724) ~[na:1.7.0_25]
Sever side have not exceptions.
I had try turning some sysctl item as bellow to support huge connections, but its useless:
net.core.wmem_max = 33554432
net.ipv4.tcp_rmem = 4096 4096 33554432
net.ipv4.tcp_wmem = 4096 4096 33554432
net.ipv4.tcp_mem = 786432 1048576 26777216
net.ipv4.tcp_max_tw_buckets = 360000
net.core.netdev_max_backlog = 4096
vm.min_free_kbytes = 65536
vm.swappiness = 0
net.ipv4.ip_local_port_range = 1024 65535
net.ipv4.tcp_max_syn_backlog = 4096
net.netfilter.nf_conntrack_max = 3000000
net.nf_conntrack_max = 3000000
net.core.somaxconn = 327680
The max open fd already set to 999999
linux-152k:~ # ulimit -n
999999
The OS release is SUSE Linux Enterprise Server 11 SP2 with 3.0.13 kernel:
linux-152k:~ # cat /etc/SuSE-release
SUSE Linux Enterprise Server 11 (x86_64)
VERSION = 11
PATCHLEVEL = 2
linux-152k:~ # uname -a
Linux linux-152k 3.0.13-0.27-default #1 SMP Wed Feb 15 13:33:49 UTC 2012 (d73692b) x86_64 x86_64 x86_64 GNU/Linux.
The dmesg have not any error information, CPU and Memory keep low level, every thing looks good, just server reset connection from client.
We have a test server which was SUSE Linux Enterprise Server 11 SP1 with 2.6.32 kernel, it works well, can support up to 300k connections.
I think maybe some kernel or security limit cause this, but I can't find it, any suggestions or any way to get some debug informations of why server send RST? Thanks.
Santal, I've just came across the following link, and it seems it can give an answer to your question:
What is the theoretical maximum number of open TCP connections that a modern Linux box can have
Finally got the root cause. Simply said, it was a JDK bug, please refer to http://mail.openjdk.java.net/pipermail/nio-dev/2013-September/002284.html
which cause NPE when fd > 64 * 1024.
After upgrade to JDK7_45, everything works great now.
Related
I have the enviroment that have an PSTN > GATEWAY (CME) . So I would like to know how to set the Asterisk to understand the dial-peer from the cisco. Somebody did this ?.
I did try to set the sip.conf to
[4000]
allowguest=yes
defaultuser=4000
nsecure=port,invite
bindport=5060
type=peer ; I did try change to "friend" as well, but the same problem.
port=5060
host=172.16.101.25
context=Plan1
insecure=yes
canreinvite=yes
qualify=yes
I did get a return from cisco.
<--- SIP read from UDP:172.16.101.25:53054 --->
SIP/2.0 200 OK
Via: SIP/2.0/UDP 172.16.101.43:5060;branch=z9hG4bK0bfec330
From: "asterisk" <sip:asterisk#172.16.101.43>;tag=as26306981
To: <sip:172.16.101.25>;tag=65E9F470-228A
Date: Thu, 22 Oct 2015 16:05:13 GMT
Call-ID: 100fe7696c5adbf3170c02ff799d7539#172.16.101.43:5060
Server: Cisco-SIPGateway/IOS-12.x
CSeq: 102 OPTIONS
Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, PRACK, UPDATE, REFER, SUBSCRIBE, NOTIFY, INFO, REGISTER
Allow-Events: telephone-event
Accept: application/sdp
Supported: 100rel,timer,resource-priority,replaces,sdp-anat
Content-Type: application/sdp
Content-Length: 170
v=0
o=CiscoSystemsSIP-GW-UserAgent 8815 8188 IN IP4 172.16.101.25
s=SIP Call
c=IN IP4 172.16.101.25
t=0 0
m=audio 0 RTP/AVP 18 0 8 9 4 2 15
c=IN IP4 172.16.101.25
<------------->
--- (14 headers 7 lines) ---
Really destroying SIP dialog '100fe7696c5adbf3170c02ff799d7539#172.16.101.43:5060' Method: OPTIONS
The Cisco gateway send the REGISTER but asterisk donĀ“t receive.
No firewalls or any rules.
Sorry for my poor english.
I'm not sure the problem is on the Asterisk side, might be config missing on the cisco side.
If you want phone calls delivered from the cisco gateway to asterisk here is the config on the cisco side you need:
dial-peer voice 6000 voip
destination-pattern 6...
session protocol sipv2
session target ipv4:172.16.101.43
This will send all 4 digit numbers starting with 6 to asterisk. Should be enough to get traffic flowing from cisco gateway to asterisk. Change destination-pattern to match the dial-plan on asterisk for the numbers you want delivered there. dot is a single digit wildcard, [] is used for collection of numbers to match.
If the gateway IOS is version 15 or later you will need to add the asterisk ip to the ip address trusted list also
voice service voip
ip address trusted list
ipv4 172.16.101.43
If you want/need to specify asterisk as a sip-ua on the cisco gateway you would do so under sip-ua in the cisco gateway config. This is where you would specify credentials used to register on asterisk. Here is a list of possible commands needed there.
sip-ua
credentials username <username> password <password> realm <sip domain>
authentication username <username> password <password> realm <sip domain>
registrar ipv4:172.16.101.43
sip-server ipv4:172.16.101.43
I'm trying to get Asterisk working with Bahnhof IP telephony provider here in Sweden. They don't officially support third party solutions but that have never stopped me before.
Anyhow to the nitty gritty, the VoIP linux (Debian 7) running Asterisk 11.5.1 has the public IP on interface eth1, the firewall (iptables) is fully opened between the box and Bahnhof VoIP servers. Incoming calls are working. Outgoing are not as shown in the log below. I've search the web from top to bottom and have zip about this issue.
I have another provider (CellIP) which works just fine with these same settings, I've never received any 404 messages like below from them.
sip.conf
...
[bahnhof]
type=peer
defaultuser=55500XXXXXX
fromuser=55500XXXXXX
context=default
secret=mypassword
host=bahnhof-lda.soho1.voip.bahnhof.net
fromdomain=bahnhof-lda.soho1.voip.bahnhof.net
insecure=port,invite
qualify=yes
canreinvite=no
dtmfmode=rfc2833
...
extensions.conf
...
[010XXXXXXX-out]
exten => _0X.,1,Set(CALLERID(num)=55500XXXXXX)
exten => _0X.,n,Dial(SIP/${EXTEN}#bahnhof,30,trg)
exten => _0X.,n,Hangup()
...
SIP DEBUG
<------------>
Audio is at 10060
Adding codec 0x100 (g729) to SDP
Adding codec 0x2 (gsm) to SDP
Adding codec 0x4 (ulaw) to SDP
Adding codec 0x8 (alaw) to SDP
Adding non-codec 0x1 (telephone-event) to SDP
Reliably Transmitting (no NAT) to 77.240.208.105:5060:
INVITE sip:0700XXXXXX#bahnhof-lda.soho1.voip.bahnhof.net SIP/2.0
Via: SIP/2.0/UDP 81.170.XXX.XXX:5060;branch=z9hG4bK4db8a605
Max-Forwards: 70
From: "0700XXXXXX" <sip:55500XXXXXX#bahnhof-lda.soho1.voip.bahnhof.net>;tag=as5eaad2ef
To: <sip:0700XXXXXX#bahnhof-lda.soho1.voip.bahnhof.net>
Contact: <sip:55500XXXXXX#81.170.XXX.XXX:5060>
Call-ID: 498e8f9616aadc0a658b301909fe58d7#bahnhof-lda.soho1.voip.bahnhof.net
CSeq: 102 INVITE
User-Agent: Asterisk PBX 1.8.11.0
Date: Wed, 16 Oct 2013 15:18:57 GMT
Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY, INFO, PUBLISH
Supported: replaces, timer
Content-Type: application/sdp
Content-Length: 360
v=0
o=root 495353330 495353330 IN IP4 81.170.195.158
s=Asterisk PBX 1.8.11.0
c=IN IP4 81.170.XXX.XXX
t=0 0
m=audio 10060 RTP/AVP 18 3 0 8 101
a=rtpmap:18 G729/8000
a=fmtp:18 annexb=no
a=rtpmap:3 GSM/8000
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-16
a=silenceSupp:off - - - -
a=ptime:20
a=sendrecv
---
<--- SIP read from UDP:77.240.208.105:5060 --->
SIP/2.0 100 Trying
User-Agent: Centile-Supra/1
CSeq: 102 INVITE
Via: SIP/2.0/UDP 81.170.XXX.XXX:5060;branch=z9hG4bK4db8a605
Content-Length: 0
From: "0700XXXXXX" <sip:55500XXXXXX#bahnhof-lda.soho1.voip.bahnhof.net>;tag=as5eaad2ef
To: <sip:0700XXXXXX#bahnhof-lda.soho1.voip.bahnhof.net>
Call-ID: 498e8f9616aadc0a658b301909fe58d7#bahnhof-lda.soho1.voip.bahnhof.net
<------------->
--- (8 headers 0 lines) ---
<--- SIP read from UDP:77.240.208.105:5060 --->
SIP/2.0 404 No domain-name found in requestURI:0700XXXXXX
User-Agent: Intraswitch/7.5.6.4.SR4-SNAPSHOT_SCF-8
CSeq: 102 INVITE
Via: SIP/2.0/UDP 81.170.XXX.XXX:5060;branch=z9hG4bK4db8a605
Content-Length: 0
Record-Route: <sip:77.240.208.105;lr>
From: "0700XXXXXX" <sip:55500XXXXXX#bahnhof-lda.soho1.voip.bahnhof.net>;tag=as5eaad2ef
To: <sip:0700XXXXXX#bahnhof-lda.soho1.voip.bahnhof.net>;tag=e67fec39-f749-59b4-4b8f-c798e694a64b
Supported: replaces, 100rel, timer
Server: Intraswitch/7.5.6.4.SR4-SNAPSHOT_SCF-8
Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, MESSAGE, PRACK, REFER, INFO, SUBSCRIBE, NOTIFY
Contact: <sip:0700XXXXXX#77.240.208.19:5061>
Call-ID: 498e8f9616aadc0a658b301909fe58d7#bahnhof-lda.soho1.voip.bahnhof.net
The obvious error is "SIP/2.0 404 No domain-name found in requestURI:0700XXXXXX", however the underlying condition causing this is not at all obvious to me :]
Note, all XXX's are real values masked for obvious reasons.
Any help or ideas to try is greatly appreciated!
SOLVED - UPDATE 2013-10-19
The issue is now solved, the operator Bahnhof (in Sweden) has a similar setup to Sipgate. This means there is very little flexibility in how you configure the REGISTER and Trunk. A small mistake here which would not affect the general SIP provider has major impact here, perhaps this is a high security strategy from Bahnhof and Sipgate. In short having a bad register will prevent you from making outbound calls.
It came down to this in sip.conf,
Bad register: register => 55500XXXXXX:mypassword#bahnhof/2000
OK register: register => 55500XXXXXX:mypassword#bahnhof/55500XXXXXX
Then in extensions.conf,
[default]
exten => 55500XXXXXX,1,Goto(010XXXXXXX-in,s,1)
Now outbound calls work, however here is where Bahnhof is acutally even more tricky then Sipgate, Bahnhof has a cluster of SIP servers where a inbound call can originate from any of them. When Asterisk resolves bahnhof-lda.soho1.voip.bahnhof.net it simply takes the first ip address and considers this the SIP peer. The problem is that a inbound SIP call can come from any of the six addresses and not only the address you registered at.
The solution would be to add [bahnhof-peer-01], [bahnhof-peer-2] ... entries for each ip address,
[bahnhof-peer-01]
type=peer
context=default
host=77.240.208.19
insecure=port,invite
canreinvite=no
dtmfmode=rfc2833
[bahnhof-peer-02]
type=peer
context=default
host=77.240.208.20
insecure=port,invite
canreinvite=no
dtmfmode=rfc2833
...
OR
Simply set in sip.conf,
[general]
context=default
allowguest=yes
alwaysauthreject=yes
extensions.conf,
[default]
; Incoming calls on SE line 010XXXXXXX (steered from "register" section)
exten => 55500XXXXXX,1,Goto(010XXXXXXX-in,s,1)
; NOTHING MORE UNDER default
[extensions]
...
This would allow any incoming calls to be accepted, even unauthenticated ones. One might see this as a security risk but I have all VoIP traffic tightly locked down by firewall rules so this is the solution I went with. Also, as long as you have the context set to default in [general] and you have a proper setup with nothing under the [default] context except the inbound extension you will be fine. Anonymous inbound calls will have no access to you trunks or extensions, only access to inbound extensions which they obviously will need to know (55500XXXXXX) to get through to your inbound rules for your Bahnhof number.
So finally, this solution enabled both outbound and inbound calls via the Bahnhof SIP provider. Technically what seems to happen when you use the username as the extension on the register line is that this forces a Digest Auth towards the SIP server when an outbound call is made. This in turn seems to satisfy the SIP server with the domain-name part in the URI. This is not completely clear to me in relevance but obviously plays a part.
I am loading a MP3 stream from IceCast 2.3.2-kh29 server in the Android app with MediaPlayer class.
Playing works well, but sometimes stops happen. If see the server responses in IcyStreamMeta class for ID3 tags, there is 404 error for this case.
Also it happens in Windows 7: Firefox and other browsers.
Here are normal headers (some data ***ed):
http://***:14534/***.mp3
GET /***.mp3 HTTP/1.1
Host: ***:14534
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:22.0) Gecko/20100101 Firefox/22.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: ru-RU,ru;q=0.8,en-US;q=0.5,en;q=0.3
Accept-Encoding: gzip, deflate
Connection: keep-alive
HTTP/1.1 200 OK
Server: nginx/1.4.1
Date: Tue, 23 Jul 2013 21:22:00 GMT
Content-Type: audio/mpeg
Transfer-Encoding: chunked
Connection: keep-alive
icy-br: 192
ice-audio-info: bitrate=192;samplerate=44100;channels=2
icy-description: MP3 192 Kbps
icy-genre: ***
icy-name: ***
icy-pub: 1
icy-url: ***
Cache-Control: no-cache
Expires: Mon, 26 Jul 1997 05:00:00 GMT
Pragma: no-cache
So, the stream sometimes plays only about a minute or less, sometimes seconds and stops. What's the possible reason of 404 error? In other devices there were tests with stable work. Internet speed is well. Can router cause such things? Also, maybe some special HTTP request headers are needed for IceCast (and if they're not present, it gives 404)? Or it's an internal server error for some cases?
So, from WireShark:
2973 53.630385000 SERVER'S IP 192.168.100.6 TCP 1466 14534 > 59847 [ACK] Seq=1284017 Ack=1 Win=63 Len=1412
2976 53.636352000 SERVER'S IP 192.168.100.6 TCP 1157 14534 > 59847 [PSH, ACK] Seq=1285429 Ack=1 Win=63 Len=1103
2978 53.671606000 SERVER'S IP 192.168.100.6 TCP 60 14534 > 59847 [PSH, ACK] Seq=1286532 Ack=1 Win=63 Len=5
2980 53.678606000 SERVER'S IP 192.168.100.6 TCP 60 14534 > 59847 [FIN, ACK] Seq=1286537 Ack=2 Win=63 Len=0
The issue is your chunked encoding. You're proxying your stream through Nginx, and Nginx is "fixing" the output to be compatible with HTTP/1.0. Don't do that.
You can try turning off chunked encoding in your Nginx config:
chunked_transfer_encoding off
I'm writing some code to integrate an in-house app into a DVR to retrieve a video file. This is all reverse engineered as there isn't any official documentation, and I'm having trouble understanding the following sequence of events (captured by playing with the DVR's Android app).
936 72.985204 192.168.0.1 192.168.0.200 HTTP 468 POST /cgi-bin/supervisor/NetworkBk.cgi HTTP/1.1 (application/x-www-form-urlencoded)
937 72.985368 192.168.0.200 192.168.0.1 TCP 54 mit-ml-dev > 41859 [ACK] Seq=1 Ack=415 Win=65535 Len=0
938 73.933676 192.168.0.200 192.168.0.1 HTTP 275 HTTP/1.0 200 OK (video/mpeg4)
939 73.933983 192.168.0.1 192.168.0.200 TCP 54 41859 > mit-ml-dev [ACK] Seq=415 Ack=222 Win=15544 Len=0
940 74.004433 192.168.0.200 192.168.0.1 TCP 74 [TCP segment of a reassembled PDU]
941 74.004887 192.168.0.1 192.168.0.200 TCP 54 41859 > mit-ml-dev [ACK] Seq=415 Ack=242 Win=15544 Len=0
942 74.024669 192.168.0.200 192.168.0.1 HTTP 1346 Continuation or non-HTTP traffic
The HTTP POST requests the video file, which then results in an HTTP OK. I get confused as to what happens next. Isn't the request complete when the HTTP 200 is received? Why then is it continuing to receive TCP data and then getting a HTTP Continuation or non-HTTP traffic? The subsequent TCP packets contain the video file I'm intending to download. When I manually craft a HTTP POST I get the HTTP OK response and then I'm stumped.
This is the code I use to simulate the HTTP POST.
import requests
dc = {"action":"download", "start_time":"2013 7 1 13 59 00", "end_time":"2013 7 14 3 0", "num":"255", "ch":"5"}
r = requests.post("http://192.168.0.200/cgi-bin/supervisor/NetworkBk.cgi", data=dc, auth=(username, password))
This code gets the HTTP 200 OK response, how do I get to the Continuation or non-HTTP traffic? I'm new to this and so am unsure if I've provided enough details. I can provide the HTTP headers if that will help.
Addendum
This is the RAW response of the HTTP OK reply. As far as I can tell, there is nothing there about expecting extra content.
HTTP/1.0 200 OK
Date: Mon, 01 Jul 2013 15:01:34 GMT
nServer: Linux/2.x UPnP/1.0 Avtech/1.0
Expires: 0
Pragma: no-cache
Cache-Control: no-cache
Connection: close
Content-Type: video/mpeg4
Content-Length: 5
0
OK
Why then is it continuing to receive TCP data and then getting a HTTP
Continuation or non-HTTP traffic?
By default, when you make a request, the body of the response is downloaded immediately.
So in this case once the successful POST request is made the DVR will immediately start sending the video data over TCP - most probably as a H.264 Byte Stream. That would account for the non-HTTP traffic you are seeing.
This code gets the HTTP 200 OK response, how do I get to the
Continuation or non-HTTP traffic?
You can override this default behaviour and defer downloading the response body until you access the Response.content attribute with the stream parameter: You could then use something like r.iter_content to iterate over the response data in chunks and then write them to a file. e.g.
import requests
url = "http://192.168.0.200/cgi-bin/supervisor/NetworkBk.cgi"
dc = {"action":"download", "start_time":"2013 7 1 13 59 00", "end_time":"2013 7 14 3 0", "num":"255", "ch":"5"}
r = requests.post(url, data=dc, auth=(username, password), stream=True)
if r.status_code == 200:
with open(path, 'wb') as f:
for chunk in r.iter_content():
f.write(chunk)
I created this small example. I have an EchoServer on Port 8080 and a LogServer on Port 9090 (exemplary in this example). Both are started on the same machine (with Server, which contains the main).
Server started on port 8080
Server started on port 9090
As soon a client -- via telnet -- connects, the EchoServer establishes a connection to the LogServer. Now I am entering a long text, let's say 5000 character (see the long_text in the example), even if bash cannot handle it:
EchoServer Received: 1024
LogServer Received: 1024
EchoServer Received: 2048
LogServer Received: 2048
EchoServer Received: 1025
LogServer Received: 1025
If I enter the text again, I am getting:
EchoServer Received: 2048
LogServer Received: 2048
EchoServer Received: 2049
LogServer Received: 2049
Let's do it again:
EchoServer Received: 3072
EchoServer Received: 1025
LogServer Received: 3072
LogServer Received: 1025
And again:
EchoServer Received: 4096
EchoServer Received: 1
LogServer Received: 4096
LogServer Received: 1
The last time:
EchoServer Received: 4097
LogServer Received: 4097
My observation:
First of all, the data is fragmented. Additionally, each time the fragmends are extended by 1024 bytes (1024,2048,3072,4096,...). I guess the last behavious is because of the TCP slow start.
How can I achive the forwarding to the LogServer without fragmentation, such my text will arrive as one single message? I guess the problem is, how I connect to the LogServer.
[EDIT1]
I changed the logs. It seems, that it's already happening between telnet and the EchoSever. Anyway, I still have the problem in the real environment. The whole message (some Kilobyte) is arriving via WebSockets and the Forwarding to another Connection is fragmented.
[EDIT2]
I did some more research (with wireshark -- the log). I guess it has noting to do with TCP Slow Start. The data (I was sending 4095 times the letter A) arriving on the machine as three correct TCP packets:
Frame 1 (1506 bytes) with 1440 bytes TCP data (41 41 41 ... 41 41 41/HEX)
Frame 2 (1506 bytes) with 1440 bytes TCP data (41 41 41 ... 41 41 41/HEX)
Frame 3 (1283 bytes) with 1217 bytes TCP data (41 41 41 ... 41 0d 0a/HEX)
All 4095 A characters + CRLF arrived as expected.
The EchoServer said:
EchoServer Received: 1024
EchoServer Received: 2048
EchoServer Received: 1025
It also received the 4095 characters + CRLF, but it is different fragmented than the TCP segments (exactly same as the first log above). How can I avoid this Netty behavior?
In non-blocking I/O, there's no practical way to get the number of available bytes in socket receive buffer. Because of that problem, Netty predicts the number of available bytes. It starts from 1024 and then increases the prediction depending the number of read bytes. You can shcnage this behavior by employing a different prediction algorithm.
The default implementation is AdaptiveReceiveBufferSizePredictor and you might want to take a look into its source code to write your own one.
However, no matter what prediction algorithm you choose, you have to keep in mind that TCP/IP is a streaming protocol, which means you can always get messages in a split or merged form. Please refer to the user guide: http://netty.io/docs/stable/guide/html/ (See the 'Dealing with a Stream-based Transport' section.)
You require a FrameDecoder in your pipeline can which assemble bytes from the network into complete frames. In your case I think you need to combine the StringDecoder and DelimiterBasedFrameDecoder. Take a look at the Telnet example and specifically the TelnetServerPipelineFactory