UFW blocks allowed IP sometimes - ufw

I have a server with the IP 172.17.11.101. UFW is enabled and when i do "ufw status", i have :
Anywhere ALLOW 172.17.11.102
It works fine nearly all the day but sometimes i have a error during 4-5 seconds :
Mar 17 23:59:20 server kernel: [124538.209612] [UFW BLOCK] IN=eth0 OUT= MAC=00:50:56:a5:26:1a:b2:a3:56:a5:1b:69:08:00 SRC=172.17.11.102 DST=172.17.11.101 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=13165 DF PROTO=TCP SPT=60410 DPT=27018 WINDOW=29200 RES=0x00 SYN URGP=0
Mar 17 23:59:20 server kernel: [124538.348414] [UFW BLOCK] IN=eth0 OUT= MAC=00:50:56:a5:26:1a:b2:a3:56:a5:3d:87:08:00 SRC=172.17.11.102 DST=172.17.11.101 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=23299 DF PROTO=TCP SPT=32052 DPT=27018 WINDOW=29200 RES=0x00 SYN URGP=0
Mar 17 23:59:21 server kernel: [124539.542688] [UFW BLOCK] IN=eth0 OUT= MAC=00:50:56:a5:26:1a:b2:a3:56:a5:58:76:08:00 SRC=172.17.11.102 DST=172.17.11.101 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=5814 DF PROTO=TCP SPT=30235 DPT=27018 WINDOW=29200 RES=0x00 SYN URGP=0
Mar 17 23:59:22 server kernel: [124540.320007] [UFW BLOCK] IN=eth0 OUT= MAC=00:50:56:a5:26:1a:b2:a3:56:a5:28:05:08:00 SRC=172.17.11.102 DST=172.17.11.101 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=57844 DF PROTO=TCP SPT=64450 DPT=27018 WINDOW=29200 RES=0x00 SYN URGP=0
We can see that UFW blocks the IP 172.17.11.102.
I don't know how to debug this problem which happens 2-3 times each days.
Do you have any idea please ?

Running a port finder for port 27018 indicates that this is caused by steam running on 172.17.11.102.
This is confirmed by the steam website
Required Ports for Steam
Which ports do I need to open on my router or firewall for Steam?
To log into Steam and download content:
HTTP (TCP port 80) and HTTPS (443)
UDP 27015 through 27030
TCP 27015 through 27030
So you could for example deactivate/reconfigure steam or deactivate logging for UFW similar to here i.e.:
sudo ufw deny 27018

Related

nginx server and ssh stop responding

I have a running Flash server on gunicorn under nginx on Raspberry pi zero.
My problem is the raspberry sometime go to sleep a cupe of minutes and server can not be reached and ssh do not work anymore.
So i desable the pi power save with this: sudo iw dev wlan0 set power_save off.
And it's better, but because having issue with 413 Request Entity Too Large i set client_max_body_size to my nginx config file.
But now it's worse, the 'sleep' happen more frequenly, sometime i have to reboot.
This my reverse-proxy.conf:
server {
listen 80;
listen [::]:80;
server_name localhost;
return 301 https://$server_name$request_uri;
}
server {
listen 443 ssl;
listen [::]:443 ssl;
ssl_certificate /etc/ssl/certs/selfsigned.crt;
ssl_certificate_key /etc/ssl/private/selfsigned.key;
error_log /var/www/flask/nginx.log debug;
ssl_dhparam /etc/nginx/dhparam.pem;
location / {
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header HOST $http_host;
proxy_pass http://127.0.0.1:8080;
proxy_redirect off;
}
location /upload {
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header HOST $http_host;
proxy_pass http://127.0.0.1:8080;
proxy_redirect off;
client_max_body_size 200M; # file needed to upload is just a big image around 1m
}
# increase timeout , 300s, 1d, default: 60s
fastcgi_read_timeout 1d;
proxy_read_timeout 1d;
}
This is the last lines on my nginx log file after 'sleep'.
2021/03/13 21:44:18 [debug] 8220#8220: *445 reusable connection: 1
2021/03/13 21:44:18 [debug] 8220#8220: *445 event timer add: 3: 65000:7060228
2021/03/13 21:44:38 [debug] 8220#8220: *445 http keepalive handler
2021/03/13 21:44:38 [debug] 8220#8220: *445 malloc: 018D46F0:1024
2021/03/13 21:44:38 [debug] 8220#8220: *445 SSL_read: -1
2021/03/13 21:44:38 [debug] 8220#8220: *445 SSL_get_error: 5
2021/03/13 21:44:38 [debug] 8220#8220: *445 peer shutdown SSL cleanly
2021/03/13 21:44:38 [info] 8220#8220: *445 client 192.168.1.72 closed keepalive connection (104: Connection reset by peer)
2021/03/13 21:44:38 [debug] 8220#8220: *445 close http connection: 3
2021/03/13 21:44:38 [debug] 8220#8220: *445 SSL_shutdown: 1
2021/03/13 21:44:38 [debug] 8220#8220: *445 event timer del: 3: 7060228
2021/03/13 21:44:38 [debug] 8220#8220: *445 reusable connection: 0
2021/03/13 21:44:38 [debug] 8220#8220: *445 free: 018D46F0
2021/03/13 21:44:38 [debug] 8220#8220: *445 free: 00000000
2021/03/13 21:44:38 [debug] 8220#8220: *445 free: 018F56F0, unused: 8
2021/03/13 21:44:38 [debug] 8220#8220: *445 free: 01933360, unused: 120
kermel log (/var/log/syslog):
Mar 13 22:48:09 raspberrypi rngd[270]: stats: Time spent starving for entropy: (min=0; avg=0.000; max=0)us
Mar 13 22:58:08 raspberrypi systemd[1]: session-10.scope: Succeeded.
Mar 13 23:17:01 raspberrypi CRON[25098]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Mar 13 23:44:19 raspberrypi dhcpcd[385]: wlan0: hardware address 00:00:00:00:00:00 claims 192.168.1.64
Mar 13 23:44:21 raspberrypi dhcpcd[385]: wlan0: hardware address 00:00:00:00:00:00 claims 192.168.1.64
Mar 13 23:44:21 raspberrypi dhcpcd[385]: wlan0: 10 second defence failed for 192.168.1.64
Mar 13 23:44:21 raspberrypi avahi-daemon[260]: Withdrawing address record for 192.168.1.64 on wlan0.
Mar 13 23:44:21 raspberrypi avahi-daemon[260]: Leaving mDNS multicast group on interface wlan0.IPv4 with address 192.168.1.64.
Mar 13 23:44:21 raspberrypi dhcpcd[385]: wlan0: deleting route to 192.168.1.0/24
Mar 13 23:44:21 raspberrypi dhcpcd[385]: wlan0: deleting default route via 192.168.1.254
Mar 13 23:44:21 raspberrypi avahi-daemon[260]: Interface wlan0.IPv4 no longer relevant for mDNS.
Mar 13 23:44:21 raspberrypi dhcpcd[385]: wlan0: rebinding lease of 192.168.1.64
Mar 13 23:44:21 raspberrypi dhcpcd[385]: wlan0: probing address 192.168.1.64/24
Mar 13 23:44:26 raspberrypi dhcpcd[385]: wlan0: leased 192.168.1.64 for 86400 seconds
Mar 13 23:44:26 raspberrypi avahi-daemon[260]: Joining mDNS multicast group on interface wlan0.IPv4 with address 192.168.1.64.
Mar 13 23:44:26 raspberrypi avahi-daemon[260]: New relevant interface wlan0.IPv4 for mDNS.
Mar 13 23:44:26 raspberrypi avahi-daemon[260]: Registering new address record for 192.168.1.64 on wlan0.IPv4.
Mar 13 23:44:26 raspberrypi dhcpcd[385]: wlan0: adding route to 192.168.1.0/24
Mar 13 23:44:26 raspberrypi dhcpcd[385]: wlan0: adding default route via 192.168.1.254
Mar 13 23:48:09 raspberrypi rngd[270]: stats: bits received from HRNG source: 180064
Mar 13 23:48:09 raspberrypi rngd[270]: stats: bits sent to kernel pool: 123584
Mar 13 23:48:09 raspberrypi rngd[270]: stats: entropy added to kernel pool: 123584
Mar 13 23:48:09 raspberrypi rngd[270]: stats: FIPS 140-2 successes: 9
Mar 13 23:48:09 raspberrypi rngd[270]: stats: FIPS 140-2 failures: 0
Mar 13 23:48:09 raspberrypi rngd[270]: stats: FIPS 140-2(2001-10-10) Monobit: 0
Mar 13 23:48:09 raspberrypi rngd[270]: stats: FIPS 140-2(2001-10-10) Poker: 0
Mar 13 23:48:09 raspberrypi rngd[270]: stats: FIPS 140-2(2001-10-10) Runs: 0
Mar 13 23:48:09 raspberrypi rngd[270]: stats: FIPS 140-2(2001-10-10) Long run: 0
Mar 13 23:48:09 raspberrypi rngd[270]: stats: FIPS 140-2(2001-10-10) Continuous run: 0
Mar 13 23:48:09 raspberrypi rngd[270]: stats: HRNG source speed: (min=101.599; avg=254.741; max=920.244)Kibits/s
Mar 13 23:48:09 raspberrypi rngd[270]: stats: FIPS tests speed: (min=924.206; avg=3071.971; max=9096.996)Kibits/s
Mar 13 23:48:09 raspberrypi rngd[270]: stats: Lowest ready-buffers level: 2
Mar 13 23:48:09 raspberrypi rngd[270]: stats: Entropy starvations: 0
Mar 13 23:48:09 raspberrypi rngd[270]: stats: Time spent starving for entropy: (min=0; avg=0.000; max=0)us
Mar 13 23:57:38 raspberrypi dhcpcd[385]: wlan0: part of Router Advertisement expired
Edit:
It's possible the problem come from my computer or the pi filtering my compyter ip, because sometime i can ssh or reach the http server from my Android phone which is in the same network, but no internet or firewall(ESET antivirus) problem in my computer.

pfSense 2.5.0 upgrade broke my NordVPN gateway

Ever since I upgraded to pfSense 2.5.0, my NordVPN interface does not work anymore. Traffic does not get routes to the NordVPN gateway, as pfSense reports it as "down" with 100% package loss. When checking "Status -> OpenVPN" the connection is reported as UP, but the gateway is DOWN. I don't understand how this is possible, but the log provides some clues, although I don't understand what goes wrong when reading the log.
OpenVPN Log (private IPs removed):
Feb 19 07:42:59 openvpn 79266 Initialization Sequence Completed
Feb 19 07:43:58 openvpn 79266 Authenticate/Decrypt packet error: missing authentication info
Feb 19 07:44:58 openvpn 79266 Authenticate/Decrypt packet error: missing authentication info
Feb 19 07:45:58 openvpn 79266 [nl852.nordvpn.com] Inactivity timeout (--ping-restart), restarting
Feb 19 07:45:58 openvpn 79266 SIGUSR1[soft,ping-restart] received, process restarting
Feb 19 07:45:58 openvpn 79266 Restart pause, 10 second(s)
Feb 19 07:46:08 openvpn 79266 NOTE: the current --script-security setting may allow this configuration to call user-defined scripts
Feb 19 07:46:08 openvpn 79266 Outgoing Control Channel Authentication: Using 512 bit message hash 'SHA512' for HMAC authentication
Feb 19 07:46:08 openvpn 79266 Incoming Control Channel Authentication: Using 512 bit message hash 'SHA512' for HMAC authentication
Feb 19 07:46:08 openvpn 79266 TCP/UDP: Preserving recently used remote address: [AF_INET]194.127.172.103:1194
Feb 19 07:46:08 openvpn 79266 Socket Buffers: R=[42080->524288] S=[57344->524288]
Feb 19 07:46:08 openvpn 79266 UDPv4 link local (bound): [AF_INET]x.x.x.x:0
Feb 19 07:46:08 openvpn 79266 UDPv4 link remote: [AF_INET]y.y.y.y:1194
Feb 19 07:46:08 openvpn 79266 TLS: Initial packet from [AF_INET]y.y.y.y.z:1194, sid=2ce7940f f02613d1
Feb 19 07:46:08 openvpn 79266 VERIFY WARNING: depth=0, unable to get certificate CRL: CN=nl852.nordvpn.com
Feb 19 07:46:08 openvpn 79266 VERIFY WARNING: depth=1, unable to get certificate CRL: C=PA, O=NordVPN, CN=NordVPN CA5
Feb 19 07:46:08 openvpn 79266 VERIFY WARNING: depth=2, unable to get certificate CRL: C=PA, O=NordVPN, CN=NordVPN Root CA
Feb 19 07:46:08 openvpn 79266 VERIFY OK: depth=2, C=PA, O=NordVPN, CN=NordVPN Root CA
Feb 19 07:46:08 openvpn 79266 VERIFY OK: depth=1, C=PA, O=NordVPN, CN=NordVPN CA5
Feb 19 07:46:08 openvpn 79266 VERIFY KU OK
Feb 19 07:46:08 openvpn 79266 Validating certificate extended key usage
Feb 19 07:46:08 openvpn 79266 ++ Certificate has EKU (str) TLS Web Server Authentication, expects TLS Web Server Authentication
Feb 19 07:46:08 openvpn 79266 VERIFY EKU OK
Feb 19 07:46:08 openvpn 79266 VERIFY OK: depth=0, CN=nl852.nordvpn.com
Feb 19 07:46:08 openvpn 79266 WARNING: 'link-mtu' is used inconsistently, local='link-mtu 1582', remote='link-mtu 1634'
Feb 19 07:46:08 openvpn 79266 WARNING: 'auth' is used inconsistently, local='auth [null-digest]', remote='auth SHA512'
Feb 19 07:46:08 openvpn 79266 Control Channel: TLSv1.2, cipher TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384, 4096 bit RSA
Feb 19 07:46:08 openvpn 79266 [nl852.nordvpn.com] Peer Connection Initiated with [AF_INET]194.127.172.103:1194
Feb 19 07:46:09 openvpn 79266 SENT CONTROL [nl852.nordvpn.com]: 'PUSH_REQUEST' (status=1)
Feb 19 07:46:09 openvpn 79266 PUSH: Received control message: 'PUSH_REPLY,redirect-gateway def1,dhcp-option DNS 103.86.96.100,dhcp-option DNS 103.86.99.100,sndbuf 524288,rcvbuf 524288,explicit-exit-notify,comp-lzo no,route-gateway z.z.z.z,topology subnet,ping 60,ping-restart 180,ifconfig g.g.g.g 255.255.255.0,peer-id 3'
Feb 19 07:46:09 openvpn 79266 OPTIONS IMPORT: timers and/or timeouts modified
Feb 19 07:46:09 openvpn 79266 OPTIONS IMPORT: explicit notify parm(s) modified
Feb 19 07:46:09 openvpn 79266 OPTIONS IMPORT: compression parms modified
Feb 19 07:46:09 openvpn 79266 OPTIONS IMPORT: --sndbuf/--rcvbuf options modified
Feb 19 07:46:09 openvpn 79266 Socket Buffers: R=[524288->524288] S=[524288->524288]
Feb 19 07:46:09 openvpn 79266 OPTIONS IMPORT: --ifconfig/up options modified
Feb 19 07:46:09 openvpn 79266 OPTIONS IMPORT: route options modified
Feb 19 07:46:09 openvpn 79266 OPTIONS IMPORT: route-related options modified
Feb 19 07:46:09 openvpn 79266 OPTIONS IMPORT: --ip-win32 and/or --dhcp-option options modified
Feb 19 07:46:09 openvpn 79266 OPTIONS IMPORT: peer-id set
Feb 19 07:46:09 openvpn 79266 OPTIONS IMPORT: adjusting link_mtu to 1657
Feb 19 07:46:09 openvpn 79266 Using peer cipher 'AES-256-CBC'
Feb 19 07:46:09 openvpn 79266 Data Channel: using negotiated cipher 'AES-256-CBC'
Feb 19 07:46:09 openvpn 79266 Outgoing Data Channel: Cipher 'AES-256-CBC' initialized with 256 bit key
Feb 19 07:46:09 openvpn 79266 Outgoing Data Channel: Using 512 bit message hash 'SHA512' for HMAC authentication
Feb 19 07:46:09 openvpn 79266 Incoming Data Channel: Cipher 'AES-256-CBC' initialized with 256 bit key
Feb 19 07:46:09 openvpn 79266 Incoming Data Channel: Using 512 bit message hash 'SHA512' for HMAC authentication
Feb 19 07:46:09 openvpn 79266 Preserving previous TUN/TAP instance: ovpnc8
Feb 19 07:46:09 openvpn 79266 NOTE: Pulled options changed on restart, will need to close and reopen TUN/TAP device.
Feb 19 07:46:09 openvpn 79266 Closing TUN/TAP interface
Feb 19 07:46:09 openvpn 79266 /usr/local/sbin/ovpn-linkdown ovpnc8 1500 1637 a.b.c.d 255.255.255.0 init
Feb 19 07:46:10 openvpn 79266 ROUTE_GATEWAY a.b.c.d/255.255.254.0 IFACE=re0 HWADDR=00:e2:6c:68:07:be
Feb 19 07:46:10 openvpn 79266 TUN/TAP device ovpnc8 exists previously, keep at program end
Feb 19 07:46:10 openvpn 79266 TUN/TAP device /dev/tun8 opened
Feb 19 07:46:10 openvpn 79266 /sbin/ifconfig ovpnc8 x.x.x.x y.y.y.y mtu 1500 netmask 255.255.255.0 up
Feb 19 07:46:10 openvpn 79266 /sbin/route add -net x.x.x.x x.x.x.x 255.255.255.0
Feb 19 07:46:10 openvpn 79266 /usr/local/sbin/ovpn-linkup ovpnc8 1500 1637 x.x.x.x 255.255.255.0 init
Feb 19 07:46:10 openvpn 79266 Initialization Sequence Completed
And the gateway log:
Feb 19 04:16:02 dpinger 68141 send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 1 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr x.x.x.x bind_addr x.x.x.x identifier "NORDVPN_VPNV4 "
Feb 19 04:16:04 dpinger 68141 NORDVPN_VPNV4 x.x.x.x: Alarm latency 0us stddev 0us loss 100%
Feb 19 04:19:13 dpinger 16894 send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 1 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr x.x.x.x bind_addr x.x.x.x identifier "WAN_DHCP "
Feb 19 04:19:13 dpinger 17398 send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 1 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr x.x.x.x bind_addr x.x.x.x identifier "NORDVPN_VPNV4 "
Feb 19 04:19:15 dpinger 17398 NORDVPN_VPNV4 x.x.x.x: Alarm latency 0us stddev 0us loss 100%
In Firewall -> Rules -> LAN I adjusted the "default allow LAN to any rule" to the gateway "NordVPN". Outbound NAT is set to manual, with the top rule taking the LAN net as source and the NORDVPN interface.
Any help is appreciated. As said, the current configuration worked fine in 2.4.5 -- the latest release before upgrading to 2.5.0. I'm considering downgrading at this point.
Changed fallback DEA to AES-256-CBC from AES-256-GCM, and it's working fine
Go to VPN/OpenVPN/Client, and edit the setting "Fallback Data Encryption Algorithm"
NordVPN has posted updated documentation for pfSense 2.5.0, titled: pfSense 2.5 Setup with NordVPN.
As #NDK has mentioned in their A'er the updated docs show that you need to change the Fallback Data Encryption Algorithm to AES-256-CBC.

UPNP M-SEARCH response does not yield a HTTP GET request. Why?

I am trying to create a MediaServer UPNP program in order to stream video from my phones camera to my PC.
I used Intel device spy to send an M-SEARCH request and used Wireshark to capture the network packets.
Here is the M-SEARCH packet
(Src: 192.168.1.28, Dst: 239.255.255.250; Src Port: 50852, Dst Port: 1900, time 2.09)
M-SEARCH * HTTP/1.1
ST: upnp:rootdevice
MAN: "ssdp:discover"
MX: 5
HOST: 239.255.255.250:1900
Here is the UDP reply
(Src: 192.168.1.23, Dst: 192.168.1.28; Src Port: 53359, Dst Port: 50852)
HTTP/1.1 200 OK
CACHE-CONTROL: max-age=1810
DATE: Wed, 1 Feb 2017 02:07:36 GMT
EXT:
LOCATION: http://192.168.1.23:49156/details.xml
SERVER: Linux/2.x.x, UPnP/1.0, pvConnect UPnP SDK/1.0, TwonkyMedia UPnP SDK/1.1
ST: upnp:rootdevice
USN: uuid:3d64febc-ae6a-4584-853a-85368ca80800::upnp:rootdevice
Content-Length: 0
I do not get a following HTTP GET request to 192.168.1.23. I compared it to other UPNP device responses that worked and could see no difference.
I tried different source ports but with no sucess. Any ideas?
#simonc, Thank you. I did have a \r\n at the end of my message, but I added another one (to the NOTIFY message as well) and now I can see my device.

Server send RST to client when TCP connection max than 65000~

I am work on a high load tcp application with Java Netty, which expect to arrive 300k concurrent TCP connections.
It works perfect on test server, arrive 300k connections, but when deploy to production server, it only can support 65387 connections, after arrive this number, client will throw out a "java.io.IOException: Connection reset by peer" exceptions. I try many times, every time, when connections up to 65387, client will can't create connection.
The network capture as bellow, 10.95.196.27 is server, 10.95.196.29 is client :
16822 12:26:12.480238 10.95.196.29 10.95.196.27 TCP 74 can-ferret > http [SYN] Seq=0 Win=14600 Len=0 MSS=1460 SACK_PERM=1 TSval=872641174 TSecr=0 WS=128
16823 12:26:12.480267 10.95.196.27 10.95.196.29 TCP 66 http > can-ferret [SYN, ACK] Seq=0 Ack=1 Win=2920 Len=0 MSS=1460 SACK_PERM=1 WS=1024
16824 12:26:12.480414 10.95.196.29 10.95.196.27 TCP 60 can-ferret > http [ACK] Seq=1 Ack=1 Win=14720 Len=0
16825 12:26:12.480612 10.95.196.27 10.95.196.29 TCP 54 http > can-ferret [FIN, ACK] Seq=1 Ack=1 Win=3072 Len=0
16826 12:26:12.480675 10.95.196.29 10.95.196.27 HTTP 94 Continuation or non-HTTP traffic
16827 12:26:12.480697 10.95.196.27 10.95.196.29 TCP 54 http > can-ferret [RST] Seq=1 Win=0 Len=0
The exception cause by after client 3 handshake to server, server send a RST package to client, and the new connection was broken.
client side exception stack as bellow:
16:42:05.826 [nioEventLoopGroup-1-15] WARN i.n.channel.DefaultChannelPipeline - An exceptionCaught() event was fired, and it reached at the end of the pipeline. It usually means the last handler in the pipeline did not handle the exception.
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[na:1.7.0_25]
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) ~[na:1.7.0_25]
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:225) ~[na:1.7.0_25]
at sun.nio.ch.IOUtil.read(IOUtil.java:193) ~[na:1.7.0_25]
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:375) ~[na:1.7.0_25]
at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:259) ~[netty-all-4.0.0.Beta3.jar:na]
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:885) ~[netty-all-4.0.0.Beta3.jar:na]
at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:226) ~[netty-all-4.0.0.Beta3.jar:na]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:72) ~[netty-all-4.0.0.Beta3.jar:na]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:460) ~[netty-all-4.0.0.Beta3.jar:na]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:424) ~[netty-all-4.0.0.Beta3.jar:na]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:360) ~[netty-all-4.0.0.Beta3.jar:na]
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:103) ~[netty-all-4.0.0.Beta3.jar:na]
at java.lang.Thread.run(Thread.java:724) ~[na:1.7.0_25]
Sever side have not exceptions.
I had try turning some sysctl item as bellow to support huge connections, but its useless:
net.core.wmem_max = 33554432
net.ipv4.tcp_rmem = 4096 4096 33554432
net.ipv4.tcp_wmem = 4096 4096 33554432
net.ipv4.tcp_mem = 786432 1048576 26777216
net.ipv4.tcp_max_tw_buckets = 360000
net.core.netdev_max_backlog = 4096
vm.min_free_kbytes = 65536
vm.swappiness = 0
net.ipv4.ip_local_port_range = 1024 65535
net.ipv4.tcp_max_syn_backlog = 4096
net.netfilter.nf_conntrack_max = 3000000
net.nf_conntrack_max = 3000000
net.core.somaxconn = 327680
The max open fd already set to 999999
linux-152k:~ # ulimit -n
999999
The OS release is SUSE Linux Enterprise Server 11 SP2 with 3.0.13 kernel:
linux-152k:~ # cat /etc/SuSE-release
SUSE Linux Enterprise Server 11 (x86_64)
VERSION = 11
PATCHLEVEL = 2
linux-152k:~ # uname -a
Linux linux-152k 3.0.13-0.27-default #1 SMP Wed Feb 15 13:33:49 UTC 2012 (d73692b) x86_64 x86_64 x86_64 GNU/Linux.
The dmesg have not any error information, CPU and Memory keep low level, every thing looks good, just server reset connection from client.
We have a test server which was SUSE Linux Enterprise Server 11 SP1 with 2.6.32 kernel, it works well, can support up to 300k connections.
I think maybe some kernel or security limit cause this, but I can't find it, any suggestions or any way to get some debug informations of why server send RST? Thanks.
Santal, I've just came across the following link, and it seems it can give an answer to your question:
What is the theoretical maximum number of open TCP connections that a modern Linux box can have
Finally got the root cause. Simply said, it was a JDK bug, please refer to http://mail.openjdk.java.net/pipermail/nio-dev/2013-September/002284.html
which cause NPE when fd > 64 * 1024.
After upgrade to JDK7_45, everything works great now.

Netty - Framing

I created this small example. I have an EchoServer on Port 8080 and a LogServer on Port 9090 (exemplary in this example). Both are started on the same machine (with Server, which contains the main).
Server started on port 8080
Server started on port 9090
As soon a client -- via telnet -- connects, the EchoServer establishes a connection to the LogServer. Now I am entering a long text, let's say 5000 character (see the long_text in the example), even if bash cannot handle it:
EchoServer Received: 1024
LogServer Received: 1024
EchoServer Received: 2048
LogServer Received: 2048
EchoServer Received: 1025
LogServer Received: 1025
If I enter the text again, I am getting:
EchoServer Received: 2048
LogServer Received: 2048
EchoServer Received: 2049
LogServer Received: 2049
Let's do it again:
EchoServer Received: 3072
EchoServer Received: 1025
LogServer Received: 3072
LogServer Received: 1025
And again:
EchoServer Received: 4096
EchoServer Received: 1
LogServer Received: 4096
LogServer Received: 1
The last time:
EchoServer Received: 4097
LogServer Received: 4097
My observation:
First of all, the data is fragmented. Additionally, each time the fragmends are extended by 1024 bytes (1024,2048,3072,4096,...). I guess the last behavious is because of the TCP slow start.
How can I achive the forwarding to the LogServer without fragmentation, such my text will arrive as one single message? I guess the problem is, how I connect to the LogServer.
[EDIT1]
I changed the logs. It seems, that it's already happening between telnet and the EchoSever. Anyway, I still have the problem in the real environment. The whole message (some Kilobyte) is arriving via WebSockets and the Forwarding to another Connection is fragmented.
[EDIT2]
I did some more research (with wireshark -- the log). I guess it has noting to do with TCP Slow Start. The data (I was sending 4095 times the letter A) arriving on the machine as three correct TCP packets:
Frame 1 (1506 bytes) with 1440 bytes TCP data (41 41 41 ... 41 41 41/HEX)
Frame 2 (1506 bytes) with 1440 bytes TCP data (41 41 41 ... 41 41 41/HEX)
Frame 3 (1283 bytes) with 1217 bytes TCP data (41 41 41 ... 41 0d 0a/HEX)
All 4095 A characters + CRLF arrived as expected.
The EchoServer said:
EchoServer Received: 1024
EchoServer Received: 2048
EchoServer Received: 1025
It also received the 4095 characters + CRLF, but it is different fragmented than the TCP segments (exactly same as the first log above). How can I avoid this Netty behavior?
In non-blocking I/O, there's no practical way to get the number of available bytes in socket receive buffer. Because of that problem, Netty predicts the number of available bytes. It starts from 1024 and then increases the prediction depending the number of read bytes. You can shcnage this behavior by employing a different prediction algorithm.
The default implementation is AdaptiveReceiveBufferSizePredictor and you might want to take a look into its source code to write your own one.
However, no matter what prediction algorithm you choose, you have to keep in mind that TCP/IP is a streaming protocol, which means you can always get messages in a split or merged form. Please refer to the user guide: http://netty.io/docs/stable/guide/html/ (See the 'Dealing with a Stream-based Transport' section.)
You require a FrameDecoder in your pipeline can which assemble bytes from the network into complete frames. In your case I think you need to combine the StringDecoder and DelimiterBasedFrameDecoder. Take a look at the Telnet example and specifically the TelnetServerPipelineFactory

Resources