I am using Facebook's Nifty to run a TCP-based Thrift application. I want to load balance the requests so I am trying to setup HAProxy as the load balancer. The load balancing part works great. The server check part actually works great. If I take down one server, HAProxy sees it and directs traffic to the other.
What does not work is the tcp-check option. Every request it makes forces Netty to throw a java.io.IOException: Connection reset by peer Exception.
So, HAProxy is closing the connection and Netty is still chatting. What is strange though is that the tcp-check expect binary 5550 # UP is working!
Here is my config for reference:
defaults
mode tcp
timeout connect 5000ms
timeout client 50000ms
listen thrift
bind *:9090
mode tcp
balance roundrobin
option tcpka
retries 3
option tcp-check
tcp-check send-binary 80010001000000135379734f7073536572766963653a636865636b0000000000
# tcp-check expect string UP
tcp-check expect binary 5550 # UP
server thrift01 127.0.0.1:4444 check inter 10s
server thrift02 127.0.0.1:5555 check inter 10s
timeout connect 20s
timeout server 30s
Again, connections work. Load balancing works. Regular SYN/ACK check works. But when I turn on the tcp-check I get errors on the server.
[2016-10-06 17:47:17,586] [nifty-server-worker-5] ERROR c.f.nifty.core.NiftyExceptionLogger - Exception triggered on channel connected to /127.0.0.1:39010
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:64)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Thanks.
Related
I'm trying to load balance an Atlassian Bitbucket cluster with HA Proxy so that SSH endpoints are marked down if the corresponding http status check on the same server fails.
The configuration below used to work fine but, since upgrading to Bitbucket 6.10.5 (which has a new embedded Tomcat server), I'm now getting the error "Server bitbucket_ssh/bitbucket1 is DOWN, reason: Layer7 invalid response, info: "TCPCHK got an empty response at step 7 comment: 'HTTP Status'".
If I curl http://bitbucket1.mydomain:8200/status, the response comes back {"state":"RUNNING"}, same as before the upgrade.
What could be causing the empty response?
backend bitbucket_ssh
mode tcp
balance roundrobin
option tcp-check
tcp-check comment "SSH Check"
tcp-check connect port 8203
tcp-check expect rstring ^SSH.*$
tcp-check comment "HTTP Status"
tcp-check connect port 8200
tcp-check send GET\ /status\r\n
tcp-check expect string RUNNING
server bitbucket1 bitbucket1.mydomain:8203 check
server bitbucket2 bitbucket2.mydomain:8203 check
server bitbucket3 bitbucket3.mydomain:8203 check
It seems the newer Tomcat server needs an explicit HTTP/1.0 connection with an extra CR-LF.
Changing:
tcp-check send GET\ /status\r\n
to
tcp-check send GET\ /status\ HTTP/1.0\r\n
tcp-check send \r\n
got it running.
I have a simple setup for HAProxy to a back-end server available over an IPSec VPN. When I connect directly to the back-end server using Curl, the requests goes through successfully, but when I use HAProxy to the same back-end, over the VPN, the request is dropped with a 503 ERROR. From the logs, it seems the connection is being aborted prematurely, but I cannot decipher why. Also, both requests work when I use remote servers available over the Internet as back-ends, where a VPN is not involved. Am I missing a specific config or something for HAProxy over a VPN?
Note: I have set no health check on purpose for the back-end
HAProxy config:
defaults
mode http
# option httplog
log global #use log set in the global config
log-format \"[Lo:%ci/%cp; Re:%si/%sp] [Proxy - %bi:%bp/%fi:%fp] [%tr] %ft %b/%s %TR/%Tw/%Tc/%Tr/%Ta %ST %B %CC %CS %tsc %ac/%fc/%bc/%sc/%rc %sq/%bq %hr %hs %{+Q}r\"
option dontlognull
option http-keep-alive
option forwardfor except 127.0.0.0/8
option redispatch
retries 2
timeout http-request 10s #maximum allowed time to wait for a complete HTTP request
timeout queue 10s #maximum time to wait in the queue for a connection slot to be free
timeout connect 5s #maximum time to wait for a connection attempt to a server to succeed
timeout client 5s #minimum time for inactivity on client side
timeout server 5s #maximum inactivity time on the server side
timeout http-keep-alive 30s #maximum allowed time to wait for a new HTTP request to appear
timeout check 10s
maxconn 5000
##-----------------------------------------------------
## API Requests
##-----------------------------------------------------
## frontend to proxy HTTP callbacks coming from App servers to VPN Server
frontend api_requests
mode http
bind 10.132.2.2:80
bind 127.0.0.1:80
default_backend testbed
## backend to proxy HTTP requests from App Servers to VPN Server
backend testbed
balance roundrobin
server broker 196.XXX.YYY.136:80
Entry captured on traffic log for failed attempt over VPN :
May 30 09:15:10 localhost haproxy[22844]: [Lo:127.0.0.1/56046; Re:196.XXX.YYY.136/80] [Proxy - :0/127.0.0.1:80] [30/May/2019:09:15:10.285] api_requests testbed/broker 0/0/-1/-1/0 503 212 - - SC-- 1/1/0/0/2 0/0 "POST /request HTTP/1.1"
What could be the issue causing a Curl request to be accepted but a proxy request by HAProxy to be dropped specifically for a VPN connection? Anyone faced a similar issue before?
I configured a nginx instances as a reverse proxy for a websocket server and established a websocket between client and server, according to the official tutorial https://www.nginx.com/blog/websocket-nginx/.
Then I run nginx -s quit to gracefully shut down nginx.
I found that a worker process is always in a status shutting down.. and I can still send message via the established websocket connection, then the nginx master and worker process will hang up until timeout.
I'd like to know if nginx supports the function that telling both client and server to close the socket connection on transportation level and exit normally, instead of waiting for the websocket time out.
Connection Timeout Expired. The timeout period elapsed while attempting to consume the pre-login handshake acknowledgement.
This could be because the pre-login handshake failed or the server was unable to respond back in time.
It seems to be a firewall issue. Firewall seems to be blocking connection to sql server for port 1433. You will need to add a rule to allow connection on this port.
You can find more on this here: SQL Server Pre-Login Handshake Acknowledgement Error
I have a service which will listen on port 8443 after it is launched.
I have xinetd configured to start my service when a connection is made on port 8443.
So Xinetd should launch my application and then let my application handle any more incoming connections.
I'm getting repeated "warning: can't get client address: Transport endpoint is not connected" and then Xinetd disables my service for 10 seconds.
This only happens when I set wait = yes.
Stopping my application from listening to port 8443 doesn't make a difference.
Is my understanding of xinetd wait flag correct or am I doing something wrong with the xinetd configuration?
I've looked at man pages, wait=yes is usually associated with UDP but nothing in there says you can't use it with TCP.
I searched on SO and everything I found has tcp working with wait=no.
I am getting the following errors when connection to xinetd.
5786]: warning: can't get client address: Transport endpoint is not connected
5564]: EXIT: MyApplication status=1 pid=5786 duration=0(sec)
5564]: START: MyApplication pid=5787 from=<no address>
5787]: warning: can't get client address: Transport endpoint is not connected
5564]: EXIT: MyApplication status=1 pid=5787 duration=0(sec)
5564]: Deactivating service MyApplication due to excessive incoming connections. Restarting in 10 seconds.
5564]: FAIL: MyApplication connections per second from=<no address>
5564]: Activating service MyApplication
My configuration is:
disable = no
socket_type = stream
protocol = tcp
wait = yes
user = user
server = /usr/bin/MyApplication
port = 8443
type = UNLISTED
flags = IPv4
Just incase anyone else comes across this, I was looking for the same thing. From this comment by one of the maintainers Steve Grubb here, he says
A wait service is one that accepts the
connection. telnet does not accept the connection - it expects the connection to
be accepted and the socket dup'ed to the stdin/out descriptors.
There is also no way for xinetd to check to see if a child has accepted the
connection. Therefore, when you have a program that's configured as a wait
service and it doesn't accept the connection, the socket is still readable when
xinetd goes back to the select loop. Around and around it goes.
The child server is launched from when xinetd forks then calls exec_server. When wait=true, as mentioend above the child process must accept the connection on the socket. To get the socket the file descriptor with value 0 can be used with accept. I am using python with the following,
import socket
s = socket.fromfd(0, socket.AF_INET, socket.SOCK_STREAM)
print(s.accept())
which returns (conn, address) where conn is a new socket object usable to send and receive data on the connection.
From the man pages
wait This attribute determines if the service is single-threaded or multi-threaded and whether or not xinetd accepts the connection or the server program accepts the
connection. If its value is yes, the service is single-threaded; this means that xinetd will start the server and then it will stop handling requests for the
service until the server dies and that the server software will accept the connection. If the attribute value is no, the service is multi-threaded and xinetd
will keep handling new service requests and xinetd will accept the connection. It should be noted that udp/dgram services normally expect the value to be yes
since udp is not connection oriented, while tcp/stream servers normally expect the value to be no.
So if you have wait=yes you are single threaded. Once there is a connection nothing else is is able to connect till your current sessions disconnects or the script ends.