Health check to detect redis master from google tcp load balancer - networking

I am trying to setup a google TCP internal Load Balancer. Instance group behind this lb consists of redis-server processes listening on port 6379. Out of these redis instances, only one of them is master.
Problem: Add a TCP health check to detect the redis master and make lb divert all traffic to redis master only.
Approach:
Added a TCP Health Check for the port 6379.
In order to send the command role to redis-server process and parse the response, I am using the optional params provided in the health check. Please check the screenshot here.
Result: Health check is failing for all. If I remove the optional request/response params, health check starts passing for all.
Debugging:
Connected to lb using netcat and issued the command role, it sends the response starting with *3(for master) and *5(for slave) as expected.
Logged into instance and stopped redis-server process. Started listening on port 6379 using nc -l -p 6379 to check what exactly is being received at the instance's side in the health check. It does receive role\r\n.
After step 2, restarted redis-server and ran MONITOR command in redis-cli, to watch log of commands received by this process. Here there is no log of role.
This means, instance is receiving the data(role\r\n) over tcp but is not received by the process redis-cli(as per MONITOR command) or something else is happening. Please help.

Unfortunately GCP's TCP health checks is pretty limited on what can be checked in the response. From https://cloud.google.com/sdk/gcloud/reference/compute/health-checks/create/tcp:
--response=RESPONSE
An optional string of up to 1024 characters that the health checker expects to receive from the instance. If the response is not received exactly, the health check probe fails. If --response is configured, but not --request, the health checker will wait for a response anyway. Unless your system automatically sends out a message in response to a successful handshake, only configure --response to match an explicit --request.
Note the word "exactly" in the help message. The response has to match the provided string in full. One can't specify a partial string to search for in the response.
As you can see on https://redis.io/commands/role, redis's ROLE command returns a bunch of text. Though the substring "master" is present in the response, it also has a bunch of other text that would vary from setup to setup (based on the number of slaves, their addresses, etc.).
You should definitely raise a feature request with GCP for regex matching on the response. A possible workaround until then is to have a little web app on each host that performs "redis-cli role | grep master" command locally and return the response. Then the health check can be configured to monitor this web app.

Related

Ping operation by dynatrace?

The sitescope tool has the functionality for checking the Ping operation, with frequency of pinging the application configurable and email alerts as well.
Does Dynatrace support ping operation and email alerting ?
You can't to a ping from dynatrace, but that is probably not what you want to do anyway, because it just tells you that the host is up and available via ICMP.
What you can do with dynatrace, is execute a synthetic HTTP all against an endpoint on that host to see if your application is up and running.

gRPC client reconnect inside Kubernetes

If we define our microservice inside Kubernetes pods, do we need to instrument a gRPC client reconnection if the service pod is restarting?
When the pod restarts the host name is not changed, but we cannot guarantee the IP address remains the same. So is the gRPC client still be able to detect the new server to reconnect to?
When the TCP connection is disconnected (because the old pod stopped) gRPC's channel will attempt to reconnect with exponential backoff. Each reconnect attempt implies resolving the DNS address, although it may not detect the new address immediately because of the TTL (time-to-live) of the old DNS entry. Also, I believe some implementations resolve the address when a failure is detected instead of before an attempt.
This process happens naturally without your application doing anything, although it may experience RPC failures until the connection is re-established. Enabling "wait for ready" on an RPC would reduce the chances the RPC fails during this period of transition, although such an RPC generally implies you don't care about response latency.
If the DNS address is not (eventually) re-resolved, then that would be a bug and you should file an issue.
You need client-side load balancing as described here. You can watch the endpoints of a service with Kubernetes api. I have created a package for Go programming language and it is on github. Sorry but I didn't write a documentation yet. Basic concept is get service endpoints at beginning than watch service endpoints for changes.

first test in Kamailio

I have just installed Kamailio SIP Server followed instructions on official site. Later I've started the server for listening SIP messages and added "test" user. So now the tutorial is ended up and i have no idea how i can test whether it works correctly or not. I mean if there is some "hello world" simple config to run or how to write simple test and execute in that environment. What I've found in google it's just module and function descriptions. Thanks for any help and "real" examples are vital :)
I assume you have choosen a domain for your sip server (mysipserver.com in the tutorial). I'm also assuming that you have choosen a domain name that you owns.
Step1: check NAPTR & SRV record (optional, but at least SRV is good to have)
In theory, SIP Applications, will do some NAPTR and SRV requests to locate your server for your service. This is described in rfc3263 and means you are supposed to configure your DNS entries to let SIP applications find the IP of your server. Check this page for an example!
Then, you can test NAPTR for your service (replace antisip.com, with your domain name)
~$ host -t NAPTR antisip.com
antisip.com has NAPTR record 0 0 "s" "SIPS+D2T" "" _sips._tcp.antisip.com.
antisip.com has NAPTR record 2 0 "s" "SIP+D2U" "" _sip._udp.antisip.com.
antisip.com has NAPTR record 1 0 "s" "SIP+D2T" "" _sip._tcp.antisip.com.
Then, use one the answers to test the SRV queries:
~$ host -t SRV _sips._tcp.antisip.com.
_sips._tcp.antisip.com has SRV record 0 0 5061 sip.antisip.com.
_sips._tcp.antisip.com has SRV record 0 0 5061 sip2.antisip.com.
In the example above sip.antisip.com and sip2.antisip.com are running the sip services for antisip.com
Step2: Without NAPTR/SRV, at least check DNS
To make it simple, if you have one server, just make sure your domain resolve to your server's IP address:
~$ ping antisip.com
PING antisip.com (91.121.78.130) 56(84) bytes of data.
Note that for me, antisip.com is also the sip.antisip.com server.
Step3: Testing from a windows
The easiest from this point is to test on your favorite desktop OS. This will allow you to start a network capture.
You can download this very simple demo. It's a very basic SIP app, but that's easier for testing: VoipByAntisip.exe for Windows
Install wireshark and start it. Then, start capture and put the "sip" filter. You may also later add the "DNS" filter and the "RTP" filter.
Test UDP, TCP and then TLS:
To test UDP, in settings, configure:
Proxy: mysipserver.com
username: test
password: yourpassword
protocol: UDP
To test TCP, in settings, modify:
protocol: TCP
To test TLS (without certificate verification), in settings, modify:
protocol: TLS
After applying the change, the box on the left of REFRESH button should become green with 200 OK written. If not, UDP doesn't work and either the answer code is written, or a 408 Timeout is provided to indicate no answer.
If you have registered correctly: that means you have received a 200 Ok, then, the "location" table of your kamailio database should contains the new registered user.
Test calls:
Of course, you also need to test calls.
The tutorial doesn't indicate that you need a rtp relay! But usually, if you wish to makes calls between SIP User-Agents, an application relaying RTP, like rtpproxy will need to be installed and configured to work with kamailio on your server. Without the relay, you should be able to call (talk) between two SIP applications running on the same LAN.
In order to test calls, you will need to create a second user (test2?) and configure another PC to use this account. Then, in Voip By Antisip for windows, use the start call box and enter sip:test2#mysipserver.com. The network capture should show an INVITE being sent to your server. This INVITE should be relayed to second user and received by test2 SIP application.
If your SIP server is up and running, then go ahead and use an android phone to test whether it works fine. You can use 'csipsimple' client to connect to a SIP server. For more details checkout this tutorial.
And there are other SIP clients available for various devices PC, Android, iOS, etc.

Service dies when Nmap is run

I am having a weird problem.
I have a service running on port 8888 on one of my many servers in a cluster.
When I run nmap on my gateway to get all the IPs inside my network, this service miraculously dies. Since nmap does a port scan too, It might have something to do with it. I am not sure.
The nmap command I am using is this:
sudo nmap -oX ${FILE_NAME} ${IP_DOMAIN} -A -O --osscan-guess
Can some tell me what might be happening ?
While Nmap developers try to limit the danger, Nmap scans can still crash services. The most likely culprit for crashing a service (as opposed to crashing an entire machine) is the service version detection scan phase (-sV, implied in your command by -A). This scan sends a series of data packets to the service in an attempt to elicit a response which can be matched against Nmap's database of known services. When a match is found, Nmap stops sending probes. That means that an unknown service can get lots of probes sent to it which contain binary data, command strings, and other data that your service is not expecting.
A well-written network service will not crash on any input; your service has a bug of some sort. Avoiding this sort of crash usually means avoiding scanning that service:
You can use the Exclude directive in your nmap-service-probes data file to instruct Nmap to never send these service probes to port 8888.
You can avoid scanning port 8888 at all by changing the ports you scan with -p. Later versions of Nmap will support the --exclude-ports option, too.
You can make sure you are using the latest version of Nmap. If your service's fingerprint was added to the nmap-service-probes file, then Nmap will stop sending probes when it detects it, which may avoid sending the later probe that crashes it.
You can reduce the intensity of the service scan with the --version-intensity option. This prevents Nmap from sending so many service probes, which may eliminate the one that is crashing your service.
Finally, if this service is a standard one and not something custom to your own network, you can report it to The Network Scanning Watch List so that other users can avoid crashing it as well.

Accept INVITE only after REGISTER

I run my own sip server (asterisk). Apparently my sip server allows to perform an INVITE without doing any REGISTER first. This leads to lots of unsuccessful attacks on my server. IS there any way to allow INVITE requests only from a successfully REGISTERed clients? Through asterisk or iptables?
You need change allowguest parameter to no in your sip.conf.
Check the link below for more tips about security in asterisk:
http://blogs.digium.com/2009/03/28/sip-security/
My study so far tells me that REGISTER is only for asterisk to reach or forward the INVITES but not to authenticate an INVITE request. When an INVITE comes, asterisk tries to check the given user name and if its a valid one, it sends a 407 (Authentication required) back to the client. Then client inserts the password (encrypted) in the response and sends INVITE2 to server. Now server authenticates the user and when credentials match, proceeds with establishing the call.
Conclusion: An INVITE has no relation with REGISTER and so my idea of restricting only REGISTERED clients to send an INVITE is not possible.
As a workaround, I have written my own script. Source is at https://github.com/naidu/JailMe
Consider a real Session Border Controller which pays for itself quickly when you get hacked. However, if you want a "good enough" option then read on:
There is an iptables module called "string" which will search a packet for a given string. In the case of SIP we expect to see "REGISTER" in the first packet from any given address, so combine this with -m state --state NEW or something similar. After that, we would want keep-alive happening to ensure that connection tracking remains open (usually Asterisk sends OPTIONS, but it can send empty UDP). You want that anyway in case the client is behind NAT.
It's not the ideal solution, because iptables cannot figure out whether a registration has been successful, but at least we can insist the other guy makes an attempt at registration. One of the answers linked below shows use of the string module in iptables:
https://security.stackexchange.com/questions/31957/test-firewall-rules-linux
You could also put an AGI script into your dialplan that does some additional checking, potentially looking at IP address and whether the extension is registered... ensure the INVITE comes from the same source IP.
Fail2Ban is an easy way to block unwanted traffic! fail2ban check system logs for failed attempts, if there are too many (exceeding defined threshold) failed attempts in specified time from some remote IP then Fail2Pan consider it as attack, and then add that IP address in iptables to block any type of traffic from it. following links can help
http://www.voip-info.org/wiki/view/Fail2Ban+(with+iptables)+And+Asterisk
http://www.markinthedark.nl/news/ubuntu-linux-unix/70-configure-fail2ban-for-asterisk-centos-5.html

Resources