Postfix Server - How do I block an IP thats is constantly trying to connect - postfix-mta

Firstly, thanks for looking.
I have recently set up a postfix mailserver with dovecot using the following guide:
https://www.linuxbabe.com/mail-server/ubuntu-18-04-iredmail-email-server
I have been monitoring the logs on /var/log/mail.log and the following entry keep showing every minute:
May 12 14:09:47 mail postfix/postscreen[32610]: CONNECT from [102.68.24.27]:59165 to [MYIP ADDRESS]:25
May 12 14:09:47 mail postfix/postscreen[32610]: PASS OLD [102.68.24.27]:59165
May 12 14:09:47 mail postfix/smtpd[32613]: warning: hostname yourcommunications.co.za does not resolve to address 102.68.24.27
May 12 14:09:47 mail postfix/smtpd[32613]: connect from unknown[102.68.24.27]
May 12 14:09:47 mail postfix/smtpd[32613]: lost connection after EHLO from unknown[102.68.24.27]
May 12 14:09:47 mail postfix/smtpd[32613]: disconnect from unknown[102.68.24.27] ehlo=1 commands=1
Please could someone help me block this ip from connecting?
I am on Ubuntu 20.04 using iredmail 1.4.0.
Any help would be greatly appreciated.

Look at these postfix anvil parameters:
anvil_rate_time_unit (default: 60s):
The time unit over which client connection rates and other rates are calculated.
anvil_status_update_time (default: 600s):
How frequently the anvil(8) connection and rate limiting server logs peak usage information.
smtpd_client_connection_count_limit (default: 50):
The maximum number of connections that an SMTP client may make simultaneously.
smtpd_client_connection_rate_limit (default: no limit):
The maximum number of connections that an SMTP client may make in the time interval specified with anvil_rate_time_unit (default: 60s).
smtpd_client_message_rate_limit (default: no limit):
The maximum number of message delivery requests that an SMTP client may make in the time interval specified with anvil_rate_time_unit (default: 60s).
smtpd_client_recipient_rate_limit (default: no limit):
The maximum number of recipient addresses that an SMTP client may specify in the time interval specified with anvil_rate_time_unit (default: 60s).

Related

Troubleshoot RServe config option keep.alive

I am using RServe 1.7.3 on a headless RHEL 7.9 VM. On the client, I am using RserveCLI2.
On long running jobs, the TCP/IP connection becomes blocked by a fire wall, after 2 hours.
I came across the keep.alive configuration option, that is available since RServe 1.7.2 (RServe News/Changelog).
The specs read:
added support for keep.alive configuration option - it is global to
all servers and if enabled the client sockets are instructed to keep
the connection alive by periodic messages.
I added the following to /etc/Rserv.conf:
keep.alive enable
but this does no prevent the connection from being blocked.
Unfortunately, I cannot run a network monitoring tool, like Wireshark, to monitor the traffic between client and server.
How could I troubleshoot this?
Some specific questions I have:
Is the path of the config file indeed /etc/Rserv.conf, as specified in Documentation for Rserve? Notice that it does not have a final e, like Rserve.
Does this behaviour depend on de RServe client in use, or is this completely handled at the socket level?
Can I inspect the runtime settings of RServe, to see if keep.alive is enabled?
We got this to work.
To summarize, we adjusted some kernel settings to make sure keep-alive packets are send at shorter intervals to prevent the connection from being deemed dead by network components.
This is how and why.
The keep.alive enable setting is in fact an instruction to the socket layer to periodically emit keep-alive packets from server to client. The client is expected to return an ACK on these packets. The behaviour is governed by three kernel-level settings, as explained in TCP Keepalive HOWTO - Using TCP keepalive under Linux:
tcp_keepalive_time (defaults to 7200 seconds)
tcp_keepalive_intvl (defaults to 75 seconds)
tcp_keepalive_probes (defaults to 9 times)
The tcp_keepalive_time is the first time a keep-alive packet is sent, after establishing the tcp/ip connection. The tcp_keepalive_intvl interval is de wait time between subsequent packets and tcp_keepalive_probes the number of subsequent unacknowledged packets that make the system decide the connection is dead.
So, the first keep-alive packet was only send after 2 hours. After that time, some network component had already decided the connection was dead and the keep-alive packet never made it to the client and thus no ACK was ever send.
We lowered both tcp_keepalive_time and tcp_keepalive_intvl to 600 seconds.
With tcpdump -i [interface] port 6311 we were able to monitor the keep-alive packets.
15:40:11.225941 IP <server>.6311 <some node>.<port>: Flags [.], ack 1576, win 237, length 0
15:40:11.226196 IP <some node>.<port> <server>.6311: Flags [.], ack 401, win 511, length 0
This continues until the results are send back and the connection is closed. At least, I test for a duration of 12 hours.
So, we use keep-alive here not to check for dead peers, but to prevent disconnection due to network inactivity, as is discussed in TCP Keepalive HOWTO - 2.2. Why use TCP keepalive?. In that scenario, you want to use low values for keep-alive time and interval.
Note that these are kernel level settings, and thus are applied system-wide. We use a dedicated server, so this is no issue for us, but may be in other cases.
Finally, for completeness, I'll answer my own three questions.
The path of the the configuration is /etc/Rserv.conf, as was confirmed by changing another setting (remoted enable to remote disable).
This is handled a the socket level.
I am not sure, but using tcpdump shows that Rserve emits keep-alive packets, which is a more useful way to inspect what's happening.

Why cant I connect more than 8000 clients to MQTT brokers via HAProxy?

I am trying to establish 10k client connections(potentially 100k) with my 2 MQTT brokers using HAProxy as a load balancer.
I have a working simulator(using Java Paho library) that can simulate 10k clients. On the same machine I run 2 MQTT brokers in docker. For LB im using another machine with virtual image of Ubuntu 16.04.
When I connect directly to a MQTT Broker those connections are established without a problem, however when I use HAProxy I only get around 8.8k connections, while the rest throw: Error at client{insert number here}: Connection lost (32109) - java.net.SocketException: Connection reset. When I connect simulator directly to a broker (Same machine) about 20k TCP connections open, however when I use load balancer only 17k do. This leaves me thinking that LB is causing the problem.
It is important to add that whenever I run the simulator I'm unable to use the browser (Cannot connect to the internet). I havent tested if this is browser only, but could that mean that I actually run out of ports or something similar and the real issue here is not in the LB?
Here is my HAProxy configuration:
global
log /dev/log local0
log /dev/log local1 notice
maxconn 500000
ulimit-n 500000
maxpipes 500000
defaults
log global
mode http
timeout connect 3h
timeout client 3h
timeout server 3h
listen mqtt
bind *:8080
mode tcp
option tcplog
option clitcpka
balance leastconn
server broker_1 address:1883 check
server broker_2 address:1884 check
listen stats
bind 0.0.0.0:1936
mode http
stats enable
stats hide-version
stats realm Haproxy\ Statistics
stats uri /
This is what MQTT broker shows for every successful/unsuccessful connection
...
//Successful connection
1613382861: New connection from xxx:32850 on port 1883.
1613382861: New client connected from xxx:60974 as 356 (p2, c1, k1200, u'admin').
...
//Unsuccessful connection
1613382699: New connection from xxx:42861 on port 1883.
1613382699: Client <unknown> closed its connection.
...
And this is what ulimit -a shows on LB machine.
core file size (blocks) (-c) 0
data seg size (kb) (-d) unlimited
scheduling priority (-e) 0
file size (blocks) (-f) unlimited
pending signals (-i) 102355
max locked memory (kb) (-l) 82000
max memory size (kb) (-m) unlimited
open files (-n) 500000
POSIX message queues (bytes) (-q) 819200
real-time priority (-r) 0
stack size (kb) (-s) 8192
cpu time (seconds) (-t) unlimited
max user processes (-u) 500000
virtual memory (kb) (-v) unlimited
file locks (-x) unlimited
Note: The LB process has the same limits.
I followed various tutorials and increased open file limit as well as port limit and TCP header size, etc. The number of connected users increased from 2.8k to about 8.5-9k (Which is still way lower than the 300k author of the tutorial had). ss -s command shows about 17000ish TCP and inet connections.
Any pointers would greatly help!
Thanks!
You can't do a normal LB of MQTT traffic, as you can't "pin" the connection based on the MQTT Topic. If you send in a SUBSCRIBE to Broker1 for Topic "test/blatt/#", but the next client PUBLISHes to Broker2 "test/blatt/foo", then if the two brokers are not bridged, your first subscriber will never get that message.
If your clients are terminating the TCP connection sometime after the CONNECT, or the HAproxy is round-robin'ing the packets between the two brokers, you will get errors like this. You need to somehow persist the connections, and I don't know how you do that with HAproxy. Non-free LB's like A10 Thunder or F5 LTM can persist TCP connections...but you still need the MQTT brokers bridged for it all to work.
Turns out I was running out of resources on my computer.
I moved simulator to another machine and managed to get 15k connections running. Due to resource limits I cant get more than that. Computer thats running the serverside uses 20/32GB of RAM and the computer running simulator used 32/32GB for approx 15k devices. Now I see why running both on the same computer is not an option.

Solaris tcp_time_wait_interval configuration

In my Solaris server, I have an HTTP Server which handle many incoming connections. In my server logic, it closes connection from client manually so that many TIME_WAIT status appear when I call command netstat -an in my server.
So that I change the tcp_time_wait_interval to 10 second with command:
ndd -set /dev/tcp tcp_time_wait_interval 10000
But I read from user guide, it says : "Do not set the value lower than 60 seconds".
Does anyone know why Oracle recommend that?
The user guide URL is : http://docs.oracle.com/cd/E19455-01/806-6779/chapter4-51/index.html
Was told my Oracle engineer with a very heavy trans load in thousands/sec can set to as low as 10 ms on Solaris 11/11.1

DHCP lease checkup

I'm currently developing an Embedded device, which uses TCP-IP, and get's it's IP address with DHCP.
I saw in examples, that every now and then, I need to check if the lease has ended, but I didn't find any reference about how often to check it, because there are implementation who check it once in 8 days, and implementation which check it every 24 hours.
so basically, in your implementations, How often do you check the DHCP lease ? what's the standard regarding this issue ?
You actually have to check the "IP Lease time" field in the ACK of the DHCPREQUEST. The RFC specifies that this ACK message MUST contain the lease time. Some clients may also choose to propose a lease time in the DHCPDISCOVER or DHCPREQUEST message (depends on the implementation).
From the client perspective, at 50% of the lease duration (T1) the client has to send a DHCP Request to the server to ask for a renew of its lease time. When the client receives a DHCP ACK from the server, the client computes the lease expiration time as the sum of the time at which the client sent the DHCPREQUEST message and the duration of the lease in the DHCPACK message.
If no DHCPACK arrives before time by 87.5% of lease time (T2), the client sends (via broadcast) a DHCPREQUEST message to extend its lease.
If the lease expires before the client receives a DHCPACK (T3), the client MUST immediately stop any other network processing and requests network initialization parameters as if the client were uninitialized.
Hence you have to keep in mind T1, T2 and T3.

Linux Syslog Server Format

I am creating a syslog formatted message according to RFC3164 and sending it to my linux default syslog server which is listining of port 514.
The message i am sending is
<187>Nov 19 02:58:57 nms-server6 %cgmesh-2-outage: Outage detected on this device
I open a socket, make a datagram packet and send this packet on that socket.
Now in the var/log/syslog.log which i have configured to receive all the syslog messages as
. /var/log/syslog.log
I am getting this extra hostname getting inserted by the server automatically as show below
Nov 19 02:58:57 nms-server6 nms-server6 %cgmesh-2-outage: Outage detected on this device
as you see nms-server6 is getting repeated twice while i am sending it just once...so somehow the server is inserting it by default..
can some one share some knowledge on this ?
Are you adding the hostname in your message? If so, I don't think that's necessary as the hostname will be taken from the packet - which would explain the duplication.
Also, as a side note - it's nice that you've added the %fac-sev-mnemonic: portion, but that is not a standard, it's used by Cisco devices.
Here's a link to a good whitepaper that covers Cisco Mnemonics (and syslog management):
Building Scalable Syslog Management Solutions:
http://www.cisco.com/en/US/technologies/collateral/tk869/tk769/white_paper_c11-557812.html

Resources