Can anyone give me any idea Nftable rules - networking

1.Accept incoming TCP connections ssh (port 22), with a rate limit of 30 connections per minute, per host, and a burst of 5 connections
2.Log accepted ssh connections.

The first rule in the input chain is usually:
ct state established,related counter accept
So it should be sufficient to add the rule:
ct state new tcp dport 22 limit rate 30/minute burst 35 packets log prefix "[nft accept ssh] " counter accept
Putting it all together:
table inet filter {
chain input {
type filter hook input priority filter; policy drop;
ct state established,related counter accept
ct state new tcp dport 22 limit rate 30/minute burst 35 packets log prefix "[nft accept ssh] " counter accept
}
}

Related

IPTables which allows inbound connection but disallows inbound traffic

Currently trying to setup iptables to allow for a client to connect to the server to listen to a stream of messages via TCP. The thing is, we want to block the client from sending any messages once connected (is OK if the client is DROP'ed in this case).
Is there a way to allow a client to connect and enforce a 1 way communication from the server to the client only?
Requires this to work purely within iptables (no software proxy-like solution).
The primary problem to be solved here is that we can't just shut down all the traffic from the client after the connection has been established because TCP is a positive acknowledgement protocol. If the server does not receive acks from the client, it will retransmit, and eventually time out. In what follows, I will assume that we are using IPV4.
So what we want to do is to allow the connection to be established, and then only allow acknowledgements from the client, that is, packets that contain no TCP payload.
Unfortunately, the length of the TCP payload is not explicitly represented in the TCP header. We can try to use the overall length in the IP header, but this is complicated by the fact that both the IP header and the TCP header include variable length options fields, so there are many possible overall lengths that ack with no payload.
Since IP options are rarely used and commonly filtered, let's simplify things by first dropping all packets that contain options in the IP header (if your firewall is not already doing so). The implications of doing so are discussed at length here.
To do this, we will drop all traffic to our server (here taken to be 10.2.3.4:1234) where the IP header length (bits 4-7 of byte 0 in the IP header) is not 5:
iptables -A INPUT -p tcp -d 10.2.3.4 --dport 1234 \
-m u32 --u32 "0>>24&0xF=6:0xF" -j DROP
This uses the iptables u32 module to grab 4 bytes from the packet starting at byte 0, right shift it 24 bits, mask the lower nibble, and then drop the packet if this is in the range 6-15. Note that 5 is actually the minimum size of an IP header.
The situation with TCP options is a bit more complicated. In establishing the connections, many different options may be used, e.g., to negotiate window scaling. However, once the connection is established, the only thing we need to worry about is TCP timestamps and selective acks. So let's let the connection be established:
iptables -A INPUT -p tcp -d 10.2.3.4 --dport 1234 \
--tcp-flags SYN SYN -j ACCEPT
Note that it is possible to send a payload in a SYN packet, so here we are not completely meeting your requirement. Most ordinary TCP implementations will not do so, although TCP fast open does. If you want to mitigate this, you could drop SYN packets that are fragments (that could reassemble to something very large) and limit the overall length of the non-fragmented SYN packets to something reasonable that would allow for the usual options to be present in the TCP three way handshake. Note that the above rule was added to the INPUT chain, which is processed after IP fragment reassembly.
OK, so we can get the TCP connection established, and the IP headers are restricted to 5 words (20 bytes).
However the TCP headers may contain selective acks, tcp timestamps, both, or neither. Let's start with TCP headers with no options. An ack with no options and no payload will consist of a 5 word IP header followed by a 5 word TCP header followed by no data. So the total length in the IP header would be 40. If the packet was a fragment, it could conceal a payload in a subsequent fragment, but since we are working with the INPUT chain which is processed after IP fragmentation reassembly, we won't have to worry about this.
iptables -A INPUT -p tcp -d 10.2.3.4 --dport 1234 \
-m u32 --u32 "32>>28=5 && 0&0xFFFF=40" -j ACCEPT
The IP header is 20 bytes and the data offset nibble is in byte 12, so taking the 4 bytes starting at byte 32=20+12, we shift the nibble down and compare it to five, and then make sure that the total length in bytes 2 and 3 of word 0 of the IP header is 40.
If there are TCP timestamps in the TCP header, then there will be an additional 12 bytes (3 words) in the TCP header. We can accept that in a similar manner:
iptables -A INPUT -p tcp -d 10.2.3.4 --dport 1234 \
-m u32 --u32 "32>>28=8 && 0&0xFFFF=52" -j ACCEPT
I'll leave it as an exercise for the reader to work out the other combinations. (Note that dealing with selective ack is several cases as there can be 1-4 selective ack blocks, or 1-3 selective ack blocks with timestamps.)
DISCLAIMER: I did not actually try this out, so my apologies if there is a typo or if I overlooked something. I believe the strategy to be sound, and if there is any error or omission, let me know and I will correct.

Identify single communication

I have problem with identifying communication established by TCP.
I have to identify first completed communication, for example first complete http communication.
I have dump .pcap file with capture. I know that communication should start by three way handshake ( SYN, SYN - ACK, ACK ) and then closing of communication by double FIN flag from both side.
But I have a lot of communication in that dump file.
So here is the question. Which things i need to remember to match exact one communication ?
I thought about source IP, destination IP, protocol, maybe port but i am not sure.
Thank you for every advice.
And sorry for my english.
You stated that you need:
To identify a particular conversation
To identify the first completed conversation
You can identify a particular TCP or UDP conversation by filtering for
the 5-tuple of the connection:
Source IP
Source Port
Destination IP
Destination Port
Transport (TCP or UDP)
As Shane mentioned, this is protocol dependent e.g. ICMP does not have the concept of
ports like TCP and UDP do.
A libpcap filter like the following would work for TCP and UDP:
tcp and host 1.1.1.1 and port 53523 and dst ip 1.1.1.2 and port 80
Apply it with tcpdump:
$ tcpdump -nnr myfile.pcap 'tcp and host 1.1.1.1 and port 53523 and dst ip 1.1.1.2 and port 80'
To identify the first completed connection you will have to follow the timestamps.
Using a tool like Bro to read a PCAP would yield the answer as it will list each connection
attempt seen (complete or incomplete):
$ bro -r myfile.pcap
$ bro-cut -d < conn.log | head -1
2014-03-14T10:00:09-0500 CPnl844qkZabYchIL7 1.1.1.1 57596 1.1.1.2 80 tcp http 0.271392 248 7775 SF F ShADadfF 14 1240 20 16606 (empty) US US
Use the flag data for TCP to judge whether there was a successful handshake and tear down.
For other protocols you can make judgements based on byte counts, sent and received.
Identifying the first completed communication is highly protocol specific. You are on the right track with your filters. If your protocol is a commonly used one there are plug ins called protocol analyzers and filters that can locate "conversations" for you from a pcap data stream. If you know approximate start time and end time that would help narrow it down too.

Disable ICMP Host unreachable

I'm using a single raw socket to read UDP packets from local test network with 1024 ports. Each UDP src and dest port is unique and I need access to IP and UDP header fields. I can stream and process data (in and out) at 100 mbps in linux-rt kernel with very low jitter < 250 usec, 10 usec nominal.
I'd like to prevent kernel from issuing ICMP port unreachable errors back to the sending host, however, I don't want to create 1024 vanilla UDP sockets and bind to each one because of resource constraints. Currently, I'm using iptables to drop the outbound port unreachable messages. Does anyone know of a way (programmatic using C code) to prevent the ICMP unreachable traffic? Perhaps an IOCTL or socket option? I also tried changing /proc/sys/net/ipv4/icmp_ratelimit but that seemed to have no effect. By default the ratemask is set for dest unreachables and a variety of ratelimit values did not change any behavior that I could see.

Advantage of a 64k mtu for lo?

I saw this commit in the Linux kernel and was confused by it:
loopback current mtu of 16436 bytes allows no more than 3 MSS TCP
segments per frame, or 48 Kbytes. Changing mtu to 64K allows TCP
stack to build large frames and significantly reduces stack overhead.
Performance boost on bulk TCP transferts can be up to 30 %, partly
because we now have one ACK message for two 64KB segments, and a lower
probability of hitting /proc/sys/net/ipv4/tcp_reordering default limit.
--- a/drivers/net/loopback.c
+++ b/drivers/net/loopback.c
static void loopback_setup(struct net_device *dev)
{
- dev->mtu = (16 * 1024) + 20 + 20 + 12;
+ dev->mtu = 64 * 1024;
What does lo have to do with TCP transfers? Isn't it just a loopback address where you look ethernet traffic for whatever reason?
Since this is a change to the loopback interface, it's a performance boost for transfers on the local interface. Like if you FTP to 127.0.0.1, for example.

Spanning Tree Protocol

How to get switch MAC address while implementing spanning tree protocol?
ARP packets are the way to go. Find the ip address of the switch you want, then send an ARP request to that ipaddress. You will receive a packet back mapping the ip address requested to the MAC address which owns that ip address.
The answer above is more of a how to translate an ip address to a MAC address, as that sounds like the gist of your question. STP generally is implemented using BPDU or bridge protocol data unit. If you haven't already you might want to check out:
http://computer.howstuffworks.com/lan-switch14.htm
http://en.wikipedia.org/wiki/Spanning_tree_protocol
http://wiki.wireshark.org/STP
http://en.wikipedia.org/wiki/Logical_Link_Control
BRIDGE ID: Each bridge is assigned an
ID, called the bridge ID, that is
defined as an 8-byte value split into
two components. THe lowest six bytes
are assigned the Ethernet MAC address
of the bridge ports, and the highest
two bytes are a configurable priority,
call the bridge priority.
-Understanding Linux network internals
By Christian Benvenuti
See also
Troubleshooting campus networks
By Priscilla Oppenheimer, Joseph Bardwell
You should first know that most Cisco switches will assign a unique bridge ID per VLAN based on a mac-address assigned to the switch. You can figure out what the bridge ID will be for a VLAN once you have determined what the assigned mac-address is. It is also good to remember that newer switches can use an extended system ID which is more than just the mac-address (as the other poster noted).
You can determine the base mac address and then calculate what the bridgeID will be for a particular VLAN based on the concept that the bridge ID for a particular VLAN will be the base Bridge ID + the vlan number. Example:
Base VLAN = 000.0001.0800
Bridge ID for VLAN 1 = 0000.0001.0801
Bridge ID for VLAN 300 = 0000.0001.092c
yes, it is in Hex format..
You could do this on a Cisco switch as follows:
1: show int | i line | address
This will give you your "base" mac address. You will notice all of the SVIs have the same mac address.
Vlan1 is up, line protocol is up
Hardware is EtherSVI, address is 000.0001.0800 (bia 000.0001.0800)
2: You could also just check spanning tree for the calculation directly:
Show span vlan 1 | b Bridge ID
Bridge ID Priority 8192
Address **000.0001.0801**
Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec
Aging Time 300
The mac address under the Bridge ID is the one used for spanning tree calculation.
for Cisco switches
sh spanning-tree
Switch>sh sp
VLAN0001
Spanning tree enabled protocol ieee
Root ID Priority 32769
Address 0010.1167.1B9C
Cost 19
Port 17(FastEthernet0/17)
Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec
Bridge ID Priority 32769 (priority 32768 sys-id-ext 1)
Address 00E0.8F81.C638****
Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec
Aging Time 20
Interface Role Sts Cost Prio.Nbr Type
Fa0/17 Root LSN 19 128.17 P2p
Switch>
for Huawei switches
display stp
-------[CIST Global Info][Mode MSTP]-------
CIST Bridge :32768.4c1f-ccfe-181f
Config Times :Hello 2s MaxAge 20s FwDly 15s MaxHop 20
Active Times :Hello 2s MaxAge 20s FwDly 15s MaxHop 20
CIST Root/ERPC :32768.4c1f-cc7e-7e4d / 20000
CIST RegRoot/IRPC :32768.4c1f-ccfe-181f / 0
CIST RootPortId :128.10
BPDU-Protection :Disabled
TC or TCN received :2
TC count per hello :0
STP Converge Mode :Normal
Time since last TC :0 days 0h:0m:44s
Number of TC :2
Last TC occurred :GigabitEthernet0/0/10
----[Port1(GigabitEthernet0/0/1)][DOWN]----

Resources