How to capture TCP/IP fragmentation in tcpdump? - tcp

As we all know, the MTU is 1500 and the MSS for TCP is 1460. So when the buf used in the recv function is large than 1460 bytes, the TCP will be splitted into many parts.
I write a simple echo prog, and want to use tcpdump to check the fragmentation. However, it does not show the fragmentation when the buf is small, but shows when the buf is about 20K.
Here is the code:
Server:
import socket
import sys
import os
addr = ('10.0.0.2',10086)
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server.bind(addr)
server.listen(5)
while True:
connfd, addr= server.accept()
print 'connection ip:', addr
data = connfd.recv(8192);
Client:
import socket
import os
import sys
addr = ('10.0.0.2', 10086)
client = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
client.connect(addr)
data = '';
for num in range(0,8192):
data += '1'
client.sendall(bytes(data))
Here is the tcpdump cmd I used:
sudo tcpdump -i lo port 10086 -s 1514 -v
See from the code, the buf is 8192, the MSS is 1460. So, in my opinion, the packet will be splitted into 1460, 1460, 1460, 1460, 1460, 892. But in the screenshots it not.
Also, I am not sure if this is caused by the [DF] flags. The prog is used python, so the build-in sockopt [DF] is set default? Heaven knows.

As we all know, the MTU is 1500 and the MSS for TCP is 1460
This is not true.
The MTU depends on the transport medium and a MTU of 1500 is specific to ethernet. But based on your tcpdump you are not using the ethernet interface (i.e. wired LAN connection between two machines) but have client and server on the same machine and thus use the lo interface (tcpdump -i lo ...). The MTU for the localhost interface is usually much higher:
$ ifconfig lo
lo: ... mtu 65536
$ ifconfig eth0
eth0: ... mtu 1500
Apart from that you will probably not see any fragmentation at all. If the packets are larger than the MTU you will see TCP segmentation (not fragmentation), i.e. that the OS will split the TCP stream into different segments where each if not larger than the MSS. Fragmentation instead occurs on the lower layers, for example if an IP packet needs to be split further since somewhere in the path to the target is a device with a smaller MTU.
The [DF] (don't fragment) you see is to make sure that no IP level fragmentation occurs and the packet instead gets discarded and the sender notified so that the Path MTU (minimal MTU of the Path) can be discovered and the TCP segmentation optimized for this in order to reduce the overhead of delivery. See Path MTU discovery for more information.

I'd like to add that you wont see fragments with your tcpdump filter because you are filtering on a port number. IP fragments don't really have a port number. Just the packet id and offset and also the protocol number. So you should filter on UDP with the IP src or destination.
Or use this filter to see fragments:
tcpdump -i eth1 '((ip[6:2] > 0) and (not ip[6] = 64))'
Credit: https://github.com/SergK/cheatsheat-tcpdump/blob/master/tcpdump_advanced_filters.txt

You are sending data from localhost to localhost, inside one host.
So, the MTU of ethernet (1500) won't limit size of MSS, because the data is not going over the ethernet.
Try to repeat the test between two different hosts.

Related

Terminal not seeing ping messages from TUN port

Hi I'm working on a project and I had a question involving ping commands and how they interface over network TUN ports.
Basically I'm sending out ping requests which are routed to my TUN port and the reply's are sent to the TUN port over the VPN. There are no other internet interfaces (i.e. no wifi/ethernet). Using wireshark and tcpdump I can see that the correct reply messages are seen on the TUN0 port but terminal does not see the replys and instead shows 100% drop rate. The issue seems to be that the TUN0 port is not properly linking back to the kernal? (total guess I'm quite new to IP routing).
The IP address of the TUN is 10.0.0.73 and I am pinging a computer with IP address 10.0.0.28
Bellow is a snippet from the tcpdump on TUN0 this is a request and reply that to my untrained eye should work:
23:08:52.257566 IP (tos 0x0, ttl 64, id 11185, offset 0, flags [DF], proto ICMP (1), length 84)
10.0.0.73 > 10.0.0.28: ICMP echo request, id 24667, seq 2, length 64
23:09:11.508002 IP (tos 0x0, ttl 64, id 13315, offset 0, flags [none], proto ICMP (1), length 84)
10.0.0.28 > 10.0.0.73: ICMP echo reply, id 24667, seq 2, length 64
Based on other posts I checked my ip route list and the output is as such
pi#raspberrypi:~$ sudo ip route list
10.0.0.0/24 dev tun0 proto kernel scope link src 10.0.0.73
and the ifconfig is this:
pi#raspberrypi:~$ ifconfig tun0
tun0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
inet addr:10.0.0.73 P-t-P 10.0.0.73 Mask:255.255.255.0
...
Turns out the issue was that the replies were showing up in incorrect orders and greatly delayed, when I fixed the network connections this issue went away without changing any configurations in the iptables

How to remove "IPv4 address parameter" field (optional) in a SCTP packet in Ubuntu

I want to send a SCTP packet to a server using L2TP VPN in Ubuntu 20.04. For this purpose, I have set up my L2TP VPN and I can successfully test the connection using ping command. Now my ifconfig output is as follows:
enp0s31f6: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet x.x.x.x netmask 255.255.255.248 broadcast p.p.p.p
...
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
...
ppp0: flags=4305<UP,POINTOPOINT,RUNNING,NOARP,MULTICAST> mtu 1400
inet y.y.y.y netmask 255.255.255.255 destination q.q.q.q
...
In this output, x.x.x.x is my LAN IP and y.y.y.y is my VPN IP.
But when I send my INIT SCTP packet, two optional fields, i.e. IPv4 address parameter, is appeared in INIT chunk subtree in Wireshark log as follows. These parameters contain my IPs.
Stream Control Transmission Protocol, Src Port: a (a), Dst Port: b (b)
Source port: a
Destination port: b
Verification tag: 0x00000000
[Association index: 65535]
Checksum: 0x06cf8029 [unverified]
[Checksum Status: Unverified]
INIT chunk (Outbound streams: 3, inbound streams: 3)
Chunk type: INIT (1)
0... .... = Bit: Stop processing of the packet
.0.. .... = Bit: Do not report
Chunk flags: 0x00
Chunk length: 52
Initiate tag: 0xd1d6f19b
Advertised receiver window credit (a_rwnd): 106496
Number of outbound streams: 3
Number of inbound streams: 3
Initial TSN: 1216798565
IPv4 address parameter (Address: x.x.x.x)
IPv4 address parameter (Address: y.y.y.y)
Supported address types parameter (Supported types: IPv4)
ECN parameter
Forward TSN supported parameter
and finally, here is the sent and received packets:
I think that IPv4 address parameter (Address: x.x.x.x) (my LAN IP) in my INIT packet, has caused to receive the ABORT packet from server. When I turn off my VPN, these two optional fields do not appear.
How can I remove these two optional fields in Ubuntu after turning up my VPN?
Manual client IP assignment is required to remove "IPv4 address parameter" fields in a SCTP packet. So, the following codes is required in C++:
int sock = socket(AF_INET, SOCK_STREAM, IPPROTO_SCTP);
if(sock < 0)
{
//handle error
}
struct sockaddr_in clientAddr;
memset(&clientAddr,0, sizeof(struct sockaddr_in));
clientAddr.sin_family = AF_INET;
clientAddr.sin_addr.s_addr = inet_addr("y.y.y.y");
clientAddr.sin_port = htons(a);
if( ::bind(sock, (struct sockaddr*)&clientAddr, sizeof(struct sockaddr)) < 0 )
{
//handle error
}

Receive specific multicast message on a client connected over VPN

Case:
[ Subnet A , 192.168.2.0/24, Padavan firmware based internet gw ]
[ Subnet B , 192.168.1.0/24, Padavan firmware based internet gw ]
Host from subnet A (2.155) is connected via VPN (possible options: PPTP, OpenVPN, L2TP w/o ipsec) to subnet B, and receives address, saying 1.245/32
In subnet B exists host (1.10/32) which sends multicast datagramms to 224.0.0.50:9898 ; On router I see them with
tcpdump -i br0 -c 10 dst host 224.0.0.50 and port 9898 and multicast
13:46:54.345369 IP 192.168.1.10.4321 > 224.0.0.50.9898: UDP, length 135
I am looking for solutions, to receive/forward those broadcast messages, so they could be seen by hosts, connected via VPN
On router B, which is Padavan firmware based, I have, and limited to udpxy, igmproxy utilities, if needed.
On client host, I am debian based, and generally not limited in tools.
Datagrams are proprietary protocol, i.e. not a iptv or video stream.
Any ideas are welcomed.
[UPD] Additional info - per discussion in comments
That's a very specific hardware device, which is not very chatty in ethernet terms (saying max 1-2 datagramms in 5 seconds), thus for sure should be pretty forwardable. Unfortunately, It sends status updates purely via broadcasting. in Subnet A do exist similar device + control software. Thus I am looking for a way datagramms broadcasted to 224.0.0.50:9898 in subnet B to re-appear in subnet A. May be with help of some tool. May be smcroute, may be udpxy, maybe igmproxy
As I don't like to leave resolved questions unanswered, here is currently working solution
In subnet B I have installed openVPN server endpoint, configured as L2.
In subnet A, on a control host I have installed openvpn client, that connects to subnet B, assigned interface is tapz
20: tapz: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UNKNOWN group default qlen 100
link/ether 0a:da:be:96:78:d9 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.245/24 brd 192.168.1.255 scope global noprefixroute tapz
valid_lft forever preferred_lft forever
inet6 fe80::8da:beff:fe96:78d9/64 scope link
valid_lft forever preferred_lft forever
So now on a control host I have:
broadcasting from local device on physical ethernet enp5s0
sudo tcpdump -i enp5s0 -c 10 dst host 224.0.0.50 and port 9898 and multicast
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on enp5s0, link-type EN10MB (Ethernet), capture size 262144 bytes
13:55:05.642963 IP lumi-gateway-v3_miio56591509.4321 > 224.0.0.50.9898: UDP,
length 136
and now I also receive broadcasts from remote network device on tapz
sudo tcpdump -i tapz -c 10 dst host 224.0.0.50 and port 9898 and multicast
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tapz, link-type EN10MB (Ethernet), capture size 262144 bytes
13:53:49.141751 IP 192.168.1.10.4321 > 224.0.0.50.9898: UDP, length 135
So far that it what I was looking for I am getting necessary datagrams on a VPN client. OpenVPN on remote side can be also optimized on filter of information forwarded for multicasts.
For those who come here, with the same question.
When you will have necessary multicast on tap0,
you can create bridge from, saying, eth0 and tap0
For notes of everyone interested, who would came here.
ip link add br0 type bridge
ip link set tap0 master br0
ip link set eth0 master br0
POC - both multicasts on single interface
sudo tcpdump -i br0 dst host 224.0.0.50 and port 9898
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on br0, link-type EN10MB (Ethernet), capture size 262144 bytes
21:09:51.823632 IP 192.168.1.10.4321 > 224.0.0.50.9898: UDP, length 135
21:09:55.045138 IP 192.168.2.214.4321 > 224.0.0.50.9898: UDP, length 136

Steps to share internet with BeagleBone Black using USB from OS X

Already tried:
Connect the BBB with USB to iMac
Share internet with the board from System Preferences->Sharing
ssh to the board and then try to udhcp -i usb0
This is what it says:
udhcpc (v1.20.2) started
Gets stuck and I get and error: Write failed: Broken pipe
ssh exits
Any clues?
After some try-and-erroring, here's what worked for me:
1. Watch this video: http://www.youtube.com/watch?v=Cf9hnscbSK8
2. If your BBB was shipped after November 2013, instead of screen /dev/tty.usb*B 115200 use screen /dev/tty.usb* 115200 and actually you need to go to the /dev directory and check which of the tty.usbXXX is available for your BBB and screen it. In my case it was tty.usb131 for example
3. You continue the steps just like in the video until opkg update which would be the thing you need to do over the internet
And that it's all about it.
Your SSH session is getting stuck because you're connected to usb0 and the udhcpc command changed the IP address for it! At this point there's nothing listening on the other end of your ssh session, so your local computer's ssh client eventually fails with the broken pipe error and exits.
An obvious workaround is to connect via tty.usbserial instead of ssh to the IP address. You'd think the usb port's assigned IP shouldn't be changing though. Read on to understand what's happening.
Most people using a BBB for the first time attach them directly to their Internet connected computer using the supplied USB cable. It's exactly what the BBBs designers intended for you to do, and they've done a fantastic job with the BBBs startup web page.
That host computer shares it connection differently though depending on whether it's Windows, OS X or Linux, and how you do it varies depending on the version of the OS you're running.
Derek Molloy (Exploring BeagleBone) and Jason Kridner (Youtube OS X Beaglebone video) provide some fairly detailed instructions to use host based Internet sharing with your BBB. The Linux and Windows instructions are still good, but they need to update the OS X info for Yosemite - Apple switched their NAT and firewall software to pf from ipfw and natd. If you try running udhcpc like Jason did in his vid it doesn't work the same way as his did.
So back to your BBB SSH problem with OS X Yosemite. Here's how to see what's going on: Connect to the BBB using a serial/FTDI cable, then check the ip config of usb0 for the beaglebone.
beaglebone:~# ifconfig -a usb0
usb0 Link encap:Ethernet HWaddr 0e:be:ff:00:ff:00 inet addr:192.168.7.2
Bcast:192.168.7.3 Mask:255.255.255.252
confirm you can ping the host that's sharing it's Internet connection
beaglebone:~# ping 192.168.7.1
PING 192.168.7.1 (192.168.7.1) 56(84) bytes of data.
64 bytes from 192.168.7.1: icmp_req=1 ttl=64 time=0.681 ms
64 bytes from 192.168.7.1: icmp_req=2 ttl=64 time=0.533 ms
^C
try reaching an Internet IP (google dns)
beaglebone:~# ping 8.8.8.8
connect: Network is unreachable
check routes and confirm there's no default route out, which is why the ping above failed (a USB connected BBB has a 192.168.7.0/30 network setup by default, so it can only reach 192.168.7.0, .1, .2 and .3 addresses).
beaglebone:~# netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
192.168.7.0 0.0.0.0 255.255.255.252 U 0 0 0 usb0
so if you run udhcpc it will add the missing route for you. you could also just add the route directly, but you need to setup dns as well, and with OS X Internet sharing it won't work without also changing the BBB's IP address - see links at end of this post)
beaglebone:~# udhcpc -i usb0
udhcpc (v1.20.2) started
Sending discover...
Sending discover...
and here is where udhcpc changes the IP instead of just re-using 192.168.7.2. The new IP is compatible with the IP range used by OS X Internet Sharing, so that may be why the DHCP server is returning it.
Sending select for 192.168.2.34...
Lease of 192.168.2.34 obtained, lease time 85536
udhcpc then throws an error because there's no default route to delete
/etc/udhcpc/default.script: Resetting default routes
SIOCDELRT: No such process
udhcpc then adds the default route - note carefully it's an OS X Internet Sharing 192.168.2 address, not the original 192.168.7.
/etc/udhcpc/default.script: Adding DNS 192.168.2.1
everything worked, so you can see the new route and successfully ping an external IP now
beaglebone:~# netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
0.0.0.0 192.168.2.1 0.0.0.0 UG 0 0 0 usb0
192.168.2.0 0.0.0.0 255.255.255.0 U 0 0 0 usb0
beaglebone:~# ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_req=1 ttl=53 time=4.08 ms
64 bytes from 8.8.8.8: icmp_req=2 ttl=53 time=3.59 ms
^C
There are a couple of blog posts that show how to set this up permanently:
Sharing OS X Internet Connection over USB to BeagleBone Black
and
Changing usb0 IP address on the BeagleBone Black

Sniffing packets using tshark

I have 2 servers(serv1,serv2) that communicate and i'm trying to sniff packets matching certain criteria that gets transferred from serv1 to serv2. Tshark is installed on my Desktop(desk1). I have written the following script:
while true; do
tshark -a duration:10 -i eth0 -R "(sip.CSeq.method == "OPTIONS") && (sip.Status-Code) && ip.src eq serv1" -Tfields -e sip.response-time > response.time.`date +%F-%T`
done
This script seems to run fine when run on serv1(since serv1 is sending packets to serv2). However, when i try to run this on desk1, it cant capture any packets. They all are on the same LAN. What am i missing?
Assuming that either serv1 or serv2 are on the same physical ethernet switch as desk1, you can sniff transit traffic between serv1 and serv2 by using a feature called SPAN (Switch Port Analyzer).
Assume your server is on FastEtheret4/2 and your desktop is on FastEthernet4/3 of the Cisco Switch... you should telnet or ssh into the switch and enter these commands...
4507R#configure terminal
Enter configuration commands, one per line. End with CNTL/Z.
4507R(config)#monitor session 1 source interface fastethernet 4/2
!--- This configures interface Fast Ethernet 4/2 as source port.
4507R(config)#monitor session 1 destination interface fastethernet 4/3
!--- The configures interface Fast Ethernet 0/3 as destination port.
4507R#show monitor session 1
Session 1
---------
Type : Local Session
Source Ports :
Both : Fa4/2
Destination Ports : Fa4/3
4507R#
This feature is not limited to Cisco devices... Juniper / HP / Extreme and other Enterprise ethernet switch vendors also support it.
How about using the misnamed tcpdump which will capture all traffic from the wire. What I suggest doing is just capturing packets on the interface. Do not filter at the capture level. After you can filter the pcap file. Something like this
tcpdump -w myfile.pcap -n -nn -i eth0
If your LAN is a switched network (most are) or your desktop NIC doesn't support promiscuous mode, then you won't be able to see any of the packets. Verify both of those things.

Resources