Centos does not receive packets as well as Ubuntu - using sockets - networking

I have CentOS on my server and the server has two NIC.
I want to capture any packets on NIC number 1 and forward them to NIC number 2. I use Socket in my C++ application to capture and froward the packets, then I use IPerf to analyze my application. so I connect each of the NICs to different computers (Computer A and B) and try to send UDP packets from computer A to B. The server suppose to capture packets from A then forward them to B and vice versa.
It works well when I'm pinging computer B from A, but there are some things wrong when I try to generate more packets via IPerf. IPerf generate 100K UDP packets per second (22 bytes payload) and send it from A to B, but computer B does not receive all of the packets in a same time! for example if IPerf on A send 100K packets at the first second, Computer B receive these packets at 20 seconds! It seems like some caching system on server that holds the received packet! let me just show you what happened:
[OS: CentOS]
[00:00] Computer A Start Sending...
[00:01] Computer A send 100,000 packets
[00:01] Finnish
[00:00] Computer B start to listen...
[00:01] Computer B receive 300 packets
[00:02] Computer B receive 250 packets
[00:03] Computer B receive 200 packets
[00:04] Computer B receive 190 packets
[00:05] Computer B receive 180 packets
[00:06] Computer B receive 170 packets
[00:07] Computer B receive 180 packets
[00:08] Computer B receive 20 packets
[00:09] Computer B receive 3 packets
[00:10] (same things happened until all 100K packets will receive, it takes long time)
At the 4th second, I unplugged the network cable from computer A to make sure it's not sending any packets any more, after that computer B is still receiving packets! It seems like something holds the traffic at the server and releases it slowly. I tried to turn the firewall off but nothing has been changed.
I changed the server OS and use Ubuntu to check if there is any problem in my code, but it works well on Ubuntu. after that I tried to change CentOS sockets buffer size but it doesn't help. here is some of important part of my code:
How I setup a socket:
int packet_socket = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));
if (packet_socket == -1) {
printf("Can't create AF_PACKET socket\n");
return -1;
}
// We would use V3 because it could read/pool in per block basis instead per packet
int version = TPACKET_V3;
int setsockopt_packet_version = setsockopt(packet_socket, SOL_PACKET, PACKET_VERSION, &version, sizeof(version));
if (setsockopt_packet_version < 0) {
printf("Can't set packet v3 version\n");
return -1;
}
//Set Interface in Promiscuous Mode
struct ifreq ifopts;
strncpy(ifopts.ifr_name,interface_name,IFNAMSIZ-1);
ioctl(packet_socket,SIOCGIFFLAGS,&ifopts);
ifopts.ifr_flags|=IFF_PROMISC;
ioctl(packet_socket,SIOCGIFFLAGS,&ifopts);
//Bind Socket On Specific Interface
struct sockaddr_ll sockll;
bzero((char *)&sockll,sizeof(sockll));
sockll.sll_family=AF_PACKET;
sockll.sll_protocol=htons(ETH_P_ALL);
sockll.sll_ifindex=get_interface_index(interface_name,packet_socket);
bind(packet_socket,(struct sockaddr *) &sockll , sizeof(sockll));
And here I receive packets:
u_char *packet;
packet=new u_char[1600];
*length = recv(packet_socket ,packet,1600,0);
And here I send packets:
int res = write(packet_socket ,packet,PacketSize);
I'm so confused and I don't know what is going on CentOS. Can you please help me to understand what is happening?
Is CentOS a good choice for doing this job?
Thank you.

Try disabling selinux on centos
setenforce 0
Then try

Related

How to simulate the exact same network traffic that was previoulsy recorded?

On Unix, how to simulate the exact same network traffic that was previously recorded?
I have a LAN made of 2 machines:
A local PC with interface eth0 and IP 192.168.1.1. On this PC runs a code C that listens to eth0, grab UDP packets and produce a result with them.
A remote hardware with IP 192.168.1.10. The hardware needs an initialization step (configuration, handshake, acknowledgment) and needs to be maintained active with a heartbeat. As long as the hardware is active, it sends data (grabbed by the local PC at the other end). All the different communications are done through different ports (see picture).
On the local PC, I plug the remote running HW, run tcpdump -i eth0 -w dump.pcap & (running in background), and just after that I run the code C that uses UDP packets received from HW (in parallel tcpdump is running). This produces a result R1 on the local PC: R1 is valid and can be post-processed.
Now, after the record dump.pcap is done, I let the remote HW running (otherwise eth0 dies - ip a does no more associate an IP to eth0), I run tcpreplay -K --intf1=eth0 dump.pcap & (running in background), and just after that I re-run the code C that uses UDP packets received from tcpreplay running in parallel (at least, that's my understanding of what should occur). The traces when C runs looks correct (initialization OK, no error, running / receiving looks OK). Unfortunately, C produces another result R2... Which is different from R1: R2 is invalid and can not even be post-processed?! The size of R2 is about the size of R1 but seems to be filled with zero/uninitialized data.
Is it possible to simulate the exact same traffic that the one that was previously recorded? If yes, what did I miss or what do I do wrong?
Note: I use a bash script to run tcpdump and C just one after the other when recording, and run tcpreplay and C just one after the other when replaying (trying to do things the same ways with similar delays as much as possible).

Should the kernel pass data from a tap interface to an application listening on INADDR_ANY?

I have an application that creates, listens on and writes to a tap interface. The software will read(tun_fd,...) and perform some action on that data, and it will return data to the system as UDP packets via write(tun_fd,...).
I assign an IP to the interface, 10.10.10.10\24 so that a socket application can bind to it and so that the kernel will pass any packets for the virtual subnet to the tap interface.
The software generate frames with IP/UDP packets with the destination IP being that assigned to the interface, and a source IP existing in the same subnet. The source and dest mac address match that of the tap device. Those frames are written back to the kernel with write(tun_fd,...).
If I open said tap interface in wireshark I will see my frames/packets as I expect to, properly formatted, expected ports, expected macs and IPs. But if I try to read those packets with netcat -lvu 0.0.0.0 ${MY_UDP_PORT} I don't see anything.
Is this expected behavior?
Update 1
INADDR_ANY is a red herring. I have the problem even when explicitly binding to an interface / port as in this pseudo code:
#> # make_tap_gen is a fake program that creates a tap interface and pushes UDP packets to 10.10.10.10#1234
#> ./make_tap_gen tun0
#> ip addr add dev tun0 10.10.10.10/24
#> netcat -lvu 10.10.10.10 1234
Update 2
I modified my code to be able to switch to a tun as opposed to a tap and I experience the same issue (well formatted packets in Wireshark but no data in socket applications).
Update 3
In the kernel documentation for tuntap it says
Let's say that you configured IPv6 on the tap0, then whenever
the kernel sends an IPv6 packet to tap0, it is passed to the application
(VTun for example). The application encrypts, compresses and sends it to
the other side over TCP or UDP. The application on the other side decompresses
and decrypts the data received and writes the packet to the TAP device,
the kernel handles the packet like it came from real physical device.
This implies to me that a write(tun_fd,...) where the packet was properly formatted and destined for an IP assigned to some interface on the system should be received by any application listening to 0.0.0.0:${MY_UDP_PORT}
Yes, data written into the tuntap device via write(tun_fd...) should get passed to the kernel protocol stack and distributed to listening sockets with matching packet information just like the frame had arrived over a wire attached to a physical ethernet device.
It requires that the packets be properly formed (IP checksum is good, UDP checksum is good or 0). It requires that the kernel know how to handle the packet (is there an interface on the system with a matching destination IP?). If it's a tap device it may also require that your application is properly ARP'ing (although this might not be necessary for a 'received' packet from the perspective of a socket application listening to an address assigned to the tap device).
In my case the problem was silly. While I had turned on UDP checksum verification in wireshark I forgot to turn on IP header verification. An extra byteswap was breaking that checksum. After fixing that I was immediately able to see packets written into the TAP device in a socket application listening on the address assigned to that interface.

DPDK l2fwd with multiple RX/TX queue in KVM

I wanted to try-out multiple RX/TX queue in KVM (Guest: CentOS). I have compiled DPDK (version: 18.05.1) and inserted igb_uio driver (bound two interface to it).
I am trying client to server connection (private).
client (eth1: 10.10.10.2/24) <--> (eth1) CentOS VM (DPDK: 18.05.1)
(eth2) <--> server (eth1: 10.10.10.10/24)
VM manages both interface directly in passthrough mode (macvtap - passthrough).
<interface type='direct' trustGuestRxFilters='yes'>
<source dev='ens1' mode='passthrough'/>
<model type='virtio'/>
<driver name='vhost' queues='2'/>
</interface>
When l2fwd applications started with single RX & TX queue (default change) & no-mac-updating. Client & server connectivity works perfectly.
I made some changes to try multiple RX/TX queues with l2fwd application.
I could see that ARP is not getting resolved at either end. VM doesn't receive any packets afterwards.
Can someone point me to the document to use multiple RX/TX queues which can verify my changes? Does multiple RX/TX queues work in VM environment? I have seen others also complaining about it.
I am newbie in DPDK world. Any help will be useful. Thank you.
Edited (Adding more details):
I am configuring ethernet device with 1 RX queue and 2 TX queues in l2fwd example.
uint16_t q = 0;
uint16_t no_of_tx_queues = 2;
// Configuring Ethernet Device
rte_eth_dev_configure(portid, 1, no_of_tx_queues, &local_port_conf);
// Configuring Rx Queue
ret = rte_eth_rx_queue_setup(portid, 0, nb_rxd, rte_eth_dev_socket_id(portid), &rxq_conf, l2fwd_pktmbuf_pool);
// Configuring 2 TX Queue
for(q = 0; q < no_of_tx_queues; q++) {
ret = rte_eth_tx_queue_setup(portid, q, nb_txd, rte_eth_dev_socket_id(portid), &txq_conf);
}
I am reading packets from single RX queue: Queue-id: 0 (as setup earlier).
nb_rx = rte_eth_rx_burst(portid, 0, pkts_burst, MAX_PKT_BURST);
I am seeing that some packets are coming and forwarded to other interface but some are not. For ICMP (ping), I can see ARP is forwarded but ICMP echo request is not read by l2fwd.
Solution what I found:
I have configured 2 RX & 2 TX queues in l2fwd. I can ICMP request is read from second RX queue (Queue-id: 1) and forwarded too. With that, client to server connectivity is working as expected.
The question here is:
Even I have configured 1 RX queue & 2 TX queue. Why few packets are coming on Queue-id: 1 (which is not configured & not read by l2fwd application).
It is observed in KVM (running on CentOS) environment. I have checked the same on ESXI, I can see all packets are read from single queue (Queue-id: 0) and forwarded.
Why?? Please explain. Is there any way I can turn off load balancing (of packet transmitted on two RX queues) in KVM so that I can receive all the packets on single queue?
Here is the DPDK's Vhost multiple queues test plan with all the command line arguments used:
https://doc.dpdk.org/dts/test_plans/vhost_multi_queue_qemu_test_plan.html
There are no much details in the question, so the only suggestion I have is to make sure multiple queues work first and then run l2fwd on top of that. If the guest OS does not work with multiple queues, DPDK won't fix the issue.

Why do I see packets that where source and destination are not my IP address

Im new to the networking world and I'm trying to use wireshark to get a hang of how packets are sent from my machine etc. Hence this question might be a dumb one.
When I open the wireshark packet analyzer GUI (on windows 7) there is a source and destination column. It shows packets where source IP is not mine and the destination IP is not mine either. Why is this happening? My network interface card should be receiving and sending only packets addressed to/sent from my IP address, right?
(attaching a screenshot. My IP address is 10.177.255.186)
Thanks.
On a small LAN all packets are generally broadcast to everyone. By broadcast I mean that the data is physically sent to everyone. When received the network interface determines if the packet was sent to you by looking at the address.
Using Wireshark your network interface can be set into promiscuous mode which means that all packets are captured and sent from the network interface to the CPU. This allows programs like Wireshark to record all those packets and not just the ones addressed for your computer.
Edit: However the packets don't have to be sent to all computers. A hub can be used to connect multiple computers together and acts as just a repeater meaning all packets are always sent everywhere (except on the wire where the packet came from). A switch however is similar but smarter.
If three computers A, B and C are connected to a switch and A sends a packet to B then the packet will first arrive at the switch. If the switch knows what wire B is connected to then it will only send it down that wire. If it doesn't know it sends it everywhere and later if B replies to A the switch will figure out what wire B is on. This means that C will generally never get to see any of the messages sent between A and B once the switch knows what wires A and B are on.

P2P Networking under the each NAT

I'm doing some mobile project, that need to P2P communication with two devices.
And I faced with problem. (cause it's rare that smartphone have public ip)
I found some answers. It is 'UDP Hole Punching'.
I guess I understand about 'UDP Hole Punching' 100% conceptually, and write some codes.
But it doesn't work.
This is my situation.
Device A connected NAT(A) for Wi-Fi.
Device B connected NAT(B) for Wi-Fi.
NAT(A) and NAT(B) is different one.
Relay Server S bind socket and waiting for devices. (S is WebServer but Network Status is good.)
At the first, A and B send dummy packet to S. Then S save UniqueID(to tell A and B), Public IP, Port.
And S send information to each device A and B.
Like this:
- IP Address and Port Number about A. -> send to B
- IP Address and Port Number about B. -> send to A
Now A and B send UDP packet to other device based on information(IP Address and Port Number) from S.
(15 per second. using same socket that used server-device session)
But it's not working. (actually intermittently work. maybe once in 10 times? and I don't know why success and fail. there is no any tiny little common relation.)
I think it's not NAT Type problems. I tested South Korea and 90% NAT in South Korea is not Symmetric Cone.
Depending on the implementation of the NAT, it may not work at all. NAT hole punching requires, some special form of NAT implementation:
a) If the NAT recognizes UDP traffic it may (but some times does not) NAT-translate by changing the sender port number to some random port number (and changing the sender IP to the public IP address) and then redirects -for some limited period of time- incoming UDP traffic on that port back to the host behind the NAT (changing back the port number and changing the receiver IP). That's where it works.
b) Another possibility is, that the NAT does redirect only traffic from special host to that opened port to the host behind the NAT. That is where it will not work.
c) It's not standardized what "refreshes" the timeout for the incoming traffic rule. The timeout may be prolonged by incoming traffic. But it may be needed to have outgoing traffic to the same host (Server S) to prolong the timeout.
It also seems UDP state expires very quickly for some implementations (within 100 ms in some cases). This means, you'll either need to continue to send keep alive packets to your Server 'S' -OR- you need at least to send UDP packets in shorter periods than 100 ms (e.g. once every 50 ms or 20 ms).

Resources