How to forward packet from eth0 to a tun/tap device? - networking

Here is the topo: HostA(eth0) ---- (eth0)HostB
I have created a tun/tap device on HostB, for say tun0 or tap0. When eth0 of HostB receives a packet from HostA, maybe a ICMPv6(NS, echo request, etc.) or a UDP/TCP packet(encapsulated with IPv6 header), I want to forward this packet from eth0 to tap0. After doing something to this packet, I also want to send a reply back to HostA, through tap0 and eth0.
I cannot find a way to do that, can some one help me or give some hints?

This is an extremely basic routing question, probably unsuitable for Stack Overflow.
You need something like this on Host B:
HostB# sysctl -w net.ipv6.conf.all.forwarding=1
HostB# ip -6 addr add 2001:db8:0:0::1/64 dev eth0
HostB# ip -6 addr add 2001:db8:0:1::1/64 dev tun0
Then on Host A:
HostA# ip -6 addr add 2001:db8:0:0::2/64 dev eth0
HostA# ip -6 route add default via 2001:db8:0:0::1 dev eth0
HostA# ping6 2001:db8:0:1::2 # <-- should work if that host exists on tun0

Related

TCP push packet not delivered from tun

I setup simple packet intercept program, using two tuns, setup like this:
# ip tuntap add mode tun name tun0
# ip link set tun0 up
# ip addr add 10.0.0.0/31 dev tun0
# ip tuntap add mode tun name tun1
# ip link set tun1 up
# ip addr add 10.0.1.0/31 dev tun1
and redirect output to the program like this:
# ip rule add fwmark 1 table 1
# ip route add default dev tun0 table 1
# iptables -t mangle -A OUTPUT --source 192.168.1.0 -o enp34s0 -p tcp --dport 9732 -j MARK --set-mark 1
# iptables -t nat -A POSTROUTING --source 10.0.1.1 -o enp34s0 -j MASQUERADE
I enabled ip_forward and disabled rp_filter. Packets received on tun0 are processed, modified and ip/tcp checksums are updated.
I can even correctly intercept tcp handshake SYN -> ACK,SYN -> ACK part of communication, but after that, any incoming packet would be correctly intercepted modified and send out of tun, but it would never be delivered to local application.
Ok, found a problem, whilst I did recalculate the checksum, it only calculated correct one only for the TCP packets without any payload, thus TCP handshak get through and nothing else.

Receive specific multicast message on a client connected over VPN

Case:
[ Subnet A , 192.168.2.0/24, Padavan firmware based internet gw ]
[ Subnet B , 192.168.1.0/24, Padavan firmware based internet gw ]
Host from subnet A (2.155) is connected via VPN (possible options: PPTP, OpenVPN, L2TP w/o ipsec) to subnet B, and receives address, saying 1.245/32
In subnet B exists host (1.10/32) which sends multicast datagramms to 224.0.0.50:9898 ; On router I see them with
tcpdump -i br0 -c 10 dst host 224.0.0.50 and port 9898 and multicast
13:46:54.345369 IP 192.168.1.10.4321 > 224.0.0.50.9898: UDP, length 135
I am looking for solutions, to receive/forward those broadcast messages, so they could be seen by hosts, connected via VPN
On router B, which is Padavan firmware based, I have, and limited to udpxy, igmproxy utilities, if needed.
On client host, I am debian based, and generally not limited in tools.
Datagrams are proprietary protocol, i.e. not a iptv or video stream.
Any ideas are welcomed.
[UPD] Additional info - per discussion in comments
That's a very specific hardware device, which is not very chatty in ethernet terms (saying max 1-2 datagramms in 5 seconds), thus for sure should be pretty forwardable. Unfortunately, It sends status updates purely via broadcasting. in Subnet A do exist similar device + control software. Thus I am looking for a way datagramms broadcasted to 224.0.0.50:9898 in subnet B to re-appear in subnet A. May be with help of some tool. May be smcroute, may be udpxy, maybe igmproxy
As I don't like to leave resolved questions unanswered, here is currently working solution
In subnet B I have installed openVPN server endpoint, configured as L2.
In subnet A, on a control host I have installed openvpn client, that connects to subnet B, assigned interface is tapz
20: tapz: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UNKNOWN group default qlen 100
link/ether 0a:da:be:96:78:d9 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.245/24 brd 192.168.1.255 scope global noprefixroute tapz
valid_lft forever preferred_lft forever
inet6 fe80::8da:beff:fe96:78d9/64 scope link
valid_lft forever preferred_lft forever
So now on a control host I have:
broadcasting from local device on physical ethernet enp5s0
sudo tcpdump -i enp5s0 -c 10 dst host 224.0.0.50 and port 9898 and multicast
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on enp5s0, link-type EN10MB (Ethernet), capture size 262144 bytes
13:55:05.642963 IP lumi-gateway-v3_miio56591509.4321 > 224.0.0.50.9898: UDP,
length 136
and now I also receive broadcasts from remote network device on tapz
sudo tcpdump -i tapz -c 10 dst host 224.0.0.50 and port 9898 and multicast
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tapz, link-type EN10MB (Ethernet), capture size 262144 bytes
13:53:49.141751 IP 192.168.1.10.4321 > 224.0.0.50.9898: UDP, length 135
So far that it what I was looking for I am getting necessary datagrams on a VPN client. OpenVPN on remote side can be also optimized on filter of information forwarded for multicasts.
For those who come here, with the same question.
When you will have necessary multicast on tap0,
you can create bridge from, saying, eth0 and tap0
For notes of everyone interested, who would came here.
ip link add br0 type bridge
ip link set tap0 master br0
ip link set eth0 master br0
POC - both multicasts on single interface
sudo tcpdump -i br0 dst host 224.0.0.50 and port 9898
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on br0, link-type EN10MB (Ethernet), capture size 262144 bytes
21:09:51.823632 IP 192.168.1.10.4321 > 224.0.0.50.9898: UDP, length 135
21:09:55.045138 IP 192.168.2.214.4321 > 224.0.0.50.9898: UDP, length 136

What is the relation between docker0 and eth0?

I know by default docker creates a virtual bridge docker0, and all container network are linked to docker0.
As illustrated above:
container eth0 is paired with vethXXX
vethXXX is linked to docker0 same as a machine linked to switch
But what is the relation between docker0 and host eth0?
More specifically:
When a packet flows from container to docker0, how does it know it will be forwarded to eth0, and then to the outside world?
When an external packet arrives to eth0, why it is forwarded to docker0 then container? instead of processing it or drop it?
Question 2 can be a little confusing, I will keep it there and explained a little more:
It is a return packet that initialed by container(in question 1): since the outside does not know container network, the packet is sent to host eth0. How it is forwarded to container? I mean, there must be some place to store the information, how can I check it?
Thanks in advance!
After reading the answer and official network articles, I find the following diagram more accurate that docker0 and eth0 has no direct link,instead they can forward packets:
http://dockerone.com/uploads/article/20150527/e84946a8e9df0ac6d109c35786ac4833.png
There is no direct link between the default docker0 bridge and the hosts ethernet devices. If you use the --net=host option for a container then the hosts network stack will be available in the container.
When a packet flows from container to docker0, how does it know it will be forwarded to eth0, and then to the outside world?
The docker0 bridge has the .1 address of the Docker network assigned to it, this is usually something around a 172.17 or 172.18.
$ ip address show dev docker0
8: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether 02:42:03:47:33:c1 brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 scope global docker0
valid_lft forever preferred_lft forever
Containers are assigned a veth interface which is attached to the docker0 bridge.
$ bridge link
10: vethcece7e5 state UP #(null): <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master docker0 state forwarding priority 32 cost 2
Containers created on the default Docker network receive the .1 address as their default route.
$ docker run busybox ip route show
default via 172.17.0.1 dev eth0
172.17.0.0/16 dev eth0 src 172.17.0.3
Docker uses NAT MASQUERADE for outbound traffic from there and it will follow the standard outbound routing on the host, which may or may not involve eth0.
$ iptables -t nat -vnL POSTROUTING
Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
0 0 MASQUERADE all -- * !docker0 172.17.0.0/16 0.0.0.0/0
iptables handles the connection tracking and return traffic.
When an external packet arrives to eth0, why it is forwarded to docker0 then container? instead of processing it or drop it?
If you are asking about the return path for outbound traffic from the container, see iptables above as the MASQUERADE will map the connection back through.
If you mean new inbound traffic, Packets are not forwarded into a container by default. The standard way to achieve this is to setup a port mapping. Docker launches a daemon that listens on the host on port X and forwards to the container on port Y.
I'm not sure why NAT wasn't used for inbound traffic as well. I've run into some issues trying to map large numbers of ports into containers which led to mapping real world interfaces completely into containers.
You can detect the relation by network interface iflink from a container and ifindex on the host machine.
Get iflink from a container:
$ docker exec ID cat /sys/class/net/eth0/iflink
17253
Then find this ifindex among interfaces on the host machine:
$ grep -l 17253 /sys/class/net/veth*/ifindex
/sys/class/net/veth02455a1/ifindex

Ping external IPv6 address from a network namespace

I need to reach an external IPv6 address from a network namespace.
In my host I have setup a SIT tunnel (IPv6-in-IPv4) that does tunnelling of IPv6 packets and sends it through the default interface (eth0). The SIT tunnel relies on the Hurricane Electric tunnel broker service. I can ping an external IPv6 address from the host.
$ ping6 ipv6.google.com
PING ipv6.google.com(lis01s13-in-x0e.1e100.net) 56 data bytes
64 bytes from lis01s13-in-x0e.1e100.net: icmp_seq=1 ttl=57 time=98.1 ms
64 bytes from lis01s13-in-x0e.1e100.net: icmp_seq=2 ttl=57 time=98.7 ms
Here are some details about the tunnel:
$ ip -6 route sh
2001:470:1f14:10be::/64 dev he-ipv6 proto kernel metric 256
default dev he-ipv6 metric 1024
Here comes the interesting part. For reasons that are beyond the scope of this question, I need to do the same thing (ping ipv6.google.com) from within a network namespace.
Here is how I create and setup my network namespace:
ns1.sh
#!/bin/bash
set -x
if [[ $EUID -ne 0 ]]; then
echo "You must run this script as root."
exit 1
fi
# Create network namespace 'ns1'.
ip netns del ns1 &>/dev/null
ip netns add ns1
# Create veth pair.
ip li add name veth1 type veth peer name vpeer1
# Setup veth1 (host).
ip -6 addr add fc00::1/64 dev veth1
ip li set dev veth1 up
# Setup vpeer1 (network namespace).
ip li set dev vpeer1 netns ns1
ip netns exec ns1 ip li set dev lo up
ip netns exec ns1 ip -6 addr add fc00::2/64 dev vpeer1
ip netns exec ns1 ip li set vpeer1 up
# Make vpeer1 default gw.
ip netns exec ns1 ip -6 route add default dev vpeer1
# Get into ns1.
ip netns exec ns1 /bin/bash --rcfile <(echo "PS1=\"ns1> \"")
Then I run ns1.sh and ping veth1 (fc00::1) vpeer1 (fc00::2) from 'ns1'.
ns1> ping6 fc00::1
PING fc00::1(fc00::1) 56 data bytes
64 bytes from fc00::1: icmp_seq=1 ttl=64 time=0.075 ms
^C
ns1> ping6 fc00::2
PING fc00::2(fc00::2) 56 data bytes
64 bytes from fc00::2: icmp_seq=1 ttl=64 time=0.056 ms
However, if I try to ping an external IPv6 address:
ns1> ping6 2a00:1450:4004:801::200e
PING 2a00:1450:4004:801::200e(2a00:1450:4004:801::200e) 56 data bytes
^C
--- 2a00:1450:4004:801::200e ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1008ms
All packets are loss.
I opened veth1 with tcpdump and checked what's going on. What I'm seeing is that Neighbor Solicitation packets are reaching the interface. These packets are trying to resolve the MAC address of the the IPv6 destination address:
$ sudo tcpdump -qns 0 -e -i veth1
IPv6, length 86: fc00::2 > ff02::1:ff00:200e: ICMP6, neighbor solicitation,
who has 2a00:1450:4004:801::200e, length 32
IPv6, length 86: fc00::2 > ff02::1:ff00:200e: ICMP6, neighbor solicitation,
who has 2a00:1450:4004:801::200e, length 32
I don't really understand why this is happening. I have enabled IPv6 forwarding in the host too but it had no effect:
$ sudo sysctl -w net.ipv6.conf.default.forwarding=1
Thanks for reading this question. Suggestions welcome :)
EDIT:
Routing table in the host:
2001:470:1f14:10be::/64 dev he-ipv6 proto kernel metric 256
fc00::/64 dev veth1 proto kernel metric 256
default dev he-ipv6 metric 1024
I added a NDP proxy in the host, that solves the NDP solicitations. Still the address is not reachable from the nsnet (looking into this):
sudo sysctl -w net.ipv6.conf.all.proxy_ndp=1
sudo ip -6 neigh add proxy 2a00:1450:4004:801::200e dev veth1
ULAs are not routable
You have given an Unique Local Address (fc00::2) to the network namespace: this IP address is not routable in the global internet but only in your local network.
When your ISP receives the ICMP packet coming from this address it will drop it. Even if this packet was successfully reaching ipv6.google.com, it could not possibly send the answer back to you because there is no announced route for this IP address.
Routing table problem (NDP notications)
You get NDP notifications because of this line:
ip netns exec ns1 ip -6 route add default dev vpeer1
which tells the kernel that (in the netns) all IP address are directly connected on the vpeer1 interface. The kernel thinks that this IP address is present on the Ethernet link: that's why it's trying to resolve its MAC address with NDP.
Instead, you want to say that they are reachable through a given router (in your case, the router is your host):
ip netns exec ns1 ip -6 route add default dev vpeer1 via $myipv6
Solutions
You can either:
associate an public IPv6 address (of your public IPv6 prefix) to your netns and setup a NDP proxy on the host for this address;
subnet your IPv6 prefix and route a subnet to your host (if you can);
use NAT (bad, ugly, don't do that).
You should be able to achieve the first one using something like this:
#!/bin/bash
set -x
myipv6=2001:470:1f14:10be::42
peeripv6=2001:470:1f14:10be::43
#Create the netns:
ip netns add ns1
# Create and configure the local veth:
ip link add name veth1 type veth peer name vpeer1
ip -6 address add $myipv6/128 dev veth1
ip -6 route add $peeripv6/128 dev veth1
ip li set dev veth1 up
# Setup vpeer1 in the netns:
ip link set dev vpeer1 netns ns1
ip netns exec ns1 ip link set dev lo up
ip netns exec ns1 ip -6 address add $peeripv6/128 dev vpeer1
ip netns exec ns1 ip -6 route add $myipv6/128 dev vpeer1
ip netns exec ns1 ip link set vpeer1 up
ip netns exec ns1 ip -6 route add default dev vpeer1 via $peeripv6
# IP forwarding
sysctl -w net.ipv6.conf.default.forwarding=1
# NDP proxy for the netns
ip -6 neigh add proxy $myipv6 dev veth1

Pinging between two tap devices on the same machine

I have two virtual TAP interfaces tap0 and tap1 on my machine. They have IPs 10.0.0.1 and 10.0.0.2 respectively. They are both connected to each other using socat. Both have netmasks 255.255.255.0 (and hence are on the same subnet). With this setup, I try pinging 10.0.0.2 through tap0 and vice versa. This doesn't seem to work for some reason. Although tcpdump shows ARP packets from tap0 reaching tap1, there are no ARP replies and hence no ICMP requests and hence no ICMP replies. Using a TUN device instead of a TAP device bypasses the ARP request/response cycle, but now the ICMP requests show up at tap1 with no ICMP response coming back.
I have tried a couple of things like enabling ip_forward ( echo 1 > /proc/sys/net/ipv4/ip_forward) and disabling reverse path filtering ( echo 0 > /proc/sys/net/ipv4/conf/tap0/rp_filter and echo 0 > /proc/sys/net/ipv4/conf/tap1/rp_filter ).
Here are the commands to reproduce my problem :
sudo socat TUN:10.0.0.1/24,tun-type=tap TUN:10.0.0.2/24,tun-type=tap
sudo ifconfig tap0 10.0.0.1 netmask 255.255.255.0
sudo ifconfig tap1 10.0.0.2 netmask 255.255.255.0
ping -Itap0 10.0.0.2
tcpdump -Itap0 -n
tcpdump -Itap1 -n

Resources