Will socket.recvfrom always return a valid address? - networking

I'm working on a communication protocol that should support self configuration by broadcasting / multicasting the peers' address over the local network. The intuitive way would be to broadcast the address, but as it turns out, it's pretty hard to reliably figure out the local IP address of the current machine (depending on the configuration, you might get "127.0.0.1" or another useless address).
The alternative is to not include the host address in the broadcast message, but to have the receivers call recvfrom on their socket which not only returns the received data, but also the sender's address. As I see it, that call is available on both Unix and Windows (one of my requirements) and probably some more platforms. My question now is, are there situations where this might fail and recvfrom returns an unreachable or otherwise useless address?

If you limit this technique to only broadcast UDP, you should be fine. The only things that tend to mess with this are things like dual NAT or hairpin NAT. That's just not done with broadcasts that are local only anyway.

Any address is subject to becoming unreachable (and therefore useless by your definition) at any time. Your software should be prepared to deal with that.
You can reliably determine the system's IP addresses (note the plural, more on that in a minute) by enumerating the interfaces. How you do this will differ among platforms as there's nothing in a standard (e.g., POSIX) that specifies how it should be done. Many Unixy systems have a getifaddrs() call; Windows does something else. Either way, isolating that code should be easy.
Your software also shouldn't make the assumption that IP it comes across is "the" address. On a system with more than multiple interfaces (which is most of them if you count the loopback), routings may change or someone may want to run your protocol on a segment that isn't on the same interface as the default route.
If you're going to broadcast a message, you need to do it once on every interface which is up, loopbacks included, unless you're configured to do otherwise. The broadcast should also happen from each of those interfaces so it has the proper address. You can't assume that other hosts on the same segment as any interface know about any other interface or have a way to route to their addresses.
If your protocol is intended for use only on connected segments, throwing away data from non-connected subnets would be a reasonable thing to do.

Related

How to detect router?

I am trying to write a program that scan an ip range and detect if an ip is address of a router or not.
Currently i used traceroute from my computer to all host in the network. However, i believe there must be some way to directly "ask" a host at an ip if it is a router or not?
by the way, do you know any program/ opensource already does this?
Routers are supposed to talk couple of protocols (actually a neat bunch) that regular IP nodes do not, and then there are some which are more common (i.e. even non-router nodes do).
Router-only protocols:
VRRP
IGRP / EIGRP
OSPF
BGP
RIP
You could do active-probing on those, i.e. send a packet (behaving as if you are another router, or an end-node) and checking to see what kind of response the router (if at all) sends.
Alternatively you could do passive-probing, like 'sniffing', i.e. watching out for the kind of IP packets being sent out by various nodes. There are some which are usually sent out by Routers only (again, mostly from the above list).
Common protocol, but that can actually tell you a lot:
SNMP (esply the unsecure one's like v1/v2, are easy to deal with, without having to establish a secure session)
Other ways:
Portscanning (actually can tell you a real lot), for example all routers have some management ports (although, often they are locked down due to security concerns)
What you want to do is often what many 'Network Management' software do, to "discover" capabilities / functionality of other nodes in the network. And, there isn't a single size-fits all solution. They use bunch of different methods, heuristics to finally figure out what the other node is.
Any node which is hopped to and not just an endpoint is a router. However, this doesn't allow you to detect routers with no reachable devices hooked up. (Any input as to whether my answer has merit would be great!)

Pinging Computer through specefic route

I have a network of computers connected in form of a graph.
I want to ping from one computer(A) to another computer(B). A and B are connected to each other through many different ways, but I want to PING via only a particular edges only. I have the information of the edges to be followed during pinging available at both A and B.
How should I do this?
You could source route the ping but the return would choose its own path.
Furthermore, source-routed packets are often filtered due to security concerns. (Not always, they are useful and sometimes even required at edge routers.)
If the machines are under your local administrative control, then you could ensure that source-routed packets are permitted. As long as you are able to start a daemon on machine B, you could also easily enough design your own ping protocol that generates source-routed echo returns.
Well, this is actually done by routing protocols that are configured on the media in between the computers (routers I expect). I think there isn't a way where you can say "use that specific route". The routers have different protocols (OSPF, EIGRP, RIPv2) and they do the load balancing. The only way you would be sure of one specific route is to use static routing, but this isn't dynamically done where your computer decides the route.
This is normal because :
if you would be able to chose a route, DoS would be quite easy to do to kill one route.

What was the motivation for adding the IPV6_V6ONLY flag?

In IPv6 networking, the IPV6_V6ONLY flag is used to ensure that a socket will only use IPv6, and in particular that IPv4-to-IPv6 mapping won't be used for that socket. On many OS's, the IPV6_V6ONLY is not set by default, but on some OS's (e.g. Windows 7), it is set by default.
My question is: What was the motivation for introducing this flag? Is there something about IPv4-to-IPv6 mapping that was causing problems, and thus people needed a way to disable it? It would seem to me that if someone didn't want to use IPv4-to-IPv6 mapping, they could simply not specify a IPv4-mapped IPv6 address. What am I missing here?
Not all IPv6 capable platforms support dualstack sockets so the question becomes how do applications needing to maximimize IPv6 compatibility either know dualstack is supported or bind separatly when its not? The only universal answer is IPV6_V6ONLY.
An application ignoring IPV6_V6ONLY or written before dualstack capable IP stacks existed may find binding separatly to V4 fails in a dualstack environment as the IPv6 dualstack socket bind to IPv4 preventing IPv4 socket binding. The application may also not be expecting IPv4 over IPv6 due to protocol or application level addressing concerns or IP access controls.
This or similar situations most likely prompted MS et al to default to 1 even tho RFC3493 declares 0 to be default. 1 theoretically maximizes backwards compatibility. Specifically Windows XP/2003 does not support dualstack sockets.
There are also no shortage of applications which unfortunately need to pass lower layer information to operate correctly and so this option can be quite useful for planning a IPv4/IPv6 compatibility strategy that best fits the requirements and existing codebases.
The reason most often mentioned is for the case where the server has some form of ACL (Access Control List). For instance, imagine a server with rules like:
Allow 192.0.2.4
Deny all
It runs on IPv4. Now, someone runs it on a machine with IPv6 and, depending on some parameters, IPv4 requests are accepted on the IPv6 socket, mapped as ::192.0.2.4 and then no longer matched by the first ACL. Suddenly, access would be denied.
Being explicit in your application (using IPV6_V6ONLY) would solve the problem, whatever default the operating system has.
I don't know why it would be default; but it's the kind of flags that i would always put explicit, no matter what the default is.
About why does it exist in the first place, i guess that it allows you to keep existing IPv4-only servers, and just run new ones on the same port but just for IPv6 connections. Or maybe the new server can simply proxy clients to the old one, making the IPv6 functionality easy and painless to add to old services.
For Linux, when writing a service that listens on both IPv4 and IPv6 sockets on the same service port, e.g. port 2001, you MUST call setsockopt(s, SOL_IPV6, IPV6_V6ONLY, &one, sizeof(one)); on the IPv6 socket. If you do not, the bind() operation for the IPv4 socket fails with "Address already in use".
There are plausible ways in which the (poorly named) "IPv4-mapped" addresses can be used to circumvent poorly configured systems, or bad stacks, or even in a well configured system might just require onerous amounts of bugproofing. A developer might wish to use this flag to make their application more secure by not utilizing this part of the API.
See: http://ipv6samurais.com/ipv6samurais/openbsd-audit/draft-cmetz-v6ops-v4mapped-api-harmful-01.txt
Imagine a protocol that includes in the conversation a network address, e.g. the data channel for FTP. When using IPv6 you are going to send the IPv6 address, if the recipient happens to be a IPv4 mapped address it will have no way of connecting to that address.
There's one very common example where the duality of behavior is a problem. The standard getaddrinfo() call with AI_PASSIVE flag offers the possibility to pass a nodename parameter and returns a list of addresses to listen on. A special value in form of a NULL string is accepted for nodename and implies listening on wildcard addresses.
On some systems 0.0.0.0 and :: are returned in this order. When dual-stack socket is enabled by default and you don't set the socket IPV6_V6ONLY, the server connects to 0.0.0.0 and then fails to connect to dual-stack :: and therefore (1) only works on IPv4 and (2) reports error.
I would consider the order wrong as IPv6 is expected to be preferred. But even when you first attempt dual-stack :: and then IPv4-only 0.0.0.0, the server still reports an error for the second call.
I personally consider the whole idea of a dual-stack socket a mistake. In my project I would rather always explicitly set IPV6_V6ONLY to avoid that. Some people apparently saw it as a good idea but in that case I would probably explicitly unset IPV6_V6ONLY and translate NULL directly to 0.0.0.0 bypassing the getaddrinfo() mechanism.

A Parallel IP address space exlusively for a P2P network?

I would like to do this because it would make peer location much more effective in my p2p network as I would know that all the addresses would be part of this network.
How could I do this while remaining compatible with current transport layer protocols such as SCTP, and the current hardware used on the big wide Internet?
Thanks,
Andreas
I suggest using IPv6.
There is enough address space that you can create up to 2^40 "unique unicast" ranges, each with 16 bits of subnet and 64 bits of host ID.
Protocols such as UDP, TCP, and SCTP already work on top of it
It already has major operating system support.
See http://www.rfc-editor.org/rfc/rfc4193.txt
Densely filling the 40-bit unique-id is discouraged. Use the random generation method mentioned in the RFC.
Put simply, you can't. IPv4 IPs are distributed by IANA to the five major IP registries: ARIN (North America), RIPE (Europe), APNIC (Asia/Pacific), LACNIC (Latin America/Carribean), and AfriNIC (Africa). These registries then distribute those out to ISPs.
There are blocks reserved for local networks, but those are not routable over the public Internet... they must be encapsulated; this is how VPNs work.
The best way to have this kind of functionality is probably to use a name lookup service, or even a peer discovery service in the protocol itself.
The fact is, no matter what you do, it is likely that you will have to get your application to perform extra work on top of the IP protocol anyway, because the IP protocol itself supports only 1 address space, you need to add another layer to add an independent address space.
It looks like you're trying to create a network inside of a P2P "world". So all the users using the P2P app would have a second IP address, say Alice has 10.0.2.40, that could be used by Bob, another user of the app, to get to Alice. Right?
With that regards, it looks like you'd want to set up a VPN on each client and use some sort of route table modifications so the VPN is only used for the address-space allocated by the the P2P program (say the 10.x.x.x network).
But there are problems with that.. for example you'll never find an address space that everyone has free to use. Home Routers use 192.168.x.x, corporate networks or enthusiasts (like myself) use 10.x.x.x, and the 172.something is used by other sysadmins for stuff I'm sure.
Disclaimer: Not a networking genius, I'm speculating here.

Detect another host with the same MAC address

How can I detect if another host is using the same MAC address as the current host, e.g. because the other host is spoofing?
I'm working in an embedded environment, so looking for answers on a protocol level, rather than “use such and such a tool”.
Edit: RARP does not solve this problem. For RARP to get any reply at all, there has to be at least one host on the segment which supports RARP. Since RARP is obsolete, modern operating systems don't support it. Furthermore, all RARP can do is tell you your own IP address - the response won't be any different if there’s another host on the segment with the same MAC, unless that host has itself used a different IP address.
This question is too interesting to put down! After several false starts I started thinking about the essential components of the problem and scoured the RFCs for advice. I haven't found a definitive answer, but here's my thought process, in the hope that it helps:
The original question asks how to detect another device with your MAC address. Assuming you're on an IP network, what's required to accomplish this?
The passive method would be simply to listen to traffic and look for any packets that you didn't transmit but have your MAC address. This may or may not occur, so although it can tell you definitively if a duplicate exists, it cannot tell you definitively that it doesn't.
Any active method requires you to transmit a packet that forces an impostor to respond. This immediately eliminates any methods that depend on optional protocols.
If another device is spoofing you, it must (by definition) respond to packets with your MAC address as the destination. Otherwise it's snooping but not spoofing.
The solution should be independent of IP address and involve only the MAC address.
So the answer, it seems, would be to transmit either a broadcast (ethernet) packet or a packet with your MAC address as its destination, that requires a response. The monkeywrench is that an IP address is usually involved, and you don't know it.
What sort of protocol fits this description?
Easy Answer:
If your network supports BOOTP or DHCP, you're done, because this authoritatively binds a MAC address to an IP address. Send a BOOTP request, get an IP address, and try to talk to it. You may need to be creative to force the packet onto the wire and prevent yourself from responding (I'm thinking judicious use of iptables and NAT).
Not-so-easy Answers:
A protocol that's independent of IP: either one that doesn't use the IP layer, or one that allows broadcasts. None comes to mind.
Send any packet that would normally generate a response from you, prevent yourself from responding, and look for a response from another device. It would seem sensible to use your IP address as the destination, but I'm not convinced of that. Unfortunately, the details (and, therefore the answer) are left as an exercise for the OP ... but I hope the discussion was helpful.
I suspect the final solution will involve a combination of techniques, as no single approach seems to guarantee a dependable determination.
Some information is available at http://en.wikipedia.org/wiki/ARP_spoofing#Defenses
If all else fails, you may enjoy this: http://www.rfc-editor.org/rfc/rfc2321.txt
Please post a follow-up with your solution, as I'm sure it will be helpful to others. Good luck!
You could send an ARP request for each possible ip in the subnet.
Of course the source address of the ARP request must be ff:ff:ff:ff:ff:ff, otherwise you might not see the response.
I forged a packet like this with bittwiste and replayed it with PReplay and all the hosts on the network got the response. (I don't know if these forged ARP packets are legal or not... some OSes might ignore them)
Here is what the forged package looked like:
Here is what the reply looked like:
If you watch the responses and see your MAC address in one of the packets (in the red rectangle) , than someone has the same MAC address as you do...
Unfortuantely I couldn't test the theory fully because none of my (Windows) machines care about me trying to set the nic's MAC address...
Two hosts using same MAC address on a single network segment would probably make switches go nuts and you could probably detect it by having an extremely unreliable network connection (as the switches would send some portion of packets that belong to your host to the second one, depending on which one of you sent the last packet in their direction).
This is very late, and a non-answer, but I wanted to follow up with exactly what I did in case anyone else is interested.
I was working with some very weird embedded hardware that doesn’t have a MAC address assigned at manufacture. That means we needed to assign one in software.
The obvious solution is to have the user pick a MAC address that they know is available on their network, preferably from the locally-administered range, and that’s what I did. However, I wanted to pick a reasonably safe default, and also attempt to warn the user if a conflict occurred.
In the end I resorted to picking a random-ish default in the locally-administered range, chosen by making some hardware readings that have moderate entropy. I deliberately excluded the beginning and end of the range on the assumption that those are moderately more likely to be chosen manually. The chances are that there will only be one of these devices on any given network, and certainly less than 20, so the chances of a conflict are very low, albeit not as low as they could be due to the somewhat predictable random numbers.
Given the low chances of there being a problem, and despite the excellent answers above, I decided to dispense with the conflict detection and make do with a warning to the user to look out for MAC conflict problems.
If I did decide to implement conflict detection, then given that I control the whole network stack, I would probably look out for excessive unknown or missing packets, and then trigger a change of MAC address or warn the user when that happens.
Hopefully that will help someone else somewhere – but probably not!

Resources