Comparing IP v6 addresses - ip

has anybody some good ideas to compare two ipv6 addresses. It look like the shortage rules are making it complicated.
for instance the full address
1234:0db8:0000:0000:0000:ff00:ff00:0011
leading zero can be removed => 1234:0db8::::ff00:ff00:11
one group of empty fields can be removed 1234:0db8::ff00:ff00:00111
the last 32 bit can be an old fashioned ipv4 address 1234:0db8::::ff00:172.0.0.15

You can use the standard library function socket.inet_pton to convert the addresses into a byte string for comparison:
>>> socket.inet_pton(socket.AF_INET6,'1234:0db8::ff00:ff00:0011')
'\x124\r\xb8\x00\x00\x00\x00\x00\x00\xff\x00\xff\x00\x00\x11'
>>> socket.inet_pton(socket.AF_INET6,'1234:0db8:0000:0000:0000:ff00:ff00:0011')
'\x124\r\xb8\x00\x00\x00\x00\x00\x00\xff\x00\xff\x00\x00\x11'
This will reduce the risk of you creating your own IPv6 bug.
Example above is in python, but the inet_pton function is available on different platforms and languages:
http://msdn.microsoft.com/en-us/library/windows/desktop/cc805844(v=vs.85).aspx
http://man7.org/linux/man-pages/man3/inet_pton.3.html

You could just split it by colons and then compare each value.
If you encounter an empty field -> insert '0000' for it.
If you encounter a field with less than 4 digits -> fill it up with zeroes
Additionally you could give each of the fields a weight to emphasize the values of the fields.

Related

How to write in network address/netmask the following address space?

I'd like to ask how to write in network address/netmask the following address space:
63.39.191.192 - 63.40.192.223
On paper, I couldn't figure any way of doing it, so I tried using a network address calculator to figure it out.
I inputed the first IP address and started toying with the netmask.
What I couldn't understand is how the first and last usable address varied based on the netmask.
So, here I am, hoping that you might explain to me how the first and last IP address are determined based on the netmask and how to solve that problem.
There are two things that someone might mean by a address/netmask pair. One option is something that looks like 192.168.0.1/24. This means that the first 24 bits of an acceptable address must match the given address. This is a common way of expressing subnets, however it is not possible to express your range like this. This means that you will not be able to work out a solution in the calculator you linked, which uses this method as input.
The other way is as a pair of dotted quads. The subnet above would be expressed like this: 192.168.0.1/255.255.255.0. Everything which can be expressed in the first way can be expressed in the second way, but the converse is not true.
To understand how to solve your problem using the second format, you have to know something about binary numbers. Each part of the dotted quad is a number 0-255 and can be expressed as a binary number with eight digits (bits). Thus the whole address is a binary number made up of 32 bits, each of which is either 0 or 1.
A network specification is an address, followed by another 32-bit number, expressed as an address. What the second number means is this: each place in that number where the digit is 1, the first address has to match on that digit. Each place where the digit in the netmask is 0, no match is needed. So you see how matching the first 24 bits is the same as matching 255.255.255.0, which is a 32-bit number made up of 24 1's followed by 8 0's.
You can also see how some netmasks can't be expressed in the first type. Any netmask which isn't one string of repeated 1's followed by the rest 0's, can't be written like this. The reason for the first type is that most real-world networks do have netmasks of this form.
To construct a netmask of the second type, you can work with one byte at a time. The first byte of the address has to match exactly 63. So the address will be 63.x.x.x and the mask will be 255.x.x.x. As before 255, made up of all 1's, means match every bit. The second byte can be either 39 (00100111 in binary), or 40 (00101000). This one can't be expressed as any number plus a set of bits to match. Only the first four bits of the two numbers match, but if we try to do something like 63.39.x.x/255.224.x.x (224 is 11110000), we will match any second byte from 32 to 47. You should check your previous question to see if this is right, however, you should hopefully be able to figure some more out if you understand binary.
If you're not completely sure how binary works, please go away and make sure you really get it before looking into netmasks further. It really will help and it's a very good thing to know about anyway.

IP address long format: Who invented it? Is it internationally standardised?

According to this question IP address representation in the Maxmind Geolitecity database
What is the official name for this format of an ip-address?
Who invented it? Is it internationally standardised?
I mean the format without dots. Never heard of it. I know about the binary octets separated by dots.
It is called a decimal representation. IP address is a 32bit binary number which we usually represent in the dot-decimal notation you mentioned. Since it is originally an 32bit number it is easier (and faster) to store and manipulate it in it's native format (as a number) than to parse the string representation.
For example you have 10.0.0.1 address whose octets in binary looks like:
00001010.00000000.00000000.00000001
When you lose the dots and look at it like a number you get 167772161 which is what ip2long() function will return if you pass 10.0.0.1 as an argument.

IPv6 zero compression

When using zero-compression on the following IPv6 address
2001:0DB8:0000:CD30:0000:0000:0000:0000/60
Why is this not correct:
2001:DB8::CD30::/60
... while this is:
2001:DB8:0:CD30::/60
Zero compression can only be made once. The reason for this is, that the IPv6 address is not unique any more otherwise.
Take your example 2001:DB8::CD30::/60 will it expand to
2001:0DB8:0000:0000:0000:CD30:0000:0000/60
or
2001:0DB8:0000:0000:CD30:0000:0000:0000/60
or
2001:0DB8:0000:CD30:0000:0000:0000:0000/60
...?
If only one "::" is used, the result will always be unique as there is only one possible fixed number of zeros to be inserted.
Because it is ambiguous.
The address 2001:DB8::CD30:: could be expanded in any of the following possibilities:
2001:DB8:0:CD30:0:0:0:0
2001:DB8:0:0:CD30:0:0:0
2001:DB8:0:0:0:CD30:0:0
2001:DB8:0:0:0:0:CD30:0
The reason is that :: is used to shorten multiple zeros in the 16-bit address field.
In your example 2001:0DB8:0000:CD30:0000:0000:0000:0000/60, it only has multiple 0s in the 16-bit field at the suffix, the 0000 in 2001:0DB8:0000:CD30: is just one 16-bit field and you'd just use 0 to shorten it.
More interesting question: How would you shorten this2001:0000:0000:CD30:0000:0000:0000:0000/60?
It is defined in the standard:
In addition, Section 2.2 of [RFC4291] notes,
'The "::" can only appear once in an address.'
What it means that the address can be written as either:
2001:0:0:CD30::/60 OR 2001::0:CD30:0:0:0:0/60.
Both are valid, but I'd prefer the first representation since the purpose of zeroco mpression is to shorten the address where the first representation is shorter.

Is 192.056.2.01 a valid representation of an v4 ip?

I'm writing some code to convert an v4 ip stored in a string to a custom data type (a class with 4 integers in this case).
I was wondering if I should accept ips like the one I put in the title or only ips wiht no preceding zeros, let's see it with an example.
This two ips represent the same to us (humans) and for example windows network configuration accepts them:
192.56.2.1 and 192.056.2.01
But I was wondering if the second one is actually correct or not.
I mean, according to the RFC is the second ip valid?.
Thanks in advance.
Be careful, inet_addr(3) is one of Unix's standard API to translate a textual representation of IPv4 address into an internal representation, and it interprets 056 as an octal number:
http://pubs.opengroup.org/onlinepubs/9699919799/functions/inet_addr.html
All numbers supplied as parts in IPv4 dotted decimal notation may be decimal, octal, or hexadecimal, as specified in the ISO C standard (that is, a leading 0x or 0X implies hexadecimal; otherwise, a leading '0' implies octal; otherwise, the number is interpreted as decimal).
Its younger brothers like inet_ntop(3) and getaddrinfo(3) are all the same:
http://pubs.opengroup.org/onlinepubs/9699919799/functions/inet_ntop.html
http://pubs.opengroup.org/onlinepubs/9699919799/functions/getaddrinfo.html
Summary
Although such textual representations of IP addresses like 192.056.2.01 might be valid on all platforms, different OS interpret them differently.
This would be enough reason for me to avoid such a way of textual representation.
Pros
In decimal numerotation 056 is equals to 56 so why not?
Cons
0XX format is commonly used to octal numerotation
Whatever your decisions just put it on your documentation and it will be ok :)
Defining if it is correct or not depends on your implementation.
As you mentioned windows OS considers it correct because it removes any leading zeros when it resolves the IP.
So if in your program you set an appropriate logic, e.g every subset of the IP stored in your 4 integer class, without the leading zeros, it will be correct for your case too.
Textual Representation of IPv4 and IPv6 Addresses is an “Internet-Draft”,
which, I guess, is like an RFC wanna-be. 
(Also, it expired a decade ago, on 2005-08-23,
and, apparently, has not been reissued,
so it’s not even close to being official.) 
Anyway, in Section 2: History it says,
The original IPv4 “dotted octet” format was never fully defined in any RFC,
so it is necessary to look at usage,
rather than merely find an authoritative definition,
to determine what the effective syntax was. 
The first mention of dotted octets in the RFC series is …
four dot-separated parts, each of which consists of
“three digits representing an integer value in the range 0 through 255”.
A few months later, [[IPV4-NUMB][3]] …
used dotted decimal format, zero-filling each encoded octet to three digits.
                ⋮
Meanwhile,
a very popular implementation of IP networking went off in its own direction. 
4.2BSD introduced a function inet_aton(), …
[which] allowed octal and hexadecimal in addition to decimal,
distinguishing these radices by using the C language syntax
involving a prefix “0” or “0x”, and allowed the numbers to be arbitrarily long.
The 4.2BSD inet_aton() has been widely copied and imitated,
and so is a de facto standard
for the textual representation of IPv4 addresses. 
Nevertheless, these alternative syntaxes have now fallen out of use …
[and] All the forms except for decimal octets are seen as non-standard
(despite being quite widely interoperable) and undesirable.
So, even though [POSIX defines the behavior of inet_addr][4]
to interpret leading zero as octal (and leading “0x” as hex),
it may be safest to avoid it.
P.S. [RFC 790][3] has been obsoleted by [RFC 1700][5],
which uses decimal numbers of one, two, or three digits,
without leading zeroes.
[3]: https://www.rfc-editor.org/rfc/rfc790 "the "Assigned Numbers" RFC"
[4]: http://pubs.opengroup.org/onlinepubs/9699919799/functions/inet_addr.html
[5]: https://www.rfc-editor.org/rfc/rfc1700

What is the name for encoding/encrypting with noise padding?

I want code to render n bits with n + x bits, non-sequentially. I'd Google it but my Google-fu isn't working because I don't know the term for it.
For example, the input value in the first column (2 bits) might be encoded as any of the output values in the comma-delimited second column (4 bits) below:
0 1,2,7,9
1 3,8,12,13
2 0,4,6,11
3 5,10,14,15
My goal is to take a list of integer IDs, and transform them in a way they can still be used for persistent URLs, but that can't be iterated/enumerated sequentially, and where a client cannot determine programmatically if a URL in a search result set has been visited previously without visiting it again.
I would term this process "encoding". You'll see something similar done to permit the use of communications channels that have special symbols that are not permitted in data. Examples: uuencoding and base64 encoding.
That said, you still need to (and appear at first blush to have) ensure that there is only one correct de-code; and accept the increase in size of the output (in the case above, the output will be double the size, bit-for-bit as the input).
I think you'd be better off encrypting the number with a cheap cypher + a constant secret key stored on your server(s), adding a random character or four at the end, and a cheap checksum, and simply reject any responses that don't have a valid checksum.
<encrypt(secret)>
<integer>+<random nonsense>
</encrypt>
+
<checksum()>
<integer>+<random nonsense>
</checksum>
Then decrypt the first part (remember, cheap == fast), validate the ciphertext using the checksum, throw off the random nonsense, and use the integer you stored.
There are probably some cryptographic no-no's here, but let's face it, the cost of this algorithm being broken is a touch on the low side.

Resources