Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
There’s more than one way to write an IP address (ma.ttias.be)
140 points by Mojah on July 9, 2019 | hide | past | favorite | 48 comments


The different notations tripped us up big time a while ago at my company. A class of our devices used to get their machine name set from the serial number in the BIOS. On the older models these started with letters. Then we got a new model and the serial numbers on these were just long numbers. So far so good.

After a while we noticed that a lot of them (around 30%) didn't work on the network. It took a while to figure out that some of the serial numbers got interpreted as octal IP addresses so pinging them didn't resolve through DNS but instead went straight to the (wrong) IP.

In hindsight this could have been predicted but it just shows that often a design that made sense when it was created can cause huge headaches later on. Now we prefix all serial numbers with some letters to make sure they don't get misinterpreted.


You should have been able to ping them using an FQDN but I can see how it would be confusing to have to do that.


Why not base64 them?

Do you actually use serial numbers anywhere else?

Why not generate a random name instead?


The serial numbers are on the back of the devices so it’s nice to be able to correlate them with the network name. As I mentioned now we prefix them with a few letters to avoid being misinterpreted.


Base64 needs 64 unique characters, but domains names don't have 64 unique characters. Also base64 could result in pure numbers as its output, leading to the exact same problem.


I did telephone tech support in an Austin boiler room for awhile and used the decimal IP notation I'd learned from my ISP days a lot. Sometimes accessing a customer's router home page is a must and there is no connectivity for name lookups. Elderly people have trouble with dots and numbers, or they add spaces, or any number of things.

Instead of directing them to 192.168.0.1 I'd tell them to enter 3232235521 and it would get the same place. My supervisor called me into the office because someone heard me doing it while snooping a call and it freaked them out.


One character shorter, so that's somewhat helpful. However, wouldn't requiring to type the http:// prefix still kill the simplification enough to make this not even bothering with? Otherwise, both Chrome and Firefox will do a google search for the decimal number.


You could just add a forward slash to the end of it instead of adding the protocol, i.e. 3232235521/


I discovered the hard way that a library I was using would parse an IP address in a URL like 192.168.000.077 as 192.168.0.63. Yeah... it treated leading zeros as octal in individual components of the IP. Lots of fun to track down.


It’s always annoyed me that IPv4 addresses are valid DNS names. Unnecessary confusion and software workarounds to deal with it.


Back in the dark past of the late 1980s/early 1990s, "[x.x.x.x]" used to be a convention to force parsing as an IP address rather than a DNS name. I have no idea where this convention came from, or where it went to, but it seems to have completely vanished since.

It's possible that it was a UK-specific convention that was borne from JANET but I don't know; I do know that PAD and CPAD addresses (from X.25) were more common on JANET than IP addresses, so that seems unlikely.


This is a great point. Square brackets are still used in ipv6 to more easily distinguish between the ip the port: e.g. [::1]:52


They're not valid DNS names, they're IP addresses. When you subscribe to an Internet Service Provider you are actually subscribing to the network using the IP address space. DNS is an add-on that is overlayed on top of the IP addressed network. If anything, allowing host resolvers besides DNS should be a more transparent process, like switching out your default search engine.


IPv4 addresses can be parsed as host names per RFC 1123 section 2.1; it is merely convention ("SHOULD") that dotted numbers are checked as IP addresses before being passed through DNS.

This is what the parent meant. IPv4 addresses are also syntactically valid host names.


> IPv4 addresses are also syntactically valid host names.

Curious...from the philosophical intent supporting the normative language[1] (my emphasis added):

However, a valid host name can never have the dotted-decimal form #.#.#.#, since at least the highest-level component label will be alphabetic.

[1] https://tools.ietf.org/html/rfc1123#page-13



I'm not sure how you would go about fixing that...?


At this point? It’s not feasible.


Not according to rfc1035 they’re not


Yes, I misread that when implementing restrictions on DNS labels as well (answering with a different RFC from a sibling): "The DNS itself places only one restriction on the particular labels that can be used to identify resource records. That one restriction relates to the length of the label and the full name." -- https://tools.ietf.org/html/rfc2181#section-11

Also, to get a full view of DNS, the HTML view of the RFC links to all the updates to DNS over the years: https://tools.ietf.org/html/rfc1035

Which is pretty handy.


Parsing IPv6 also requires some thought! Consider 0000:0000:0001:0002:0000:0000:dead:beef

- 0000:0000:1:2::dead:beef

- ::1:2:0000:0000:dead:beef

- ::1:2:0:0:dead:beef

- ::1:2:0:0:222.173.190.239

- 0:0:1:2::222.173.190.239


Digging more about IPv6 representation:

- It's possible to stick "::" in many places

- It's possible to have :0000: before or after "::"

- leading zeros (:0001: is the same as :1:)

- "Text Representation of Special Addresses", so the ::192.168.0.1 representation

- hex uppercase vs hex lowercase

see https://tools.ietf.org/html/rfc5952


Also due to the ambiguity of ports also using a colon delimiter, the IPv6 address may be in brackets:

    [::1:2:0:0:dead:beef]:443
And link-local addresses are mandatory and scoped per interface, so they need a zone id supplied as either an integer or interface name:

    fe80::1:2:0:0:dead:beef%eth0


that rfc says upper/lower doesn't matter? Am I wrong?


That's the point: because it doesn't matter it provides two representations for the same semantic thing.


Well, that's just like IPv4, then. They can still be canonicalized, if you need to compare them.


That's why there are library routines for this, inet_pton and inet_ntop. No one should need to parse these addresses themselves.


I like the hex version better than the traditional version, although the traditional one is easily recognised as an IP address, even without context, so its probably best to stick with that ;)


I believe you can also represent an ipv4 address in base 10, by doing something like

a.b.c.d, (a * 2^24) + (b * 2^16) + (c * 2^8) + (d * 2^0)

(the first time i did this was the first time I understood raising a constant to zero was one, so I left it there for myself)

The resultant number should be somewhere between 0 and (2^32)-1. It's a neat toy. I'm not sure what value it has in practice.

192 x 2^24 + 168 x 2^16 + 0 x 2^8 + 1 x 2^0 == 3232235521

ping 3232235521 PING 3232235521 (192.168.0.1): 56 data bytes Request timeout for icmp_seq 0 ^C


I've used this a handful of times, actually!

I've mostly done it to determine a "next" IP address from a large range - i.e, what IP comes after 10.0.0.255, and how can I bound it between 10.0.0.30 and 10.0.1.128?

It's much easier to increment an integer, compare it to other integers, and then convert it back to an string-format IP than it is to implement those operations by "parsing" IP addresses.


An IP address like 127.0.0.1 is an integer, written in an odd base 256 notation! What I think you mean is that it's easier to increment an integer written in conventional base 10, because people are used to working with numbers like that.


I'm imagining magmastonealex found it useful for things like working with each of 10.0.0.0/16 without having to manage the digits of those base-256 numbers:

  for (var i = 0; i < 2 ** 16; i++) {
    work("10.0." + i);
  }
as opposed to

  for (var i = 0; i < 2 ** 16; i++) {
    work("10.0." + i / 256 + "." + i % 256);
  }


This is neat!

Here's a JS version:

  > const ip2dec = ip => ip.split(".").reverse().reduce((acc, n, i) => acc + (parseInt(n) * (2 ** (i * 8))), 0)
  > ip2dec("192.168.0.1")
  3232235521
  > ip2dec("192.168.1.1")
  3232235777
  > ip2dec("0xDE.0xAD.0xBE.0xEF")
  3735928559


There’s even more ways to skin this cat. Octal notation works (prefixed with a 0), and each octet can be represented differently (0x7F.1 == 0x7F.0.0.1 == 127.0.0.1).


The bit with `ping 0` is sadly unexplored, as the nature of "any host in network" and "any host in any network" addresses are sadly unmentioned these days.

Then come the joys of debugging someone's configuration because a tutorial told them to "connect to 0.0.0.0" instead of localhost...


As a side note, why wouldn't 0.0.0.0 be a valid address if you have a route for it (just like you can set any public IP, and are not restricted to private ranges)?

I recall setting 192.168.0.0 as a valid address in windows XP a while ago.

I usually find people in the networking are (especially teachers) to quite stubbornly assume that things are always one way, while almost the entirety of networking is comprised of RFCs.

Sure, default configurations help in that you don't have to change a device's configuration, but I see no reason why I couldn't change my MAC, or set my broadcast address to 10.0.0.42, or the broadcast MAC to 01:02:03:04:05:06, and my gateway to 10.0.0.0 while I enjoy browsing the web on my 0.0.0.0-addressed computer...


0.0.0.0 is a valid address. You also have a route for 0.0.0.0/0, usually called "default" :)

Just not valid for a host, as it is a network address where "network" is specified as the whole internet. It's how you create services listening to connections from any address - technically you can setup a listening socket waiting for connections only from specific network, but I haven't tested it.

It's also why minimal compliant IPv4 network has 4 IP addresses: the network address, host 1, host 2, broadcast. There's an unratified RFC for allowing only host 1 and host 2, but I believe it works only on some combinations of gear/software.

Then of course you get systems accepting broken setups, like 192.168.0.0 host address with /24 or /16 netmask.

However, 192.168.1.0/23 is a valid address a host can have.


This was quite interesting thanks!

On a related note, does anyone have a good recommendations/blog post that explains how IP addressing, ports, subnetting, etc. all work?

This is one of these things where I know that I'm just working off of empirical knowledge without knowing the fundamentals.



What's the explanation for why Linux treats 0.0.0.0 as 127.0.0.1?


When I bind a socket to 0.0.0.0 it means to all interfaces on the host machine, of which 127.0.0.1 is one of those interfaces. Not sure if that answers your question.


If you ping 0.0.0.0 on Linux it pings 127.0.0.1. That's the behavior I'm talking about. Binding to 0.0.0.0 is something completely different and not Linux-specific.


I see ping does

connect(5, {sa_family=AF_INET, sin_port=htons(1025), sin_addr=inet_addr("0.0.0.0")}, 16) = 0

and then does

getsockname(5, {sa_family=AF_INET, sin_port=htons(56630), sin_addr=inet_addr("127.0.0.1")}, [16]) = 0

I don't see the reason in the documentation of either syscall.

I was thinking it might be due to its subnet being the biggest one one commonly has (/8), but I just set up a (/1), and it still used 127.0.0.1. Even if I set the lo interface down, it still tries 127.0.0.1 and just hangs until I bring it back up. I removed the 127.0.0.1/24 address from lo, and gave lo the address 128.0.0.1/1 with the same options 127.0.0.1/8 had, and that caused `ping 0` to return the error, "connect: Invalid argument". So, I don't know. At least I learned that the behavior seems to really be tied to the address and not the loopback interface, which I though was supposed to abstract the address.


https://unix.stackexchange.com/questions/99336/how-does-ping...

Because it's special cased in ping. Because that's what a relevant RFC says to do.


Looking at the RFC it seems to say that { 0, 0 } means "this host on the network" and must not be sent except as a source address during initialization in order for the host to learn its own IP address. My interpretation is not that this means trying to ping 0.0.0.0 is supposed to be equivalent to 127.0.0.1, but just that you can't actually set your destination address to 0.0.0.0 when sending IP packets. Linux ping's interpretation of this as meaning to fall back to 127.0.0.1 when trying to ping 0.0.0.0 doesn't seem unreasonable, but nor does it seem to be mandated by the RFC.



Wait, so does this mean 192.168/16 means 192.0.0.168/16 rather than 192.168.0.0/16?


Octal notation. pshyeah




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: