Only Ones And Zeros

Welcome to my second 'Under The Hood' post! Following on from 'Getting to Google, Part 1', we are now going to see what happens after we have an IP address. Actually, thats a little bit of a fib. You will safely be able to go to Part 3 to see what usually happens - this post is actually a quick aside, and notes a couple of interesting things!

The packet capture below includes ARP again, and a quick review of one of the basic tasks somebody will carry out if they believe there are connectivity issues (you may have heard many people say 'can you ping X?', and this is covered below).

"ARP! ARP! ARP!"
Packets 1 and 2 in the capture will look very familiar. If you look at packets 9 and 10 in 'Getting to Google, Part 1', you will see that they are very close to identical. Given the purpose of ARP, and the fact that these packets were captured very shortly after the packets seen in that post, why does my virtual host need to send out the same ARP request again? Surely it already has the MAC address for the default gateway! The answer lies in Microsofts Knowledgebase article 949589, which advises:

"ARP caching behavior has been changed in Windows Vista. The TCP/IP stack implementations in Windows Vista comply with RFC4861 (Neighbor Discovery protocol for IP version 6 [Ipv6]) for both the IPv4 and the IPv6 Neighbor Discovery process."
The ARP cache is essentially a table that records all of the information from previous ARP replies. To see the current ARP cache contents, you can simply open a command prompt and enter arp -a . A quick example is below and you can see that we have the entry for 192.168.126.2.

Just two seconds... that KB article is for Vista. My virtual host is Windows 7... As we can see below, the article does appear to apply to Windows 7 also! The Base Reachable Time for the interface in question is indeed 30,000ms, as per the Knowledgebase article. Actually, checking further down in the article also confirms that this applies to Windows 7.

Can I ping it? YES YOU CAN! Hopefully...
So... you may have heard the standard IT troubleshooting phrase 'can you ping xxxxx?'. If network connectivity issues are present, this is a simple tool that uses the ICMP protocol to check connectivity to another specific host. In the following screenshot, you can see the use of the ping command.

From the Windows 7 host, we try to ping the default gateway IP (although note that the target IP does not need to be 'local' ie on the same network). An ICMP 'echo request' is sent from the virtual host to the default gateway. This is essentially my virtual host saying 'Hello, are you there?'. The default gateway receives the echo request, and responds with an ICMP 'echo reply', confirming that it is there. Four pings, or echo requests, are sent by default - as in certain situations not all get a response, such as in cases of intermittent connectivity.

It should be noted that sometimes, even when there are not connectivity issues, pings may not work, as they can be dropped by security applications on the host (trying to hide it) or by any devices, or 'hops', in between the source and destination.

Looking at the ICMP packets in more detail, you can see that the ICMP echo request has an ICMP 'type' of 8, which identifes this packet as an echo request packet. The echo reply packet has an ICMP 'type' of 0, which identifes this packet as an echo reply. You can also see in the main packet capture screenshot at the top of this post that the sequence numbers increment. This is so each echo reply can be matched up with its respective echo request.

Q: How did you know about the Microsoft Knowledgebase article?
A: Google is your friend (though some people disagree, that is an entirely different discussion!) - as I knew what I was looking for, a search for 'Windows 7 arp cache timeout' yielded the following:
- second hit was a brief article located at msmvps.com. And that linked to the MS article!
- third hit was support.microsoft.com/kb/949589
- fourth hit was a http://www.petri.co.il/ article. Not directly related, but just including a shout out as there are some fantastic articles on that site!
As a quick aside, although searching usually gets you to answers, there are two great articles on the QA blogs site which can be read here and here* I have fairly strong opinions on the subject matter myself, and disagree with some points made but thats what makes the subject interesting. On the other hand, I agree with a lot of what is written in them - I would like to throw 'Plan B (book) into the equation also, but with any approach there are pros and cons!

Q: How do you know about ICMP 'types' and 'codes'?
A: I have spent a fair bit of time in IT - with time and experience, this knowledge becomes second nature, trust me! Types and codes for ICMP (and indeed types and codes for other IP protocols) are coordinated by the Internet Assigned Numbers Authority (IANA). The current IANA page relating to ICMP types and codes is here

Q: I noted that in the ICMP echo request screenshot that the Internet Protocol Version 4 line was red... Red is bad, isnt it?
A: Well caught! In this particular case, it isnt bad. To ensure that transmitted packets have not been corrupted, some protocols include 'checksums' which are fields containing values that are the result of a mathematical calculation based on other protocol fields. In this case, Wireshark has performed the calculation itself, and is advising that the checksum is incorrect. This is actually due to an option known as 'checksum offloading' (which Wireshark handily mentions) meaning that the checksum is calculated by the Network card, so the outgoing packet is captured before the checksum is calculated! As this only happens on outgoing packets, we can be sure that no corruption has occured!

*Full disclosure - my wife works for QA :-)

This is my first post under what I am calling the 'Under the hood' series. These series will explore what actually happens out of sight, or in the background when people are carrying out day to day tasks.

If you are relatively new to IT, or networking, then you may struggle to follow the below, but another of my aims is to explain each of the concepts more fully in my future tutorials. After following the tutorials, this post could be used as a benchmark to see how much understanding you have gained.
If you work in a security field, try to follow the below through and see what possible attack vectors you can think of!

In the 'Getting to Google' posts, I will be showing you what happens out of sight when users carry out the fairly straight forward task of powering on a laptop or desktop, and browsing to http://www.google.co.uk/. Despite the length of these articles, you can actually dig further down however I will be touching only on the key points.

To obtain the packet captures I will be going through below, I used a Windows 7 enterprise virtual machine running in VMWare Workstation. VMWare Workstation has been configured to act as the DHCP server. The packets were captured with perhaps one of my favourite tools ever - an open source tool called Wireshark. In some of the tutorial series to come, I will also be showing the use of windump, which is the Windows version of tcpdump.

The 'big' picture
The initial stage of being able to browse to google involves obtaining an IP address so communication is possible, and then actually knowing where to send the requests. The below packets are all packets captured that belong to this first stage:

Packet 1 - Please can I use my last address?
Selecting the first packet of the 10 captured, we can see a packet sent out from my Windows 7 host. We can see in the second part of the window that the source MAC address (Media Access Control, also known as 'Burned In Address' or 'physical address') of my virtual host is 00:0c:29:7c:b8:92, which you can also see belongs to VMWare. The destination MAC address is ff:ff:ff:ff:ff:ff - my host doesn't know who is out there, so its broadcasting (or shouting!). We can also see that the source IP (Internet Protocol) is 0.0.0.0 (my host isnt allowed to use the requested address yet), and the destination is 255.255.255.255 (again, a broadcast address, but not technically used at this moment in time). We can see that the transport protocol used is UDP (User Datagram Protocol), with a source port of 68 and a destination port of 67 (the DHCP server is listening on port 67, the port related to the Bootstrap Protocol).

Now we delve into what is contained within the Bootstrap data. From the below, we can see that this is a DHCP request (asking if we can use the IP address we want to), the requested IP address is 192.168.10.128 and my virtual machine is called 'mytesthost'. Now we have sent the request, we need to wait for the response!

But... hold on a second... There is one more key thing. Make note of the 'transaction ID' in the above screenshot.

Packet 2 - No. You cannot.
Moving on to the second packet that we see, we see an incoming packet from the local DHCP server, with a source MAC of 00:50:56:fa:1e:5d, send to the broadcast MAC address. This time we see a source IP of 192.168.126.254 (the DHCP server has a static, or pre-configured IP address), and a broadcast destination IP. Note that the UDP source port is now 67 and the destination UDP port is now 68.

If we then have a look at the Bootstrap data, we can see that the message type is a 'DHCP NAK' or DHCP Negative AcKnowledgement (I acknowledge your request, but the answer is 'no'), and the DHCP server is also kind enough to give a reason - the IP address you have requested is not available!

Hmmm... if the destination MAC address was a broadcast address, and the IP address was a broadcast address, ie this was 'shouted out' at everyone, how does my virtual host know that the negative acknowledgement is directed at them? Note the transaction ID!

Packet 3 - ok then... can somebody please offer me an address?
So, my virtual machine has asked if it could use the IP address that it had previously used, and it got a 'no' straight back. This time, it is just going to shout out again, but this time it is willing to accept offers. The MAC and IP addresses are the same as the first packet. The UDP ports used are the same as the first packet. This time, however, the Bootstrap data is slightly different. The message type is now a 'DHCP discover' (my virtual host is asking if there are any DHCP servers out there willing to offer it an IP address). You can see that the hostname is the same as in the first packet.
Note that the transaction ID has now changed. This is now an entirely different conversation than that contained in the first two packets.

Packet 4 - Just a second, I have an address to offer, but just double checking that it isnt already in use...
Now, at this stage the DHCP server believes that the IP address 192.168.126.129 isn't being used. However, there are hosts out there that can have static, or pre-configured, IP addresses. Maybe one of those hosts has already got this address! To check that this is not the case, the DHCP server sends out an ICMP (Internet Control Message Protocol) packet, of 'type 8 code 0', which translates to an 'ICMP echo request' or 'ping'. As IP addresses need to be unique, the DHCP server is shouting out 'is there anybody out there called 192.168.126.129?'. In this case, there is not.
Note the identifier. Although in this packet capture there is no response, if there *was* a response, the response packet would contain that identifier, so the DHCP server would know that this response was part of the same conversation. Note the similarities between the ICMP identifier used here, and the DHCP transaction IDs in the first 3 packets, with respect to how they are used.

Packet 5 - We have another speaker!
Now we see an ARP (Address Resolution Protocol) packet. We will see ARP packets in captured packets 9 and 10 so will go into more depth there. For now, however, as soon as the local default gateway sees the ICMP request, it notes that somebody is interested in 192.168.126.129, and also notes that it doesnt know the MAC address for that host. In an attempt to find out the address of that host, the ARP request is sent out - although the phrase does not appear in the actual packet data, Wireshark carries out one of its very handy functions in interpreting the packet for us. As you can see, the interpretation is "Who has 192.168.126.129? Tell 192.168.126.2" (192.168.126.2 is the IP address of the default gateway).

Packet 6 - Right! I have an address I can offer. With conditions...
The DHCP server has sat there in silence for just over 1 second. During this time, if there was another host on the network that already had the IP address of 192.168.126.129, an answer would have been received. As it has not, the DHCP server now offers the IP address to my virtual host.
The source MAC address is the MAC address of the DHCP server, and the destination MAC address is that of my virtual host. The source IP is that of the DHCP server, and the destination IP is the address that has been offered (although note that this is strictly ignored at this point, as my virtual host hasnt yet accepted the IP address offer, that is actually contained in the Bootstrap data)

Now, a closer inspection of the Bootstrap data shows that the DHCP server is giving the following offer: "I offer you the IP address of 192.168.126.129 (Your IP address). The address will be yours for 30 minutes (lease time, after which the IP address will need to be formally requested again). Your default gateway is 192.168.126.2 (router), and your Domain Name Server (DNS) server is also 192.168.126.2". You can see that the DHCP message type is indeed 'DHCP Offer', and the transaction ID is the same as that in packet 3, advising that this DHCP Offer is related to the DHCP discover packet sent out just over 1 second ago.

Packet 7 - Well alright then, please may I have that address?
My virtual host now has the offer of an IP address. DHCP replies can be received from multiple DHCP servers (in this case I only have the one!) - in these situations, one offer has got to be accepted and all others rejected. To do this, the destination MAC address is set to the broadcast MAC address, and the destination IP address also set to broadcast. This request is shouted out, so the offer from all other servers is implicitly rejected (the transaction ID is used again!).

Hold on! If this is shouted out to everybody, how does the DHCP server whose offer has been accepted know that this is the case? The IP address of the DHCP server is contained within the Bootstrap data. From the below you can see that the DHCP message type is 'DHCP Request', the requested IP is 192.168.126.129, and the DHCP request is intended for DHCP server 192.168.126.254. Further to this, using 'DHCP option 55', my virtual host has requested information regarding, among other things: the default gateway, the DNS server and the subnet mask to use.

Packet 8 - You sure can!
Well, the DHCP server has offered the address, and now received a formal request for it. A DHCP ACK (DHCP ACKnowledgment) is sent from the DHCP server to my virtual host. The source MAC and source IP address are those of the DHCP server, and the destination MAC and IP address are those of my virtual host. Nobody else needs to know about this packet, so broadcasts are not used.

The Bootstrap data contained within the packet lists similar information to the previous offer in packet 6. Note that not all of the options requested in packet 7 are listed, however the options that are available will allow my virtual host to 'get out' to the Internet!

Packet 9 - Ready to go! Right, where is that default gateway?
Right! My virtual machine has an IP address. It also has the IP address of the default gateway (192.168.126.2, the next 'hop' towards the Internet!). However, in order for my virtual machine to send packets via the default gateway, it needs to know the MAC address of the default gateway. To obtain the MAC address, it sends an ARP request out. The destination MAC address is set as the broadcast address - my virtual host is shouting "Who has 192.168.126.2? Tell me!"

Packet 10 - I sir, am the default gateway you are looking for
The default gateway sees the ARP request, and recognising that the request is directed at it, responds with an ARP reply. As this is an ARP response, the destination MAC address is set to that of the host that sent the ARP request, which in this case is my virtual machine. The ARP reply says "You were looking for the MAC address for host 192.168.126.2 - that is me, and I can confirm that my MAC address is 00:50:56:f6:72:76"

Wow! For something as simple as powering on your laptop and browsing to http://www.google.co.uk/, you can see that a lot happens in the background. And so far we have only covered what happens when the laptop is powered on!

Q: Surely in this day and age you should be focusing on IPv6?
A: Indeed - IPv4 is still hanging on, but the IPv6 swapover is inevitable. However as an introduction as to what is going on, IPv4 is far more easily followed and I am hoping that this post is referenced as people follow the upcoming tutorials through. Most training courses that I am aware of focus on IPv4 in the introductions and come on to IPv6 as an advanced topic. I have decided to follow!

Q: I dont understand a lot of this article - RFCs? Broadcasts? Bootstrap options???
A: As said above, if you are new or relatively new to this, I am hoping that this post can be used as a benchmark. When up and coming tutorials are followed through, you should find that you start to understand more of what is happening here! We have touched on a lot of things here, and that hopefully highlights how much actually goes on in the background - and we are just scratching the surface!

Only Ones And Zeros

Saturday, 22 June 2013

Under the hood - Getting to Google Part 2

Sunday, 30 December 2012

Under the hood - Getting to Google Part 1

About Me