If you are relatively new to IT, or networking, then you may struggle to follow the below, but another of my aims is to explain each of the concepts more fully in my future tutorials. After following the tutorials, this post could be used as a benchmark to see how much understanding you have gained.
If you work in a security field, try to follow the below through and see what possible attack vectors you can think of!
In the 'Getting to Google' posts, I will be showing you what happens out of sight when users carry out the fairly straight forward task of powering on a laptop or desktop, and browsing to http://www.google.co.uk/. Despite the length of these articles, you can actually dig further down however I will be touching only on the key points.
To obtain the packet captures I will be going through below, I used a Windows 7 enterprise virtual machine running in VMWare Workstation. VMWare Workstation has been configured to act as the DHCP server. The packets were captured with perhaps one of my favourite tools ever - an open source tool called Wireshark. In some of the tutorial series to come, I will also be showing the use of windump, which is the Windows version of tcpdump.
The 'big' picture
The initial stage of being able to browse to google involves obtaining an IP address so communication is possible, and then actually knowing where to send the requests. The below packets are all packets captured that belong to this first stage:
Packet 1 - Please can I use my last address?
Selecting the first packet of the 10 captured, we can see a packet sent out from my Windows 7 host. We can see in the second part of the window that the source MAC address (Media Access Control, also known as 'Burned In Address' or 'physical address') of my virtual host is 00:0c:29:7c:b8:92, which you can also see belongs to VMWare. The destination MAC address is ff:ff:ff:ff:ff:ff - my host doesn't know who is out there, so its broadcasting (or shouting!). We can also see that the source IP (Internet Protocol) is 0.0.0.0 (my host isnt allowed to use the requested address yet), and the destination is 255.255.255.255 (again, a broadcast address, but not technically used at this moment in time). We can see that the transport protocol used is UDP (User Datagram Protocol), with a source port of 68 and a destination port of 67 (the DHCP server is listening on port 67, the port related to the Bootstrap Protocol).
Now we delve into what is contained within the Bootstrap data. From the below, we can see that this is a DHCP request (asking if we can use the IP address we want to), the requested IP address is 192.168.10.128 and my virtual machine is called 'mytesthost'. Now we have sent the request, we need to wait for the response!
But... hold on a second... There is one more key thing. Make note of the 'transaction ID' in the above screenshot.
Packet 2 - No. You cannot.
Moving on to the second packet that we see, we see an incoming packet from the local DHCP server, with a source MAC of 00:50:56:fa:1e:5d, send to the broadcast MAC address. This time we see a source IP of 192.168.126.254 (the DHCP server has a static, or pre-configured IP address), and a broadcast destination IP. Note that the UDP source port is now 67 and the destination UDP port is now 68.
If we then have a look at the Bootstrap data, we can see that the message type is a 'DHCP NAK' or DHCP Negative AcKnowledgement (I acknowledge your request, but the answer is 'no'), and the DHCP server is also kind enough to give a reason - the IP address you have requested is not available!
Hmmm... if the destination MAC address was a broadcast address, and the IP address was a broadcast address, ie this was 'shouted out' at everyone, how does my virtual host know that the negative acknowledgement is directed at them? Note the transaction ID!
Packet 3 - ok then... can somebody please offer me an address?
So, my virtual machine has asked if it could use the IP address that it had previously used, and it got a 'no' straight back. This time, it is just going to shout out again, but this time it is willing to accept offers. The MAC and IP addresses are the same as the first packet. The UDP ports used are the same as the first packet. This time, however, the Bootstrap data is slightly different. The message type is now a 'DHCP discover' (my virtual host is asking if there are any DHCP servers out there willing to offer it an IP address). You can see that the hostname is the same as in the first packet.
Note that the transaction ID has now changed. This is now an entirely different conversation than that contained in the first two packets.
Packet 4 - Just a second, I have an address to offer, but just double checking that it isnt already in use...
Now, at this stage the DHCP server believes that the IP address 192.168.126.129 isn't being used. However, there are hosts out there that can have static, or pre-configured, IP addresses. Maybe one of those hosts has already got this address! To check that this is not the case, the DHCP server sends out an ICMP (Internet Control Message Protocol) packet, of 'type 8 code 0', which translates to an 'ICMP echo request' or 'ping'. As IP addresses need to be unique, the DHCP server is shouting out 'is there anybody out there called 192.168.126.129?'. In this case, there is not.
Note the identifier. Although in this packet capture there is no response, if there *was* a response, the response packet would contain that identifier, so the DHCP server would know that this response was part of the same conversation. Note the similarities between the ICMP identifier used here, and the DHCP transaction IDs in the first 3 packets, with respect to how they are used.
Packet 5 - We have another speaker!
Now we see an ARP (Address Resolution Protocol) packet. We will see ARP packets in captured packets 9 and 10 so will go into more depth there. For now, however, as soon as the local default gateway sees the ICMP request, it notes that somebody is interested in 192.168.126.129, and also notes that it doesnt know the MAC address for that host. In an attempt to find out the address of that host, the ARP request is sent out - although the phrase does not appear in the actual packet data, Wireshark carries out one of its very handy functions in interpreting the packet for us. As you can see, the interpretation is "Who has 192.168.126.129? Tell 192.168.126.2" (192.168.126.2 is the IP address of the default gateway).
Packet 6 - Right! I have an address I can offer. With conditions...
The DHCP server has sat there in silence for just over 1 second. During this time, if there was another host on the network that already had the IP address of 192.168.126.129, an answer would have been received. As it has not, the DHCP server now offers the IP address to my virtual host.
The source MAC address is the MAC address of the DHCP server, and the destination MAC address is that of my virtual host. The source IP is that of the DHCP server, and the destination IP is the address that has been offered (although note that this is strictly ignored at this point, as my virtual host hasnt yet accepted the IP address offer, that is actually contained in the Bootstrap data)
Now, a closer inspection of the Bootstrap data shows that the DHCP server is giving the following offer: "I offer you the IP address of 192.168.126.129 (Your IP address). The address will be yours for 30 minutes (lease time, after which the IP address will need to be formally requested again). Your default gateway is 192.168.126.2 (router), and your Domain Name Server (DNS) server is also 192.168.126.2". You can see that the DHCP message type is indeed 'DHCP Offer', and the transaction ID is the same as that in packet 3, advising that this DHCP Offer is related to the DHCP discover packet sent out just over 1 second ago.
Packet 7 - Well alright then, please may I have that address?
My virtual host now has the offer of an IP address. DHCP replies can be received from multiple DHCP servers (in this case I only have the one!) - in these situations, one offer has got to be accepted and all others rejected. To do this, the destination MAC address is set to the broadcast MAC address, and the destination IP address also set to broadcast. This request is shouted out, so the offer from all other servers is implicitly rejected (the transaction ID is used again!).
Packet 8 - You sure can!
Well, the DHCP server has offered the address, and now received a formal request for it. A DHCP ACK (DHCP ACKnowledgment) is sent from the DHCP server to my virtual host. The source MAC and source IP address are those of the DHCP server, and the destination MAC and IP address are those of my virtual host. Nobody else needs to know about this packet, so broadcasts are not used.
The Bootstrap data contained within the packet lists similar information to the previous offer in packet 6. Note that not all of the options requested in packet 7 are listed, however the options that are available will allow my virtual host to 'get out' to the Internet!
Packet 9 - Ready to go! Right, where is that default gateway?
Right! My virtual machine has an IP address. It also has the IP address of the default gateway (192.168.126.2, the next 'hop' towards the Internet!). However, in order for my virtual machine to send packets via the default gateway, it needs to know the MAC address of the default gateway. To obtain the MAC address, it sends an ARP request out. The destination MAC address is set as the broadcast address - my virtual host is shouting "Who has 192.168.126.2? Tell me!"
Packet 10 - I sir, am the default gateway you are looking for
The default gateway sees the ARP request, and recognising that the request is directed at it, responds with an ARP reply. As this is an ARP response, the destination MAC address is set to that of the host that sent the ARP request, which in this case is my virtual machine. The ARP reply says "You were looking for the MAC address for host 192.168.126.2 - that is me, and I can confirm that my MAC address is 00:50:56:f6:72:76"
Wow! For something as simple as powering on your laptop and browsing to http://www.google.co.uk/, you can see that a lot happens in the background. And so far we have only covered what happens when the laptop is powered on!
Q: Surely in this day and age you should be focusing on IPv6?
A: Indeed - IPv4 is still hanging on, but the IPv6 swapover is inevitable. However as an introduction as to what is going on, IPv4 is far more easily followed and I am hoping that this post is referenced as people follow the upcoming tutorials through. Most training courses that I am aware of focus on IPv4 in the introductions and come on to IPv6 as an advanced topic. I have decided to follow!
Q: I dont understand a lot of this article - RFCs? Broadcasts? Bootstrap options???
A: As said above, if you are new or relatively new to this, I am hoping that this post can be used as a benchmark. When up and coming tutorials are followed through, you should find that you start to understand more of what is happening here! We have touched on a lot of things here, and that hopefully highlights how much actually goes on in the background - and we are just scratching the surface!