How does a packet travel from one computer to another over the Internet based on OSI model - networking

I am familiar with the basic OSI model but I always get confused how does a packet travel from one machine to another over the Internet and what OSI layers do come into picture? For example, for the following topology:
Machine A<----->Switch<---->Router<---->Router<---->Router<---->Switch<---->Machine B
where the multiple routers are shown to represent the Internet, what happens at the OSI layer level, when Machine A send a packet (say a simple "ls" command over FTP) to Machine B.
The above is just a suggested example, but if any one can explain with any other network topology, that is fine too. All I am looking a very basic explanation of how the packet gets transformed to different OSI layers at each nodes (Machine, Switch, Router, etc.).

Routers use the IP layer (layer 3) and switches use the data-link layer (layer 2). Layer 1 is the physical 1s and 0s that go over a wire, Layer 2 is the data-link layer, which is protocols like Ethernet and Point-To-Point Protocol (PPP), which carries information between adjacent nodes about MAC address from and to and allows for error detection and retransmission. Layer 3 is the IP layer, which carries information about where in the whole network the packet is from and to, not just the current hop.
The transmission would go like this:
Machine A wants to send a packet to Machine B. Machine A knows Machine B's IP address, so it places that in the layer 3 packet. Machine A needs to place the MAC Address of the next hop in the layer 2 packet, however. If it does not know, then it will send something called an ARP request (Address Resolution Protocol, read here: http://www.tildefrugal.net/tech/arp.php ) to the network, with the destination IP. One of a few things will happen here:
The IP is local. The machine with that IP will reply back to the sender with its MAC address.
The IP is non-local. The gateway router will detect this and send its MAC address.
The IP is non-local and Machine A's default gateway and subnet mask are set. Using this information Machine A can determine the non-locality of the IP address and send it to the router's MAC address (ARPing if not known yet).
(If Machine A found this out earlier, it will be in the ARP cache and Machine A will just use that.) Now that the MAC address is sent, the packet can be transferred (the physical layer 1 performing the actual transfer of data on the wire). The next stop will be the switch. The switch knows which outbound port the MAC address listed as the layer 2 destination is on, because it tracks every MAC address it's seen a packet come from and which port it came on - if it does not know, then it will flood it out every single port, guaranteeing it'll arrive.
As such, the packet arrives at the router. The cool thing about the IP model is that it divides every single IP address in the network/world into a hierarchy - Subnets by definition cannot overlap subnets partially, they either wholly contain them or are wholly contained by them. So as long as subnets follow this hierarchy, the router can unambiguously determine where each of the 4 billion possible IP addresses are on the network just by looking at what subnet the IP will fall under in its table! The packet is then sent out that port.
As the packet travels through interconnected ISPs' routers, backbone infrastructure and so on, it arrives at Machine B's router, where the opposite process happens - router B sees that its destined for Machine B and sends it inbound. (Similarly, Router B will have to use a process like ARP to find Machine B's MAC address if not known.) The rest should be trivial from here.
good references:
https://web.archive.org/web/20120129120350/http://www.tildefrugal.net/tech/arp.php
http://en.wikipedia.org/wiki/Data_link_layer
http://en.wikipedia.org/wiki/Network_switch
http://en.wikipedia.org/wiki/Network_layer
http://en.wikipedia.org/wiki/Routing
http://en.wikipedia.org/wiki/Router_(computing)
http://en.wikipedia.org/wiki/Address_Resolution_Protocol

The only thing that can travel over a copper wire are pulses of electricity.
The binary number 1 is represented by a pulse of electricity or no pulse of electricity for 0.
Just keep in mind that real data of any kind cannot be sent over copper wire, fibre optic, or through the air ...only a representation of the data which has previously been converted to a 1 or a 0 and then is reconverted back at the receiving end.

Network layer protocol supervises the transmission of packets from a source machine to a destination. Data is broken down into packets, or datagrams, up to 64 kb long before it is transmitted, with a stamp of destination IP address, and forwarded to the network gateway. A gateway can be router to interconnect networks.

Related

Why only MAC address is used to transfer the packet to a device?

I am sorry if its basics, but I did not find the appealing answer for it over the Internet.
Why only MAC is used to transfer the packet to a device ? MAC address is only obtained by ARP for a specific IP address. So, why not just let the routers maintain IP addresses of the neighbouring routers and route packets using IP addresses of routers instead of MAC addresses ?
Why not redesign the architecture, to only use IP address for routing as well as moving the packet in the data link layer too ?
Why do we need MAC addresses?" Why can't network devices such as the routers just send the packet to the next router using the router's IP address?
Note : I know that MAC address is used to identify the system in a network. But you see the source never knew the MAC address of receiver. All it knew was its IP address and MAC address of next hop.
I'm reading Data Comm and Networking by Forouzan ( Ed 5) and it says that even routers have an IP address. So why use the mac address at all. The router can store the IP address of the source and route it to the next router .
EDIT : The question that I was getting as suggestion to this one does not answer my query. There are multiple counter points and proof that I have presented here which could have been done which is not answered by the one which is suggested. So please read my question before making any assumptions.
What do you think makes more sense: Having one protocol like Ethernet handle all the layer 2 details so that its layer 3 payload doesn't have to care, or force IP, ARP, WoL, IPX, MPLS, SLPP, and dozens more implement it on their own? The whole purpose of OSI layers is that upper layers need not know all the lower layer's details and lower layers need need not support the upper layer's features.
MAC addresses are used for the layer 2 protocol which encapsulates a layer 3 protocol. If all the necessary features were embedded into IP, then you'd be leaving other protocols to re-implement layer 2 routing on their own. This would be wildly inefficient.

What does this IP address stand for in IPv4

Hi I've been learning networks for a while now and I can't seem to find a solution for this doubt online.
I learnt that if you are sending a packet to a class B network's destination 141.49.255.255, it means you are broadcasting the packet to the whole class B network of 141.49.0.0; I also learnt that if you are sending a packet to a class A network's destination address like 10.255.255.255, you are again broadcasting the packet in the network ID 10.0.0.0
My question is, what am I doing when I'm sending a packet to class B 141.255.255.255 ? Am I sending it to a single host or a network?
Forget classes. Everything is being done in CIDR for the last two decades.
Sending to 141.255.255.255/16 is a subnet broadcast to 141.255.0.0/16. The same address may also be
141.255.255.255/8 - subnet broadcast to 141.0.0.0/8
141.255.255.255/6 - unicast(!) to the node 141.255.255.255
Most often, broadcasts are sent to 255.255.255.255. Routers (usually) don't forward broadcasts, so it's limited to the segment anyway.
However, there are some subtle differences: running e.g. 192.168.1.0/24 and 10.1.0.0/16 inside the very same Ethernet segment, a broadcast to 192.168.1.255/24 is physically sent to all nodes (Ethernet broadcast), even the ones with a 10.1.0.0/16 address. These however will the drop and ignore the packet since it doesn't address them.
In contrast, a broadcast to 255.255.255.255 will be sent to all nodes and processed by all nodes.

Transferring data between two computers connected with a switch from a high level language

I'll start with stating that I know very little about networking and the whole OSI model.
My goal is to create a tiny network(for now my laptop and a raspberry Pi) using an unmanaged network switch. On higher layer transmissions(level 3+) I would simply set the destination IP address for a packet. From what I've read on Wikipedia a network switch operates at the data link layer which means it uses MAC addresses.
How does one send data to a device on a local area network when it's connecting with something that only supports MAC addresses. More importantly, how does one do it from a high level language like Java or C#?
TL;DR The the OSI model is about abstraction and programing languages use operating system calls to implement this abstraction. The Rasberry Pi is running a full OS and will send and receive network data addressed to its assigned IP address. You do not need to specify MAC address.
You want to communicate with a Raspberry Pi from your Laptop. To do this you first connect them to the dumb switch and assign both devices an IP address in the same subnet, on physical interfaces connected to the dumb switch. Let say that your laptop's physical ethernet connection is assigned 10.0.0.1/24 and Rasberry Pi's physical ethernet connection is assigned 10.0.0.2/24 (If you do not understand my notation look at CIDR). IP addresses are Layer 3 constructs. Now your application will use an Operating System socket to create a TCP or UDP connection(see UDP java example here) with a layer 4 address (application port). Everything higher than Layer 4 is handled by your application.
Layer 2 and lower is handled by the OS. When your application tries to send data through the socket, the Operating System determines which physical interface to send data from by looking at the destination IP address. This lookup uses the OS Routing Table. Assuming you have a normal routing table, the OS will pick the interface that has ab IP with the same subnet as the destination IP. So if you send data to 10.0.0.2, your OS will send data from 10.0.0.1 because it has the same subnet of 10.0.0. Now the OS has selected an interface, it still does not know what Layer 2 MAC address to send the Layer 3 IP packet to. The main reason the OS does not know this is because IP addresses can change, but Layer 2 MAC addresses should not. Anyhow the OS sends out an ARP request which tries to get the MAC address for an IP address. If the devices are connected properly, the OS gets a MAC address for the desired IP address and begins to send data to that MAC address. The switch (smart or dumb) makes sure the message gets to the desired MAC address. At the receiving end, the OS receives the packet and send the data in the packet to sockets bound to the Layer 4 address (application port).
Side note: it is technically possible to send data to just a MAC address using RAW sockets but it is extremely technical.
Liam Kelly's answer provides great insight on abstraction of data sending. I will try to provide complementary information.
Network switch operation
While most switches operate at data level, there are some that can perform some operation at higher levels:
layer 3: Within the confines of the Ethernet physical layer, a layer-3 switch can perform some or all of the functions normally
performed by a router.
layer 4: [...] capability for network address translation, but then adds some type of load distribution based on TCP sessions.
layer 7: [...] distribute the load based on uniform resource locators (URLs), or by using some installation-specific technique to
recognize application-level transactions.
RAW sockets usage
As already specified, these require fairly advanced programming skills. They are also severely restricted in non-server versions of modern Windows Operating Systems (source) due to security concerns:
TCP data cannot be sent over raw sockets.
UDP datagrams with an invalid source address cannot be sent over raw sockets. The IP source address for any outgoing UDP datagram must
exist on a network interface or the datagram is dropped. This change
was made to limit the ability of malicious code to create
distributed denial-of-service attacks and limits the ability to send
spoofed packets (TCP/IP packets with a forged source IP address).
A call to the bind function with a raw socket for the IPPROTO_TCP protocol is not allowed.
Suggestion
If .NET is a viable option for you, I would take Pcap.Net for a spin, as it allows various operations at packet level using high level programming (including LINQ).

Source MAC address on Ethernet layer 2

My question is pretty basic and my apologized for that. In TCP/IP network, if you take a look on layer 2 (Ethernet), we have some fields like Source and Destination MAC address. If you think in your browser, it is totally understandable that you can type a IP, and then the layer 3 (IP) will know the destination IP, then the frame is sent to layer 2.. However, how the layer two know the destination MAC address ? The source MAC address is stored in the network card and it is understandable but on what part of communication the destination MAC address will be known ?
I will give you the basic of how the routing works. This will clarify your doubts:
1) Assuming that we have a layer 3 IP packet which we want to send to some destination which have a DIP = y.
2) The route lookup in the routing table would give you the next hop and the outgoing interface through which the packet needs to go out.
3) Now we have the next hop. Still we dont have the layer 2 encap information which is the DMAC. Assuming this is the first packet we are sending and we dont have the DMAC vs next hop IP mapping yet.
4) The system will generate a ARP Query which will be broadcasted with a DMAC FF:FF:FF:FF:FF:FF within the subnet. This query asks who has the the next hop IP. (Remember we got this from the route lookup).
5) The router having the next hop IP would respond back to the src router (Unicast reply), thus we would know what is the DMAC that needs to be put on layer 2 header.
6) The L3 packet will now be encapsulated in layer 2 header with the dmac as already known now.
7) Remeber the dip won't change in the layer 3 header. This way the packet would traverse hop by hop. At each hop the layer 2 DMAC would change and not the DIP.
Remember that the MAC address identifies each router uniquely on a hop by hop basis.
Hope that clarifies your doubt.
Your computer determines the destination MAC address of the next hop is typically determined on an Ethernet network by using ARP, an Ethernet broadcast protocol that allows you to ask which device is assigned a particular IP address.
MAC address is resolved using layer-2 devices such as network switches. ARP tables is the tool to resolve IP to MAC.
# arp -an

Why do I see packets that where source and destination are not my IP address

Im new to the networking world and I'm trying to use wireshark to get a hang of how packets are sent from my machine etc. Hence this question might be a dumb one.
When I open the wireshark packet analyzer GUI (on windows 7) there is a source and destination column. It shows packets where source IP is not mine and the destination IP is not mine either. Why is this happening? My network interface card should be receiving and sending only packets addressed to/sent from my IP address, right?
(attaching a screenshot. My IP address is 10.177.255.186)
Thanks.
On a small LAN all packets are generally broadcast to everyone. By broadcast I mean that the data is physically sent to everyone. When received the network interface determines if the packet was sent to you by looking at the address.
Using Wireshark your network interface can be set into promiscuous mode which means that all packets are captured and sent from the network interface to the CPU. This allows programs like Wireshark to record all those packets and not just the ones addressed for your computer.
Edit: However the packets don't have to be sent to all computers. A hub can be used to connect multiple computers together and acts as just a repeater meaning all packets are always sent everywhere (except on the wire where the packet came from). A switch however is similar but smarter.
If three computers A, B and C are connected to a switch and A sends a packet to B then the packet will first arrive at the switch. If the switch knows what wire B is connected to then it will only send it down that wire. If it doesn't know it sends it everywhere and later if B replies to A the switch will figure out what wire B is on. This means that C will generally never get to see any of the messages sent between A and B once the switch knows what wires A and B are on.

Resources