Given that so much of software engineer is on web servers and clients, one of the most immediately valuable areas of computer science is computer networking. With the advent of the World Wide Web, the global Internet has rapidly become the dominant type of computer network. It now enables people around the world to use the Web for e-commerce and interactive entertainment applications, in addition to email and IP telephony. As a result, the study of computer networking is now synonymous with the study of the Internet and its applications.
Recently, I finished an online Stanford course called “Introduction to Computer Networking.” The course focuses on explaining how the Internet works, ranging from how bits are modulated on wires and in wireless to application-level protocols like BitTorrent and HTTP. It also explains the principles of how to design networks and network protocols. I want to share a small bit of the knowledge I acquired from the course here.
The easiest way to understand computer networks is through the compare itself. Computers are general-purpose machines that mean different things to different people. Some of us just want to do basic tasks like word processing or chatting with Facebook friends and so we couldn’t care less how that happens under the covers. At the opposite end of the spectrum, some of us like modifying our computers to run faster, fitting quicker processors or more memory, or whatever it might be; for geeks, poking around inside computers is an end in itself. Somewhere in between these extremes, there are moderately tech-savvy people who use computers to do everyday jobs with a reasonable understanding of how their machines work. Because computers mean different things to different people, it can help us to understand them by thinking of a stack of layers: hardware at the bottom, the operating system somewhere on top of that, then applications running at the highest level. You can “engage” with a computer at any of these levels without necessarily thinking about any of the other layers. Nevertheless, each layer is made possible by things happening at lower levels, whether you’re aware of that or not. Things that happen at the higher levels could be carried out in many different ways at the lower levels; for example, you can use a web browser like Chrome (an application) on many different operating systems, and you can run various operating systems on a particular laptop, even though the hardware doesn’t change at all.
Computer networks are similar: we all have different ideas about them and care more or less about what they’re doing and why. If you work in a small office with your computer hooked up to other people’s machines and shared printers, probably all you care about is that you can send emails to your colleagues and print out your stuff; you’re not bothered how that actually happens. But if you’re charged with setting up the network in the first place, you have to consider things like how it’s physically linked together, what sort of cables you’re using and how long they can be, what the MAC (media access control) addresses are, and all kinds of other nitty-gritty. Again, just like with computers, we can think about a network in terms of its different layers — and there are 2 popular ways of doing that.
- The OSI (Open Systems Interconnect) model describes a computer network as a stack of 7 layers. It was conceived as a way of making all kinds of different computers and networks talk to one another, which was a major problem back in the 60s, 70s, and 80s — when virtually all computing hardware was proprietary and one manufacturer’s equipment seldom worked with anyone else’s.
- If you’ve never head of the OSI model, that’s quite probably because a different way of hooking up the world’s computers triumphed over it, delivering the amazing computer network you’re using right now: the Internet. The Internet is based on a 2-part networking system called TCP/IP in which computers hook up over networks (using what’s called Transmission Control Protocol) to exchange information in packets (using the Internet Protocol).
While the OSI model is quite an abstract and academic concept, rarely encountered outside books and articles about computer networking, the TCP/IP model is a simpler, easier-to-understand, and the more practical proposition: it’s the bedrock of the Internet — and the very technology you’re using to read these words now. We can understand TCP/IP using 4 slightly simple layers described in details below:
1 — Link Layer:
The Internet is made up of end-hosts, links, and routers. Data is delivered hop-by-hop over each link in turn. Data is delivered in packets. A packet consists of the data we want to be delivered, along with a header that tells the network where the packet is to be delivered, where it came from and so on.
The Link layer’s job is to carry the data over one link at a time. You have probably heard of Ethernet and WiFi — these are 2 examples of different Link layers.
2 — Network Layer:
The most important layer is the Network layer. It delivers packets end-to-end across the Internet from the source to the destination. A packet is an important basic building block in networks. A packet is the name we give to a self-contained collection of data, plus a header that describes what the data is, where it is going and where it came from.
Network layer packets are called datagrams. They consist of some data and a head containing the “To” and “From” addresses — just like we put the “To:” and “From” addresses on a letter. The Network hands the datagram to the Link Layer below, telling it to send the datagram over the first link. In other words, the Link Layer is providing a service to the Network Layer. Essentially, the Link Layer says: “if you give me a datagram to send, I will transmit it over one link for you.”
At the other end of the link is a router. The Link Layer of the router accepts the datagram from the link, and hands it up to the Network Layer in the router. The Network Layer on the router examines the destination address of the datagram, and is responsible for routing the datagram one hop at a time towards its eventual destination. It does this by sending to the Link Layer again, to carry it over the next link. And so on until it reaches the Network Layer at the destination.
Notice that the Network Layer does not need to concern itself with *how* the Link Layer sends the datagram over the link. In fact, different Link Layers work in very different ways; Ethernet and WiFi are clearly very different. This separation of concerns between the Network Layer and the Link Layer allows each to focus on its job, without worrying about how the other layer works. It also means that a single Network Layer has a common way to talk to many different Link Layers by simply handing them datagrams to send. This separation of concerns is made possible by the modularity of each layer and a common well-defined API to the layer below.
On the internet, the network layer is special: When we send packets into the Internet, we must use the Internet Protocol. It is the Internet Protocol, or IP, that holds the Internet together. IP provides a deliberately simple service. It is a simple, dumb, minimal service with four main features: It sends datagrams, hop-by-hop across the Internet. The service is unreliable and best-effort; there is no per-flow state making the protocol connectionless.
3 — Transport Layer:
The most common Transport Layer is TCP, or the Transmission Control Protocol.
TCP makes sure that data sent by an application at one end of the Internet is correctly delivered –in the right order -to the application at the other end of the Internet. If the Network Layers drops some datagrams, TCP will retransmit them, multiple times if need-be. If the Network Layer delivers them out of order –perhaps because two packets follow a different path to their destination — TCP will put the data back into the right order again.
Applications such as a web client, or an email client, find TCP very useful indeed. By employing TCP to make sure data is delivered correctly, they don’t have to worry about implementing all of the mechanisms inside the application. They can take advantage of the huge effort that developers put into correctly implementing TCP, and reuse it to deliver data correctly. Reuse is another big advantage of layering.
But not all applications need data to be delivered correctly. For example, if a video conference application is sending a snippet of video in a packet, there may be no point waiting for the packet to be retransmitted multiple times; better to just move on. Some applications just don’t need the TCP service.
If an application doesn’t need reliable delivery, it can use the much simple UDP — or user datagram protocol — instead. UDP just bundles up application data and hands it to the Network Layer for delivery to the other end. UDP offers no delivery guarantees.
In other words, an Application has the choice of at least two different Transport Layer services: TCP and UDP. There are in fact many other choices too, but these are the most commonly used transport layer services.
4 — Application Layer:
There are of course many thousands of applications that use the Internet. While each application is different, it can reuse the Transport Layer by using the well-defined API from the Application Layer to the TCP or UDP service beneath.
Applications typically want a bi-directional reliable byte stream between two endpoints. They can send whatever byte-stream they want, and Applications have a protocol of their own that defines the syntax and semantics of data flowing between the two endpoints.
For example, when a web client requests a page from a web server, the web client sends a GET request. This is one of the commands of the hypertext transfer protocol, or http. http dictates that the GET command is sent as an ASCII string, along with the URL of the page being requested. As far as the Application Layer is concerned, the GET request is sent directly to its peer at the other end –the web server Application. The Application doesn’t need to know how it got there, or how many times it needed to be retransmitted. At the web client, the Application Layer hands the GET request to the TCP layer, which provides the service of making sure it is reliably delivered. It does this using the services of the Network layer, which in turn uses the services of the Link Layer.
Putting it all together
Network engineers find it convenient to arrange all the functions that make up the Internet into Layers. At the top is the Application, such as BitTorrent or Skype or the world wide web, which talks to its peer-layer at the destination. When the application has data to send, it hands the data to the Transport layer, which has the job of delivering the data reliably to the other end. The Transport Layer sends data to the other end by handing it to the Network Layer, which has the job of breaking the data into packets, each with the correct destination address. Finally, the packets are handed to the Link Layer, which has the responsibility of delivering the packet from one hop to the next along its path. The data makes its way, hop by hop, from one router to the next. The Network Layer forwards it to the next router, one at a time, until it reaches the destination. There, the data is passed up the layers, until it reaches the Application.
Network engineers are responsible for implementing, maintaining, supporting, developing and, in some cases, designing communication networks within an organization or between organizations. Their goal is to ensure the integrity of high availability network infrastructure to provide maximum performance for their users. Having a fundamental understanding of concepts such as TCP/IP is absolutely required if you want to become one.
I highly recommend you taking the Stanford course if you want to learn more about computer networking. You can get all the course’s lecture slides from my GitHub source code here. Good luck studying!