What Happens When You Click google.com

Photo by Firmbee.com on Unsplash

What Happens When You Click google.com

A beginner's guide to how the web works

The creation of the internet should be the single invention that changed the world forever. Just think of how much things have changed since Tim Berners-Lee built on existing platforms to make the World Wide Web (WWW) and changed the way we share information online. I personally am grateful for his statement: “This is for everyone!” Says a lot doesn’t it?

While you use the web both to share and receive information, have you ever wondered how it works? How is it that you click the link to a website, and you get to do all those fun things you normally do, from reading the news, to watching videos of somersaulting cats and the p video that’s so readily available these days (picture me shaking my head)?

The Client/Server System Of The Web

Well, the web operates through this client/server system. The client is you, and by “you” I mean your web browser like Google Chrome. What the client does is make requests to the server. When you click the link to any of your favourite websites, and it automatically begins to load in Chrome (if you had made it your default browser to begin with), you’re making a request to the website’s server. As the client, you’re asking the server to give you the information you need.

The server is the computer, or any other device, where the information you need is stored. It could either be a software or a hardware server. Web servers are typically powerful computers, and they are capable of handling many requests from many clients without crashing. Now that I said it, I guess you can suspect why sometimes there are difficulties accessing your favourite website when you click the link. It could be that many people like you were trying to get to the server at the same time, and it just couldn’t take the high demand. What the owners of the server can do is get many servers to serve their many clients, and that way if one is down, they could redirect you to another server so you can easily get your precious information. They use something called a Load Balancer to be able to get this done. The concept of load balancing is beyond the scope of this article, but I thought it would be cool to mention it.

So after the server receives the request from you, the client, it processes it and, depending on the outcome of its processing, returns either a file, a program or an error. The file is normally a web page in HTML, CSS and JavaScript. The program is normally something executable, if that was what the page was optimized for, and the error is just the server telling you either that the information you needed wasn’t available, or it didn’t understand your request, or some other thing.

Uniform Resource Locator (URL)

I remember I mentioned something about clicking links earlier on. Links are something called Uniform Resource Locators (URLs), which help the client and the server communicate and find what exactly you are looking for. I’m sure the links you type normally look like this;

imaginarywebsite.com/thispage/thatpage/this.. (some might have additional gibberish with question marks and stuff!).

This URL is important because it helps uniquely identify the thing you need to get, otherwise the web wouldn’t even know where to look. Think of it like a librarian in a huge public library. She knows where every book is because there’s a system that organizes the books in such a way that you can find books easily, even if there were 10,000 books in the place. Maybe the books would have a unique number based on their category or shelf or colour.

Let’s break down the anatomy of the URL and properly understand what it means.

The URL is made up of four parts;

  • The protocol (that’s the “https” part)

  • The hostname (the something.com)

  • The port (although it’s normally invisible)

  • The path and file name

The protocol

The protocol is just the set of rules that govern how the information is exchanged between the client and the server. Think of it like traffic rules that prevent drivers from killing people on the road. Although there are many protocols to use, HTTP and HTTPS are the most popular. HTTP is short for HyperText Transfer Protocol, while the extra “S” at the later just means it’s a more secure form of HTTP. This security means there’s a level of encryption as the data is transferred from server to client. Think of encryption as changing the information to a secret code only the client would understand, and if someone intercepts the data before it gets to you, the client, he would not be able to understand it. When the file gets to the client, it decodes it (changes it to its original form) and presents it to you.

The hostname

The hostname is the name of the unique location where the information is located. You see, there’s something called an IP address. Think of it as the house address of every website on the internet. It’s a set of numbers that specify the location of what exactly you’re looking for. No two websites can have the same IP address, and it’s assigned by the Internet Service Provider (ISP). For example, this is the IP address of the official WWE website;

151.101.2.133:443

Yeah that’s right. With this system it’ll be really easy to remember your favourite website’s web address won’t it? I can already picture you struggling to keep all the IP addresses of your 50 favourite websites in your head. And that’s where a Domain Name System (DNS) comes to save the day. The DNS maps each IP address to an easy to remember name, and when you type the name, it would look it up and match it to an IP address so you wouldn’t need to remember that 151.101.2.133:443 is wwe.com!

The port

The port is just a numerical identifier that helps tell where internet traffic should be directed to. It’s just like telling the server “Hey, clients would be coming through that door so stand by it to receive them.” There are many ports, from 0 to 65535, and they serve specific purposes.

Remember I said it was somewhat invisible in the earlier example? That’s because there are default ports for different protocols. If you are using the default port, there would be no need to type it in the URL. In the example, I used https, and it listens, by default, on port 443 (notice that the WWE’s website, which uses a HTTPS protocol, has a 443 behind the column). HTTP would listen on port 80. The combination of an IP address and port number will help route the client’s request to the appropriate server for a response.

File path

Finally, the file path and file name is just the exact location of the specific information you would want to retrieve. You see, making a website is somehow organizing files and folders and linking them together. It’s kinda like how I know to check my documents folder, and then the textbooks folder in the documents, and then the anatomy folder in the textbooks, for my anatomy textbooks.

HTTP Status Codes

Remember I said the server could also throw up an error if it has some issues with your requests? There are different kinds of errors and you would know what kind of error you get by looking at something called the error code.

Depending on the nature of the server’s response, there are HTTP codes designated, from 100 to 500. While there are a lot of codes, and I don’t expect you to care to know all of them, they’re grouped under different categories. These categories are the 100 level codes, 200 level codes, all the way to 500 level codes, and although these categories give different information, the codes in each of the categories are related. To give you an idea, I’d use the first digit of the code to categorize them.

1xx — 100 Level Codes

These codes normally give some information about the status of your request. For example;

  • 100 means continue

  • 101 means it’s switching protocols

2xx — 200 Level Codes

These codes mean your request was successfully received and you got the right response.

  • 200 means OK, here’s what you wanted to see

  • 204 means OK, but there’s no content (it’s an empty page).

3xx — 300 Level Codes

These codes mean the owner of the site has moved to a new location, and you’re being redirected.

  • 301 means they moved permanently to their new site

  • 307 means a temporary redirect

  • 308 means a permanent redirect

4xx — 400 Level Codes

These codes mean an error from your end

  • 400 means that’s a bad request you made.

  • 403 means you were not allowed to make the request (that means the page is not for you to visit)

  • 404 means the server couldn’t find the page you asked for, probably because you spelt something wrong (isn’t this the most popular code?)

  • 451 means the page is unavailable for legal reasons

5xx — 500 Level Codes

These codes mean an error on the server’s end.

  • 500 means an Internal server error.

  • 503 means the service is unavailable.

  • 504 means there’s a gateway timeout.

There it is, the preliminaries on understanding how the web works and what happens behind the scene. The next time you type google.com (or any other website for that matter), perhaps you would appreciate all that goes into displaying the results you see on your screen. I hope you enjoyed reading this as I enjoyed writing it. See you soon!