Chapter 1.1.1 - What is the Internet?
Before we delve into complicated questions, we need to answer a more fundamental one — what is the Internet. Generally speaking, it's a network of connected computers just like yours. They're connected in various ways: some are connected to the Internet with optic fiber (that travels at the speed of light), some with cellular networks, some with Wi-Fi, etc. What unites them is the languages they speak: protocols for exchanging information. The architecture of the Internet is very complicated and we will not go in-depth on these topics (save that for an advanced undergraduate course). However, we will provide a friendly overview of the most important components that we believe are crucial to know before you can embark on the journey of building your own website.
IP addresses
In most cases, each computer in the Internet has its own address. If you're on a shared Wi-Fi, than your computer would share the same IP address as other computers connected to the router, although it would still be a unique address. Sometimes they change: that's called a dynamic IP address. In most cases, though, the IP addresses are static. What's the difference? There's no fundamental difference, other than the fact that they change, and the reason why that happens is because your IP address is assigned to you by your Internet provider, and sometimes they recycle them for internal purposes.
Client vs. server
There are two main participants in the exchange of information via the Internet: the client and the server. The client is the computer that sends a request for some information to the server, and the server is the computer in charge of returning some requested information back to the client. For example, when you go to https://apple.com, you are the client and you are sending a request to some of the Apple's servers.
There are also two types of IP addresses currently available: IPv4 and IPv6. For the purposes of this course, we will only deal with IPv4 addresses. IPv4 address is represented by four numbers less or equal to 255 split by a dot (i.e. 192.168.0.1). Why 255? That is the maximum value that a byte can hold.
Server ports
Think of servers as office buildings. To reach a particular office, you would first need to know the address of its office building (IP address). Then, you'd need to know the office number. The analogy of an office number for servers is a port number: a digit ranging from 0 to 65536. Why 65536? That is the maximum value that two bytes can hold (though don't worry about this, and you certainly don't need to memorize that number). There are some conventions for port numbers; here are just a few:
- 21 is for FTP servers (File Transfer Protocol)
- 80 is for HTTP servers (Hyper-Text Transfer Protocol; explained below)
- 443 is for HTTPS servers (Secure HTTPS; explained below)
80 and 443 are so common that you don't actually need to type them in your browser — simply use the IP address. For other ports, you would need to write the IP addresses followed by a colon and by the port number (i.e., 192.168.0.1:21). The website you will build will be available at ports 80 and 443 (unless you don't want others to be able to find them).
HTTP/HTTPS
HTTP is the protocol used by practically every website you ever visited. It is a standard protocol that your computer uses to form a request to the server and that the server uses to compile necessary information and send it back to you. For example, if you visit https://apple.com, your browser would actually create something like the HTTP request below and send it to Apple (this will differ from computer to computer). Note that it contains my operating system, my language, and other metadata that will allow Apple to customize their response to me.
GET / HTTP/2
Host: www.apple.com
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:88.0) Gecko/20100101 Firefox/88.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br
DNT: 1
Connection: keep-alive
Cookie: ...
Upgrade-Insecure-Requests: 1
Sec-GPC: 1
In response, Apple would send back the HTML file for https://apple.com. What is HTML? You will learn that in Chapter 2 :)
You don't need to know what each of the lines in the above request means, except the first one — that's the kind of HTTP request that you are sending. There are quite a few possible requests. Here are the most common ones:
GET: used to retrieve informationPOST: used to post/update information (usually sent with some payload, i.e., the data that you want to update)DELETE: used to delete some informationOPTIONS: used to retrieve available HTTP methods (for example, not all servers supportPOSTor evenGET)
For the purposes of this course you will likely only use GET. If you were to build a complicated server with a database, you would need to support more methods. When you visit a website, your browser always sends a GET request. In fact, you can't conveniently send anything else unless you send it in code (you will learn JavaScript in Chapter 6).
HTTPS is just a secure version of HTTP. Essentially, before sending this HTTP request above, your browser would encrypt it and the server would decrypt it — that way anyone in between (for example, your Internet provider) would not know what you are sending and receiving. How exactly that works is beyond the scope of this class, but trust us when we say some brilliant people created it and it's definitely secure.
DNS
The final important concept for understanding websites is DNS — the Domain Name System. As we mentioned above, each computer connected to the Internet has an IP address, but people are not good at memorizing four arbitrary-looking three-digit numbers, and so the founders of the Internet created DNS — a way to map easily memorizable domain names like apple.com to less memorizable addresses like 17.253.144.10 (which is the real address behind that domain, at least for my computer). There are many ways to configure DNS when you get your own domain, but here are the two most relevant records:
- A: maps an IP address to a domain name
- CNAME: stores a canonical name for your domain (in other words, an alias)