How Web Browsing Works
What happens when you type a URL in a browser?
10,000-foot View
5,000-foot View[*]
- If requested page is in browser cache and is fresh, render the cached content
- DNS lookup to find the IP address of the requested domain name (e.g. www.google.com = 8.8.8.8)
- Connect to the web server via a TCP+TLS handshake
- Browser sends an HTTPS request to the server
- Server sends HTTPS response
- Browser renders the HTML
DNS Cache Lookup
Browsers search for IP addresses in the following order of DNS caches:
- Browser
- OS
- Router
- ISP
- DNS via root servers
OSI Model
- Physical layer
- Data link layer
- Network layer (IP)
- Transport layer (TCP, UDP) - divides outbound msg into packets; reassembles inbound packets into msg; flow & error control
- Session layer
- Presentation layer
- Application layer (HTTP, FTP) - the protocol in which the end-to-end, client/server communicate
What about TLS/SSL?
- TLS doesn't fit the OSI model[*], but it can be considered to be in a "Security" layer
- TLS and its predecessor, SSL, both are commonly referred to as SSL
- Provides the cryptographic/encryption protocol to ensure secure communication
Transport Layers
- TCP
- Cast
- Request/Response
- Unicast
- 1 sender; 1 receiver (point-to-point)
- "Guaranteed" delivery?: Yes
- UDP
- Cast
- Publish/subscribe
- Broadcast or multicast
- 1 sender; many receivers/listeners
- "Guaranteed" delivery?: No
- QUIC
- built on top of UDP
- reduced latency & bandwidth
- Easier to change at the app level vs. the kernel level
- Cast
TCP+TLS Handshake - Overview
TLS Details
- Certificates - TLS uses certificates to verify the identity of endpoints.
- Cipher - The algorithm for performing encryption/ decryption. Common ciphersuites are Diffie–Hellman and RSA; involves public & private keys:
- Public Key - shared between client and server; basically used to encrypt messages
- Private Key - NOT shared; basically used to decrypt messages
Load Balancing
- Browser requests are usually handled by a "farm" of web servers.
- Load Balancers (LBs) distribute the workload amongst the web servers.
- NAT - Browsers just know about one IP address but different servers with various IP addresses can process the requests. Network Address Translation (NAT) is a way of remapping one IP address into another by modifying datagram packets
Load Balancing - Types
- Layer 4 load balancing operates at the intermediate transport layer (TCP), which deals with delivery of messages with no regard to the content of the messages. L4 LBs simply forward network packets to and from the upstream server without inspecting the content of the packets. They can make limited routing decisions by inspecting the first few packets in the TCP stream.
- Layer 7 load balancing operates at the high-level application layer (HTTP), which deals with the actual content of each message. L7 LBs can make sophisticated routing decisions based on the content of the message (the URL or cookie, for example). L7s can also apply optimizations and changes to the content (such as compression and encryption).