How Browsers Work

Yuyang 前端小白🥬

Users want web experiences with content that is fast to load and smooth to interact with.They are two goals we want to achieve.

To better achieve these goals, we need to understand how browsers work.

How Browsers Work

Two major issues in web performance are issues having to do with network latency and issues having to do with the fact that for the most part, browsers are single-threaded.

Navigation is the process of loading a web page. It involves the following steps:

  1. DNS lookup
    The first step in the navigation process is to look up the IP address of the server that hosts the website. This is done using the Domain Name System (DNS).If you navigate to https://example.com , the HTML page is located on the server with IP address of 93.184.216.34. If you’ve never visited this site, a DNS lookup must happen.
    After this initial request, the IP will likely be cached for a time, which speeds up subsequent requests by retrieving the IP address from the cache instead of contacting a name server again.

img
This can be problematic for performance, particularly on mobile networks. When a user is on a mobile network, each DNS lookup has to go from the phone to the cell tower to reach an authoritative DNS server. The distance between a phone, a cell tower, and the name server can add significant latency.

  1. TCP handshake
    Once the browser has the IP address, it can establish a connection to the server. This is done using the Transmission Control Protocol (TCP). The browser sends a SYN packet to the server, which responds with a SYN-ACK packet, and the browser sends an ACK packet back. This is known as the TCP handshake.

img

  1. TLS negotiation
    If the website uses HTTPS, the browser and server must negotiate a secure connection. This is done using the Transport Layer Security (TLS) protocol. The browser sends a ClientHello message to the server, which responds with a ServerHello message, and the browser sends a Finished message. This is known as the TLS handshake.

Response

Once we have established a connection to the server, we can request the HTML page. The server responds with the HTML page, which the browser parses and renders.

Congestion control / TCP slow start

During the TCP handshake, the browser and server negotiate the maximum segment size (MSS) for the connection. The browser starts by sending a small number of segments and increases the number of segments it sends until it reaches the maximum segment size. This is known as TCP slow start.

Parsing

Once the browser receives the first chunk of data, it can begin parsing the information.Parsing is the step the browser takes to turn the data it receives over the network into a Document Object Model (DOM) tree and a CSS Object Model (CSSOM), which are used to render the page.

The browser will begin parsing and attempting to render the page as soon as it receives the first chunk of data. This is known as incremental rendering.

CRP: Critical Rendering Path

Web Performance includes the following:

  • server requests and responses
  • loading
  • scripting
  • rendering
  • layout
  • painting

The CRP is the sequence of steps the browser goes through to convert the HTML, CSS, and JavaScript into pixels on the screen.
A request for a web page or app starts with an HTTP request. The server sends a response containing the HTML. The browser then begins parsing the HTML, converting the received bytes to the DOM tree. The browser initiates requests every time it finds links to external resources, be it stylesheets, scripts, or embedded image references. Some requests are blocking, which means the parsing of the rest of the HTML is halted until the imported asset is handled. The browser continues to parse the HTML making requests and building the DOM, until it gets to the end, at which point it constructs the CSS object model. With the DOM and CSSOM complete, the browser builds the render tree, computing the styles for all the visible content. After the render tree is complete, layout occurs, defining the location and size of all the render tree elements. Once complete, the page is rendered, or ‘painted’ on the screen.

Document Object Model (DOM)

DOM construction is incremental.

CSS Object Model (CSSOM)

CSSOM construction is incremental.CSS is render blocking.

Render tree

The render tree is the combination of the DOM and CSSOM. It is used to render the page.
To contrcut the render tree, the browser will:

  • Traverse the DOM tree
  • Match the CSSOM rules to the DOM nodes
  • Apply the CSSOM rules to the DOM nodes
  • Construct the render tree

Layout

Layout is the process of determining the size and position of each element on the page. The browser will:

  • Traverse the render tree
  • Calculate the size and position of each element
  • Determine the flow of the page

Painting

Painting is the process of filling in pixels on the screen. The browser will:

  • Traverse the render tree
  • Paint the pixels on the screen
此页目录
How Browsers Work