How Web Browsers Work

How Web Browsers Work

A break down of how typing a URL into an input field turns into colorful pixels on your screen.

On your computer no application is more important or more frequently used than your browser. Yet, most people have no idea how it works.

Of course, you don’t need to know how it works to use it. As for me, I am determined to understand the tools I rely on.

Browser Architecture Overview

This is the main flow the browser has to go through when rendering a layout.

Rendering engine basic flow

There are lots of different components in a browser, but here are the most important ones.

Browser architecture

User Interface

UI includes all the elements you directly interact with, such as the address bar, back/forward buttons, home button, bookmarks menu, and settings. It is the visible layer of the browser.

Browser Engine

The browser engine acts as an orchestrator or bridge between the User Interface (UI) and the Rendering Engine. It takes inputs from the UI (like a URL entered in the address bar) and relays commands to the Rendering Engine.

Rendering Engine

The rendering engine is responsible for displaying the requested content. If the content is an HTML document, the rendering engine parses the HTML and CSS and displays the parsed content on the screen.

  • Example: Blink (used in Chrome and Edge), Gecko (used in Firefox), WebKit (used in Safari).

JavaScript Engine

To parse and execute JavaScript code we have the JavaScript Engine. It allows for dynamic content and interactive features on websites.

  • Example: V8 (used in Chrome and Node.js), SpiderMonkey (used in Firefox), JavaScriptCore (used in Safari).

Networking

The networking layer handles all network communications, such as HTTP/HTTPS requests, to retrieve URLs and other web resources. It manages aspects like internet protocols, security, and data transfer.

UI Backend

This is responsible for drawing the basic widgets that make up the browser’s user interface, like combo boxes, windows, and buttons. It uses the operating system’s user interface methods.

  • Example: Skia is a 2D graphics library used by Chrome for this purpose.

Data Persistence / Storage

The Data persistence layer allows the browser to store various kinds of data locally. This includes cookies, cache, local storage, IndexedDB, WebSQL, and FileSystem.

  • Example: Browsers often use database systems like SQLite for managing this data.

DNS

When you type a URL in your browser (like hungrimind.com), behind the scenes, your browser does not use the URL directly to retrieve the contents of the website. It can only do that using an IP address.

dns

You can think of URLs as usernames, and IP addresses as phone numbers. To connect to any website, you need the phone number of that website.

This is where DNS servers come in.

They’re like the phonebooks of the internet. You query the DNS server, asking for the IP that corresponds to hungrimind.com, and the DNS server replies with the IP.

Now, your browser knows the IP address of the site, and it can connect to it and retrieve the HTML.

DOM Tree

After the browser receives the HTML, it needs to turn it into the page that you see. It does this by creating a DOM tree.

DOM stands for Document Object Model, and it defines the logical structure of the web page. The DOM tree represents the HTML elements as a hierarchy of nodes.

For every HTML element, there is a corresponding element node in the DOM tree.

DOM Tree

The root of the DOM tree is the Document node followed by the <html> element. As the parser traverses the HTML string, it creates new DOM nodes from the HTML elements and appends them as children into the tree.

For example, given this HTML:

<html>
<h1>Welcome!</h1>
</html>

After parsing, the DOM tree would look like this:

Document
└── html
└── h1
└── #text "Welcome!"

Note that the text “Welcome!” is added to the DOM as a text node, which is a child of the h1 element node.

There is no stylesheet, scripts, or assets yet; those come in later.

Render Tree

Before the browser can render the web page, it needs to create a “render tree”. The render tree is what the browser uses when it paints the pixels on the screen.

The render tree is a combination of the DOM and the CSSOM. For each node in the DOM, the browser checks if there is a CSS rule that applies to it. If rule exists, it adds that to the render tree. If it doesn’t exist, it adds the DOM node with the default style. And if the element has display: none, it is omitted from the render tree entirely.

Now, we have a tree with all the information needed for rendering. You start at the root of the tree and work your way down. The next step is to determine the layout and the size of each element.

Layout

Now that the render tree has been constructed, the browser needs to create a layout. The purpose of the layout is to figure out the position of everything; where is the header, where is the footer, and how big should each element be.

During this step, the browser calculates the coordinates for each element in this process. This process starts with the root “block” element, typically the <html> tag, and sets its y and x coordinates to (0, 0), which is the top left corner.

Then it goes through one by one with each element in the render tree.

The last part of the process is referred to as painting.

Paint

Painting is where the browser goes through all of these layers and displays pixels for what each of them should look like.

Painting Visualizer

This is where the rendering engine and the operating system meet. The rendering engine sends instructions to the operating system to display pixels on the screen, and how this is done depends on the operating system. For Windows, it’s done with DirectX, on a Mac, it would be done with Metal.

The journey from a simple text input to a full-featured visual webpage is quite the journey. Hopefully, this process is much less magical and mystical next time you interact with the browser.

Get Articles in Your Inbox

Sign up to get the latest articles in your inbox, CEO insights, and free goodies.