What You Will Learn
This blog aims to:
- Give you a landscape view of the browser exploitation realm.
- Give you some pointers on how to get started in browser exploitation.
Prerequisites
Before reading this, itβs gonna be good to:
- Have some low level knowledge (e.g previously coded in
C
/C++
). - Know how to code in
JavaScript
is also a plus.
Why Would I Learn This?
Incentives motivate work
So why would you consider browser hacking, why would you even bother to learn how that black box works?
Let me give you a couple of reasons:
- It puts bread on the table, i.e you can get up to $250,000 bounties while doing this kind of stuffπ°.
- It can even help you while doing youβre day to day programming β yes while youβre writing the next JS framework! β, knowing how things work at a bare metal level, can really you even with the most abstract languages, case in point, is there a difference between these two loops performance wise?
for (let i = 0; i < n; ++i) {
for (let j = 0; j < n; ++j) {
for (let k = 0; k < n; ++k) {
C[i][j] += A[i][k] * B[k][j];
}
}
}
for (let k = 0; k < n; ++k) {
for (let j = 0; j < n; ++j) {
for (let i = 0; i < n; ++i) {
C[i][j] += A[i][k] * B[k][j];
}
}
}
- Teaches you how to handle complexity, embrace abstraction, and cope with a high probability of failure.
- βAesthetically pleasingβ - Alisa Esage, popping a shell is always fun to see, but there is something about popping a shell by just entering a website in the address bar.
How Browsers Work?
Browsers can be thought of as almost independent machines, i.e. they run programs (aka websites), render UIs, handle different users, etc. This might even have been the motivation behind projects like Chromium OS and Firefox OS (now discontinued), though they didnβt succeedβafter all, you canβt just overtake well-established, operating systems overnight anyways π€·ββοΈ.
Nonetheless, browsers are incredibly complex and beautifully engineered pieces of software, and their architecture is worth exploring.
Weβll be focusing on the Chromium browser (since Iβve had some experience with its low-level implementation, particularly its JavaScript engine-but letβs not get ahead of ourselves).
A picture diagram is worth a hundred words, so hereβs a diagram outlining the major components of Chromiumβs architecture, I used the three dots to abstract away the scary IPC (Inter-process communication woo woo π»), which we wonβt be discussing for now:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Chromium Browser β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β βββββββββββββββ βββββββββββββββ βββββββββββββββββββββββββ β
β β βββββββββββ β β Renderer β β GPU Process β β
β β | ... | β β (Blink + β β (Graphics β β
β β βββββββββββ β β V8) β β Acceleration) β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββββββββββββ β
β ββββββββββββββββ βββββββββββββββ β
β β Network β β Storage β β
β β (Net Stack) β β (Cache, β β
β ββββββββββββββββ β Cookies) β β
β βββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
We will be discussing the renderer process, in particular, the V8 component.
Now How Does V8 Work?
Why are we looking at the V8 component specifically you ask? because itβs the most interesting one, from an attacker perspective, itβs where a whole language (Javascript) gets evaluated, and JIT-ed, and the thing about JIT compilation is that you canβt avoid memory corruption by just employing a memory safe language (e.g Rust), itβs code generation logic that might lead to such vulnerabilities, not the code itself (at least not as frequently cmiiw), thatβs why the V8 team, keeps using C++ as of today.
Here is a diagram illustrating a general view of the V8 pipeline workflow:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β V8 JavaScript Engine β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β βββββββββββββββββββββββββ βββββββββββββββββββββββββββββ β
β β Parser β ββΊ β Abstract Syntax Tree β β
β β (Converts JS to AST) β β (AST) β β
β βββββββββββββββββββββββββ βββββββββββββββββββββββββββββ β
β βΌ β
β βββββββββββββββββββββββββ βββββββββββββββββββββββββββββ β
β β Ignition β ββ β Bytecode Generator β β
β β (Interpreter) β β (Converts AST to Bytecode)β β
β βββββββββββββββββββββββββ βββββββββββββββββββββββββββββ β
β ββββββββββββββββββββ βΌ β
β ββββ Maglev/Sparkplug βββ βββββββββββββββββββββββββββββ β
β β Turbo(Fan/Shaft)/... β ββ β Profiler (Hot Code) β β
β β (Optimizing Compiler) β β (Monitors Execution) β β
β βββββββββββββββββββββββββ βββββββββββββββββββββββββββββ β
β β
β βββββββββββββββββββββββββ βββββββββββββββββββββββββββββ β
β β Orinoco β ββ β Garbage Collector β β
β β (Memory Management) β β (Heap Cleanup) β β
β βββββββββββββββββββββββββ βββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Okay, I guess, but as we know itβs the case with Javascript, βEverything is an Objectββat least mostly, so how are objects are stored in the memory?
Well, as the saying goes βThings are known by their oppositesβ, so letβs see how is a low level language: C, differ from an abstract language: Javascript:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
| | C (Low-Level) | JavaScript (Abstract) |
ββββββββββββββββββββββββ+ββββββββββββββββββββββββββββββ+βββββββββββββββββββββββββββββββ€
| Memory Management | Manual (malloc/free) | Automatic (Garbage Collected)|
| | Uses glibc heap | Uses V8 heap |
ββββββββββββββββββββββββ+ββββββββββββββββββββββββββββββ+βββββββββββββββββββββββββββββββ€
| Object Representation | Structs saved as-is in | Objects are complex |
| | contiguous memory | spec-defined[^1] structures |
| | (e.g., `struct {int x;}`) | (e.g., hidden classes, maps) |
ββββββββββββββββββββββββ+ββββββββββββββββββββββββββββββ+βββββββββββββββββββββββββββββββ€
| Execution Model | Compiled to native machine | [JIT-compiled] (e.g., V8 |
| | code (no runtime) | Ignition/TurboFan) |
ββββββββββββββββββββββββ+ββββββββββββββββββββββββββββββ+βββββββββββββββββββββββββββββββ€
| Type System | Static (fixed at compile | Dynamic (types inferred |
| | time) | at runtime) |
ββββββββββββββββββββββββ+ββββββββββββββββββββββββββββββ+βββββββββββββββββββββββββββββββ
[^1]: https://262.ecma-international.org/
Letβs look at an example of an array of floats:
βββββββββββββββββββββ βββββββββββββββββββββββββββββββββββββββββββ
β JS Code β β V8 Heap β
βββββββββββββββββββββ€ βββββββββββββββββββββββββββββββββββββββββββ€
β const arr = β | βββββββββββββββββββββββββββββββββββ β
β [1.1, 1.2, 1.3];ββββββββββΆ β JSArray (arr) β β
βββββββββββββββββββββ β βββββββββββββββββββββββββββββββββββ€ β
β β - HiddenClass (Map) β β
β β - ElementsKind: DOUBLE_ELEMENTS β β
β β - Length: 3 β β
β β - Elements ββββββββββββββββββββΌβββ |
β βββββββββββββββββββββββββββββββββββ β β
β β β
β βββββββββββββββββββββββββββββββββββ β β
β β FixedDoubleArray ββββ β
β βββββββββββββββββββββββββββββββββββ€ β
β β - [0]: 1.1 β β
β β - [1]: 1.2 β β
β β - [2]: 1.3 β β
β βββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββ
Pointer Compression
Letβs talk about pointer compression, because thatβs what enabled (or at least in my understanding helped) with pointer sandboxing that we will be discussing later.
Well you might be used to 64 bits addresses, but in V8 heap, they announced (around 2018 IIRC) pointer compression, which is basically just saving the lower 32-bits of each pointer (offset), while the higher 4-bytes are saved to a register (r14 on x86), for fast access, so the pseudo-code for βuncompressingβ a βsandboxβ pointer can be stated as this: full_pointer = base_address | (compressed_pointer & 0xFFFFFFFF)
.
Pointer Tagging
There is also the concept of pointer tagging, to distinguish between Smi
(immediate small integer) and a HeapObject
(anything allocated in the V8 heap inherits from this), this is very simple, the LSB bit can be either 0
(Smi
) or 1
(HeapObject
).
ββββββββββββββββββββββββββββββ
β 32-bit Compressed Pointer β
ββββββββββββββββββββββββββββββ€
β [31........1][0] β
β Offset Tag β
ββββββββββββββββββββββββββββββ
Careful reader will recognize that this poses that V8 heap will be limited to 4GB
, which is true, but fortunately V8 has already a 2GB
/4GB
limit even before this mechanism has been deployed.
The Optimization Pipeline
Overview
Before learning how V8 bugs gets introduced, having an idea about itβs optimization pipeline.
Letβs look at an example at how a javascript function gets optimized down to machine code:
βββββββββββββββββββββββββββββ
β JavaScript Source β
βββββββββββββββββββββββββββββ€
β function calc(a, b) { β
β const sum = a + b; β
β const prod = a * b; β
β return sum; β
β } β
βββββββββββββββββββββββββββββ
βΌ
ββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββ
β Bytecode (Ignition) β β Initial IR[^1] (TurboFan) β
ββββββββββββββββββββββββββββββββ€ βββββββββββββββββββββββββββββββββ€
β Ldar a1 β β t1 = Load a β
β Add a0, [0] β β t2 = Load b β
β Star0 β β t3 = Add t1, t2 β
β Ldar a1 β β Store sum = t3 β
β Mul a0, [1] β ββΊ β t4 = Load a β
β Star1 β β t5 = Load b β
β Ldar r0 β β t6 = Mul t4, t5 β
β Return β β Store prod = t6 β
ββββββββββββββββββββββββββββββββ β t7 = Load sum β
β Return t7 β
βββββββββββββββββββββββββββββββββ
βΌ
ββββββββββββββββββββββββββββββββββββββββ
β Optimized IR (After DCE[^2]) β
ββββββββββββββββββββββββββββββββββββββββ€
β t1 = Load a β
β t2 = Load b β
β t3 = Add t1, t2 β
β Return t3 β
ββββββββββββββββββββββββββββββββββββββββ
βΌ
ββββββββββββββββββββββββββββββββββββββββ
β Final Machine Code β
ββββββββββββββββββββββββββββββββββββββββ€
β mov rax, [a] β
β add rax, [b] β
β ret β
ββββββββββββββββββββββββββββββββββββββββ
[^1]: Intermediate representation
[^2]: Dead-code elimination
Ignitionβs bytecode isnβt currently in our interest, but itβs quite self explanatory once you know that ignitionβs machine is register-based, with an accumulator register, and that Lda
/Sta
Loads/Stores to that accumulator, ai
registers holds the arguments passed to the function.
Other forms of optimizations can also happen, letβs recall a couple of those:
Common subexpression elimination
βββββββββββββββββββββββββ βββββββββββββββββββββββββ
β Original IR β β Optimized IR β
β β β β
β t1 = Load a β β t1 = Load a β
β t2 = Load b βββββΊβ t2 = Load b β
β t3 = Add t1, t2 β β t3 = Add t1, t2 β
β t4 = Load a β β t4 = t1 β // Reuse t1
β t5 = Load b β β t5 = t2 β // Reuse t2
β t6 = Add t4, t5 β β t6 = t3 β // Reuse t3
βββββββββββββββββββββββββ βββββββββββββββββββββββββ
Inline expansion
βββββββββββββββββββββββββ βββββββββββββββββββββββββ
β Caller Function β β Inlined IR β
β β β β
β t1 = Call foo(a, b) βββββΊβ t1 = Add a, b β // foo() inlined
β ... β β ... β
βββββββββββββββββββββββββ βββββββββββββββββββββββββ
Loop-invariant code motion
βββββββββββββββββββββββββ βββββββββββββββββββββββββ
β Original Loop β β Optimized Loop β
β β β β
β while (i < n) { β β t1 = Load x β // Hoisted
β t1 = Load x βββββΊβ while (i < n) { β
β t2 = Add t1, i β β t2 = Add t1, i β
β } β β } β
βββββββββββββββββββββββββ βββββββββββββββββββββββββ
When It Breaks
JIT Typer Bugs
Typer bugs are a class of vulnerabilities that arise from incorrect type or range analysis during JIT optimization. V8 predicts variable types and value ranges to enable optimizations. If it makes wrong assumptions, it can lead to eliminated security checks.
Letβs see an example of these typer bugs.
function f(x) {
var arr = [1.1, 1.2];
var y = 0; // y0
if (x == "foo") y = 1; // y1
// y2 = phi(y0, y1)
y = y + 1; // y3 = y2 + 1
return arr[y];
}
The comments represent how the variables are gonna be assigned in the SSA (Static single-assignment form) form which is used by Turbofan, variables can be assigned only once in the said form.
The Phi function merges two or multiple possibilities, choosing one of them depending on the previous control flow.
Below is a CFG (Control-flow graph) in an SSA form.
ββββββββββββββββββββββ
β x == "foo" β
ββββββββ¬βββββββ¬βββββββ
β β
Yes β β No
β β
βββββββββββββββ βββββββββββ
β β
ββββββββ΄βββββββββ ββββββββ΄ββββββββββ
β y0 = 0 β β y1 = 1 β
β Range(0, 0) β β Range(1, 1) β
ββββββββ¬βββββββββ βββββββββ¬βββββββββ
β β
ββββββββββββββββββ¬βββββββββββββββ
βΌ
ββββββββββββββββββββββββββ
β y2 = Phi(y0, y1) β
β Range(0, 1) β
βββββββββββββ¬βββββββββββββ
βΌ
ββββββββββββββββββββββββββββ
β y3 = SpecSafeAdd(y2, 1) β
β Range(1, 2) β
ββββββββββββββββββββββββββββ
y3
type (Range(1, 2)
) is perfectly correct, but letβs assume SpecSafeAdd
is erroneous, i.e., it updates the Range
minimum but not the maximum, below is the updated node:
...
βΌ
ββββββββββββββββββββββββββββ
β y3 = SpecSafeAdd(y2, 1) β
β Range(1, 1) β
ββββββββββββββββββββββββββββ
Now we have a problem, which is that y3.Max
is assumed to be 1
, which is in the range of the arr
, this means when the function is compiled, when we invoke the function with x == βfooβ
, y
will have a value of 2
and at the return, we will get arr[2]
effectively having an out-of-bound read.
What about out-of-bound write?
function f(x, w) {
var arr = [1.1, 1.2];
var y = 0;
if (x == "foo") y = 1;
y = y + 1;
arr[y] = w;
}
Yeah.
Exploitation
So we have an out-of-bound r/w, itβs eventually just a byte/few bytes access, we need more reach, and more power, to have exactly that, in browser exploitation itβs very common after you get a memory corruption vulnerability, to develop two primitives: addrof
/fakeobj
.
addrof
The goal of this primitive β practically a function β is that you pass it an object, and you get itβs address of that object inside the V8 heap.
βββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββ
β JavaScript β β V8 Heap β
βββββββββββββββββββββββββββββ€ ββββββββββββββββββββββββββββ€
β β β β
β let obj = {x: 42}; ββββββββ0x158f...ββββββββββββββββββββββ β
β β β β JSObject (obj) β β
β let addr = addrof(obj); β β β - map: 0x... β β
β β β β - properties: ... β β
β // addr = 0x158f00040321 | β β - elements: ... β β
βββββββββββββββββββββββββββββ β βββββββββββββββββββββ β
ββββββββββββββββββββββββββββ
fakeobj
The goal of this primitive is to pass it an address, that you already have setup-ed a valid object that array and it will return to that object in the javascript world, and youβll ultimately have control over that objectβs backing store (e.g elements address of an array), thus having control where to write/read in all of the V8 heap.
βββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββββ
β JavaScript β β V8 Heap β
βββββββββββββββββββββββββββββββββββ€ ββββββββββββββββββββββββββββββββββββββ€
β β β β
β var arr = [1.1, 1.1, βββ 0x158f...βββββββββββββββββββββββ β
β map<<4|properties, β β β JSArray (arr) β β
β length<<4|elements] β β β - map: 0x... β β
β β β β - properties: ... β β
β var addr = addrof(arr); β ββββ€ - elements: ... β β
β // addr = 0x158f00040321 β ββ βββββββββββββββββββββ β
β β ββ βββββββββββββββββββββββββββββββββ β
β fake_arr = fakeobj(addr + ββββββββββββββ΄ββ€ Fake JSArray (arr's elements)β β
β 0x...) β β β - map: 0x... β β
β β β β - properties: ... β β
β β ββββ€ - elements: ... β β
β β ββ βββββββββββββββββββββββββββββββββ β
β β ββ βββββββββββββββββββββββββββββ β
β β ββ β WasmTrustedInstanceData β β
β fake_arr[0] βββββββββββββββββββββββββββββββ΄ββββ€ - jump_table_start: 0x... β β
β β β β (wasm rwx page) β β
β β β β β β
β β β βββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββββ
With these two primitives in youβre tool belts, itβs just a matter of writing youβre shell-code, to say, a WASM rwx
page, et voila!
Youβre Starter Pack
When doing V8 research, debugging is a must, this is my goto docker image:
FROM ubuntu:22.04 as build
RUN apt-get update && apt-get -y upgrade
RUN apt-get install -yq --no-install-recommends build-essential git ca-certificates python3-pkgconfig curl python3
RUN git clone https://chromium.googlesource.com/chromium/tools/depot_tools.git /opt/depot_tools
ENV PATH="/opt/depot_tools:${PATH}"
RUN mkdir /build
RUN cd /build && fetch v8 && cd v8 && git checkout 247d42b64689fe03fea5949373b3f0d0daa81375 && gclient sync
RUN cd /build/v8 && gn gen out/release --args='is_debug=false target_cpu="x64" v8_symbol_level=2 v8_enable_backtrace=true v8_enable_dissembler=true v8_enable_object_print=true v8_enable_verify_heap=true v8_enable_sandbox=true' && \
autoninja -C out/release d8
The V8 Cage

It turns out that making things more complicated can also be a way of solving problemsβand thatβs exactly the case with this new security feature introduced by the V8 team. The core idea is to βremove any unboxed pointers in the V8 heapβ and instead rely on indices into lookup tables that contain the actual unsandboxed pointers.
Letβs see how is that relevant to the most two important components that are traditionally used in the exploitation chain.
FunctionβWise
Previously, JSFunction
objects contained relative pointers to CodeDataContainer
objects, which in turn held raw pointers to JIT-compiled code - enabling JIT spraying attacks. The sandbox now replaces raw pointers with indices into a trusted code pointer table (outside the V8 heap), effectively eliminating1 these kinds of attacks.
Before
βββββββββββββββββββββββββββ
β V8 Sandbox β
βββββββββββββββββββββββββββ€
βββββββββββββββββββββββββββ βββββββββββββββββββββββββ
ββ JSFunction ββ β Code β
ββ (V8 Heap rw-) ββββββββΊβ (JIT-compiled r-x) β
ββ ββ β β
ββ - [[Code]]: ββ β push rbp β
ββ *Direct pointer* ββ β mov rbp, rsp β
βββββββββββββββββββββββββββ β sub rsp, 0xc0 β
βββββββββββββββββββββββββββ βββββββββββββββββββββββββ
After
βββββββββββββββββββββββββββ
β V8 Sandbox β
βββββββββββββββββββββββββββ€
βββββββββββββββββββββββββββ βββββββββββββββββββββββββ βββββββββββββββββββββββββ
ββ JSFunction ββ β Code Pointer Table β β Code β
ββ (V8 Heap rw-) ββ β (in trusted space) ββββββββΊβ (JIT-compiled r-x) β
ββ ββ β β β β
ββ - [[Code]]: ββ β Entry 0: 0xABC123 β β push rbp β
ββ *Index into table* ββββββββΊβ Entry 1: 0xDEF456 β β mov rbp, rsp β
ββ (e.g., "Entry 1") ββ β Entry 2: 0xGHI789 β β sub rsp, 0xc0 β
βββββββββββββββββββββββββββ βββββββββββββββββββββββββ βββββββββββββββββββββββββ
β β
βββββββββββββββββββββββββββ
WASMβWise
WASM instances also used to hold relative pointers to WasmTrustedInstanceData
, which in turn hold an unsandboxed pointer to the WASM rwx
page, but now it hold external table indices.
Before
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β V8 Sandbox β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β βββββββββββββββββββββββββ βββββββββββββββββββββββββ β
β β WASM Module β β WASM Instance β β
β β ββββββββΊβ β β
β β - Exported Functions β β - CodeTable: β β
β β (Sandboxed Pointer) β β *Raw pointers* β β
β βββββββββββββββββββββββββ ββββββββββββ¬βββββββββββββ β
β | β
β | β
ββββββββββββββββββββββββββββββββββββββββββββββ|ββββββββββββββββ
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β RWX Memory (Outside Sandbox) β
β - JIT-compiled WASM instructions β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
After
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β V8 Sandbox β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β βββββββββββββββββββββββββ βββββββββββββββββββββββββ β
β β WASM Module β β WASM Instance β β
β β ββββββββΊβ β β
β β - Exported Functions β β - CodeTable: β β
β β (Sandboxed Pointer) β β *Indices* β β
β βββββββββββββββββββββββββ ββββββββββββ¬βββββββββββββ β
β | β
β | β
ββββββββββββββββββββββββββββββββββββββββββββββ|ββββββββββββββββ
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Trusted Pointer Table (External) β
β - Entry 0: 0x1B15ABC123 (Tag + Address) β
β - Entry 1: 0x1B15DEF456 β
βββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββ
|
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β RWX Memory β
β - JIT-compiled WASM instructions β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
With these mitigations in place, exploitation is yet another step away.
DYIs
Hacking TVs?
Most modern TVs come with a browser. For example, the TV in our room has an ancient one β yes, 2016 is ancient in browser terms. So we can practice our newly learned skills on products like these.
Doing Some π©
Of course, Iβm joking, but surprisingly, itβs not very common for many headless browsers out there to get updated. Case in point: hereβs the User-Agent
of a third-party service (we found it being used by one of the multimillion-user platforms out there) endpoint bot:
Going Further
Weβve just scratched the surface of the iceberg, if you want to go further into this area of hacking, try doing this pwn.college dojo (thatβs how I started), and if youβre looking for some resources, Iβve been collecting some V8 resources, you can check it out, other than that, until the next time!
-
βeliminatingβ is a stretch.Β ↩