Top Banner
WEBASSEMBLY illustrated Takenobu T. Rev. 0.01.0 exploring some mental models and implementations
107

WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

May 27, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

WEBASSEMBLY illustrated

Takenobu T.

Rev. 0.01.0

exploring some mental models and implementations

Page 2: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

NOTE- Please refer to the official documents in detail.- This information is based on "WebAssembly Specification

Release 1.0 (Draft, last updated Oct 31, 2018)".- This information is current as of Nov, 2018.

Still work in progress.

Page 3: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

1. Introduction

- Overview

2. WebAssembly abstract machine

- Abstract machine

- Store

- Stack

- Computational model

- Type

- Trap

- Thread

- External interface

3. WebAssembly module

- Module

- Binary encoding

4. WebAssembly instructions

- Instructions

- Simple instructions

- Control instructions

- Byte order

Appendix A : Semantics

Appendix B : Implementation

- Implementations

- CLI development utilities

- Test suite

- Desugar examples

Appendix C : Future

References

Contents

Page 4: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

1. Introduction

Page 5: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

Overview

1. Introduction

Page 6: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

C/C++

source

WebAssembly module(WebAssembly binary)

References : [1] Ch.1.1, [2], [3], [6]

WebAssembly is a code format

Emscripten

Rust

source

Go

source

Haskell

source

Rust

compiler

Go

compiler

GHC/

Asterius

...

...LLVM,

Binaryen

Web browser(Firefox, Chrome, Safari, Edge, ...)

Source

code

Compiler

Runtime(Browser, Stand-alone)

WebAssembly

code

Other environment(Node.js, WAVM,...)

WebAssembly is a safe, portable, low-level code format.

Page 7: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.2, Ch.5, Ch.6, [7]

WebAssembly code

(module

(func (export "add7")

(param $x i64)

(result i64)

(i64.add

(get_local $x)

(i64.const 7))))

Text format Binary format

0x0061736d010000 ...

WebAssembly encodes a low-level, assembly-like programming language.

WebAssembly has multiple concrete representations.

(its text format and the binary format.)

(module

(type

(func (param i64) (result i64)))

(func (type 0)

(param i64) (result i64)

get_local 0

i64.const 7

i64.add)

(export "add7" (func 0)))

syntactic sugar core syntax

Page 8: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

WebAssembly abstract machine

References : [1] Ch.1.1, [2], [3]

Abstract machine is defined

Physical Processor(x86, ARM, ...)

Web browser, Other environment(Firefox, Chrome, Safari, Edge, Node.js, WAVM,...)

Platform,

Environment

Abstract machine

Code

Hardware

Software

WebAssembly code

WebAssembly is a virtual instruction set architecture (virtual ISA).

Execution behavior is defined in terms of an abstract machine.

Instruction set

Binary encoding

Validation semantics

Execution semantics

Page 9: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.1, Ch.3, Ch.7, [2]

Validation

Validation checks that a WebAssembly module is well-formed.

Validity is defined by a type system.

The type system of WebAssembly is sound, implying both type safety and

memory safety with respect to the WebAssembly semantics.

Runtime

Decode

Validation

Execution

WebAssembly module(WebAssembly binary)

Code

Page 10: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

2. WebAssembly abstract machine

Page 11: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

Abstract machine

2. WebAssembly abstract machine

Page 12: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.2, Ch.4, [2], [4]

WebAssembly abstract machine

Globals

Linear

memory

WebAssembly abstract machine

TableFunctions

Store

Operand

stack

Control

stack

Call

stack

Stack

(immutable) (immutable) (mutable) (mutable)

WebAssembly abstract machine is based on a stack machine.

The abstract machine includes a store and an implicit stack.

Page 13: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

Store

2. WebAssembly abstract machine

Page 14: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.2, Ch.4, [2], [4]

Store

Globals

Linear

memoryTableFunctions

Store

(immutable) (immutable) (mutable) (mutable)

The store represents all global state.

The store have been allocated during the life time of the abstract machine.

Page 15: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.2, Ch.4, [2], [4]

Functions

Functions

call #n

function #0function #1function #2

:

012

:

The function component of a module defines a vector of functions.

Functions are referenced through function indices.

Instruction sequence

...

Page 16: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.2, Ch.4, [2], [4]

Table

Table

call_indirect #n :

The table is an array of opaque values of a particular element type.

Currently, the only available element type is an untyped function reference.

This allows emulating function pointers by way of table indices.

Tables are referenced through table indices.

012

:

function #0function #1function #2

:

Functions

Page 17: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.2, Ch.4, [2], [4]

Linear memory

Linear memory

8 bit

8, 16, 32, 64-bit load

8, 16, 32, 64-bit store

:Byte array

The linear memory is a contiguous, mutable array of raw bytes.

The linear memory can be addressed at byte level (including unaligned).

The size of the memory is a multiple of the WebAssembly page size.

Page 18: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.2, Ch.4, [2], [4]

Globals

Globals

get_global #n

set_global #n

global variable #0global variable #1global variable #2

:

012

:

The globals component defines a vector of global variables.

The globals are referenced through global indices.

The global variables hold a value and can either be mutable or immutable.

Page 19: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

Stack

2. WebAssembly abstract machine

Page 20: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.2, Ch.4, [2], [4]

Stack

Operand

stack

Control

stack

Call

stack

Stack

Most instructions interact with the implicit stack.

The stack contains values, labels and frames(activations).

Values

Labels

Frames

Page 21: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.2, Ch.4, [2], [4]

Operand stack

Operand stack

Stack

push ,popadd, sub, ...

Instructions manipulate values on an implicit operand stack.

The layout of the operand stack can be statically determined at any point in the code.

valuevaluevalue

Page 22: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.2, Ch.4, [2], [4]

Control stack

Control stack

Stack

push ,popblock, if, loop,

branch

Each structured control instruction introduces an implicit label.

Labels are targets for branch instructions that reference them with label indices.

labellabellabel

Page 23: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.2, Ch.4, [2], [4]

Call stack

Call stack

Stack

push ,popcall, return

Frames hold the values of its local variables (including arguments).

Frames also carry the return arity of the respective function.

local variable #2local variable #1local variable #0set_local, get_local

set, get

Frame

Frame

return arity

Page 24: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

Computational model

2. WebAssembly abstract machine

Page 25: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

WebAssembly abstract machine

References : [1] Ch.2, Ch.4, [2], [4]

Computational model

Globals

Linear memoryOperand stack

operation

Instruction sequence

instruction

Control stack

Call stack (Frame)

Functions Table

...

Store

Ext

ern

al m

ean

s

Page 26: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

WebAssembly abstract machine

References : [1] Ch.2, Ch.4, [2], [4]

Computational model

Globals

Linear memoryOperand stack

operation

Instruction sequence

instruction

add, sub, ...

Control stack

Call stack (Frame)

Functions Table

...

load, store

set,

get

set,

get

local vars

global varslabels

blo

ck, b

r, ..

.

values

Store

code

call call_indirect

call, return

Ext

ern

al m

ean

s

Page 27: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

Type

2. WebAssembly abstract machine

Page 28: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.1, Ch.2, Ch.3, Ch.4 , [2], [4]

Value types

Integers 32bit

64bitIntegers

Floating-point numbers 32bit

64bit

WebAssembly provides only four basic value types.

32 bit integers also serve as Booleans and as memory addresses.

Floating-point numbers

Page 29: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.2, Ch.3, Ch.4, [2]

Instructions have type annotations

i32.add 32bit integer

64bit integeri64.add

f32.add 32bit floating-point

64bit floating-pointf64.add

arguments

result

arguments

result

arguments

result

arguments

result

Some instructions have type annotations.

For example, the instruction i32.add has type [i32 i32] → [i32],

consuming two i32 values and producing one.

Page 30: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.2, Ch.3, Ch.4, [2]

Functions have type declarations

Each function takes a sequence of WebAssembly values as

parameters and returns a sequence of values as results as defined

by its function type.

Operand stack

value

result

(func

(param $x i64)

(result i64)

(i64.add

(get_local $x)

(i64.const 7)))

Call stack (Frame)

local variablearguments

Page 31: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.2, Ch.3, Ch.4, [2]

Control blocks have also a type declaration

Every control construct is annotated with a function type.

Operand stack

value

result

(block (result i64)

(i64.add

(get_local $x)

(i64.const 7)))

Page 32: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

Trap

2. WebAssembly abstract machine

Page 33: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.1, Ch.4 , [2], [4]

Trap

WebAssembly abstract machine

Code

Abort

Report

Handling

Certain instructions may produce a trap, which immediately aborts execution.

Traps cannot be handled by WebAssembly code,

but are reported to the outside environment, where they typically can be caught.

Environment

Page 34: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

Ext

ern

al m

ean

s

WebAssembly abstract machine

References : [1] Ch.1, Ch.4 , [2], [4]

Trap

Globals

Linear memoryOperand stack

operation

Instruction sequence

instruction

Control stack

Call stack (Frame)

Functions Table

...

Store

unreachable

zero division

invalid conversion

undefined element

uninitialized element

out of bounds

Page 35: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.1, Ch.4 , [2], [4]

Linear memory

Linear memory

load/store :in bounds

Trap

OK

A trap occurs if an access is not within the bounds of the current memory size.

Page 36: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

Thread

2. WebAssembly abstract machine

Page 37: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.4 , [2], [4]

Thread

Store

Stack

GlobalsLinear

memoryTableFunctions

Instruction

sequence

Threads

The current version of WebAssembly is single-threaded,

but configurations with multiple threads may be supported in the future.

Page 38: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

External interface

2. WebAssembly abstract machine

Page 39: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

WebAssembly abstract machine

References : [1] Ch.2, Ch.4. [2]. [5]

Import and export

Globals

Linear memory

Stack

Functions

Table

Store

Host environment,

External means,

Other instances, ...

Import

Export

Import

Export

Import

Export

Import

Export

Functions, table, memory and globals may be shared via import/export.

Page 40: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

WebAssembly abstract machine

References : [1] Ch.2, Ch.4. [2]

Mutation from external

Globals

Linear memory

Functions

Table

Store

Host environment,

External means,

Other instances, ...

Mutable

Mutable

Mutable

Stack

Table, memory and globals can be mutated by external mean.

Page 41: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

WebAssembly abstract machine

References : [1] Ch.2, Ch.4. [2]

Invoke from external

Globals

Linear memory

Functions

Table

Store

Host environment,

External means,

Other instances, ...

Invoke

Stack

Any exported function can be invoked externally.

exported

Page 42: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

WebAssembly abstract machine

References : [1] Ch.2, Ch.4. [2]

Foreign call

Globals

Linear memory

Functions

Table

Store

Host environment,

External means,

Other instances, ...

Invoke (FFI)Codecall

imported

Stack

Call instructions can invoke an imported function.

Page 43: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

3. WebAssembly module

Page 44: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

Module

3. WebAssembly module

Page 45: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.1, Ch.2, Ch.4, [2], [4], [5]

WebAssembly module

WebAssembly module

code

WebAssembly programs are organized into modules.

Modules are the distributable, loadable, and executable unit of code.

WebAssembly modules are distributed in a binary format.

Page 46: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.1, Ch.2, Ch.4, [2], [4], [5]

WebAssembly module

Linear memoryTable

Functions elem data

WebAssembly module

Types start

Globals

Export

Import

A module collects definitions for types, functions, table, memory, and globals.

In addition, it can declare imports and exports and provide initialization logic in

the form of data and element segments or a start function.

Page 47: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.1, Ch.2, Ch.4, [2], [4], [5]

WebAssembly module and abstract machine

Linear memoryTable

Functions elem data

WebAssembly module

Types start

Globals

Export

Import

Store

GlobalsLinear

memoryTableFunctionsStack

limit(size)

limit(size)

WebAssembly abstract machine (module instance)

A module corresponds to the static representation of a program.

A module instance corresponds to a dynamic representation.

Page 48: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

Binary encoding

3. WebAssembly module

Page 49: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.5, [7], [5]

Binary encoding of modules

The binary encoding of modules is organized into sections.

WebAssembly module (WebAssembly binary)

...000000016100 6d73 017e7e020701 6001 037e

Form

magicversion

code section

:

import section

type section

table section

sect

ion

s

Binary encoding of modules

func section

Page 50: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.5, [7], [5]

Sections

magicversion

code section

:

import section

type section

table section

000000sizeid 017e7e020701 6001 037e ...

Binary encoding of modules

sect

ion

s

content

Section format

Each section consists of

• a one-byte section id,

• the u32 size of the contents, in bytes,

• the actual contents, whose structure is depended on the section id.

func section

Page 51: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.5, Ch.6, [7], [5]

Example of WebAssembly module

0000000: 0061 736d ; WASM_BINARY_MAGIC

0000004: 0100 0000 ; WASM_BINARY_VERSION

; section "Type" (1)

0000008: 01 ; section code

0000009: 05 ; section size

000000a: 01 ; num types

; type 0

000000b: 60 ; func

000000c: 00 ; num params

000000d: 01 ; num results

000000e: 7f ; i32

; section "Function" (3)

000000f: 03 ; section code

0000010: 02 ; section size

0000011: 01 ; num functions

0000012: 00 ; function 0 signature

; index

; section "Export" (7)

0000013: 07 ; section code

0000014: 07 ; section size

0000015: 01 ; num exports

0000016: 03 ; string length

0000017: 666f 6f ; foo ; export name

000001a: 00 ; export kind

000001b: 00 ; export func index

; section "Code" (10)

000001c: 0a ; section code

000001d: 06 ; section size

000001e: 01 ; num functions

; function body 0

000001f: 04 ; func body size

0000020: 00 ; local decl count

0000021: 41 ; i32.const

0000022: 07 ; i32 literal

0000023: 0b ; end

(module

(func (export "foo" (result i32)

i32.const 7))

[text format]

[binary format]

(`wat2wasm –v` command)

Page 52: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [8], [1] Ch.5, [2]

Integer encoding with LEB128

All integers are encoded using the LEB128 variable-length integer encoding.

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 0 0 0 1 132bit integer

8bit 8bit 8bit 8bit

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 0 0 0 1 1

7bit7bit7bit7bit

7bit splitting

0 0 0 1 0 0 1 0 1 0 0 0 1 10 1Adding variable bit

0x00_00_04_C3

0x09_A3

8bit8bit(compressed)

Page 53: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

4. WebAssembly instructions

Page 54: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

Instructions

4. WebAssembly instructions

Page 55: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.1, Ch.2, Ch.3, Ch.4, [2], [4]

Instructions

Simple instructions Control instructions

Stack

push/pop/...

operations

Block, Loop, Conditional

Instructions fall into two main categories.

Simple instructions perform basic operations on data.

Control instructions alter control flow.

Page 56: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

Simple instructions

4. WebAssembly instructions

Page 57: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.2, Ch.4 , [2], [4]

Numeric instructions

operation

Operand stack

pop operands I64.add, I64.sub, ...

value bvalue avalue

operation

Operand stack

value cvalue

push result

I64.add, I64.sub, ...

Numeric instructions pop arguments from the operand stack

and push results back to it.

Page 58: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.2, Ch.4 , [2], [4]

Numeric instructions : const

operation

Operand stackI64.const n, I32.const n, ...

value

Operand stack

value nvalue

push value n

The const instruction pushes the value to the stack.

Page 59: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.2, Ch.4 , [2], [4]

Parametric instructions : drop

Operand stack

pop operands drop

valuevaluevalue

Operand stack

valuevalue

The drop instruction simply throws away a single operand.

Page 60: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.2, Ch.4 , [2], [4]

Parametric instructions : select

Operand stack

pop operands

select

value bvalue avalue

Operand stack

value b or cvalue

The select instruction selects one of its first two operands based on

whether its third operand is zero or not.

value c

select

Page 61: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.2, Ch.4 , [2], [4]

Global variable instructions

Operand stack

valuevaluevalue

Globals

get_global #n

set_global #n

global variable #0global variable #1global variable #2

:

012

:

Global variable instructions get or set the values of variables.

Page 62: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.2, Ch.4 , [2], [4]

Local variable instructions

Operand stack

valuevaluevalue

Frame

get_local #n

set_local #n

tee_local #n

local variable #0local variable #1local variable #2

:

012

:

Local variable instructions get or set the values of variables.

(including function arguments)

Page 63: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.2, Ch.4 , [2], [4]

Memory instructions : load, store

Operand stack

valuevaluevalue

load

store

Memory is accessed with load and store instructions for the different value types.

:

Linear memory

Page 64: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.2, Ch.4 , [2], [4]

Memory instructions : memory.grow

Linear memory

:

Linear memory

:

Grown

Current

size

memory.grow

Page size; 64KiB

The memory.grow instruction grows memory by a given delta.

The memory.grow instruction operate in units of page size (64KiB).

Page 65: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

Control instructions

4. WebAssembly instructions

Page 66: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.2, Ch.4 , [2], [4]

Control flow is structured

block

end

loop

end

block

end

if

end

else

Control flow is expressed with well-nested constructs such as blocks, loops,

and conditionals (if-else).

Structured control flow allows simpler and more efficient verification.

:

:

:

:

:

Page 67: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.2, Ch.4 , [2], [4]

Structured control instructions

The block, loop and if instructions are structured control instructions.

block construct loop construct if construct

block

end

loop

end

if

else

end

Page 68: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.2, Ch.4 , [2], [4]

Control constructs and branch instruction

block

:

:

br 0

:

:

end

loop

:

:

:

br 0

:

end

if

br 0

:

else

:

:

end

Branches can only target control constructs.

Intuitively, a branch targeting a block or if behaves like a break statement,

while a branch targeting a loop behaves like a continue statement.

forward jump forward jumpbackward jump

block construct loop construct if construct

Page 69: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.2, Ch.4 , [2], [4]

Nested constructs and branch instruction

Branches have "label" immediates.

It do not reference program positions in the instruction stream

but instead reference outer control constructs by relative nesting depth.

br 2

br

2 br

1 br

0

block

end

block

end

block

end

Page 70: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.2, Ch.4 , [2], [4]

Conditional branch instruction

The br_if instruction performs a conditional branch.

br_if

block

end

Operand stack

valuevaluevalue

not-taken (else)taken (then)condition

Page 71: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

br_table

br b

r br

block

end

block

end

block

end

References : [1] Ch.2, Ch.4 , [2], [4]

Table branch instruction

Operand stack

valuevaluevalue

The br_table performs an indirect branch through an operand indexing

into the label vector.

indexing

Page 72: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.2, Ch.4 , [2], [4]

Call instruction

call #n

function #0

Functions

function #1function #2

:

The call instruction invokes another function,

consuming the necessary arguments from the stack and returning

the result values of the call.

Page 73: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.2, Ch.4 , [2], [4]

Indirect call instruction

call_indirect

function #0

Functions

function #1function #2

:

element #0

Table

element #1element #2

:

The call_indirect instruction calls a function indirectly through

an operand indexing into a table.

Operand stack

valuevaluevalue

indexing

Page 74: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch.2, Ch.4 , [2], [4]

Return instruction

The return instruction is an unconditional branch to the outermost block,

which implicitly is the body of the current function.

return

block

end

block

end

block

end

Page 75: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

Byte order

4. WebAssembly instructions

Page 76: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [1] Ch. 4, [2], [4]

Endian

WebAssembly abstract machine is little endian byte order.

When a number is stored into memory, it is converted into a sequence of

bytes in little endian byte order.

Operand stack

MSB LSB

Linear memory

8 bit

Byte arrayN+3 N+2 N+1 N

i32.load

i32.store

01

:

NN+1N+2N+3

:

Page 77: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

Appendix A

Page 78: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

Semantics

Appendix A

Page 79: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [2], [1] Ch.3, Ch.4, Ch.7

Validation and execution semantics

The semantics is derived from the following article:

"Bringing the Web up to Speed with WebAssembly" [2]

Execution semantics: reduction rulesValidation semantics: typing rules

Page 80: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

Appendix B

Page 81: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

Implementations

Appendix B

Page 82: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [C1], [C2], [C3], [C4], [C5]

Implementations

wasm-as

Binaryen

wat2wasm

WABT

wasm-interp

WAVM

Go compiler

Go

node.js

Firefox

web browserRuntime

wasm

Wasm

Binary(.wasm)

Spec

wasm wavm-runwasm-shell

Wasm

Binary(.wasm)

Wasm

Binary(.wasm)

Wasm

Binary(.wasm)

Wasm

Binary(.wasm)

Wasm

Text(.wat)

Wasm

Text(.wat)

Wasm

Text(.wat)

Wasm

Text(.wat)

Go

source

Page 83: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [C1]

Reference interpreter : spec

let rec step (c : config) : config =

let {frame; code = vs, es; _} = c in

let e = List.hd es in

let vs', es' =

match e.it, vs with

| Plain e', vs ->

(match e', vs with

| Unreachable, vs ->

vs, [Trapping "unreachable executed" @@ e.at]

| Nop, vs ->

vs, []

| Block (ts, es'), vs ->

vs, [Label (List.length ts, [], ([], List.map plain es')) @@ e.at]

| Loop (ts, es'), vs ->

vs, [Label (0, [e' @@ e.at], ([], List.map plain es')) @@ e.at]

| If (ts, es1, es2), I32 0l :: vs' ->

vs', [Plain (Block (ts, es2)) @@ e.at]

https://github.com/WebAssembly/spec

[interpreter/exec/eval.ml]

Page 84: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [C3]

Interpreter : WABT

Result Thread::Run(int num_instructions) {

Result result = Result::Ok;

const uint8_t* istream = GetIstream();

const uint8_t* pc = &istream[pc_];

for (int i = 0; i < num_instructions; ++i) {

Opcode opcode = ReadOpcode(&pc);

assert(!opcode.IsInvalid());

switch (opcode) {

case Opcode::Select: {

uint32_t cond = Pop<uint32_t>();

Value false_ = Pop();

Value true_ = Pop();

CHECK_TRAP(Push(cond ? true_ : false_));

break;

}

case Opcode::Br:

GOTO(ReadU32(&pc));

break;

case Opcode::BrIf: {

https://github.com/WebAssembly/wabt

[src/interp/interp.cc]

Page 85: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [C4]

Stand-alone VM : WAVM

// Decode the WebAssembly opcodes and emit LLVM IR for them.

OperatorDecoderStream decoder(functionDef.code);

UnreachableOpVisitor unreachableOpVisitor(*this);

OperatorPrinter operatorPrinter(irModule, functionDef);

Uptr opIndex = 0;

while(decoder && controlStack.size())

{

irBuilder.SetCurrentDebugLocation(

llvm::DILocation::get(llvmContext, (unsigned int)opIndex++,

0, diFunction));

if(ENABLE_LOGGING)

{ logOperator(decoder.decodeOpWithoutConsume(operatorPrinter)); }

if(controlStack.back().isReachable) { decoder.decodeOp(*this); }

else

{

decoder.decodeOp(unreachableOpVisitor);

}

wavmAssert(irBuilder.GetInsertBlock() == returnBlock);

if(EMIT_ENTER_EXIT_HOOKS)

https://github.com/WAVM/WAVM

[Lib/LLVMJIT/EmitFunction.cpp]

Page 86: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [C5]

Web browser : Firefox

switch (op.b0) {

case uint16_t(Op::End):

if (!emitEnd()) {

return false;

}

if (iter_.controlStackEmpty()) {

if (!deadCode_) {

doReturn(funcType().ret(), PopStack(false));

}

return iter_.readFunctionEnd(iter_.end());

}

NEXT();

// Control opcodes

case uint16_t(Op::Nop):

CHECK_NEXT(iter_.readNop());

case uint16_t(Op::Drop):

CHECK_NEXT(emitDrop());

case uint16_t(Op::Block):

CHECK_NEXT(emitBlock());

case uint16_t(Op::Loop):

https://github.com/mozilla/gecko-dev

[ js/src/wasm/WasmBaselineCompile.cpp]

Page 87: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

CLI development utilities

Appendix B

Page 88: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [C1], [C2], [C3]

Assemble

$ wasm-as sample.wat

Binaryen :

$ wat2wasm sample.wat

WABT :

$ wat2wasm -v sample.wat

Assemble Wasm text format (.wat) to Wasm binary format (.wasm) :

$ wasm –d sample.wat

Spec :

Page 89: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [C1], [C2], [C3]

Disassemble

$ wasm-dis sample.wasm

Binaryen :

$ wasm2wat sample.wasm

WABT :

$ wasm-objdump -d sample.wasm

Disassemble Wasm binary format (.wasm) to Wasm text format (.wat)

$ wasm –d sample.wasm

Spec :

Page 90: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [C3]

Desugar

$ wat-desugar sample.wat

WABT :

Desugar Wasm text format (.wat) to Wasm text format (.wat)

Page 91: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [C1], [C3]

Dump information

$ wasm-objdump -s sample.wasm

WABT :

$ wasm-objdump -x sample.wasm

Dump Wasm binary format (.wasm) information :

$ wasm –s sample.wasm

Spec :

Page 92: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [C1], [C3], [c4]

Run

$ wasm-interp --run-all-exports --trace sample.wasm

WABT : Run Wasm binary format with trace

$ wavm-run sample.wat

WAVM : Run Wasm text format

Run Wasm binary format (.wasm) and Wasm text format (.wat) :

$ wasm sample.wasm -e '(invoke “XXX”)'

Spec : Run Wasm binary format

Page 93: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [C1]

REPL

$ wasm -

Spec :

REPL (Read-Eval-Print-Loop) :

$ wasm sample.wasm -

Page 94: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

Test suite

Appendix B

Page 95: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [C1]

Test suite and Wasm text format examples

README.md fac.wast names.wast

address.wast float_exprs.wast nop.wast

align.wast float_literals.wast return.wast

binary.wast float_memory.wast run.py*

block.wast float_misc.wast select.wast

br.wast forward.wast set_local.wast

br_if.wast func.wast skip-stack-guard-page.wast

br_table.wast func_ptrs.wast stack.wast

break-drop.wast get_local.wast start.wast

call.wast globals.wast store_retval.wast

call_indirect.wast i32.wast switch.wast

comments.wast i64.wast tee_local.wast

const.wast if.wast token.wast

conversions.wast imports.wast traps.wast

custom.wast inline-module.wast type.wast

data.wast int_exprs.wast typecheck.wast

elem.wast int_literals.wast unreachable.wast

endianness.wast labels.wast unreached-invalid.wast

exports.wast left-to-right.wast unwind.wast

f32.wast linking.wast utf8-custom-section-id.wast

f32_bitwise.wast loop.wast utf8-import-field.wast

f32_cmp.wast memory.wast utf8-import-module.wast

f64.wast memory_grow.wast utf8-invalid-encoding.wast

f64_bitwise.wast memory_redundancy.wast

f64_cmp.wast memory_trap.wast

https://github.com/WebAssembly/spec

[test/core]

Note: `.wast` extension means command-script and Wasm text format.

Page 96: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

Desugar examples

Appendix B

Page 97: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [7], [1] Ch.6, Ch.2, Ch.4

Desugar example

Text format

(func (result i32)

i32.const 1

i32.const 2

i32.add)

syntactic sugar core syntax

Text format

(func (result i32)

(i32.add

(i32.const 1)

(i32.const 2)))

Page 98: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [7], [1] Ch.6, Ch.2, Ch.4

Desugar example

Text format

(func (result i32)

i32.const 1

i32.const 2

i32.const 3

i32.mul

i32.add)

syntactic sugar core syntax

Text format

(func (result i32)

(i32.add

(i32.const 1)

(i32.mul

(i32.const 2)

(i32.const 3))))

Page 99: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [7], [1] Ch.6, Ch.2, Ch.4

Desugar example

Text format

(func (result i32)

block (result i32)

i32.const 1

i32.const 2

i32.add

end)

syntactic sugar core syntax

Text format

(func (result i32)

(block (result i32)

(i32.add

(i32.const 1)

(i32.const 2))))

Page 100: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [7], [1] Ch.6, Ch.2, Ch.4

Desugar example

Text format

(func

block

block

br 1

end

end)

syntactic sugar core syntax

Text format

(func

(block $label_a

(block $label_b

br $label_a)))

Page 101: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [7], [1] Ch.6, Ch.2, Ch.4

Desugar example

Text format

(func (result i32)

get_global 0

if (result i32)

i32.const 1

else

i32.const 2

end)

syntactic sugar core syntax

Text format

(func (result i32)

(if (result i32) (get_global 0)

(then (i32.const 1))

(else (i32.const 2))))

Page 102: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

Future

Appendix C

Page 103: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References : [2], [3], [4]

Future directions

* zero-cost exception, threads, SIMD

* tail call, stack switching, coroutines

* garbage collectors

Page 104: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References

Page 105: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References

[1] WebAssembly Specification Release 1.0 (Draft, last updated Oct 31, 2018)

https://webassembly.github.io/spec/core/

[2] Bringing the Web up to Speed with WebAssembly

https://github.com/WebAssembly/spec/blob/master/papers/pldi2017.pdf

[3] WebAssembly High-Level Goals

https://webassembly.org/docs/high-level-goals/

[4] Design Rationale

https://webassembly.org/docs/rationale/

[5] Modules

https://webassembly.org/docs/modules/

[6] MDN: WebAssembly Concepts

https://developer.mozilla.org/en-US/docs/WebAssembly/Concepts

[7] MDN: Understanding WebAssembly text format

https://developer.mozilla.org/en-US/docs/WebAssembly/Understanding_the_text_format

[8] Wikipedia: LEB128

https://en.wikipedia.org/wiki/LEB128

Page 106: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

References

[C1] spec: WebAssembly specification, reference interpreter, and test suite.

https://github.com/WebAssembly/spec

[C2] Binaryen: Compiler infrastructure and toolchain library for WebAssembly, in C++

https://github.com/WebAssembly/binaryen

[C3] WABT: The WebAssembly Binary Toolkit

https://github.com/WebAssembly/wabt

[C4] WAVM: WebAssembly Virtual Machine

https://github.com/WAVM/WAVM

[C5] mozilla/gecko-dev (Firefox)

https://github.com/mozilla/gecko-dev

Page 107: WEBASSEMBLY illustrated - GitHub Pages...WebAssembly encodes a low-level, assembly-like programming language. WebAssembly has multiple concrete representations. (its text format and

Here is the slide: https://github.com/takenobu-hs/WebAssembly-illustrated