WebAssembly Specification

1. Introduction

1.1. Introduction

WebAssembly (abbreviated Wasm [1]) is a safe, portable, low-level code format designed for efficient execution and compact representation. Its main goal is to enable high performance applications on the Web, but it does not make any Web-specific assumptions or provide Web-specific features, so it can be employed in other environments as well.

WebAssembly is an open standard developed by a W3C Community Group.

This document describes version 1.0 of the core WebAssembly standard. It is intended that it will be superseded by new incremental releases with additional features in the future.

1.1.1. Design Goals

The design goals of WebAssembly are the following:

Fast, safe, and portable semantics:
- Fast: executes with near native code performance, taking advantage of capabilities common to all contemporary hardware.
- Safe: code is validated and executes in a memory-safe [2], sandboxed environment preventing data corruption or security breaches.
- Well-defined: fully and precisely defines valid programs and their behavior in a way that is easy to reason about informally and formally.
- Hardware-independent: can be compiled on all modern architectures, desktop or mobile devices and embedded systems alike.
- Language-independent: does not privilege any particular language, programming model, or object model.
- Platform-independent: can be embedded in browsers, run as a stand-alone VM, or integrated in other environments.
- Open: programs can interoperate with their environment in a simple and universal manner.
Efficient and portable representation:
- Compact: has a binary format that is fast to transmit by being smaller than typical text or native code formats.
- Modular: programs can be split up in smaller parts that can be transmitted, cached, and consumed separately.
- Efficient: can be decoded, validated, and compiled in a fast single pass, equally with either just-in-time (JIT) or ahead-of-time (AOT) compilation.
- Streamable: allows decoding, validation, and compilation to begin as soon as possible, before all data has been seen.
- Parallelizable: allows decoding, validation, and compilation to be split into many independent parallel tasks.
- Portable: makes no architectural assumptions that are not broadly supported across modern hardware.

WebAssembly code is also intended to be easy to inspect and debug, especially in environments like web browsers, but such features are beyond the scope of this specification.

[1]	A contraction of “WebAssembly”, not an acronym, hence not using all-caps.

[2]	No program can break WebAssembly’s memory model. Of course, it cannot guarantee that an unsafe language compiling to WebAssembly does not corrupt its own memory layout, e.g. inside WebAssembly’s linear memory.

1.1.2. Scope

At its core, WebAssembly is a virtual instruction set architecture (virtual ISA). As such, it has many use cases and can be embedded in many different environments. To encompass their variety and enable maximum reuse, the WebAssembly specification is split and layered into several documents.

This document is concerned with the core ISA layer of WebAssembly. It defines the instruction set, binary encoding, validation, and execution semantics, as well as a textual representation. It does not, however, define how WebAssembly programs can interact with a specific environment they execute in, nor how they are invoked from such an environment.

Instead, this specification is complemented by additional documents defining interfaces to specific embedding environments such as the Web. These will each define a WebAssembly application programming interface (API) suitable for a given environment.

1.2. Security Considerations

WebAssembly provides no ambient access to the computing environment in which code is executed. Any interaction with the environment, such as I/O, access to resources, or operating system calls, can only be performed by invoking functions provided by the embedder and imported into a WebAssembly module. An embedder can establish security policies suitable for a respective environment by controlling or limiting which functional capabilities it makes available for import. Such considerations are an embedder’s responsibility and the subject of API definitions for a specific environment.

Because WebAssembly is designed to be translated into machine code running directly on the host’s hardware, it is potentially vulnerable to side channel attacks on the hardware level. In environments where this is a concern, an embedder may have to put suitable mitigations into place to isolate WebAssembly computations.

1.2.1. Dependencies

WebAssembly depends on two existing standards:

[IEEE-754-2019], for the representation of floating-point data and the semantics of respective numeric operations.
[UNICODE], for the representation of import/export names and the text format.

However, to make this specification self-contained, relevant aspects of the aforementioned standards are defined and formalized as part of this specification, such as the binary representation and rounding of floating-point values, and the value range and UTF-8 encoding of Unicode characters.

Note

The aforementioned standards are the authoritative source of all respective definitions. Formalizations given in this specification are intended to match these definitions. Any discrepancy in the syntax or semantics described is to be considered an error.

1.3. Overview

1.3.1. Concepts

WebAssembly encodes a low-level, assembly-like programming language. This language is structured around the following concepts.

Values: WebAssembly provides only four basic value types. These are integers and [IEEE-754-2019] numbers, each in 32 and 64 bit width. 32 bit integers also serve as Booleans and as memory addresses. The usual operations on these types are available, including the full matrix of conversions between them. There is no distinction between signed and unsigned integer types. Instead, integers are interpreted by respective operations as either unsigned or signed in two’s complement representation.

Instructions: The computational model of WebAssembly is based on a stack machine. Code consists of sequences of instructions that are executed in order. Instructions manipulate values on an implicit operand stack [1] and fall into two main categories. Simple instructions perform basic operations on data. They pop arguments from the operand stack and push results back to it. Control instructions alter control flow. Control flow is structured, meaning it is expressed with well-nested constructs such as blocks, loops, and conditionals. Branches can only target such constructs.

Traps: Under some conditions, certain instructions may produce a trap, which immediately aborts execution. Traps cannot be handled by WebAssembly code, but are reported to the outside environment, where they typically can be caught.

Functions: Code is organized into separate functions. Each function takes a sequence of values as parameters and returns a sequence of values as results. [2] Functions can call each other, including recursively, resulting in an implicit call stack that cannot be accessed directly. Functions may also declare mutable local variables that are usable as virtual registers.

Tables: A table is an array of opaque values of a particular element type. It allows programs to select such values indirectly through a dynamic index operand. Currently, the only available element type is an untyped function reference. Thereby, a program can call functions indirectly through a dynamic index into a table. For example, this allows emulating function pointers by way of table indices.

Linear Memory: A linear memory is a contiguous, mutable array of raw bytes. Such a memory is created with an initial size but can be grown dynamically. A program can load and store values from/to a linear memory at any byte address (including unaligned). Integer loads and stores can specify a storage size which is smaller than the size of the respective value type. A trap occurs if an access is not within the bounds of the current memory size.

Modules: A WebAssembly binary takes the form of a module that contains definitions for functions, tables, and linear memories, as well as mutable or immutable global variables. Definitions can also be imported, specifying a module/name pair and a suitable type. Each definition can optionally be exported under one or more names. In addition to definitions, modules can define initialization data for their memories or tables that takes the form of segments copied to given offsets. They can also define a start function that is automatically executed.

Embedder: A WebAssembly implementation will typically be embedded into a host environment. This environment defines how loading of modules is initiated, how imports are provided (including host-side definitions), and how exports can be accessed. However, the details of any particular embedding are beyond the scope of this specification, and will instead be provided by complementary, environment-specific API definitions.

[1]	In practice, implementations need not maintain an actual operand stack. Instead, the stack can be viewed as a set of anonymous registers that are implicitly referenced by instructions. The type system ensures that the stack height, and thus any referenced register, is always known statically.

[2]	In the current version of WebAssembly, there may be at most one result value.

1.3.2. Semantic Phases

Conceptually, the semantics of WebAssembly is divided into three phases. For each part of the language, the specification specifies each of them.

Decoding: WebAssembly modules are distributed in a binary format. Decoding processes that format and converts it into an internal representation of a module. In this specification, this representation is modelled by abstract syntax, but a real implementation could compile directly to machine code instead.

Validation: A decoded module has to be valid. Validation checks a number of well-formedness conditions to guarantee that the module is meaningful and safe. In particular, it performs type checking of functions and the instruction sequences in their bodies, ensuring for example that the operand stack is used consistently.

Execution

Finally, a valid module can be executed. Execution can be further divided into two phases:

Instantiation. A module instance is the dynamic representation of a module, complete with its own state and execution stack. Instantiation executes the module body itself, given definitions for all its imports. It initializes globals, memories and tables and invokes the module’s start function if defined. It returns the instances of the module’s exports.

Invocation. Once instantiated, further WebAssembly computations can be initiated by invoking an exported function on a module instance. Given the required arguments, that executes the respective function and returns its results.

Instantiation and invocation are operations within the embedding environment.

2. Structure

2.1. Conventions

WebAssembly is a programming language that has multiple concrete representations (its binary format and the text format). Both map to a common structure. For conciseness, this structure is described in the form of an abstract syntax. All parts of this specification are defined in terms of this abstract syntax.

2.1.1. Grammar Notation

The following conventions are adopted in defining grammar rules for abstract syntax.

Terminal symbols (atoms) are written in sans-serif font: $i 32, e n d .$
Nonterminal symbols are written in italic font: $v a l t y p e, i n s t r .$
$A^{n is a sequence of n \geq 0 iterations of A .}$
$A^{* is a possibly empty sequence of iterations of A . (This is a shorthand for A^{n used where n is not relevant.)}}$
$A^{+ is a non-empty sequence of iterations of A . (This is a shorthand for A^{n where n \geq 1 .)}}$
$A^{? is an optional occurrence of A . (This is a shorthand for A^{n where n \leq 1 .)}}$
Productions are written $s y m : = A_{1}$
Large productions may be split into multiple definitions, indicated by ending the first one with explicit ellipses, $s y m : = A_{1}$
Some productions are augmented with side conditions in parentheses, “ $(if c o n d i t i o n) ”, that provide a shorthand for a combinatorial expansion of the production into many separate cases.$

2.1.2. Auxiliary Notation

When dealing with syntactic constructs the following notation is also used:

$ϵ denotes the empty sequence.$
$∣ s ∣ denotes the length of a sequence s .$
$s [i] denotes the i -th element of a sequence s, starting from 0 .$
$s [i : n] denotes the sub-sequence s [i] \dots s [i + n - 1] of a sequence s .$
$s with [i] = A denotes the same sequence as s, except that the i -th element is replaced with A .$
$s with [i : n] = A^{n denotes the same sequence as s, except that the sub-sequence s [i : n] is replaced with A^{n .}}$
$c o n c a t (s^{*)}$

Moreover, the following conventions are employed:

The notation $x^{n, where x is a non-terminal symbol, is treated as a meta variable ranging over respective sequences of x (similarly for x^{*, x^{+, x^{?).}}}}$
When given a sequence $x^{n, then the occurrences of x in a sequence written (A_{1}}$

Productions of the following form are interpreted as records that map a fixed set of fields $f i e l d_{i}$

r : = {f i e l d_{1}

The following notation is adopted for manipulating such records:

$r . f i e l d denotes the contents of the f i e l d component of r .$
$r with f i e l d = A denotes the same record as r, except that the contents of the f i e l d component is replaced with A .$
$r_{1}$

${f i e l d_{1}$
$⨁ r^{* denotes the composition of a sequence of records, respectively; if the sequence is empty, then all fields of the resulting record are empty.}$

The update notation for sequences and records generalizes recursively to nested components accessed by “paths” $p t h : = ([\dots] ∣ . f i e l d)^{+ :}$

$s with [i] p t h = A is short for s with [i] = (s [i] with p t h = A) .$
$r with f i e l d p t h = A is short for r with f i e l d = (r . f i e l d with p t h = A) .$

where $r with . f i e l d = A is shortened to r with f i e l d = A .$

2.1.3. Vectors

Vectors are bounded sequences of the form $A^{n (or A^{*), where the A can either be values or complex constructions. A vector can have at most 2^{32 - 1 elements.}}}$

v e c (A) : : = A^{n}

2.2. Values

WebAssembly programs operate on primitive numeric values. Moreover, in the definition of programs, immutable sequences of values occur to represent more complex data, such as text strings or other vectors.

2.2.1. Bytes

The simplest form of value are raw uninterpreted bytes. In the abstract syntax they are represented as hexadecimal literals.

b y t e : : = 0 x 00 ∣ \dots ∣ 0 x F F

2.2.1.1. Conventions

The meta variable $b ranges over bytes.$
Bytes are sometimes interpreted as natural numbers $n < 256 .$

2.2.2. Integers

Different classes of integers with different value ranges are distinguished by their bit width $N and by whether they are unsigned or signed .$

u N s N i N : : = : : = : : = 0 ∣ 1 ∣ \dots ∣ 2_{- 2}^{N - 1}

The latter class defines uninterpreted integers, whose signedness interpretation can vary depending on context. In the abstract syntax, they are represented as unsigned values. However, some operations convert them to signed based on a two’s complement interpretation.

Note

The main integer types occurring in this specification are $u 32, u 64, s 32, s 64, i 8, i 16, i 32, i 64 . However, other sizes occur as auxiliary constructions, e.g., in the definition of floating-point numbers.$

2.2.2.1. Conventions

The meta variables $m, n, i range over integers.$
Numbers may be denoted by simple arithmetics, as in the grammar above. In order to distinguish arithmetics like $2^{N from sequences like (1)^{N, the latter is distinguished with parentheses.}}$

2.2.3. Floating-Point

Floating-point data represents 32 or 64 bit values that correspond to the respective binary formats of the [IEEE-754-2019] standard (Section 3.3).

Every value has a sign and a magnitude. Magnitudes can either be expressed as normal numbers of the form $m_{0}$

Possible magnitudes also include the special values $\infty (infinity) and n a n (NaN, not a number). NaN values have a payload that describes the mantissa bits in the underlying binary representation . No distinction is made between signalling and quiet NaNs.$

f N f N m a g : : = : : = ∣ ∣ ∣ + f N m a g ∣ - f N m a g (1 + u M \cdot 2^{- M) \cdot 2}

where $M = s i g n i f (N) and E = e x p o n (N) with$

s i g n i f (32) s i g n i f (64) = = 2352

A canonical NaN is a floating-point value $\pm n a n (c a n o n_{N}$

c a n o n_{N}

An arithmetic NaN is a floating-point value $\pm n a n (n) with n \geq c a n o n_{N}$

Note

In the abstract syntax, subnormals are distinguished by the leading 0 of the significand. The exponent of subnormals has the same value as the smallest possible exponent of a normal number. Only in the binary representation the exponent of a subnormal is encoded differently than the exponent of any normal number.

2.2.3.1. Conventions

The meta variable $z ranges over floating-point values where clear from context.$

2.2.4. Names

Names are sequences of characters, which are scalar values as defined by [UNICODE] (Section 2.4).

n a m e c h a r : : = : : = c h a r^{* (if ∣ u t f 8 (c h a r^{*) ∣ < 2}}

Due to the limitations of the binary format, the length of a name is bounded by the length of its UTF-8 encoding.

2.2.4.1. Convention

Characters (Unicode scalar values) are sometimes used interchangeably with natural numbers $n < 1412 .$

2.3. Types

Various entities in WebAssembly are classified by types. Types are checked during validation, instantiation, and possibly execution.

2.3.1. Value Types

Value types classify the individual values that WebAssembly code can compute with and the values that a variable accepts.

v a l t y p e : : = i 32 ∣ i 64 ∣ f 32 ∣ f 64

The types $i 32 and i 64 classify 32 and 64 bit integers, respectively. Integers are not inherently signed or unsigned, their interpretation is determined by individual operations.$

The types $f 32 and f 64 classify 32 and 64 bit floating-point data, respectively. They correspond to the respective binary floating-point representations, also known as single and double precision, as defined by the [IEEE-754-2019] standard (Section 3.3).$

2.3.1.1. Conventions

The meta variable $t ranges over value types where clear from context.$
The notation $∣ t ∣ denotes the bit width of a value type. That is, ∣ i 32 ∣ = ∣ f 32 ∣ = 32 and ∣ i 64 ∣ = ∣ f 64 ∣ = 64 .$

2.3.2. Result Types

Result types classify the result of executing instructions or blocks, which is a sequence of values written with brackets.

r e s u l t y p e : : = [v a l t y p e^{?]}

Note

In the current version of WebAssembly, at most one value is allowed as a result. However, this may be generalized to sequences of values in future versions.

2.3.3. Function Types

Function types classify the signature of functions, mapping a vector of parameters to a vector of results, written as follows.

f u n c t y p e : : = [v e c (v a l t y p e)] \to [v e c (v a l t y p e)]

Note

In the current version of WebAssembly, the length of the result type vector of a valid function type may be at most $1 . This restriction may be removed in future versions.$

2.3.4. Limits

Limits classify the size range of resizeable storage associated with memory types and table types.

l i m i t s : : = {m i n u 32, m a x u 32^{?}}

If no maximum is given, the respective storage can grow to any size.

2.3.5. Memory Types

Memory types classify linear memories and their size range.

m e m t y p e : : = l i m i t s

The limits constrain the minimum and optionally the maximum size of a memory. The limits are given in units of page size.

2.3.6. Table Types

Table types classify tables over elements of element types within a size range.

t a b l e t y p e e l e m t y p e : : = : : = l i m i t s e l e m t y p e f u n c r e f

Like memories, tables are constrained by limits for their minimum and optionally maximum size. The limits are given in numbers of entries.

The element type $f u n c r e f is the infinite union of all function types . A table of that type thus contains references to functions of heterogeneous type.$

Note

In future versions of WebAssembly, additional element types may be introduced.

2.3.7. Global Types

Global types classify global variables, which hold a value and can either be mutable or immutable.

g l o b a l t y p e m u t : : = : : = m u t v a l t y p e c o n s t ∣ v a r

2.3.8. External Types

External types classify imports and external values with their respective types.

e x t e r n t y p e : : = f u n c f u n c t y p e ∣ t a b l e t a b l e t y p e ∣ m e m m e m t y p e ∣ g l o b a l g l o b a l t y p e

2.3.8.1. Conventions

The following auxiliary notation is defined for sequences of external types. It filters out entries of a specific kind in an order-preserving fashion:

$f u n c s (e x t e r n t y p e^{*) = [f u n c t y p e ∣ (f u n c f u n c t y p e) \in e x t e r n t y p e^{*]}}$
$t a b l e s (e x t e r n t y p e^{*) = [t a b l e t y p e ∣ (t a b l e t a b l e t y p e) \in e x t e r n t y p e^{*]}}$
$m e m s (e x t e r n t y p e^{*) = [m e m t y p e ∣ (m e m m e m t y p e) \in e x t e r n t y p e^{*]}}$
$g l o b a l s (e x t e r n t y p e^{*) = [g l o b a l t y p e ∣ (g l o b a l g l o b a l t y p e) \in e x t e r n t y p e^{*]}}$

2.4. Instructions

WebAssembly code consists of sequences of instructions. Its computational model is based on a stack machine in that instructions manipulate values on an implicit operand stack, consuming (popping) argument values and producing or returning (pushing) result values.

Note

In the current version of WebAssembly, at most one result value can be pushed by a single instruction. This restriction may be lifted in future versions.

In addition to dynamic operands from the stack, some instructions also have static immediate arguments, typically indices or type annotations, which are part of the instruction itself.

Some instructions are structured in that they bracket nested sequences of instructions.

The following sections group instructions into a number of different categories.

2.4.1. Numeric Instructions

Numeric instructions provide basic operations over numeric values of specific type. These operations closely match respective operations available in hardware.

Numeric instructions are divided by value type. For each type, several subcategories can be distinguished:

Constants: return a static constant.
Unary Operators: consume one operand and produce one result of the respective type.
Binary Operators: consume two operands and produce one result of the respective type.
Tests: consume one operand of the respective type and produce a Boolean integer result.
Comparisons: consume two operands of the respective type and produce a Boolean integer result.
Conversions: consume a value of one type and produce a result of another (the source type of the conversion is the one after the “ $_”).$

Some integer instructions come in two flavors, where a signedness annotation $s x distinguishes whether the operands are to be interpreted as unsigned or signed integers. For the other integer instructions, the use of two’s complement for the signed interpretation means that they behave the same regardless of signedness.$

2.4.1.1. Conventions

Occasionally, it is convenient to group operators together according to the following grammar shorthands:

u n o p b i n o p t e s t o p r e l o p c v t o p : : = : : = : : = : : = : : = i u n o p ∣ f u n o p i b i n o p ∣ f b i n o p i t e s t o p i r e l o p ∣ f r e l o p w r a p ∣ e x t e n d ∣ t r u n c ∣ c o n v e r t ∣ d e m o t e ∣ p r o m o t e ∣ r e i n t e r p r e t

2.4.2. Parametric Instructions

Instructions in this group can operate on operands of any value type.

i n s t r : : = ∣ ∣ \dots d r o p s e l e c t

The $d r o p operator simply throws away a single operand.$

The $s e l e c t operator selects one of its first two operands based on whether its third operand is zero or not.$

2.4.3. Variable Instructions

Variable instructions are concerned with access to local or global variables.

i n s t r : : = ∣ ∣ ∣ ∣ ∣ \dots l o c a l . g e t l o c a l i d x l o c a l . s e t l o c a l i d x l o c a l . t e l o c a l i d x g l o b a l . g e t g l o b a l i d x g l o b a l . s e t g l o b a l i d x

These instructions get or set the values of variables, respectively. The $l o c a l . t e instruction is like l o c a l . s e t but also returns its argument.$

2.4.4. Memory Instructions

Instructions in this group are concerned with linear memory.

m e m a r g i n s t r : : = : : = ∣ ∣ ∣ ∣ ∣ ∣ {o f s e t u 32, a l i g n u 32} \dots i n n . l o a d m e m a r g ∣ f n n . l o a d m e m a r g i n n . s t o r e m e m a r g ∣ f n n . s t o r e m e m a r g i n n . l o a d 8_s x m e m a r g ∣ i n n . l o a d 16_s x m e m a r g ∣ i 64 . l o a d 32_s x m e m a r g i n n . s t o r e 8 m e m a r g ∣ i n n . s t o r e 16 m e m a r g ∣ i 64 . s t o r e 32 m e m a r g m e m o r y . s i z e m e m o r y . g r o w

Memory is accessed with $l o a d and s t o r e instructions for the different value types . They all take a memory immediate m e m a r g that contains an address offset and the expected alignment (expressed as the exponent of a power of 2). Integer loads and stores can optionally specify a storage size that is smaller than the bit width of the respective value type. In the case of loads, a sign extension mode s x is then required to select appropriate behavior.$

The static address offset is added to the dynamic address operand, yielding a 33 bit effective address that is the zero-based index at which the memory is accessed. All values are read and written in little endian byte order. A trap results if any of the accessed memory bytes lies outside the address range implied by the memory’s current size.

Note

Future version of WebAssembly might provide memory instructions with 64 bit address ranges.

The $m e m o r y . s i z e instruction returns the current size of a memory. The m e m o r y . g r o w instruction grows memory by a given delta and returns the previous size, or - 1 if enough memory cannot be allocated. Both instructions operate in units of page size .$

Note

In the current version of WebAssembly, all memory instructions implicitly operate on memory index $0 . This restriction may be lifted in future versions.$

2.4.5. Control Instructions

Instructions in this group affect the flow of control.

The $n o p instruction does nothing.$

The $u n r e a c h a b l e instruction causes an unconditional trap .$

The $b l o c k, l o p and i f instructions are structured instructions. They bracket nested sequences of instructions, called blocks, terminated with, or separated by, e n d or e l s e pseudo-instructions. As the grammar prescribes, they must be well-nested. A structured instruction can produce a value as described by the annotated result type .$

Each structured control instruction introduces an implicit label. Labels are targets for branch instructions that reference them with label indices. Unlike with other index spaces, indexing of labels is relative by nesting depth, that is, label $0 refers to the innermost structured control instruction enclosing the referring branch instruction, while increasing indices refer to those farther out. Consequently, labels can only be referenced from within the associated structured control instruction. This also implies that branches can only be directed outwards, “breaking” from the block of the control construct they target. The exact effect depends on that control construct. In case of b l o c k or i f it is a forward jump, resuming execution after the matching e n d . In case of l o p it is a backward jump to the beginning of the loop.$

Note

This enforces structured control flow. Intuitively, a branch targeting a $b l o c k or i f behaves like a b r e a k statement in most C-like languages, while a branch targeting a l o p behaves like a c o n t i n u e statement.$

Branch instructions come in several flavors: $b r performs an unconditional branch, b r_i f performs a conditional branch, and b r_t a b l e performs an indirect branch through an operand indexing into the label vector that is an immediate to the instruction, or to a default target if the operand is out of bounds. The r e t u r n instruction is a shortcut for an unconditional branch to the outermost block, which implicitly is the body of the current function. Taking a branch unwinds the operand stack up to the height where the targeted structured control instruction was entered. However, forward branches that target a control instruction with a non-empty result type consume matching operands first and push them back on the operand stack after unwinding, as a result for the terminated structured instruction.$

The $c a l instruction invokes another function, consuming the necessary arguments from the stack and returning the result values of the call. The c a l_i n d i r e c t instruction calls a function indirectly through an operand indexing into a table . Since tables may contain function elements of heterogeneous type f u n c r e f, the callee is dynamically checked against the function type indexed by the instruction’s immediate, and the call aborted with a trap if it does not match.$

Note

In the current version of WebAssembly, $c a l_i n d i r e c t implicitly operates on table index 0 . This restriction may be lifted in future versions.$

2.4.6. Expressions

Function bodies, initialization values for globals, and offsets of element or data segments are given as expressions, which are sequences of instructions terminated by an $e n d marker.$

e x p r : : = i n s t r^{* e n d}

In some places, validation restricts expressions to be constant, which limits the set of allowable instructions.

2.5. Modules

WebAssembly programs are organized into modules, which are the unit of deployment, loading, and compilation. A module collects definitions for types, functions, tables, memories, and globals. In addition, it can declare imports and exports and provide initialization logic in the form of data and element segments or a start function.

m o d u l e : : = {t y p e s v e c (f u n c t y p e), f u n c s v e c (f u n c), t a b l e s v e c (t a b l e), m e m s v e c (m e m), g l o b a l s v e c (g l o b a l), e l e m v e c (e l e m), d a t a v e c (d a t a), s t a r t

Each of the vectors – and thus the entire module – may be empty.

2.5.1. Indices

Definitions are referenced with zero-based indices. Each class of definition has its own index space, as distinguished by the following classes.

t y p e i d x f u n c i d x t a b l e i d x m e m i d x g l o b a l i d x l o c a l i d x l a b e l i d x : : = : : = : : = : : = : : = : : = : : =

The index space for functions, tables, memories and globals includes respective imports declared in the same module. The indices of these imports precede the indices of other definitions in the same index space.

The index space for locals is only accessible inside a function and includes the parameters of that function, which precede the local variables.

Label indices reference structured control instructions inside an instruction sequence.

2.5.1.1. Conventions

The meta variable $l ranges over label indices.$
The meta variables $x, y range over indices in any of the other index spaces.$

2.5.2. Types

The $t y p e s component of a module defines a vector of function types .$

All function types used in a module must be defined in this component. They are referenced by type indices.

Note

Future versions of WebAssembly may add additional forms of type definitions.

2.5.3. Functions

The $f u n c s component of a module defines a vector of functions with the following structure:$

f u n c : : = {t y p e t y p e i d x, l o c a l s v e c (v a l t y p e), b o d y e x p r}

The $t y p e of a function declares its signature by reference to a type defined in the module. The parameters of the function are referenced through 0-based local indices in the function’s body; they are mutable.$

The $l o c a l s declare a vector of mutable local variables and their types. These variables are referenced through local indices in the function’s body. The index of the first local is the smallest index not referencing a parameter.$

The $b o d y is an instruction sequence that upon termination must produce a stack matching the function type’s result type .$

Functions are referenced through function indices, starting with the smallest index not referencing a function import.

2.5.4. Tables

The $t a b l e s component of a module defines a vector of tables described by their table type :$

t a b l e : : = {t y p e t a b l e t y p e}

A table is a vector of opaque values of a particular table element type. The $m i n size in the limits of the table type specifies the initial size of that table, while its m a x, if present, restricts the size to which it can grow later.$

Tables can be initialized through element segments.

Tables are referenced through table indices, starting with the smallest index not referencing a table import. Most constructs implicitly reference table index $0 .$

Note

In the current version of WebAssembly, at most one table may be defined or imported in a single module, and all constructs implicitly reference this table $0 . This restriction may be lifted in future versions.$

2.5.5. Memories

The $m e m s component of a module defines a vector of linear memories (or memories for short) as described by their memory type :$

m e m : : = {t y p e m e m t y p e}

A memory is a vector of raw uninterpreted bytes. The $m i n size in the limits of the memory type specifies the initial size of that memory, while its m a x, if present, restricts the size to which it can grow later. Both are in units of page size .$

Memories can be initialized through data segments.

Memories are referenced through memory indices, starting with the smallest index not referencing a memory import. Most constructs implicitly reference memory index $0 .$

Note

In the current version of WebAssembly, at most one memory may be defined or imported in a single module, and all constructs implicitly reference this memory $0 . This restriction may be lifted in future versions.$

2.5.6. Globals

The $g l o b a l s component of a module defines a vector of global variables (or globals for short):$

g l o b a l : : = {t y p e g l o b a l t y p e, i n i t e x p r}

Each global stores a single value of the given global type. Its $t y p e also specifies whether a global is immutable or mutable. Moreover, each global is initialized with an i n i t value given by a constant initializer expression .$

Globals are referenced through global indices, starting with the smallest index not referencing a global import.

2.5.7. Element Segments

The initial contents of a table is uninitialized. The $e l e m component of a module defines a vector of element segments that initialize a subrange of a table, at a given offset, from a static vector of elements.$

e l e m : : = {t a b l e t a b l e i d x, o f s e t e x p r, i n i t v e c (f u n c i d x)}

The $o f s e t is given by a constant expression .$

Note

In the current version of WebAssembly, at most one table is allowed in a module. Consequently, the only valid $t a b l e i d x is 0 .$

2.5.8. Data Segments

The initial contents of a memory are zero-valued bytes. The $d a t a component of a module defines a vector of data segments that initialize a range of memory, at a given offset, with a static vector of bytes .$

d a t a : : = {d a t a m e m i d x, o f s e t e x p r, i n i t v e c (b y t e)}

The $o f s e t is given by a constant expression .$

Note

In the current version of WebAssembly, at most one memory is allowed in a module. Consequently, the only valid $m e m i d x is 0 .$

2.5.9. Start Function

The $s t a r t component of a module declares the function index of a start function that is automatically invoked when the module is instantiated, after tables and memories have been initialized.$

s t a r t : : = {f u n c f u n c i d x}

Note

The start function is intended for initializing the state of a module. The module and its exports are not accessible before this initialization has completed.

2.5.10. Exports

The $e x p o r t s component of a module defines a set of exports that become accessible to the host environment once the module has been instantiated .$

e x p o r t e x p o r t d e s c : : = : : = ∣ ∣ ∣ {n a m e n a m e, d e s c e x p o r t d e s c} f u n c f u n c i d x t a b l e t a b l e i d x m e m m e m i d x g l o b a l g l o b a l i d x

Each export is labeled by a unique name. Exportable definitions are functions, tables, memories, and globals, which are referenced through a respective descriptor.

2.5.10.1. Conventions

The following auxiliary notation is defined for sequences of exports, filtering out indices of a specific kind in an order-preserving fashion:

$f u n c s (e x p o r t^{*) = [f u n c i d x ∣ f u n c f u n c i d x \in (e x p o r t . d e s c)^{*]}}$
$t a b l e s (e x p o r t^{*) = [t a b l e i d x ∣ t a b l e t a b l e i d x \in (e x p o r t . d e s c)^{*]}}$
$m e m s (e x p o r t^{*) = [m e m i d x ∣ m e m m e m i d x \in (e x p o r t . d e s c)^{*]}}$
$g l o b a l s (e x p o r t^{*) = [g l o b a l i d x ∣ g l o b a l g l o b a l i d x \in (e x p o r t . d e s c)^{*]}}$

2.5.11. Imports

The $i m p o r t s component of a module defines a set of imports that are required for instantiation .$

i m p o r t i m p o r t d e s c : : = : : = ∣ ∣ ∣ {m o d u l e n a m e, n a m e n a m e, d e s c i m p o r t d e s c} f u n c t y p e i d x t a b l e t a b l e t y p e m e m m e m t y p e g l o b a l g l o b a l t y p e

Each import is labeled by a two-level name space, consisting of a $m o d u l e name and a n a m e for an entity within that module. Importable definitions are functions, tables, memories, and globals . Each import is specified by a descriptor with a respective type that a definition provided during instantiation is required to match.$

Every import defines an index in the respective index space. In each index space, the indices of imports go before the first index of any definition contained in the module itself.

Note

Unlike export names, import names are not necessarily unique. It is possible to import the same $m o d u l e / n a m e pair multiple times; such imports may even have different type descriptions, including different kinds of entities. A module with such imports can still be instantiated depending on the specifics of how an embedder allows resolving and supplying imports. However, embedders are not required to support such overloading, and a WebAssembly module itself cannot implement an overloaded name.$

3. Validation

3.1. Conventions

Validation checks that a WebAssembly module is well-formed. Only valid modules can be instantiated.

Validity is defined by a type system over the abstract syntax of a module and its contents. For each piece of abstract syntax, there is a typing rule that specifies the constraints that apply to it. All rules are given in two equivalent forms:

In prose, describing the meaning in intuitive form.
In formal notation, describing the rule in mathematical form. [1]

Note

The prose and formal rules are equivalent, so that understanding of the formal notation is not required to read this specification. The formalism offers a more concise description in notation that is used widely in programming languages semantics and is readily amenable to mathematical proof.

In both cases, the rules are formulated in a declarative manner. That is, they only formulate the constraints, they do not define an algorithm. The skeleton of a sound and complete algorithm for type-checking instruction sequences according to this specification is provided in the appendix.

3.1.1. Contexts

Validity of an individual definition is specified relative to a context, which collects relevant information about the surrounding module and the definitions in scope:

Types: the list of types defined in the current module.
Functions: the list of functions declared in the current module, represented by their function type.
Tables: the list of tables declared in the current module, represented by their table type.
Memories: the list of memories declared in the current module, represented by their memory type.
Globals: the list of globals declared in the current module, represented by their global type.
Locals: the list of locals declared in the current function (including parameters), represented by their value type.
Labels: the stack of labels accessible from the current position, represented by their result type.
Return: the return type of the current function, represented as an optional result type that is absent when no return is allowed, as in free-standing expressions.

In other words, a context contains a sequence of suitable types for each index space, describing each defined entry in that space. Locals, labels and return type are only used for validating instructions in function bodies, and are left empty elsewhere. The label stack is the only part of the context that changes as validation of an instruction sequence proceeds.

More concretely, contexts are defined as records $C with abstract syntax:$

C : : = {

In addition to field access written $C . f i e l d the following notation is adopted for manipulating contexts:$

When spelling out a context, empty fields are omitted.
$C, f i e l d A^{* denotes the same context as C but with the elements A^{* prepended to its f i e l d component sequence.}}$

Note

We use indexing notation like $C . l a b e l s [i] to look up indices in their respective index space in the context. Context extension notation C, f i e l d A is primarily used to locally extend relative index spaces, such as label indices . Accordingly, the notation is defined to append at the front of the respective sequence, introducing a new relative index 0 and shifting the existing ones.$

3.1.2. Prose Notation

Validation is specified by stylised rules for each relevant part of the abstract syntax. The rules not only state constraints defining when a phrase is valid, they also classify it with a type. The following conventions are adopted in stating these rules.

A phrase $A is said to be “valid with type T ” if and only if all constraints expressed by the respective rules are met. The form of T depends on what A is.$

Note

For example, if $A is a function, then T is a function type; for an A that is a global, T is a global type; and so on.$
The rules implicitly assume a given context $C .$
In some places, this context is locally extended to a context $C^{' with additional entries. The formulation “Under context C^{', \dots statement \dots” is adopted to express that the following statement must apply under the assumptions embodied in the extended context.}}$

3.1.3. Formal Notation

Note

This section gives a brief explanation of the notation for specifying typing rules formally. For the interested reader, a more thorough introduction can be found in respective text books. [2]

The proposition that a phrase $A has a respective type T is written A : T . In general, however, typing is dependent on a context C . To express this explicitly, the complete form is a judgement C ⊢ A : T, which says that A : T holds under the assumptions encoded in C .$

The formal typing rules use a standard approach for specifying type systems, rendering them into deduction rules. Every rule has the following general form:

\frac{p r e m i s e _{1}}{c o n c l u s i o n}

Such a rule is read as a big implication: if all premises hold, then the conclusion holds. Some rules have no premises; they are axioms whose conclusion holds unconditionally. The conclusion always is a judgment $C ⊢ A : T, and there is one respective rule for each relevant construct A of the abstract syntax.$

Note

For example, the typing rule for the $i 32 . a d instruction can be given as an axiom:$

C ⊢ i 3 2 . a d : [ i 3 2 i 3 2 ] \to [ i 3 2 ]

The instruction is always valid with type $[i 32 i 32] \to [i 32] (saying that it consumes two i 32 values and produces one), independent of any side conditions.$

An instruction like $l o c a l . g e t can be typed as follows:$

\frac{C . l o c a l s [ x ] = t}{C ⊢ l o c a l . g e t x : [ ] \to [ t ]}

Here, the premise enforces that the immediate local index $x exists in the context. The instruction produces a value of its respective type t (and does not consume any values). If C . l o c a l s [x] does not exist then the premise does not hold, and the instruction is ill-typed.$

Finally, a structured instruction requires a recursive rule, where the premise is itself a typing judgement:

C ⊢ b l o c k [ t ^{?] i n s t r^{* e n d : [] \to [t_{?]}^{C, l a b e l [t^{?] ⊢ i n s t r^{* : [] \to [t^{?]}}}}}}

A $b l o c k instruction is only valid when the instruction sequence in its body is. Moreover, the result type must match the block’s annotation [t^{?]}$

[1]

The semantics is derived from the following article: Andreas Haas, Andreas Rossberg, Derek Schuff, Ben Titzer, Dan Gohman, Luke Wagner, Alon Zakai, JF Bastien, Michael Holman. Bringing the Web up to Speed with WebAssembly. Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017). ACM 2017.

[2]	For example: Benjamin Pierce. Types and Programming Languages. The MIT Press 2002

3.2. Types

Most types are universally valid. However, restrictions apply to function types as well as the limits of table types and memory types, which must be checked during validation.

3.2.1. Limits

Limits must have meaningful bounds that are within a given range.

3.2.1.1. ${m i n n, m a x m^{?}}$

The value of $n must not be larger than k .$
If the maximum
m? is not empty, then:

Its value must not be larger than $k .$
Its value must not be smaller than $n .$
Then the limit is valid within range $k .$

⊢ { m i n n , m a x m _{?} : k}^{n \leq k (m \leq k)^{? (n \leq m)^{?}}}

3.2.2. Function Types

Function types may not specify more than one result.

3.2.2.1. $[t_{1 n] \to [t_{2 m]}}$

The arity $m must not be larger than 1 .$
Then the function type is valid.

⊢ [ t _{1 *] \to [t_{2 ?] ok}}

Note

The restriction to at most one result may be removed in future versions of WebAssembly.

3.2.3. Table Types

3.2.3.1. $l i m i t s e l e m t y p e$

The limits $l i m i t s must be valid within range 2^{32 .}$
Then the table type is valid.

\frac{⊢ l i m i t s : 2 ^{32}}{⊢ l i m i t s e l e m t y p e ok}

3.2.4. Memory Types

3.2.4.1. $l i m i t s$

The limits $l i m i t s must be valid within range 2^{16 .}$
Then the memory type is valid.

\frac{⊢ l i m i t s : 2 ^{16}}{⊢ l i m i t s ok}

3.2.5. Global Types

3.2.5.1. $m u t v a l t y p e$

The global type is valid.

⊢ m u t v a l t y p e ok

3.2.6. External Types

3.2.6.1. $f u n c f u n c t y p e$

The function type $f u n c t y p e must be valid .$
Then the external type is valid.

\frac{⊢ f u n c t y p e ok}{⊢ f u n c f u n c t y p e ok}

3.2.6.2. $t a b l e t a b l e t y p e$

The table type $t a b l e t y p e must be valid .$
Then the external type is valid.

\frac{⊢ t a b l e t y p e ok}{⊢ t a b l e t a b l e t y p e ok}

3.2.6.3. $m e m m e m t y p e$

The memory type $m e m t y p e must be valid .$
Then the external type is valid.

\frac{⊢ m e m t y p e ok}{⊢ m e m m e m t y p e ok}

3.2.6.4. $g l o b a l g l o b a l t y p e$

The global type $g l o b a l t y p e must be valid .$
Then the external type is valid.

\frac{⊢ g l o b a l t y p e ok}{⊢ g l o b a l g l o b a l t y p e ok}

3.3. Instructions

Instructions are classified by function types $[t_{1 *] \to [t_{2 *] that describe how they manipulate the operand stack . The types describe the required input stack with argument values of types t_{1 * that an instruction pops off and the provided output stack with result values of types t_{2 * that it pushes back.}}}}$

Note

For example, the instruction $i 32 . a d has type [i 32 i 32] \to [i 32], consuming two i 32 values and producing one.$

Typing extends to instruction sequences $i n s t r^{* . Such a sequence has a function type [t_{1 *] \to [t_{2 *] if the accumulative effect of executing the instructions is consuming values of types t_{1 * off the operand stack and pushing new values of types t_{2 * .}}}}}$

For some instructions, the typing rules do not fully constrain the type, and therefore allow for multiple types. Such instructions are called polymorphic. Two degrees of polymorphism can be distinguished:

value-polymorphic: the value type $t of one or several individual operands is unconstrained. That is the case for all parametric instructions like d r o p and s e l e c t .$
stack-polymorphic: the entire (or most of the) function type $[t_{1 *] \to [t_{2 *] of the instruction is unconstrained. That is the case for all control instructions that perform an unconditional control transfer, such as u n r e a c h a b l e, b r, b r_t a b l e, and r e t u r n .}}$

In both cases, the unconstrained types or type sequences can be chosen arbitrarily, as long as they meet the constraints imposed for the surrounding parts of the program.

Note

For example, the $s e l e c t instruction is valid with type [t t i 32] \to [t], for any possible value type t . Consequently, both instruction sequences$

(i 32 . c o n s t 1) (i 32 . c o n s t 2) (i 32 . c o n s t 3) s e l e c t

and

(f 64 . c o n s t 1.0) (f 64 . c o n s t 2.0) (i 32 . c o n s t 3) s e l e c t

are valid, with $t in the typing of s e l e c t being instantiated to i 32 or f 64, respectively.$

The $u n r e a c h a b l e instruction is valid with type [t_{1 *] \to [t_{2 *] for any possible sequences of value types t_{1 * and t_{2 * . Consequently,}}}}$

u n r e a c h a b l e i 32 . a d

is valid by assuming type $[] \to [i 32 i 32] for the u n r e a c h a b l e instruction. In contrast,$

u n r e a c h a b l e (i 64 . c o n s t 0) i 32 . a d

is invalid, because there is no possible type to pick for the $u n r e a c h a b l e instruction that would make the sequence well-typed.$

3.3.1. Numeric Instructions

3.3.1.1. $t . c o n s t c$

The instruction is valid with type $[] \to [t] .$

C ⊢ t . c o n s t c : [ ] \to [ t ]

3.3.1.2. $t . u n o p$

The instruction is valid with type $[t] \to [t] .$

C ⊢ t . u n o p : [ t ] \to [ t ]

3.3.1.3. $t . b i n o p$

The instruction is valid with type $[t t] \to [t] .$

C ⊢ t . b i n o p : [ t t ] \to [ t ]

3.3.1.4. $t . t e s t o p$

The instruction is valid with type $[t] \to [i 32] .$

C ⊢ t . t e s t o p : [ t ] \to [ i 3 2 ]

3.3.1.5. $t . r e l o p$

The instruction is valid with type $[t t] \to [i 32] .$

C ⊢ t . r e l o p : [ t t ] \to [ i 3 2 ]

3.3.1.6. $t_{2}$

The instruction is valid with type $[t_{1}$

C ⊢ t _{2}

3.3.2. Parametric Instructions

3.3.2.1. $d r o p$

The instruction is valid with type $[t] \to [], for any value type t .$

C ⊢ d r o p : [ t ] \to [ ]

3.3.2.2. $s e l e c t$

The instruction is valid with type $[t t i 32] \to [t], for any value type t .$

C ⊢ s e l e c t : [ t t i 3 2 ] \to [ t ]

Note

Both $d r o p and s e l e c t are value-polymorphic instructions.$

3.3.3. Variable Instructions

3.3.3.1. $l o c a l . g e t x$

The local $C . l o c a l s [x] must be defined in the context.$
Let $t be the value type C . l o c a l s [x] .$
Then the instruction is valid with type $[] \to [t] .$

\frac{C . l o c a l s [ x ] = t}{C ⊢ l o c a l . g e t x : [ ] \to [ t ]}

3.3.3.2. $l o c a l . s e t x$

The local $C . l o c a l s [x] must be defined in the context.$
Let $t be the value type C . l o c a l s [x] .$
Then the instruction is valid with type $[t] \to [] .$

\frac{C . l o c a l s [ x ] = t}{C ⊢ l o c a l . s e t x : [ t ] \to [ ]}

3.3.3.3. $l o c a l . t e x$

The local $C . l o c a l s [x] must be defined in the context.$
Let $t be the value type C . l o c a l s [x] .$
Then the instruction is valid with type $[t] \to [t] .$

\frac{C . l o c a l s [ x ] = t}{C ⊢ l o c a l . t e x : [ t ] \to [ t ]}

3.3.3.4. $g l o b a l . g e t x$

The global $C . g l o b a l s [x] must be defined in the context.$
Let $m u t t be the global type C . g l o b a l s [x] .$
Then the instruction is valid with type $[] \to [t] .$

\frac{C . g l o b a l s [ x ] = m u t t}{C ⊢ g l o b a l . g e t x : [ ] \to [ t ]}

3.3.3.5. $g l o b a l . s e t x$

The global $C . g l o b a l s [x] must be defined in the context.$
Let $m u t t be the global type C . g l o b a l s [x] .$
The mutability $m u t must be v a r .$
Then the instruction is valid with type $[t] \to [] .$

\frac{C . g l o b a l s [ x ] = v a r t}{C ⊢ g l o b a l . s e t x : [ t ] \to [ ]}

3.3.4. Memory Instructions

3.3.4.1. $t . l o a d m e m a r g$

The memory $C . m e m s [0] must be defined in the context.$
The alignment $2^{m e m a r g . a l i g n must not be larger than the bit width of t divided by 8 .}$
Then the instruction is valid with type $[i 32] \to [t] .$

\frac{C . m e m s [ 0 ] = m e m t y p e 2 ^{m e m a r g . a l i g n \leq ∣ t ∣ / 8}}{C ⊢ t . l o a d m e m a r g : [ i 3 2 ] \to [ t ]}

3.3.4.2. $t . l o a d N_s x m e m a r g$

The memory $C . m e m s [0] must be defined in the context.$
The alignment $2^{m e m a r g . a l i g n must not be larger than N / 8 .}$
Then the instruction is valid with type $[i 32] \to [t] .$

\frac{C . m e m s [ 0 ] = m e m t y p e 2 ^{m e m a r g . a l i g n \leq N / 8}}{C ⊢ t . l o a d N _ s x m e m a r g : [ i 3 2 ] \to [ t ]}

3.3.4.3. $t . s t o r e m e m a r g$

The memory $C . m e m s [0] must be defined in the context.$
The alignment $2^{m e m a r g . a l i g n must not be larger than the bit width of t divided by 8 .}$
Then the instruction is valid with type $[i 32 t] \to [] .$

\frac{C . m e m s [ 0 ] = m e m t y p e 2 ^{m e m a r g . a l i g n \leq ∣ t ∣ / 8}}{C ⊢ t . s t o r e m e m a r g : [ i 3 2 t ] \to [ ]}

3.3.4.4. $t . s t o r e N m e m a r g$

The memory $C . m e m s [0] must be defined in the context.$
The alignment $2^{m e m a r g . a l i g n must not be larger than N / 8 .}$
Then the instruction is valid with type $[i 32 t] \to [] .$

\frac{C . m e m s [ 0 ] = m e m t y p e 2 ^{m e m a r g . a l i g n \leq N / 8}}{C ⊢ t . s t o r e N m e m a r g : [ i 3 2 t ] \to [ ]}

3.3.4.5. $m e m o r y . s i z e$

The memory $C . m e m s [0] must be defined in the context.$
Then the instruction is valid with type $[] \to [i 32] .$

\frac{C . m e m s [ 0 ] = m e m t y p e}{C ⊢ m e m o r y . s i z e : [ ] \to [ i 3 2 ]}

3.3.4.6. $m e m o r y . g r o w$

The memory $C . m e m s [0] must be defined in the context.$
Then the instruction is valid with type $[i 32] \to [i 32] .$

\frac{C . m e m s [ 0 ] = m e m t y p e}{C ⊢ m e m o r y . g r o w : [ i 3 2 ] \to [ i 3 2 ]}

3.3.5. Control Instructions

3.3.5.1. $n o p$

The instruction is valid with type $[] \to [] .$

C ⊢ n o p : [ ] \to [ ]

3.3.5.2. $u n r e a c h a b l e$

The instruction is valid with type $[t_{1 *] \to [t_{2 *], for any sequences of value types t_{1 * and t_{2 * .}}}}$

C ⊢ u n r e a c h a b l e : [ t _{1 *] \to [t_{2 *]}}

Note

The $u n r e a c h a b l e instruction is stack-polymorphic .$

3.3.5.3. $b l o c k [t^{?] i n s t r^{* e n d}}$

Let $C^{' be the same context as C, but with the result type [t^{?]}}$
Under context $C^{', the instruction sequence i n s t r^{* must be valid with type [] \to [t^{?]}}}$
Then the compound instruction is valid with type $[] \to [t^{?]}$

C ⊢ b l o c k [ t ^{?] i n s t r^{* e n d : [] \to [t_{?]}^{C, l a b e l s [t^{?] ⊢ i n s t r^{* : [] \to [t^{?]}}}}}}

Note

The notation $C, l a b e l s [t^{?]}$

The fact that the nested instruction sequence $i n s t r^{* must have type [] \to [t^{?]}}$

3.3.5.4. $l o p [t^{?] i n s t r^{* e n d}}$

Let $C^{' be the same context as C, but with the empty result type [] prepended to the l a b e l s vector.}$
Under context $C^{', the instruction sequence i n s t r^{* must be valid with type [] \to [t^{?]}}}$
Then the compound instruction is valid with type $[] \to [t^{?]}$

C ⊢ l o p [ t ^{?] i n s t r^{* e n d : [] \to [t_{?]}^{C, l a b e l s [] ⊢ i n s t r^{* : [] \to [t^{?]}}}}}

Note

The notation $C, l a b e l s [] inserts the new label type at index 0, shifting all others.$

The fact that the nested instruction sequence $i n s t r^{* must have type [] \to [t^{?]}}$

3.3.5.5. $i f [t^{?] i n s t r_{1 * e l s e i n s t r_{2 * e n d}}}$

Let $C^{' be the same context as C, but with the result type [t^{?]}}$
Under context $C^{', the instruction sequence i n s t r_{1 * must be valid with type [] \to [t^{?]}}}$
Under context $C^{', the instruction sequence i n s t r_{2 * must be valid with type [] \to [t^{?]}}}$
Then the compound instruction is valid with type $[i 32] \to [t^{?]}$

C ⊢ i f [ t ^{?] i n s t r_{1 * e l s e i n s t r_{2 * e n d : [i 32] \to [t_{?]}^{C, l a b e l s [t^{?] ⊢ i n s t r_{1 * : [] \to [t^{?] C, l a b e l s [t^{?] ⊢}}}}}}}}

Note

The notation $C, l a b e l s [t^{?]}$

The fact that the nested instruction sequence $i n s t r^{* must have type [] \to [t^{?]}}$

3.3.5.6. $b r l$

The label $C . l a b e l s [l] must be defined in the context.$
Let $[t^{?]}$
Then the instruction is valid with type $[t_{1 * t^{?] \to [t_{2 *], for any sequences of value types t_{1 * and t_{2 * .}}}}}$

C ⊢ b r l : [ t _{1 * t^{?] \to [t_{2 *] C . l a b e l s [l] = [t^{?]}}}}

Note

The label index space in the context $C contains the most recent label first, so that C . l a b e l s [l] performs a relative lookup as expected.$

The $b r instruction is stack-polymorphic .$

3.3.5.7. $b r_i f l$

The label $C . l a b e l s [l] must be defined in the context.$
Let $[t^{?]}$
Then the instruction is valid with type $[t^{? i 32] \to [t^{?]}}$

C ⊢ b r _ i f l : [ t ^{? i 32] \to [t_{?]}^{C . l a b e l s [l] = [t^{?]}}}

Note

The label index space in the context $C contains the most recent label first, so that C . l a b e l s [l] performs a relative lookup as expected.$

3.3.5.8. $b r_t a b l e l^{* l_{N}}$

The label $C . l a b e l s [l_{N}$
Let $[t^{?]}$
For all $l_{i}$
For all $l_{i}$
Then the instruction is valid with type $[t_{1 * t^{? i 32] \to [t_{2 *], for any sequences of value types t_{1 * and t_{2 * .}}}}}$

C ⊢ b r _ t a b l e l ^{* l_{N}}

Note

The label index space in the context $C contains the most recent label first, so that C . l a b e l s [l_{i}$

The $b r_t a b l e instruction is stack-polymorphic .$

3.3.5.9. $r e t u r n$

The return type $C . r e t u r n must not be absent in the context.$
Let $[t^{?]}$
Then the instruction is valid with type $[t_{1 * t^{?] \to [t_{2 *], for any sequences of value types t_{1 * and t_{2 * .}}}}}$

C ⊢ r e t u r n : [ t _{1 * t^{?] \to [t_{2 *] C . r e t u r n = [t^{?]}}}}

Note

The $r e t u r n instruction is stack-polymorphic .$

$C . r e t u r n is absent (set to ϵ) when validating an expression that is not a function body. This differs from it being set to the empty result type ([ϵ]), which is the case for functions not returning anything.$

3.3.5.10. $c a l x$

The function $C . f u n c s [x] must be defined in the context.$
Then the instruction is valid with type $C . f u n c s [x] .$

C ⊢ c a l x : [ t _{1 *] \to [t_{2 *] C . f u n c s [x] = [t_{1 *] \to [t_{2 *]}}}}

3.3.5.11. $c a l_i n d i r e c t x$

The table $C . t a b l e s [0] must be defined in the context.$
Let $l i m i t s e l e m t y p e be the table type C . t a b l e s [0] .$
The element type $e l e m t y p e must be f u n c r e f .$
The type $C . t y p e s [x] must be defined in the context.$
Let $[t_{1 *] \to [t_{2 *] be the function type C . t y p e s [x] .}}$
Then the instruction is valid with type $[t_{1 * i 32] \to [t_{2 *] .}}$

C ⊢ c a l _ i n d i r e c t x : [ t _{1 * i 32] \to [t_{2 *] C . t a b l e s [0] = l i m i t s f u n c r e f C . t y p e s [x] = [t_{1 *] \to [t_{2 *]}}}}

3.3.6. Instruction Sequences

Typing of instruction sequences is defined recursively.

3.3.6.1. Empty Instruction Sequence: $ϵ$

The empty instruction sequence is valid with type $[t^{*] \to [t^{*]}}$

C ⊢ ϵ : [ t ^{*] \to [t^{*]}}

3.3.6.2. Non-empty Instruction Sequence: $i n s t r^{* i n s t r_{N}}$

The instruction sequence $i n s t r^{* must be valid with type [t_{1 *] \to [t_{2 *], for some sequences of value types t_{1 * and t_{2 * .}}}}}$
The instruction $i n s t r_{N}$
There must be a sequence of value types $t_{0 *, such that t_{2 * = t_{0 * t^{* .}}}}$
Then the combined instruction sequence is valid with type $[t_{1 *] \to [t_{0 * t_{3 *] .}}}$

C ⊢ i n s t r ^{* i n s t r_{N}}

3.3.7. Expressions

Expressions $e x p r are classified by result types of the form [t^{?]}$

3.3.7.1. $i n s t r^{* e n d}$

The instruction sequence $i n s t r^{* must be valid with type [] \to [t^{?]}}$
Then the expression is valid with result type $[t^{?]}$

C ⊢ i n s t r ^{* e n d : [t_{?]}^{C ⊢ i n s t r^{* : [] \to [t^{?]}}}}

3.3.7.2. Constant Expressions

In a constant expression $i n s t r^{* e n d}$
A constant instruction
instr must be:

either of the form $t . c o n s t c,$
or of the form $g l o b a l . g e t x, in which case C . g l o b a l s [x] must be a global type of the form c o n s t t .$

C ⊢ i n s t r ^{* e n d const}

\frac{\frac{C . g l o b a l s [ x ] = c o n s t t}{C ⊢ g l o b a l . g e t x const}}{C ⊢ t . c o n s t c const}

Note

Currently, constant expressions occurring as initializers of globals are further constrained in that contained $g l o b a l . g e t instructions are only allowed to refer to imported globals. This is enforced in the validation rule for modules by constraining the context C accordingly.$

The definition of constant expression may be extended in future versions of WebAssembly.

3.4. Modules

Modules are valid when all the components they contain are valid. Furthermore, most definitions are themselves classified with a suitable type.

3.4.1. Functions

Functions $f u n c are classified by function types of the form [t_{1 *] \to [t_{2 ?] .}}$

3.4.1.1. ${t y p e x, l o c a l s t^{*, b o d y e x p r}}$

The type $C . t y p e s [x] must be defined in the context.$
Let $[t_{1 *] \to [t_{2 ?] be the function type C . t y p e s [x] .}}$
Let
C′ be the same context as $C, but with: l o c a l s set to the sequence of value types t_{1 * t^{*, concatenating parameters and locals,}} l a b e l s set to the singular sequence containing only result type [t_{2 ?] .} r e t u r n set to the result type [t_{2 ?] .}$
Under the context $C^{', the expression e x p r must be valid with type t_{2 ? .}}$
Then the function definition is valid with type $[t_{1 *] \to [t_{2 ?] .}}$

C ⊢ { t y p e x , l o c a l s t ^{*, b o d y e x p r} : [t_{1 *] \to [t_{2 ?] C . t y p e s [x] = [t_{1 *] \to [t_{2 ?] C, l o c a l s t_{1 * t^{*, l a b e l s [t}}}}}}}

Note

The restriction on the length of the result types $t_{2 * may be lifted in future versions of WebAssembly.}$

3.4.2. Tables

Tables $t a b l e are classified by table types .$

3.4.2.1. ${t y p e t a b l e t y p e}$

The table type $t a b l e t y p e must be valid .$
Then the table definition is valid with type $t a b l e t y p e .$

\frac{⊢ t a b l e t y p e ok}{C ⊢ { t y p e t a b l e t y p e } : t a b l e t y p e}

3.4.3. Memories

Memories $m e m are classified by memory types .$

3.4.3.1. ${t y p e m e m t y p e}$

The memory type $m e m t y p e must be valid .$
Then the memory definition is valid with type $m e m t y p e .$

\frac{⊢ m e m t y p e ok}{C ⊢ { t y p e m e m t y p e } : m e m t y p e}

3.4.4. Globals

Globals $g l o b a l are classified by global types of the form m u t t .$

3.4.4.1. ${t y p e m u t t, i n i t e x p r}$

The global type $m u t t must be valid .$
The expression $e x p r must be valid with result type [t] .$
The expression $e x p r must be constant .$
Then the global definition is valid with type $m u t t .$

\frac{⊢ m u t t ok C ⊢ e x p r : [ t ] C ⊢ e x p r const}{C ⊢ { t y p e m u t t , i n i t e x p r } : m u t t}

3.4.5. Element Segments

Element segments $e l e m are not classified by a type.$

3.4.5.1. ${t a b l e x, o f s e t e x p r, i n i t y^{*}}$

The table $C . t a b l e s [x] must be defined in the context.$
Let $l i m i t s e l e m t y p e be the table type C . t a b l e s [x] .$
The element type $e l e m t y p e must be f u n c r e f .$
The expression $e x p r must be valid with result type [i 32] .$
The expression $e x p r must be constant .$
For each $y_{i}$
Then the element segment is valid.

C ⊢ { t a b l e x , o f s e t e x p r , i n i t y ^{*} ok C . t a b l e s [x] = l i m i t s f u n c r e f C ⊢ e x p r : [i 32] C ⊢ e x p r const (C . f u n c s [y] = f u n c t y p e)^{*}}

3.4.6. Data Segments

Data segments $d a t a are not classified by any type.$

3.4.6.1. ${d a t a x, o f s e t e x p r, i n i t b^{*}}$

The memory $C . m e m s [x] must be defined in the context.$
The expression $e x p r must be valid with result type [i 32] .$
The expression $e x p r must be constant .$
Then the data segment is valid.

C ⊢ { d a t a x , o f s e t e x p r , i n i t b ^{*} ok C . m e m s [x] = l i m i t s C ⊢ e x p r : [i 32] C ⊢ e x p r const}

3.4.7. Start Function

Start function declarations $s t a r t are not classified by any type.$

3.4.7.1. ${f u n c x}$

The function $C . f u n c s [x] must be defined in the context.$
The type of $C . f u n c s [x] must be [] \to [] .$
Then the start function is valid.

\frac{C . f u n c s [ x ] = [ ] \to [ ]}{C ⊢ { f u n c x } ok}

3.4.8. Exports

Exports $e x p o r t and export descriptions e x p o r t d e s c are classified by their external type .$

3.4.8.1. ${n a m e n a m e, d e s c e x p o r t d e s c}$

The export description $e x p o r t d e s c must be valid with external type e x t e r n t y p e .$
Then the export is valid with external type $e x t e r n t y p e .$

\frac{C ⊢ e x p o r t d e s c : e x t e r n t y p e}{C ⊢ { n a m e n a m e , d e s c e x p o r t d e s c } : e x t e r n t y p e}

3.4.8.2. $f u n c x$

The function $C . f u n c s [x] must be defined in the context.$
Then the export description is valid with external type $f u n c C . f u n c s [x] .$

\frac{C . f u n c s [ x ] = f u n c t y p e}{C ⊢ f u n c x : f u n c f u n c t y p e}

3.4.8.3. $t a b l e x$

The table $C . t a b l e s [x] must be defined in the context.$
Then the export description is valid with external type $t a b l e C . t a b l e s [x] .$

\frac{C . t a b l e s [ x ] = t a b l e t y p e}{C ⊢ t a b l e x : t a b l e t a b l e t y p e}

3.4.8.4. $m e m x$

The memory $C . m e m s [x] must be defined in the context.$
Then the export description is valid with external type $m e m C . m e m s [x] .$

\frac{C . m e m s [ x ] = m e m t y p e}{C ⊢ m e m x : m e m m e m t y p e}

3.4.8.5. $g l o b a l x$

The global $C . g l o b a l s [x] must be defined in the context.$
Then the export description is valid with external type $g l o b a l C . g l o b a l s [x] .$

\frac{C . g l o b a l s [ x ] = g l o b a l t y p e}{C ⊢ g l o b a l x : g l o b a l g l o b a l t y p e}

3.4.9. Imports

Imports $i m p o r t and import descriptions i m p o r t d e s c are classified by external types .$

3.4.9.1. ${m o d u l e n a m e_{1}$

The import description $i m p o r t d e s c must be valid with type e x t e r n t y p e .$
Then the import is valid with type $e x t e r n t y p e .$

C ⊢ { m o d u l e n a m e _{1}

3.4.9.2. $f u n c x$

The function $C . t y p e s [x] must be defined in the context.$
Let $[t_{1 *] \to [t_{2 *] be the function type C . t y p e s [x] .}}$
Then the import description is valid with type $f u n c [t_{1 *] \to [t_{2 *] .}}$

C ⊢ f u n c x : f u n c [ t _{1 *] \to [t_{2 *] C . t y p e s [x] = [t_{1 *] \to [t_{2 *]}}}}

3.4.9.3. $t a b l e t a b l e t y p e$

The table type $t a b l e t y p e must be valid .$
Then the import description is valid with type $t a b l e t a b l e t y p e .$

\frac{⊢ t a b l e t y p e ok}{C ⊢ t a b l e t a b l e t y p e : t a b l e t a b l e t y p e}

3.4.9.4. $m e m m e m t y p e$

The memory type $m e m t y p e must be valid .$
Then the import description is valid with type $m e m m e m t y p e .$

\frac{⊢ m e m t y p e ok}{C ⊢ m e m m e m t y p e : m e m m e m t y p e}

3.4.9.5. $g l o b a l g l o b a l t y p e$

The global type $g l o b a l t y p e must be valid .$
Then the import description is valid with type $g l o b a l g l o b a l t y p e .$

\frac{⊢ g l o b a l t y p e ok}{C ⊢ g l o b a l g l o b a l t y p e : g l o b a l g l o b a l t y p e}

3.4.10. Modules

Modules are classified by their mapping from the external types of their imports to those of their exports.

A module is entirely closed, that is, its components can only refer to definitions that appear in the module itself. Consequently, no initial context is required. Instead, the context $C for validation of the module’s content is constructed from the definitions in the module.$

Let $m o d u l e be the module to validate.$
Let
C be a context where:

$C . t y p e s is m o d u l e . t y p e s,$
$C . f u n c s is f u n c s (i t^{*)}$
$C . t a b l e s is t a b l e s (i t^{*)}$
$C . m e m s is m e m s (i t^{*)}$
$C . g l o b a l s is g l o b a l s (i t^{*)}$
$C . l o c a l s is empty,$
$C . l a b e l s is empty,$
$C . r e t u r n is empty.$
Let $C^{' be the context where C^{' . g l o b a l s}}$
Under the context
C:

For each $f u n c t y p e_{i}$
For each $f u n c_{i}$
For each $t a b l e_{i}$
For each $m e m_{i}$
For each $g l o b a l_{i}$
For each $e l e m_{i}$
For each $d a t a_{i}$
If $m o d u l e . s t a r t is non-empty, then m o d u l e . s t a r t must be valid .$
For each $i m p o r t_{i}$
For each $e x p o r t_{i}$
The length of $C . t a b l e s must not be larger than 1 .$
The length of $C . m e m s must not be larger than 1 .$
All export names $e x p o r t_{i}$
Let $f t^{* be the concatenation of the internal function types f t_{i}}$
Let $t t^{* be the concatenation of the internal table types t t_{i}}$
Let $m t^{* be the concatenation of the internal memory types m t_{i}}$
Let $g t^{* be the concatenation of the internal global types g t_{i}}$
Let $i t^{* be the concatenation of external types i t_{i}}$
Let $e t^{* be the concatenation of external types e t_{i}}$
Then the module is valid with external types $i t^{* \to e t^{* .}}$

Note

Most definitions in a module – particularly functions – are mutually recursive. Consequently, the definition of the context $C in this rule is recursive: it depends on the outcome of validation of the function, table, memory, and global definitions contained in the module, which itself depends on C . However, this recursion is just a specification device. All types needed to construct C can easily be determined from a simple pre-pass over the module that does not perform any actual validation.$

Globals, however, are not recursive. The effect of defining the limited context $C^{' for validating the module’s globals is that their initialization expressions can only access imported globals and nothing else.}$

Note

The restriction on the number of tables and memories may be lifted in future versions of WebAssembly.

4. Execution

4.1. Conventions

WebAssembly code is executed when instantiating a module or invoking an exported function on the resulting module instance.

Execution behavior is defined in terms of an abstract machine that models the program state. It includes a stack, which records operand values and control constructs, and an abstract store containing global state.

For each instruction, there is a rule that specifies the effect of its execution on the program state. Furthermore, there are rules describing the instantiation of a module. As with validation, all rules are given in two equivalent forms:

In prose, describing the execution in intuitive form.
In formal notation, describing the rule in mathematical form. [1]

Note

As with validation, the prose and formal rules are equivalent, so that understanding of the formal notation is not required to read this specification. The formalism offers a more concise description in notation that is used widely in programming languages semantics and is readily amenable to mathematical proof.

4.1.1. Prose Notation

Execution is specified by stylised, step-wise rules for each instruction of the abstract syntax. The following conventions are adopted in stating these rules.

The execution rules implicitly assume a given store $S .$
The execution rules also assume the presence of an implicit stack that is modified by pushing or popping values, labels, and frames.
Certain rules require the stack to contain at least one frame. The most recent frame is referred to as the current frame.
Both the store and the current frame are mutated by replacing some of their components. Such replacement is assumed to apply globally.
The execution of an instruction may trap, in which case the entire computation is aborted and no further modifications to the store are performed by it. (Other computations can still be initiated afterwards.)
The execution of an instruction may also end in a jump to a designated target, which defines the next instruction to execute.
Execution can enter and exit instruction sequences that form blocks.
Instruction sequences are implicitly executed in order, unless a trap or jump occurs.
In various places the rules contain assertions expressing crucial invariants about the program state.

4.1.2. Formal Notation

Note

This section gives a brief explanation of the notation for specifying execution formally. For the interested reader, a more thorough introduction can be found in respective text books. [2]

The formal execution rules use a standard approach for specifying operational semantics, rendering them into reduction rules. Every rule has the following general form:

c o n f i g u r a t i o n ↪ c o n f i g u r a t i o n

A configuration is a syntactic description of a program state. Each rule specifies one step of execution. As long as there is at most one reduction rule applicable to a given configuration, reduction – and thereby execution – is deterministic. WebAssembly has only very few exceptions to this, which are noted explicitly in this specification.

For WebAssembly, a configuration typically is a tuple $(S; F; i n s t r^{*)}$

To avoid unnecessary clutter, the store $S and the frame F are omitted from reduction rules that do not touch them.$

There is no separate representation of the stack. Instead, it is conveniently represented as part of the configuration’s instruction sequence. In particular, values are defined to coincide with $c o n s t instructions, and a sequence of c o n s t instructions can be interpreted as an operand “stack” that grows to the right.$

Note

For example, the reduction rule for the $i 32 . a d instruction can be given as follows:$

(i 32 . c o n s t n_{1}

Per this rule, two $c o n s t instructions and the a d instruction itself are removed from the instruction stream and replaced with one new c o n s t instruction. This can be interpreted as popping two value off the stack and pushing the result.$

When no result is produced, an instruction reduces to the empty sequence:

n o p ↪ ϵ

Labels and frames are similarly defined to be part of an instruction sequence.

The order of reduction is determined by the definition of an appropriate evaluation context.

Reduction terminates when no more reduction rules are applicable. Soundness of the WebAssembly type system guarantees that this is only the case when the original instruction sequence has either been reduced to a sequence of $c o n s t instructions, which can be interpreted as the values of the resulting operand stack, or if a trap occurred.$

Note

For example, the following instruction sequence,

(f 64 . c o n s t x_{1}

terminates after three steps:

where $x_{4}$

[1]

[2]	For example: Benjamin Pierce. Types and Programming Languages. The MIT Press 2002

4.2. Runtime Structure

Store, stack, and other runtime structure forming the WebAssembly abstract machine, such as values or module instances, are made precise in terms of additional auxiliary syntax.

4.2.1. Values

WebAssembly computations manipulate values of the four basic value types: integers and floating-point data of 32 or 64 bit width each, respectively.

In most places of the semantics, values of different types can occur. In order to avoid ambiguities, values are therefore represented with an abstract syntax that makes their type explicit. It is convenient to reuse the same notation as for the $c o n s t instructions producing them:$

v a l : : = ∣ ∣ ∣ i 32 . c o n s t i 32 i 64 . c o n s t i 64 f 32 . c o n s t f 32 f 64 . c o n s t f 64

4.2.2. Results

A result is the outcome of a computation. It is either a sequence of values or a trap.

r e s u l t : : = ∣ v a l^{* t r a p}

Note

In the current version of WebAssembly, a result can consist of at most one value.

4.2.3. Store

The store represents all global state that can be manipulated by WebAssembly programs. It consists of the runtime representation of all instances of functions, tables, memories, and globals that have been allocated during the life time of the abstract machine. [1]

Syntactically, the store is defined as a record listing the existing instances of each category:

s t o r e : : = {f u n c s t a b l e s m e m s g l o b a l s

[1]	In practice, implementations may apply techniques like garbage collection to remove objects from the store that are no longer referenced. However, such techniques are not semantically observable, and hence outside the scope of this specification.

4.2.3.1. Convention

The meta variable $S ranges over stores where clear from context.$

4.2.4. Addresses

Function instances, table instances, memory instances, and global instances in the store are referenced with abstract addresses. These are simply indices into the respective store component.

a d r f u n c a d r t a b l e a d r m e m a d r g l o b a l a d r : : = : : = : : = : : = : : = 0 ∣ 1 ∣ 2 ∣ \dots

An embedder may assign identity to exported store objects corresponding to their addresses, even where this identity is not observable from within WebAssembly code itself (such as for function instances or immutable globals).

Note

Addresses are dynamic, globally unique references to runtime objects, in contrast to indices, which are static, module-local references to their original definitions. A memory address $m e m a d r denotes the abstract address of a memory instance in the store, not an offset inside a memory instance.$

There is no specific limit on the number of allocations of store objects, hence logical addresses can be arbitrarily large natural numbers.

4.2.5. Module Instances

A module instance is the runtime representation of a module. It is created by instantiating a module, and collects runtime representations of all entities that are imported, defined, or exported by the module.

m o d u l e i n s t : : = {t y p e s f u n c a d r s t a b l e a d r s m e m a d r s g l o b a l a d r s e x p o r t s

Each component references runtime instances corresponding to respective declarations from the original module – whether imported or defined – in the order of their static indices. Function instances, table instances, memory instances, and global instances are referenced with an indirection through their respective addresses in the store.

It is an invariant of the semantics that all export instances in a given module instance have different names.

4.2.6. Function Instances

A function instance is the runtime representation of a function. It effectively is a closure of the original function over the runtime module instance of its originating module. The module instance is used to resolve references to other definitions during execution of the function.

f u n c i n s t h o s t f u n c : : = ∣ : : = {t y p e f u n c t y p e, m o d u l e m o d u l e i n s t, c o d e f u n c} {t y p e f u n c t y p e, h o s t c o d e h o s t f u n c} \dots

A host function is a function expressed outside WebAssembly but passed to a module as an import. The definition and behavior of host functions are outside the scope of this specification. For the purpose of this specification, it is assumed that when invoked, a host function behaves non-deterministically, but within certain constraints that ensure the integrity of the runtime.

Note

Function instances are immutable, and their identity is not observable by WebAssembly code. However, the embedder might provide implicit or explicit means for distinguishing their addresses.

4.2.7. Table Instances

A table instance is the runtime representation of a table. It holds a vector of function elements and an optional maximum size, if one was specified in the table type at the table’s definition site.

Each function element is either empty, representing an uninitialized table entry, or a function address. Function elements can be mutated through the execution of an element segment or by external means provided by the embedder.

t a b l e i n s t f u n c e l e m : : = : : = {e l e m v e c (f u n c e l e m), m a x u 32_{f u n c a d r^{?}}^{?}}

It is an invariant of the semantics that the length of the element vector never exceeds the maximum size, if present.

Note

Other table elements may be added in future versions of WebAssembly.

4.2.8. Memory Instances

A memory instance is the runtime representation of a linear memory. It holds a vector of bytes and an optional maximum size, if one was specified at the definition site of the memory.

m e m i n s t : : = {d a t a v e c (b y t e), m a x u 32^{?}}

The length of the vector always is a multiple of the WebAssembly page size, which is defined to be the constant $6536 - abbreviated 64 K i . Like in a memory type, the maximum size in a memory instance is given in units of this page size.$

The bytes can be mutated through memory instructions, the execution of a data segment, or by external means provided by the embedder.

It is an invariant of the semantics that the length of the byte vector, divided by page size, never exceeds the maximum size, if present.

4.2.9. Global Instances

A global instance is the runtime representation of a global variable. It holds an individual value and a flag indicating whether it is mutable.

g l o b a l i n s t : : = {v a l u e v a l, m u t m u t}

The value of mutable globals can be mutated through variable instructions or by external means provided by the embedder.

4.2.10. Export Instances

An export instance is the runtime representation of an export. It defines the export’s name and the associated external value.

e x p o r t i n s t : : = {n a m e n a m e, v a l u e e x t e r n v a l}

4.2.11. External Values

An external value is the runtime representation of an entity that can be imported or exported. It is an address denoting either a function instance, table instance, memory instance, or global instances in the shared store.

e x t e r n v a l : : = ∣ ∣ ∣ f u n c f u n c a d r t a b l e t a b l e a d r m e m m e m a d r g l o b a l g l o b a l a d r

4.2.11.1. Conventions

The following auxiliary notation is defined for sequences of external values. It filters out entries of a specific kind in an order-preserving fashion:

$f u n c s (e x t e r n v a l^{*) = [f u n c a d r ∣ (f u n c f u n c a d r) \in e x t e r n v a l^{*]}}$
$t a b l e s (e x t e r n v a l^{*) = [t a b l e a d r ∣ (t a b l e t a b l e a d r) \in e x t e r n v a l^{*]}}$
$m e m s (e x t e r n v a l^{*) = [m e m a d r ∣ (m e m m e m a d r) \in e x t e r n v a l^{*]}}$
$g l o b a l s (e x t e r n v a l^{*) = [g l o b a l a d r ∣ (g l o b a l g l o b a l a d r) \in e x t e r n v a l^{*]}}$

4.2.12. Stack

Besides the store, most instructions interact with an implicit stack. The stack contains three kinds of entries:

Values: the operands of instructions.
Labels: active structured control instructions that can be targeted by branches.
Activations: the call frames of active function calls.

These entries can occur on the stack in any order during the execution of a program. Stack entries are described by abstract syntax as follows.

Note

It is possible to model the WebAssembly semantics using separate stacks for operands, control constructs, and calls. However, because the stacks are interdependent, additional book keeping about associated stack heights would be required. For the purpose of this specification, an interleaved representation is simpler.

4.2.12.1. Values

Values are represented by themselves.

4.2.12.2. Labels

Labels carry an argument arity $n and their associated branch target, which is expressed syntactically as an instruction sequence:$

l a b e l : : = l a b e l_{n}

Intuitively, $i n s t r^{* is the continuation to execute when the branch is taken, in place of the original control construct.}$

Note

For example, a loop label has the form

l a b e l_{n}

When performing a branch to this label, this executes the loop, effectively restarting it from the beginning. Conversely, a simple block label has the form

l a b e l_{n}

When branching, the empty continuation ends the targeted block, such that execution can proceed with consecutive instructions.

4.2.12.3. Activations and Frames

Activation frames carry the return arity $n of the respective function, hold the values of its locals (including arguments) in the order corresponding to their static local indices, and a reference to the function’s own module instance :$

a c t i v a t i o n f r a m e : : = : : = f r a m e_{{l o c a l s v a l^{,}}}^{n}

The values of the locals are mutated by respective variable instructions.

4.2.12.4. Conventions

The meta variable $L ranges over labels where clear from context.$
The meta variable $F ranges over frames where clear from context.$

Note

In the current version of WebAssembly, the arities of labels and frames cannot be larger than $1 . This may be generalized in future versions.$

4.2.13. Administrative Instructions

Note

This section is only relevant for the formal notation.

In order to express the reduction of traps, calls, and control instructions, the syntax of instructions is extended to include the following administrative instructions:

i n s t r : : = ∣ ∣ ∣ ∣ ∣ ∣ \dots t r a p i n v o k e f u n c a d r i n i t_e l e m t a b l e a d r u 32 f u n c i d x^{*}

The $t r a p instruction represents the occurrence of a trap. Traps are bubbled up through nested instruction sequences, ultimately reducing the entire program to a single t r a p instruction, signalling abrupt termination.$

The $i n v o k e instruction represents the imminent invocation of a function instance, identified by its address . It unifies the handling of different forms of calls.$

The $i n i t_e l e m and i n i t_d a t a instructions perform initialization of element and data segments during module instantiation .$

Note

The reason for splitting instantiation into individual reduction steps is to provide a semantics that is compatible with future extensions like threads.

The $l a b e l and f r a m e instructions model labels and frames “on the stack” . Moreover, the administrative syntax maintains the nesting structure of the original structured control instruction or function body and their instruction sequences with an e n d marker. That way, the end of the inner instruction sequence is known when part of an outer sequence.$

Note

For example, the reduction rule for $b l o c k is:$

b l o c k [t^{n] i n s t r^{* e n d ↪ l a b e l_{n}}}

This replaces the block with a label instruction, which can be interpreted as “pushing” the label on the stack. When $e n d is reached, i.e., the inner instruction sequence has been reduced to the empty sequence - or rather, a sequence of n c o n s t instructions representing the resulting values - then the l a b e l instruction is eliminated courtesy of its own reduction rule :$

l a b e l_{n}

This can be interpreted as removing the label from the stack and only leaving the locally accumulated operand values.

4.2.13.1. Block Contexts

In order to specify the reduction of branches, the following syntax of block contexts is defined, indexed by the count $k of labels surrounding a hole [_] that marks the place where the next step of computation is taking place:$

B^{0 B^{k + 1 : : = : : =}}

This definition allows to index active labels surrounding a branch or return instruction.

Note

For example, the reduction of a simple branch can be defined as follows:

l a b e l_{0}

Here, the hole $[_] of the context is instantiated with a branch instruction. When a branch occurs, this rule replaces the targeted label and associated instruction sequence with the label’s continuation. The selected label is identified through the label index l, which corresponds to the number of surrounding l a b e l instructions that must be hopped over - which is exactly the count encoded in the index of a block context.$

4.2.13.2. Configurations

A configuration consists of the current store and an executing thread.

A thread is a computation over instructions that operates relative to a current frame referring to the module instance in which the computation runs, i.e., where the current function originates from.

c o n f i g t h r e a d : : = : : = s t o r e; t h r e a d f r a m e; i n s t r^{*}

Note

The current version of WebAssembly is single-threaded, but configurations with multiple threads may be supported in the future.

4.2.13.3. Evaluation Contexts

Finally, the following definition of evaluation context and associated structural rules enable reduction inside instruction sequences and administrative forms as well as the propagation of traps:

E : : = [_] ∣ v a l^{* E i n s t r^{* ∣ l a b e l_{n}}}

Reduction terminates when a thread’s instruction sequence has been reduced to a result, that is, either a sequence of values or to a $t r a p .$

Note

The restriction on evaluation contexts rules out contexts like $[_] and ϵ [_] ϵ for which E [t r a p] = t r a p .$

For an example of reduction under evaluation contexts, consider the following instruction sequence.

(f 64 . c o n s t x_{1}

This can be decomposed into $E [(f 64 . c o n s t x_{2}$

E = (f 64 . c o n s t x_{1}

Moreover, this is the only possible choice of evaluation context where the contents of the hole matches the left-hand side of a reduction rule.

4.3. Numerics

Numeric primitives are defined in a generic manner, by operators indexed over a bit width $N .$

Some operators are non-deterministic, because they can return one of several possible results (such as different NaN values). Technically, each operator thus returns a set of allowed values. For convenience, deterministic results are expressed as plain values, which are assumed to be identified with a respective singleton set.

Some operators are partial, because they are not defined on certain inputs. Technically, an empty set of results is returned for these inputs.

In formal notation, each operator is defined by equational clauses that apply in decreasing order of precedence. That is, the first clause that is applicable to the given arguments defines the result. In some cases, similar clauses are combined into one by using the notation $\pm or \mp . When several of these placeholders occur in a single clause, then they must be resolved consistently: either the upper sign is chosen for all of them or the lower sign.$

Note

For example, the $f c o p y s i g n operator is defined as follows:$

f c o p y s i g n_{N}

This definition is to be read as a shorthand for the following expansion of each clause into two separate ones:

f c o p y s i g n_{N}

Conventions:

The meta variable $d is used to range over single bits.$
The meta variable $p is used to range over (signless) magnitudes of floating-point values, including n a n and \infty .$
The meta variable $q is used to range over (signless) rational magnitudes, excluding n a n or \infty .$
The notation $f^{- 1 denotes the inverse of a bijective function f .}$
Truncation of rational values is written $t r u n c (\pm q), with the usual mathematical definition:$

$t r u n c (\pm q) = \pm i (if i \in N \land + q - 1 < i \leq + q)$

4.3.1. Representations

Numbers have an underlying binary representation as a sequence of bits:

b i t s_{i N (i) b i t s_{f N (z)}}

Each of these functions is a bijection, hence they are invertible.

4.3.1.1. Integers

Integers are represented as base two unsigned numbers:

i b i t s_{N}

Boolean operators like $\land, \lor, or ⊻ are lifted to bit sequences of equal length by applying them pointwise.$

4.3.1.2. Floating-Point

Floating-point values are represented in the respective binary format defined by [IEEE-754-2019] (Section 3.4):

f b i t s_{N}

where $M = s i g n i f (N) and E = e x p o n (N) .$

4.3.1.3. Storage

When a number is stored into memory, it is converted into a sequence of bytes in little endian byte order:

b y t e s_{t}^{l i t l e n d i a n (ϵ) l i t l e n d i a n (d^{8 d^{'^{*)}}}}

Again these functions are invertable bijections.

4.3.2. Integer Operations

4.3.2.1. Sign Interpretation

Integer operators are defined on $i N values. Operators that use a signed interpretation convert the value using the following definition, which takes the two’s complement when the value lies in the upper half of the value range (i.e., its most significant bit is 1):$

s i g n e d_{s i g n e d_{N}}^{N}

This function is bijective, and hence invertible.

4.3.2.2. Boolean Interpretation

The integer result of predicates – i.e., tests and relational operators – is defined with the help of the following auxiliary function producing the value $1 or 0 depending on a condition.$

b o l (C) b o l (C) = = 10 (if C) (otherwise)

4.3.2.3. $i a d_{N}$

Return the result of adding $i_{1}$

i a d_{N}

4.3.2.4. $i s u b_{N}$

Return the result of subtracting $i_{2}$

i s u b_{N}

4.3.2.5. $i m u l_{N}$

Return the result of multiplying $i_{1}$

i m u l_{N}

4.3.2.6. $i d i v_u_{N}$

If $i_{2}$
Else, return the result of dividing $i_{1}$

i d i v_u_{N}

Note

This operator is partial.

4.3.2.7. $i d i v_s_{N}$

Let $j_{1}$
Let $j_{2}$
If $j_{2}$
Else if $j_{1}$
Else, return the result of dividing $j_{1}$

i d i v_s_{N}

Note

This operator is partial. Besides division by $0, the result of (- 2^{N - 1) / (- 1) = + 2^{N - 1 is not representable as an N -bit signed integer.}}$

4.3.2.8. $i r e m_u_{N}$

If $i_{2}$
Else, return the remainder of dividing $i_{1}$

i r e m_u_{N}

Note

This operator is partial.

As long as both operators are defined, it holds that $i_{1}$

4.3.2.9. $i r e m_s_{N}$

Let $j_{1}$
Let $j_{2}$
If $i_{2}$
Else, return the remainder of dividing $j_{1}$

i r e m_s_{N}

Note

This operator is partial.

As long as both operators are defined, it holds that $i_{1}$

4.3.2.10. $i a n d_{N}$

Return the bitwise conjunction of $i_{1}$

i a n d_{N}

4.3.2.11. $i o r_{N}$

Return the bitwise disjunction of $i_{1}$

i o r_{N}

4.3.2.12. $i x o r_{N}$

Return the bitwise exclusive disjunction of $i_{1}$

i x o r_{N}

4.3.2.13. $i s h l_{N}$

Let $k be i_{2}$
Return the result of shifting $i_{1}$

i s h l_{N}

4.3.2.14. $i s h r_u_{N}$

Let $k be i_{2}$
Return the result of shifting $i_{1}$

i s h r_u_{N}

4.3.2.15. $i s h r_s_{N}$

Let $k be i_{2}$
Return the result of shifting $i_{1}$

i s h r_s_{N}

4.3.2.16. $i r o t l_{N}$

Let $k be i_{2}$
Return the result of rotating $i_{1}$

i r o t l_{N}

4.3.2.17. $i r o t r_{N}$

Let $k be i_{2}$
Return the result of rotating $i_{1}$

i r o t r_{N}

4.3.2.18. $i c l z_{N}$

Return the count of leading zero bits in $i; all bits are considered leading zeros if i is 0 .$

i c l z_{N}

4.3.2.19. $i c t z_{N}$

Return the count of trailing zero bits in $i; all bits are considered trailing zeros if i is 0 .$

i c t z_{N}

4.3.2.20. $i p o p c n t_{N}$

Return the count of non-zero bits in $i .$

i p o p c n t_{N}

4.3.2.21. $i e q z_{N}$

Return $1 if i is zero, 0 otherwise.$

i e q z_{N}

4.3.2.22. $i e q_{N}$

Return $1 if i_{1}$

i e q_{N}

4.3.2.23. $i n e_{N}$

Return $1 if i_{1}$

i n e_{N}

4.3.2.24. $i l t_u_{N}$

Return $1 if i_{1}$

i l t_u_{N}

i l t_s_{N}

4.3.2.26. $i g t_u_{N}$

Return $1 if i_{1}$

i g t_u_{N}

4.3.2.27. $i g t_s_{N}$

Let $j_{1}$
Let $j_{2}$
Return $1 if j_{1}$

i g t_s_{N}

4.3.2.28. $i l e_u_{N}$

Return $1 if i_{1}$

i l e_u_{N}

4.3.2.29. $i l e_s_{N}$

Let $j_{1}$
Let $j_{2}$
Return $1 if j_{1}$

i l e_s_{N}

4.3.2.30. $i g e_u_{N}$

Return $1 if i_{1}$

i g e_u_{N}

4.3.2.31. $i g e_s_{N}$

Let $j_{1}$
Let $j_{2}$
Return $1 if j_{1}$

i g e_s_{N}

4.3.3. Floating-Point Operations

Floating-point arithmetic follows the [IEEE-754-2019] standard, with the following qualifications:

All operators use round-to-nearest ties-to-even, except where otherwise specified. Non-default directed rounding attributes are not supported.
Following the recommendation that operators propagate NaN payloads from their operands is permitted but not required.
All operators use “non-stop” mode, and floating-point exceptions are not otherwise observable. In particular, neither alternate floating-point exception handling attributes nor operators on status flags are supported. There is no observable difference between quiet and signalling NaNs.

Note

Some of these limitations may be lifted in future versions of WebAssembly.

4.3.3.1. Rounding

Rounding always is round-to-nearest ties-to-even, in correspondence with [IEEE-754-2019] (Section 4.3.1).

An exact floating-point number is a rational number that is exactly representable as a floating-point number of given bit width $N .$

A limit number for a given floating-point bit width $N is a positive or negative number whose magnitude is the smallest power of 2 that is not exactly representable as a floating-point number of width N (that magnitude is 2^{128 for N = 32 and 2^{1024 for N = 64).}}$

A candidate number is either an exact floating-point number or a positive or negative limit number for the given bit width $N .$

A candidate pair is a pair $z_{1}$

A real number $r is converted to a floating-point value of bit width N as follows:$

If $r is 0, then return + 0 .$
Else if $r is an exact floating-point number, then return r .$
Else if $r greater than or equal to the positive limit, then return + \infty .$
Else if $r is less than or equal to the negative limit, then return - \infty .$
Else if
z1 and $z_{2}$
If
z is $0, then: If r < 0, then return - 0 . Else, return + 0 .$
Else if
z is a limit number, then:

If $r < 0, then return - \infty .$
Else, return $+ \infty .$
Else, return $z .$

f l o a t_{N}^{f l o a t_{N}^{f l o a t_{N}^{f l o a t_{N}^{f l o a t_{N}^{c l o s e s t}}}}}

where:

e x a c t_{N}

4.3.3.2. NaN Propagation

When the result of a floating-point operator other than $f n e g, f a b s, or f c o p y s i g n is a NaN, then its sign is non-deterministic and the payload is computed as follows:$

If the payload of all NaN inputs to the operator is canonical (including the case that there are no NaN inputs), then the payload of the output is canonical as well.
Otherwise the payload is picked non-determinsitically among all arithmetic NaNs; that is, its most significant bit is $1 and all others are unspecified.$

This non-deterministic result is expressed by the following auxiliary function producing a set of allowed outputs from a set of inputs:

n a n s_{N}

4.3.3.3. $f a d_{N}$

If either $z_{1}$
Else if both $z_{1}$
Else if both $z_{1}$
Else if one of $z_{1}$
Else if both $z_{1}$
Else if both $z_{1}$
Else if one of $z_{1}$
Else if both $z_{1}$
Else return the result of adding $z_{1}$

4.3.3.4. $f s u b_{N}$

If either $z_{1}$
Else if both $z_{1}$
Else if both $z_{1}$
Else if $z_{1}$
Else if $z_{2}$
Else if both $z_{1}$
Else if both $z_{1}$
Else if $z_{2}$
Else if $z_{1}$
Else if both $z_{1}$
Else return the result of subtracting $z_{2}$

Note

Up to the non-determinism regarding NaNs, it always holds that $f s u b_{N}$

4.3.3.5. $f m u l_{N}$

If either $z_{1}$
Else if one of $z_{1}$
Else if both $z_{1}$
Else if both $z_{1}$
Else if one of $z_{1}$
Else if one of $z_{1}$
Else if both $z_{1}$
Else if both $z_{1}$
Else return the result of multiplying $z_{1}$

4.3.3.6. $f d i v_{N}$

If either $z_{1}$
Else if both $z_{1}$
Else if both $z_{1}$
Else if $z_{1}$
Else if $z_{1}$
Else if $z_{2}$
Else if $z_{2}$
Else if $z_{1}$
Else if $z_{1}$
Else if $z_{2}$
Else if $z_{2}$
Else return the result of dividing $z_{1}$

4.3.3.7. $f m i n_{N}$

If either $z_{1}$
Else if one of $z_{1}$
Else if one of $z_{1}$
Else if both $z_{1}$
Else return the smaller value of $z_{1}$

f m i n_{N}

4.3.3.8. $f m a x_{N}$

If either $z_{1}$
Else if one of $z_{1}$
Else if one of $z_{1}$
Else if both $z_{1}$
Else return the larger value of $z_{1}$

f m a x_{N}

4.3.3.9. $f c o p y s i g n_{N}$

If $z_{1}$
Else return $z_{1}$

f c o p y s i g n_{N}

4.3.3.10. $f a b s_{N}$

If $z is a NaN, then return z with positive sign.$
Else if $z is an infinity, then return positive infinity.$
Else if $z is a zero, then return positive zero.$
Else if $z is a positive value, then z .$
Else return $z negated.$

f a b s_{N}^{f a b s_{f a b s_{f a b s_{N}}^{N}}^{N}}

4.3.3.11. $f n e g_{N}$

If $z is a NaN, then return z with negated sign.$
Else if $z is an infinity, then return that infinity negated.$
Else if $z is a zero, then return that zero negated.$
Else return $z negated.$

f n e g_{N}^{f n e g_{f n e g_{f n e g_{N}}^{N}}^{N}}

4.3.3.12. $f s q r t_{N}$

If $z is a NaN, then return an element of n a n s_{N}$
Else if $z has a negative sign, then return an element of n a n s_{N}$
Else if $z is positive infinity, then return positive infinity.$
Else if $z is a zero, then return that zero.$
Else return the square root of $z .$

f s q r t_{N}^{f s q r t_{N}^{f s q r t_{f s q r t_{f s q r t_{f s q r t}^{N}}^{N}}^{N}}}

4.3.3.13. $f c e i l_{N}$

If $z is a NaN, then return an element of n a n s_{N}$
Else if $z is an infinity, then return z .$
Else if $z is a zero, then return z .$
Else if $z is smaller than 0 but greater than - 1, then return negative zero.$
Else return the smallest integral value that is not smaller than $z .$

f c e i l_{N}^{f c e i l_{N}^{f c e i l_{f c e i l_{f c e i l_{N}}^{N}}^{N}}}

4.3.3.14. $f f l o r_{N}$

If $z is a NaN, then return an element of n a n s_{N}$
Else if $z is an infinity, then return z .$
Else if $z is a zero, then return z .$
Else if $z is greater than 0 but smaller than 1, then return positive zero.$
Else return the largest integral value that is not larger than $z .$

f f l o r_{N}^{f f l o r_{N}^{f f l o r_{f f l o r_{f f l o r_{N}}^{N}}^{N}}}

4.3.3.15. $f t r u n c_{N}$

If $z is a NaN, then return an element of n a n s_{N}$
Else if $z is an infinity, then return z .$
Else if $z is a zero, then return z .$
Else if $z is greater than 0 but smaller than 1, then return positive zero.$
Else if $z is smaller than 0 but greater than - 1, then return negative zero.$
Else return the integral value with the same sign as $z and the largest magnitude that is not larger than the magnitude of z .$

f t r u n c_{N}^{f t r u n c_{N}^{f t r u n c_{f t r u n c_{f t r u n c_{f t r u n c}^{N}}^{N}}^{N}}}

4.3.3.16. $f n e a r e s t_{N}$

If $z is a NaN, then return an element of n a n s_{N}$
Else if $z is an infinity, then return z .$
Else if $z is a zero, then return z .$
Else if $z is greater than 0 but smaller than or equal to 0.5, then return positive zero.$
Else if $z is smaller than 0 but greater than or equal to - 0.5, then return negative zero.$
Else return the integral value that is nearest to $z; if two values are equally near, return the even one.$

f n e a r e s t_{N}^{f n e a r e s t_{N}^{f n e a r e s t_{N}^{f n e a r e s t_{f n e a r e s t_{f n e a r e s t}^{N}}^{N}}}}

4.3.3.17. $f e q_{N}$

If either $z_{1}$
Else if both $z_{1}$
Else if both $z_{1}$
Else return $0 .$

f e q_{N}

4.3.3.18. $f n e_{N}$

If either $z_{1}$
Else if both $z_{1}$
Else if both $z_{1}$
Else return $1 .$

f n e_{N}

4.3.3.19. $f l t_{N}$

If either $z_{1}$
Else if $z_{1}$
Else if $z_{1}$
Else if $z_{1}$
Else if $z_{2}$
Else if $z_{2}$
Else if both $z_{1}$
Else if $z_{1}$
Else return $0 .$

f l t_{N}

4.3.3.20. $f g t_{N}$

If either $z_{1}$
Else if $z_{1}$
Else if $z_{1}$
Else if $z_{1}$
Else if $z_{2}$
Else if $z_{2}$
Else if both $z_{1}$
Else if $z_{1}$
Else return $0 .$

f g t_{N}

4.3.3.21. $f l e_{N}$

If either $z_{1}$
Else if $z_{1}$
Else if $z_{1}$
Else if $z_{1}$
Else if $z_{2}$
Else if $z_{2}$
Else if both $z_{1}$
Else if $z_{1}$
Else return $0 .$

f l e_{N}

4.3.3.22. $f g e_{N}$

If either $z_{1}$
Else if $z_{1}$
Else if $z_{1}$
Else if $z_{1}$
Else if $z_{2}$
Else if $z_{2}$
Else if both $z_{1}$
Else if $z_{1}$
Else return $0 .$

f g e_{N}

4.3.4. Conversions

4.3.4.1. $e x t e n d^{u}_{M, N (i)}$

Return $i .$

e x t e n d^{u}_{M, N (i)}

Note

In the abstract syntax, unsigned extension just reinterprets the same value.

4.3.4.2. $e x t e n d^{s}_{M, N (i)}$

Let $j be the signed interpretation of i of size M .$
Return the two’s complement of $j relative to size N .$

e x t e n d^{s}_{M, N (i)}

4.3.4.3. $w r a p_{M, N (i)}$

Return $i modulo 2^{N .}$

w r a p_{M, N (i)}

4.3.4.4. $t r u n c^{u}_{M, N (z)}$

If $z is a NaN, then the result is undefined.$
Else if $z is an infinity, then the result is undefined.$
Else if $z is a number and t r u n c (z) is a value within range of the target type, then return that value.$
Else the result is undefined.

t r u n c^{u}_{M, N (\pm n a n (n) t r u n c^{u}_{M, N (\pm \infty) t r u n c^{u}_{M, N (\pm q) t r u n c^{u}_{M, N (\pm q)}}}}

Note

This operator is partial. It is not defined for NaNs, infinities, or values for which the result is out of range.

4.3.4.5. $t r u n c^{s}_{M, N (z)}$

If $z is a NaN, then the result is undefined.$
Else if $z is an infinity, then the result is undefined.$
If $z is a number and t r u n c (z) is a value within range of the target type, then return that value.$
Else the result is undefined.

t r u n c^{s}_{M, N (\pm n a n (n) t r u n c^{s}_{M, N (\pm \infty) t r u n c^{s}_{M, N (\pm q) t r u n c^{s}_{M, N (\pm q)}}}}

Note

This operator is partial. It is not defined for NaNs, infinities, or values for which the result is out of range.

4.3.4.6. $p r o m o t e_{M, N (z)}$

If $z is a canonical NaN, then return an element of n a n s_{N}$
Else if $z is a NaN, then return an element of n a n s_{N}$
Else, return $z .$

p r o m o t e_{M, N (\pm n a n (n) p r o m o t e_{M, N (\pm n a n (n) p r o m o t e_{M, N (z)}}}

4.3.4.7. $d e m o t e_{M, N (z)}$

If $z is a canonical NaN, then return an element of n a n s_{N}$
Else if $z is a NaN, then return an element of n a n s_{N}$
Else if $z is an infinity, then return that infinity.$
Else if $z is a zero, then return that zero.$
Else, return $f l o a t_{N}$

d e m o t e_{M, N (\pm n a n (n) d e m o t e_{M, N (\pm n a n (n) d e m o t e_{M, N (\pm \infty) d e m o t e_{M, N (\pm 0) d e m o t e_{(\pm q)}}}}}

4.3.4.8. $c o n v e r t^{u}_{M, N (i)}$

Return $f l o a t_{N}$

c o n v e r t^{u}_{M, N (i)}

4.3.4.9. $c o n v e r t^{s}_{M, N (i)}$

Let $j be the signed interpretation of i .$
Return $f l o a t_{N}$

c o n v e r t^{u}_{M, N (i)}

4.3.4.10. $r e i n t e r p r e t_{t_{1}}$

Let $d^{* be the bit sequence b i t s_{t_{1}}}$
Return the constant $c^{' for which b i t s_{t_{2}}}$

r e i n t e r p r e t_{t_{1}}

4.4. Instructions

WebAssembly computation is performed by executing individual instructions.

4.4.1. Numeric Instructions

Numeric instructions are defined in terms of the generic numeric operators. The mapping of numeric instructions to their underlying operators is expressed by the following definition:

o p_{i N (n_{1}}

And for conversion operators:

c v t o p_{t_{1}}

Where the underlying operators are partial, the corresponding instruction will trap when the result is not defined. Where the underlying operators are non-deterministic, because they may return one of multiple possible NaN values, so are the corresponding instructions.

Note

For example, the result of instruction $i 32 . a d applied to operands i_{1}$

4.4.1.1. $t . c o n s t c$

Push the value $t . c o n s t c to the stack.$

Note

No formal reduction rule is required for this instruction, since $c o n s t instructions coincide with values .$

4.4.1.2. $t . u n o p$

Assert: due to validation, a value of value type $t is on the top of the stack.$
Pop the value $t . c o n s t c_{1}$
If
unopt(c1) is defined, then:

Let $c be a possible result of computing u n o p_{t}$
Push the value $t . c o n s t c to the stack.$
Else:
1. Trap.

(t . c o n s t c_{1}

4.4.1.3. $t . b i n o p$

Assert: due to validation, two values of value type $t are on the top of the stack.$
Pop the value $t . c o n s t c_{2}$
Pop the value $t . c o n s t c_{1}$
If
binopt(c1,c2) is defined, then:

Let $c be a possible result of computing b i n o p_{t}$
Push the value $t . c o n s t c to the stack.$
Else:
1. Trap.

(t . c o n s t c_{1}

4.4.1.4. $t . t e s t o p$

Assert: due to validation, a value of value type $t is on the top of the stack.$
Pop the value $t . c o n s t c_{1}$
Let $c be the result of computing t e s t o p_{t}$
Push the value $i 32 . c o n s t c to the stack.$

(t . c o n s t c_{1}

4.4.1.5. $t . r e l o p$

Assert: due to validation, two values of value type $t are on the top of the stack.$
Pop the value $t . c o n s t c_{2}$
Pop the value $t . c o n s t c_{1}$
Let $c be the result of computing r e l o p_{t}$
Push the value $i 32 . c o n s t c to the stack.$

(t . c o n s t c_{1}

4.4.1.6. $t_{2}$

Assert: due to validation, a value of value type $t_{1}$
Pop the value $t_{1}$
If
cvtopt1,t2sx?(c1) is defined:

Let $c_{2}$
Push the value $t_{2}$
Else:
1. Trap.

(t_{1}

4.4.2. Parametric Instructions

4.4.2.1. $d r o p$

Assert: due to validation, a value is on the top of the stack.
Pop the value $v a l from the stack.$

v a l d r o p ↪ ϵ

4.4.2.2. $s e l e c t$

Assert: due to validation, a value of value type $i 32 is on the top of the stack.$
Pop the value $i 32 . c o n s t c from the stack.$
Assert: due to validation, two more values (of the same value type) are on the top of the stack.
Pop the value $v a l_{2}$
Pop the value $v a l_{1}$
If
c is not $0, then: Push the value v a l_{1}$
Else:
1. Push the value $v a l_{2}$

v a l_{1}

4.4.3. Variable Instructions

4.4.3.1. $l o c a l . g e t x$

Let $F be the current frame .$
Assert: due to validation, $F . l o c a l s [x] exists.$
Let $v a l be the value F . l o c a l s [x] .$
Push the value $v a l to the stack.$

F; (l o c a l . g e t x) ↪ F; v a l (if F . l o c a l s [x] = v a l)

4.4.3.2. $l o c a l . s e t x$

Let $F be the current frame .$
Assert: due to validation, $F . l o c a l s [x] exists.$
Assert: due to validation, a value is on the top of the stack.
Pop the value $v a l from the stack.$
Replace $F . l o c a l s [x] with the value v a l .$

F; v a l (l o c a l . s e t x) ↪ F^{'; ϵ}

4.4.3.3. $l o c a l . t e x$

Assert: due to validation, a value is on the top of the stack.
Pop the value $v a l from the stack.$
Push the value $v a l to the stack.$
Push the value $v a l to the stack.$
Execute the instruction $(l o c a l . s e t x) .$

v a l (l o c a l . t e x) ↪ v a l v a l (l o c a l . s e t x)

4.4.3.4. $g l o b a l . g e t x$

Let $F be the current frame .$
Assert: due to validation, $F . m o d u l e . g l o b a l a d r s [x] exists.$
Let $a be the global address F . m o d u l e . g l o b a l a d r s [x] .$
Assert: due to validation, $S . g l o b a l s [a] exists.$
Let $g l o b be the global instance S . g l o b a l s [a] .$
Let $v a l be the value g l o b . v a l u e .$
Push the value $v a l to the stack.$

S; F; (g l o b a l . g e t x) ↪ S; F; v a l (if S . g l o b a l s [F . m o d u l e . g l o b a l a d r s [x] . v a l u e = v a l)

4.4.3.5. $g l o b a l . s e t x$

Let $F be the current frame .$
Assert: due to validation, $F . m o d u l e . g l o b a l a d r s [x] exists.$
Let $a be the global address F . m o d u l e . g l o b a l a d r s [x] .$
Assert: due to validation, $S . g l o b a l s [a] exists.$
Let $g l o b be the global instance S . g l o b a l s [a] .$
Assert: due to validation, a value is on the top of the stack.
Pop the value $v a l from the stack.$
Replace $g l o b . v a l u e with the value v a l .$

S; F; v a l (g l o b a l . s e t x) ↪ S_{(if S^{' = S [F . . [x] . =)}}^{'; F; ϵ}

Note

Validation ensures that the global is, in fact, marked as mutable.

4.4.4. Memory Instructions

Note

The alignment $m e m a r g . a l i g n in load and store instructions does not affect the semantics. It is an indication that the offset e a at which the memory is accessed is intended to satisfy the property e a m o d 2^{m e m a r g . a l i g n = 0}$

4.4.4.1. $t . l o a d m e m a r g and t . l o a d N_s x m e m a r g$

Let $F be the current frame .$
Assert: due to validation, $F . m o d u l e . m e m a d r s [0] exists.$
Let $a be the memory address F . m o d u l e . m e m a d r s [0] .$
Assert: due to validation, $S . m e m s [a] exists.$
Let $m e m be the memory instance S . m e m s [a] .$
Assert: due to validation, a value of value type $i 32 is on the top of the stack.$
Pop the value $i 32 . c o n s t i from the stack.$
Let $e a be the integer i + m e m a r g . o f s e t .$
If
N is not part of the instruction, then:

Let $N be the bit width ∣ t ∣ of value type t .$
If
ea+N/8 is larger than the length of $m e m . d a t a, then: Trap.$
Let $b^{* be the byte sequence m e m . d a t a [e a : N / 8] .}$
If
N and $s x are part of the instruction, then: Let n be the integer for which b y t e s_{i N} Let c be the result of computing e x t e n d_s x_{N, ∣ t ∣ (n) .}$
Else:
1. Let $c be the constant for which b y t e s_{t}$
Push the value $t . c o n s t c to the stack.$

4.4.4.2. $t . s t o r e m e m a r g and t . s t o r e N m e m a r g$

Let $F be the current frame .$
Assert: due to validation, $F . m o d u l e . m e m a d r s [0] exists.$
Let $a be the memory address F . m o d u l e . m e m a d r s [0] .$
Assert: due to validation, $S . m e m s [a] exists.$
Let $m e m be the memory instance S . m e m s [a] .$
Assert: due to validation, a value of value type $t is on the top of the stack.$
Pop the value $t . c o n s t c from the stack.$
Assert: due to validation, a value of value type $i 32 is on the top of the stack.$
Pop the value $i 32 . c o n s t i from the stack.$
Let $e a be the integer i + m e m a r g . o f s e t .$
If
N is not part of the instruction, then:

Let $N be the bit width ∣ t ∣ of value type t .$
If
ea+N/8 is larger than the length of $m e m . d a t a, then: Trap.$
If
N is part of the instruction, then:

Let $n be the result of computing w r a p_{∣ t ∣, N (c) .}$
Let $b^{* be the byte sequence b y t e s_{i N}}$
Else:
1. Let $b^{* be the byte sequence b y t e s_{t}}$
Replace the bytes $m e m . d a t a [e a : N / 8] with b^{* .}$

4.4.4.3. $m e m o r y . s i z e$

Let $F be the current frame .$
Assert: due to validation, $F . m o d u l e . m e m a d r s [0] exists.$
Let $a be the memory address F . m o d u l e . m e m a d r s [0] .$
Assert: due to validation, $S . m e m s [a] exists.$
Let $m e m be the memory instance S . m e m s [a] .$
Let $s z be the length of m e m . d a t a divided by the page size .$
Push the value $i 32 . c o n s t s z to the stack.$

S; F; m e m o r y . s i z e ↪ S; F; (i 32 . c o n s t s z) (if ∣ S . m e m s [F . m o d u l e . m e m a d r s [0] . d a t a ∣ = s z \cdot 64 K i)

4.4.4.4. $m e m o r y . g r o w$

Let $F be the current frame .$
Assert: due to validation, $F . m o d u l e . m e m a d r s [0] exists.$
Let $a be the memory address F . m o d u l e . m e m a d r s [0] .$
Assert: due to validation, $S . m e m s [a] exists.$
Let $m e m be the memory instance S . m e m s [a] .$
Let $s z be the length of S . m e m s [a] divided by the page size .$
Assert: due to validation, a value of value type $i 32 is on the top of the stack.$
Pop the value $i 32 . c o n s t n from the stack.$
Either, try growing
mem by $n pages : If it succeeds, push the value i 32 . c o n s t s z to the stack. Else, push the value i 32 . c o n s t (- 1) to the stack.$
Or, push the value $i 32 . c o n s t (- 1) to the stack.$

Note

The $m e m o r y . g r o w instruction is non-deterministic. It may either succeed, returning the old memory size s z, or fail, returning - 1 . Failure must occur if the referenced memory instance has a maximum size defined that would be exceeded. However, failure can occur in other cases as well. In practice, the choice depends on the resources available to the embedder .$

4.4.5. Control Instructions

4.4.5.1. $n o p$

Do nothing.

n o p ↪ ϵ

4.4.5.2. $u n r e a c h a b l e$

Trap.

u n r e a c h a b l e ↪ t r a p

4.4.5.3. $b l o c k [t^{?] i n s t r^{* e n d}}$

Let $n be the arity ∣ t^{? ∣}$
Let $L be the label whose arity is n and whose continuation is the end of the block.$
Enter the block $i n s t r^{* with label L .}$

b l o c k [t^{n] i n s t r^{* e n d}}

4.4.5.4. $l o p [t^{?] i n s t r^{* e n d}}$

Let $L be the label whose arity is 0 and whose continuation is the start of the loop.$
Enter the block $i n s t r^{* with label L .}$

l o p [t^{?] i n s t r^{* e n d}}

4.4.5.5. $i f [t^{?] i n s t r_{1 * e l s e i n s t r_{2 * e n d}}}$

Assert: due to validation, a value of value type $i 32 is on the top of the stack.$
Pop the value $i 32 . c o n s t c from the stack.$
Let $n be the arity ∣ t^{? ∣}$
Let $L be the label whose arity is n and whose continuation is the end of the i f instruction.$
If
c is non-zero, then:

Enter the block $i n s t r_{1 * with label L .}$
Else:
1. Enter the block $i n s t r_{2 * with label L .}$

(i 32 . c o n s t c) i f [t^{n] i n s t r_{1 * e l s e i n s t r_{(i 32 . c o n s t c) i f [t^{n] i n s t r_{1 * e l s e i n s t r_{2}}}}^{2 * e n d}}}

4.4.5.6. $b r l$

Assert: due to validation, the stack contains at least $l + 1 labels.$
Let $L be the l -th label appearing on the stack, starting from the top and counting from zero.$
Let $n be the arity of L .$
Assert: due to validation, there are at least $n values on the top of the stack.$
Pop the values $v a l^{n from the stack.}$
Repeat
l+1 times:

While the top of the stack is a value, do:

Pop the value from the stack.

Assert: due to validation, the top of the stack now is a label.
Pop the label from the stack.
Push the values $v a l^{n to the stack.}$
Jump to the continuation of $L .$

l a b e l_{n}

4.4.5.7. $b r_i f l$

Assert: due to validation, a value of value type $i 32 is on the top of the stack.$
Pop the value $i 32 . c o n s t c from the stack.$
If
c is non-zero, then:

Execute the instruction $(b r l) .$
Else:
1. Do nothing.

(i 32 . c o n s t c) (b r_i f l) (i 32 . c o n s t c) (b r_i f l) ↪ ↪ (b r l) ϵ (if c \neq = 0) (if c = 0)

4.4.5.8. $b r_t a b l e l^{* l_{N}}$

Assert: due to validation, a value of value type $i 32 is on the top of the stack.$
Pop the value $i 32 . c o n s t i from the stack.$
If
i is smaller than the length of $l^{*, then: Let l_{i} Execute the instruction (b r l_{i}}$
Else:
1. Execute the instruction $(b r l_{N}$

(i 32 . c o n s t i) (b r_t a b l e l^{* l_{(i 32 . c o n s t i) (b r_t a b l e l^{* l_{N}}}^{N}}

4.4.5.9. $r e t u r n$

Let $F be the current frame .$
Let $n be the arity of F .$
Assert: due to validation, there are at least $n values on the top of the stack.$
Pop the results $v a l^{n from the stack.}$
Assert: due to validation, the stack contains at least one frame.
While the top of the stack is not a frame, do:
1. Pop the top element from the stack.
Assert: the top of the stack is the frame $F .$
Pop the frame from the stack.
Push $v a l^{n to the stack.}$
Jump to the instruction after the original call that pushed the frame.

f r a m e_{n}

4.4.5.10. $c a l x$

Let $F be the current frame .$
Assert: due to validation, $F . m o d u l e . f u n c a d r s [x] exists.$
Let $a be the function address F . m o d u l e . f u n c a d r s [x] .$
Invoke the function instance at address $a .$

F; (c a l x) ↪ F; (i n v o k e a) (if F . m o d u l e . f u n c a d r s [x] = a)

4.4.5.11. $c a l_i n d i r e c t x$

Let $F be the current frame .$
Assert: due to validation, $F . m o d u l e . t a b l e a d r s [0] exists.$
Let $t a be the table address F . m o d u l e . t a b l e a d r s [0] .$
Assert: due to validation, $S . t a b l e s [t a] exists.$
Let $t a b be the table instance S . t a b l e s [t a] .$
Assert: due to validation, $F . m o d u l e . t y p e s [x] exists.$
Let $f t_{e x p e c t be the function type F . m o d u l e . t y p e s [x] .}$
Assert: due to validation, a value with value type $i 32 is on the top of the stack.$
Pop the value $i 32 . c o n s t i from the stack.$
If
i is not smaller than the length of $t a b . e l e m, then: Trap.$
If
tab.elem[i] is uninitialized, then:

Trap.
Let $a be the function address t a b . e l e m [i] .$
Assert: due to validation, $S . f u n c s [a] exists.$
Let $f be the function instance S . f u n c s [a] .$
Let $f t_{a c t u a l be the function type f . t y p e .}$
If
ftactual and $f t_{e x p e c t differ, then: Trap.}$
Invoke the function instance at address $a .$

S; F; (i 32 . c o n s t i) (c a l_i n d i r e c t x) ↪ S; F; (i n v o k e a) (if \land

4.4.6. Blocks

The following auxiliary rules define the semantics of executing an instruction sequence that forms a block.

4.4.6.1. Entering $i n s t r^{* with label L}$

Push $L to the stack.$
Jump to the start of the instruction sequence $i n s t r^{* .}$

Note

No formal reduction rule is needed for entering an instruction sequence, because the label $L is embedded in the administrative instruction that structured control instructions reduce to directly.$

4.4.6.2. Exiting $i n s t r^{* with label L}$

When the end of a block is reached without a jump or trap aborting it, then the following steps are performed.

Let $m be the number of values on the top of the stack.$
Pop the values $v a l^{m from the stack.}$
Assert: due to validation, the label $L is now on the top of the stack.$
Pop the label from the stack.
Push $v a l^{m back to the stack.}$
Jump to the position after the $e n d of the structured control instruction associated with the label L .$

l a b e l_{n}

Note

This semantics also applies to the instruction sequence contained in a $l o p instruction. Therefore, execution of a loop falls off the end, unless a backwards branch is performed explicitly.$

4.4.7. Function Calls

The following auxiliary rules define the semantics of invoking a function instance through one of the call instructions and returning from it.

4.4.7.1. Invocation of function address $a$

Assert: due to validation, $S . f u n c s [a] exists.$
Let $f be the function instance, S . f u n c s [a] .$
Let $[t_{1 n] \to [t_{2 m] be the function type f . t y p e .}}$
Assert: due to validation, $m \leq 1 .$
Let $t^{* be the list of value types f . c o d e . l o c a l s .}$
Let $i n s t r^{* e n d}$
Assert: due to validation, $n values are on the top of the stack.$
Pop the values $v a l^{n from the stack.}$
Let $v a l_{0 * be the list of zero values of types t^{* .}}$
Let $F be the frame {m o d u l e f . m o d u l e, l o c a l s v a l^{n v a l_{0 *} .}}$
Push the activation of $F with arity m to the stack.$
Execute the instruction $b l o c k [t_{2 m] i n s t r^{* e n d}}$

4.4.7.2. Returning from a function

When the end of a function is reached without a jump (i.e., $r e t u r n) or trap aborting it, then the following steps are performed.$

Let $F be the current frame .$
Let $n be the arity of the activation of F .$
Assert: due to validation, there are $n values on the top of the stack.$
Pop the results $v a l^{n from the stack.}$
Assert: due to validation, the frame $F is now on the top of the stack.$
Pop the frame from the stack.
Push $v a l^{n back to the stack.}$
Jump to the instruction after the original call.

f r a m e_{n}

4.4.7.3. Host Functions

Invoking a host function has non-deterministic behavior. It may either terminate with a trap or return regularly. However, in the latter case, it must consume and produce the right number and types of WebAssembly values on the stack, according to its function type.

A host function may also modify the store. However, all store modifications must result in an extension of the original store, i.e., they must only modify mutable contents and must not have instances removed. Furthermore, the resulting store must be valid, i.e., all data and code in it is well-typed.

Here, $h f (S; v a l^{n)}$

For a WebAssembly implementation to be sound in the presence of host functions, every host function instance must be valid, which means that it adheres to suitable pre- and post-conditions: under a valid store $S, and given arguments v a l^{n matching the ascribed parameter types t_{1 n, executing the host function must yield a non-empty set of possible outcomes each of which is either divergence or consists of a valid store S^{' that is an extension of S and a result matching the ascribed return types t}}}$

Note

A host function can call back into WebAssembly by invoking a function exported from a module. However, the effects of any such call are subsumed by the non-deterministic behavior allowed for the host function.

4.4.8. Expressions

An expression is evaluated relative to a current frame pointing to its containing module instance.

Jump to the start of the instruction sequence $i n s t r^{* of the expression.}$
Execute the instruction sequence.
Assert: due to validation, the top of the stack contains a value.
Pop the value $v a l from the stack.$

The value $v a l is the result of the evaluation.$

S; F; i n s t r^{* ↪ S^{'; F^{'; i n s t r^{' * (if S; F; i n s t r^{* e n d ↪ S^{'; F^{'; i n s t r^{' * e n d)}}}}}}}}

Note

Evaluation iterates this reduction rule until reaching a value. Expressions constituting function bodies are executed during function invocation.

4.5. Modules

For modules, the execution semantics primarily defines instantiation, which allocates instances for a module and its contained definitions, inititializes tables and memories from contained element and data segments, and invokes the start function if present. It also includes invocation of exported functions.

Instantiation depends on a number of auxiliary notions for type-checking imports and allocating instances.

4.5.1. External Typing

For the purpose of checking external values against imports, such values are classified by external types. The following auxiliary typing rules specify this typing relation relative to a store $S in which the referenced instances live.$

4.5.1.1. $f u n c a$

The store entry $S . f u n c s [a] must be a function instance {t y p e f u n c t y p e, \dots} .$
Then $f u n c a is valid with external type f u n c f u n c t y p e .$

\frac{S . f u n c s [ a ] = { t y p e f u n c t y p e , \dots }}{S ⊢ f u n c a : f u n c f u n c t y p e}

4.5.1.2. $t a b l e a$

The store entry $S . t a b l e s [a] must be a table instance {e l e m (f a^{?)^{n, m a x m^{?}}}}$
Then $t a b l e a is valid with external type t a b l e ({m i n n, m a x m^{?} f u n c r e f)}$

S ⊢ t a b l e a : t a b l e ( { m i n n , m a x m _{?} f u n c r e f)}^{S . t a b l e s [a] = {e l e m (f a^{?)^{n, m a x m^{?}}}}}

4.5.1.3. $m e m a$

The store entry $S . m e m s [a] must be a memory instance {d a t a b^{n \cdot 64 K i, m a x m^{?}}}$
Then $m e m a is valid with external type m e m ({m i n n, m a x m^{?})}$

S ⊢ m e m a : m e m { m i n n , m a x m _{?}}^{S . m e m s [a] = {d a t a b^{n \cdot 64 K i, m a x m^{?}}}}

4.5.1.4. $g l o b a l a$

The store entry $S . g l o b a l s [a] must be a global instance {v a l u e (t . c o n s t c), m u t m u t} .$
Then $g l o b a l a is valid with external type g l o b a l (m u t t) .$

\frac{S . g l o b a l s [ a ] = { v a l u e ( t . c o n s t c ) , m u t m u t }}{S ⊢ g l o b a l a : g l o b a l ( m u t t )}

4.5.2. Import Matching

When instantiating a module, external values must be provided whose types are matched against the respective external types classifying each import. In some cases, this allows for a simple form of subtyping, as defined below.

4.5.2.1. Limits

Limits ${m i n n_{1}$

$n_{1}$
Either:
- $m_{2 ? is empty.}$
Or:
- Both $m_{1 ? and m_{2 ? are non-empty.}}$
- $m_{1}$

⊢ { m i n n _{1}

4.5.2.2. Functions

An external type $f u n c f u n c t y p e_{1}$

Both $f u n c t y p e_{1}$

⊢ f u n c f u n c t y p e \leq f u n c f u n c t y p e

4.5.2.4. Memories

An external type $m e m l i m i t s_{1}$

Limits $l i m i t s_{1}$

⊢ m e m l i m i t s _{1}

4.5.2.5. Globals

An external type $g l o b a l g l o b a l t y p e_{1}$

Both $g l o b a l t y p e_{1}$

⊢ g l o b a l g l o b a l t y p e \leq g l o b a l g l o b a l t y p e

4.5.3. Allocation

New instances of functions, tables, memories, and globals are allocated in a store $S, as defined by the following auxiliary functions.$

4.5.3.1. Functions

Let $f u n c be the function to allocate and m o d u l e i n s t its module instance .$
Let $a be the first free function address in S .$
Let $f u n c t y p e be the function type m o d u l e i n s t . t y p e s [f u n c . t y p e] .$
Let $f u n c i n s t be the function instance {t y p e f u n c t y p e, m o d u l e m o d u l e i n s t, c o d e f u n c} .$
Append $f u n c i n s t to the f u n c s of S .$
Return $a .$

a l o c f u n c (S, f u n c, m o d u l e i n s t) where: f u n c a d r f u n c t y p e f u n c i n s t S^{' = = = = = S_{,}^{∣ S . ∣}}

4.5.3.2. Host Functions

Let $h o s t f u n c be the host function to allocate and f u n c t y p e its function type .$
Let $a be the first free function address in S .$
Let $f u n c i n s t be the function instance {t y p e f u n c t y p e, h o s t c o d e h o s t f u n c} .$
Append $f u n c i n s t to the f u n c s of S .$
Return $a .$

a l o c h o s t f u n c (S, f u n c t y p e, h o s t f u n c) where: f u n c a d r f u n c i n s t S^{' = = = = S_{∣ S . ∣ {,}}^{',}}

Note

Host functions are never allocated by the WebAssembly semantics itself, but may be allocated by the embedder.

4.5.3.3. Tables

Let $t a b l e t y p e be the table type to allocate.$
Let $({m i n n, m a x m^{?} e l e m t y p e)}$
Let $a be the first free table address in S .$
Let $t a b l e i n s t be the table instance {e l e m (ϵ)^{n, m a x m^{?}}}$
Append $t a b l e i n s t to the t a b l e s of S .$
Return $a .$

a l o c t a b l e (S, t a b l e t y p e) where: t a b l e t y p e t a b l e a d r t a b l e i n s t S^{' = = = = = S_{',}^{{n, m}}

4.5.3.4. Memories

Let $m e m t y p e be the memory type to allocate.$
Let ${m i n n, m a x m^{?}}$
Let $a be the first free memory address in S .$
Let $m e m i n s t be the memory instance {d a t a (0 x 00)^{n \cdot 64 K i, m a x m^{?}}}$
Append $m e m i n s t to the m e m s of S .$
Return $a .$

a l o c m e m (S, m e m t y p e) where: m e m t y p e m e m a d r m e m i n s t S^{' = = = = = S_{',}^{{n, m}}

4.5.3.5. Globals

Let $g l o b a l t y p e be the global type to allocate and v a l the value to initialize the global with.$
Let $m u t t be the structure of global type g l o b a l t y p e .$
Let $a be the first free global address in S .$
Let $g l o b a l i n s t be the global instance {v a l u e v a l, m u t m u t} .$
Append $g l o b a l i n s t to the g l o b a l s of S .$
Return $a .$

a l o c g l o b a l (S, g l o b a l t y p e, v a l) where: g l o b a l t y p e g l o b a l a d r g l o b a l i n s t S^{' = = = = = S_{',}^{t ∣ S . ∣}}

4.5.3.6. Growing tables

Let $t a b l e i n s t be the table instance to grow and n the number of elements by which to grow it.$
Let $l e n be n added to the length of t a b l e i n s t . e l e m .$
If $l e n is larger than or equal to 2^{32, then fail.}$
If $t a b l e i n s t . m a x is not empty and its value is smaller than l e n, then fail.$
Append $n empty elements to t a b l e i n s t . e l e m .$

g r o w t a b l e (t a b l e i n s t, n) = t a b l e i n s t with e l e m = t a b l e i n s t . e l e m (ϵ)^{n (if \land \land}

4.5.3.7. Growing memories

Let $m e m i n s t be the memory instance to grow and n the number of pages by which to grow it.$
Assert: The length of $m e m i n s t . d a t a is divisible by the page size 64 K i .$
Let $l e n be n added to the length of m e m i n s t . d a t a divided by the page size 64 K i .$
If $l e n is larger than 2^{16, then fail.}$
If $m e m i n s t . m a x is not empty and its value is smaller than l e n, then fail.$
Append $n times 64 K i bytes with value 0 x 00 to m e m i n s t . d a t a .$

g r o w m e m (m e m i n s t, n) = m e m i n s t with d a t a = m e m i n s t . d a t a (0 x 00)^{n \cdot 64 K i (}

4.5.3.8. Modules

The allocation function for modules requires a suitable list of external values that are assumed to match the import vector of the module, and a list of initialization values for the module’s globals.

1. Let $m o d u l e be the module to allocate and e x t e r n v a l_{i m * the vector of external values providing the module’s imports, and v a l^{* the initialization values of the module’s globals .}}$

For each function
funci in $m o d u l e . f u n c s, do: Let f u n c a d r_{i}$
For each table
tablei in $m o d u l e . t a b l e s, do: Let t a b l e a d r_{i}$
For each memory
memi in $m o d u l e . m e m s, do: Let m e m a d r_{i}$
For each global
globali in $m o d u l e . g l o b a l s, do: Let g l o b a l a d r_{i}$
Let $f u n c a d r^{* be the the concatenation of the function addresses f u n c a d r_{i}}$
Let $t a b l e a d r^{* be the the concatenation of the table addresses t a b l e a d r_{i}}$
Let $m e m a d r^{* be the the concatenation of the memory addresses m e m a d r_{i}}$
Let $g l o b a l a d r^{* be the the concatenation of the global addresses g l o b a l a d r_{i}}$
Let $f u n c a d r_{m o d * be the list of function addresses extracted from e x t e r n v a l_{i m *, concatenated with f u n c a d r^{* .}}}$
Let $t a b l e a d r_{m o d * be the list of table addresses extracted from e x t e r n v a l_{i m *, concatenated with t a b l e a d r^{* .}}}$
Let $m e m a d r_{m o d * be the list of memory addresses extracted from e x t e r n v a l_{i m *, concatenated with m e m a d r^{* .}}}$
Let $g l o b a l a d r_{m o d * be the list of global addresses extracted from e x t e r n v a l_{i m *, concatenated with g l o b a l a d r^{* .}}}$
For each export
exporti in $m o d u l e . e x p o r t s, do: If e x p o r t_{i} Else, if e x p o r t_{i} Else, if e x p o r t_{i} Else, if e x p o r t_{i} Let e x p o r t i n s t_{i}$
Let $e x p o r t i n s t^{* be the the concatenation of the export instances e x p o r t i n s t_{i}}$
Let $m o d u l e i n s t be the module instance {t y p e s (m o d u l e . t y p e s), f u n c a d r s f u n c a d r_{m o d *, t a b l e a d r s t a b l e a d r_{m o d *, m e m a d r s m e m a d r_{m o d *,}}}$
Return $m o d u l e i n s t .$

a l o c m o d u l e (S, m o d u l e, e x t e r n v a l_{i m *, v a l^{*)}}

where:

m o d u l e i n s t S_{1}

Here, the notation $a l o c x^{* is shorthand for multiple allocations of object kind X, defined as follows:}$

a l o c x^{* (S_{0}}

Moreover, if the dots $\dots are a sequence A^{n (as for globals), then the elements of this sequence are passed to the allocation function pointwise.}$

Note

The definition of module allocation is mutually recursive with the allocation of its associated functions, because the resulting module instance $m o d u l e i n s t is passed to the function allocator as an argument, in order to form the necessary closures. In an implementation, this recursion is easily unraveled by mutating one or the other in a secondary step.$

4.5.4. Instantiation

Given a store $S, a module m o d u l e is instantiated with a list of external values e x t e r n v a l^{n supplying the required imports as follows.}$

Instantiation checks that the module is valid and the provided imports match the declared types, and may fail with an error otherwise. Instantiation can also result in a trap from executing the start function. It is up to the embedder to define how such conditions are reported.

If
module is not valid, then:

Fail.
Assert: $m o d u l e is valid with external types e x t e r n t y p e_{i m m classifying its imports .}$
If the number
m of imports is not equal to the number $n of provided external values, then: Fail.$
For each external value
externvali in $e x t e r n v a l^{n and external type e x t e r n t y p e_{i' in e x t e r n t y p e_{i m n, do: If e x t e r n v a l_{i} If e x t e r n t y p e_{i}}}}$

Let $v a l^{* be the vector of global initialization values determined by m o d u l e and e x t e r n v a l^{n . These may be calculated as follows.}}$
1. Let $m o d u l e i n s t_{i m be the auxiliary module instance {g l o b a l a d r s g l o b a l s (e x t e r n v a l^{n)}}}$
2. Let $F_{i m be the auxiliary frame {m o d u l e m o d u l e i n s t_{i m, l o c a l s ϵ} .}}$
3. Push the frame $F_{i m to the stack.}$
4. For each global
  $g l o b a l_{i}$
5. Assert: due to validation, the frame $F_{i m is now on the top of the stack.}$
6. Pop the frame $F_{i m from the stack.}$
Let $m o d u l e i n s t be a new module instance allocated from m o d u l e in store S with imports e x t e r n v a l^{n and global initializer values v a l^{*, and let S^{' be the extended store produced by module allocation.}}}$
Let $F be the frame {m o d u l e m o d u l e i n s t, l o c a l s ϵ} .$
Push the frame $F to the stack.$
For each element segment $e l e m_{i}$
1. Let $e o v a l_{i}$
2. Assert: due to validation, $e o v a l_{i}$
3. Let $t a b l e i d x_{i}$
4. Assert: due to validation, $m o d u l e i n s t . t a b l e a d r s [t a b l e i d x_{i}$
5. Let $t a b l e a d r_{i}$
6. Assert: due to validation, $S^{' . t a b l e s [t a b l e a d r_{i}}$
7. Let $t a b l e i n s t_{i}$
8. Let $e e n d_{i}$
9. If
  $e e n d_{i}$
For each data segment $d a t a_{i}$
1. Let $d o v a l_{i}$
2. Assert: due to validation, $d o v a l_{i}$
3. Let $m e m i d x_{i}$
4. Assert: due to validation, $m o d u l e i n s t . m e m a d r s [m e m i d x_{i}$
5. Let $m e m a d r_{i}$
6. Assert: due to validation, $S^{' . m e m s [m e m a d r_{i}}$
7. Let $m e m i n s t_{i}$
8. Let $d e n d_{i}$
9. If
  $d e n d_{i}$
Assert: due to validation, the frame $F is now on the top of the stack.$
Pop the frame from the stack.
For each element segment $e l e m_{i}$
1. For each function index
  $f u n c i d x_{i j in e l e m_{i}}$
For each data segment $d a t a_{i}$
1. For each byte
  $b_{i j in d a t a_{i}}$
If the start function $m o d u l e . s t a r t is not empty, then:$
1. Assert: due to validation, $m o d u l e i n s t . f u n c a d r s [m o d u l e . s t a r t . f u n c] exists.$
2. Let $f u n c a d r be the function address m o d u l e i n s t . f u n c a d r s [m o d u l e . s t a r t . f u n c] .$
3. Invoke the function instance at $f u n c a d r .$

i n s t a n t i a t e (S, m o d u l e, e x t e r n v a l_{S; F; i n i t_e l e m a i ϵ S; F; i n i t_e l e m a i (x_{0}}^{n)}

Note

Module allocation and the evaluation of global initializers are mutually recursive because the global initialization values $v a l^{* are passed to the module allocator but depend on the store S^{' and module instance m o d u l e i n s t returned by allocation. However, this recursion is just a specification device. Due to validation, the initialization values can easily be determined from a simple pre-pass that evaluates global initializers in the initial store.}}$

All failure conditions are checked before any observable mutation of the store takes place. Store mutation is not atomic; it happens in individual steps that may be interleaved with other threads.

Evaluation of constant expressions does not affect the store.

4.5.5. Invocation

Once a module has been instantiated, any exported function can be invoked externally via its function address $f u n c a d r in the store S and an appropriate list v a l^{* of argument values .}$

Invocation may fail with an error if the arguments do not fit the function type. Invocation can also result in a trap. It is up to the embedder to define how such conditions are reported.

Note

If the embedder API performs type checks itself, either statically or dynamically, before performing an invocation, then no failure other than traps can occur.

The following steps are performed:

Assert: $S . f u n c s [f u n c a d r] exists.$
Let $f u n c i n s t be the function instance S . f u n c s [f u n c a d r] .$
Let $[t_{1 n] \to [t_{2 m] be the function type f u n c i n s t . t y p e .}}$
If the length
∣val∗∣ of the provided argument values is different from the number $n of expected arguments, then: Fail.$
For each value type
ti in $t_{1 n and corresponding value v a l_{i}}$
Let $F be the dummy frame {m o d u l e {}, l o c a l s ϵ} .$
Push the frame $F to the stack.$
Push the values $v a l^{* to the stack.}$
Invoke the function instance at address $f u n c a d r .$

Once the function has returned, the following steps are executed:

Assert: due to validation, $m values are on the top of the stack.$
Pop $v a l_{r e s m from the stack.}$

The values $v a l_{r e s m are returned as the results of the invocation.}$

i n v o k e (S, f u n c a d r, v a l^{n)}

5. Binary Format

5.1. Conventions

The binary format for WebAssembly modules is a dense linear encoding of their abstract syntax. [1]

The format is defined by an attribute grammar whose only terminal symbols are bytes. A byte sequence is a well-formed encoding of a module if and only if it is generated by the grammar.

Each production of this grammar has exactly one synthesized attribute: the abstract syntax that the respective byte sequence encodes. Thus, the attribute grammar implicitly defines a decoding function (i.e., a parsing function for the binary format).

Except for a few exceptions, the binary grammar closely mirrors the grammar of the abstract syntax.

Note

Some phrases of abstract syntax have multiple possible encodings in the binary format. For example, numbers may be encoded as if they had optional leading zeros. Implementations of decoders must support all possible alternatives; implementations of encoders can pick any allowed encoding.

The recommended extension for files containing WebAssembly modules in binary format is “ $. w a s m ” and the recommended Media Type is “ a p l i c a t i o n / w a s m ”.$

[1]	Additional encoding layers – for example, introducing compression – may be defined on top of the basic representation defined here. However, such layers are outside the scope of the current specification.

5.1.1. Grammar

The following conventions are adopted in defining grammar rules for the binary format. They mirror the conventions used for abstract syntax. In order to distinguish symbols of the binary syntax from symbols of the abstract syntax, $t y p e w r i t e r font is adopted for the former.$

Terminal symbols are bytes expressed in hexadecimal notation: $0 x 0 F .$
Nonterminal symbols are written in typewriter font: $v a l t y p e, i n s t r .$
$B^{n is a sequence of n \geq 0 iterations of B .}$
$B^{* is a possibly empty sequence of iterations of B . (This is a shorthand for B^{n used where n is not relevant.)}}$
$B^{? is an optional occurrence of B . (This is a shorthand for B^{n where n \leq 1 .)}}$
$x : B denotes the same language as the nonterminal B, but also binds the variable x to the attribute synthesized for B .$
Productions are written $s y m : = B_{1}$
Some productions are augmented by side conditions in parentheses, which restrict the applicability of the production. They provide a shorthand for a combinatorial expansion of the production into many separate cases.

Note

For example, the binary grammar for value types is given as follows:

v a l t y p e : : = ∣ ∣ ∣ 0 x 7 F 0 x 7 E 0 x 7 D

Consequently, the byte $0 x 7 F encodes the type i 32, 0 x 7 E encodes the type i 64, and so forth. No other byte value is allowed as the encoding of a value type.$

The binary grammar for limits is defined as follows:

l i m i t s : : = ∣ 0 x 00 n : u 32 0 x 01 n : u 32 m : u 32 \Rightarrow \Rightarrow

That is, a limits pair is encoded as either the byte $0 x 00 followed by the encoding of a u 32 value, or the byte 0 x 01 followed by two such encodings. The variables n and m name the attributes of the respective u 32 nonterminals, which in this case are the actual unsigned integers those decode into. The attribute of the complete production then is the abstract syntax for the limit, expressed in terms of the former values.$

5.1.2. Auxiliary Notation

When dealing with binary encodings the following notation is also used:

$ϵ denotes the empty byte sequence.$
$∣ ∣ B ∣ is the length of the byte sequence generated from the production B in a derivation.$

5.1.3. Vectors

Vectors are encoded with their $u 32 length followed by the encoding of their element sequence.$

v e c (B) : : = n : u 32 (x : B)^{n}

5.2. Values

5.2.1. Bytes

Bytes encode themselves.

b y t e : : = ∣ ∣ 0 x 00 0 x F F \Rightarrow

5.2.2. Integers

All integers are encoded using the LEB128 variable-length integer encoding, in either unsigned or signed variant.

Unsigned integers are encoded in unsigned LEB128 format. As an additional constraint, the total number of bytes encoding a value of type $u N must not exceed c e i l (N / 7) bytes.$

u N : : = ∣ n : b y t e n : b y t e m : u (N - 7) \Rightarrow \Rightarrow

Signed integers are encoded in signed LEB128 format, which uses a two’s complement representation. As an additional constraint, the total number of bytes encoding a value of type $s N must not exceed c e i l (N / 7) bytes.$

s N : : = ∣ ∣ n : b y t e n : b y t e n : b y t e m : s (N - 7) \Rightarrow \Rightarrow \Rightarrow

Uninterpreted integers are encoded as signed integers.

i N : : = n : s N \Rightarrow i (if n = s i g n e d_{N}

Note

The side conditions $N > 7 in the productions for non-terminal bytes of the u and s encodings restrict the encoding’s length. However, “trailing zeros” are still allowed within these bounds. For example, 0 x 03 and 0 x 83 0 x 00 are both well-formed encodings for the value 3 as a . Similarly, either of$

The side conditions on the value $n of terminal bytes further enforce that any unused bits in these bytes must be 0 for positive values and 1 for negative ones. For example, 0 x 83 0 x 10 is malformed as a u 8 encoding. Similarly, both 0 x 83 0 x 3 E and 0 x F F 0 x 7 B are malformed as$

5.2.3. Floating-Point

Floating-point values are encoded directly by their [IEEE-754-2019] (Section 3.4) bit pattern in little endian byte order:

f N : : = b^{* : b y t e^{N / 8}}

5.2.4. Names

Names are encoded as a vector of bytes containing the [UNICODE] (Section 3.9) UTF-8 encoding of the name’s character sequence.

n a m e : : = b^{* : v e c (b y t e)}

The auxiliary $u t f 8 function expressing this encoding is defined as follows:$

u t f 8 (c_{*)}^{u t f 8 (c) u t f 8 (c) u t f 8 (c) u t f 8 (c) = = = = =}

Note

Unlike in some other formats, name strings are not 0-terminated.

5.3. Types

5.3.1. Value Types

Value types are encoded by a single byte.

v a l t y p e : : = ∣ ∣ ∣ 0 x 7 F 0 x 7 E 0 x 7 D

Note

In future versions of WebAssembly, value types may include types denoted by type indices. Thus, the binary format for types corresponds to the signed LEB128 encoding of small negative $s N values, so that they can coexist with (positive) type indices in the future.$

5.3.2. Result Types

The only result types occurring in the binary format are the types of blocks. These are encoded in special compressed form, by either the byte $0 x 40 indicating the empty type or as a single value type .$

b l o c k t y p e : : = ∣ 0 x 40 t : v a l t y p e \Rightarrow \Rightarrow

Note

In future versions of WebAssembly, this scheme may be extended to support multiple results or more general block types.

5.3.3. Function Types

Function types are encoded by the byte $0 x 60 followed by the respective vectors of parameter and result types.$

f u n c t y p e : : = 0 x 60 t_{1 * : v e c (v a l t y p e) t_{2 * : v e c (v a l t y p e)}}

5.3.4. Limits

Limits are encoded with a preceding flag indicating whether a maximum is present.

l i m i t s : : = ∣ 0 x 00 n : u 32 0 x 01 n : u 32 m : u 32 \Rightarrow \Rightarrow

5.3.5. Memory Types

Memory types are encoded with their limits.

m e m t y p e : : = l i m : l i m i t s \Rightarrow l i m

5.3.6. Table Types

Table types are encoded with their limits and a constant byte indicating their element type.

t a b l e t y p e e l e m t y p e : : = : : = e t : e l e m t y p e l i m : l i m i t s 0 x 70 \Rightarrow \Rightarrow

5.3.7. Global Types

Global types are encoded by their value type and a flag for their mutability.

g l o b a l t y p e m u t : : = : : = ∣ t : v a l t y p e m : m u t 0 x 00 0 x 01

5.4. Instructions

Instructions are encoded by opcodes. Each opcode is represented by a single byte, and is followed by the instruction’s immediate arguments, where present. The only exception are structured control instructions, which consist of several opcodes bracketing their nested instruction sequences.

Note

Gaps in the byte code ranges for encoding instructions are reserved for future extensions.

5.4.1. Control Instructions

Control instructions have varying encodings. For structured instructions, the instruction sequences forming nested blocks are terminated with explicit opcodes for $e n d and e l s e .$

Note

The $e l s e opcode 0 x 05 in the encoding of an i f instruction can be omitted if the following instruction sequence is empty.$

Note

In future versions of WebAssembly, the zero byte occurring in the encoding of the $c a l_i n d i r e c t instruction may be used to index additional tables.$

5.4.2. Parametric Instructions

Parametric instructions are represented by single byte codes.

i n s t r : : = ∣ ∣ \dots 0 x 1 A 0 x 1 B

5.4.3. Variable Instructions

Variable instructions are represented by byte codes followed by the encoding of the respective index.

i n s t r : : = ∣ ∣ ∣ ∣ ∣ \dots 0 x 20 x : l o c a l i d x 0 x 21 x : l o c a l i d x 0 x 22 x : l o c a l i d x 0 x x

5.4.4. Memory Instructions

Each variant of memory instruction is encoded with a different byte code. Loads and stores are followed by the encoding of their $m e m a r g immediate.$

m e m a r g i n s t r : : = : : = ∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣

Note

In future versions of WebAssembly, the additional zero bytes occurring in the encoding of the $m e m o r y . s i z e and m e m o r y . g r o w instructions may be used to index additional memories.$

5.4.5. Numeric Instructions

All variants of numeric instructions are represented by separate byte codes.

The $c o n s t instructions are followed by the respective literal.$

i n s t r : : = ∣ ∣ ∣ ∣ \dots 0 x 41 n : i 32 0 x 42 n : i 64 0 x 43 z : 0 x 44 z :

All other numeric instructions are plain opcodes without any immediates.

i n s t r : : = ∣ ∣ ∣ ∣ ∣ ∣ \dots 0 x 5 B 0 x 5 C

i n s t r : : = ∣ ∣ ∣ ∣ ∣ ∣ \dots 0 x 61 0 x 62

5.4.6. Expressions

Expressions are encoded by their instruction sequence terminated with an explicit $0 x 0 B opcode for e n d .$

e x p r : : = (i n : i n s t r)^{* 0 x 0 B \Rightarrow}

5.5. Modules

The binary encoding of modules is organized into sections. Most sections correspond to one component of a module record, except that function definitions are split into two sections, separating their type declarations in the function section from their bodies in the code section.

Note

This separation enables parallel and streaming compilation of the functions in a module.

5.5.1. Indices

All indices are encoded with their respective value.

t y p e i d x f u n c i d x t a b l e i d x m e m i d x g l o b a l i d x l o c a l i d x l a b e l i d x : : = : : = : : = : : = : : = : : = : : =

5.5.2. Sections

Each section consists of

a one-byte section id,
the $u 32 size of the contents, in bytes,$
the actual contents, whose structure is depended on the section id.

Every section is optional; an omitted section is equivalent to the section being present with empty contents.

The following parameterized grammar rule defines the generic structure of a section with id $N and contents described by the grammar B .$

s e c t i o n_{N}

For most sections, the contents $B encodes a vector . In these cases, the empty result ϵ is interpreted as the empty vector.$

Note

Other than for unknown custom sections, the $s i z e is not required for decoding, but can be used to skip sections when navigating through a binary. The module is malformed if the size does not match the length of the binary contents B .$

The following section ids are used:

Id	Section
0	custom section
1	type section
2	import section
3	function section
4	table section
5	memory section
6	global section
7	export section
8	start section
9	element section
10	code section
11	data section

5.5.3. Custom Section

Custom sections have the id 0. They are intended to be used for debugging information or third-party extensions, and are ignored by the WebAssembly semantics. Their contents consist of a name further identifying the custom section, followed by an uninterpreted sequence of bytes for custom use.

c u s t o m s e c c u s t o m : : = : : = s e c t i o n_{n a m e b y t e^{*}}^{0}

Note

If an implementation interprets the contents of a custom section, then errors in that contents, or the placement of the section, must not invalidate the module.

5.5.4. Type Section

The type section has the id 1. It decodes into a vector of function types that represent the $t y p e s component of a module .$

t y p e s e c : : = f t^{* : s e c t i o n_{1}}

5.5.5. Import Section

The import section has the id 2. It decodes into a vector of imports that represent the $i m p o r t s component of a module .$

5.5.6. Function Section

The function section has the id 3. It decodes into a vector of type indices that represent the $t y p e fields of the functions in the f u n c s component of a module . The l o c a l s and b o d y fields of the respective functions are encoded separately in the code section .$

f u n c s e c : : = x^{* : s e c t i o n_{3}}

5.5.7. Table Section

The table section has the id 4. It decodes into a vector of tables that represent the $t a b l e s component of a module .$

t a b l e s e c t a b l e : : = : : = t a b^{* : s e c t i o n_{t t : t a b l e t y p e}^{4}}

5.5.8. Memory Section

The memory section has the id 5. It decodes into a vector of memories that represent the $m e m s component of a module .$

m e m s e c m e m : : = : : = m e m^{* : s e c t i o n_{m t : m e m t y p e}^{5}}

5.5.9. Global Section

The global section has the id 6. It decodes into a vector of globals that represent the $g l o b a l s component of a module .$

g l o b a l s e c g l o b a l : : = : : = g l o b^{* : s e c t i o n_{g t : g l o b a l t y p e e : e x p r}^{6}}

5.5.10. Export Section

The export section has the id 7. It decodes into a vector of exports that represent the $e x p o r t s component of a module .$

e x p o r t s e c e x p o r t e x p o r t d e s c : : = : : = : : = ∣ ∣ ∣ e x^{* :}

5.5.11. Start Section

The start section has the id 8. It decodes into an optional start function that represents the $s t a r t component of a module .$

s t a r t s e c s t a r t : : = : : = s t^{? : s e c t i o n_{x : f u n c i d x}^{8}}

5.5.12. Element Section

The element section has the id 9. It decodes into a vector of element segments that represent the $e l e m component of a module .$

e l e m s e c e l e m : : = : : = s e g^{* : s e c t i o n_{x : t a b l e i d x e : e x p r y}^{9}}

5.5.13. Code Section

The code section has the id 10. It decodes into a vector of code entries that are pairs of value type vectors and expressions. They represent the $l o c a l s and b o d y field of the functions in the f u n c s component of a module . The t y p e fields of the respective functions are encoded separately in the function section .$

The encoding of each code entry consists of

the $u 32 size of the function code in bytes,$
the actual function code, which in turn consists of
- the declaration of locals,
- the function body as an expression.

Local declarations are compressed into a vector whose entries consist of

a $u 32 count,$
a value type,

denoting count locals of the same value type.

c o d e s e c c o d e f u n c l o c a l s : : = : : = : : = : : = c o d e^{* :}

Here, $c o d e ranges over pairs (v a l t y p e^{*, e x p r)}$

Note

Like with sections, the code $s i z e is not needed for decoding, but can be used to skip functions when navigating through a binary. The module is malformed if a size does not match the length of the respective function code.$

5.5.14. Data Section

The data section has the id 11. It decodes into a vector of data segments that represent the $d a t a component of a module .$

d a t a s e c d a t a : : = : : = s e g^{* : s e c t i o n_{11 (v e c (d a t a)) x : e : b}}

5.5.15. Modules

The encoding of a module starts with a preamble containing a 4-byte magic number (the string $‘ \ 0 a s m ’) and a version field. The current version of the WebAssembly binary format is 1.$

The preamble is followed by a sequence of sections. Custom sections may be inserted at any place in this sequence, while other sections must occur at most once and in the prescribed order. All sections can be empty.

The lengths of vectors produced by the (possibly empty) function and code section must match up.

where for each $t_{i *, e_{i}}$

f u n c^{n [i] = {t y p e t y p e i d x^{n [i], l o c a l s t_{i *, b o d y e_{i}}}}

Note

The version of the WebAssembly binary format may increase in the future if backward-incompatible changes have to be made to the format. However, such changes are expected to occur very infrequently, if ever. The binary format is intended to be forward-compatible, such that future extensions can be made without incrementing its version.

6. Text Format

6.1. Conventions

The textual format for WebAssembly modules is a rendering of their abstract syntax into S-expressions.

Like the binary format, the text format is defined by an attribute grammar. A text string is a well-formed description of a module if and only if it is generated by the grammar. Each production of this grammar has at most one synthesized attribute: the abstract syntax that the respective character sequence expresses. Thus, the attribute grammar implicitly defines a parsing function. Some productions also take a context as an inherited attribute that records bound identifers.

Except for a few exceptions, the core of the text grammar closely mirrors the grammar of the abstract syntax. However, it also defines a number of abbreviations that are “syntactic sugar” over the core syntax.

The recommended extension for files containing WebAssembly modules in text format is “ $. w a t ”. Files with this extension are assumed to be encoded in UTF-8, as per [UNICODE] (Section 2.5).$

6.1.1. Grammar

The following conventions are adopted in defining grammar rules of the text format. They mirror the conventions used for abstract syntax and for the binary format. In order to distinguish symbols of the textual syntax from symbols of the abstract syntax, $t y p e w r i t e r font is adopted for the former.$

Terminal symbols are either literal strings of characters enclosed in quotes or expressed as [UNICODE] scalar values: $‘ m o d u l e ’, U + 0 A . (All characters written literally are unambiguously drawn from the 7-bit ASCII subset of Unicode.)$
Nonterminal symbols are written in typewriter font: $v a l t y p e, i n s t r .$
$T^{n is a sequence of n \geq 0 iterations of T .}$
$T^{* is a possibly empty sequence of iterations of T . (This is a shorthand for T^{n used where n is not relevant.)}}$
$T^{+ is a sequence of one or more iterations of T . (This is a shorthand for T^{n where n \geq 1 .)}}$
$T^{? is an optional occurrence of T . (This is a shorthand for T^{n where n \leq 1 .)}}$
$x : T denotes the same language as the nonterminal T, but also binds the variable x to the attribute synthesized for T .$
Productions are written $s y m : = T_{1}$
Some productions are augmented by side conditions in parentheses, which restrict the applicability of the production. They provide a shorthand for a combinatorial expansion of the production into many separate cases.

A distinction is made between lexical and syntactic productions. For the latter, arbitrary white space is allowed in any place where the grammar contains spaces. The productions defining lexical syntax and the syntax of values are considered lexical, all others are syntactic.

Note

For example, the textual grammar for value types is given as follows:

v a l t y p e : : = ∣ ∣ ∣ ‘ i 32 ’ ‘ i 64 ’

The textual grammar for limits is defined as follows:

l i m i t s : : = ∣ n : u 32 n : u 32 m : u 32 \Rightarrow \Rightarrow {m i n n, m a x ϵ} {m i n n, m a x m}

The variables $n and m name the attributes of the respective u 32 nonterminals, which in this case are the actual unsigned integers those parse into. The attribute of the complete production then is the abstract syntax for the limit, expressed in terms of the former values.$

6.1.2. Abbreviations

In addition to the core grammar, which corresponds directly to the abstract syntax, the textual syntax also defines a number of abbreviations that can be used for convenience and readability.

Abbreviations are defined by rewrite rules specifying their expansion into the core syntax:

a b r e v i a t i o n s y n t a x \equiv e x p a n d e d s y n t a x

These expansions are assumed to be applied, recursively and in order of appearance, before applying the core grammar rules to construct the abstract syntax.

6.1.3. Contexts

The text format allows the use of symbolic identifiers in place of indices. To resolve these identifiers into concrete indices, some grammar production are indexed by an identifier context $I as a synthesized attribute that records the declared identifiers in each index space . In addition, the context records the types defined in the module, so that parameter indices can be computed for functions .$

It is convenient to define identifier contexts as records $I with abstract syntax as follows:$

I : : = {

For each index space, such a context contains the list of identifiers assigned to the defined indices. Unnamed indices are associated with empty ( $ϵ) entries in these lists.$

An identifier context is well-formed if no index space contains duplicate identifiers.

6.1.3.1. Conventions

To avoid unnecessary clutter, empty components are omitted when writing out identifier contexts. For example, the record ${} is shorthand for an identifier context whose components are all empty.$

6.1.4. Vectors

Vectors are written as plain sequences, but with a restriction on the length of these sequence.

v e c (A) : : = (x : A)^{n}

6.2. Lexical Format

6.2.1. Characters

The text format assigns meaning to source text, which consists of a sequence of characters. Characters are assumed to be represented as valid [UNICODE] (Section 2.4) scalar values.

s o u r c e c h a r : : = : : = c h a r^{* U + 00 ∣ \dots ∣ U + D 7 F ∣}

Note

While source text may contain any Unicode character in comments or string literals, the rest of the grammar is formed exclusively from the characters supported by the 7-bit ASCII subset of Unicode.

6.2.2. Tokens

The character stream in the source text is divided, from left to right, into a sequence of tokens, as defined by the following grammar.

t o k e n k e y w o r d r e s e r v e d : : = : : = : : = k e y w o r d ∣ u N ∣ s N ∣ f N ∣ s t r i n g ∣ i d ∣ ‘ (’ ∣ ‘) ∣ (

Tokens are formed from the input character stream according to the longest match rule. That is, the next token always consists of the longest possible sequence of characters that is recognized by the above lexical grammar. Tokens can be separated by white space, but except for strings, they cannot themselves contain whitespace.

The set of keyword tokens is defined implicitly, by all occurrences of a terminal symbol in literal form, such as $‘ k e y w o r d ’, in a syntactic production of this chapter.$

Any token that does not fall into any of the other categories is considered reserved, and cannot occur in source text.

Note

The effect of defining the set of reserved tokens is that all tokens must be separated by either parentheses or white space. For example, $‘ 0 $ x ’ is a single reserved token. Consequently, it is not recognized as two separate tokens ‘ 0 ’ and ‘ $ x ’, but instead disallowed. This property of tokenization is not affected by the fact that the definition of reserved tokens overlaps with other token classes.$

6.2.3. White Space

White space is any sequence of literal space characters, formatting characters, or comments. The allowed formatting characters correspond to a subset of the ASCII format effectors, namely, horizontal tabulation ( $U + 09), line feed (U + 0 A), and carriage return (U + 0 D).$

s p a c e f o r m a t : : = : : = (‘ ’ ∣ f o r m a t ∣ c o m e n t)^{*}

The only relevance of white space is to separate tokens. It is otherwise ignored.

6.2.4. Comments

A comment can either be a line comment, started with a double semicolon $‘;; ’ and extending to the end of the line, or a block comment, enclosed in delimiters ‘ (; ’ \dots ‘;) ’ . Block comments can be nested.$

c o m e n t l i n e c o m e n t l i n e c h a r b l o c k c o m e n t b l o c k c h a r : : = : : = : : = : : = : : = ∣ ∣ ∣

Here, the pseudo token $e o f indicates the end of the input. The look-ahead restrictions on the productions for b l o c k c h a r disambiguate the grammar such that only well-bracketed uses of block comment delimiters are allowed.$

Note

Any formatting and control characters are allowed inside comments.

6.3. Values

The grammar productions in this section define lexical syntax, hence no white space is allowed.

6.3.1. Integers

All integers can be written in either decimal or hexadecimal notation. In both cases, digits can optionally be separated by underscores.

s i g n d i g i t h e x d i g i t n u m h e x n u m : : = : : = : : = ∣ ∣ : : = ∣ : : = ∣

The allowed syntax for integer literals depends on size and signedness. Moreover, their value must lie within the range of the respective type.

u N s N : : = ∣ : : = ∣ n : n u m ‘ 0 x ’ n : h e x n u m \pm : s i g n n : n u m \pm : s i g n ‘ 0 x ’ n :

Uninterpreted integers can be written as either signed or unsigned, and are normalized to unsigned in the abstract syntax.

i N : : = ∣ n : u N i : s N \Rightarrow \Rightarrow n n

6.3.2. Floating-Point

Floating-point values can be represented in either decimal or hexadecimal notation.

The value of a literal must not lie outside the representable range of the corresponding [IEEE-754-2019] type (that is, a numeric value must not overflow to $\pm infinity), but it may be rounded to the nearest representable value.$

Note

Rounding can be prevented by using hexadecimal notation with no more significant bits than supported by the required type.

Floating-point values may also be written as constants for infinity or canonical NaN (not a number). Furthermore, arbitrary NaN values may be expressed by providing an explicit payload value.

f N f N m a g : : = : : = ∣ ∣ ∣ ∣ \pm : s i g n z : f N m a g z : f l o a t z : h e x f l o a t ‘ i n f ’

6.3.3. Strings

Strings denote sequences of bytes that can represent both textual and binary data. They are enclosed in quotation marks and may contain any character other than ASCII control characters, quotation marks ( $‘ " ’), or backslash (‘ \ ’), except when expressed with an escape sequence .$

s t r i n g s t r i n g e l e m : : = : : = ∣ ‘ " ’ (b^{* : s t r i n g e l e m)}

Each character in a string literal represents the byte sequence corresponding to its UTF-8 [UNICODE] (Section 2.5) encoding, except for hexadecimal escape sequences $‘ \ h ’, which represent raw bytes of the respective value.$

s t r i n g c h a r : : = ∣ ∣ ∣ ∣ ∣ ∣ ∣ c : c h a r ‘ \ t ’

6.3.4. Names

Names are strings denoting a literal character sequence. A name string must form a valid UTF-8 encoding as defined by [UNICODE] (Section 2.5) and is interpreted as a string of Unicode scalar values.

n a m e : : = b^{* : s t r i n g}

Note

Presuming the source text is itself encoded correctly, strings that do not contain any uses of hexadecimal byte escapes are always valid names.

6.3.5. Identifiers

Indices can be given in both numeric and symbolic form. Symbolic identifiers that stand in lieu of indices start with $‘ $ ’, followed by any sequence of printable ASCII characters that does not contain a space, quotation mark, comma, semicolon, or bracket.$

i d i d c h a r : : = : : = ∣ ∣ ∣ ∣ ‘ $ ’ i d c h a r

6.3.5.1. Conventions

The expansion rules of some abbreviations require insertion of a fresh identifier. That may be any syntactically valid identifier that does not already occur in the given source text.

6.4. Types

6.4.1. Value Types

v a l t y p e : : = ∣ ∣ ∣ ‘ i 32 ’ ‘ i 64 ’

6.4.2. Result Types

r e s u l t y p e : : = (t : r e s u l t)^{?}

Note

In future versions of WebAssembly, this scheme may be extended to support multiple results or more general result types.

6.4.3. Function Types

f u n c t y p e p a r a m r e s u l t : : = : : = : : = ‘ (’ ‘ f u n c

6.4.3.1. Abbreviations

Multiple anonymous parameters or results may be combined into a single declaration:

‘ (’ ‘ p a r a m ’ v a l t y p e^{* ‘) ’ ‘}

6.4.4. Limits

l i m i t s : : = ∣ n : u 32 n : u 32 m : u 32 \Rightarrow \Rightarrow {m i n n, m a x ϵ} {m i n n, m a x m}

6.4.5. Memory Types

m e m t y p e : : = l i m : l i m i t s \Rightarrow l i m

6.4.6. Table Types

t a b l e t y p e e l e m t y p e : : = : : = l i m : l i m i t s e t : e l e m t y p e ‘ f u n c r e f ’ \Rightarrow \Rightarrow

Note

Additional element types may be introduced in future versions of WebAssembly.

6.4.7. Global Types

g l o b a l t y p e : : = ∣ t : v a l t y p e ‘ (’ ‘ m u t ’ t : v a l t y p e ‘) ’

6.5. Instructions

Instructions are syntactically distinguished into plain and structured instructions.

i n s t r_{I}

In addition, as a syntactic abbreviation, instructions can be written as S-expressions in folded form, to group them visually.

6.5.1. Labels

Structured control instructions can be annotated with a symbolic label identifier. They are the only symbolic identifiers that can be bound locally in an instruction sequence. The following grammar handles the corresponding update to the identifier context by composing the context with an additional label entry.

l a b e l_{I}

Note

The new label entry is inserted at the beginning of the label list in the identifier context. This effectively shifts all existing labels up by one, mirroring the fact that control instructions are indexed relatively not absolutely.

6.5.2. Control Instructions

Structured control instructions can bind an optional symbolic label identifier. The same label identifier may optionally be repeated after the corresponding $e n d and e l s e pseudo instructions, to indicate the matching delimiters.$

All other control instruction are represented verbatim.

p l a i n i n s t r_{I}

Note

The side condition stating that the identifier context $I^{' must be empty in the rule for c a l_i n d i r e c t enforces that no identifier can be bound in any p a r a m declaration appearing in the type annotation.}$

6.5.2.1. Abbreviations

The $‘ e l s e ’ keyword of an ‘ i f ’ instruction can be omitted if the following instruction sequence is empty.$

‘ i f ’ l a b e l r e s u l t y p e i n s t r^{* ‘ e n d ’ \equiv ‘ i f ’ l a b e l r e s u l t y p e i n s t r}

6.5.3. Parametric Instructions

p l a i n i n s t r_{I}

6.5.4. Variable Instructions

p l a i n i n s t r_{I}

6.5.5. Memory Instructions

The offset and alignment immediates to memory instructions are optional. The offset defaults to $0, the alignment to the storage size of the respective memory access, which is its natural alignment . Lexically, an o f s e t or a l i g n phrase is considered a single keyword token, so no white space is allowed around the ‘ = ’ .$

m e m a r g_{N}

6.5.6. Numeric Instructions

p l a i n i n s t r_{I}

p l a i n i n s t r_{I}

p l a i n i n s t r_{I}

p l a i n i n s t r_{I}

6.5.7. Folded Instructions

Instructions can be written as S-expressions by grouping them into folded form. In that notation, an instruction is wrapped in parentheses and optionally includes nested folded instructions to indicate its operands.

In the case of block instructions, the folded form omits the $‘ e n d ’ delimiter. For i f instructions, both branches have to be wrapped into nested S-expressions, headed by the keywords ‘ t h e n ’ and ‘ e l s e ’ .$

The set of all phrases defined by the following abbreviations recursively forms the auxiliary syntactic class $f o l d e d i n s t r . Such a folded instruction can appear anywhere a regular instruction can.$

Note

For example, the instruction sequence

(l o c a l . g e t $ x) (i 32 . c o n s t 2) i 32 . a d (i 32 . c o n s t 3) i 32 . m u l

can be folded into

(i 32 . m u l (i 32 . a d (l o c a l . g e t $ x) (i 32 . c o n s t 2) (i 32 . c o n s t 3)

Folded instructions are solely syntactic sugar, no additional syntactic or type-based checking is implied.

6.5.8. Expressions

Expressions are written as instruction sequences. No explicit $‘ e n d ’ keyword is included, since they only occur in bracketed positions.$

e x p r : : = (i n : i n s t r)^{*}

6.6. Modules

6.6.1. Indices

Indices can be given either in raw numeric form or as symbolic identifiers when bound by a respective construct. Such identifiers are looked up in the suitable space of the identifier context $I .$

6.6.2. Types

Type definitions can bind a symbolic type identifier.

t y p e : : = ‘ (’ ‘ t y p e ’ i d^{? f t : f u n c t y p e}

6.6.3. Type Uses

A type use is a reference to a type definition. It may optionally be augmented by explicit inlined parameter and result declarations. That allows binding symbolic identifiers to name the local indices of parameters. If inline declarations are given, then their types must match the referenced function type.

t y p e u s e_{I}

The synthesized attribute of a $t y p e u s e is a pair consisting of both the used type index and the updated identifier context including possible parameter identifiers. The following auxiliary function extracts optional identifiers from parameters:$

i d (‘ (’ ‘ p a r a m ’ i d^{? \dots ‘) ’) = i d}

Note

Both productions overlap for the case that the function type is $[] \to [] . However, in that case, they also produce the same results, so that the choice is immaterial.$

The well-formedness condition on $I^{' ensures that the parameters do not contain duplicate identifier.}$

6.6.3.1. Abbreviations

A $t y p e u s e may also be replaced entirely by inline parameter and result declarations. In that case, a type index is automatically inserted:$

(t_{1}

where $x is the smallest existing type index whose definition in the current module is the function type [t_{1 *] \to [t_{2 *] . If no such index exists, then a new type definition of the form}}$

‘ (’ ‘ t y p e ’ ‘ (’ ‘ f u n c ’ p a r a m^{* r e s u l t ‘) ’ ‘) ’}

is inserted at the end of the module.

Abbreviations are expanded in the order they appear, such that previously inserted type definitions are reused by consecutive expansions.

6.6.4. Imports

The descriptors in imports can bind a symbolic function, table, memory, or global identifier.

6.6.4.1. Abbreviations

As an abbreviation, imports may also be specified inline with function, table, memory, or global definitions; see the respective sections.

6.6.5. Functions

Function definitions can bind a symbolic function identifier, and local identifiers for its parameters and locals.

f u n c_{I}

The definition of the local identifier context $I^{'' uses the following auxiliary function to extract optional identifiers from locals:}$

i d (‘ (’ ‘ l o c a l ’ i d^{? \dots ‘) ’) = i d}

Note

The well-formedness condition on $I^{'' ensures that parameters and locals do not contain duplicate identifiers.}$

6.6.5.1. Abbreviations

Multiple anonymous locals may be combined into a single declaration:

‘ (’ ‘ l o c a l ’ v a l t y p e^{* ‘) ’ \equiv}

Functions can be defined as imports or exports inline:

‘ (’ ‘ f u n c ’ i d^{? ‘ (’ ‘ i m p o r t ’ n a m e}

The latter abbreviation can be applied repeatedly, with “ $\dots ” containing another import or export.$

6.6.6. Tables

Table definitions can bind a symbolic table identifier.

t a b l e_{I}

6.6.6.1. Abbreviations

An element segment can be given inline with a table definition, in which case its offset is $0 and the limits of the table type are inferred from the length of the given segment:$

‘ (’ ‘ t a b l e ’ i d^{? e l e m t y p e ‘ (’ ‘ e l e m ’ x}

Tables can be defined as imports or exports inline:

‘ (’ ‘ t a b l e ’ i d^{? ‘ (’ ‘ i m p o r t ’ n a m e}

The latter abbreviation can be applied repeatedly, with “ $\dots ” containing another import or export or an inline elements segment.$

6.6.7. Memories

Memory definitions can bind a symbolic memory identifier.

m e m_{I}

6.6.7.1. Abbreviations

A data segment can be given inline with a memory definition, in which case its offset is $0 the limits of the memory type are inferred from the length of the data, rounded up to page size :$

‘ (’ ‘ m e m o r y ’ i d^{? ‘ (’ ‘ d a t a ’ b}

Memories can be defined as imports or exports inline:

‘ (’ ‘ m e m o r y ’ i d^{? ‘ (’ ‘ i m p o r t ’ n a m e}

The latter abbreviation can be applied repeatedly, with “ $\dots ” containing another import or export or an inline data segment.$

6.6.8. Globals

Global definitions can bind a symbolic global identifier.

g l o b a l_{I}

6.6.8.1. Abbreviations

Globals can be defined as imports or exports inline:

‘ (’ ‘ g l o b a l ’ i d^{? ‘ (’ ‘ i m p o r t ’ n a m e}

The latter abbreviation can be applied repeatedly, with “ $\dots ” containing another import or export.$

6.6.9. Exports

The syntax for exports mirrors their abstract syntax directly.

e x p o r t_{I}

6.6.9.1. Abbreviations

As an abbreviation, exports may also be specified inline with function, table, memory, or global definitions; see the respective sections.

6.6.10. Start Function

A start function is defined in terms of its index.

s t a r t_{I}

Note

At most one start function may occur in a module, which is ensured by a suitable side condition on the $m o d u l e grammar.$

6.6.11. Element Segments

Element segments allow for an optional table index to identify the table to initialize.

e l e m_{I}

Note

In the current version of WebAssembly, the only valid table index is 0 or a symbolic table identifier resolving to the same value.

6.6.11.1. Abbreviations

As an abbreviation, a single instruction may occur in place of the offset:

i n s t r \equiv ‘ (’ ‘ o f s e t ’ i n s t r ‘) ’

Also, the table index can be omitted, defaulting to $0 .$

‘ (’ ‘ e l e m ’ ‘ (’ ‘ o f s e t ’ e x p r_{I}

As another abbreviation, element segments may also be specified inline with table definitions; see the respective section.

6.6.12. Data Segments

Data segments allow for an optional memory index to identify the memory to initialize. The data is written as a string, which may be split up into a possibly empty sequence of individual string literals.

d a t a_{I}

Note

In the current version of WebAssembly, the only valid memory index is 0 or a symbolic memory identifier resolving to the same value.

6.6.12.1. Abbreviations

As an abbreviation, a single instruction may occur in place of the offset:

i n s t r \equiv ‘ (’ ‘ o f s e t ’ i n s t r ‘) ’

Also, the memory index can be omitted, defaulting to $0 .$

‘ (’ ‘ d a t a ’ ‘ (’ ‘ o f s e t ’ e x p r_{I}

As another abbreviation, data segments may also be specified inline with memory definitions; see the respective section.

6.6.13. Modules

A module consists of a sequence of fields that can occur in any order. All definitions and their respective bound identifiers scope over the entire module, including the text preceding them.

A module may optionally bind an identifier that names the module. The name serves a documentary role only.

Note

Tools may include the module name in the name section of the binary format.

The following restrictions are imposed on the composition of modules: $m_{1}$

$m_{1}$
$m_{1}$

Note

The first condition ensures that there is at most one start function. The second condition enforces that all imports must occur before any regular definition of a function, table, memory, or global, thereby maintaining the ordering of the respective index spaces.

The well-formedness condition on $I in the grammar for m o d u l e ensures that no namespace contains duplicate identifiers.$

The definition of the initial identifier context $I uses the following auxiliary definition which maps each relevant definition to a singular context with one (possibly empty) identifier:$

6.6.13.1. Abbreviations

In a source file, the toplevel $(m o d u l e \dots) surrounding the module body may be omitted.$

m o d u l e f i e l d^{*}

A Appendix

A.1 Embedding

A WebAssembly implementation will typically be embedded into a host environment. An embedder implements the connection between such a host environment and the WebAssembly semantics as defined in the main body of this specification. An embedder is expected to interact with the semantics in well-defined ways.

This section defines a suitable interface to the WebAssembly semantics in the form of entry points through which an embedder can access it. The interface is intended to be complete, in the sense that an embedder does not need to reference other functional parts of the WebAssembly specification directly.

Note

On the other hand, an embedder does not need to provide the host environment with access to all functionality defined in this interface. For example, an implementation may not support parsing of the text format.

Types

In the description of the embedder interface, syntactic classes from the abstract syntax and the runtime’s abstract machine are used as names for variables that range over the possible objects from that class. Hence, these syntactic classes can also be interpreted as types.

For numeric parameters, notation like $n : u 32 is used to specify a symbolic name in addition to the respective value range.$

Errors

Failure of an interface operation is indicated by an auxiliary syntactic class:

e r o r : : = e r o r

In addition to the error conditions specified explicitly in this section, implementations may also return errors when specific implementation limitations are reached.

Note

Errors are abstract and unspecific with this definition. Implementations can refine it to carry suitable classifications and diagnostic messages.

Pre- and Post-Conditions

Some operations state pre-conditions about their arguments or post-conditions about their results. It is the embedder’s responsibility to meet the pre-conditions. If it does, the post conditions are guaranteed by the semantics.

In addition to pre- and post-conditions explicitly stated with each operation, the specification adopts the following conventions for runtime objects ( $s t o r e, m o d u l e i n s t, e x t e r n v a l, addresses):$

Every runtime object passed as a parameter must be valid per an implicit pre-condition.
Every runtime object returned as a result is valid per an implicit post-condition.

Note

As long as an embedder treats runtime objects as abstract and only creates and manipulates them through the interface defined here, all implicit pre-conditions are automatically met.

Store

$s t o r e_i n i t () : s t o r e$

Return the empty store.

s t o r e_i n i t () = {f u n c s ϵ, m e m s ϵ, t a b l e s ϵ, g l o b a l s ϵ}

Modules

$m o d u l e_d e c o d e (b y t e^{*) : m o d u l e ∣ e r o r}$

If there exists a derivation for the byte sequence $b y t e^{* as a m o d u l e according to the binary grammar for modules, yielding a module m, then return m .}$
Else, return $e r o r .$

m o d u l e_d e c o d e (b_{m o d u l e_d e c o d e (b^{*)}}^{*)}

$m o d u l e_p a r s e (c h a r^{*) : m o d u l e ∣ e r o r}$

If there exists a derivation for the source $c h a r^{* as a m o d u l e according to the text grammar for modules, yielding a module m, then return m .}$
Else, return $e r o r .$

m o d u l e_p a r s e (c_{m o d u l e_p a r s e (c^{*)}}^{*)}

$m o d u l e_v a l i d a t e (m o d u l e) : e r o r^{?}$

If $m o d u l e is valid, then return nothing.$
Else, return $e r o r .$

m o d u l e_v a l i d a t e (m) m o d u l e_v a l i d a t e (m) = = ϵ e r o r (if ⊢ m : e x t e r n t y p e^{* \to}

$m o d u l e_i n s t a n t i a t e (s t o r e, m o d u l e, e x t e r n v a l^{*) : (s t o r e, m o d u l e i n s t ∣ e r o r)}$

Try instantiating $m o d u l e in s t o r e with external values e x t e r n v a l^{* as imports:}$

If it succeeds with a module instance $m o d u l e i n s t, then let r e s u l t be m o d u l e i n s t .$
Else, let $r e s u l t be e r o r .$

Return the new store paired with $r e s u l t .$

m o d u l e_i n s t a n t i a t e (S, m, e v_{m o d u l e_i n s t a n t i a t e (S, m, e v^{*)}}^{*)}

Note

The store may be modified even in case of an error.

$m o d u l e_i m p o r t s (m o d u l e) : (n a m e, n a m e, e x t e r n t y p e)^{*}$

Pre-condition: $m o d u l e is valid with external import types e x t e r n t y p e^{* and external export types e x t e r n t y p e^{'^{* .}}}$
Let $i m p o r t^{* be the imports m o d u l e . i m p o r t s .}$
Assert: the length of $i m p o r t^{* equals the length of e x t e r n t y p e^{* .}}$
For each $i m p o r t_{i}$

Let $r e s u l t_{i}$

Return the concatenation of all $r e s u l t_{i}$
Post-condition: each $e x t e r n t y p e_{i}$

m o d u l e_i m p o r t s (m) = (i m . m o d u l e, i m . n a m e, e x t e r n t y p e)^{* (if i m^{* = m . i m p o r t s \land ⊢ m : e x t e r n t y p e^{* \to}}}

$m o d u l e_e x p o r t s (m o d u l e) : (n a m e, e x t e r n t y p e)^{*}$

Pre-condition: $m o d u l e is valid with external import types e x t e r n t y p e^{* and external export types e x t e r n t y p e^{'^{* .}}}$
Let $e x p o r t^{* be the exports m o d u l e . e x p o r t s .}$
Assert: the length of $e x p o r t^{* equals the length of e x t e r n t y p e^{'^{* .}}}$
For each $e x p o r t_{i}$

Let $r e s u l t_{i}$

Return the concatenation of all $r e s u l t_{i}$
Post-condition: each $e x t e r n t y p e_{i' is valid .}$

m o d u l e_e x p o r t s (m) = (e x . n a m e, e x t e r n t y p e^{')^{* (if e x^{* = m . e x p o r t s \land ⊢ m :}}}

Module Instances

$i n s t a n c e_e x p o r t (m o d u l e i n s t, n a m e) : e x t e r n v a l ∣ e r o r$

Assert: due to validity of the module instance $m o d u l e i n s t, all its export names are different.$
If there exists an
exportinsti in $m o d u l e i n s t . e x p o r t s such that name e x p o r t i n s t_{i}$
Else, return $e r o r .$

i n s t a n c e_e x p o r t (m, n a m e) i n s t a n c e_e x p o r t (m, n a m e) = = m . e x p o r t s [i] . v a l u e e r o r (if m . e x p o r t s [i] . n a m e = n a m e) (otherwise)

Functions

$f u n c_a l o c (s t o r e, f u n c t y p e, h o s t f u n c) : (s t o r e, f u n c a d r)$

Pre-condition: $f u n c t y p e is v a l i d < v a l i d - f u n c t y p e > .$
Let $f u n c a d r be the result of allocating a host function in s t o r e with function type f u n c t y p e and host function code h o s t f u n c .$
Return the new store paired with $f u n c a d r .$

f u n c_a l o c (S, f t, c o d e) = (S^{', a)}

Note

This operation assumes that $h o s t f u n c satisfies the pre- and post-conditions required for a function instance with type f u n c t y p e .$

Regular (non-host) function instances can only be created indirectly through module instantiation.

$f u n c_t y p e (s t o r e, f u n c a d r) : f u n c t y p e$

Assert: the external value $f u n c f u n c a d r is valid with external type f u n c f u n c t y p e .$
Return $f u n c t y p e .$
Post-condition: $f u n c t y p e is valid .$

f u n c_t y p e (S, a) = f t (if S ⊢ f u n c a : f u n c f t)

$f u n c_i n v o k e (s t o r e, f u n c a d r, v a l^{) : (s t o r e, v a l^{ ∣ e r o r)}}$

Try invoking the function $f u n c a d r in s t o r e with values v a l^{* as arguments:}$

If it succeeds with values $v a l^{'^{* as results, then let r e s u l t be v a l^{'^{* .}}}}$
Else it has trapped, hence let $r e s u l t be e r o r .$

Return the new store paired with $r e s u l t .$

f u n c_i n v o k e (S, a, v_{f u n c_i n v o k e (S, a, v^{*)}}^{*)}

Note

The store may be modified even in case of an error.

Tables

$t a b l e_a l o c (s t o r e, t a b l e t y p e) : (s t o r e, t a b l e a d r)$

Pre-condition: $t a b l e t y p e is v a l i d < v a l i d - t a b l e t y p e > .$
Let $t a b l e a d r be the result of allocating a table in s t o r e with table type t a b l e t y p e .$
Return the new store paired with $t a b l e a d r .$

t a b l e_a l o c (S, t t) = (S^{', a)}

$t a b l e_t y p e (s t o r e, t a b l e a d r) : t a b l e t y p e$

Assert: the external value $t a b l e t a b l e a d r is valid with external type t a b l e t a b l e t y p e .$
Return $t a b l e t y p e .$
Post-condition: $t a b l e t y p e is v a l i d < v a l i d - t a b l e t y p e > .$

t a b l e_t y p e (S, a) = t t (if S ⊢ t a b l e a : t a b l e t t)

$t a b l e_r e a d (s t o r e, t a b l e a d r, i : u 32) : f u n c a d r^{? ∣ e r o r}$

Let $t i be the table instance s t o r e . t a b l e s [t a b l e a d r] .$
If $i is larger than or equal to the length of t i . e l e m, then return e r o r .$
Else, return $t i . e l e m [i] .$

t a b l e_r e a d (S, a, i) t a b l e_r e a d (S, a, i) = = f a^{? e r o r}

$t a b l e_w r i t e (s t o r e, t a b l e a d r, i : u 32, f u n c a d r^{?) : s t o r e ∣ e r o r}$

Let $t i be the table instance s t o r e . t a b l e s [t a b l e a d r] .$
If $i is larger than or equal to the length of t i . e l e m, then return e r o r .$
Replace $t i . e l e m [i] with the optional function address f a^{? .}$
Return the updated store.

t a b l e_w r i t e (S, a, i, f a_{t a b l e_w r i t e (S, a, i, f a^{?)}}^{?)}

$t a b l e_s i z e (s t o r e, t a b l e a d r) : u 32$

Return the length of $s t o r e . t a b l e s [t a b l e a d r] . e l e m .$

t a b l e_s i z e (S, a) = n (if ∣ S . t a b l e s [a] . e l e m ∣ = n)

$t a b l e_g r o w (s t o r e, t a b l e a d r, n : u 32) : s t o r e ∣ e r o r$

Try growing the table instance
store.tables[tableadr] by $n elements: If it succeeds, return the updated store. Else, return e r o r .$

t a b l e_g r o w (S, a, n) t a b l e_g r o w (S, a, n) = = S^{' e r o r (}

Memories

$m e m_a l o c (s t o r e, m e m t y p e) : (s t o r e, m e m a d r)$

Pre-condition: $m e m t y p e is v a l i d < v a l i d - m e m t y p e > .$
Let $m e m a d r be the result of allocating a memory in s t o r e with memory type m e m t y p e .$
Return the new store paired with $m e m a d r .$

m e m_a l o c (S, m t) = (S^{', a)}

$m e m_t y p e (s t o r e, m e m a d r) : m e m t y p e$

Assert: the external value $m e m m e m a d r is valid with external type m e m m e m t y p e .$
Return $m e m t y p e .$
Post-condition: $m e m t y p e is v a l i d < v a l i d - m e m t y p e > .$

m e m_t y p e (S, a) = m t (if S ⊢ m e m a : m e m m t)

$m e m_r e a d (s t o r e, m e m a d r, i : u 32) : b y t e ∣ e r o r$

Let $m i be the memory instance s t o r e . m e m s [m e m a d r] .$
If $i is larger than or equal to the length of m i . d a t a, then return e r o r .$
Else, return the byte $m i . d a t a [i] .$

m e m_r e a d (S, a, i) m e m_r e a d (S, a, i) = = b e r o r (if S . m e m s [a] . d a t a [i] = b) (otherwise)

$m e m_w r i t e (s t o r e, m e m a d r, i : u 32, b y t e) : s t o r e ∣ e r o r$

Let $m i be the memory instance s t o r e . m e m s [m e m a d r] .$
If $u 32 is larger than or equal to the length of m i . d a t a, then return e r o r .$
Replace $m i . d a t a [i] with b y t e .$
Return the updated store.

m e m_w r i t e (S, a, i, b) m e m_w r i t e (S, a, i, b) = = S^{' e r o r (if S}

$m e m_s i z e (s t o r e, m e m a d r) : u 32$

Return the length of $s t o r e . m e m s [m e m a d r] . d a t a divided by the page size .$

m e m_s i z e (S, a) = n (if ∣ S . m e m s [a] . d a t a ∣ = n \cdot 64 K i)

$m e m_g r o w (s t o r e, m e m a d r, n : u 32) : s t o r e ∣ e r o r$

Try growing the memory instance
store.mems[memadr] by $n pages : If it succeeds, return the updated store. Else, return e r o r .$

m e m_g r o w (S, a, n) m e m_g r o w (S, a, n) = = S^{' e r o r (}

Globals

$g l o b a l_a l o c (s t o r e, g l o b a l t y p e, v a l) : (s t o r e, g l o b a l a d r)$

Pre-condition: $g l o b a l t y p e is v a l i d < v a l i d - g l o b a l t y p e > .$
Let $g l o b a l a d r be the result of allocating a global in s t o r e with global type g l o b a l t y p e and initialization value v a l .$
Return the new store paired with $g l o b a l a d r .$

g l o b a l_a l o c (S, g t, v) = (S^{', a)}

$g l o b a l_t y p e (s t o r e, g l o b a l a d r) : g l o b a l t y p e$

Assert: the external value $g l o b a l g l o b a l a d r is valid with external type g l o b a l g l o b a l t y p e .$
Return $g l o b a l t y p e .$
Post-condition: $g l o b a l t y p e is v a l i d < v a l i d - g l o b a l t y p e > .$

g l o b a l_t y p e (S, a) = g t (if S ⊢ g l o b a l a : g l o b a l g t)

$g l o b a l_r e a d (s t o r e, g l o b a l a d r) : v a l$

Let $g i be the global instance s t o r e . g l o b a l s [g l o b a l a d r] .$
Return the value $g i . v a l u e .$

g l o b a l_r e a d (S, a) = v (if S . g l o b a l s [a] . v a l u e = v)

$g l o b a l_w r i t e (s t o r e, g l o b a l a d r, v a l) : s t o r e ∣ e r o r$

Let $g i be the global instance s t o r e . g l o b a l s [g l o b a l a d r] .$
If $g i . m u t is not v a r, then return e r o r .$
Replace $g i . v a l u e with the value v a l .$
Return the updated store.

g l o b a l_w r i t e (S, a, v) g l o b a l_w r i t e (S, a, v) = = S^{' e r o r (S . [a] . = \land}

A.2 Implementation Limitations

Implementations typically impose additional restrictions on a number of aspects of a WebAssembly module or execution. These may stem from:

physical resource limits,
constraints imposed by the embedder or its environment,
limitations of selected implementation strategies.

This section lists allowed limitations. Where restrictions take the form of numeric limits, no minimum requirements are given, nor are the limits assumed to be concrete, fixed numbers. However, it is expected that all implementations have “reasonably” large limits to enable common applications.

Note

A conforming implementation is not allowed to leave out individual features. However, designated subsets of WebAssembly may be specified in the future.

Syntactic Limits

Structure

An implementation may impose restrictions on the following dimensions of a module:

the number of types in a module
the number of functions in a module, including imports
the number of tables in a module, including imports
the number of memories in a module, including imports
the number of globals in a module, including imports
the number of element segments in a module
the number of data segments in a module
the number of imports to a module
the number of exports from a module
the number of parameters in a function type
the number of results in a function type
the number of locals in a function
the size of a function body
the size of a structured control instruction
the number of structured control instructions in a function
the nesting depth of structured control instructions
the number of label indices in a $b r_t a b l e instruction$
the length of an element segment
the length of a data segment
the length of a name
the range of characters in a name

If the limits of an implementation are exceeded for a given module, then the implementation may reject the validation, compilation, or instantiation of that module with an embedder-specific error.

Note

The last item allows embedders that operate in limited environments without support for [UNICODE] to limit the names of imports and exports to common subsets like ASCII.

Binary Format

For a module given in binary format, additional limitations may be imposed on the following dimensions:

the size of a module
the size of any section
the size of an individual function’s code
the number of sections

Text Format

For a module given in text format, additional limitations may be imposed on the following dimensions:

the size of the source text
the size of any syntactic element
the size of an individual token
the nesting depth of folded instructions
the length of symbolic identifiers
the range of literal characters allowed in the source text

Validation

An implementation may defer validation of individual functions until they are first invoked.

If a function turns out to be invalid, then the invocation, and every consecutive call to the same function, results in a trap.

Note

This is to allow implementations to use interpretation or just-in-time compilation for functions. The function must still be fully validated before execution of its body begins.

Execution

Restrictions on the following dimensions may be imposed during execution of a WebAssembly program:

the number of allocated module instances
the number of allocated function instances
the number of allocated table instances
the number of allocated memory instances
the number of allocated global instances
the size of a table instance
the size of a memory instance
the number of frames on the stack
the number of labels on the stack
the number of values on the stack

If the runtime limits of an implementation are exceeded during execution of a computation, then it may terminate that computation and report an embedder-specific error to the invoking code.

Some of the above limits may already be verified during instantiation, in which case an implementation may report exceedance in the same manner as for syntactic limits.

Note

Concrete limits are usually not fixed but may be dependent on specifics, interdependent, vary over time, or depend on other implementation- or embedder-specific situations or events.

A.3 Validation Algorithm

The specification of WebAssembly validation is purely declarative. It describes the constraints that must be met by a module or instruction sequence to be valid.

This section sketches the skeleton of a sound and complete algorithm for effectively validating code, i.e., sequences of instructions. (Other aspects of validation are straightforward to implement.)

In fact, the algorithm is expressed over the flat sequence of opcodes as occurring in the binary format, and performs only a single pass over it. Consequently, it can be integrated directly into a decoder.

The algorithm is expressed in typed pseudo code whose semantics is intended to be self-explanatory.

Data Structures

The algorithm uses two separate stacks: the operand stack and the control stack. The former tracks the types of operand values on the stack, the latter surrounding structured control instructions and their associated blocks.

type val_type = I32 | I64 | F32 | F64

type opd_stack = stack(val_type | Unknown)

type ctrl_stack = stack(ctrl_frame)
type ctrl_frame = {
  label_types : list(val_type)
  end_types : list(val_type)
  height : nat
  unreachable : bool
}

             

For each value, the operand stack records its value type, or Unknown when the type is not known.

For each entered block, the control stack records a control frame with the type of the associated label (used to type-check branches), the result type of the block (used to check its result), the height of the operand stack at the start of the block (used to check that operands do not underflow the current block), and a flag recording whether the remainder of the block is unreachable (used to handle stack-polymorphic typing after branches).

Note

In the presentation of this algorithm, multiple values are supported for the result types classifying blocks and labels. With the current version of WebAssembly, the list could be simplified to an optional value.

For the purpose of presenting the algorithm, the operand and control stacks are simply maintained as global variables:

var opds : opd_stack
var ctrls : ctrl_stack

However, these variables are not manipulated directly by the main checking function, but through a set of auxiliary functions:

func push_opd(type : val_type | Unknown) =
  opds.push(type)

func pop_opd() : val_type | Unknown =
  if (opds.size() = ctrls[0].height && ctrls[0].unreachable) return Unknown
  error_if(opds.size() = ctrls[0].height)
  return opds.pop()

func pop_opd(expect : val_type | Unknown) : val_type | Unknown =
  let actual = pop_opd()
  if (actual = Unknown) return expect
  if (expect = Unknown) return actual
  error_if(actual =/= expect)
  return actual

func push_opds(types : list(val_type)) = foreach (t in types) push_opd(t)
func pop_opds(types : list(val_type)) = foreach (t in reverse(types)) pop_opd(t)

             

Pushing an operand simply pushes the respective type to the operand stack.

Popping an operand checks that the operand stack does not underflow the current block and then removes one type. But first, a special case is handled where the block contains no known operands, but has been marked as unreachable. That can occur after an unconditional branch, when the stack is typed polymorphically. In that case, an unknown type is returned.

A second function for popping an operand takes an expected type, which the actual operand type is checked against. The types may differ in case one of them is Unknown. The more specific type is returned.

Finally, there are accumulative functions for pushing or popping multiple operand types.

Note

The notation stack[i] is meant to index the stack from the top, so that ctrls[0] accesses the element pushed last.

The control stack is likewise manipulated through auxiliary functions:

func push_ctrl(label : list(val_type), out : list(val_type)) =
  let frame = ctrl_frame(label, out, opds.size(), false)
  ctrls.push(frame)

func pop_ctrl() : list(val_type) =
  error_if(ctrls.is_empty())
  let frame = ctrls[0]
  pop_opds(frame.end_types)
  error_if(opds.size() =/= frame.height)
  ctrls.pop()
  return frame.end_types

func unreachable() =
  opds.resize(ctrls[0].height)
  ctrls[0].unreachable := true

             

Pushing a control frame takes the types of the label and result values. It allocates a new frame record recording them along with the current height of the operand stack and marks the block as reachable.

Popping a frame first checks that the control stack is not empty. It then verifies that the operand stack contains the right types of values expected at the end of the exited block and pops them off the operand stack. Afterwards, it checks that the stack has shrunk back to its initial height.

Finally, the current frame can be marked as unreachable. In that case, all existing operand types are purged from the operand stack, in order to allow for the stack-polymorphism logic in pop_opd to take effect.

Note

Even with the unreachable flag set, consecutive operands are still pushed to and popped from the operand stack. That is necessary to detect invalid examples like $(u n r e a c h a b l e (i 32 . c o n s t) i 64 . a d) . However, a polymorphic stack cannot underflow, but instead generates Unknown types as needed.$

Validation of Opcode Sequences

The following function shows the validation of a number of representative instructions that manipulate the stack. Other instructions are checked in a similar manner.

Note

Various instructions not shown here will additionally require the presence of a validation context for checking uses of indices. That is an easy addition and therefore omitted from this presentation.

func validate(opcode) =
  switch (opcode)
    case (i32.add)
      pop_opd(I32)
      pop_opd(I32)
      push_opd(I32)

   case (drop)
      pop_opd()

   case (select)
      pop_opd(I32)
      let t1 = pop_opd()
      let t2 = pop_opd(t1)
      push_opd(t2)

   case (unreachable)
      unreachable()

   case (block t*)
      push_ctrl([t*], [t*])

   case (loop t*)
      push_ctrl([], [t*])

   case (if t*)
      pop_opd(I32)
      push_ctrl([t*], [t*])

   case (end)
      let results = pop_ctrl()
      push_opds(results)

   case (else)
      let results = pop_ctrl()
      push_ctrl(results, results)

   case (br n)
      error_if(ctrls.size() < n)
      pop_opds(ctrls[n].label_types)
      unreachable()

   case (br_if n)
      error_if(ctrls.size() < n)
      pop_opd(I32)
      pop_opds(ctrls[n].label_types)
      push_opds(ctrls[n].label_types)

   case (br_table n* m)
      error_if(ctrls.size() < m)
      foreach (n in n*)
        error_if(ctrls.size() < n || ctrls[n].label_types =/= ctrls[m].label_types)
      pop_opd(I32)
      pop_opds(ctrls[m].label_types)
      unreachable()

             

Note

It is an invariant under the current WebAssembly instruction set that an operand of Unknown type is never duplicated on the stack. This would change if the language were extended with stack operators like dup. Under such an extension, the above algorithm would need to be refined by replacing the Unknown type with proper type variables to ensure that all uses are consistent.

A.4 Custom Sections

This appendix defines dedicated custom sections for WebAssembly’s binary format. Such sections do not contribute to, or otherwise affect, the WebAssembly semantics, and like any custom section they may be ignored by an implementation. However, they provide useful meta data that implementations can make use of to improve user experience or take compilation hints.

Currently, only one dedicated custom section is defined, the name section.

Name Section

The name section is a custom section whose name string is itself $‘ n a m e ’ . The name section should appear only once in a module, and only after the data section .$

The purpose of this section is to attach printable names to definitions in a module, which e.g. can be used by a debugger or when parts of the module are to be rendered in text form.

Note

All names are represented in [UNICODE] encoded in UTF-8. Names need not be unique.

Subsections

The data of a name section consists of a sequence of subsections. Each subsection consists of a

a one-byte subsection id,
the $u 32 size of the contents, in bytes,$
the actual contents, whose structure is depended on the subsection id.

n a m e s e c n a m e d a t a n a m e s u b s e c t i o n_{N}

The following subsection ids are used:

Id	Subsection
0	module name
1	function names
2	local names

Each subsection may occur at most once, and in order of increasing id.

Name Maps

A name map assigns names to indices in a given index space. It consists of a vector of index/name pairs in order of increasing index value. Each index must be unique, but the assigned names need not be.

n a m e m a p n a m e a s o c : : = : : = v e c (n a m e a s o c) i d x n a m e

An indirect name map assigns names to a two-dimensional index space, where secondary indices are grouped by primary indices. It consists of a vector of primary index/name map pairs in order of increasing index value, where each name map in turn maps secondary indices to names. Each primary index must be unique, and likewise each secondary index per individual name map.

i n d i r e c t n a m e m a p i n d i r e c t n a m e a s o c : : = : : = v e c (i n d i r e c t n a m e a s o c) i d x n a m e m a p

Module Names

The module name subsection has the id 0. It simply consists of a single name that is assigned to the module itself.

m o d u l e n a m e s u b s e c : : = n a m e s u b s e c t i o n_{0}

Function Names

The function name subsection has the id 1. It consists of a name map assigning function names to function indices.

f u n c n a m e s u b s e c : : = n a m e s u b s e c t i o n_{1}

Local Names

The local name subsection has the id 2. It consists of an indirect name map assigning local names to local indices grouped by function indices.

l o c a l n a m e s u b s e c : : = n a m e s u b s e c t i o n_{2}

A.5 Soundness

The type system of WebAssembly is sound, implying both type safety and memory safety with respect to the WebAssembly semantics. For example:

All types declared and derived during validation are respected at run time; e.g., every local or global variable will only contain type-correct values, every instruction will only be applied to operands of the expected type, and every function invocation always evaluates to a result of the right type (if it does not trap or diverge).
No memory location will be read or written except those explicitly defined by the program, i.e., as a local, a global, an element in a table, or a location within a linear memory.
There is no undefined behavior, i.e., the execution rules cover all possible cases that can occur in a valid program, and the rules are mutually consistent.

Soundness also is instrumental in ensuring additional properties, most notably, encapsulation of function and module scopes: no locals can be accessed outside their own function and no module components can be accessed outside their own module unless they are explicitly exported or imported.

The typing rules defining WebAssembly validation only cover the static components of a WebAssembly program. In order to state and prove soundness precisely, the typing rules must be extended to the dynamic components of the abstract runtime, that is, the store, configurations, and administrative instructions. [1]

Values and Results

Values and results can be classified by value types and result types as follows.

Values $t . c o n s t c$

The value is valid with value type $t .$

⊢ t . c o n s t c : t

Results $v a l^{*}$

For each value
vali in $v a l^{* : The value v a l_{i}}$
Let $t^{* be the concatenation of all t_{i}}$
Then the result is valid with result type $[t^{*]}$

⊢ v a l ^{* : [t_{*]}^{(⊢ v a l : t)^{*}}}

Results $t r a p$

The result is valid with result type $[t^{*]}$

⊢ t r a p : [ t ^{*]}

Store Validity

The following typing rules specify when a runtime store $S is valid . A valid store must consist of function, table, memory, global, and module instances that are themselves valid, relative to S .$

To that end, each kind of instance is classified by a respective function, table, memory, or global type. Module instances are classified by module contexts, which are regular contexts repurposed as module types describing the index spaces defined by a module.

Function Instances ${t y p e f u n c t y p e, m o d u l e m o d u l e i n s t, c o d e f u n c}$

The function type $f u n c t y p e must be valid .$
The module instance $m o d u l e i n s t must be valid with some context C .$
Under context $C, the function f u n c must be valid with function type f u n c t y p e .$
Then the function instance is valid with function type $f u n c t y p e .$

\frac{⊢ f u n c t y p e ok S ⊢ m o d u l e i n s t : C C ⊢ f u n c : f u n c t y p e}{S ⊢ { t y p e f u n c t y p e , m o d u l e m o d u l e i n s t , c o d e f u n c } : f u n c t y p e}

Host Function Instances ${t y p e f u n c t y p e, h o s t c o d e h f}$

The function type $f u n c t y p e must be valid .$
Let $[t_{1 *] \to [t_{2 *] be the function type f u n c t y p e .}}$
For every valid store
S1 extending $S and every sequence v a l^{* of values whose types coincide with t_{1 * : Executing h f in store S_{1} For every element R of this set: Either R must be ⊥ (i.e., divergence). Or R consists of a valid store S_{2}}}$
Then the function instance is valid with function type $f u n c t y p e .$

Note

This rule states that, if appropriate pre-conditions about store and arguments are satisfied, then executing the host function must satisfy appropriate post-conditions about store and results. The post-conditions match the ones in the execution rule for invoking host functions.

Any store under which the function is invoked is assumed to be an extension of the current store. That way, the function itself is able to make sufficient assumptions about future stores.

Table Instances ${e l e m (f a^{?)^{n, m a x m^{?}}}}$

For each optional function address
fai? in the table elements $(f a^{?)^{n : Either f a_{i ? is empty.} Or the external value f u n c f a must be valid with some external type f u n c f t .}}$
The limits ${m i n n, m a x m^{?}}$
Then the table instance is valid with table type ${m i n n, m a x m^{?} f u n c r e f}$

S ⊢ { e l e m ( f a ^{?)^{n, m a x m^{?} : {m i n n, m a x m^{?} f u n c r e f}}}}

Memory Instances ${d a t a b^{n, m a x m^{?}}}$

The limits ${m i n n, m a x m^{?}}$
Then the memory instance is valid with memory type ${m i n n, m a x m^{?}}$

S ⊢ { d a t a b ^{n, m a x m^{?} : {m i n n, m a x m_{?}}^{⊢ {m i n n, m a x m^{?} ok}}}}

Global Instances ${v a l u e (t . c o n s t c), m u t m u t}$

The global instance is valid with global type $m u t t .$

S ⊢ { v a l u e ( t . c o n s t c ) , m u t m u t } : m u t t

Export Instances ${n a m e n a m e, v a l u e e x t e r n v a l}$

The external value $e x t e r n v a l must be valid with some external type e x t e r n t y p e .$
Then the export instance is valid.

\frac{S ⊢ e x t e r n v a l : e x t e r n t y p e}{S ⊢ { n a m e n a m e , v a l u e e x t e r n v a l } ok}

Module Instances $m o d u l e i n s t$

Each function type $f u n c t y p e_{i}$
For each function address $f u n c a d r_{i}$
For each table address $t a b l e a d r_{i}$
For each memory address $m e m a d r_{i}$
For each global address $g l o b a l a d r_{i}$
Each export instance $e x p o r t i n s t_{i}$
For each export instance $e x p o r t i n s t_{i}$
Let $f u n c t y p e^{'^{* be the concatenation of all f u n c t y p e_{i' in order.}}}$
Let $t a b l e t y p e^{* be the concatenation of all t a b l e t y p e_{i}}$
Let $m e m t y p e^{* be the concatenation of all m e m t y p e_{i}}$
Let $g l o b a l t y p e^{* be the concatenation of all g l o b a l t y p e_{i}}$
Then the module instance is valid with context ${t y p e s f u n c t y p e^{*, f u n c s f u n c t y p e^{'^{*, t a b l e s t a b l e t y p e^{*, m e m s m e m t y p e^{*, g l o b a l s g l o b a l t y p e^{*}}}}}}}$

Configuration Validity

To relate the WebAssembly type system to its execution semantics, the typing rules for instructions must be extended to configurations $S; T, which relates the store to execution threads .$

Configurations and threads are classified by their result type. In addition to the store $S, threads are typed under a return type r e s u l t y p e^{?, which controls whether and with which type a r e t u r n instruction is allowed. This type is absent (ϵ) except for instruction sequences inside an administrative f r a m e instruction.}$

Finally, frames are classified with frame contexts, which extend the module contexts of a frame’s associated module instance with the locals that the frame contains.

Configurations $S; T$

The store $S must be valid .$
Under no allowed return type, the thread $T must be valid with some result type [t^{?]}$
Then the configuration is valid with the result type $[t^{?]}$

⊢ S ; T : [ t _{?]}^{⊢ S ok S; ϵ ⊢ T : [t^{?]}}

Threads $F; i n s t r^{*}$

Let $r e s u l t y p e^{? be the current allowed return type.}$
The frame $F must be valid with a context C .$
Let $C^{' be the same context as C, but with r e t u r n set to r e s u l t y p e^{? .}}$
Under context $C^{', the instruction sequence i n s t r^{* must be valid with some type [] \to [t^{?]}}}$
Then the thread is valid with the result type $[t^{?]}$

S ; r e s u l t y p e ^{? ⊢ F; i n s t r^{* : [t_{?]}^{S ⊢ F : C S; C, r e t u r n r e s u l t y p e^{? ⊢ i n s t r^{* : [] \to [t^{?]}}}}}}

Frames ${l o c a l s v a l^{*, m o d u l e m o d u l e i n s t}}$

The module instance $m o d u l e i n s t must be valid with some module context C .$
Each value $v a l_{i}$
Let $t^{* the concatenation of all t_{i}}$
Let $C^{' be the same context as C, but with the value types t^{* prepended to the l o c a l s vector.}}$
Then the frame is valid with frame context $C^{' .}$

S ⊢ { l o c a l s v a l ^{*, m o d u l e m o d u l e i n s t} : (C, l o c a l s t_{*)}^{S ⊢ m o d u l e i n s t : C (⊢ v a l : t)^{*}}}

Administrative Instructions

Typing rules for administrative instructions are specified as follows. In addition to the context $C, typing of these instructions is defined under a given store S . To that end, all previous typing judgements C ⊢ p r o p are generalized to include the store, as in S; C ⊢ p r o p, by implicitly adding S to all rules - S is never modified by the pre-existing rules, but it is accessed in the extra rules for administrative instructions given below.$

$t r a p$

The instruction is valid with type $[t_{1 *] \to [t_{2 *], for any sequences of value types t_{1 * and t_{2 * .}}}}$

S ; C ⊢ t r a p : [ t _{1 *] \to [t_{2 *]}}

$i n v o k e f u n c a d r$

The external function value $f u n c f u n c a d r must be valid with external function type f u n c ([t_{1 *] \to [t_{2 *]) .}}$
Then the instruction is valid with type $[t_{1 *] \to [t_{2 *] .}}$

S ; C ⊢ i n v o k e f u n c a d r : [ t _{1 *] \to [t_{2 *] S ⊢ f u n c f u n c a d r : f u n c [t_{1 *] \to [t_{2 *]}}}}

$i n i t_e l e m t a b l e a d r o x^{n}$

The external table value $t a b l e t a b l e a d r must be valid with some external table type t a b l e l i m i t s f u n c r e f .$
The index $o + n must be smaller than or equal to l i m i t s . m i n .$
The module instance $m o d u l e i n s t must be valid with some context C .$
Each function index $x_{i}$
Then the instruction is valid.

S ; C ⊢ i n i t _ e l e m t a b l e a d r o x ^{n ok S ⊢ t a b l e t a b l e a d r : t a b l e l i m i t s f u n c r e f o + n \leq l i m i t s . m i n (C . f u n c s [x] = f u n c t y p e)^{n}}

$i n i t_d a t a m e m a d r o b^{n}$

The external memory value $m e m m e m a d r must be valid with some external memory type m e m l i m i t s .$
The index $o + n must be smaller than or equal to l i m i t s . m i n divided by the page size 64 K i .$
Then the instruction is valid.

S ; C ⊢ i n i t _ d a t a m e m a d r o b ^{n ok S ⊢ m e m m e m a d r : m e m l i m i t s o + n \leq l i m i t s . m i n \cdot 64 K i}

$l a b e l_{n}$

The instruction sequence $i n s t r_{0 * must be valid with some type [t_{1 n] \to [t_{2 ?] .}}}$
Let $C^{' be the same context as C, but with the result type [t_{1 n] prepended to the l a b e l s vector.}}$
Under context $C^{', the instruction sequence i n s t r^{* must be valid with type [] \to [t_{2 ?] .}}}$
Then the compound instruction is valid with type $[] \to [t_{2 ?] .}$

S ; C ⊢ l a b e l _{n}

$f r a m e_{n}$

Under the return type $[t^{n]}$
Then the compound instruction is valid with type $[] \to [t^{n]}$

S ; C ⊢ f r a m e _{n}

Store Extension

Programs can mutate the store and its contained instances. Any such modification must respect certain invariants, such as not removing allocated instances or changing immutable definitions. While these invariants are inherent to the execution semantics of WebAssembly instructions and modules, host functions do not automatically adhere to them. Consequently, the required invariants must be stated as explicit constraints on the invocation of host functions. Soundness only holds when the embedder ensures these constraints.

The necessary constraints are codified by the notion of store extension: a store state $S^{' extends state S, written S ⪯ S^{', when the following rules hold.}}$

Note

Extension does not imply that the new store is valid, which is defined separately above.

Store $S$

The length of $S . f u n c s must not shrink.$
The length of $S . t a b l e s must not shrink.$
The length of $S . m e m s must not shrink.$
The length of $S . g l o b a l s must not shrink.$
For each function instance $f u n c i n s t_{i}$
For each table instance $t a b l e i n s t_{i}$
For each memory instance $m e m i n s t_{i}$
For each global instance $g l o b a l i n s t_{i}$

Function Instance $f u n c i n s t$

A function instance must remain unchanged.

⊢ f u n c i n s t ⪯ f u n c i n s t

Table Instance $t a b l e i n s t$

The length of $t a b l e i n s t . e l e m must not shrink.$
The value of $t a b l e i n s t . m a x must remain unchanged.$

⊢ { e l e m ( f a _{1 ?)^{n_{1}}}

Memory Instance $m e m i n s t$

The length of $m e m i n s t . d a t a must not shrink.$
The value of $m e m i n s t . m a x must remain unchanged.$

⊢ { d a t a b _{1 n_{1}}

Global Instance $g l o b a l i n s t$

The mutability $g l o b a l i n s t . m u t must remain unchanged.$
The value type of the value $g l o b a l i n s t . v a l u e must remain unchanged.$
If $g l o b a l i n s t . m u t is c o n s t, then the value g l o b a l i n s t . v a l u e must remain unchanged.$

⊢ { v a l u e ( t . c o n s t c _{1}

Theorems

Given the definition of valid configurations, the standard soundness theorems hold. [2]

Theorem (Preservation). If a configuration $S; T is valid with result type [t^{*]}$

A terminal thread is one whose sequence of instructions is a result. A terminal configuration is a configuration whose thread is terminal.

Theorem (Progress). If a configuration $S; T is valid (i.e., ⊢ S; T : [t^{*]}$

From Preservation and Progress the soundness of the WebAssembly type system follows directly.

Corollary (Soundness). If a configuration $S; T is valid (i.e., ⊢ S; T : [t^{*]}$

In other words, every thread in a valid configuration either runs forever, traps, or terminates with a result that has the expected type. Consequently, given a valid store, no computation defined by instantiation or invocation of a valid module can “crash” or otherwise (mis)behave in ways not covered by the execution semantics given in this specification.

[1]

The formalization and theorems are derived from the following article: Andreas Haas, Andreas Rossberg, Derek Schuff, Ben Titzer, Dan Gohman, Luke Wagner, Alon Zakai, JF Bastien, Michael Holman. Bringing the Web up to Speed with WebAssembly. Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017). ACM 2017.

[2]	A machine-verified version of the formalization and soundness proof is described in the following article: Conrad Watt. Mechanising and Verifying the WebAssembly Specification. Proceedings of the 7th ACM SIGPLAN Conference on Certified Programs and Proofs (CPP 2018). ACM 2018.

A.6 Index of Types

Category	Constructor	Binary Opcode
Type index	$x$	(positive number as $s 32 or u 32)$
Value type	$i 32$	$0 x 7 F (-1 as s 7)$
Value type	$i 64$	$0 x 7 E (-2 as s 7)$
Value type	$f 32$	$0 x 7 D (-3 as s 7)$
Value type	$f 64$	$0 x 7 C (-4 as s 7)$
(reserved)		$0 x 7 B .. 0 x 71$
Element type	$f u n c r e f$	$0 x 70 (-16 as s 7)$
(reserved)		$0 x 6 F .. 0 x 61$
Function type	$[v a l t y p e^{] \to [v a l t y p e^{]}}$	$0 x 60 (-32 as s 7)$
(reserved)		$0 x 5 F .. 0 x 41$
Result type	$[ϵ]$	$0 x 40 (-64 as s 7)$
Table type	$l i m i t s e l e m t y p e$	(none)
Memory type	$l i m i t s$	(none)
Global type	$m u t v a l t y p e$	(none)

A.7 Index of Instructions

Instruction	Binary Opcode	Type	Validation	Execution
$u n r e a c h a b l e$	$0 x 00$	$[t_{1 ] \to [t_{2 ]}}$	validation	execution
$n o p$	$0 x 01$	$[] \to []$	validation	execution
$b l o c k [t^{?]}$	$0 x 02$	$[] \to [t^{*]}$	validation	execution
$l o p [t^{?]}$	$0 x 03$	$[] \to [t^{*]}$	validation	execution
$i f [t^{?]}$	$0 x 04$	$[i 32] \to [t^{*]}$	validation	execution
$e l s e$	$0 x 05$
(reserved)	$0 x 06$
(reserved)	$0 x 07$
(reserved)	$0 x 08$
(reserved)	$0 x 09$
(reserved)	$0 x 0 A$
$e n d$	$0 x 0 B$
$b r l$	$0 x 0 C$	$[t_{1 * t^{?] \to [t_{2 *]}}}$	validation	execution
$b r_i f l$	$0 x 0 D$	$[t^{? i 32] \to [t^{?]}}$	validation	execution
$b r_t a b l e l^{* l}$	$0 x 0 E$	$[t_{1 * t^{? i 32] \to [t_{2 *]}}}$	validation	execution
$r e t u r n$	$0 x 0 F$	$[t_{1 * t^{?] \to [t_{2 *]}}}$	validation	execution
$c a l x$	$0 x 10$	$[t_{1 ] \to [t_{2 ]}}$	validation	execution
$c a l_i n d i r e c t x$	$0 x 11$	$[t_{1 * i 32] \to [t_{2 *]}}$	validation	execution
(reserved)	$0 x 12$
(reserved)	$0 x 13$
(reserved)	$0 x 14$
(reserved)	$0 x 15$
(reserved)	$0 x 16$
(reserved)	$0 x 17$
(reserved)	$0 x 18$
(reserved)	$0 x 19$
$d r o p$	$0 x 1 A$	$[t] \to []$	validation	execution
$s e l e c t$	$0 x 1 B$	$[t t i 32] \to [t]$	validation	execution
(reserved)	$0 x 1 C$
(reserved)	$0 x 1 D$
(reserved)	$0 x 1 E$
(reserved)	$0 x 1 F$
$l o c a l . g e t x$	$0 x 20$	$[] \to [t]$	validation	execution
$l o c a l . s e t x$	$0 x 21$	$[t] \to []$	validation	execution
$l o c a l . t e x$	$0 x 22$	$[t] \to [t]$	validation	execution
$g l o b a l . g e t x$	$0 x 23$	$[] \to [t]$	validation	execution
$g l o b a l . s e t x$	$0 x 24$	$[t] \to []$	validation	execution
(reserved)	$0 x 25$
(reserved)	$0 x 26$
(reserved)	$0 x 27$
$i 32 . l o a d m e m a r g$	$0 x 28$	$[i 32] \to [i 32]$	validation	execution
$i 64 . l o a d m e m a r g$	$0 x 29$	$[i 32] \to [i 64]$	validation	execution
$f 32 . l o a d m e m a r g$	$0 x 2 A$	$[i 32] \to [f 32]$	validation	execution
$f 64 . l o a d m e m a r g$	$0 x 2 B$	$[i 32] \to [f 64]$	validation	execution
$i 32 . l o a d 8_s m e m a r g$	$0 x 2 C$	$[i 32] \to [i 32]$	validation	execution
$i 32 . l o a d 8_u m e m a r g$	$0 x 2 D$	$[i 32] \to [i 32]$	validation	execution
$i 32 . l o a d 16_s m e m a r g$	$0 x 2 E$	$[i 32] \to [i 32]$	validation	execution
$i 32 . l o a d 16_u m e m a r g$	$0 x 2 F$	$[i 32] \to [i 32]$	validation	execution
$i 64 . l o a d 8_s m e m a r g$	$0 x 30$	$[i 32] \to [i 64]$	validation	execution
$i 64 . l o a d 8_u m e m a r g$	$0 x 31$	$[i 32] \to [i 64]$	validation	execution
$i 64 . l o a d 16_s m e m a r g$	$0 x 32$	$[i 32] \to [i 64]$	validation	execution
$i 64 . l o a d 16_u m e m a r g$	$0 x 33$	$[i 32] \to [i 64]$	validation	execution
$i 64 . l o a d 32_s m e m a r g$	$0 x 34$	$[i 32] \to [i 64]$	validation	execution
$i 64 . l o a d 32_u m e m a r g$	$0 x 35$	$[i 32] \to [i 64]$	validation	execution
$i 32 . s t o r e m e m a r g$	$0 x 36$	$[i 32 i 32] \to []$	validation	execution
$i 64 . s t o r e m e m a r g$	$0 x 37$	$[i 32 i 64] \to []$	validation	execution
$f 32 . s t o r e m e m a r g$	$0 x 38$	$[i 32 f 32] \to []$	validation	execution
$f 64 . s t o r e m e m a r g$	$0 x 39$	$[i 32 f 64] \to []$	validation	execution
$i 32 . s t o r e 8 m e m a r g$	$0 x 3 A$	$[i 32 i 32] \to []$	validation	execution
$i 32 . s t o r e 16 m e m a r g$	$0 x 3 B$	$[i 32 i 32] \to []$	validation	execution
$i 64 . s t o r e 8 m e m a r g$	$0 x 3 C$	$[i 32 i 64] \to []$	validation	execution
$i 64 . s t o r e 16 m e m a r g$	$0 x 3 D$	$[i 32 i 64] \to []$	validation	execution
$i 64 . s t o r e 32 m e m a r g$	$0 x 3 E$	$[i 32 i 64] \to []$	validation	execution
$m e m o r y . s i z e$	$0 x 3 F$	$[] \to [i 32]$	validation	execution
$m e m o r y . g r o w$	$0 x 40$	$[i 32] \to [i 32]$	validation	execution
$i 32 . c o n s t i 32$	$0 x 41$	$[] \to [i 32]$	validation	execution
$i 64 . c o n s t i 64$	$0 x 42$	$[] \to [i 64]$	validation	execution
$f 32 . c o n s t f 32$	$0 x 43$	$[] \to [f 32]$	validation	execution
$f 64 . c o n s t f 64$	$0 x 44$	$[] \to [f 64]$	validation	execution
$i 32 . e q z$	$0 x 45$	$[i 32] \to [i 32]$	validation	execution, operator
$i 32 . e q$	$0 x 46$	$[i 32 i 32] \to [i 32]$	validation	execution, operator
$i 32 . n e$	$0 x 47$	$[i 32 i 32] \to [i 32]$	validation	execution, operator
$i 32 . l t_s$	$0 x 48$	$[i 32 i 32] \to [i 32]$	validation	execution, operator
$i 32 . l t_u$	$0 x 49$	$[i 32 i 32] \to [i 32]$	validation	execution, operator
$i 32 . g t_s$	$0 x 4 A$	$[i 32 i 32] \to [i 32]$	validation	execution, operator
$i 32 . g t_u$	$0 x 4 B$	$[i 32 i 32] \to [i 32]$	validation	execution, operator
$i 32 . l e_s$	$0 x 4 C$	$[i 32 i 32] \to [i 32]$	validation	execution, operator
$i 32 . l e_u$	$0 x 4 D$	$[i 32 i 32] \to [i 32]$	validation	execution, operator
$i 32 . g e_s$	$0 x 4 E$	$[i 32 i 32] \to [i 32]$	validation	execution, operator
$i 32 . g e_u$	$0 x 4 F$	$[i 32 i 32] \to [i 32]$	validation	execution, operator
$i 64 . e q z$	$0 x 50$	$[i 64] \to [i 32]$	validation	execution, operator
$i 64 . e q$	$0 x 51$	$[i 64 i 64] \to [i 32]$	validation	execution, operator
$i 64 . n e$	$0 x 52$	$[i 64 i 64] \to [i 32]$	validation	execution, operator
$i 64 . l t_s$	$0 x 53$	$[i 64 i 64] \to [i 32]$	validation	execution, operator
$i 64 . l t_u$	$0 x 54$	$[i 64 i 64] \to [i 32]$	validation	execution, operator
$i 64 . g t_s$	$0 x 55$	$[i 64 i 64] \to [i 32]$	validation	execution, operator
$i 64 . g t_u$	$0 x 56$	$[i 64 i 64] \to [i 32]$	validation	execution, operator
$i 64 . l e_s$	$0 x 57$	$[i 64 i 64] \to [i 32]$	validation	execution, operator
$i 64 . l e_u$	$0 x 58$	$[i 64 i 64] \to [i 32]$	validation	execution, operator
$i 64 . g e_s$	$0 x 59$	$[i 64 i 64] \to [i 32]$	validation	execution, operator
$i 64 . g e_u$	$0 x 5 A$	$[i 64 i 64] \to [i 32]$	validation	execution, operator
$f 32 . e q$	$0 x 5 B$	$[f 32 f 32] \to [i 32]$	validation	execution, operator
$f 32 . n e$	$0 x 5 C$	$[f 32 f 32] \to [i 32]$	validation	execution, operator
$f 32 . l t$	$0 x 5 D$	$[f 32 f 32] \to [i 32]$	validation	execution, operator
$f 32 . g t$	$0 x 5 E$	$[f 32 f 32] \to [i 32]$	validation	execution, operator
$f 32 . l e$	$0 x 5 F$	$[f 32 f 32] \to [i 32]$	validation	execution, operator
$f 32 . g e$	$0 x 60$	$[f 32 f 32] \to [i 32]$	validation	execution, operator
$f 64 . e q$	$0 x 61$	$[f 64 f 64] \to [i 32]$	validation	execution, operator
$f 64 . n e$	$0 x 62$	$[f 64 f 64] \to [i 32]$	validation	execution, operator
$f 64 . l t$	$0 x 63$	$[f 64 f 64] \to [i 32]$	validation	execution, operator
$f 64 . g t$	$0 x 64$	$[f 64 f 64] \to [i 32]$	validation	execution, operator
$f 64 . l e$	$0 x 65$	$[f 64 f 64] \to [i 32]$	validation	execution, operator
$f 64 . g e$	$0 x 66$	$[f 64 f 64] \to [i 32]$	validation	execution, operator
$i 32 . c l z$	$0 x 67$	$[i 32] \to [i 32]$	validation	execution, operator
$i 32 . c t z$	$0 x 68$	$[i 32] \to [i 32]$	validation	execution, operator
$i 32 . p o p c n t$	$0 x 69$	$[i 32] \to [i 32]$	validation	execution, operator
$i 32 . a d$	$0 x 6 A$	$[i 32 i 32] \to [i 32]$	validation	execution, operator
$i 32 . s u b$	$0 x 6 B$	$[i 32 i 32] \to [i 32]$	validation	execution, operator
$i 32 . m u l$	$0 x 6 C$	$[i 32 i 32] \to [i 32]$	validation	execution, operator
$i 32 . d i v_s$	$0 x 6 D$	$[i 32 i 32] \to [i 32]$	validation	execution, operator
$i 32 . d i v_u$	$0 x 6 E$	$[i 32 i 32] \to [i 32]$	validation	execution, operator
$i 32 . r e m_s$	$0 x 6 F$	$[i 32 i 32] \to [i 32]$	validation	execution, operator
$i 32 . r e m_u$	$0 x 70$	$[i 32 i 32] \to [i 32]$	validation	execution, operator
$i 32 . a n d$	$0 x 71$	$[i 32 i 32] \to [i 32]$	validation	execution, operator
$i 32 . o r$	$0 x 72$	$[i 32 i 32] \to [i 32]$	validation	execution, operator
$i 32 . x o r$	$0 x 73$	$[i 32 i 32] \to [i 32]$	validation	execution, operator
$i 32 . s h l$	$0 x 74$	$[i 32 i 32] \to [i 32]$	validation	execution, operator
$i 32 . s h r_s$	$0 x 75$	$[i 32 i 32] \to [i 32]$	validation	execution, operator
$i 32 . s h r_u$	$0 x 76$	$[i 32 i 32] \to [i 32]$	validation	execution, operator
$i 32 . r o t l$	$0 x 77$	$[i 32 i 32] \to [i 32]$	validation	execution, operator
$i 32 . r o t r$	$0 x 78$	$[i 32 i 32] \to [i 32]$	validation	execution, operator
$i 64 . c l z$	$0 x 79$	$[i 64] \to [i 64]$	validation	execution, operator
$i 64 . c t z$	$0 x 7 A$	$[i 64] \to [i 64]$	validation	execution, operator
$i 64 . p o p c n t$	$0 x 7 B$	$[i 64] \to [i 64]$	validation	execution, operator
$i 64 . a d$	$0 x 7 C$	$[i 64 i 64] \to [i 64]$	validation	execution, operator
$i 64 . s u b$	$0 x 7 D$	$[i 64 i 64] \to [i 64]$	validation	execution, operator
$i 64 . m u l$	$0 x 7 E$	$[i 64 i 64] \to [i 64]$	validation	execution, operator
$i 64 . d i v_s$	$0 x 7 F$	$[i 64 i 64] \to [i 64]$	validation	execution, operator
$i 64 . d i v_u$	$0 x 80$	$[i 64 i 64] \to [i 64]$	validation	execution, operator
$i 64 . r e m_s$	$0 x 81$	$[i 64 i 64] \to [i 64]$	validation	execution, operator
$i 64 . r e m_u$	$0 x 82$	$[i 64 i 64] \to [i 64]$	validation	execution, operator
$i 64 . a n d$	$0 x 83$	$[i 64 i 64] \to [i 64]$	validation	execution, operator
$i 64 . o r$	$0 x 84$	$[i 64 i 64] \to [i 64]$	validation	execution, operator
$i 64 . x o r$	$0 x 85$	$[i 64 i 64] \to [i 64]$	validation	execution, operator
$i 64 . s h l$	$0 x 86$	$[i 64 i 64] \to [i 64]$	validation	execution, operator
$i 64 . s h r_s$	$0 x 87$	$[i 64 i 64] \to [i 64]$	validation	execution, operator
$i 64 . s h r_u$	$0 x 88$	$[i 64 i 64] \to [i 64]$	validation	execution, operator
$i 64 . r o t l$	$0 x 89$	$[i 64 i 64] \to [i 64]$	validation	execution, operator
$i 64 . r o t r$	$0 x 8 A$	$[i 64 i 64] \to [i 64]$	validation	execution, operator
$f 32 . a b s$	$0 x 8 B$	$[f 32] \to [f 32]$	validation	execution, operator
$f 32 . n e g$	$0 x 8 C$	$[f 32] \to [f 32]$	validation	execution, operator
$f 32 . c e i l$	$0 x 8 D$	$[f 32] \to [f 32]$	validation	execution, operator
$f 32 . f l o r$	$0 x 8 E$	$[f 32] \to [f 32]$	validation	execution, operator
$f 32 . t r u n c$	$0 x 8 F$	$[f 32] \to [f 32]$	validation	execution, operator
$f 32 . n e a r e s t$	$0 x 90$	$[f 32] \to [f 32]$	validation	execution, operator
$f 32 . s q r t$	$0 x 91$	$[f 32] \to [f 32]$	validation	execution, operator
$f 32 . a d$	$0 x 92$	$[f 32 f 32] \to [f 32]$	validation	execution, operator
$f 32 . s u b$	$0 x 93$	$[f 32 f 32] \to [f 32]$	validation	execution, operator
$f 32 . m u l$	$0 x 94$	$[f 32 f 32] \to [f 32]$	validation	execution, operator
$f 32 . d i v$	$0 x 95$	$[f 32 f 32] \to [f 32]$	validation	execution, operator
$f 32 . m i n$	$0 x 96$	$[f 32 f 32] \to [f 32]$	validation	execution, operator
$f 32 . m a x$	$0 x 97$	$[f 32 f 32] \to [f 32]$	validation	execution, operator
$f 32 . c o p y s i g n$	$0 x 98$	$[f 32 f 32] \to [f 32]$	validation	execution, operator
$f 64 . a b s$	$0 x 99$	$[f 64] \to [f 64]$	validation	execution, operator
$f 64 . n e g$	$0 x 9 A$	$[f 64] \to [f 64]$	validation	execution, operator
$f 64 . c e i l$	$0 x 9 B$	$[f 64] \to [f 64]$	validation	execution, operator
$f 64 . f l o r$	$0 x 9 C$	$[f 64] \to [f 64]$	validation	execution, operator
$f 64 . t r u n c$	$0 x 9 D$	$[f 64] \to [f 64]$	validation	execution, operator
$f 64 . n e a r e s t$	$0 x 9 E$	$[f 64] \to [f 64]$	validation	execution, operator
$f 64 . s q r t$	$0 x 9 F$	$[f 64] \to [f 64]$	validation	execution, operator
$f 64 . a d$	$0 x A 0$	$[f 64 f 64] \to [f 64]$	validation	execution, operator
$f 64 . s u b$	$0 x A 1$	$[f 64 f 64] \to [f 64]$	validation	execution, operator
$f 64 . m u l$	$0 x A 2$	$[f 64 f 64] \to [f 64]$	validation	execution, operator
$f 64 . d i v$	$0 x A 3$	$[f 64 f 64] \to [f 64]$	validation	execution, operator
$f 64 . m i n$	$0 x A 4$	$[f 64 f 64] \to [f 64]$	validation	execution, operator
$f 64 . m a x$	$0 x A 5$	$[f 64 f 64] \to [f 64]$	validation	execution, operator
$f 64 . c o p y s i g n$	$0 x A 6$	$[f 64 f 64] \to [f 64]$	validation	execution, operator
$i 32 . w r a p_i 64$	$0 x A 7$	$[i 64] \to [i 32]$	validation	execution, operator
$i 32 . t r u n c_f 32_s$	$0 x A 8$	$[f 32] \to [i 32]$	validation	execution, operator
$i 32 . t r u n c_f 32_u$	$0 x A 9$	$[f 32] \to [i 32]$	validation	execution, operator
$i 32 . t r u n c_f 64_s$	$0 x A A$	$[f 64] \to [i 32]$	validation	execution, operator
$i 32 . t r u n c_f 64_u$	$0 x A B$	$[f 64] \to [i 32]$	validation	execution, operator
$i 64 . e x t e n d_i 32_s$	$0 x A C$	$[i 32] \to [i 64]$	validation	execution, operator
$i 64 . e x t e n d_i 32_u$	$0 x A D$	$[i 32] \to [i 64]$	validation	execution, operator
$i 64 . t r u n c_f 32_s$	$0 x A E$	$[f 32] \to [i 64]$	validation	execution, operator
$i 64 . t r u n c_f 32_u$	$0 x A F$	$[f 32] \to [i 64]$	validation	execution, operator
$i 64 . t r u n c_f 64_s$	$0 x B 0$	$[f 64] \to [i 64]$	validation	execution, operator
$i 64 . t r u n c_f 64_u$	$0 x B 1$	$[f 64] \to [i 64]$	validation	execution, operator
$f 32 . c o n v e r t_i 32_s$	$0 x B 2$	$[i 32] \to [f 32]$	validation	execution, operator
$f 32 . c o n v e r t_i 32_u$	$0 x B 3$	$[i 32] \to [f 32]$	validation	execution, operator
$f 32 . c o n v e r t_i 64_s$	$0 x B 4$	$[i 64] \to [f 32]$	validation	execution, operator
$f 32 . c o n v e r t_i 64_u$	$0 x B 5$	$[i 64] \to [f 32]$	validation	execution, operator
$f 32 . d e m o t e_f 64$	$0 x B 6$	$[f 64] \to [f 32]$	validation	execution, operator
$f 64 . c o n v e r t_i 32_s$	$0 x B 7$	$[i 32] \to [f 64]$	validation	execution, operator
$f 64 . c o n v e r t_i 32_u$	$0 x B 8$	$[i 32] \to [f 64]$	validation	execution, operator
$f 64 . c o n v e r t_i 64_s$	$0 x B 9$	$[i 64] \to [f 64]$	validation	execution, operator
$f 64 . c o n v e r t_i 64_u$	$0 x B A$	$[i 64] \to [f 64]$	validation	execution, operator
$f 64 . p r o m o t e_f 32$	$0 x B B$	$[f 32] \to [f 64]$	validation	execution, operator
$i 32 . r e i n t e r p r e t_f 32$	$0 x B C$	$[f 32] \to [i 32]$	validation	execution, operator
$i 64 . r e i n t e r p r e t_f 64$	$0 x B D$	$[f 64] \to [i 64]$	validation	execution, operator
$f 32 . r e i n t e r p r e t_i 32$	$0 x B E$	$[i 32] \to [f 32]$	validation	execution, operator
$f 64 . r e i n t e r p r e t_i 64$	$0 x B F$	$[i 64] \to [f 64]$	validation	execution, operator

A.8 Index of Semantic Rules

Typing of Static Constructs

Construct	Judgement
Limits	$⊢ l i m i t s : k$
Function type	$⊢ f u n c t y p e ok$
Table type	$⊢ t a b l e t y p e ok$
Memory type	$⊢ m e m t y p e ok$
Global type	$⊢ g l o b a l t y p e ok$
External type	$⊢ e x t e r n t y p e ok$
Instruction	$S; C ⊢ i n s t r : f u n c t y p e$
Instruction sequence	$S; C ⊢ i n s t r^{* : f u n c t y p e}$
Expression	$C ⊢ e x p r : r e s u l t y p e$
Function	$C ⊢ f u n c : f u n c t y p e$
Table	$C ⊢ t a b l e : t a b l e t y p e$
Memory	$C ⊢ m e m : m e m t y p e$
Global	$C ⊢ g l o b a l : g l o b a l t y p e$
Element segment	$C ⊢ e l e m ok$
Data segment	$C ⊢ d a t a ok$
Start function	$C ⊢ s t a r t ok$
Export	$C ⊢ e x p o r t : e x t e r n t y p e$
Export description	$C ⊢ e x p o r t d e s c : e x t e r n t y p e$
Import	$C ⊢ i m p o r t : e x t e r n t y p e$
Import description	$C ⊢ i m p o r t d e s c : e x t e r n t y p e$
Module	$⊢ m o d u l e : e x t e r n t y p e^{* \to e x t e r n t y p e^{*}}$

Typing of Runtime Constructs

Construct	Judgement
Value	$⊢ v a l : v a l t y p e$
Result	$⊢ r e s u l t : r e s u l t y p e$
External value	$S ⊢ e x t e r n v a l : e x t e r n t y p e$
Function instance	$S ⊢ f u n c i n s t : f u n c t y p e$
Table instance	$S ⊢ t a b l e i n s t : t a b l e t y p e$
Memory instance	$S ⊢ m e m i n s t : m e m t y p e$
Global instance	$S ⊢ g l o b a l i n s t : g l o b a l t y p e$
Export instance	$S ⊢ e x p o r t i n s t ok$
Module instance	$S ⊢ m o d u l e i n s t : C$
Store	$⊢ s t o r e ok$
Configuration	$⊢ c o n f i g ok$
Thread	$S; r e s u l t y p e^{? ⊢ t h r e a d : r e s u l t y p e}$
Frame	$S ⊢ f r a m e : C$

Constantness

Construct	Judgement
Constant expression	$C ⊢ e x p r const$
Constant instruction	$C ⊢ i n s t r const$

Import Matching

Construct	Judgement
Limits	$⊢ l i m i t s_{1}$
External type	$⊢ e x t e r n t y p e_{1}$

Store Extension

Construct	Judgement
Function instance	$⊢ f u n c i n s t_{1}$
Table instance	$⊢ t a b l e i n s t_{1}$
Memory instance	$⊢ m e m i n s t_{1}$
Global instance	$⊢ g l o b a l i n s t_{1}$
Store	$⊢ s t o r e_{1}$

Execution

Construct	Judgement
Instruction	$S; F; i n s t r^{* ↪ S^{'; F^{'; i n s t r^{'^{*}}}}}$
Expression	$S; F; e x p r ↪ S^{'; F^{'; e x p r^{'}}}$

Conformance

Document conventions

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

WebAssembly Core Specification

W3C Recommendation, 5 December 2019

Abstract

Status of this document

WebAssembly Specification

1. Introduction

1.1. Introduction

1.1.1. Design Goals

1.1.2. Scope

1.2. Security Considerations

1.2.1. Dependencies

1.3. Overview

1.3.1. Concepts

1.3.2. Semantic Phases

2. Structure

2.1. Conventions

2.1.1. Grammar Notation

2.1.2. Auxiliary Notation

2.1.3. Vectors

2.2. Values

2.2.1. Bytes

2.2.1.1. Conventions

2.2.2. Integers

2.2.2.1. Conventions

2.2.3. Floating-Point

2.2.3.1. Conventions

2.2.4. Names

2.2.4.1. Convention

2.3. Types

2.3.1. Value Types

2.3.1.1. Conventions

2.3.2. Result Types

2.3.3. Function Types

2.3.4. Limits

2.3.5. Memory Types

2.3.6. Table Types

2.3.7. Global Types

2.3.8. External Types

2.3.8.1. Conventions

2.4. Instructions

2.4.1. Numeric Instructions

2.4.1.1. Conventions

2.4.2. Parametric Instructions

2.4.3. Variable Instructions

2.4.4. Memory Instructions

2.4.5. Control Instructions

2.4.6. Expressions

2.5. Modules

2.5.1. Indices

2.5.1.1. Conventions

2.5.2. Types

2.5.3. Functions

2.5.4. Tables

2.5.5. Memories

2.5.6. Globals

2.5.7. Element Segments

2.5.8. Data Segments

2.5.9. Start Function

2.5.10. Exports

2.5.10.1. Conventions

2.5.11. Imports

3. Validation

3.1. Conventions

3.1.1. Contexts

3.1.2. Prose Notation

3.1.3. Formal Notation

3.2. Types

3.2.1. Limits

3.2.1.1. {min n,max m?}

3.2.2. Function Types

3.2.2.1. [t1n​]→[t2m​]

3.2.3. Table Types

3.2.3.1. limits elemtype

3.2.4. Memory Types

3.2.4.1. limits

3.2.5. Global Types

3.2.5.1. mut valtype

3.2.6. External Types

3.2.6.1. func functype

3.2.6.2. table tabletype

3.2.1.1. ${m i n n, m a x m^{?}}$

3.2.2.1. $[t_{1 n] \to [t_{2 m]}}$

3.2.3.1. $l i m i t s e l e m t y p e$

3.2.4.1. $l i m i t s$

3.2.5.1. $m u t v a l t y p e$

3.2.6.1. $f u n c f u n c t y p e$

3.2.6.2. $t a b l e t a b l e t y p e$

3.2.6.3. $m e m m e m t y p e$

3.2.6.4. $g l o b a l g l o b a l t y p e$

3.3.1.1. $t . c o n s t c$

3.3.1.2. $t . u n o p$

3.3.1.3. $t . b i n o p$

3.3.1.4. $t . t e s t o p$

3.3.1.5. $t . r e l o p$

3.3.1.6. $t_{2}$

3.3.2.1. $d r o p$

3.3.2.2. $s e l e c t$

3.3.3.1. $l o c a l . g e t x$

3.3.3.2. $l o c a l . s e t x$

3.3.3.3. $l o c a l . t e x$

3.3.3.4. $g l o b a l . g e t x$

3.3.3.5. $g l o b a l . s e t x$

3.3.4.1. $t . l o a d m e m a r g$

3.3.4.2. $t . l o a d N_s x m e m a r g$

3.3.4.3. $t . s t o r e m e m a r g$

3.3.4.4. $t . s t o r e N m e m a r g$

3.3.4.5. $m e m o r y . s i z e$

3.3.4.6. $m e m o r y . g r o w$

3.3.5.1. $n o p$

3.3.5.2. $u n r e a c h a b l e$

3.3.5.3. $b l o c k [t^{?] i n s t r^{* e n d}}$

3.3.5.4. $l o p [t^{?] i n s t r^{* e n d}}$

3.3.5.5. $i f [t^{?] i n s t r_{1 * e l s e i n s t r_{2 * e n d}}}$

3.3.5.6. $b r l$

3.3.5.7. $b r_i f l$

3.3.5.8. $b r_t a b l e l^{* l_{N}}$

3.3.5.9. $r e t u r n$

3.3.5.10. $c a l x$

3.3.5.11. $c a l_i n d i r e c t x$

3.3.6.1. Empty Instruction Sequence: $ϵ$

3.3.6.2. Non-empty Instruction Sequence: $i n s t r^{* i n s t r_{N}}$

3.3.7.1. $i n s t r^{* e n d}$

3.4.1.1. ${t y p e x, l o c a l s t^{*, b o d y e x p r}}$

3.4.2.1. ${t y p e t a b l e t y p e}$

3.4.3.1. ${t y p e m e m t y p e}$

3.4.4.1. ${t y p e m u t t, i n i t e x p r}$

3.4.5.1. ${t a b l e x, o f s e t e x p r, i n i t y^{*}}$

3.4.6.1. ${d a t a x, o f s e t e x p r, i n i t b^{*}}$

3.4.7.1. ${f u n c x}$

3.4.8.1. ${n a m e n a m e, d e s c e x p o r t d e s c}$

3.4.8.2. $f u n c x$

3.4.8.3. $t a b l e x$

3.4.8.4. $m e m x$

3.4.8.5. $g l o b a l x$

3.4.9.1. ${m o d u l e n a m e_{1}$

3.4.9.2. $f u n c x$

3.4.9.3. $t a b l e t a b l e t y p e$