Formal Verification, the Move Language, and the Move Prover
This is the third in a series of posts looking at the Move programming language. By now, we’ve covered an introduction to Move, and offered solutions to common issues we’ve encountered while auditing projects written in Move.
In this follow up piece, we take a deep dive into the formal verification of Move programs. Move has been designed from the ground up with verification in mind. We start by revisiting non-standard language constructs that actually help improve the security of Move programs. Subsequently, we walk through some examples of deductive formal verification in Move and highlight important lessons we learned.
You can follow along with this walkthrough using the code provided in this GitHub repository.
A Primer on the Move Language
Basics of Move
The Move language features many constructs that simplify the task of writing (provably) secure blockchain applications. We covered the basics of the Move language in a previous blog and the advantages it offers for writing secure smart contracts. We saw that, unlike most other smart contract programming languages, Move was designed from the ground up with verification in mind. In this follow up, we take a deep dive into formal verification of Move programs. We first revisit some of Move’s non-standard language constructs that help improve security, then walk through some examples of deductive formal verification in Move and highlight important lessons we learned while developing the examples.
The Move repository on GitHub contains a lightweight tutorial that covers basic setup and a sequence of examples for a simple token smart contract.
Restricted Set of Primitive Data Types
One distinctive feature of Move is its limited set of primitive data types. Unlike most general purpose programming languages, Move does not support signed integers, floating point numbers (IEEE-754), strings, or maps. Move also does not provide direct access to pointers, which makes formal verification significantly easier. This leaves the following list of Move primitive data types:
- Unsigned 8, 64, and 128 bit integers (denoted by
u8,u64, andu128). - Booleans (denoted by
bool) - A datatype
addressto represent addresses on the blockchain. Addresses in Move do not allow address arithmetic, cannot be modified, and cannot be converted to (or from) integers. Another restriction is that one can only use addresses that are obtained at runtime to access resources, not to access modules. These restrictions are highly desirable for efficient formal verification! - Reference types that can either be mutable or immutable. References are closely linked to Move’s ownership model, in that passing references does not transfer ownership.
User-Defined Types
Move is more conventional when it comes to user-defined data types. Data is mostly represented in structures (records) with typed fields as they are known from most programming languages:
For now, do not worry about the key and store attributes that Asset is declared with. These are called abilities and define how ownership and storage of values of type Asset must be handled; for example, a struct that doesn’t have the store ability must not be written into persistent blockchain storage. We will revisit abilities in greater detail in the section of Move’s ownership model.
Encapsulation, Abstract Data Types, and Friends
Move enforces a rather strict encapsulation of data within modules. In particular, you may only access the fields of a structure from functions that are defined within the same module. Also, structures can only be created or deconstructed in code that resides in the same module. In that sense, Move structure declarations are treated as abstract data types that hide their inner workings from code outside of “their” module. Functions within a module are private by default and thus can only be called from within that module. Any function can be declared public, making it accessible from external code. And there is a noteworthy third protection level that will become important later on: Modules can have friends. If we define a contract coin (example source in GitHub) that represents some asset and a coin_manager that provides infrastructure, we can befriend the coin and the coin_manager:
Functions that are declared as public(friend) can only be invoked from the contract itself and from all of the contract’s friends. In that way, we can grant the coin_manager contract augmented privileges compared to other contracts.
Fine-grained modules and carefully chosen friend specifications are important in Move, where restricting the operations on locally defined structures serves more purposes than abstraction and encapsulation:
- Self-contained modules make specification more understandable and more convincing. That becomes important when ghost variables are used to track “specification state”. In small modules, it is much easier to convince oneself that the ghost variable is actually tracking what it is supposed to track.
- The encapsulation provided by modules and friend declarations has important security implications. To understand those, let’s take a closer look at Move’s ownership model.
Ownership in Move
The Move language enforces strict rules on accessing global memory and has a dedicated ownership model built in. These rules have been designed to simplify development of secure Move programs as well as to simplify automated formal verification of those programs.
Before we go into details, let’s quickly dive into a major difference in data representation between the Ethereum and the Diem blockchain:
Persistent User Data on the Ethereum Blockchain
Ethereum distinguishes between contract and user addresses; data (including the bytecode of smart contracts itself) can be stored at contract addresses, whereas no such data can be attached to user addresses. A common pattern in Ethereum smart contracts is to include a mapping as a state variable that keeps track of user data. On a simplified token contract, that might look like this:
The entire transfer is implemented by modifying the mapping that is persisted on the blockchain under the SampleToken contract’s deployment address. In particular, the recipient has no control over anything stored in that mapping under her address; it is merely an index into a map that is maintained by the smart contract.
In Ethereum, data about a smart contract’s users is held in mappings that reside in the storage space allocated to the owning smart contract. Usually, Ethereum smart contracts maintain a (possibly large) mapping from addresses to the corresponding data (such as the user’s token balance, deposit, etc.). Those mappings do not distinguish between addresses that belong to a user and those that belong to another smart contract. For example, that’s why both users and smart contracts can hold tokens.
User Data in Move/Diem
In Diem based blockchains, the structuring is fundamentally different. Data belonging to a user account is stored in a storage area associated to that account. The storage area of each user account can be thought of as a mapping from contract addresses to the corresponding user data. In that way, data belonging to different smart contracts can independently co-exist in the user’s storage area. This structuring makes an explicit mapping data type unnecessary in Move - each contract can implicitly store and manipulate user data via that user’s address and the type of the underlying resource in that user’s storage area. A practical advantage of this structuring is that users need to explicitly allow smart contracts to store information to their accounts. Unsolicited airdropping, i.e. “transferring” tokens to a user account without the user’s consent, is impossible.
Abilities
Abilities are a crucial ingredient for making Move programs secure. Essentially, they restrict which operations a program may perform on the underlying data type. Take, for example, the following snippet that defines a module coin with a very simple Coin structure:
The Coin structure has key and store abilities attached to its declaration. The storage areas in Move blockchains hold structures, only. Each such structure is identified by its name and the address of its defining contract. If we declare a structure with the key ability, it becomes hashable, i.e. its contract address and name can serve as a key into the internal lookup tables that constitute the storage areas of addresses. In short: Structures that have the key ability can be found in global storage.
Moreover, anything that should be stored in global storage must have the store ability. All other data items are ephemeral, as they are freed by the VM once they are no longer used (or the transaction terminates).
While the key and store abilities are mainly focused on Move/Diem’s data handling, the remaining two abilities are fundamental for managing digital assets, i.e. data that is attributed a certain value. Let’s look at the details!
Move’s Ownership Model
Similar to Rust, Move uses an ownership model for deciding when an allocated data structure can be deleted. It is theoretically rooted in linear logic, which reasons about resources that can be passed around. Now imagine an instance of the Coin data structure from our previous example. In Move, any such value can represent an underlying asset (i.e. value) without any further security precautions:
- The
Coinstructure does not have thecopyability. Hence, instances ofCoincannot be copied. They cannot be passed into functions via a call-by-value and cannot be copied, otherwise. The code in functioncounterfeitertries to make money by simply duplicating any coin it receives:
That code will be rejected by the compiler:
To be absolutely sure, one might ask: What if that code wasn’t implemented in the Move language, but as a corresponding piece of bytecode? Then the Move compiler was not involved at all and could not detect the counterfeiting attempt, right?
Luckily, the Move VM’s bytecode is typed and keeps track of abilities. So even if an attacker circumvented the Move compiler by directly injecting bytecode, the Move VM’s bytecode verifier that runs on the validators and analyzes any byte code before it gets executed would reject it. So in essence, code that violates ability constraints can never be executed on the blockchain!
- The
Coinstructure also lacks thedropability. That makes the Move compiler & virtual machine reject execution of any code that would delete the coin instance. See this as a safety precaution that avoids accidentally loosing your digital assets. The only place where such a structure may be deleted is within code of its declaring module. Here and only here can ourCoinbe destructured and deleted:
Note the subtle difference to other programming languages: The code in Line 2 goes beyond pattern matching semantics in that it makes the field amount of the Coin structure available in a local variable. It also destroys the coin instance, i.e. the underlying digital asset is gone by that point. Only the information that was stored in it, is preserved in a local variable. To merely read the amount in a given coin instance without destroying the latter, one must use a field selector such as coin.amount.
Finally, the access restrictions provided by Move modules complete the ownership story: Structures declared within a module can only be created (also called “packed”) within the code of that module. They cannot be dropped and the only way to delete them is to destructure them within their defining module. From outside the coin module, any such structure is an abstract data type whose inner structure, lifecycle, or contents remains locked.
Exception Handling (or the Lack Thereof)
Exceptions signal a failure that often arises from unexpected or invalid inputs or inconsistent state. They are thrown when the condition in an assert! statement evaluates to false, but they also arise on numeric over- and underflows, division by zero, and other system failures (e.g. when a resource does not exist). When an exception occurs, the current transaction aborts and rolls back all state changes it made. Move does not allow catching an exception and dealing with it (as Solidity does since version 0.6). This lack of exception handling facilitates formal verification, as it simplifies reasoning about control flow considerably. And even for humans, correctly tracking the locations where exceptions may be thrown and those where they potentially get caught (if they are caught at all), is a daunting task.
At this point, we have covered most language features that set Move apart from other commonly known smart contract programming languages; we also dived into some of the details that have specifically been built into Move to make it amenable to formal reasoning. So it is time to look into the actual topic of this blog post: How to formally prove or disprove properties of Move programs!
Formal Verification of Move Programs
Move comes bundled with the a dedicated tool for formal verification, the Move Prover. The language and the prover have been developed together and are tightly integrated.
The Move Prover And Its Specification Language
The Move programming language has a specification sub-language that allows programmers to state desired properties about Move programs. Its principles rest upon theoretical contributions from Floyd, Hoare, and Meyer starting in the 1960s. In particular, the Move Prover follows the design by contract paradigm. Internally, the Move Prover translates the Move program together with its specifications into a mathematical model, which is then formally verified using the deductive verifier Boogie. Boogie uses SMT solvers such as Z3 and CVC4 as decision procedures to discharge proof obligations. Once Boogie has completed formal verification, its results are mapped back to the level of the Move language and displayed to the programmer.
If you'd like to follow along using the examples in the Move Prover, the Move project's GitHub page has a detailed guide to installing the Move Prover. There is also a readme available with instructions for building the Move Prover from sources.
The Move Prover (J.E.Zhong, K. Cheang, S. Qadeer, W. Grieskamp, S. Blackshear, J. Park, Y. Zohar, C. Barrett, D. Dill. “The Move Prover”, CAV 2020)
The properties that need to be verified are written in the Move Specification Language. It has all the constructs necessary to reason about modules, data structures, functions, loops, and invariants. Formulas can be expressed in full first order logic. Exciting? Then let’s jump into the details!
Specifying Pre- and Post-Conditions on Move Functions
Augmenting a Move function with pre- and postconditions essentially forms a Hoare triple {P} F(x,y) {Q} stating that whenever function F is called in a state satisfying the predicates in P and terminates, its post-state satisfies the predicates in Q. (Note: This is known as a proof of partial correctness; it is partial, as we do not reason about non-terminating executions of F(x,y). If we also succeed in proving that F(x,y) terminates whenever it is started in a state satisfying P, we establish what is called total correctness.)
Let’s start with the very basics and take a look at the following Move function (example source in GitHub):
The function add is particularly simple, as it only refers to local parameters and does not access global resources). To capture its intended behavior, one might write the following specification:
Essentially, this says that whenever add is called, no matter what the values of x and y are (because we didn’t put any restrictions on them), it ensures that upon termination the returned result equals the sum of x and y. Lean back for a moment and think about whether that statement is actually true.
Perhaps surprisingly, the Move Prover verifies the specification without complaints. And while it obviously holds for “usual” inputs, you might have noticed that for large enough values of x and y, an overflow occurs when evaluating x + y. The reason for the Move Prover not detecting the overflow is that by default, it verifies partial correctness and regards the occurrence of a revert as a special case of non-termination. Essentially, the Move Prover disregards failing executions by default, and the exception caused by an overflow leads to a failing execution. We can control this behavior using the aborts_if_is_strict pragma, so let’s ask the Move Prover to take failing executions into account when verifying our specification:
Asking it again to verify the code above produces the expected counterexample, essentially highlighting the possibility that add reverts upon overflow:
The aborts_if_strict pragma can be set globally to ensure all specifications take reverts into account. We will work under this assumption from here on.
Let’s modify the specification to explicitly account for overflows. The Move Specification Language has a built-in primitive aborts_if to express conditions under which a function aborts. Specifying that a function aborts on overflows is nice; even nicer would be to rule out the possibility of such overflows in the first place. This can be achieved via requires specifications, which allow us to impose a precondition stating that function add may only be called with parameters x and y whose sum does not exceed 2^(64)-1:
Specification Subtleties
There are subtle differences between Move expression semantics and Move specification expression semantics. One such case is revealed in the example above. Whereas over- and underflows that occur during evaluation of arithmetic expressions such as x + y at runtime make the transaction abort, arithmetic expressions in specifications operate on mathematical integers. There, the finite-width integers u8, u64, and u128 are implicitly cast into a new data type num that has an unbounded integer domain, eliminating all over- and underflows. In the example above, it is this semantic difference that makes the aborts_if condition in Line 2 well defined.
Another crucial point is the restriction to “pure” statements within Move specifications. Specifications must be side-effect free apart from operations that may change specification (ghost) variables. Accordingly, one may use pure functions from the Move program within specifications; these are functions that do not modify global state and only use expressions that are well-formed in both Move and the Move specification language.
requires vs. aborts_if
You might wonder if it is necessary to specify both aborts_i x + y > MAX_U64 and requires x + y <= MAX_U64. And indeed, there is a subtle difference:
aborts_ifstates a condition that, when satisfied, will result in an abort and a state reversal. These specifications make all situations that may lead to an abort explicit.requiresstatements impose a constraint on all known call-sites of the function; as such, they are not restricted to reasoning about aborting transactions. In our example, each call site ofaddwithin the scope of the Move project gives rise to a proof obligation where the Move Prover has to prove that all conditions from the callee’srequirestatements are met. Stated differently, ourrequirespecification requires each caller of add to ensure that the requirements (i.e.,x + ydoes not overflow) are met before executing the call.
For internal calls, a require statement helps to avoid errors (and enforces some logical discipline), as the callers have to ensure that the requirements for the callee’s correct operation are met. That is not the case for aborts_if statements; they propagate up to the caller, but they do not require the caller to meet the callee’s requirements.
Specifying both essentially requires all known call sites (i.e., those that are within the scope of the project at compile time) to ensure the requirements are met while also specifying that the function may abort if invoked from external modules.
The Deductive Power of Move
The McCarthy 91 function is a popular example of a non-trivial recursive function. It takes an integer-valued input n and returns 91 if n is smaller than 100 and n - 10, otherwise.
This is a Move implementation and specification of that function (example source in GitHub):
Spend a minute or two to wrap your head around the function and convince yourself that it’s non-trivial. You might be surprised that the Move Prover (actually, the Boogie verifier and the Z3 SMT solver that do the reasoning) can prove its correctness as written above, with no additional annotations. It suffices to state the expected result in two ensures specifications. The prover succeeds in proving correctness almost instantaneously:
Opaque Specifications
You may have noticed the pragma opaque annotation in the specification of the McCarthy 91 function. That option controls how function calls are handled within proofs:
By default (i.e., without pragma opaque), the Boogie Verifier reasons about the body of each function that is called. That is technically achieved by inlining the function call, replacing the call site with a copy of the callee’s function body. This is advantageous whenever the callee is not exhaustively specified; by analyzing its body, the Move Prover has complete knowledge about the callee’s inner workings.
The pragma opaque option changes that behavior. Instead of inlining the callee, the Move Prover tries to prove all requirements of the callee (as specified in its preconditions) at the call site. If that proof succeeds, it immediately assumes that all ensures statements hold. The implementation of the function is not taken into account, but replaced entirely by the axiomatization provided via its specification.
Opaque specifications are the foundation of modular reasoning. In our case, specifying pragma opaque is necessary because our McCarthy 91 implementation is recursive, making the default approach of using inlining inapplicable.
Always Double Check
As we will see later, double checking such results is crucial, as specification errors could lead to subtle inconsistencies that would render any proofs vacuous. Let’s try to introduce a bug:
As we can see, the Move Prover finds a counterexample for n = 100, where our buggy version yields 90 instead of 91.
Specifications in Code Blocks
So far, we have used ensures and aborts_if to specify the desired result of functions. In all our previous examples, the Move Prover was able to deduce the validity of our specifications without any help. However, specifying the desired outcome of a function generally does not suffice for the prover to construct a proof. For more complex algorithms, we need to help in the proof derivation by providing additional specifications within the function’s body. We demonstrate this by specifying realistic properties of a standard bubble sort algorithm.
Bubble Sort (example source in GitHub)
The basic bubble sort algorithm operates on arrays. For simplicity, we’ll assume that the arrays contain no duplicate elements; handling duplicate elements makes the specification and proof significantly more complex than we have space for here. The basic idea is to traverses the array elements from left to right, swapping adjacent entries whenever the value at index k is larger than the value at index k+1. If we arrive at the end and made changes during the traversal, we have to start over again. We terminate once no swaps occur during a traversal:
Specifying Properties
We obviously require any sensible sorting algorithm to return the input, well, sorted. And we might also require that the result contains all the numbers from the input. To make sure that no new elements were added during the sort, we also require that the result be the same length as the input. That leads us to this specification:
Let’s take a closer look at the specification.
The first ensures condition states that all elements are sorted in increasing order. There is one pitfall in Move that is worth mentioning.
The Move Specification language defines ranges as a built-in datatype that is mainly intended for quantification (as we did above). The syntax for such ranges is m..n, where m and n are numbers. In Move, the range m..n denotes the set of numbers {m, m+1, m+2, ..., n-1}; n is not included! So a range of the form m..n defines a left-closed interval [m, n). This semantics will be familiar to you if you have worked with ranges in Python, but otherwise may be surprising.
Equipped with that semantic insight, the universal quantification indeed ranges over all indices but the last one and specifies an increasing sequence. Our second condition states that the resulting vector retains all numbers from the input. And our last condition states that the length of the vector remains the same.
Now let’s ask the Move Prover to verify our specification of bubble sort. Given our previous experience, the results may be surprising at first: all of our specifications are reported to be violated!
Violation of the sortedness property.
The Move Prover claims that elements can be lost.
The result may be a different length than the input.
Transaction may abort due to an underflow on empty inputs.
You may ask: Is our implementation really that buggy? And luckily, the answer is no (it is actually correct with respect to this specification, as we will prove later). So why are all these errors reported?
Internally, the Move Prover—like most formal verification tools that follow the same approach—has to deduce validity for each of the postconditions in our specifications from what it can infer from the function’s implementation. Sadly, but unavoidably, the knowledge that can be inferred automatically by the Move Prover (or any other deductive verification tool) is generally limited to a degree that it will not be able to prove arbitrary specifications. The tool needs your help!
As a downside, deductive verification tools (including the Move Prover) cannot generally distinguish between specifications they are unable to prove correct because of missing lemmas about the implementation, and specifications that are truly violated. In both cases, these tools report a violation and generate a counterexample.
Let’s take a step back and for now focus only on proving that the elements in the resulting vector are stored in increasing order:
Intuitively, we can help the Move Prover by providing an idea for proving the property along with a couple of guiding specifications. Consider the Boolean flag swapped. It is initialized to false at the beginning of a bubble sort iteration and set to true when arriving at the first pair of elements where elems[i] is larger than elems[i+1]. Once set, the swapped flag remains true until the end of the iteration.
Loop Invariants To The Rescue
One key observation is that as long as swapped remains false, no elements required swapping; that is, they already are in increasing order. We can express that formally as a loop invariant:
Note: The notation elems[k] is a shorthand for accessing the k-th vector element. It is only available within specifications and not part of the general Move language.
A subset of specifications can be used within code blocks. That includes:
assumestatements, which ask the Move Prover to treat the respective condition as unconditionally true. In that sense, anassumeprovides a “local” axiom that refer to a particular location within the Move program’s control flow and can be used in other derivation steps.assertstatements, which must be proven correct at their source code location. If their validity cannot be shown, the Move Prover aborts.invariantstatements, which can be used to annotate loops. These are an essential ingredient in the Move Specification language to express more advanced properties of algorithms. Aninvariantspecification requires the Move Prover to establish an inductive argument:- The induction base: show that the invariant condition holds when entering the loop; and
- The induction step: assuming that the invariant condition holds when entering the loop body (using the induction hypothesis), show that the invariant condition still holds at the end of the loop iteration.
Equipped with that machinery, let’s ask the Move Prover to establish the invariant:
