(hooks)= Hooks ===== Hooks are used to attach CVL code to certain low-level operations, such as loads and stores to specific storage variables. Each hook contains a pattern that describes what operations cause the hook to be invoked, and a block of code that is executed when the contract performs those operations. The remainder of this document describes the operation of hooks in detail. For examples of idiomatic hook usage, see {ref}`tracking-changes` and {ref}`using-opcodes`. ```{contents} ``` Syntax ------ The syntax for hooks is given by the following EBNF grammar: ``` hook ::= "hook" pattern block pattern ::= "Sstore" access_path param [ "(" param ")" ] | "Sload" param access_path | opcode [ "(" params ")" ] [ param ] access_path ::= id | [ id "." ] "(" "slot" number ")" | access_path "." id | access_path "[" "KEY" basic_type id "]" | access_path "[" "INDEX" basic_type id "]" | access_path "." "(" "offset" number ")" opcode ::= "ALL_SLOAD" | "ALL_SSTORE" | "ALL_TLOAD" | "ALL_TSTORE" | ... param ::= evm_type id ``` See {ref}`opcode-hooks` below for the list of available opcodes. See {doc}`statements` for information about the `statement` production; see {doc}`types` for the `evm_type` production; see {doc}`basics` for the `number` production. It is prohibited to have multiple hooks with the same hook pattern. Two hooks have the same hook pattern if both are `Sstore` hooks with the same access path, both are `Sload` hooks with the same access path, or both are opcode hooks with the same opcode. Doing so will result in an error like this: `The declared hook pattern <second hook> duplicates the hook pattern <first hook> at <spec file>. A hook pattern may only be used once.` Note that two access paths are considered to be "the same" if they resolve to the same storage address. Syntactically different access paths can alias, e.g., when accessing a member by name (`contract.member`) or by slot (`contract.(slot n)`). Examples -------- - [store hook example](https://github.com/Certora/Examples/blob/14668d39a6ddc67af349bc5b82f73db73349ef18/CVLByExample/ERC20/certora/specs/ERC20.spec#L117) - [load hook example](https://github.com/Certora/Examples/blob/14668d39a6ddc67af349bc5b82f73db73349ef18/CVLByExample/structs/BankAccounts/certora/specs/Bank.spec#L141) See {ref}`using-opcodes` and {ref}`tracking-changes` for additional examples. (load-hooks)= (store-hooks)= Load and store hooks -------------------- Load hooks are executed before a read from a specific location in storage, while store hooks are executed before a write to a specific location in storage. The locations to be matched are given by an access path, such as a contract variable, array index, or a slot number. See {ref}`access-paths` below for information on the available access paths. A load pattern contains the keyword `Sload`, followed by the type and name of a variable that will hold the loaded value, followed by an access path indicating the location that is read. For example, here is a load hook that will execute whenever a contract reads the value of `C.owner`: ```cvl hook Sload address o C.owner { ... } ``` Inside the body of this hook, the variable `o` will be bound to the value that was read. A store pattern contains the keyword `Sstore`, followed by an access path indicating the location that is being written to, followed by the type and name of a variable to hold the value that is being stored. Optionally, the pattern may also include the type and name of a variable to store the previous value that is being overwritten. For example, here is a store hook that will execute whenever a contract writes the value of `C.totalSupply`: ```cvl hook Sstore C.totalSupply uint ts (uint old_ts) { ... } ``` Inside the body of this hook, the variable `ts` will be bound to the value that is being written to the `totalSupply` variable, while `old_ts` is bound to the value that was stored there previously. If you do not need to refer to the old value, you can omit the variable declaration for it. For example, the following hook only binds the new value of `C.totalSupply`: ```cvl hook Sstore C.totalSupply uint ts { ... } ``` ```{note} A hook will **not** be triggered by CVL code, including access to solidity storage from CVL. ``` (access-paths)= ### Access paths The patterns for load and store hooks are fine-grained; they allow you to hook on accesses to specific contract variables or specific array, struct, or mapping accesses. Storage locations are designated by "access paths". An access path starts with either the name of a contract field, or a [slot number][storage-layout]. Contract fields may be qualified by the contract that defines them (e.g. `Contract.field`). If the contract name is omitted, it defaults to `currentContract`. The contract name may either be an instance variable introduced by a [using statement][using] or the name of a contract on the {term}`scene`. ```{note} If a contract inherits multiple variables with the same name, you cannot use that variable as an access path. ``` [storage-layout]: https://docs.soliditylang.org/en/v0.8.17/internals/layout_in_storage.html If the indicated location holds a struct, you can refer to a specific field of the struct by appending `.<field-name>` to the path. For example, the following hook will execute on every store to the `balance` field of the struct `C.owner`: ```cvl hook Sstore C.owner.balance uint b { ... } ``` If the indicated location holds an array, you can refer to an arbitrary element of the array by appending `[INDEX uint <variable>]`. This pattern will match any store to an element of the array, and will bind the named variable to the index of the access. For example, the following hook will execute on any write to the array `C.entries` and will update the corresponding entry of the ghost mapping `_entries` to match: ```cvl hook Sstore C.entries[INDEX uint i] uint e { _entries[i] = e; } ``` Additionally, if the indicated location holds a _dynamic_ array, you can refer to accesses to the length of the array with `.length`: ```cvl hook Sstore C.entries.length uint len { _len = len; // updates a ghost variable `_len` with the length is written to storage. } ``` Similarly, if the indicated location holds a mapping, you can refer to an arbitrary entry by appending `[KEY <type> <variable>]`. This pattern will match any write to the mapping, and will bind the named variable to the key. For example, the following hook will execute on any write to the mapping `C.balances`, and will update the `_balances` ghost accordingly: ```cvl hook Sstore C.balances[KEY address user] uint balance { _balances[user] = balance; } ``` Finally, there is the low-level access pattern `<base>.(offset <n>)` for matching loads and stores that are a specific number of bytes from the base. For example, the following hook will match writes to the third or fourth byte of slot 1 (these two bytes are matched because the type of the variable is `uint16`: ```cvl hook Sstore (slot 1).(offset 2) uint16 b { ... } ``` These different kinds of paths can be combined, subject to restrictions listed below. For example, the following hook will execute whenever the contract writes to the `balance` field of a struct in the `users` mapping of contract `C`: ```cvl hook C.users[KEY address user].balance uint v (uint old_value) { ... } ``` Inside the body of the hook, the variable `user` will refer to the address that was used as the key into the mapping `C.users`; the variable `v` will contain the value that is written, and the variable `old_value` will contain the value that was previously stored there. There are a few restrictions on the available combinations of low-level and high-level access paths: - You cannot access struct fields on access paths that contain `slot` or `offset` components, because the struct type is not known. - You can only use `KEY` and `INDEX` patterns on word-aligned access paths (i.e. any `offset` components must be multiples of 32). ```{note} The only available access paths for `solc` versions 5.17 and older are `slot` and `offset` paths. ``` ### Access path caveats In order to apply hooks correctly, the Prover must analyze the contract's use of storage. In some cases, especially in the presence of inline assembly `sload` and `sstore` instructions, the analysis will fail. In this case, hooks may not be applied. If storage analysis fails, you will see a message indicating the failure in the global problems view of the rule report. See {ref}`storage-and-memory-analysis` for more details. (rawhooks)= ### Hooking on all storage loads or stores Load and store hooks apply to reads and writes to specific storage locations. In some cases, it is useful to instrument every load or store, regardless of the location. The `ALL_SLOAD` and `ALL_SSTORE` opcode hooks are used for this purpose; they will be executed on every load and store instruction (in all contracts) respectively. See {ref}`opcode-hooks` below for the general syntax of opcode hooks. The `ALL_SLOAD` opcode hook takes one input `uint` argument containing the slot number of the load instruction, and has one `uint` output containing the value that is loaded from the slot. For example: ```cvl hook ALL_SLOAD(uint slot) uint val { ... } ``` The `ALL_SSTORE` opcode hook takes two input `uint` arguments; the first is the slot number of the store instruction, and the second is the value being stored. For example: ```cvl hook ALL_SSTORE(uint slot, uint val) { ... } ``` ```{note} The storage splitting optimization must be disabled using the {ref}`-enableStorageSplitting` option in order to use the `ALL_SLOAD` or `ALL_SSTORE` hooks. ``` If a load instruction matches an `Sload` hook pattern and there is also an `ALL_SLOAD` hook, then both hooks will be executed; the `Sload` hook will apply first, and then the `ALL_SLOAD` hook. Similarly, if a store would trigger both an `Sstore` pattern and an `ALL_SSTORE` pattern, the `Sstore` hook would be executed, followed by the `ALL_SSTORE` hook. ```{note} Just like the usual opcode hooks, the raw storage hooks are applied on all contracts. This means that a storage access on _any_ contract will trigger the hook. Therefore, in a rule that models calls to multiple contracts, if two contracts are accessing the same slot the same hook code will be called with the same slot number. ``` #### Notes on transient storage. In a similar vein to `ALL_SLOAD` and `ALL_SSTORE` hooks, CVL also allows to hook on `TLOAD` and `TSTORE` instructions for updating the transient storage, using the `ALL_TLOAD` and `ALL_TSTORE` hooks. The hooks for transient storage access share the syntax of their regular storage counterparts. Given that Solidity only allows (as of v0.8.25) access to transient storage via inline-assembly, and has no notion of structure to transient storage, Prover currently does not expose access-path based access like for the regular storage hooks. For more information on how the Prover models transient storage, please refer to the relevant section on {ref}`transient storage <transient-storage>`. (opcode-hooks)= EVM opcode hooks ---------------- % TODO DOC-354: document Create hooks (which are different from `CREATE1` / `CREATE2`) Opcode hooks are executed just after[^before-hooks] a contract executes a specific [EVM opcode][evm-opcodes]. An opcode hook pattern consists of the name of the opcode, followed by the inputs to the opcode (if any), followed by the type and variable name for the output (if any). [^before-hooks]: For halting instructions such as `REVERT` and `SELFDESTRUCT`, the hook is executed before the instruction instead of after. [evm-opcodes]: https://ethereum.org/en/developers/docs/evm/opcodes/ For example, the following hook will execute immediately after any contract executes the `EXTCODESIZE` instruction: ```cvl hook EXTCODESIZE(address addr) uint v { ... } ``` Within the body of the hook, the `addr` variable will be bound to the address argument to the opcode, and the variable `v` will be bound to the returned value of the opcode. ```{note} Opcode hooks are applied to _all_ contracts, not just the main contract under verification. ``` Opcode hooks have the same names, arguments, and return values as the corresponding [EVM opcodes][evm-opcodes], with the exception of the `CREATE1` hook, which corresponds to the `CREATE` opcode. Below is the set of supported opcode hook patterns: ```cvl hook ADDRESS address v hook BALANCE(address addr) uint v hook ORIGIN address v hook CALLER address v hook CALLVALUE uint v hook CODESIZE uint v hook CODECOPY(uint destOffset, uint offset, uint length) hook GASPRICE uint v hook EXTCODESIZE(address addr) uint v hook EXTCODECOPY(address b, uint destOffset, uint offset, uint length) hook EXTCODEHASH(address a) bytes32 hash hook BLOCKHASH(uint n) bytes32 hash hook COINBASE address v hook TIMESTAMP uint v hook NUMBER uint v hook DIFFICULTY uint v hook GASLIMIT uint v hook CHAINID uint v hook SELFBALANCE uint v hook BASEFEE uint v hook MSIZE uint v hook GAS uint v hook LOG0(uint offset, uint length) hook LOG1(uint offset, uint length, bytes32 t1) hook LOG2(uint offset, uint length, bytes32 t1, bytes32 t2) hook LOG3(uint offset, uint length, bytes32 t1, bytes32 t2, bytes32 t3) hook LOG4(uint offset, uint length, bytes32 t1, bytes32 t2, bytes32 t3, bytes32 t4) hook CREATE1(uint value, uint offset, uint length) address v hook CREATE2(uint value, uint offset, uint length, bytes32 salt) address v hook CALL(uint g, address addr, uint value, uint argsOffset, uint argsLength, uint retOffset, uint retLength) uint rc hook CALLCODE(uint g, address addr, uint value, uint argsOffset, uint argsLength, uint retOffset, uint retLength) uint rc hook DELEGATECALL(uint g, address addr, uint argsOffset, uint argsLength, uint retOffset, uint retLength) uint rc hook STATICCALL(uint g, address addr, uint argsOffset, uint argsLength, uint retOffset, uint retLength) uint rc hook REVERT(uint offset, uint size) hook BLOBHASH(uint n) bytes32 hash hook SELFDESTRUCT(address a) ``` % Note: I'm removing the following section and just replacing it by saying that % the above list are the only supported codes, since there seem to be many other % unsupported codes (e.g. `ADD` and friends) % ### Unsupported opcodes % % The standard stack-manipulating instructions `DUP*`, `SWAP*`, `PUSH*` and `POP` % are not modeled. `MLOAD` and `MSTORE` are also not modeled. (call-hooks)= ### Call hooks We provide hooks for all four call opcodes: [`CALL`](https://www.evm.codes/#f1), [`CALLCODE`](https://www.evm.codes/#f2), [`DELEGATECALL`](https://www.evm.codes/#f4), and [`STATICCALL`](https://www.evm.codes/#fa). The hook parameters match the stack inputs of the respective opcodes. The arguments for the call arguments and return values (`argsOffset`, `argsSize`, `retOffset`, and `retSize`) mostly exist for future use. Their values can be consumed, but currently they cannot be used to read data stored in memory before or after the call. These hooks can be very useful to establish sensible security invariants. As an example, suppose no external code should gain write access to the data of the current contract. Both `CALLCODE` and `DELEGATECALL` have the potential to do exactly that by calling into another contract but keeping the current context. Note that there are exceptions to this property whenever *trusted* 3rd party libraries are used. It might also be necessary to restrict these checks to the contract of interest. ```cvl hook CALLCODE(uint g, address addr, uint value, uint argsOffset, uint argsLength, uint retOffset, uint retLength) uint rc { // using CALLCODE is generally not expected in this contract assert (executingContract != currentContract, "we should not use `callcode`"); } hook DELEGATECALL(uint g, address addr, uint argsOffset, uint argsLength, uint retOffset, uint retLength) uint rc { // DELEGATECALL is used in this contract, but it only ever calls into itself assert (executingContract != currentContract || addr == currentContract, "we should only `delegatecall` into ourselves" ); } ``` ### Known inter-dependencies and common pitfalls Hooks are instrumented for every appearance of the matching instruction in the bytecode, as generated by a high-level source compiler such as the Solidity compiler. The behavior of the bytecode may sometimes be surprising. For example, every Solidity method call to a non-payable function will contain an early call to `CALLVALUE` to check that it is 0. This means that every time a non-payable Solidity function is invoked, the `CALLVALUE` hook will be triggered. (executingContract)= Hook bodies ----------- The body of a hook may contain almost any CVL code, including calls to other Solidity functions. The only exception is that hooks may not contain parametric method calls. Expressions in hook bodies may reference variables bound by the hook pattern. ### Keywords available in hook bodies Hook bodies may refer to the special CVL variable `executingContract`, which contains the address of the contract whose code triggered the hook. The call opcodes (`CALL`, `CALLCODE`, `STATICCALL` and `DELEGATECALL`) may also refer to a special CVL variable `selector`, which can be used to compare to a specific method signature selector. For example, here the hook body will assert that the native token value passed in the call is 0, only for a `transfer(address,uint)` call: ```cvl hook CALL(uint g, address addr, uint value, uint argsOffs, uint argLength, uint retOffset, uint retLength) uint rc { if(selector == sig:transfer(address, uint).selector) { assert value == 0; } } ``` ```{note} In case the size of the input to the call is less than 4 bytes, the value of `selector` is 0. This can be checked by comparing the `argLength` argument of the hook to 4. ``` ### Reentrant hooks While a hook is executing, additional hooks are skipped. If a hook calls a contract function which would normally cause another hook to execute, the inner hook will not execute. For example, suppose the contract function `updateX()` always assigned to the value `x`, and consider the following hook (see [`HookReentrancy.spec` here][hook-reentrance-example] for the complete example): % TODO: maybe update link to `master` when Examples PR #46 is accepted? [hook-reentrance-example]: https://github.com/Certora/Examples/tree/28cb774fc03767e8aeaf00ea1bb0fac7db595fc9/ReferenceManual/HookReentrancy ```{code-block} cvl :linenos: /// the number of stores to `x` ghost mathint xStoreCount; /// increment xStoreCount and recursively update `x` hook Sstore x uint v { xStoreCount = xStoreCount + 1; if (xStoreCount < 5) { updateX(); } } /// This rule will pass because hooks are not recursively applied rule checkStoreCount { require xStoreCount == 0; updateX(); assert xStoreCount == 1; } ``` In this example, the rule `checkStoreCount` calls `updateX` on line 15, which updates `x`, triggering the hook on line 5. The hook then calls `updateX` again on line 8. The recursive call to `updateX` will then update `x` a second time. At this point, you may expect that the hook will be triggered a second time, but because there is already a hook executing, this second update to `x` will not trigger the hook. Therefore the `xStoreCount` ghost will *not* be updated a second time, so its final value will be `1`.