Introduction
On January 13th 2025, zkSecurity started a security audit of Zeko’s circuits. The audit was performed on the public repository https://github.com/zeko-labs/zeko at commit fab6a03555. The engagement lasted six days and was conducted by two consultants.
Zeko is a zk-rollup on the Mina blockchain. We can also refer to Zeko as a Layer 2 (L2) to disambiguate it from Mina’s base layer (L1).
Zeko is isomorphic to Mina, which means that it offers the same interface to zkapp developers (apart from a few deliberate exceptions). Compared to the L1, it promises a better experience thanks to differentiators like high throughput and removal of transaction size limits.
Scope
Zeko is built as a fork of the Mina repository. It uses Mina’s zk libraries Snarky and Pickles, and many of Mina’s protocol-related data structures and tooling. Furthermore, the circuits for proving state transitions of the L2 are largely shared with Mina’s own “transaction snark” which proves state transitions of the L1.
Within the repo, Zeko’s main logic that is not shared with Mina is located at src/app/zeko. Few additional changes were made to other parts of Mina to modify the transaction snark’s behavior in specific ways.
The scope of the audit was the following:
- All of
src/app/zeko/circuits, which hosts Zeko’s own circuits, including
- Zeko’s “outer” contracts, which are zkapps on the Mina L1, and
- “inner” contracts that operate on the L2.
- The diff to the original Mina repo, as far as it affects Zeko’s circuits and in particular the transaction snark.
At the time of the audit, the circuit code was newly rewritten and not integrated into the overall rollup yet. Even witness generation code was, in many parts, not implemented. This put a hard limit on the audit scope: We really just looked at the circuits themselves, not the wider context in which they are used.
Nonetheless, we believe that the audit covers a significant part of the attack surface of the zk-rollup as a whole, because the circuits are the largest critical component.
As much as possible, we tried to not only review circuits for local bugs such as missing constraints, but also audit the security of the protocol design as a whole.
Summary and recommendations
Zeko’s overall protocol design is deemed to be sound. Various attack vectors seem to be carefully considered and mitigated. We found a few significant bugs, each of which seems easy to fix, in a code base that is quite fresh and untested.
Given the partial scope and the number of high-severity findings, a small follow-up audit when the code base has stabilized could provide additional assurance.
Protocol Overview
At its heart, Zeko is a zkapp on Mina, which holds a commitment to the L2 ledger in its state. The contract’s central method, Rule_commit, moves the ledger state forward: by proving it has some transactions that, when applied to the source ledger, result in the target ledger.
(Note: Contract methods are called “rules” in Zeko.)
This zkapp is called the outer account, where “outer” refers to the L1. There is also an inner account on the L2. The two cooperate to support a messaging scheme between L1 and L2 that can be used by other zkapps.
Apart from the outer and inner rollup accounts, Zeko also maintains bridge accounts on both sides, which have rules to bridge tokens to and from the L2. The bridge contracts can be considered the first and foremost examples for how to use messaging.
Messaging
Messaging is based on actions, a Mina protocol construct. In general, an action is arbitrary data attached to a transaction, that is appended to a Merkle list commitment on the target account when the transaction is applied. The list commitment is called the action state of the account.
Both the outer and inner accounts have rules that submit an action: Rule_action_witness and Rule_inner_action_witness, respectively. Both rules allow arbitrary child account updates, and the action contains the hash of those child updates plus an additional field element defined by the caller (for committing to application-specific data).
Consequently, the outer action witness rule allows you to perform any interaction on the L1, and records it on the outer account’s action state for later reference. Similarly the inner action witness rule allows you to record interactions on the L2.
Messaging is achieved by making the outer action state available on the inner account’s app state, and vice versa. In pseudo-code:
outer_account = {
app_state: { inner_action_state, ledger_hash, ... };
action_state
}
inner_account = {
app_state: { outer_action_state };
action_state
}
To see how this is useful, let’s say you want to move tokens to the L2. You deposit them to an L1 bridge account, inside a call to the outer action witness rule. On the L2, you can refer to inner_account.outer_action_state to prove that this L1 deposit happened. When doing this, an inner bridge account lets you withdraw the tokens on the L2.
How are action states made available across layers?
Making the action state available from L2 to L1 is straightforward: The outer commit rule, after proving transition of the L2 ledger, looks up the updated inner_account in that ledger, and sets its own inner_action_state to inner_account.action_state. Since this is the only way that inner_action_state can be updated, we know it commits to actions submitted on the inner account.
The direction from L1 to L2 is harder because the inner contract doesn’t have a proof of what happened on the L1. Instead, it has a sync rule which moves the outer_action_state field forward by applying some actions, without proving anything about those actions.
Proving correctness of inner_account.outer_action_state is deferred to the outer commit rule: after updating the L2 ledger, it checks that outer_action_state is an ancestor of the current outer_account.action_state.
In other words, we can only ever commit to an L2 in which the inner outer_action_state contains real L1 actions.
Let’s try to understand why this deferred proof scheme is secure. On the one hand, nothing in the sync rule prevents it from being called with L1 actions that never happened, and the L2 could go into an “unlawful” state. For example, you could invent deposits, and use them to withdraw tokens out of thin air from the inner bridge account.
However, an unlawful L2 can never finalize: It wouldn’t pass the proof of the outer commit rule. In particular, transactions on the L1 can always rely on the L2 state stored in the outer account: it can never originate from L2 interactions that assumed a fake inner_account.outer_action_state.
Put simply: while you could print yourself tokens on a fake L2, you could never get those tokens out on the L1.
In practice, as clarified by the Zeko team, sequencers will control calls to the inner sync rule so that their L2 state never becomes invalid.
Action state extensions
Recall that action states are Merkle list commitments: They are the result of hashing all historical actions in a linear chain. One step in that chain looks like where is the action appended to the action state .[^1]
[^1]: Technical detail: when using “action” in the same sense as Mina does, in the equation above actually represents a non-empty list of actions. We change terminology and call those lists “actions”.
Frequently in Zeko’s contracts, we need to show that some action is contained in a given action state . This means proving that there exists a previous action state and some remaining actions such that
Since the hash chain can have an arbitrary length, we can’t verify it in a fixed-size circuit. To solve this, Zeko uses a recursive proof of the hash chain. This is called an ASE, for “action state extension”.
ASEs are created using the Folder module that implements a generic recursive state machine proof. We found a missing constraint in Folder that breaks the proof.
The impact of proving invalid ASEs is devastating: It allows you to invent actions that never happened. For example, to withdraw from the L1 bridge, you need to show existence of a deposit into the L2 bridge. Since the deposit (an action on the inner account) can be spoofed, an attacker could easily withdraw arbitrary amounts. This is the most severe issue found during the audit, a “billion-dollar-bug”.
Bridge contracts
Zeko is designed to have two bridge contracts for every token: One on the L1 and one on the L2. The L1 contract will be deployed to multiple different accounts, only one of which is enabled at a time. Essentially, the idea is to spread the risk of holding funds.
Bridge accounts have the following main methods:
- Inner bridge account: finalize-deposit rule
- Outer bridge accounts: finalize-withdrawal rule and finalize-cancelled-deposit rule.
In addition, outer bridge accounts have methods to be disabled and enabled within certain windows of time.
We already hinted at how deposits and withdrawals work, but didn’t discuss important aspects yet, like cancelling and prevention of double-spends. To recap, here are the simplified steps to deposit tokens from the L1 to the L2:
- Deposit tokens into one of the L1 bridge accounts (no matter which one), inside the outer action witness rule.
- Wait until the L1 deposit is synced to the L2, which happens when the inner sync rule is called.
- Call the finalize-deposit rule on the inner bridge account, which sends the tokens to an account of your choosing.
Along with the deposit, the user posts additional parameters: a receiver and timeout.
The receiver is a public key that is able to finalize the deposit, by signature.
The timeout enables a light-weight way to cancel a deposit, via the finalize-cancelled-deposit rule, which doesn’t require cooperation by a sequencer. A deposit can be cancelled as soon as any action is posted on the outer account that has slot bounds higher than the timeout. “Slots” are the measure of time on Mina, and transactions can specify slot bounds outside of which they are invalid. This establishes a fuzzy timestamp for onchain events, used extensively by Zeko.
By contrast, a deposit is marked as “accepted” if followed by a commit action that came before the timeout. At that point, it can be finalized on the L2. Being cancellable and finalizable are mutually exclusive, and once either of them is true, the deposit status cannot change. This is important, otherwise you could get the funds out on both sides and double your money.
Preventing double-spends
Another attack vector that has to be defended against is finalizing the same deposit twice (or cancelling it twice). The same applies for withdrawals.
To prevent double-spends, finalize-deposit introduces a helper account that has the receiver’s public key and a token id owned by the L2 bridge contract. (Note: this relies on Mina’s “token owner” feature, not explained in this report.) On the helper account, we store the action index of the last processed deposit, and require that this index is strictly increasing.
Since both the helper account and the deposit’s action index are uniquely determined by the deposit, this ensures that a deposit can be finalized at most once.
Side note: If deposits are processed out of order, it could happen that one is skipped and funds are irredeemably lost. That must be prevented by appropriate software to construct user transactions, and we view it as outside the scope of the present audit.
We have several interesting findings related to double-spends. One, pointed out to us early on by the Zeko team, is that in the case of L1 withdrawals, the helper account is not determined uniquely by the withdrawal: Since multiple L1 accounts are deployed for every token, there are multiple possible helper accounts, and double-spends are possible. Details are in the write-up of the finding.
Another issue is related to a peculiar property of Mina’s transaction snark: it doesn’t prevent that the same account is added to the ledger multiple times. Allowing that would, again, destroy the assumption that the helper account is unique, and enable double-spends on the L2 bridge. Not only that: since duplicate accounts break a fundamental assumption of the protocol, they would most likely pose a security risk for userland zkapps as well.
To prove that no account is added twice, Zeko adds a new construction to its modified transaction snark: An indexed Merkle tree, which supports non-inclusion proofs and can therefore be used to implement a set of unique accounts.
Auditing Zeko’s account set was fascinating. We recommend reading our finding on unconstrained account set updates. In addition to that finding, we found 5 low-level issues in the field comparison gadget, a complex primitive needed for the indexed Merkle tree; see 1, 2, 3, 4, 5.
In total, four out of the 7 high-severity issues in our report are related to breaking uniqueness of the account set.
Below are listed the findings found during the engagement. High severity findings can be seen as
so-called
"priority 0" issues that need fixing (potentially urgently). Medium severity findings are most often
serious
findings that have less impact (or are harder to exploit) than high-severity findings. Low severity
findings
are most often exploitable in contrived scenarios, if at all, but still warrant reflection. Findings
marked
as informational are general comments that did not fit any of the other criteria.
Description. Zeko tracks all accounts in an account set, a Merkle tree. We found a bug which means a prover can update the account set in arbitrary ways. The issue allows sequencers to prove invalid state transitions with far-reaching consequences, like double-spending of L2 deposits.
First, some background. In contrast to the main account ledger, another Merkle tree used by Mina, Zeko’s account set supports non-inclusion proofs, thanks to being an indexed Merkle tree. For more on indexed Merkle trees, see the paper that introduced them and this talk and blog post by Aztec.
The account set construction exists to close a gap in Mina’s “transaction snark”, the circuit(s) that Zeko uses to prove valid state transitions of the L2 ledger. In the original transaction snark, when processing an account update, the prover is free to either prove it belongs to an existing account in the current ledger, or to witness a Merkle path to an “empty” account. In the empty case, no connection to the account id of the account update is proved. The empty account is updated to have the acount id matching the account update, and this update is recorded in the ledger. This is how new accounts are created in the protocol.
In other words, nothing in Mina’s original circuit prevents a prover from creating the same account multiple times, and picking to which of these to apply updates later on. The Mina L1 relies on consensus to prevent duplicated accounts from being accepted. Zeko, on the other hand, is centralized and doesn’t have consensus, and must rely on proofs to guarantee that only valid state transitions are performed on its L2.
This is where the account set comes in. In Zeko, every processed account, along with its “is empty” flag, is passed to a function called update_acc_set. If the flag is true, the function will prove that the account doesn’t exist in the account set yet, and insert it. This is just the piece of circuit logic missing from the original transaction snark.
let update_acc_set accounts init ~witness =
Checked.List.fold accounts ~init
~f:(fun set (account_id, is_empty_and_writeable) ->
(* ... ZKSECURITY: skipping lines ... *)
let* y = derive_token_id ~owner:account_id in
(* ZKSECURITY: `y`, a hash of the `account_id`, is added to the set. *)
let* `Before_adding_y set', `After_adding_y new_set =
Account_set.add_key_var ~x ~path_x ~y ~path_y ~z
(* ZKSECURITY: Via the `check` flag, an insertion is skipped for
accounts that already existed *)
~check:is_empty_and_writeable ()
in
let*| () = assert_equal ~label:__LOC__ Account_set.typ set set' in
new_set )
The problem happens in the case where is_empty_and_writeable is false, and y is in the tree already. Note that in this case, we don’t really want to perform any check here, since elsewhere we already proved inclusion of the account_id in the main ledger. To skip the check, this function passes check:false to Account_set.add_key_var.
Let’s look at add_key_var, the main method on Zeko’s indexed Merkle tree implementation. The function inserts a node y in the tree, using witnesses x and z such that {key: x, next_key: z} exist in the original tree and x < y < z. It turns out that the method is both over- and under-constrained if the check is false and y was already in the tree.
let add_key_var ?check ~x ~path_x ~y ~path_y ~z () =
let* () =
with_label __LOC__ (fun () -> assert_x_less_than_y_less_than_z ~x ~y ~z)
in
(* ZKSECURITY: after showing that x < y < z, we prove inclusion of `(x, z)` in the input root.
this alone implies that y did not exist in the tree before! *)
let* root = implied_root { key = x; next_key = z } path_x in
let* root_intermediate = implied_root { key = x; next_key = y } path_x in
(* ZKSECURITY: `root_intermediate'` commits to _some_ tree has an empty entry for `y`.
But this root is not connected to the input root yet... *)
let* root_intermediate' =
implied_root_raw Field.(constant typ zero) path_y
in
let* root_intermediate' =
match check with
| Some check ->
if_ check ~typ:F.typ ~then_:root_intermediate'
~else_:root_intermediate
| None ->
Checked.return root_intermediate'
in
(* ZKSECURITY: we check that `path_y` really belongs to the same tree as `path_x`,
by asserting the roots are equal. But this assertion is SKIPPED if check=false *)
let* () =
assert_equal ~label:__LOC__ F.typ root_intermediate root_intermediate'
in
(* ZKSECURITY: we update the root with the insertion of y and return it.
this is a valid insertion if check=true or None, but an unconstrained update if check=false. *)
let* root_new = implied_root { key = y; next_key = z } path_y in
Checked.return (`Before_adding_y root, `After_adding_y root_new)
There are two issues:
-
The input root is over-constrained: If y is already in the tree of Before_adding_y root, then the first check assert_x_less_than_y_less_than_z must fail. This is a core property of indexed Merkle trees, maintained by this function by updating x’s next_key to y. It means that update_acc_set fails for accounts already in the set.
-
The output root is under-constrained: Using check: false makes the function skip the check that root_intermediate == root_intermediate'. This means that path_y is not validated to be a path in the same tree, and the updated root After_adding_y root_new could belong to any tree.
In update_acc_set, the returned root is just accepted as the new value. Because of the second issue, the prover could just switch to a different account set whenever an account already existed (i.e. during a majority of transactions).
Impact. The over-constrainedness issue simply means that the implementation would not have worked. Note that the audited code is very new and was not tested prior to the audit.
The more serious issue is the second one, which would not have surfaced on its own: Exploiting it, a prover can add the same account to the L2 ledger multiple times. This is just the problem that the account set was there to address in the first place.
For an example attack enabled by non-unique accounts, consider the bridge method for finalizing an L2 deposit. It prevents double-spending of the deposit by tracking a deposit_index field on a specific receiver account. If that account could be added another time, the same deposit can be withdrawn again.
Such a double-spending attack might only be accessible to the sequencer (depending on the exact operating setup of the rollup). But that seems bad enough, as we want to force sequencers to behave according to the protocol, using zk proofs.
Recommendation. Both issues can be addressed by adjusting add_key_var. To fix the over-constrained input root, the method could accept the root to update as an argument, instead of returning it. If check is false, the input root is replaced (using if-else) with that of an empty Merkle tree, before asserting it to equal the root implied by the (x, z) node. In this case, the witnesses passed in for x and z need to be the default min and max nodes that are in the indexed Merkle tree initially, to make sure the assert_x_less_than_y_less_than_z passes.
To fix the under-constrained output root, we just need another if-else at the end that replaces the output with the original input root, if check is false.
Client Response. The issue was fixed, by swapping out the original root in add_key_var in the check=false case. This addresses both the soundness and the completeness issue, without changing the function signature.