Poseidon2 is a SNARK-friendly cryptographic hash designed to be efficient inside prime-field arithmetic circuits. It follows the Poseidon2 paper and refines the original Poseidon hash.

This implementation includes:

A sponge construction over the BN254 scalar field following the (draft) C2SP Poseidon Sponge spec based on the Duplex Sponge model.
The Poseidon2 permutation, i.e. the round function used by the sponge.
Circuit custom gate relations that enforce the permutation’s correctness.

The Sponge Construction

The sponge absorbs input elements into an internal state, applies permutations, and squeezes output elements.

Parameter	Value
State size	$t = 4$ field elements
Rate	$r = 3$ field elements
Capacity	$c = 1$ field element
Domain separator	$\mathrm{IV} = \texttt{input_length} \ll 64$

Let the input be:

$$ \mathbf{a} = (a_0, a_1, \ldots, a_{N-1}) $$

Partition it into rate-sized blocks:

$$ B_j = (a_{3j}, a_{3j+1}, a_{3j+2}), \qquad m = \left\lceil \frac{N}{3}\right\rceil $$

Missing entries in the final block are padded with $0$. This is safe for the variable-length sponge because the input length is part of the domain separator. The initial state is:

$$ \mathbf{s}^{(0)} = (0, 0, 0, \mathrm{IV}) $$

For each block $j = 0, \ldots, m - 1$:

$$ \mathbf{s}^{(j+1)} = P\left(\mathbf{s}^{(j)} + (B_j, 0)\right) $$

where $P$ is the Poseidon2 permutation. The single-output squeeze is:

$$ y_0 = \left(P(\mathbf{s}^{(m)})\right)_0 $$

The IV is created as a fixed witness so the first permutation starts from normalized stdlib field values.

The Poseidon2 Permutation

The mathematical permutation is identical in Ultra and Mega. The difference is only how the permutation is encoded in the trace.

input state
    |
    v
initial external linear layer M_E
    |
    v
4 external rounds  : full S-box on all 4 state entries, then M_E
    |
    v
56 internal rounds : S-box only on state[0], then M_I
    |
    v
4 external rounds  : full S-box on all 4 state entries, then M_E
    |
    v
output state

External matrix:

$$ M_E = \begin{bmatrix} 5 & 7 & 1 & 3 \ 4 & 6 & 1 & 1 \ 1 & 3 & 5 & 7 \ 1 & 1 & 4 & 6 \end{bmatrix} $$

Internal matrix, written with the actual diagonal entries $D_i$:

$$ M_I = \begin{bmatrix} D_1 & 1 & 1 & 1 \ 1 & D_2 & 1 & 1 \ 1 & 1 & D_3 & 1 \ 1 & 1 & 1 & D_4 \end{bmatrix} $$

The parameter table stores internal_matrix_diagonal_minus_one[i] = D_i - 1, not $D_i$ itself. This lets the implementation compute the internal matrix product as (D_i - 1) * x_i + sum(x), which is equal to $D_i x_i + \sum_{j \ne i} x_j$.

The constants are generated from the Sage script authored by Markus Schofnegger in the Horizen Labs Poseidon2 parameter tooling. With R_P = 56, R_F = 8, d = 5, and a 254-bit scalar field, the parameter set targets 128-bit security.

Trace Layouts

Ultra uses the direct layout: one row per internal round, and six arithmetic rows for the initial external linear layer.

Ultra permutation rows
 
arithmetic rows                 initial M_E
poseidon2_external rows          first external rounds
poseidon2_external propagate
poseidon2_internal rows          one partial round each
poseidon2_internal propagate
poseidon2_external rows          final external rounds
poseidon2_external propagate
--
rows

Mega keeps the same permutation but uses custom rows for the initial external linear layer and compresses all 56 internal rounds into K=4 rows.

Mega permutation rows
 
    poseidon2_external           q_poseidon2_external_initial
+ 1  poseidon2_external           first external rounds + propagate
    poseidon2_quad_internal      q_poseidon2_transition_entry
   poseidon2_quad_internal      q_poseidon2_quad_internal
    poseidon2_quad_internal      q_poseidon2_quad_internal_terminal
    poseidon2_quad_internal      selector-unconstrained standard bridge
+ 1  poseidon2_external           final external rounds + propagate
--
rows

The stdlib hash also has one fixed-witness IV row outside the permutation when it starts from the sponge IV.

Initial External Linear Layer

The initial external linear layer has no S-boxes. Mega constrains it in one row under q_poseidon2_external_initial, while Ultra emits arithmetic rows for the same matrix product. Given:

$$ \mathbf{x} = \begin{bmatrix} w_l \ w_r \ w_o \ w_4 \end{bmatrix}, \qquad \mathbf{y} = M_E\mathbf{x}, $$

the four subrelations constrain the shifted row:

$$ \begin{aligned} A_0 &: y_0 - w_l' = 0, \ A_1 &: y_1 - w_r' = 0, \ A_2 &: y_2 - w_o' = 0, \ A_3 &: y_3 - w_4' = 0. \end{aligned} $$

External Round Subrelations

An external round starts from a standard-encoded row:

$$ (w_l, w_r, w_o, w_4) $$

with round constants in (q_l, q_r, q_o, q_4). The relation computes:

$$ \begin{aligned} u_1 &= (w_l + q_l)^5, \ u_2 &= (w_r + q_r)^5, \ u_3 &= (w_o + q_o)^5, \ u_4 &= (w_4 + q_4)^5, \end{aligned} $$

then applies the external matrix:

$$ \begin{bmatrix} v_1 \ v_2 \ v_3 \ v_4 \end{bmatrix} = M_E \begin{bmatrix} u_1 \ u_2 \ u_3 \ u_4 \end{bmatrix}. $$

The four external subrelations constrain the result against the shifted row:

$$ \begin{aligned} A_0 &: v_1 - w_l' = 0, \ A_1 &: v_2 - w_r' = 0, \ A_2 &: v_3 - w_o' = 0, \ A_3 &: v_4 - w_4' = 0. \end{aligned} $$

Mega Internal Compression

Mega uses a K=4 layout: each compressed row commits four consecutive state[0] values instead of the full state at every internal round. This is sound because only state[0] passes through the internal-round S-box. Once the four S-box outputs are fixed, the update of state[1..3] is linear and can be checked through an invertible 3 by 3 linear encoding.

For a self-contained linear-algebra statement of the underlying soundness theorem — abstracted away from Poseidon2-specific notation, with a proof and a discussion of which other matrices the same construction would work for — see QUAD_THEOREM.md.

For a quad row that starts at internal round 4i:

Wire	Meaning
`w_l`	`state[0]` at round `4i`
`w_r`	`state[0]` at round `4i + 1`
`w_o`	`state[0]` at round `4i + 2`
`w_4`	`state[0]` at round `4i + 3`

The row selectors carry the current quad constants and, for interior rows, the next quad's first three constants:

Selector	Value
`q_l`	`c_{4i}`
`q_r`	`c_{4i+1}`
`q_o`	`c_{4i+2}`
`q_4`	`c_{4i+3}`
`q_m`	`c_{4(i+1)}`
`q_c`	`c_{4(i+1)+1}`
`q_5`	`c_{4(i+1)+2}`

The compression picture is:

standard state before internal rounds
    (s0, s1, s2, s3)
       |
       | q_poseidon2_transition_entry
       v
first quad row
    (s0^0, s0^1, s0^2, s0^3)
       |
       | 13 q_poseidon2_quad_internal rows
       v
terminal quad row
    (s0^52, s0^53, s0^54, s0^55)
       |
       | q_poseidon2_quad_internal_terminal
       v
standard bridge row
    (s0^56, s1^56, s2^56, s3^56)
       |
       v
final external rounds

Compressed Block Subrelations

Every subrelation in the compressed block enforces the Poseidon2 internal-round recurrence in the encoding appropriate for its boundary:

Boundary	What's known	What the subrelations enforce
Entry	full standard state at row-start	first three `state[0]` values of the first compressed row
Interior	`state[0]` chain on this row and the next	four-round output, with the next row's `state[1..3]` checked through the same linear encoding
Terminal	`state[0]` chain on this row, full standard state on the next bridge row	four-round output matched directly against the bridge row

The interior and terminal boundaries share a four-round closed form that we cover first.

Closed Form for Four Rounds

Write the committed quad-row wires as:

$$ (w_l, w_r, w_o, w_4) = (s_0^{(0)}, s_0^{(1)}, s_0^{(2)}, s_0^{(3)}) $$

and define the four S-box outputs:

$$ u_k = (s_0^{(k)} + c_{4i+k})^5, \qquad k \in 0, 1, 2, 3

Contents