Barretenberg
The ZK-SNARK library at the core of Aztec
Loading...
Searching...
No Matches
op_decomposition.test.cpp
Go to the documentation of this file.
1#include <gmock/gmock.h>
2#include <gtest/gtest.h>
3
4#include <cstdint>
5
7
8/******************************************************************************
9 CODE GENERATION OF OPERAND DECOMPOSITION INTO BYTES
10*******************************************************************************
11This test serves the purpose of code-generate the operand decomposition into bytes
12which can be statically derived by the wire format of the opcodes specified in
13the map WireOpCode_WIRE_FORMAT (see serialization.hpp).
14
15The artifacts are generated by writing to the standard output:
16- Precomputed Selectors Table which must be copied to: WireOpCode_DC_SELECTORS in instruction_spec.cpp
17- PIL code with relations copied to: instr_fetching.pil
18
19The precomputed selectors table consists of a row per wire opcode and as many
20columns as selector (determined by the algorithm below).
21Example row: { WireOpCode::SHR_16, { 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 } }
22
23The pil relations map a fixed size bytecode chunk starting from a wire opcode byte
24into the operands of the opcode as defined by WireOpCode_WIRE_FORMAT.
25Example: op4 = sel_op_dc_0 * (bd9 * 2**8 + bd10 * 2**0) + sel_op_dc_4 * (bd8 * 2**8 + bd9 * 2**0) + sel_op_dc_7 * (bd8 *
262**0);
27
28The bytecode chunk consists of the opcode, addressing mode, and 7 operands.
29Bytes of bytecode are denoted: bd0, bd1, bd2, bd3, ...., bd37 where bd0 is the
30opcode and never appears in the above pil relation as it is trivially not an operand.
31Hence we deal with 8 values: addressing_mode, op1, op2, op3, op4, op5, op6, op7
32We refer to these 8 values, including addressing_mode, as 'operands' from now on for ease of reading.
33
34The goal of the algorithm below is to determine the least number of selectors sel_op_dc_XXX
35required to decompose each operand into bytes covering all opcode formats.
36Each selector sel_op_dc_XXX determines a subset of opcodes as defined per the table
37WireOpCode_DC_SELECTORS. For any given column, select all opcodes for which the
38entry bit is toggled.
39
40Below, we denote the 8 operands [addressing_mode, op1, ..., op7]. An operand layout is a
41pair (offset, len) which corresponds to a byte offset of the wire format and len being the
42length in bytes.
43
44Example CAST_8: { OperandType::INDIRECT8, OperandType::UINT8, OperandType::UINT8, OperandType::TAG }
45[{0,1}, {1,1}, {2,1}, {3,1}]
46
47Example SET_128: { OperandType::INDIRECT8, OperandType::UINT16, OperandType::TAG, OperandType::UINT128 }
48[{0,1}, {1,2}, {3,1}, {4,16}]
49
50The algorithm consists in processing WireOpCode_WIRE_FORMAT and map every combination
51[addressing_mode/op{idx}, op_layout] to a subset of opcodes. Continuing the example above,
52the SET_128 opcode would be present in the subsets indexed by the following triples:
53[addressing_mode, {0,1}]
54[op1, {1,2}]
55[op2, {3,1}]
56[op3, {4,16}]
57
58Each subset is then encoded into a bitmask of opcodes packed as a uint128_t key.
59Each unique subset is then mapped to a selector sel_op_dc_XXX.
60
61Assuming in the example above that [op1, {1,2}] maps to selector sel_op_dc_13, the
62corresponding PIL relation will be of the form:
63op1 = sel_op_dc_13 * (bd2 * 2**8 + bd3 * 2**0) + sel_op_dc_XXX * (...) + sel_op_dc_XXX * () ...
64Note that {offset, len} = {1, 2} means that offset is bd2 (bd0 is the opcode byte and
65therefore requires increasing the offset by one) and the length is two bytes, therefore
66bd2 and bd3 are the bytes encoding (in big endian) operand op1.
67
68In addition, we add a naive algorithm to remove a partitioned subset by two other
69subsets and aliases the corresponding selector of the partitioned subset as
70an addition of the selectors pertaining to the 2 other subsets.
71For instance: pol sel_op_dc_18 = sel_op_dc_1 + sel_op_dc_6;
72This allows to save some selectors.
73
74******************************************************************************/
75
76namespace bb::avm2::constraining {
78namespace {
79
80constexpr std::string OPERAND_PREFIX = "op";
81constexpr std::string BYTE_PREFIX = "bd";
82constexpr std::string SELECTOR_PREFIX = "sel_op_dc_";
83
84// TODO(#AVM-264): Change this hardcoded value if we reduce AVM_MAX_OPERANDS.
85constexpr size_t NUM_OF_OPERANDS = 8; // Need 7 = AVM_MAX_OPERANDS to cover ECADD, +1 for addressing_mode
86
87struct OperandLayout {
88 uint8_t offset = 0;
89 uint8_t len = 0;
90};
91
92struct Partition {
96};
97
98uint32_t encode_operand_idx_with_layout(uint8_t operand_idx, uint8_t offset, uint8_t len)
99{
100 uint32_t layout = len;
101 layout += (static_cast<uint32_t>(offset) << 8);
102 layout += (static_cast<uint32_t>(operand_idx) << 16);
103 return layout;
104}
105
106uint8_t get_op_idx(uint32_t op_idx_with_layout)
107{
108 return static_cast<uint8_t>(op_idx_with_layout >> 16);
109}
110
111OperandLayout get_op_layout(uint32_t op_idx_with_layout)
112{
113 uint8_t offset = static_cast<uint8_t>((op_idx_with_layout >> 8) & 0xFF);
114 uint8_t len = static_cast<uint8_t>(op_idx_with_layout & 0xFF);
115 return OperandLayout{ .offset = offset, .len = len };
116}
117
118uint128_t encode_subset_wire_opcodes(const std::unordered_set<WireOpCode>& set)
119{
120 uint128_t value = 0;
121 for (const auto& wire_opcode : set) {
122 value += (static_cast<uint128_t>(1) << static_cast<uint8_t>(wire_opcode));
123 }
124
125 return value;
126}
127
128std::string render_selector_array(WireOpCode wire_opcode, const std::vector<uint128_t>& sel_bitmasks)
129{
130 size_t num_of_selectors = sel_bitmasks.size();
131 std::vector<bool> selectors;
132 selectors.reserve(num_of_selectors);
133
134 for (const auto& bitmask : sel_bitmasks) {
135 selectors.push_back(((static_cast<uint128_t>(1) << static_cast<uint128_t>(wire_opcode)) & bitmask) != 0);
136 }
137
138 std::string output = format("{", selectors[0]);
139 for (size_t i = 1; i < num_of_selectors; i++) {
140 output += format(", ", selectors[i]);
141 }
142 output += "}";
143 return output;
144}
145
146auto add_fold = [](const std::string& a, const std::string& b) { return a + " + " + b; };
147
148// Write pol sel_i = sel_j + sel_k
149std::string render_partitions_pil(const std::vector<Partition>& partitions,
150 const std::unordered_map<uint128_t, size_t>& bitmask_to_sel_idx)
151{
152 std::string output;
153 for (const auto& partition : partitions) {
154 output += format("pol ", SELECTOR_PREFIX, bitmask_to_sel_idx.at(partition.union_subset));
155 output += format(" = ", SELECTOR_PREFIX, bitmask_to_sel_idx.at(partition.subset_1));
156 output += format(" + ", SELECTOR_PREFIX, bitmask_to_sel_idx.at(partition.subset_2), ";\n");
157 }
158 return output;
159}
160
161// Output a string like: bd4 * 2**24 + bd5 * 2**16 + bd6 * 2**8 + bd7 * 2**0
162std::string render_operand_layout_pil(OperandLayout layout)
163{
164 std::vector<std::string> monomials;
165 monomials.reserve(layout.len);
166 uint8_t byte_offset = layout.offset;
167 for (int i = 0; i < layout.len; i++) {
168 monomials.push_back(
169 format(BYTE_PREFIX, byte_offset + i + 1, " * 2**", 8 * (layout.len - i - 1))); // Big-endian bytes
170 }
171
172 return std::accumulate(std::next(monomials.begin()), monomials.end(), monomials[0], add_fold);
173}
174
175std::string render_pil(
176 const std::array<std::vector<std::pair<size_t, OperandLayout>>, NUM_OF_OPERANDS>& sel_layout_breakdowns)
177{
178 std::string pil_equations;
179 for (uint8_t i = 0; i < NUM_OF_OPERANDS; i++) {
180 pil_equations += (i == 0) ? "#[ADDRESSING_MODE_BYTES_DECOMPOSITION]\n"
181 : format("#[OP", static_cast<uint32_t>(i), "_BYTES_DECOMPOSITION]\n");
182 pil_equations += (i == 0) ? "addressing_mode = " : format(OPERAND_PREFIX, static_cast<uint32_t>(i), " = ");
183
184 pil_equations += "(1 - PARSING_ERROR_EXCEPT_TAG_ERROR) * ("; // Error gating multiplicative term
185
186 std::vector<std::string> additive_terms;
187 for (const auto& sel_layout : sel_layout_breakdowns[i]) {
188 additive_terms.push_back(
189 format(SELECTOR_PREFIX, sel_layout.first, " * (", render_operand_layout_pil(sel_layout.second), ")"));
190 }
191 pil_equations +=
192 std::accumulate(std::next(additive_terms.begin()), additive_terms.end(), additive_terms[0], add_fold);
193 pil_equations += ");\n";
194 }
195 return pil_equations;
196}
197
199{
201
202 const auto& operand_type_sizes = simulation::testonly::get_operand_type_sizes();
203 const auto& wire_formats = simulation::testonly::get_instruction_wire_formats();
204
205 for (const auto& [opcode, format] : wire_formats) {
206 std::array<OperandLayout, 8> operands_layout_array;
207 uint8_t op_idx = 1; // We start at index 1 because the index zero is reserved for addressing mode.
208 uint8_t byte_offset = 0;
209
210 for (const auto& operand : format) {
211 const auto operand_len = static_cast<uint8_t>(operand_type_sizes.at(operand));
212 const auto op_layout = OperandLayout{ .offset = byte_offset, .len = operand_len };
213
214 if (operand == OperandType::INDIRECT8 || operand == OperandType::INDIRECT16) {
215 operands_layout_array[0] = op_layout;
216 } else {
217 operands_layout_array[op_idx++] = op_layout;
218 }
219 byte_offset += operand_len;
220 }
221 mapping.insert(std::make_pair(opcode, operands_layout_array));
222 }
223 return mapping;
224}
225
226std::unordered_map<uint32_t, std::unordered_set<WireOpCode>> gen_op_idx_with_layout_to_opcode_subset(
227 const std::unordered_map<WireOpCode, std::array<OperandLayout, NUM_OF_OPERANDS>>& opcode_to_layouts)
228{
230
231 for (const auto& [wire_opcode, operand_layouts] : opcode_to_layouts) {
232 for (uint8_t i = 0; i < NUM_OF_OPERANDS; i++) {
233 const auto& layout = operand_layouts[i];
234 if (layout.len != 0) {
235 const auto key = encode_operand_idx_with_layout(i, layout.offset, layout.len);
236 if (op_idx_with_layout_to_subset.contains(key)) {
237 op_idx_with_layout_to_subset[key].insert(wire_opcode);
238 } else {
239 op_idx_with_layout_to_subset[key] = { wire_opcode };
240 }
241 }
242 }
243 }
244 return op_idx_with_layout_to_subset;
245}
246
247TEST(DecompositionSelectors, CodeGen)
248{
249 GTEST_SKIP(); // Comment out in order to code-generate.
251 gen_opcode_to_operands_layout();
252
253 // Map any given operand idx with layout to a subset of wire opcodes where
254 // we encode an operand idx with OperandLayout into a uint32_t as per function encode_operand_idx_with_layout().
256 gen_op_idx_with_layout_to_opcode_subset(opcode_to_layouts);
257
258 // A bit mask encodes a subset. Whenever an opcode is present in the subset, we toggle the bit
259 // at position represented by the opcode. Number of opcodes is smaller than 128 and therefore the bitmask
260 // fits in a uint128_t.
261 static_assert(static_cast<uint32_t>(WireOpCode::LAST_OPCODE_SENTINEL) < 128);
262
263 std::unordered_set<uint128_t> set_of_bitmasks;
264 std::unordered_map<uint32_t, uint128_t> op_idx_with_layout_to_bitmask;
265
266 for (const auto& [op_layout, subset] : op_idx_with_layout_to_subset) {
267 const auto encoded = encode_subset_wire_opcodes(subset);
268 set_of_bitmasks.insert(encoded);
269 op_idx_with_layout_to_bitmask.insert(std::make_pair(op_layout, encoded));
270 }
271
272 info("NUMBER OF SUBSETS: ", set_of_bitmasks.size());
273
274 // Is there any union of two disjoint subsets equal to another one?
275 bool partition_found = true;
276 std::vector<Partition> partitions;
277
278 // We try to remove partitions in a naive way, basically we remove the first encountered
279 // partition one after the other. This is by no way an exhaustive algorithm to find a hierarchy
280 // of partitions. This does the job as we have only one partition with the current AVM design.
281 while (partition_found) {
282 for (auto it1 = set_of_bitmasks.begin(); it1 != set_of_bitmasks.end(); it1++) {
283 auto it2 = it1;
284 for (it2++; it2 != set_of_bitmasks.end(); it2++) {
285 uint128_t sub_union = *it1 | *it2;
286 if ((*it1 & *it2) == 0 && set_of_bitmasks.contains(sub_union)) {
287 info("PARTITION FOUND! ", *it1, " ", *it2, " ", sub_union);
288 partitions.push_back(Partition{ .subset_1 = *it1, .subset_2 = *it2, .union_subset = sub_union });
289 set_of_bitmasks.erase(sub_union);
290 partition_found = true;
291 break;
292 }
293 partition_found = false;
294 }
295 if (partition_found) { // Mechanism to exit the outer loop
296 break;
297 }
298 }
299 }
300
301 info("NUMBER OF SUBSETS AFTER PARTITION REMOVAL: ", set_of_bitmasks.size());
302
303 std::vector<uint128_t> bitmasks_vector;
304 std::copy(set_of_bitmasks.begin(), set_of_bitmasks.end(), std::back_inserter(bitmasks_vector));
305
306 // Ensure a deterministic order
307 std::sort(bitmasks_vector.begin(), bitmasks_vector.end(), std::greater<>());
308
309 info("\n#################################");
310 info(" Precomputed Selectors Table:");
311 info("#################################\n");
312
313 info("constexpr size_t NUM_OP_DC_SELECTORS = ", set_of_bitmasks.size(), ";\n\n");
314
315 info("const std::unordered_map<WireOpCode, std::array<uint8_t, NUM_OP_DC_SELECTORS>> WireOpCode_DC_SELECTORS = "
316 "{");
317
318 const auto& wire_formats = simulation::testonly::get_instruction_wire_formats();
319 for (int i = 0; i < static_cast<int>(WireOpCode::LAST_OPCODE_SENTINEL); i++) {
320 const auto wire_opcode = static_cast<WireOpCode>(i);
321 if (wire_formats.contains(wire_opcode)) {
322 info("{WireOpCode::", wire_opcode, ", ", render_selector_array(wire_opcode, bitmasks_vector), "},");
323 }
324 }
325 info("};");
326
327 // Add subsets/bitmasks which were removed as unions at the end.
328 for (const auto& partition : partitions) {
329 bitmasks_vector.push_back(partition.union_subset);
330 }
331
332 std::unordered_map<uint128_t, size_t> bitmask_to_sel_idx;
333
334 for (size_t i = 0; i < bitmasks_vector.size(); i++) {
335 bitmask_to_sel_idx.insert(std::make_pair(bitmasks_vector[i], i));
336 }
337
338 // For each operand (index of the array), we store a vector of (selector, layout) corresponding to
339 // the decomposition of the operand with respect to bytes (bd1, bd2, ...)
340 // op_1 = sel_1 * (bd1 + bd2 * 2^8) + sel_4 * (bd5 + bd6 * 2^8)
341 std::array<std::vector<std::pair<size_t, OperandLayout>>, NUM_OF_OPERANDS> sel_layout_breakdowns;
342 for (const auto& [op_idx_with_layout, bitmask] : op_idx_with_layout_to_bitmask) {
343 uint8_t op_idx = get_op_idx(op_idx_with_layout);
344 OperandLayout layout = get_op_layout(op_idx_with_layout);
345 size_t sel_idx = bitmask_to_sel_idx.at(bitmask);
346 sel_layout_breakdowns[op_idx].emplace_back(std::make_pair(sel_idx, layout));
347 }
348
349 // For each sel-layout breakdown vector, we sort by increasing selector indices.
350 for (uint8_t i = 0; i < NUM_OF_OPERANDS; i++) {
351 std::sort(
352 sel_layout_breakdowns[i].begin(),
353 sel_layout_breakdowns[i].end(),
355 }
356
357 info("\n##################");
358 info("PIL Relations:");
359 info("##################\n");
360
361 info(render_partitions_pil(partitions, bitmask_to_sel_idx));
362 info(render_pil(sel_layout_breakdowns));
363}
364
365} // namespace
366} // namespace bb::avm2::constraining
std::string format(Args... args)
Definition log.hpp:23
#define info(...)
Definition log.hpp:93
FF a
FF b
ssize_t offset
Definition engine.cpp:62
TEST(AvmFixedVKTests, FixedVKCommitments)
Test that the fixed VK commitments agree with the ones computed from precomputed columns.
const std::unordered_map< OperandType, uint32_t > & get_operand_type_sizes()
const std::unordered_map< WireOpCode, std::vector< OperandType > > & get_instruction_wire_formats()
constexpr decltype(auto) get(::tuplet::tuple< T... > &&t) noexcept
Definition tuple.hpp:13
uint128_t subset_2
uint128_t subset_1
uint128_t union_subset
uint8_t len
unsigned __int128 uint128_t
Definition serialize.hpp:45