Coarse arithmetics¶
The $macc
cell type represents a generalized multiply and accumulate
operation. The cell is purely combinational. It outputs the result of summing up
a sequence of products and other injected summands.
Y = 0 +- a0factor1 * a0factor2 +- a1factor1 * a1factor2 +- ...
+ B[0] + B[1] + ...
The A port consists of concatenated pairs of multiplier inputs (“factors”). A zero length factor2 acts as a constant 1, turning factor1 into a simple summand.
In this pseudocode, u(foo)
means an unsigned int that’s foo bits long.
struct A {
u(CONFIG.mul_info[0].factor1_len) a0factor1;
u(CONFIG.mul_info[0].factor2_len) a0factor2;
u(CONFIG.mul_info[1].factor1_len) a1factor1;
u(CONFIG.mul_info[1].factor2_len) a1factor2;
...
};
The cell’s CONFIG
parameter determines the layout of cell port A
. The
CONFIG parameter carries the following information:
struct CONFIG {
u4 num_bits;
struct mul_info {
bool is_signed;
bool is_subtract;
u(num_bits) factor1_len;
u(num_bits) factor2_len;
}[num_ports];
};
B is an array of concatenated 1-bit-wide unsigned integers to also be summed up.
- yosys> help $alu¶
Arithmetic logic unit
A building block supporting both binary addition/subtraction operations, and indirectly, comparison operations. Typically created by the
alumacc
pass, which transforms:$add
,$sub
,$lt
,$le
,$ge
,$gt
,$eq
,$eqx
,$ne
,$nex
cells into this$alu
cell.- Properties:
- Simulation model (verilog)¶
657module \$alu (A, B, CI, BI, X, Y, CO); 658 659 parameter A_SIGNED = 0; 660 parameter B_SIGNED = 0; 661 parameter A_WIDTH = 1; 662 parameter B_WIDTH = 1; 663 parameter Y_WIDTH = 1; 664 665 input [A_WIDTH-1:0] A; // Input operand 666 input [B_WIDTH-1:0] B; // Input operand 667 output [Y_WIDTH-1:0] X; // A xor B (sign-extended, optional B inversion, 668 // used in combination with 669 // reduction-AND for $eq/$ne ops) 670 output [Y_WIDTH-1:0] Y; // Sum 671 672 input CI; // Carry-in (set for $sub) 673 input BI; // Invert-B (set for $sub) 674 output [Y_WIDTH-1:0] CO; // Carry-out 675 676 wire [Y_WIDTH-1:0] AA, BB; 677 678 generate 679 if (A_SIGNED && B_SIGNED) begin:BLOCK1 680 assign AA = $signed(A), BB = BI ? ~$signed(B) : $signed(B); 681 end else begin:BLOCK2 682 assign AA = $unsigned(A), BB = BI ? ~$unsigned(B) : $unsigned(B); 683 end 684 endgenerate 685 686 // this is 'x' if Y and CO should be all 'x', and '0' otherwise 687 wire y_co_undef = ^{A, A, B, B, CI, CI, BI, BI}; 688 689 assign X = AA ^ BB; 690 // Full adder 691 assign Y = (AA + BB + CI) ^ {Y_WIDTH{y_co_undef}}; 692 693 function get_carry; 694 input a, b, c; 695 get_carry = (a&b) | (a&c) | (b&c); 696 endfunction 697 698 genvar i; 699 generate 700 assign CO[0] = get_carry(AA[0], BB[0], CI) ^ y_co_undef; 701 for (i = 1; i < Y_WIDTH; i = i+1) begin:BLOCK3 702 assign CO[i] = get_carry(AA[i], BB[i], CO[i-1]) ^ y_co_undef; 703 end 704 endgenerate 705 706endmodule
- yosys> help $fa¶
- Properties:
- Simulation model (verilog)¶
597module \$fa (A, B, C, X, Y); 598 599 parameter WIDTH = 1; 600 601 input [WIDTH-1:0] A, B, C; 602 output [WIDTH-1:0] X, Y; 603 604 wire [WIDTH-1:0] t1, t2, t3; 605 606 assign t1 = A ^ B, t2 = A & B, t3 = C & t1; 607 assign Y = t1 ^ C, X = (t2 | t3) ^ (Y ^ Y); 608 609endmodule
- yosys> help $lcu¶
Lookahead carry unit A building block dedicated to fast computation of carry-bits used in binary arithmetic operations. By replacing the ripple carry structure used in full-adder blocks, the more significant bits of the sum can be expected to be computed more quickly. Typically created during
techmap
of $alu cells (see the “_90_alu” rule in +/techmap.v).- Properties:
- Simulation model (verilog)¶
625module \$lcu (P, G, CI, CO); 626 627 parameter WIDTH = 1; 628 629 input [WIDTH-1:0] P; // Propagate 630 input [WIDTH-1:0] G; // Generate 631 input CI; // Carry-in 632 633 output reg [WIDTH-1:0] CO; // Carry-out 634 635 integer i; 636 always @* begin 637 CO = 'bx; 638 if (^{P, G, CI} !== 1'bx) begin 639 CO[0] = G[0] || (P[0] && CI); 640 for (i = 1; i < WIDTH; i = i+1) 641 CO[i] = G[i] || (P[i] && CO[i-1]); 642 end 643 end 644 645endmodule
- yosys> help $macc¶
Multiply and accumulate. A building block for summing any number of negated and unnegated signals and arithmetic products of pairs of signals. Cell port A concatenates pairs of signals to be multiplied together. When the second signal in a pair is zero length, a constant 1 is used instead as the second factor. Cell port B concatenates 1-bit-wide signals to also be summed, such as “carry in” in adders. Typically created by the
alumacc
pass, which transforms $add and $mul into $macc cells.- Properties:
- Simulation model (verilog)¶
1073module \$macc (A, B, Y); 1074 1075 parameter A_WIDTH = 0; 1076 parameter B_WIDTH = 0; 1077 parameter Y_WIDTH = 0; 1078 // CONFIG determines the layout of A, as explained below 1079 parameter CONFIG = 4'b0000; 1080 parameter CONFIG_WIDTH = 4; 1081 1082 // In the terms used for this cell, there's mixed meanings for the term "port". To disambiguate: 1083 // A cell port is for example the A input (it is constructed in C++ as cell->setPort(ID::A, ...)) 1084 // Multiplier ports are pairs of multiplier inputs ("factors"). 1085 // If the second signal in such a pair is zero length, no multiplication is necessary, and the first signal is just added to the sum. 1086 input [A_WIDTH-1:0] A; // Cell port A is the concatenation of all arithmetic ports 1087 input [B_WIDTH-1:0] B; // Cell port B is the concatenation of single-bit unsigned signals to be also added to the sum 1088 output reg [Y_WIDTH-1:0] Y; // Output sum 1089 1090 // Xilinx XSIM does not like $clog2() below.. 1091 function integer my_clog2; 1092 input integer v; 1093 begin 1094 if (v > 0) 1095 v = v - 1; 1096 my_clog2 = 0; 1097 while (v) begin 1098 v = v >> 1; 1099 my_clog2 = my_clog2 + 1; 1100 end 1101 end 1102 endfunction 1103 1104 // Bits that a factor's length field in CONFIG per factor in cell port A 1105 localparam integer num_bits = CONFIG[3:0] > 0 ? CONFIG[3:0] : 1; 1106 // Number of multiplier ports 1107 localparam integer num_ports = (CONFIG_WIDTH-4) / (2 + 2*num_bits); 1108 // Minium bit width of an induction variable to iterate over all bits of cell port A 1109 localparam integer num_abits = my_clog2(A_WIDTH) > 0 ? my_clog2(A_WIDTH) : 1; 1110 1111 // In this pseudocode, u(foo) means an unsigned int that's foo bits long. 1112 // The CONFIG parameter carries the following information: 1113 // struct CONFIG { 1114 // u4 num_bits; 1115 // struct port_field { 1116 // bool is_signed; 1117 // bool is_subtract; 1118 // u(num_bits) factor1_len; 1119 // u(num_bits) factor2_len; 1120 // }[num_ports]; 1121 // }; 1122 1123 // The A cell port carries the following information: 1124 // struct A { 1125 // u(CONFIG.port_field[0].factor1_len) port0factor1; 1126 // u(CONFIG.port_field[0].factor2_len) port0factor2; 1127 // u(CONFIG.port_field[1].factor1_len) port1factor1; 1128 // u(CONFIG.port_field[1].factor2_len) port1factor2; 1129 // ... 1130 // }; 1131 // and log(sizeof(A)) is num_abits. 1132 // No factor1 may have a zero length. 1133 // A factor2 having a zero length implies factor2 is replaced with a constant 1. 1134 1135 // Additionally, B is an array of 1-bit-wide unsigned integers to also be summed up. 1136 // Finally, we have: 1137 // Y = port0factor1 * port0factor2 + port1factor1 * port1factor2 + ... 1138 // * B[0] + B[1] + ... 1139 1140 function [2*num_ports*num_abits-1:0] get_port_offsets; 1141 input [CONFIG_WIDTH-1:0] cfg; 1142 integer i, cursor; 1143 begin 1144 cursor = 0; 1145 get_port_offsets = 0; 1146 for (i = 0; i < num_ports; i = i+1) begin 1147 get_port_offsets[(2*i + 0)*num_abits +: num_abits] = cursor; 1148 cursor = cursor + cfg[4 + i*(2 + 2*num_bits) + 2 +: num_bits]; 1149 get_port_offsets[(2*i + 1)*num_abits +: num_abits] = cursor; 1150 cursor = cursor + cfg[4 + i*(2 + 2*num_bits) + 2 + num_bits +: num_bits]; 1151 end 1152 end 1153 endfunction 1154 1155 localparam [2*num_ports*num_abits-1:0] port_offsets = get_port_offsets(CONFIG); 1156 1157 `define PORT_IS_SIGNED (0 + CONFIG[4 + i*(2 + 2*num_bits)]) 1158 `define PORT_DO_SUBTRACT (0 + CONFIG[4 + i*(2 + 2*num_bits) + 1]) 1159 `define PORT_SIZE_A (0 + CONFIG[4 + i*(2 + 2*num_bits) + 2 +: num_bits]) 1160 `define PORT_SIZE_B (0 + CONFIG[4 + i*(2 + 2*num_bits) + 2 + num_bits +: num_bits]) 1161 `define PORT_OFFSET_A (0 + port_offsets[2*i*num_abits +: num_abits]) 1162 `define PORT_OFFSET_B (0 + port_offsets[2*i*num_abits + num_abits +: num_abits]) 1163 1164 integer i, j; 1165 reg [Y_WIDTH-1:0] tmp_a, tmp_b; 1166 1167 always @* begin 1168 Y = 0; 1169 for (i = 0; i < num_ports; i = i+1) 1170 begin 1171 tmp_a = 0; 1172 tmp_b = 0; 1173 1174 for (j = 0; j < `PORT_SIZE_A; j = j+1) 1175 tmp_a[j] = A[`PORT_OFFSET_A + j]; 1176 1177 if (`PORT_IS_SIGNED && `PORT_SIZE_A > 0) 1178 for (j = `PORT_SIZE_A; j < Y_WIDTH; j = j+1) 1179 tmp_a[j] = tmp_a[`PORT_SIZE_A-1]; 1180 1181 for (j = 0; j < `PORT_SIZE_B; j = j+1) 1182 tmp_b[j] = A[`PORT_OFFSET_B + j]; 1183 1184 if (`PORT_IS_SIGNED && `PORT_SIZE_B > 0) 1185 for (j = `PORT_SIZE_B; j < Y_WIDTH; j = j+1) 1186 tmp_b[j] = tmp_b[`PORT_SIZE_B-1]; 1187 1188 if (`PORT_SIZE_B > 0) 1189 tmp_a = tmp_a * tmp_b; 1190 1191 if (`PORT_DO_SUBTRACT) 1192 Y = Y - tmp_a; 1193 else 1194 Y = Y + tmp_a; 1195 end 1196 for (i = 0; i < B_WIDTH; i = i+1) begin 1197 Y = Y + B[i]; 1198 end 1199 end 1200 1201 `undef PORT_IS_SIGNED 1202 `undef PORT_DO_SUBTRACT 1203 `undef PORT_SIZE_A 1204 `undef PORT_SIZE_B 1205 `undef PORT_OFFSET_A 1206 `undef PORT_OFFSET_B 1207 1208endmodule
- yosys> help $macc_v2¶
Multiply and add. This cell represents a generic fused multiply-add operation, it supersedes the earlier $macc cell.
- Properties:
- Simulation model (verilog)¶
1220module \$macc_v2 (A, B, C, Y); 1221 1222 parameter NPRODUCTS = 0; 1223 parameter NADDENDS = 0; 1224 parameter A_WIDTHS = 16'h0000; 1225 parameter B_WIDTHS = 16'h0000; 1226 parameter C_WIDTHS = 16'h0000; 1227 parameter Y_WIDTH = 0; 1228 1229 parameter PRODUCT_NEGATED = 1'bx; 1230 parameter ADDEND_NEGATED = 1'bx; 1231 parameter A_SIGNED = 1'bx; 1232 parameter B_SIGNED = 1'bx; 1233 parameter C_SIGNED = 1'bx; 1234 1235 function integer sum_widths1; 1236 input [(16*NPRODUCTS)-1:0] widths; 1237 integer i; 1238 begin 1239 sum_widths1 = 0; 1240 for (i = 0; i < NPRODUCTS; i++) begin 1241 sum_widths1 = sum_widths1 + widths[16*i+:16]; 1242 end 1243 end 1244 endfunction 1245 1246 function integer sum_widths2; 1247 input [(16*NADDENDS)-1:0] widths; 1248 integer i; 1249 begin 1250 sum_widths2 = 0; 1251 for (i = 0; i < NADDENDS; i++) begin 1252 sum_widths2 = sum_widths2 + widths[16*i+:16]; 1253 end 1254 end 1255 endfunction 1256 1257 input [sum_widths1(A_WIDTHS)-1:0] A; // concatenation of LHS factors 1258 input [sum_widths1(B_WIDTHS)-1:0] B; // concatenation of RHS factors 1259 input [sum_widths2(C_WIDTHS)-1:0] C; // concatenation of summands 1260 output reg [Y_WIDTH-1:0] Y; // output sum 1261 1262 integer i, j, ai, bi, ci, aw, bw, cw; 1263 reg [Y_WIDTH-1:0] product; 1264 reg [Y_WIDTH-1:0] addend, oper_a, oper_b; 1265 1266 always @* begin 1267 Y = 0; 1268 ai = 0; 1269 bi = 0; 1270 for (i = 0; i < NPRODUCTS; i = i+1) 1271 begin 1272 aw = A_WIDTHS[16*i+:16]; 1273 bw = B_WIDTHS[16*i+:16]; 1274 1275 oper_a = 0; 1276 oper_b = 0; 1277 for (j = 0; j < Y_WIDTH && j < aw; j = j + 1) 1278 oper_a[j] = A[ai + j]; 1279 for (j = 0; j < Y_WIDTH && j < bw; j = j + 1) 1280 oper_b[j] = B[bi + j]; 1281 // A_SIGNED[i] == B_SIGNED[i] as RTLIL invariant 1282 if (A_SIGNED[i] && B_SIGNED[i]) begin 1283 for (j = aw; j > 0 && j < Y_WIDTH; j = j + 1) 1284 oper_a[j] = oper_a[j - 1]; 1285 for (j = bw; j > 0 && j < Y_WIDTH; j = j + 1) 1286 oper_b[j] = oper_b[j - 1]; 1287 end 1288 1289 product = oper_a * oper_b; 1290 1291 if (PRODUCT_NEGATED[i]) 1292 Y = Y - product; 1293 else 1294 Y = Y + product; 1295 1296 ai = ai + aw; 1297 bi = bi + bw; 1298 end 1299 1300 ci = 0; 1301 for (i = 0; i < NADDENDS; i = i+1) 1302 begin 1303 cw = C_WIDTHS[16*i+:16]; 1304 1305 addend = 0; 1306 for (j = 0; j < Y_WIDTH && j < cw; j = j + 1) 1307 addend[j] = C[ci + j]; 1308 if (C_SIGNED[i]) begin 1309 for (j = cw; j > 0 && j < Y_WIDTH; j = j + 1) 1310 addend[j] = addend[j - 1]; 1311 end 1312 1313 if (ADDEND_NEGATED[i]) 1314 Y = Y - addend; 1315 else 1316 Y = Y + addend; 1317 1318 ci = ci + cw; 1319 end 1320 end 1321 1322endmodule