Coarse arithmetics¶
The $macc cell type represents a generalized multiply and accumulate
operation. The cell is purely combinational. It outputs the result of summing up
a sequence of products and other injected summands.
Y = 0 +- a0factor1 * a0factor2 +- a1factor1 * a1factor2 +- ...
+ B[0] + B[1] + ...
The A port consists of concatenated pairs of multiplier inputs (“factors”). A zero length factor2 acts as a constant 1, turning factor1 into a simple summand.
In this pseudocode, u(foo) means an unsigned int that’s foo bits long.
struct A {
u(CONFIG.mul_info[0].factor1_len) a0factor1;
u(CONFIG.mul_info[0].factor2_len) a0factor2;
u(CONFIG.mul_info[1].factor1_len) a1factor1;
u(CONFIG.mul_info[1].factor2_len) a1factor2;
...
};
The cell’s CONFIG parameter determines the layout of cell port A. The
CONFIG parameter carries the following information:
struct CONFIG {
u4 num_bits;
struct mul_info {
bool is_signed;
bool is_subtract;
u(num_bits) factor1_len;
u(num_bits) factor2_len;
}[num_ports];
};
B is an array of concatenated 1-bit-wide unsigned integers to also be summed up.
- yosys> help $alu¶
Arithmetic logic unit
A building block supporting both binary addition/subtraction operations, and indirectly, comparison operations. Typically created by the
alumaccpass, which transforms:$add,$sub,$lt,$le,$ge,$gt,$eq,$eqx,$ne,$nexcells into this$alucell.- Properties:
- Simulation model (verilog)¶
665module \$alu (A, B, CI, BI, X, Y, CO); 666 667 parameter A_SIGNED = 0; 668 parameter B_SIGNED = 0; 669 parameter A_WIDTH = 1; 670 parameter B_WIDTH = 1; 671 parameter Y_WIDTH = 1; 672 673 input [A_WIDTH-1:0] A; // Input operand 674 input [B_WIDTH-1:0] B; // Input operand 675 output [Y_WIDTH-1:0] X; // A xor B (sign-extended, optional B inversion, 676 // used in combination with 677 // reduction-AND for $eq/$ne ops) 678 output [Y_WIDTH-1:0] Y; // Sum 679 680 input CI; // Carry-in (set for $sub) 681 input BI; // Invert-B (set for $sub) 682 output [Y_WIDTH-1:0] CO; // Carry-out 683 684 wire [Y_WIDTH-1:0] AA, BB; 685 686 generate 687 if (A_SIGNED && B_SIGNED) begin:BLOCK1 688 assign AA = $signed(A), BB = BI ? ~$signed(B) : $signed(B); 689 end else begin:BLOCK2 690 assign AA = $unsigned(A), BB = BI ? ~$unsigned(B) : $unsigned(B); 691 end 692 endgenerate 693 694 // this is 'x' if Y and CO should be all 'x', and '0' otherwise 695 wire y_co_undef = ^{A, A, B, B, CI, CI, BI, BI}; 696 697 assign X = AA ^ BB; 698 // Full adder 699 assign Y = (AA + BB + CI) ^ {Y_WIDTH{y_co_undef}}; 700 701 function get_carry; 702 input a, b, c; 703 get_carry = (a&b) | (a&c) | (b&c); 704 endfunction 705 706 genvar i; 707 generate 708 assign CO[0] = get_carry(AA[0], BB[0], CI) ^ y_co_undef; 709 for (i = 1; i < Y_WIDTH; i = i+1) begin:BLOCK3 710 assign CO[i] = get_carry(AA[i], BB[i], CO[i-1]) ^ y_co_undef; 711 end 712 endgenerate 713 714endmodule
- yosys> help $fa¶
- Properties:
- Simulation model (verilog)¶
605module \$fa (A, B, C, X, Y); 606 607 parameter WIDTH = 1; 608 609 input [WIDTH-1:0] A, B, C; 610 output [WIDTH-1:0] X, Y; 611 612 wire [WIDTH-1:0] t1, t2, t3; 613 614 assign t1 = A ^ B, t2 = A & B, t3 = C & t1; 615 assign Y = t1 ^ C, X = (t2 | t3) ^ (Y ^ Y); 616 617endmodule
- yosys> help $lcu¶
Lookahead carry unit A building block dedicated to fast computation of carry-bits used in binary arithmetic operations. By replacing the ripple carry structure used in full-adder blocks, the more significant bits of the sum can be expected to be computed more quickly. Typically created during
techmapof $alu cells (see the “_90_alu” rule in +/techmap.v).- Properties:
- Simulation model (verilog)¶
633module \$lcu (P, G, CI, CO); 634 635 parameter WIDTH = 1; 636 637 input [WIDTH-1:0] P; // Propagate 638 input [WIDTH-1:0] G; // Generate 639 input CI; // Carry-in 640 641 output reg [WIDTH-1:0] CO; // Carry-out 642 643 integer i; 644 always @* begin 645 CO = 'bx; 646 if (^{P, G, CI} !== 1'bx) begin 647 CO[0] = G[0] || (P[0] && CI); 648 for (i = 1; i < WIDTH; i = i+1) 649 CO[i] = G[i] || (P[i] && CO[i-1]); 650 end 651 end 652 653endmodule
- yosys> help $macc¶
Multiply and accumulate. A building block for summing any number of negated and unnegated signals and arithmetic products of pairs of signals. Cell port A concatenates pairs of signals to be multiplied together. When the second signal in a pair is zero length, a constant 1 is used instead as the second factor. Cell port B concatenates 1-bit-wide signals to also be summed, such as “carry in” in adders. Typically created by the
alumaccpass, which transforms $add and $mul into $macc cells.- Properties:
- Simulation model (verilog)¶
1081module \$macc (A, B, Y); 1082 1083 parameter A_WIDTH = 0; 1084 parameter B_WIDTH = 0; 1085 parameter Y_WIDTH = 0; 1086 // CONFIG determines the layout of A, as explained below 1087 parameter CONFIG = 4'b0000; 1088 parameter CONFIG_WIDTH = 4; 1089 1090 // In the terms used for this cell, there's mixed meanings for the term "port". To disambiguate: 1091 // A cell port is for example the A input (it is constructed in C++ as cell->setPort(ID::A, ...)) 1092 // Multiplier ports are pairs of multiplier inputs ("factors"). 1093 // If the second signal in such a pair is zero length, no multiplication is necessary, and the first signal is just added to the sum. 1094 input [A_WIDTH-1:0] A; // Cell port A is the concatenation of all arithmetic ports 1095 input [B_WIDTH-1:0] B; // Cell port B is the concatenation of single-bit unsigned signals to be also added to the sum 1096 output reg [Y_WIDTH-1:0] Y; // Output sum 1097 1098 // Xilinx XSIM does not like $clog2() below.. 1099 function integer my_clog2; 1100 input integer v; 1101 begin 1102 if (v > 0) 1103 v = v - 1; 1104 my_clog2 = 0; 1105 while (v) begin 1106 v = v >> 1; 1107 my_clog2 = my_clog2 + 1; 1108 end 1109 end 1110 endfunction 1111 1112 // Bits that a factor's length field in CONFIG per factor in cell port A 1113 localparam integer num_bits = CONFIG[3:0] > 0 ? CONFIG[3:0] : 1; 1114 // Number of multiplier ports 1115 localparam integer num_ports = (CONFIG_WIDTH-4) / (2 + 2*num_bits); 1116 // Minium bit width of an induction variable to iterate over all bits of cell port A 1117 localparam integer num_abits = my_clog2(A_WIDTH) > 0 ? my_clog2(A_WIDTH) : 1; 1118 1119 // In this pseudocode, u(foo) means an unsigned int that's foo bits long. 1120 // The CONFIG parameter carries the following information: 1121 // struct CONFIG { 1122 // u4 num_bits; 1123 // struct port_field { 1124 // bool is_signed; 1125 // bool is_subtract; 1126 // u(num_bits) factor1_len; 1127 // u(num_bits) factor2_len; 1128 // }[num_ports]; 1129 // }; 1130 1131 // The A cell port carries the following information: 1132 // struct A { 1133 // u(CONFIG.port_field[0].factor1_len) port0factor1; 1134 // u(CONFIG.port_field[0].factor2_len) port0factor2; 1135 // u(CONFIG.port_field[1].factor1_len) port1factor1; 1136 // u(CONFIG.port_field[1].factor2_len) port1factor2; 1137 // ... 1138 // }; 1139 // and log(sizeof(A)) is num_abits. 1140 // No factor1 may have a zero length. 1141 // A factor2 having a zero length implies factor2 is replaced with a constant 1. 1142 1143 // Additionally, B is an array of 1-bit-wide unsigned integers to also be summed up. 1144 // Finally, we have: 1145 // Y = port0factor1 * port0factor2 + port1factor1 * port1factor2 + ... 1146 // * B[0] + B[1] + ... 1147 1148 function [2*num_ports*num_abits-1:0] get_port_offsets; 1149 input [CONFIG_WIDTH-1:0] cfg; 1150 integer i, cursor; 1151 begin 1152 cursor = 0; 1153 get_port_offsets = 0; 1154 for (i = 0; i < num_ports; i = i+1) begin 1155 get_port_offsets[(2*i + 0)*num_abits +: num_abits] = cursor; 1156 cursor = cursor + cfg[4 + i*(2 + 2*num_bits) + 2 +: num_bits]; 1157 get_port_offsets[(2*i + 1)*num_abits +: num_abits] = cursor; 1158 cursor = cursor + cfg[4 + i*(2 + 2*num_bits) + 2 + num_bits +: num_bits]; 1159 end 1160 end 1161 endfunction 1162 1163 localparam [2*num_ports*num_abits-1:0] port_offsets = get_port_offsets(CONFIG); 1164 1165 `define PORT_IS_SIGNED (0 + CONFIG[4 + i*(2 + 2*num_bits)]) 1166 `define PORT_DO_SUBTRACT (0 + CONFIG[4 + i*(2 + 2*num_bits) + 1]) 1167 `define PORT_SIZE_A (0 + CONFIG[4 + i*(2 + 2*num_bits) + 2 +: num_bits]) 1168 `define PORT_SIZE_B (0 + CONFIG[4 + i*(2 + 2*num_bits) + 2 + num_bits +: num_bits]) 1169 `define PORT_OFFSET_A (0 + port_offsets[2*i*num_abits +: num_abits]) 1170 `define PORT_OFFSET_B (0 + port_offsets[2*i*num_abits + num_abits +: num_abits]) 1171 1172 integer i, j; 1173 reg [Y_WIDTH-1:0] tmp_a, tmp_b; 1174 1175 always @* begin 1176 Y = 0; 1177 for (i = 0; i < num_ports; i = i+1) 1178 begin 1179 tmp_a = 0; 1180 tmp_b = 0; 1181 1182 for (j = 0; j < `PORT_SIZE_A; j = j+1) 1183 tmp_a[j] = A[`PORT_OFFSET_A + j]; 1184 1185 if (`PORT_IS_SIGNED && `PORT_SIZE_A > 0) 1186 for (j = `PORT_SIZE_A; j < Y_WIDTH; j = j+1) 1187 tmp_a[j] = tmp_a[`PORT_SIZE_A-1]; 1188 1189 for (j = 0; j < `PORT_SIZE_B; j = j+1) 1190 tmp_b[j] = A[`PORT_OFFSET_B + j]; 1191 1192 if (`PORT_IS_SIGNED && `PORT_SIZE_B > 0) 1193 for (j = `PORT_SIZE_B; j < Y_WIDTH; j = j+1) 1194 tmp_b[j] = tmp_b[`PORT_SIZE_B-1]; 1195 1196 if (`PORT_SIZE_B > 0) 1197 tmp_a = tmp_a * tmp_b; 1198 1199 if (`PORT_DO_SUBTRACT) 1200 Y = Y - tmp_a; 1201 else 1202 Y = Y + tmp_a; 1203 end 1204 for (i = 0; i < B_WIDTH; i = i+1) begin 1205 Y = Y + B[i]; 1206 end 1207 end 1208 1209 `undef PORT_IS_SIGNED 1210 `undef PORT_DO_SUBTRACT 1211 `undef PORT_SIZE_A 1212 `undef PORT_SIZE_B 1213 `undef PORT_OFFSET_A 1214 `undef PORT_OFFSET_B 1215 1216endmodule
- yosys> help $macc_v2¶
Multiply and add. This cell represents a generic fused multiply-add operation, it supersedes the earlier $macc cell.
- Properties:
- Simulation model (verilog)¶
1228module \$macc_v2 (A, B, C, Y); 1229 1230 parameter NPRODUCTS = 0; 1231 parameter NADDENDS = 0; 1232 parameter A_WIDTHS = 16'h0000; 1233 parameter B_WIDTHS = 16'h0000; 1234 parameter C_WIDTHS = 16'h0000; 1235 parameter Y_WIDTH = 0; 1236 1237 parameter PRODUCT_NEGATED = 1'bx; 1238 parameter ADDEND_NEGATED = 1'bx; 1239 parameter A_SIGNED = 1'bx; 1240 parameter B_SIGNED = 1'bx; 1241 parameter C_SIGNED = 1'bx; 1242 1243 function integer sum_widths1; 1244 input [(16*NPRODUCTS)-1:0] widths; 1245 integer i; 1246 begin 1247 sum_widths1 = 0; 1248 for (i = 0; i < NPRODUCTS; i++) begin 1249 sum_widths1 = sum_widths1 + widths[16*i+:16]; 1250 end 1251 end 1252 endfunction 1253 1254 function integer sum_widths2; 1255 input [(16*NADDENDS)-1:0] widths; 1256 integer i; 1257 begin 1258 sum_widths2 = 0; 1259 for (i = 0; i < NADDENDS; i++) begin 1260 sum_widths2 = sum_widths2 + widths[16*i+:16]; 1261 end 1262 end 1263 endfunction 1264 1265 input [sum_widths1(A_WIDTHS)-1:0] A; // concatenation of LHS factors 1266 input [sum_widths1(B_WIDTHS)-1:0] B; // concatenation of RHS factors 1267 input [sum_widths2(C_WIDTHS)-1:0] C; // concatenation of summands 1268 output reg [Y_WIDTH-1:0] Y; // output sum 1269 1270 integer i, j, ai, bi, ci, aw, bw, cw; 1271 reg [Y_WIDTH-1:0] product; 1272 reg [Y_WIDTH-1:0] addend, oper_a, oper_b; 1273 1274 always @* begin 1275 Y = 0; 1276 ai = 0; 1277 bi = 0; 1278 for (i = 0; i < NPRODUCTS; i = i+1) 1279 begin 1280 aw = A_WIDTHS[16*i+:16]; 1281 bw = B_WIDTHS[16*i+:16]; 1282 1283 oper_a = 0; 1284 oper_b = 0; 1285 for (j = 0; j < Y_WIDTH && j < aw; j = j + 1) 1286 oper_a[j] = A[ai + j]; 1287 for (j = 0; j < Y_WIDTH && j < bw; j = j + 1) 1288 oper_b[j] = B[bi + j]; 1289 // A_SIGNED[i] == B_SIGNED[i] as RTLIL invariant 1290 if (A_SIGNED[i] && B_SIGNED[i]) begin 1291 for (j = aw; j > 0 && j < Y_WIDTH; j = j + 1) 1292 oper_a[j] = oper_a[j - 1]; 1293 for (j = bw; j > 0 && j < Y_WIDTH; j = j + 1) 1294 oper_b[j] = oper_b[j - 1]; 1295 end 1296 1297 product = oper_a * oper_b; 1298 1299 if (PRODUCT_NEGATED[i]) 1300 Y = Y - product; 1301 else 1302 Y = Y + product; 1303 1304 ai = ai + aw; 1305 bi = bi + bw; 1306 end 1307 1308 ci = 0; 1309 for (i = 0; i < NADDENDS; i = i+1) 1310 begin 1311 cw = C_WIDTHS[16*i+:16]; 1312 1313 addend = 0; 1314 for (j = 0; j < Y_WIDTH && j < cw; j = j + 1) 1315 addend[j] = C[ci + j]; 1316 if (C_SIGNED[i]) begin 1317 for (j = cw; j > 0 && j < Y_WIDTH; j = j + 1) 1318 addend[j] = addend[j - 1]; 1319 end 1320 1321 if (ADDEND_NEGATED[i]) 1322 Y = Y - addend; 1323 else 1324 Y = Y + addend; 1325 1326 ci = ci + cw; 1327 end 1328 end 1329 1330endmodule