Coarse arithmetics¶
The $macc
cell type represents a generalized multiply and accumulate
operation. The cell is purely combinational. It outputs the result of summing up
a sequence of products and other injected summands.
Y = 0 +- a0factor1 * a0factor2 +- a1factor1 * a1factor2 +- ...
+ B[0] + B[1] + ...
The A port consists of concatenated pairs of multiplier inputs (“factors”). A zero length factor2 acts as a constant 1, turning factor1 into a simple summand.
In this pseudocode, u(foo)
means an unsigned int that’s foo bits long.
struct A {
u(CONFIG.mul_info[0].factor1_len) a0factor1;
u(CONFIG.mul_info[0].factor2_len) a0factor2;
u(CONFIG.mul_info[1].factor1_len) a1factor1;
u(CONFIG.mul_info[1].factor2_len) a1factor2;
...
};
The cell’s CONFIG
parameter determines the layout of cell port A
. The
CONFIG parameter carries the following information:
struct CONFIG {
u4 num_bits;
struct mul_info {
bool is_signed;
bool is_subtract;
u(num_bits) factor1_len;
u(num_bits) factor2_len;
}[num_ports];
};
B is an array of concatenated 1-bit-wide unsigned integers to also be summed up.
- yosys> help $alu¶
Arithmetic logic unit
A building block supporting both binary addition/subtraction operations, and indirectly, comparison operations. Typically created by the
alumacc
pass, which transforms:$add
,$sub
,$lt
,$le
,$ge
,$gt
,$eq
,$eqx
,$ne
,$nex
cells into this$alu
cell.- Properties:
- Simulation model (verilog)¶
657module \$alu (A, B, CI, BI, X, Y, CO); 658 659 parameter A_SIGNED = 0; 660 parameter B_SIGNED = 0; 661 parameter A_WIDTH = 1; 662 parameter B_WIDTH = 1; 663 parameter Y_WIDTH = 1; 664 665 input [A_WIDTH-1:0] A; // Input operand 666 input [B_WIDTH-1:0] B; // Input operand 667 output [Y_WIDTH-1:0] X; // A xor B (sign-extended, optional B inversion, 668 // used in combination with 669 // reduction-AND for $eq/$ne ops) 670 output [Y_WIDTH-1:0] Y; // Sum 671 672 input CI; // Carry-in (set for $sub) 673 input BI; // Invert-B (set for $sub) 674 output [Y_WIDTH-1:0] CO; // Carry-out 675 676 wire [Y_WIDTH-1:0] AA, BB; 677 678 generate 679 if (A_SIGNED && B_SIGNED) begin:BLOCK1 680 assign AA = $signed(A), BB = BI ? ~$signed(B) : $signed(B); 681 end else begin:BLOCK2 682 assign AA = $unsigned(A), BB = BI ? ~$unsigned(B) : $unsigned(B); 683 end 684 endgenerate 685 686 // this is 'x' if Y and CO should be all 'x', and '0' otherwise 687 wire y_co_undef = ^{A, A, B, B, CI, CI, BI, BI}; 688 689 assign X = AA ^ BB; 690 // Full adder 691 assign Y = (AA + BB + CI) ^ {Y_WIDTH{y_co_undef}}; 692 693 function get_carry; 694 input a, b, c; 695 get_carry = (a&b) | (a&c) | (b&c); 696 endfunction 697 698 genvar i; 699 generate 700 assign CO[0] = get_carry(AA[0], BB[0], CI) ^ y_co_undef; 701 for (i = 1; i < Y_WIDTH; i = i+1) begin:BLOCK3 702 assign CO[i] = get_carry(AA[i], BB[i], CO[i-1]) ^ y_co_undef; 703 end 704 endgenerate 705 706endmodule
- yosys> help $fa¶
- Properties:
- Simulation model (verilog)¶
597module \$fa (A, B, C, X, Y); 598 599 parameter WIDTH = 1; 600 601 input [WIDTH-1:0] A, B, C; 602 output [WIDTH-1:0] X, Y; 603 604 wire [WIDTH-1:0] t1, t2, t3; 605 606 assign t1 = A ^ B, t2 = A & B, t3 = C & t1; 607 assign Y = t1 ^ C, X = (t2 | t3) ^ (Y ^ Y); 608 609endmodule
- yosys> help $lcu¶
Lookahead carry unit A building block dedicated to fast computation of carry-bits used in binary arithmetic operations. By replacing the ripple carry structure used in full-adder blocks, the more significant bits of the sum can be expected to be computed more quickly. Typically created during
techmap
of $alu cells (see the “_90_alu” rule in +/techmap.v).- Properties:
- Simulation model (verilog)¶
625module \$lcu (P, G, CI, CO); 626 627 parameter WIDTH = 1; 628 629 input [WIDTH-1:0] P; // Propagate 630 input [WIDTH-1:0] G; // Generate 631 input CI; // Carry-in 632 633 output reg [WIDTH-1:0] CO; // Carry-out 634 635 integer i; 636 always @* begin 637 CO = 'bx; 638 if (^{P, G, CI} !== 1'bx) begin 639 CO[0] = G[0] || (P[0] && CI); 640 for (i = 1; i < WIDTH; i = i+1) 641 CO[i] = G[i] || (P[i] && CO[i-1]); 642 end 643 end 644 645endmodule
- yosys> help $macc¶
Multiply and accumulate. A building block for summing any number of negated and unnegated signals and arithmetic products of pairs of signals. Cell port A concatenates pairs of signals to be multiplied together. When the second signal in a pair is zero length, a constant 1 is used instead as the second factor. Cell port B concatenates 1-bit-wide signals to also be summed, such as “carry in” in adders. Typically created by the
alumacc
pass, which transforms $add and $mul into $macc cells.- Properties:
- Simulation model (verilog)¶
1073module \$macc (A, B, Y); 1074 1075 parameter A_WIDTH = 0; 1076 parameter B_WIDTH = 0; 1077 parameter Y_WIDTH = 0; 1078 // CONFIG determines the layout of A, as explained below 1079 parameter CONFIG = 4'b0000; 1080 parameter CONFIG_WIDTH = 4; 1081 1082 // In the terms used for this cell, there's mixed meanings for the term "port". To disambiguate: 1083 // A cell port is for example the A input (it is constructed in C++ as cell->setPort(ID::A, ...)) 1084 // Multiplier ports are pairs of multiplier inputs ("factors"). 1085 // If the second signal in such a pair is zero length, no multiplication is necessary, and the first signal is just added to the sum. 1086 input [A_WIDTH-1:0] A; // Cell port A is the concatenation of all arithmetic ports 1087 input [B_WIDTH-1:0] B; // Cell port B is the concatenation of single-bit unsigned signals to be also added to the sum 1088 output reg [Y_WIDTH-1:0] Y; // Output sum 1089 1090 // Xilinx XSIM does not like $clog2() below.. 1091 function integer my_clog2; 1092 input integer v; 1093 begin 1094 if (v > 0) 1095 v = v - 1; 1096 my_clog2 = 0; 1097 while (v) begin 1098 v = v >> 1; 1099 my_clog2 = my_clog2 + 1; 1100 end 1101 end 1102 endfunction 1103 1104 // Bits that a factor's length field in CONFIG per factor in cell port A 1105 localparam integer num_bits = CONFIG[3:0] > 0 ? CONFIG[3:0] : 1; 1106 // Number of multiplier ports 1107 localparam integer num_ports = (CONFIG_WIDTH-4) / (2 + 2*num_bits); 1108 // Minium bit width of an induction variable to iterate over all bits of cell port A 1109 localparam integer num_abits = my_clog2(A_WIDTH) > 0 ? my_clog2(A_WIDTH) : 1; 1110 1111 // In this pseudocode, u(foo) means an unsigned int that's foo bits long. 1112 // The CONFIG parameter carries the following information: 1113 // struct CONFIG { 1114 // u4 num_bits; 1115 // struct port_field { 1116 // bool is_signed; 1117 // bool is_subtract; 1118 // u(num_bits) factor1_len; 1119 // u(num_bits) factor2_len; 1120 // }[num_ports]; 1121 // }; 1122 1123 // The A cell port carries the following information: 1124 // struct A { 1125 // u(CONFIG.port_field[0].factor1_len) port0factor1; 1126 // u(CONFIG.port_field[0].factor2_len) port0factor2; 1127 // u(CONFIG.port_field[1].factor1_len) port1factor1; 1128 // u(CONFIG.port_field[1].factor2_len) port1factor2; 1129 // ... 1130 // }; 1131 // and log(sizeof(A)) is num_abits. 1132 // No factor1 may have a zero length. 1133 // A factor2 having a zero length implies factor2 is replaced with a constant 1. 1134 1135 // Additionally, B is an array of 1-bit-wide unsigned integers to also be summed up. 1136 // Finally, we have: 1137 // Y = port0factor1 * port0factor2 + port1factor1 * port1factor2 + ... 1138 // * B[0] + B[1] + ... 1139 1140 function [2*num_ports*num_abits-1:0] get_port_offsets; 1141 input [CONFIG_WIDTH-1:0] cfg; 1142 integer i, cursor; 1143 begin 1144 cursor = 0; 1145 get_port_offsets = 0; 1146 for (i = 0; i < num_ports; i = i+1) begin 1147 get_port_offsets[(2*i + 0)*num_abits +: num_abits] = cursor; 1148 cursor = cursor + cfg[4 + i*(2 + 2*num_bits) + 2 +: num_bits]; 1149 get_port_offsets[(2*i + 1)*num_abits +: num_abits] = cursor; 1150 cursor = cursor + cfg[4 + i*(2 + 2*num_bits) + 2 + num_bits +: num_bits]; 1151 end 1152 end 1153 endfunction 1154 1155 localparam [2*num_ports*num_abits-1:0] port_offsets = get_port_offsets(CONFIG); 1156 1157 `define PORT_IS_SIGNED (0 + CONFIG[4 + i*(2 + 2*num_bits)]) 1158 `define PORT_DO_SUBTRACT (0 + CONFIG[4 + i*(2 + 2*num_bits) + 1]) 1159 `define PORT_SIZE_A (0 + CONFIG[4 + i*(2 + 2*num_bits) + 2 +: num_bits]) 1160 `define PORT_SIZE_B (0 + CONFIG[4 + i*(2 + 2*num_bits) + 2 + num_bits +: num_bits]) 1161 `define PORT_OFFSET_A (0 + port_offsets[2*i*num_abits +: num_abits]) 1162 `define PORT_OFFSET_B (0 + port_offsets[2*i*num_abits + num_abits +: num_abits]) 1163 1164 integer i, j; 1165 reg [Y_WIDTH-1:0] tmp_a, tmp_b; 1166 1167 always @* begin 1168 Y = 0; 1169 for (i = 0; i < num_ports; i = i+1) 1170 begin 1171 tmp_a = 0; 1172 tmp_b = 0; 1173 1174 for (j = 0; j < `PORT_SIZE_A; j = j+1) 1175 tmp_a[j] = A[`PORT_OFFSET_A + j]; 1176 1177 if (`PORT_IS_SIGNED && `PORT_SIZE_A > 0) 1178 for (j = `PORT_SIZE_A; j < Y_WIDTH; j = j+1) 1179 tmp_a[j] = tmp_a[`PORT_SIZE_A-1]; 1180 1181 for (j = 0; j < `PORT_SIZE_B; j = j+1) 1182 tmp_b[j] = A[`PORT_OFFSET_B + j]; 1183 1184 if (`PORT_IS_SIGNED && `PORT_SIZE_B > 0) 1185 for (j = `PORT_SIZE_B; j < Y_WIDTH; j = j+1) 1186 tmp_b[j] = tmp_b[`PORT_SIZE_B-1]; 1187 1188 if (`PORT_SIZE_B > 0) 1189 tmp_a = tmp_a * tmp_b; 1190 1191 if (`PORT_DO_SUBTRACT) 1192 Y = Y - tmp_a; 1193 else 1194 Y = Y + tmp_a; 1195 end 1196 for (i = 0; i < B_WIDTH; i = i+1) begin 1197 Y = Y + B[i]; 1198 end 1199 end 1200 1201 `undef PORT_IS_SIGNED 1202 `undef PORT_DO_SUBTRACT 1203 `undef PORT_SIZE_A 1204 `undef PORT_SIZE_B 1205 `undef PORT_OFFSET_A 1206 `undef PORT_OFFSET_B 1207 1208endmodule