## Design of a 64-bit ALU:

The figure below shows the block diagram of a 64-bit ALU. The ALU can perform basic arithmetic and logic functions including add, subtract, logic AND, logic OR, and logic XOR. The ALU has 3 input signals and one output signal. Operands A and B are fed to these functional units. The third input controls the behavior of these functional units and selects the desired results at the output stage. There are also some other output signals in the ALU, such as overflow, zero, and negative, which are not shown in the figure.



When clock gating is applied, a portion (higher-order bits or lower-order bits) of these functional units do not consume any dynamic power. In this case, the power density varies significantly among different parts of the ALU. When analyzing the temperature profile of the ALU, simply applying total power to the ALU overlooks the spatial variation of power density in the ALU. Inaccuracy in the spatial power profile leads to errors in temperature evaluation. ALU is a potential hotspot in microprocessors. Therefore, errors in evaluating ALU temperature may influence the thermal management of the whole processor.

In order to analyze the thermal behavior of the ALU at finer granularity, it is necessary to specify locations of different portions of the ALU. We use the commercial placement and routing tools from Cadence to derive the floorplan of the ALU at sub-block level. We first implement a 64-bit ALU in Verilog HDL. Then we synthesized and optimized the ALU under 65 nm and 180 nm technologies and estimated power consumption of each sub-block of the ALU. Finally, we use Cadence Silicon Ensemble to perform placement and routing. We estimate the floorplan of the ALU by analyzing the layout of the ALU. The figure below shows the layout of the ALU. Standard cells are placed in rows in the center of the layout.

