天天看点

基2-booth乘法器原理及verilog代码1.booth乘法器原理2.乘法器模块设计3.verilog代码与testbench4.评价与展望

1.booth乘法器原理

对于一个n位的有符号二进制数B,首位是0则B可以表示为:

基2-booth乘法器原理及verilog代码1.booth乘法器原理2.乘法器模块设计3.verilog代码与testbench4.评价与展望

首位是1,B[n-2:0]是实际数字的补码,所以可以得到

基2-booth乘法器原理及verilog代码1.booth乘法器原理2.乘法器模块设计3.verilog代码与testbench4.评价与展望

可以得到合并的公式如下所示:

基2-booth乘法器原理及verilog代码1.booth乘法器原理2.乘法器模块设计3.verilog代码与testbench4.评价与展望

将公式展开:

基2-booth乘法器原理及verilog代码1.booth乘法器原理2.乘法器模块设计3.verilog代码与testbench4.评价与展望

除了n-1项外的每一项乘2之后再减去本身:

基2-booth乘法器原理及verilog代码1.booth乘法器原理2.乘法器模块设计3.verilog代码与testbench4.评价与展望

根据2^i重构公式:

基2-booth乘法器原理及verilog代码1.booth乘法器原理2.乘法器模块设计3.verilog代码与testbench4.评价与展望

为了统一形式,添加一项B[-1],初始值为0.注意这里的B[-1]是一个单独的寄存器,与B并没有关系则B可以修改为

基2-booth乘法器原理及verilog代码1.booth乘法器原理2.乘法器模块设计3.verilog代码与testbench4.评价与展望
基2-booth乘法器原理及verilog代码1.booth乘法器原理2.乘法器模块设计3.verilog代码与testbench4.评价与展望
基2-booth乘法器原理及verilog代码1.booth乘法器原理2.乘法器模块设计3.verilog代码与testbench4.评价与展望

2.乘法器模块设计

对于上一章节的B的公式我们可以看出,实际关注的是B中相邻两位的大小,所以可以对B进行移位处理即可每次只关注B[0]B[-1],同时A乘的2的i次方也是对A进行移位。综上我们可以构建一个新的向量array,用来移位和计算累加和。

从i=0开始计算累加和,对累加和的计算总共有三种情况:

  1. { B[i],B[i-1] } = 2'b00或2'b11 :B[i-1]-B[i] = 0,无操作
  1. { B[i],B[i-1] } = 2'b01 :B[i-1]-B[i] = 1,相加
  1. { B[i],B[i-1]} = 2'b10 :B[i-1]-B[i] = -1,相减

不妨设计算两个四位的二进制数相乘,首先定义一个四位的寄存器Q用来计算累加和。在这里可能会有个疑问,四位二进制数相加结果应该是五位的数,为什么这里只需要设四位。

所以array的实际组成是array = {Q,B,B[-1]};Q和B[-1]的初始值都是0.每次计算都是先根据B[0]B[-1]的结果,计算Q的值,然后进行算术右移。并定义一个计数器用来计算执行次数,总次数即为B的位数。

我们可以发现第一次计算的Q到最后是移位到了最末端,这种计算方式和普通移位的区别是,普通方法是将2^i,左移i次,而本节的方式是将数从高位右移,实际结果相同。

下面给出7*(-5)的例子作为参考

-----------------------------------------初始化------------------------------------------

设4-bit A , 4-bit B , 4-bit Q ,1-bit B[-1]

A = 0111(7);

-A = 1001;

B = 1011(-5);

Q = 0000;

B[-1] = 0;

B[0]B[-1] = 10;

{Q,B,B[-1]} = 0000_1011_0;

-----------------------------------------count = 3---------------------------------------

----------------------------------------计算累加和---------------------------------------

B = 1011;

B[-1] = 0;

B[0]B[-1] = 10;

Q = Q - A = 1001;

{Q,B,B[-1]} = 1001_1011_0

--------------------------------------------移位-----------------------------------------

{Q,B,B[-1]} = 1100_1101_1

-----------------------------------------count = 2---------------------------------------

-----------------------------------------计算累加和-------------------------------------

Q = 1100;

B = 1101;

B[-1] = 1;

B[0]B[-1] = 11;无操作

Q = Q = 1100;

{Q,B,B[-1]} = 1100_1101_1

--------------------------------------------移位-----------------------------------------

{Q,B,B[-1]} = 1110_0110_1

-----------------------------------------count = 1---------------------------------------

-----------------------------------------计算累加和-------------------------------------

Q = 1110;

B = 0110;

B[-1] = 1;

B[0]B[-1] = 01;相加

Q = Q + A = 1110 + 0111 = 0101;

{Q,B,B[-1]} = 0101_0110_1

--------------------------------------------移位-----------------------------------------

{Q,B,B[-1]} = 0010_1011_0

-----------------------------------------count = 0---------------------------------------

-----------------------------------------计算累加和-------------------------------------

Q = 0010;

B = 1011;

B[-1] = 0;

B[0]B[-1] = 10;相减

Q = Q - A = 0010 + 1001 = 1011;

{Q,B,B[-1]} = 1011_1011_1;

--------------------------------------------移位-----------------------------------------

{Q,B,B[-1]} = 1101_1101_1;

--------------------------------------------输出-----------------------------------------

CO = 1101_1101;

3.verilog代码与testbench

3.1 verilog代码

module boothmultiplier_radix2#(
    parameter size = 4 
)
(
    clk,
    rst_n,
    valid_i,
    receive_i,
    multiplicand_i,
    multiplier_i,
    busy_o,
    ready_o,
    result_o
);
    input wire clk;
    input wire rst_n;
    input wire valid_i;
    input wire receive_i;
    input wire signed [size-1:0] multiplicand_i;
    input wire signed [size-1:0] multiplier_i;
    output reg busy_o;
    output reg ready_o;
    output wire signed [2*size-1:0] result_o;

    reg signed [2*size:0] array;
    reg [size-1:0] M;
    reg [size-1:0] Q;
    reg [1:0] count;

    localparam STATE_IDLE  = 2'b00;
    localparam STATE_CALC  = 2'b01;
    localparam STATE_SHIFT = 2'b10;
    localparam STATE_END   = 2'b11;

    reg [1:0] state,next_state;

    always@(posedge clk or negedge rst_n)begin
        if(!rst_n)begin
            state <= STATE_IDLE;
        end else begin
            state <= next_state;
        end
    end

    always@(*)begin
        if(!rst_n)begin
            next_state = STATE_IDLE;
        end else begin
            case (state)
                STATE_IDLE: begin
                    if(valid_i)begin
                        next_state = STATE_CALC;
                    end else begin
                        next_state = STATE_IDLE;
                    end
                end
                STATE_CALC: begin
                        next_state = STATE_SHIFT;
                end
                STATE_SHIFT: begin
                    if(count == 0)begin
                        next_state = STATE_END;
                    end else begin
                        next_state = STATE_CALC;
                    end
                end
                STATE_END: begin
                    if(receive_i)begin
                        next_state = STATE_IDLE;
                    end else begin
                        next_state = STATE_END;
                    end
                end
                default: next_state = STATE_IDLE;
            endcase
        end
    end

    always @(posedge clk or negedge rst_n) begin
        if(!rst_n)begin
            array <= 'b0;
            M <= 'b0;
            Q <= 'b0;
            ready_o <= 1;
            busy_o <= 0;
            count <= 0;
        end else begin
            case(state)
                STATE_IDLE: begin
                    if(valid_i)begin
                        ready_o <= 0;
                        busy_o <= 1;
                        M <= multiplicand_i;
                        Q <= multiplier_i;
                        count <= 2'd3;
                        array <= {4'b0,multiplier_i,1'b0};
                    end else begin
                        array <= 'b0;
                        M <= 'b0;
                        Q <= 'b0;
                        ready_o <= 1;
                        busy_o <= 0;
                        count <= 2'b0;
                    end
                end
                STATE_CALC: begin
                    case({array[1:0]})
                        2'b00,2'b11:begin
                            array <= array;
                        end
                        2'b01:begin
                            array <= (array + {M[size-1:0],{(size+1){1'b0}}});
                        end
                        2'b10:begin
                            array <= (array - {M[size-1:0],{(size+1){1'b0}}});
                        end
                    endcase
                end
                STATE_SHIFT: begin
                    if(count != 0)begin
                        count <= count -1; 
                    end else begin
                        count <= 2'd3;
                        busy_o <= 0;
                        ready_o <= 1;
                    end
                    array <= array >>> 1;
                end
                STATE_END: begin
                    if(receive_i)begin
                        busy_o <= 0;
                        ready_o <= 1;
                    end else begin
                        busy_o <= 1;
                        ready_o <= 0;
                    end
                end
                default: begin
                    array <= 'b0;
                    M <= 'b0;
                    Q <= 'b0;
                    busy_o <= 0;
                    count <= 2'd3;
                end
            endcase
        end
    end

    assign result_o = (state == STATE_END)?array[8:1]:0;
endmodule
           

3.2 testbench

module Top_tb
();
    reg clk;
    reg rst_n;
    reg valid_i;
    reg signed [3:0] multiplicand_i;
    reg signed [3:0] multiplier_i;
    reg receive_i;

    wire [3:0]result_o;
    wire ready_o;
    wire busy_o;
    boothmultiplier_radix2#(
        .size(4)
    )
    UUT(
        .clk(clk),
        .rst_n(rst_n),
        .valid_i(valid_i),
        .receive_i(receive_i),
        .multiplicand_i(multiplicand_i),
        .multiplier_i(multiplier_i),
        .busy_o(busy_o),
        .ready_o(ready_o),
        .result_o(result_o)
    );
    
    initial begin
        clk = 0;
        rst_n = 0;
        #100
        rst_n = 1;
        #100
        wait(clk);
        multiplicand_i = 4'b0111;
        multiplier_i = 4'b1011;
        valid_i = 1;
        #20
        valid_i = 0;
        wait(ready_o);
        receive_i = 1;
        multiplicand_i = 4'b0101;
        multiplier_i = 4'b1001;
        valid_i = 1;
        #1000
        $stop;
    end

    always #10 clk = ~clk;
endmodule
           

4.评价与展望

1、由于计算和移位在同一周期内实现并不是很好写,也为了更加直观,代码中将计算和移位分在两个周期中完成,导致计算速度会将近降了一倍。

2、输出result_o采用的是wire类型输出,也可修改为reg输出,更加稳定。

3、后续可加入流水线,增加数据吞吐量。

4、testbench并不完整,只是给了个框架,有兴趣的朋友可以完善一下。

继续阅读