14/10/2014 Implementing Stump In this part of the course we will look at the implementation of the Stump Processor as a Verilog design from the RTL design. • MU0 as an example • Use path usage diagrams • Develop the RTL view of the Stump processor • Verilog implementation • Verilog test bench COMP22111: Processor Microarchitecture Part 4 Architectural Design We have previously produced a correct (tested) specification, now we have to move in to the architectural design. Our model was a black box, we identified the interface, and what it does; we have no idea (yet) on how it will be implemented. In the architectural design stage we identify functional blocks and the data paths between them. The Stump is a RISC processor so we can identify 4 functional blocks for handling data along with a control block. The architectural diagram emerges from the description of these blocks. COMP22111: Processor Microarchitecture Part 4 1 14/10/2014 Stump Architecture Address (to memory) Address Interface Shows all blocks on the datapath (not control) and the paths between the functional blocks. Register Bank Execute Unit Data Interface This is one conceptual view. Data out Control Data in COMP22111: Processor Microarchitecture Part 4 Path Usage Diagrams Address Interface Instruction fetch Register Bank We also update the PC to PC+1 in the fetch cycle Execute Unit No overlapping of data paths so there should be no problem Data Interface Data out Data in COMP22111: Processor Microarchitecture Part 4 2 14/10/2014 Path Usage Diagrams Address Interface Type 1 Execute Register Bank We need to writeback the result to a register Execute Unit Data Interface No overlapping of data paths so there should be no problem Data out Data in COMP22111: Processor Microarchitecture Part 4 Path Usage Diagrams Address Interface Type 2 Execute Register Bank We need to writeback the result to a register Execute Unit No overlapping of data paths so there should be no problem Data Interface Data out Data in COMP22111: Processor Microarchitecture Part 4 3 14/10/2014 Path Usage Diagrams Address Interface Type 3 Execute Register Bank We need to writeback the result to a register Execute Unit Data Interface No overlapping of data paths so there should be no problem Data out Data in COMP22111: Processor Microarchitecture Part 4 Path Usage Diagrams Address Interface Load operation Register Bank Execute Unit Data Interface Data out Data in COMP22111: Processor Microarchitecture Part 4 4 14/10/2014 Path Usage Diagrams Address Interface Store operation Register Bank Execute Unit Data Interface Data out Data in COMP22111: Processor Microarchitecture Part 4 RTL Design Combinatorial logic Register Combinatorial logic Register Register RTL describes the datapath in terms of registers and combinatorial logic and is the first level at which actual hardware is described. clock Control Data flow is synchronised to a clock using registers and combinatorial logic operates on the stored data. External control signals control the flow of data as well as the operations performed. COMP22111: Processor Microarchitecture Part 4 5 14/10/2014 RTL Design of the Stump Fetch phase data from memory Instruction Register R7 Program Counter memory address COMP22111: Processor Microarchitecture Part 4 RTL Design of the Stump Fetch phase Register Bank PC 1 A ALU C result B COMP22111: Processor Microarchitecture Part 4 6 14/10/2014 RTL Design of the Stump Execute phase immediate/offset from instruction register R0 Register Bank R7 A Sign Extend B shifter mux A B ALU C result address reg Condition Code (Status) Register COMP22111: Processor Microarchitecture Part 4 RTL Design of the Stump Load/Store phase data from memory R0 Register Bank R7 A data to memory Address Register address to memory COMP22111: Processor Microarchitecture Part 4 7 14/10/2014 RTL Stump data from memory mux IR Registers data to memory Sign Extender reg B reg A Immediate mux shifter '1' mux op B op A ALU ALU out Flags addr_reg mux memory address COMP22111: Processor Microarchitecture Part 4 Implementing a Processor MU0 We will now look at implementing the control … … we won’t be doing the Stump, as you’ll be doing that in the lab! Earlier we looked at MU0 and how to implement the datapath in structural Verilog – the datapath and control are shown below. data_out clock data_in address reset mem_rd mem_wr X_sel halt Control acc_en flag gen 1 ir acc N 1 0 acc_out[15:0] pc_en ir_en N addr_sel 0 Z pc_out[11:0] ir_out[11:0] pc alu_out[15:0] Z ir_out[15:0] X_sel 1 Y_sel addr_sel alu M[1:0] 0 Y_sel ir_out[15:12] 2 M[1:0] X_input[15:0] Y_input[15:0] We have (largely) implemented the datapath, so now we will look at implementing the control block. COMP22111: Processor Microarchitecture Part 4 8 14/10/2014 MU0 control block The MU0 control block contains the MU0 FSM as well as the control logic that determines the status of the control signals. MU0 Control mem_wr mem_read mem_rd mem_write data_out_sel X_sel ir[15:12] instr[3:0] N Z We have defined the interface to the control block previously. clock reset halt address_sel addr_sel N_flag acc_en Y_sel acc_en Z_flag pc_en pc_en data_in_sel clk ir_en ir_en rst halt alu_op[1:0] M[1:0] We could define the Verilog module header for the MU0 control block: control MU0_control(input input output output clk, rst, N_flag, Z_flag,! [3:0] instr,! reg mem_read, mem_write, halt,! data_out_sel, data_in_sel,! address_sel, acc_en, pc_en, ir_en,! reg [1:0] alu_op);! COMP22111: Processor Microarchitecture Part 4 Implementing the control What does the control do? … remember it issues commands to the datapath & contains the FSM! We will be implementing the control block in Verilog. It contains the state machine for the MU0, this can be implemented easily as an always block within the control, one implementation may be: always @ (posedge clock, posedge reset)! begin! if(reset)! state <= `FETCH;! else! case(state)! `FETCH: state <= `EXECUTE;! `EXECUTE: state <= `FETCH;! endcase! end! The control also has to set the control signals to the rest of the datapath. There are many ways we could do this … COMP22111: Processor Microarchitecture Part 4 9 14/10/2014 Implementing the control We need to identify which datapath control signals are asserted in which state and for which instruction … One way to simplify this process is to use signal usage charts, which are essentially a state transition table that lists all the control signals in datapath against each type of instruction, with a table for each state of the FSM, i.e. fetch and execute. … there’s no dependency on the type of instruction Consider the Fetch phase COMP22111: Processor Microarchitecture Part 4 MU0 Datapath FETCH data_out data_out_sel data_in address address_sel 1 0 1 0 ir_en flag gen ir acc_out[15:0] acc N Z acc_en pc_out[11:0] ir_out[11:0] pc alu_out[15:0] pc_en ir_out[15:0] 1 alu X_input[15:0] mem_read alu_op[1:0] 0 data_in_sel Y_input[15:0] mem_write COMP22111: Processor Microarchitecture Part 4 10 14/10/2014 Implementing the control We need to identify which datapath control signals are asserted in which state and for which instruction … One way to simplify this process is to use signal usage charts, which are essentially a state transition table that lists all the control signals in datapath against each type of instruction, with a table for each state of the FSM, i.e. fetch and execute. … there’s no dependency on the type of instruction Consider the Fetch phase mem_read mem_write acc_en pc_en ir_en data_out_sel data_in_sel address_sel alu_op halt Signal usage chart 1 0 0 1 1 1 x 0 2’b10 0 COMP22111: Processor Microarchitecture Part 4 Implementing the control DECODE/EXECUTE – depends on the instruction Consider the LDA instruction … Data out data_out_sel Data in Address address_sel 0 1 acc 0 pc alu 1 ir 1 0 data_in_sel COMP22111: Processor Microarchitecture Part 4 11 14/10/2014 Implementing the control Consider the ADD instruction … Data out data_out_sel Data in Address address_sel 1 0 acc 0 1 ir pc 1 alu 0 data_in_sel COMP22111: Processor Microarchitecture Part 4 Implementing the control Consider the JNE instruction … Data out data_out_sel Data in Address address_sel 0 1 acc 0 pc alu 1 ir 1 0 data_in_sel COMP22111: Processor Microarchitecture Part 4 12 14/10/2014 Implementing the control DECODE/EXCUTE – Signal usage chart instr LDA STA ADD SUB JMP JGE JNE STP 4’b0000 4’b0001 4’b0010 4’b0011 4’b0100 4’b0101 4’b0110 4’b0111 mem_read 1 0 1 1 0 0 0 1 mem_write 0 1 0 0 0 0 0 1 acc_en 1 0 1 1 0 0 0 0 pc_en 0 0 0 0 1 N Z 0 ir_en 0 0 0 0 0 0 0 0 data_out_sel x 0 0 0 x x x x data_in_sel 0 x 0 0 1 1 1 x address_sel 1 1 1 1 x x x x alu_op 2’b00 2’b00 2’b00 2’b00 halt 0 0 0 1 2’b00 2’b01 2’b11 2’b00 0 0 0 0 COMP22111: Processor Microarchitecture Part 4 Implementing the control Now we know how the control signals depend on the type of instruction and the phase (fetch or execute) of the instruction. There are many ways we could approach the implementation in Verilog… • We could set the control signals within the FSM always block – this could be difficult to get ‘correct’ • We could create another always block to set all the control signals • We could create an always block for each control signal that determines the signal value depending on whether it is the fetch or execute phase, and the type of instruction. • … and more – each would probably create a block of synthesizable Verilog, although some would be easier to implement that others. • … there’s no right or wrong way, as look as you focus on implementing synthesizable Verilog. COMP22111: Processor Microarchitecture Part 4 13 14/10/2014 Implementing the control We’ll have a go at implementing a single always block for setting the control signals ... always @ (*)! case (state)! `FETCH: begin! mem_read = 1’b1;! mem_write = 1’b0;! acc_en = 1’b0;! pc_en = 1’b1;! ir_en= = 1’b1;! data_out_sel = 1’b1;! data_in_sel = 1’bx;! address_sel = 1’b0;! alu_op = 2’b10;! halt = 1’b0;! end! `EXECUTE: begin! case (instr)! // see notes! endcase! endcase! COMP22111: Processor Microarchitecture Part 4 Verilog Test Bench module MU0_test();! ! // declarations, reg etc. – removed for clarity! ! MU0 processor(.clock (clk), // Instantiate the device! .reset (reset),! ! .data_in (data_in),! .data_out (data_out),! .address (address),! .mem_rd (memory_write),! .mem_wr (memory_write),! .halt (halt));! ! initial // This block resets the processor! begin // and, later, stops the simulation! reset = 0;! #200;! reset = 1;! #100;! reset = 0;! #1000;! $stop;! end! ! initial memory[0] = 16'h0000; // Memory initialisation! initial memory[1] = 16'h2002; // Read from file really @@@! initial memory[2] = 16'h1000;! initial memory[3] = 16'h4000;! ! always @ (address) // Trivial memory read @@@! #20 data_in = memory[address];! ! ! endmodule! ! ! COMP22111: Processor Microarchitecture Part 4 14 14/10/2014 Test Bench How do we check that the model is correct? …we test! Testbench stimulus Test stimulus check results Model response Same test should be applied at every level of the design hierarchy and on the fabricated chip. Test must be passed before proceeding to the next level down. COMP22111: Processor Microarchitecture Part 4 Overview We have looked at: • RISC v CISC • A general RISC instruction format • Simple RISC control • Pipelining • Developed a specification for a simple RISC processor – the Stump • Looked at the Stump instruction set • Briefly looked at testing of the Stump • Developed n architectural design for the Stump and checked it using path usage diagrams • Developed an RTL design for the Stump • We’ve looked at implementing the control using MU0 as a design example … you will now build the Stump in the lab! COMP22111: Processor Microarchitecture Part 4 15
© Copyright 2024 ExpyDoc