Lecture 12: Microarchitecture II

Computer Architecture/C.A (ETH Zürich, Spring 2020)

Lecture 12: Microarchitecture II

Tony Lim 2021. 6. 22. 15:53

728x90

destination address is chosen in instruction through 11~15bits.

RegWrite control singal is ON (1)

in R-type we have rs , rt to read so second mux choose read data2.

ALUOp singal tells funct module to needed ALU operation with those 2 read registers.

we don't read or write to memory so both of them are 0

result is written in Write register

R-type doesn't change control flow so PC+4 is chosen

destination register comes from 20~16bits in instruction

and immediate get sign extended and do ALU operation with given opcode instruction.

Load Word is simliar to I - type operation but we do add operation with source register and given immediate and calcualte address and load that address's value to destination register.

Store Word is similar as LW but instead of reading we write on memory

calculate condition , comparing first register and second register in ALU. X are don't care because we don't write to registers.

and in orange add gate it checks if it is "branch" operation and condition matches. in this case it is not taken so we just do PC + 4.

this case orange and gate output is one so we take ALU result.

we choose our next PC from 25~0 bits from instruction and choose that in mux.

notice that below modules are not used at all. we will cover in the next lecture how we can make use of them.

Single Cycle Processor

Clock cycle time of the microarchitecture is determined by how long it takes to complete the slowest instruction

assumption is not realistic just for study.

shows different types of instructions , what stage do they go through and how many time it takes to execute each steps.

Inefficient

All instruction run as slow as the slowest instruction
must provide worst case combinational resources in parallel as required by any instruction -> need bunch of different module -> hardware size increases.
not easy to optimize or improve performance

Microarchitecture Design Principles

Critical path design

find and decrease the maximum combinational logic delay
break a path into mutiple cycles if it takes too long.

Bread and butter (common case) design

Spend time and resource on where it matters most

Balanced design

Balance instruction / data flow through hardware componets.
Design to eliminate bottlenecks : blanace the hardware for the work.

but single cycle architecture violates all of them.

Muti Cycle Mircoarchitectures

Determine clock cycle time independently of instruction processing time

Each instruction takes as many clock cycles as it needs to take

nubmer of cycle depends on instruction might be many or less.

Benefits of Muti-Cycle design

Critical path design

can keep reducing the critical path independently of the worst case prcoessing time of any instruction

Bread and butter (cmmon case) design

can optimize the nubmer of states it takes to execute "important" instructions that make up much of the execution time

Blanaced dsign

No need to provide more capability or resources than really needed
An instruction that needs resource X mutiple times doesn't require mutiple X's to be implemented
Leads to more efficent hardware -> can reuse hardware components needed mutiple times for an instruction

Downside of Multi Cycle Design

Need to store intermediate results , overhead for registers

unlike single cycle we can use ALU for various state(cycle) , we have register to store certain data at each stage and use it on later cycle.

every instruction can be split into small pieces

IorD(the first mux) shoud be zero so we can take PC into Instr/Data Memory as address.

IRwirte =1 to store instruction in register(blue circle on the right)

can update PC to PC+4.

it is not updateing memory or register so there are many X (don't care).

we don't really know what is instruction at this point , we are setting control unit and trying to figure out what is instruction.

next state is conditional.

if instructions were either LW or SW we come to this memory calculation state.

read first register (source register) and do ALU (add) with sign extended intermmediate and store result in blue circled register for later use.

there is branch if opcode is LW we read memory and write into destination register. and go back to fetch state.

every instructions are runned by control unit which is implemented as FSM.

if other instruction need to be add than we just make more branch and use give moudle with control unit.

728x90

저작자표시 (새창열림)

'Computer Architecture > C.A (ETH Zürich, Spring 2020)' 카테고리의 다른 글

Lecture 14: Pipelining Issues (0)	2021.07.05
Lecture 13: Pipelining (0)	2021.06.28
Lecture 11: Microarchitecture I (0)	2021.06.17
Lecture 10a: Instruction Set Architecture (0)	2021.06.14
Lecture 9: Von Neumann Model ISA LC3 MIPS (0)	2021.06.13

현재글Lecture 12: Microarchitecture II

250x250

람다, JPA, Text Justification, systemd, dijkstra, 메소드 참조, Quicksort, 파일입출력, Interval Scheduling, fft, 스레드, Matrix Mutilply, 영속성, Median Find, spring, Algorithm, 자바8, Linux, 날짜시간, Weighted Interval Scheduling,

Today :
Yesterday :

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

관심있는것들