'Computer Architecture' 카테고리의 글 목록

728x90

Computer Architecture 29

in ML Quantization usually mean FP32 -> INT8Usually, when the training of the model is finished, train parameters falls in to simple shallow distribution. scale ? zero-point ? how do i calculate that scale? Z = 207 so we can map 2^b - 1 to 256. S = 0.6/255Linear depends on function , symmetric asymmetric depends on starting point ( 0 == symmetric , otherwise asymmetric)Dequantization cannot repr..

Computer Architecture/Cornell ECE 5545 2024.07.07

ML HW & Systems. Lecture 5: Microarchitecture

from big picture (Architecture) to actual unit design (Mircro Architecture) , will not cover Circuit accumulator = adding number (need to hold a number) , adder doesn't need to hold statethis takes O(N) to compute dot product of length n vectors. e.g) 8 cycles if n is 8 usping multiple multiplier will be faster but how to merge the results ,need adder treenow it takes 2 cycle to calculate 8 len..

Computer Architecture/Cornell ECE 5545 2024.07.06

Lecture 21b: Memory Hierarchy and Caches

Sram is used for L1,L2 caches which is more expensive. having higher bandwidth. Cache access index into the tag and data stores with index bits in address check valid bit in tag store compare tag bits in address with the stored tag in tag store every block ( for example 00010 , 01010 , 10010, 11010 4 blocks) can be mapped to exact same place. and with tag we can find which cluster of block is th..

Computer Architecture/C.A (ETH Zürich, Spring 2020) 2021.07.27

Lecture 20: Graphics Processing Units

Promgramming Model = how the programmer expresses the code e.g) Sequential (von Neumann) , Data Parallel (SIMD) , Dataflow, Multi-threaded (MIMD, SPMD) Execution Model = how the hardware executes the code underneath e.g) Out of order execution , Vector processor , Array processor , dataflow processor, Multithreaded processor GPU = SPMD(Single program Multiple data) model implemented by a SIMD pr..

Computer Architecture/C.A (ETH Zürich, Spring 2020) 2021.07.23

Lecture 19: SIMD Processors

SISD = Single instruction operates on single data element SIMD = Single instruction operates on mutiple data elements (Array processor, Vector processor) MISD = Mutiple instructions operate on single data element (systolic array processor, streaming processor) MIMD = Multiple instructions operate on mutiple data elements (multiple instruction streams) ,Multiprocessor , Multithreaded processor. n..

Computer Architecture/C.A (ETH Zürich, Spring 2020) 2021.07.19

Lecture 18a: VLIW

보호되어 있는 글입니다.

Computer Architecture/C.A (ETH Zürich, Spring 2020) 2021.07.11

Lecture 16: Branch Prediction

Branch Problem = Next fetch address after a control-flow instruction is not determined after N cycels in a pipelined processor so we try to predict branch target address. based on previous history we might have some address in BTB. if not we just add insturction size to current PC and take that to next PC. Simple Branch Prediction Schemes Compile time (Static) Assuming always not taken Assuming ..

Computer Architecture/C.A (ETH Zürich, Spring 2020) 2021.07.10

Lecture 15b: Out of Order , DataFlow & LD/ST Handling

Reverse Engineer and create Data flow by looking at first picture we can create second picture's right , which is data flow graph. and by looking at data flow graph we can create left instructions. Out of Order Execution with Precise Exception user reorder buffer to reorder instructions before committing them to architectural state instruction updates the Register alias table(RAT, frontend regis..

Computer Architecture/C.A (ETH Zürich, Spring 2020) 2021.07.10

Lecture 15a: Out-of-Order Execution

for example, we are trying to excute the red instruction "Add R3 R4". we need R3 so we check reorder buffer whether it is valid or not if not valid we stall the instruction. if valid we can take bypass and take value from reorder buffer. notice below red instruction there are no depdency. but since in red instruction R3 has to wait for IMUL instrction (8 cycles) , other 3 blue line instruction s..

Computer Architecture/C.A (ETH Zürich, Spring 2020) 2021.07.07

Lecture 14: Pipelining Issues

this shows we might be waisting 3 instruction due to branch instruction. we need to flush 3 instruction if branch codition is right. we want to know more quickly whether the condition of the branch instruction is satisfied. we can move target address checking more ealier. so now we can reduce nubmer of instruction to flush when target matches. Advantage = reduced CPI( cycles per instruction ) Di..

Computer Architecture/C.A (ETH Zürich, Spring 2020) 2021.07.05

1 2 3

250x250

람다, 스레드, Linux, Median Find, 메소드 참조, 파일입출력, 자바8, 영속성, systemd, Algorithm, Text Justification, spring, Matrix Mutilply, 날짜시간, Quicksort, dijkstra, Interval Scheduling, fft, Weighted Interval Scheduling, JPA,

Today :
Yesterday :

728x90

관심있는것들

Computer Architecture 29

티스토리툴바

« 2025/11 »
일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30