Why should I learn computer architecture?

Intel Nehalem microarchitecture
quadruple associative Instruction Cache 32 KByte,
128-­entry TLB-­4K, 7 TLB-­2/4M per thread
Why should I learn computer architecture? Uncore
128
Quick Path
Inter-­
connect
Branch
Prediction
global/bimodal,
loop, indirect
jmp
Prefetch Buffer (16 Bytes)
Predecode & Instruction Length Decoder
4 x 20 Bit
6,4 GT/s
Instruction Queue
18 x86 Instructions
Alignment
MacroOp Fusion
Simple
Decoder
Complex
Decoder
Loop
Stream
Decoder
DDR3
Memory
Controller
Simple
Decoder
Simple
Decoder
Decoded Instruction Queue (28 µOP entries)
MicroOp Fusion
2 x
Retirement
Register
File
Common
L3-­Cache
8 MByte
Micro
Instruction
Sequencer
2 x Register Allocation Table (RAT)
Reorder Buffer (128-­entry) fused
256 KByte
8-­way,
64 Byte
Cacheline,
private
L2-­Cache
Reservation Station (128-­entry) fused
Port 4
Store
Data
Port 3
Port 2
Port 5
Port 1
AGU
AGU
Store
Addr.
Unit
Load
Addr.
Unit
Integer/
MMX ALU,
Branch
Integer/
MMX ALU
SSE
ADD
Move
SSE
ADD
Move
128
Port 0
FP
ADD
Integer/
FP
MMX ALU,
MUL 2x AGU
SSE
MUL/DIV
Move
128
512-­entry
L2-­TLB-­4K
128
Result Bus
Memory Order Buffer (MOB)
128
128
octuple associative Data Cache 32 KByte,
64-­entry TLB-­4K, 32-­entry TLB-­2/4M
GT/s: gigatransfers per second
256
3 x 64 Bit
1,33 GT/s
Reason #1: It’s fun -­‐  Moore’s Law means the field is always radically changing -­‐  where else do you get an exponenBally larger number of legos to play with every year? -­‐  New applicaBon domains lead to totally new designs -­‐  GPUs -­‐  Phones -­‐  Data Center -­‐  Wearable -­‐  Implantable processors -­‐  Quantum, Biological, etc.. -­‐  CS ideas are increasingly applicable to hardware design -­‐  Making things faster, smaller, more energy efficient is a rush Reason #2: Performance MaSers Google Data Center Mobile Devices 100,000 computers + your code + 2X faster 1 phone + your code + 2X faster = 50,000 computers saved = 1 MW of electricity saved = “fast enough” = runs on 4 M more ipads How can you speed up code if you don’t know how a computer works? Reason #3 Great computer scienBsts know the whole stack. Bill Gates Mark Zuckerberg Richard Stallman Jeff Dean (Google) Guido (Python) Houston (Dropbox) Torvalds (Linux) Limor Fried (Adafruit) Alan Turing Larry Page (Google) Which of these people didn’t know how a computer works? Reason #4 Employers want employees that are generalists and know the whole stack. Who knows what problems you might end up having to innovate on? What we will learn in this class -­‐  Basic architecture: -­‐  InstrucBon Sets -­‐  Performance Analysis -­‐  Pipelining -­‐  Caches -­‐  Virtual Memory -­‐  In-­‐order processors -­‐  How to build your own all of the above. -­‐  Advanced Topics: -­‐  MulBcore -­‐  Data centers -­‐  Mobile Processors -­‐  GPUs -­‐  Out-­‐of-­‐order Processors -­‐  How x86 / ARM / NVidia combines all of the above