Computer Architecture Fundamentals: CPU, Bus, Memory & Addressing
This article is a concept explanation about computer architecture fundamentals.
In a Nutshell
Computer architecture describes how a computer’s components are organized and interconnected to execute instructions and process data.
CPU (Central Processing Unit)
CPU Components
1. Control Unit (CU)
- Function: Directs the flow of data between CPU, memory, and I/O devices
- Responsibilities: Instruction decoding, timing, and control signals
- Components: Instruction decoder, control logic, timing circuits
2. Arithmetic Logic Unit (ALU)
- Function: Performs arithmetic and logical operations
- Operations: Addition, subtraction, multiplication, division, AND, OR, NOT, XOR
- Registers: Temporary storage for operands and results
3. Registers
- General Purpose Registers: Store data and intermediate results
- Special Purpose Registers: Program counter, instruction register, status register
CPU Architecture Types
Von Neumann Architecture
┌─────────────────────────────────────┐
│ CPU │
│ ┌─────────┐ ┌─────────┐ │
│ │ CU │ │ ALU │ │
│ └─────────┘ └─────────┘ │
│ │ │ │
│ └───────────┘ │
└─────────────┬───────────────────────┘
│
┌─────┴─────┐
│ Memory │
│ (Unified) │
└───────────┘
Harvard Architecture
┌─────────────────────────────────────┐
│ CPU │
│ ┌─────────┐ ┌─────────┐ │
│ │ CU │ │ ALU │ │
│ └─────────┘ └─────────┘ │
│ │ │ │
│ ┌────┴─────┐ ┌───┴────┐ │
│ │Instr. Mem│ │Data Mem │ │
│ └───────────┘└─────────┘ │
└─────────────────────────────────────┘
Bus Systems
Bus Types
1. Address Bus
- Purpose: Carries memory addresses
- Direction: Unidirectional (CPU to memory/I/O)
- Width: Determines maximum addressable memory
2. Data Bus
- Purpose: Carries actual data
- Direction: Bidirectional
- Width: Determines data transfer rate
3. Control Bus
- Purpose: Carries control signals
- Direction: Bidirectional
- Signals: Read/Write, Interrupt, Clock, Reset
Bus Architecture Example
CPU
│
├─── Address Bus (32 bits)
├─── Data Bus (64 bits)
└─── Control Bus (16 bits)
│
├─── Memory
├─── I/O Devices
└─── Secondary Storage
Memory Systems
Memory Hierarchy
Level 1: CPU Registers
↓ (Fastest, Smallest)
Level 2: Cache Memory (L1, L2, L3)
↓
Level 3: Main Memory (RAM)
↓
Level 4: Secondary Storage (SSD/HDD)
↓ (Slowest, Largest)
Level 5: Tertiary Storage (Tape, Cloud)
Memory Types
RAM (Random Access Memory)
- SRAM (Static RAM): Fast, expensive, used for cache
- DRAM (Dynamic RAM): Slower, cheaper, used for main memory
ROM (Read-Only Memory)
- Mask ROM: Factory-programmed, unchangeable
- PROM: Programmable once
- EPROM: Erasable with UV light
- EEPROM: Electrically erasable
- Flash Memory: Modern EEPROM variant
Memory Addressing
Addressing Modes
1. Immediate Addressing
MOV AX, 25 ; Load immediate value 25
2. Direct Addressing
MOV AX, [1000] ; Load from memory address 1000
3. Indirect Addressing
MOV AX, [BX] ; Load from address stored in BX
4. Register Addressing
MOV AX, BX ; Copy from register BX to AX
Memory Layout
┌─────────────────────────────────────┐
│ Stack Segment │ ← High addresses
├─────────────────────────────────────┤
│ Data Segment │
├─────────────────────────────────────┤
│ Code Segment │
└─────────────────────────────────────┘ ← Low addresses
Instruction Execution Cycle
Fetch-Decode-Execute Cycle
1. FETCH
└── Load instruction from memory
2. DECODE
└── Interpret instruction
3. EXECUTE
└── Perform operation
4. WRITEBACK
└── Store result (if needed)
Example: Adding Two Numbers
Step 1: FETCH
- PC contains address of ADD instruction
- Load instruction into IR
Step 2: DECODE
- Identify ADD operation
- Determine operand locations
Step 3: EXECUTE
- Load operands from memory/registers
- Perform addition in ALU
Step 4: WRITEBACK
- Store result in destination
- Update PC to next instruction
Performance Metrics
CPU Performance Factors
1. Clock Speed
- Measurement: Gigahertz (GHz)
- Impact: Higher clock speed = faster execution
- Limitation: Heat generation and power consumption
2. Instructions Per Cycle (IPC)
- Definition: Number of instructions executed per clock cycle
- Modern CPUs: Can execute multiple instructions per cycle
3. Cache Hit Rate
- Definition: Percentage of memory accesses found in cache
- Impact: Higher hit rate = better performance
Memory Performance
Access Time Comparison
CPU Registers: 0-1 ns
L1 Cache: 1-4 ns
L2 Cache: 10-20 ns
L3 Cache: 20-100 ns
Main Memory (RAM): 50-100 ns
SSD: 50-150 μs
HDD: 5-20 ms
Modern CPU Features
1. Pipelining
- Concept: Overlap instruction execution
- Stages: Fetch, Decode, Execute, Memory, Writeback
- Benefit: Higher instruction throughput
2. Superscalar Architecture
- Concept: Multiple execution units
- Benefit: Execute multiple instructions simultaneously
3. Out-of-Order Execution
- Concept: Execute instructions in different order than written
- Benefit: Better utilization of execution units
4. Speculative Execution
- Concept: Execute instructions before knowing if they’re needed
- Benefit: Higher performance, but security implications
Practical Examples
Memory Address Calculation
Given:
- Base address: 0x1000
- Offset: 0x20
- Element size: 4 bytes
Physical address = Base + (Offset × Element size)
Physical address = 0x1000 + (0x20 × 4)
Physical address = 0x1000 + 0x80
Physical address = 0x1080
Cache Mapping Example
Direct-Mapped Cache:
- Cache size: 8KB
- Block size: 64 bytes
- Number of blocks: 8KB / 64B = 128 blocks
Address mapping:
Address bits: [Tag][Index][Offset]
Tag: 19 bits, Index: 7 bits, Offset: 6 bits
Best Practices
For Programmers
- Locality of reference: Access memory sequentially when possible
- Cache optimization: Structure data for cache efficiency
- Memory alignment: Align data on natural boundaries
- Minimize memory access: Use registers and cache effectively
For System Designers
- Balance components: Ensure no single bottleneck
- Consider memory hierarchy: Optimize for common access patterns
- Plan for scalability: Design for future expansion
- Monitor performance: Track and optimize bottlenecks
Common Issues
1. Memory Leaks
- Cause: Unreleased memory allocations
- Impact: Reduced available memory over time
- Solution: Proper memory management
2. Cache Thrashing
- Cause: Poor cache utilization
- Impact: Reduced performance
- Solution: Optimize data access patterns
3. Bus Contention
- Cause: Multiple components competing for bus access
- Impact: Reduced throughput
- Solution: Proper bus arbitration and scheduling
Future Trends
1. Quantum Computing
- Difference: Uses quantum bits (qubits) instead of classical bits
- Potential: Exponential speedup for certain problems
2. Neuromorphic Computing
- Concept: Brain-inspired architecture
- Application: AI and machine learning
3. 3D Stacking
- Concept: Stack components vertically
- Benefit: Reduced interconnect distance, higher density