Computer Architecture Fundamentals – CPU, Bus, Memory, Addressing
This article is a concept explanation of computer architecture fundamentals – including exam questions, core components and tags.
In a Nutshell
Computer architecture describes the logical structure and functionality of a computer system. Central components are CPU, memory, bus systems and addressing.
Compact Technical Description
The CPU (Central Processing Unit) is the computer’s processing unit and executes instructions. It consists of:
- Control Unit (coordinates instruction execution)
- Arithmetic Logic Unit / ALU (arithmetic-logical operations)
- Registers (very fast temporary storage)
Data and instructions are transferred between CPU, memory, and peripherals via a bus system:
- Data Bus
- Address Bus
- Control Bus
Memory includes:
- RAM (volatile, fast)
- ROM (non-volatile)
- Cache (very fast, close to CPU)
Addressing determines how memory cells are accessed (address space depends on architecture/bus width, e.g., 32 bit).
Exam-Relevant Key Points
- CPU as central processing unit (ALU + Control Unit)
- Bus system: Address, Data, Control lines
- RAM vs. ROM
- Binary addressing; address space depends on bit width (IHK-relevant)
- Memory access via addresses on address bus
- Security aspect: incorrect addresses/Buffer Overflows
- Performance: Bus/memory architecture as bottleneck
- Documentation: Architecture/address logic must be understandable
Core Components
- CPU
- Control Unit
- ALU
- Register Set
- Bus System
- Main Memory (RAM/ROM)
- Cache
- Peripherals
- Addressing Scheme
- Memory Hierarchy
Simple Practical Example
32-bit system: 2^32 addresses = 4 GiB address space
Address 0x00000000 -> 1st memory cell
Address 0xFFFFFFFC -> last (aligned) memory location
├─── Address Bus (32 bits) ├─── Data Bus (64 bits) └─── Control Bus (16 bits) │ ├─── Memory ├─── I/O Devices └─── Secondary Storage
## Memory Systems
### Memory Hierarchy
Level 1: CPU Registers ↓ (Fastest, Smallest) Level 2: Cache Memory (L1, L2, L3) ↓ Level 3: Main Memory (RAM) ↓ Level 4: Secondary Storage (SSD/HDD) ↓ (Slowest, Largest) Level 5: Tertiary Storage (Tape, Cloud)
### Memory Types
#### RAM (Random Access Memory)
- **SRAM (Static RAM)**: Fast, expensive, used for cache
- **DRAM (Dynamic RAM)**: Slower, cheaper, used for main memory
#### ROM (Read-Only Memory)
- **Mask ROM**: Factory-programmed, unchangeable
- **PROM**: Programmable once
- **EPROM**: Erasable with UV light
- **EEPROM**: Electrically erasable
- **Flash Memory**: Modern EEPROM variant
## Memory Addressing
### Addressing Modes
#### 1. Immediate Addressing
```assembly
MOV AX, 25 ; Load immediate value 25
2. Direct Addressing
MOV AX, [1000] ; Load from memory address 1000
3. Indirect Addressing
MOV AX, [BX] ; Load from address stored in BX
4. Register Addressing
MOV AX, BX ; Copy from register BX to AX
Memory Layout
┌─────────────────────────────────────┐
│ Stack Segment │ ← High addresses
├─────────────────────────────────────┤
│ Data Segment │
├─────────────────────────────────────┤
│ Code Segment │
└─────────────────────────────────────┘ ← Low addresses
Instruction Execution Cycle
Fetch-Decode-Execute Cycle
1. FETCH
└── Load instruction from memory
2. DECODE
└── Interpret instruction
3. EXECUTE
└── Perform operation
4. WRITEBACK
└── Store result (if needed)
Example: Adding Two Numbers
Step 1: FETCH
- PC contains address of ADD instruction
- Load instruction into IR
Step 2: DECODE
- Identify ADD operation
- Determine operand locations
Step 3: EXECUTE
- Load operands from memory/registers
- Perform addition in ALU
Step 4: WRITEBACK
- Store result in destination
- Update PC to next instruction
Performance Metrics
CPU Performance Factors
1. Clock Speed
- Measurement: Gigahertz (GHz)
- Impact: Higher clock speed = faster execution
- Limitation: Heat generation and power consumption
2. Instructions Per Cycle (IPC)
- Definition: Number of instructions executed per clock cycle
- Modern CPUs: Can execute multiple instructions per cycle
3. Cache Hit Rate
- Definition: Percentage of memory accesses found in cache
- Impact: Higher hit rate = better performance
Memory Performance
Access Time Comparison
CPU Registers: 0-1 ns
L1 Cache: 1-4 ns
L2 Cache: 10-20 ns
L3 Cache: 20-100 ns
Main Memory (RAM): 50-100 ns
SSD: 50-150 μs
HDD: 5-20 ms
Modern CPU Features
1. Pipelining
- Concept: Overlap instruction execution
- Stages: Fetch, Decode, Execute, Memory, Writeback
- Benefit: Higher instruction throughput
2. Superscalar Architecture
- Concept: Multiple execution units
- Benefit: Execute multiple instructions simultaneously
3. Out-of-Order Execution
- Concept: Execute instructions in different order than written
- Benefit: Better utilization of execution units
4. Speculative Execution
- Concept: Execute instructions before knowing if they’re needed
- Benefit: Higher performance, but security implications
Practical Examples
Memory Address Calculation
Given:
- Base address: 0x1000
- Offset: 0x20
- Element size: 4 bytes
Physical address = Base + (Offset × Element size)
Physical address = 0x1000 + (0x20 × 4)
Physical address = 0x1000 + 0x80
Physical address = 0x1080
Cache Mapping Example
Direct-Mapped Cache:
- Cache size: 8KB
- Block size: 64 bytes
- Number of blocks: 8KB / 64B = 128 blocks
Address mapping:
Address bits: [Tag][Index][Offset]
Tag: 19 bits, Index: 7 bits, Offset: 6 bits
Best Practices
For Programmers
- Locality of reference: Access memory sequentially when possible
- Cache optimization: Structure data for cache efficiency
- Memory alignment: Align data on natural boundaries
- Minimize memory access: Use registers and cache effectively
For System Designers
- Balance components: Ensure no single bottleneck
- Consider memory hierarchy: Optimize for common access patterns
- Plan for scalability: Design for future expansion
- Monitor performance: Track and optimize bottlenecks
Common Issues
1. Memory Leaks
- Cause: Unreleased memory allocations
- Impact: Reduced available memory over time
- Solution: Proper memory management
2. Cache Thrashing
- Cause: Poor cache utilization
- Impact: Reduced performance
- Solution: Optimize data access patterns
3. Bus Contention
- Cause: Multiple components competing for bus access
- Impact: Reduced throughput
- Solution: Proper bus arbitration and scheduling
Future Trends
1. Quantum Computing
- Difference: Uses quantum bits (qubits) instead of classical bits
- Potential: Exponential speedup for certain problems
2. Neuromorphic Computing
- Concept: Brain-inspired architecture
- Application: AI and machine learning
3. 3D Stacking
- Concept: Stack components vertically
- Benefit: Reduced interconnect distance, higher density