Computer Architecture Fundamentals: CPU, Bus, Memory & Addressing

This article is a concept explanation about computer architecture fundamentals.

In a Nutshell

Computer architecture describes how a computer’s components are organized and interconnected to execute instructions and process data.

CPU (Central Processing Unit)

CPU Components

1. Control Unit (CU)

Function: Directs the flow of data between CPU, memory, and I/O devices
Responsibilities: Instruction decoding, timing, and control signals
Components: Instruction decoder, control logic, timing circuits

2. Arithmetic Logic Unit (ALU)

Function: Performs arithmetic and logical operations
Operations: Addition, subtraction, multiplication, division, AND, OR, NOT, XOR
Registers: Temporary storage for operands and results

3. Registers

General Purpose Registers: Store data and intermediate results
Special Purpose Registers: Program counter, instruction register, status register

CPU Architecture Types

Von Neumann Architecture

┌─────────────────────────────────────┐
│            CPU                       │
│  ┌─────────┐  ┌─────────┐           │
│  │    CU   │  │   ALU   │           │
│  └─────────┘  └─────────┘           │
│         │           │               │
│         └───────────┘               │
└─────────────┬───────────────────────┘
              │
        ┌─────┴─────┐
        │  Memory   │
        │ (Unified) │
        └───────────┘

Harvard Architecture

┌─────────────────────────────────────┐
│            CPU                       │
│  ┌─────────┐  ┌─────────┐           │
│  │    CU   │  │   ALU   │           │
│  └─────────┘  └─────────┘           │
│         │           │               │
│    ┌────┴─────┐ ┌───┴────┐          │
│    │Instr. Mem│ │Data Mem │          │
│    └───────────┘└─────────┘          │
└─────────────────────────────────────┘

Bus Systems

Bus Types

1. Address Bus

Purpose: Carries memory addresses
Direction: Unidirectional (CPU to memory/I/O)
Width: Determines maximum addressable memory

2. Data Bus

Purpose: Carries actual data
Direction: Bidirectional
Width: Determines data transfer rate

3. Control Bus

Purpose: Carries control signals
Direction: Bidirectional
Signals: Read/Write, Interrupt, Clock, Reset

Bus Architecture Example

CPU
 │
 ├─── Address Bus (32 bits)
 ├─── Data Bus (64 bits)
 └─── Control Bus (16 bits)
      │
      ├─── Memory
      ├─── I/O Devices
      └─── Secondary Storage

Memory Systems

Memory Hierarchy

Level 1: CPU Registers
    ↓ (Fastest, Smallest)
Level 2: Cache Memory (L1, L2, L3)
    ↓
Level 3: Main Memory (RAM)
    ↓
Level 4: Secondary Storage (SSD/HDD)
    ↓ (Slowest, Largest)
Level 5: Tertiary Storage (Tape, Cloud)

Memory Types

RAM (Random Access Memory)

SRAM (Static RAM): Fast, expensive, used for cache
DRAM (Dynamic RAM): Slower, cheaper, used for main memory

ROM (Read-Only Memory)

Mask ROM: Factory-programmed, unchangeable
PROM: Programmable once
EPROM: Erasable with UV light
EEPROM: Electrically erasable
Flash Memory: Modern EEPROM variant

Memory Addressing

Addressing Modes

1. Immediate Addressing

MOV AX, 25    ; Load immediate value 25

2. Direct Addressing

MOV AX, [1000]  ; Load from memory address 1000

3. Indirect Addressing

MOV AX, [BX]    ; Load from address stored in BX

4. Register Addressing

MOV AX, BX      ; Copy from register BX to AX

Memory Layout

┌─────────────────────────────────────┐
│          Stack Segment              │ ← High addresses
├─────────────────────────────────────┤
│          Data Segment               │
├─────────────────────────────────────┤
│          Code Segment               │
└─────────────────────────────────────┘ ← Low addresses

Instruction Execution Cycle

Fetch-Decode-Execute Cycle

1. FETCH
   └── Load instruction from memory
   
2. DECODE
   └── Interpret instruction
   
3. EXECUTE
   └── Perform operation
   
4. WRITEBACK
   └── Store result (if needed)

Example: Adding Two Numbers

Step 1: FETCH
   - PC contains address of ADD instruction
   - Load instruction into IR

Step 2: DECODE
   - Identify ADD operation
   - Determine operand locations

Step 3: EXECUTE
   - Load operands from memory/registers
   - Perform addition in ALU

Step 4: WRITEBACK
   - Store result in destination
   - Update PC to next instruction

Performance Metrics

CPU Performance Factors

1. Clock Speed

Measurement: Gigahertz (GHz)
Impact: Higher clock speed = faster execution
Limitation: Heat generation and power consumption

2. Instructions Per Cycle (IPC)

Definition: Number of instructions executed per clock cycle
Modern CPUs: Can execute multiple instructions per cycle

3. Cache Hit Rate

Definition: Percentage of memory accesses found in cache
Impact: Higher hit rate = better performance

Memory Performance

Access Time Comparison

CPU Registers:      0-1 ns
L1 Cache:           1-4 ns
L2 Cache:           10-20 ns
L3 Cache:           20-100 ns
Main Memory (RAM):  50-100 ns
SSD:                50-150 μs
HDD:                5-20 ms

Modern CPU Features

1. Pipelining

Concept: Overlap instruction execution
Stages: Fetch, Decode, Execute, Memory, Writeback
Benefit: Higher instruction throughput

2. Superscalar Architecture

Concept: Multiple execution units
Benefit: Execute multiple instructions simultaneously

3. Out-of-Order Execution

Concept: Execute instructions in different order than written
Benefit: Better utilization of execution units

4. Speculative Execution

Concept: Execute instructions before knowing if they’re needed
Benefit: Higher performance, but security implications

Practical Examples

Memory Address Calculation

Given:
- Base address: 0x1000
- Offset: 0x20
- Element size: 4 bytes

Physical address = Base + (Offset × Element size)
Physical address = 0x1000 + (0x20 × 4)
Physical address = 0x1000 + 0x80
Physical address = 0x1080

Cache Mapping Example

Direct-Mapped Cache:
- Cache size: 8KB
- Block size: 64 bytes
- Number of blocks: 8KB / 64B = 128 blocks

Address mapping:
Address bits: [Tag][Index][Offset]
Tag: 19 bits, Index: 7 bits, Offset: 6 bits

Best Practices

For Programmers

Locality of reference: Access memory sequentially when possible
Cache optimization: Structure data for cache efficiency
Memory alignment: Align data on natural boundaries
Minimize memory access: Use registers and cache effectively

For System Designers

Balance components: Ensure no single bottleneck
Consider memory hierarchy: Optimize for common access patterns
Plan for scalability: Design for future expansion
Monitor performance: Track and optimize bottlenecks

Common Issues

1. Memory Leaks

Cause: Unreleased memory allocations
Impact: Reduced available memory over time
Solution: Proper memory management

2. Cache Thrashing

Cause: Poor cache utilization
Impact: Reduced performance
Solution: Optimize data access patterns

3. Bus Contention

Cause: Multiple components competing for bus access
Impact: Reduced throughput
Solution: Proper bus arbitration and scheduling

Future Trends

1. Quantum Computing

Difference: Uses quantum bits (qubits) instead of classical bits
Potential: Exponential speedup for certain problems

2. Neuromorphic Computing

Concept: Brain-inspired architecture
Application: AI and machine learning

3. 3D Stacking

Concept: Stack components vertically
Benefit: Reduced interconnect distance, higher density