Big O Notation: Runtime Complexity & Algorithm Efficiency
This article is a concept explanation about Big O notation and algorithm complexity analysis.
In a Nutshell
Big O notation describes how an algorithm’s runtime or memory usage grows as the input size increases, focusing on the worst-case scenario.
What is Big O Notation?
Big O notation is a mathematical notation that describes the limiting behavior of a function when the argument tends towards a particular value or infinity. In computer science, it’s used to classify algorithms according to how their runtime or space requirements grow as the input size grows.
Why Big O Matters
- Performance Prediction: Estimate how algorithms scale
- Algorithm Selection: Choose the most efficient algorithm
- System Design: Plan for expected load
- Optimization: Identify performance bottlenecks
Common Big O Complexities
O(1) - Constant Time
Runtime doesn’t depend on input size.
def get_first_element(arr):
return arr[0] # Always one operation
def hash_table_lookup(dict, key):
return dict[key] # Average case: one operation
Examples: Array access, hash table lookup, stack push/pop
O(log n) - Logarithmic Time
Runtime grows logarithmically with input size.
def binary_search(arr, target):
left, right = 0, len(arr) - 1
while left <= right:
mid = (left + right) // 2
if arr[mid] == target:
return mid
elif arr[mid] < target:
left = mid + 1
else:
right = mid - 1
return -1
Examples: Binary search, balanced tree operations, heap operations
O(n) - Linear Time
Runtime grows linearly with input size.
def linear_search(arr, target):
for element in arr:
if element == target:
return True
return False
def find_max(arr):
max_val = arr[0]
for element in arr:
if element > max_val:
max_val = element
return max_val
Examples: Linear search, finding maximum/minimum, simple traversals
O(n log n) - Linearithmic Time
Combination of linear and logarithmic growth.
def merge_sort(arr):
if len(arr) <= 1:
return arr
mid = len(arr) // 2
left = merge_sort(arr[:mid])
right = merge_sort(arr[mid:])
result = []
i = j = 0
while i < len(left) and j < len(right):
if left[i] <= right[j]:
result.append(left[i])
i += 1
else:
result.append(right[j])
j += 1
result.extend(left[i:])
result.extend(right[j:])
return result
Examples: Merge sort, heap sort, quick sort (average case)
O(n²) - Quadratic Time
Runtime grows quadratically with input size.
def bubble_sort(arr):
n = len(arr)
for i in range(n):
for j in range(0, n - i - 1):
if arr[j] > arr[j + 1]:
arr[j], arr[j + 1] = arr[j + 1], arr[j]
return arr
def find_duplicates(arr):
duplicates = []
for i in range(len(arr)):
for j in range(i + 1, len(arr)):
if arr[i] == arr[j]:
duplicates.append(arr[i])
break
return duplicates
Examples: Bubble sort, selection sort, nested loops
O(2ⁿ) - Exponential Time
Runtime doubles with each additional input element.
def fibonacci(n):
if n <= 1:
return n
return fibonacci(n - 1) + fibonacci(n - 2)
def power_set(arr):
if not arr:
return [[]]
first = arr[0]
rest = power_set(arr[1:])
with_first = [[first] + subset for subset in rest]
return rest + with_first
Examples: Recursive Fibonacci, generating all subsets, brute force problems
Big O Complexity Comparison
| Input Size | O(1) | O(log n) | O(n) | O(n log n) | O(n²) | O(2ⁿ) |
|---|---|---|---|---|---|---|
| 10 | 1 | 3 | 10 | 33 | 100 | 1,024 |
| 100 | 1 | 7 | 100 | 664 | 10,000 | 1.27×10³⁰ |
| 1,000 | 1 | 10 | 1,000 | 9,966 | 1M | 1.07×10³⁰¹ |
| 10,000 | 1 | 13 | 10,000 | 132,877 | 100M | ∞ |
Space Complexity
Space complexity measures memory usage, not runtime.
O(1) - Constant Space
def find_max_constant_space(arr):
max_val = arr[0]
for element in arr:
if element > max_val:
max_val = element
return max_val # Only uses one extra variable
O(n) - Linear Space
def reverse_array(arr):
reversed_arr = []
for element in arr:
reversed_arr.insert(0, element)
return reversed_arr # Uses O(n) extra space
O(n²) - Quadratic Space
def create_matrix(n):
matrix = []
for i in range(n):
row = []
for j in range(n):
row.append(i * j)
matrix.append(row)
return matrix # Uses O(n²) space
Best, Average, and Worst Case
Linear Search Analysis
def linear_search_analysis(arr, target):
comparisons = 0
for element in arr:
comparisons += 1
if element == target:
return comparisons # Best case: 1, Worst case: n
return comparisons # Target not found: n
- Best Case: O(1) - Target is first element
- Average Case: O(n) - Target is middle element
- Worst Case: O(n) - Target is last element or not found
Quick Sort Analysis
def quick_sort(arr):
if len(arr) <= 1:
return arr
pivot = arr[len(arr) // 2]
left = [x for x in arr if x < pivot]
middle = [x for x in arr if x == pivot]
right = [x for x in arr if x > pivot]
return quick_sort(left) + middle + quick_sort(right)
- Best Case: O(n log n) - Pivot always divides array evenly
- Average Case: O(n log n) - Random pivot selection
- Worst Case: O(n²) - Pivot is always smallest or largest
Analyzing Algorithms
Steps for Big O Analysis
- Identify the basic operation that dominates runtime
- Express runtime as function of input size n
- Drop constant factors and lower-order terms
- Use Big O notation to describe growth rate
Example: Sum of Array Elements
def sum_array(arr):
total = 0 # O(1)
for element in arr: # Runs n times
total += element # O(1)
return total # O(1)
Analysis:
- Basic operation: addition (O(1))
- Runs n times: n × O(1) = O(n)
- Drop constants: O(n)
Example: Nested Loops
def print_pairs(arr):
for i in range(len(arr)): # n iterations
for j in range(len(arr)): # n iterations
print(arr[i], arr[j]) # O(1)
Analysis:
- Outer loop: n iterations
- Inner loop: n iterations for each outer iteration
- Total operations: n × n = n²
- Complexity: O(n²)
Optimization Strategies
1. Algorithm Selection
Choose the right algorithm for the problem:
# O(n²) - Bubble sort (bad for large arrays)
def bubble_sort(arr):
# ... implementation
# O(n log n) - Merge sort (better for large arrays)
def merge_sort(arr):
# ... implementation
# O(n log n) - Quick sort (often fastest in practice)
def quick_sort(arr):
# ... implementation
2. Data Structure Choice
Use appropriate data structures:
# O(n) - Linear search in list
def linear_search_list(arr, target):
return target in arr
# O(1) - Hash table lookup
def hash_table_lookup(dict, target):
return target in dict
3. Memoization
Cache results to avoid recomputation:
# O(2ⁿ) - Inefficient Fibonacci
def fibonacci_slow(n):
if n <= 1:
return n
return fibonacci_slow(n - 1) + fibonacci_slow(n - 2)
# O(n) - Memoized Fibonacci
memo = {}
def fibonacci_fast(n):
if n in memo:
return memo[n]
if n <= 1:
return n
memo[n] = fibonacci_fast(n - 1) + fibonacci_fast(n - 2)
return memo[n]
Common Pitfalls
1. Ignoring Hidden Complexity
# Looks like O(n) but actually O(n²)
def remove_duplicates(arr):
result = []
for element in arr: # n iterations
if element not in result: # O(n) operation!
result.append(element)
return result
2. Confusing Best and Worst Case
Always analyze worst-case scenario unless specified otherwise.
3. Forgetting Space Complexity
Consider both time and space complexity in analysis.
Practical Guidelines
When to Optimize
- Profile first: Measure actual performance
- Identify bottlenecks: Focus on code that actually matters
- Consider constraints: Input size, memory limits, real-time requirements
Optimization Rules of Thumb
- O(n²) → O(n log n): Use better sorting algorithms
- O(n) → O(log n): Use binary search or hash tables
- O(2ⁿ) → O(n): Use dynamic programming or memoization
- Multiple O(n) → O(n): Combine linear passes
Real-World Examples
Database Indexing
- Without index: O(n) - Full table scan
- With index: O(log n) - B-tree lookup
Web Server Scaling
- Single thread: O(1) per request
- Multiple threads: O(n) - Linear scaling
- Distributed system: O(log n) - Logarithmic scaling
Caching Strategies
- Cache miss: O(n) - Fetch from database
- Cache hit: O(1) - Fetch from memory
Testing Big O
Empirical Testing
import time
import random
def measure_runtime(func, sizes):
for n in sizes:
data = list(range(n))
random.shuffle(data)
start_time = time.time()
func(data)
end_time = time.time()
print(f"n={n}, time={end_time - start_time:.6f}s")
# Test sorting algorithms
measure_runtime(bubble_sort, [100, 1000, 5000])
measure_runtime(merge_sort, [100, 1000, 5000])
Expected Results
- O(n²): Time increases quadratically
- O(n log n): Time increases faster than linear but slower than quadratic
- O(n): Time increases linearly