Database Indexes & Performance Optimization: B-Tree, Hash, Fulltext & Clustered

This article is a comprehensive guide to database indexes and performance optimization – including B-Tree, Hash, Fulltext, and Clustered indexes with practical examples.

In a Nutshell

Database indexes significantly accelerate queries by enabling fast access to data. Different index types are optimized for different use cases.

Compact Technical Description

Database indexes are specialized data structures that improve the speed of data retrieval operations in database tables. They work similarly to an index in a book.

Index types and their characteristics:

B-Tree Index (Balanced Tree)

Structure: Balanced tree with sorted values
Usage: Equality, range search, sorting
Performance: O(log n) for search operations
Examples: Primary keys, foreign keys, standard indexes

Hash Index

Structure: Hash table with direct addresses
Usage: Equality searches only (=)
Performance: O(1) for exact matches
Examples: Memory tables, exact searches

Fulltext Index

Structure: Inverted index for text search
Usage: Full-text search, keyword search
Performance: Optimized text search algorithms
Examples: Document search, content search

Clustered Index

Structure: Physical sorting of the table
Usage: Primary key, frequent range searches
Performance: Fast for primary key access
Examples: Time series, log data

Exam-Relevant Key Points

B-Tree Index: Balanced tree for equality and range search
Hash Index: Direct addressing for exact matches
Fulltext Index: Text search with word forms and relevance
Clustered Index: Physical data storage by index
Performance: Query optimization through indexes
Trade-offs: Storage space vs. speed
IHK-relevant: Important for database administration and optimization

Core Components

Index Structure: Tree, Hash, Inverted Index
Index Type: B-Tree, Hash, Fulltext, Clustered
Query Type: Equality, Range, Fulltext
Performance Metrics: Read time, Write time, Storage
Index Strategy: Single-Column, Multi-Column, Covering
Optimization: EXPLAIN analysis, Index Tuning
Maintenance: Rebuild, Fragmentation, Statistics

Practical Examples

1. B-Tree Index Examples

-- MySQL B-Tree Index erstellen
CREATE INDEX idx_kunden_name ON kunden(name);
CREATE INDEX idx_bestellungen_datum ON bestellungen(bestelldatum);

-- Multi-Column B-Tree Index
CREATE INDEX idx_kunden_stadt_name ON kunden(stadt, name);

-- Unique B-Tree Index
CREATE UNIQUE INDEX idx_email_unique ON kunden(email);

-- Abfrage mit B-Tree Index Nutzung
EXPLAIN SELECT * FROM kunden WHERE name = 'Mustermann';
EXPLAIN SELECT * FROM bestellungen WHERE bestelldatum BETWEEN '2024-01-01' AND '2024-12-31';

-- Performance-Vergleich
-- Ohne Index: Full Table Scan
SELECT * FROM grosse_tabelle WHERE spalte_x = 'wert';

-- Mit B-Tree Index: Index Seek
SELECT * FROM grosse_tabelle WHERE spalte_x = 'wert';

2. Hash Index Examples

-- MySQL Hash Index (nur Memory Engine)
CREATE TABLE user_sessions (
    session_id VARCHAR(255) PRIMARY KEY,
    user_id INT,
    created_at TIMESTAMP,
    data TEXT,
    INDEX ((session_id)) USING HASH
) ENGINE=MEMORY;

-- PostgreSQL Hash Index
CREATE INDEX idx_hash_email ON benutzer USING HASH (email);

-- Abfrage mit Hash Index (nur Gleichheit)
SELECT * FROM benutzer WHERE email = 'user@example.com';

-- Hash Index wird NICHT verwendet für:
SELECT * FROM benutzer WHERE email LIKE 'user%';  -- Bereichssuche
SELECT * FROM benutzer WHERE email > 'a';        -- Vergleichsoperation

3. Fulltext Index Examples

-- MySQL Fulltext Index
CREATE TABLE artikel (
    id INT PRIMARY KEY,
    titel VARCHAR(255),
    inhalt TEXT,
    FULLTEXT KEY ft_inhalt (titel, inhalt)
);

-- Fulltext Suche
SELECT titel, inhalt 
FROM artikel 
WHERE MATCH(titel, inhalt) AGAINST('datenbank performance' IN NATURAL LANGUAGE MODE);

-- Mit Relevanz-Score
SELECT titel, 
       MATCH(titel, inhalt) AGAINST('datenbank performance' IN NATURAL LANGUAGE MODE) AS score
FROM artikel 
WHERE MATCH(titel, inhalt) AGAINST('datenbank performance' IN NATURAL LANGUAGE MODE)
ORDER BY score DESC;

-- Boolean Mode für komplexe Suchen
SELECT titel, inhalt
FROM artikel 
WHERE MATCH(titel, inhalt) AGAINST('+datenbank +performance -mysql' IN BOOLEAN MODE);

-- PostgreSQL Fulltext Index
CREATE TABLE dokumente (
    id SERIAL PRIMARY KEY,
    titel VARCHAR(255),
    inhalt TEXT
);

-- Fulltext Index mit tsvector
ALTER TABLE dokumente ADD COLUMN searchable_text tsvector;
UPDATE dokumente SET searchable_text = to_tsvector('german', titel || ' ' || inhalt);
CREATE INDEX idx_fulltext ON dokumente USING GIN(searchable_text);

-- Fulltext Suche in PostgreSQL
SELECT titel, ts_rank(searchable_text, plainto_tsquery('german', 'datenbank performance')) AS rank
FROM dokumente 
WHERE searchable_text @@ plainto_tsquery('german', 'datenbank performance')
ORDER BY rank DESC;

4. Clustered Index Examples

-- SQL Server Clustered Index
CREATE TABLE log_daten (
    log_id INT IDENTITY(1,1),
    zeitstempel DATETIME NOT NULL,
    nachricht VARCHAR(1000),
    level VARCHAR(10)
);

-- Clustered Index auf Zeitstempel (physische Sortierung)
CREATE CLUSTERED INDEX idx_log_zeitstempel ON log_daten(zeitstempel);

-- PostgreSQL Clustered Index (CLUSTER)
CREATE TABLE messwerte (
    id SERIAL PRIMARY KEY,
    sensor_id INT,
    zeitpunkt TIMESTAMP NOT NULL,
    wert DECIMAL(10,4)
);

-- Clustered Index auf Primärschlüssel (Standard)
CREATE INDEX idx_messwerte_zeitpunkt ON messwerte(zeitpunkt);

-- Tabelle nach Index clustern
CLUSTER messwerte USING idx_messwerte_zeitpunkt;

-- MySQL InnoDB Clustered Index (automatisch auf Primärschlüssel)
CREATE TABLE transaktionen (
    id BIGINT PRIMARY KEY,  -- Automatisch clustered
    konto_id INT,
    betrag DECIMAL(12,2),
    datum DATETIME,
    INDEX idx_konto_datum (konto_id, datum)  -- Secondary Index
);

5. Performance Analysis with EXPLAIN

-- MySQL EXPLAIN Analyse
EXPLAIN FORMAT=JSON
SELECT k.name, b.bestelldatum, b.gesamtbetrag
FROM kunden k
JOIN bestellungen b ON k.id = b.kunden_id
WHERE k.stadt = 'Berlin'
  AND b.bestelldatum >= '2024-01-01'
ORDER BY b.gesamtbetrag DESC;

-- PostgreSQL EXPLAIN ANALYZE
EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON)
SELECT k.name, b.bestelldatum, b.gesamtbetrag
FROM kunden k
JOIN bestellungen b ON k.id = b.kunden_id
WHERE k.stadt = 'Berlin'
  AND b.bestelldatum >= '2024-01-01'
ORDER BY b.gesamtbetrag DESC;

-- Index-Nutzung überprüfen
-- MySQL: SHOW INDEX FROM tabelle;
SHOW INDEX FROM kunden;

-- PostgreSQL: \d tabelle
-- oder pg_indexes view
SELECT indexname, indexdef 
FROM pg_indexes 
WHERE tablename = 'kunden';

6. Index Optimization and Best Practices

-- Covering Index (alle benötigten Spalten im Index)
CREATE INDEX idx_kunden_covering ON kunden(stadt, name, id);

-- Abfrage nutzt nur den Index (Index-Only Scan)
SELECT id, name FROM kunden WHERE stadt = 'Berlin';

-- Partial Index (nur für Teil der Daten)
CREATE INDEX idx_aktive_kunden ON kunden(id) WHERE status = 'aktiv';

-- Function-Based Index (PostgreSQL)
CREATE INDEX idx_kunden_name_lower ON kunden(LOWER(name));

-- Abfrage nutzt Function-Based Index
SELECT * FROM kunden WHERE LOWER(name) = 'mustermann';

-- Composite Index mit richtiger Spaltenreihenfolge
CREATE INDEX idx_bestellungen_optimal ON bestellungen(kunden_id, bestelldatum, status);

-- Gute Abfrage (nutzt Index vollständig)
SELECT * FROM bestellungen 
WHERE kunden_id = 123 
  AND bestelldatum >= '2024-01-01' 
  AND status = 'completed';

-- Schlechte Abfrage (kann Index nicht nutzen)
SELECT * FROM bestellungen 
WHERE status = 'completed' 
  AND bestelldatum >= '2024-01-01';  -- Führende Spalte nicht in WHERE

7. Index Maintenance and Monitoring

-- Update index statistics
-- MySQL
ANALYZE TABLE kunden;

-- PostgreSQL
ANALYZE kunden;

-- Check index fragmentation
-- SQL Server
SELECT * FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('kunden'), NULL, NULL, 'DETAILED');

-- PostgreSQL
SELECT schemaname, tablename, attname, n_distinct, correlation 
FROM pg_stats 
WHERE tablename = 'kunden';

-- Index rebuild for fragmentation
-- SQL Server
ALTER INDEX ALL ON kunden REBUILD;

-- MySQL (InnoDB)
ANALYZE TABLE kunden;  -- Update statistics
OPTIMIZE TABLE kunden;  -- Optimize table

-- PostgreSQL
REINDEX INDEX idx_kunden_name;

Performance Comparison Index Types

Index Type	Search Type	Performance	Storage	Usage
B-Tree	Equality, Range	O(log n)	Medium	Standard Indexes
Hash	Equality Only	O(1)	Low	Memory Tables
Fulltext	Full Text	Variable	High	Text Search
Clustered	Primary Key	O(log n)	Table	Frequent Access

Index Strategies

Single-Column Index

-- Simple index on one column
CREATE INDEX idx_kunden_email ON kunden(email);

Multi-Column Index

-- Composite index with optimal column order
CREATE INDEX idx_kunden_stadt_name ON kunden(stadt, name);

Covering Index

-- Index contains all required columns
CREATE INDEX idx_bestellungen_covering ON bestellungen(kunden_id, bestelldatum, gesamtbetrag);

Partial Index

-- Index only for part of the data
CREATE INDEX idx_aktive_kunden ON kunden(id) WHERE status = 'aktiv';

Function-Based Index

-- Index on calculated values
CREATE INDEX idx_kunden_name_lower ON kunden(LOWER(name));

Query Optimization

Maximize Index Usage

-- GOOD Queries (use indexes)
SELECT * FROM kunden WHERE stadt = 'Berlin';                    -- Single-Column Index
SELECT * FROM bestellungen WHERE kunden_id = 123 AND datum > '2024-01-01'; -- Composite Index
SELECT * FROM artikel WHERE MATCH(titel) AGAINST('suchbegriff');    -- Fulltext Index

-- BAD Queries (don't use indexes)
SELECT * FROM kunden WHERE LOWER(name) = 'mustermann';           -- No Function-Based Index
SELECT * FROM bestellungen WHERE YEAR(datum) = 2024;            -- Function on column
SELECT * FROM kunden WHERE name LIKE '%mustermann%';             -- Leading Wildcard

EXPLAIN Analysis

-- MySQL
EXPLAIN SELECT * FROM kunden WHERE stadt = 'Berlin';

-- Important columns:
-- type: ALL (bad), ref, range, index (good)
-- key: Used index
-- rows: Estimated row count
-- Extra: Using index (good), Using filesort (bad)

-- PostgreSQL
EXPLAIN ANALYZE SELECT * FROM kunden WHERE stadt = 'Berlin';

-- Important information:
-- Seq Scan vs Index Scan
-- Index-Only Scan (very good)
-- Actual execution time

Advantages and Disadvantages

Advantages of Indexes

Performance: Dramatically faster queries
Sorting: ORDER BY without additional sorting
Uniqueness: UNIQUE indexes guarantee data integrity
Joins: Foreign key indexes accelerate JOINs

Disadvantages

Storage Space: Indexes require additional storage
Write Performance: INSERT/UPDATE/DELETE become slower
Maintenance: Indexes must be maintained
Overhead: Too many indexes can be counterproductive

Best Practices

When to Create Indexes

Frequent WHERE clauses
JOIN conditions
ORDER BY clauses
GROUP BY clauses
UNIQUE constraints

When to Avoid Indexes

Rarely used columns
Tables with few rows
Columns with high cardinality and low selectivity
Frequent UPDATE/DELETE operations

Index Design

Column Order: Selectivity descending
Index Width: As narrow as possible
Covering Index: Include all required columns
Partial Index: Only for relevant data

Frequently Asked Questions

What is the difference between B-Tree and Hash Index? B-Tree supports equality and range search, Hash only equality search.
When do you use a Fulltext Index? For text search in large text fields with stemming and relevance scoring.
Explain Clustered vs Non-Clustered Index! Clustered: Physical data storage, Non-Clustered: Separate index structure.
How do you check index usage? With EXPLAIN/EXPLAIN ANALYZE to analyze the query plan.

Database Indexes & Performance Optimization: B-Tree, Hash, Fulltext

Database Indexes & Performance Optimization: B-Tree, Hash, Fulltext & Clustered

In a Nutshell

Compact Technical Description

B-Tree Index (Balanced Tree)

Hash Index

Fulltext Index

Clustered Index

Exam-Relevant Key Points

Core Components

Practical Examples

1. B-Tree Index Examples

2. Hash Index Examples

3. Fulltext Index Examples

4. Clustered Index Examples

5. Performance Analysis with EXPLAIN

6. Index Optimization and Best Practices

7. Index Maintenance and Monitoring

Performance Comparison Index Types

Index Strategies

Single-Column Index

Multi-Column Index

Covering Index

Partial Index

Function-Based Index

Query Optimization

Maximize Index Usage

EXPLAIN Analysis

Advantages and Disadvantages

Advantages of Indexes

Disadvantages

Best Practices

When to Create Indexes

When to Avoid Indexes

Index Design

Frequently Asked Questions

Most Important Sources

Recommended Reading: Databases

Database Indexes & Performance Optimization: B-Tree, Hash, Fulltext & Clustered

IRC-Mania Die Seite rund um Sicherheit und Programmierung

In a Nutshell

Compact Technical Description

B-Tree Index (Balanced Tree)

Hash Index

Fulltext Index

Clustered Index

Exam-Relevant Key Points

Core Components

Practical Examples

1. B-Tree Index Examples

2. Hash Index Examples

3. Fulltext Index Examples

4. Clustered Index Examples

5. Performance Analysis with EXPLAIN

6. Index Optimization and Best Practices

7. Index Maintenance and Monitoring

IRC-Security.de

Performance Comparison Index Types

Index Strategies

Single-Column Index

Multi-Column Index

Covering Index

Partial Index

Function-Based Index

Query Optimization

Maximize Index Usage

EXPLAIN Analysis

Advantages and Disadvantages

Advantages of Indexes

Disadvantages

Best Practices

When to Create Indexes

When to Avoid Indexes

Index Design

Frequently Asked Questions

Most Important Sources

Recommended Reading: Databases

Premium VPS Hosting