Skip to content
IRC-Coding IRC-Coding
Database Indexes Performance Optimization B-Tree Hash Fulltext Index Clustered Index

Database Indexes & Performance Optimization: B-Tree, Hash, Fulltext

Master database indexes for performance optimization. B-Tree, Hash, Fulltext, and Clustered indexes with practical examples for MySQL and PostgreSQL.

S

schutzgeist

2 min read
Database Indexes & Performance Optimization: B-Tree, Hash, Fulltext

Database Indexes & Performance Optimization: B-Tree, Hash, Fulltext & Clustered

This article is a comprehensive guide to database indexes and performance optimization – including B-Tree, Hash, Fulltext, and Clustered indexes with practical examples.

In a Nutshell

Database indexes significantly accelerate queries by enabling fast access to data. Different index types are optimized for different use cases.

Compact Technical Description

Database indexes are specialized data structures that improve the speed of data retrieval operations in database tables. They work similarly to an index in a book.

Index types and their characteristics:

B-Tree Index (Balanced Tree)

  • Structure: Balanced tree with sorted values
  • Usage: Equality, range search, sorting
  • Performance: O(log n) for search operations
  • Examples: Primary keys, foreign keys, standard indexes

Hash Index

  • Structure: Hash table with direct addresses
  • Usage: Equality searches only (=)
  • Performance: O(1) for exact matches
  • Examples: Memory tables, exact searches

Fulltext Index

  • Structure: Inverted index for text search
  • Usage: Full-text search, keyword search
  • Performance: Optimized text search algorithms
  • Examples: Document search, content search

Clustered Index

  • Structure: Physical sorting of the table
  • Usage: Primary key, frequent range searches
  • Performance: Fast for primary key access
  • Examples: Time series, log data

Exam-Relevant Key Points

  • B-Tree Index: Balanced tree for equality and range search
  • Hash Index: Direct addressing for exact matches
  • Fulltext Index: Text search with word forms and relevance
  • Clustered Index: Physical data storage by index
  • Performance: Query optimization through indexes
  • Trade-offs: Storage space vs. speed
  • IHK-relevant: Important for database administration and optimization

Core Components

  1. Index Structure: Tree, Hash, Inverted Index
  2. Index Type: B-Tree, Hash, Fulltext, Clustered
  3. Query Type: Equality, Range, Fulltext
  4. Performance Metrics: Read time, Write time, Storage
  5. Index Strategy: Single-Column, Multi-Column, Covering
  6. Optimization: EXPLAIN analysis, Index Tuning
  7. Maintenance: Rebuild, Fragmentation, Statistics

Practical Examples

1. B-Tree Index Examples

-- MySQL B-Tree Index erstellen
CREATE INDEX idx_kunden_name ON kunden(name);
CREATE INDEX idx_bestellungen_datum ON bestellungen(bestelldatum);

-- Multi-Column B-Tree Index
CREATE INDEX idx_kunden_stadt_name ON kunden(stadt, name);

-- Unique B-Tree Index
CREATE UNIQUE INDEX idx_email_unique ON kunden(email);

-- Abfrage mit B-Tree Index Nutzung
EXPLAIN SELECT * FROM kunden WHERE name = 'Mustermann';
EXPLAIN SELECT * FROM bestellungen WHERE bestelldatum BETWEEN '2024-01-01' AND '2024-12-31';

-- Performance-Vergleich
-- Ohne Index: Full Table Scan
SELECT * FROM grosse_tabelle WHERE spalte_x = 'wert';

-- Mit B-Tree Index: Index Seek
SELECT * FROM grosse_tabelle WHERE spalte_x = 'wert';

2. Hash Index Examples

-- MySQL Hash Index (nur Memory Engine)
CREATE TABLE user_sessions (
    session_id VARCHAR(255) PRIMARY KEY,
    user_id INT,
    created_at TIMESTAMP,
    data TEXT,
    INDEX ((session_id)) USING HASH
) ENGINE=MEMORY;

-- PostgreSQL Hash Index
CREATE INDEX idx_hash_email ON benutzer USING HASH (email);

-- Abfrage mit Hash Index (nur Gleichheit)
SELECT * FROM benutzer WHERE email = 'user@example.com';

-- Hash Index wird NICHT verwendet für:
SELECT * FROM benutzer WHERE email LIKE 'user%';  -- Bereichssuche
SELECT * FROM benutzer WHERE email > 'a';        -- Vergleichsoperation

3. Fulltext Index Examples

-- MySQL Fulltext Index
CREATE TABLE artikel (
    id INT PRIMARY KEY,
    titel VARCHAR(255),
    inhalt TEXT,
    FULLTEXT KEY ft_inhalt (titel, inhalt)
);

-- Fulltext Suche
SELECT titel, inhalt 
FROM artikel 
WHERE MATCH(titel, inhalt) AGAINST('datenbank performance' IN NATURAL LANGUAGE MODE);

-- Mit Relevanz-Score
SELECT titel, 
       MATCH(titel, inhalt) AGAINST('datenbank performance' IN NATURAL LANGUAGE MODE) AS score
FROM artikel 
WHERE MATCH(titel, inhalt) AGAINST('datenbank performance' IN NATURAL LANGUAGE MODE)
ORDER BY score DESC;

-- Boolean Mode für komplexe Suchen
SELECT titel, inhalt
FROM artikel 
WHERE MATCH(titel, inhalt) AGAINST('+datenbank +performance -mysql' IN BOOLEAN MODE);

-- PostgreSQL Fulltext Index
CREATE TABLE dokumente (
    id SERIAL PRIMARY KEY,
    titel VARCHAR(255),
    inhalt TEXT
);

-- Fulltext Index mit tsvector
ALTER TABLE dokumente ADD COLUMN searchable_text tsvector;
UPDATE dokumente SET searchable_text = to_tsvector('german', titel || ' ' || inhalt);
CREATE INDEX idx_fulltext ON dokumente USING GIN(searchable_text);

-- Fulltext Suche in PostgreSQL
SELECT titel, ts_rank(searchable_text, plainto_tsquery('german', 'datenbank performance')) AS rank
FROM dokumente 
WHERE searchable_text @@ plainto_tsquery('german', 'datenbank performance')
ORDER BY rank DESC;

4. Clustered Index Examples

-- SQL Server Clustered Index
CREATE TABLE log_daten (
    log_id INT IDENTITY(1,1),
    zeitstempel DATETIME NOT NULL,
    nachricht VARCHAR(1000),
    level VARCHAR(10)
);

-- Clustered Index auf Zeitstempel (physische Sortierung)
CREATE CLUSTERED INDEX idx_log_zeitstempel ON log_daten(zeitstempel);

-- PostgreSQL Clustered Index (CLUSTER)
CREATE TABLE messwerte (
    id SERIAL PRIMARY KEY,
    sensor_id INT,
    zeitpunkt TIMESTAMP NOT NULL,
    wert DECIMAL(10,4)
);

-- Clustered Index auf Primärschlüssel (Standard)
CREATE INDEX idx_messwerte_zeitpunkt ON messwerte(zeitpunkt);

-- Tabelle nach Index clustern
CLUSTER messwerte USING idx_messwerte_zeitpunkt;

-- MySQL InnoDB Clustered Index (automatisch auf Primärschlüssel)
CREATE TABLE transaktionen (
    id BIGINT PRIMARY KEY,  -- Automatisch clustered
    konto_id INT,
    betrag DECIMAL(12,2),
    datum DATETIME,
    INDEX idx_konto_datum (konto_id, datum)  -- Secondary Index
);

5. Performance Analysis with EXPLAIN

-- MySQL EXPLAIN Analyse
EXPLAIN FORMAT=JSON
SELECT k.name, b.bestelldatum, b.gesamtbetrag
FROM kunden k
JOIN bestellungen b ON k.id = b.kunden_id
WHERE k.stadt = 'Berlin'
  AND b.bestelldatum >= '2024-01-01'
ORDER BY b.gesamtbetrag DESC;

-- PostgreSQL EXPLAIN ANALYZE
EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON)
SELECT k.name, b.bestelldatum, b.gesamtbetrag
FROM kunden k
JOIN bestellungen b ON k.id = b.kunden_id
WHERE k.stadt = 'Berlin'
  AND b.bestelldatum >= '2024-01-01'
ORDER BY b.gesamtbetrag DESC;

-- Index-Nutzung überprüfen
-- MySQL: SHOW INDEX FROM tabelle;
SHOW INDEX FROM kunden;

-- PostgreSQL: \d tabelle
-- oder pg_indexes view
SELECT indexname, indexdef 
FROM pg_indexes 
WHERE tablename = 'kunden';

6. Index Optimization and Best Practices

-- Covering Index (alle benötigten Spalten im Index)
CREATE INDEX idx_kunden_covering ON kunden(stadt, name, id);

-- Abfrage nutzt nur den Index (Index-Only Scan)
SELECT id, name FROM kunden WHERE stadt = 'Berlin';

-- Partial Index (nur für Teil der Daten)
CREATE INDEX idx_aktive_kunden ON kunden(id) WHERE status = 'aktiv';

-- Function-Based Index (PostgreSQL)
CREATE INDEX idx_kunden_name_lower ON kunden(LOWER(name));

-- Abfrage nutzt Function-Based Index
SELECT * FROM kunden WHERE LOWER(name) = 'mustermann';

-- Composite Index mit richtiger Spaltenreihenfolge
CREATE INDEX idx_bestellungen_optimal ON bestellungen(kunden_id, bestelldatum, status);

-- Gute Abfrage (nutzt Index vollständig)
SELECT * FROM bestellungen 
WHERE kunden_id = 123 
  AND bestelldatum >= '2024-01-01' 
  AND status = 'completed';

-- Schlechte Abfrage (kann Index nicht nutzen)
SELECT * FROM bestellungen 
WHERE status = 'completed' 
  AND bestelldatum >= '2024-01-01';  -- Führende Spalte nicht in WHERE

7. Index Maintenance and Monitoring

-- Update index statistics
-- MySQL
ANALYZE TABLE kunden;

-- PostgreSQL
ANALYZE kunden;

-- Check index fragmentation
-- SQL Server
SELECT * FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('kunden'), NULL, NULL, 'DETAILED');

-- PostgreSQL
SELECT schemaname, tablename, attname, n_distinct, correlation 
FROM pg_stats 
WHERE tablename = 'kunden';

-- Index rebuild for fragmentation
-- SQL Server
ALTER INDEX ALL ON kunden REBUILD;

-- MySQL (InnoDB)
ANALYZE TABLE kunden;  -- Update statistics
OPTIMIZE TABLE kunden;  -- Optimize table

-- PostgreSQL
REINDEX INDEX idx_kunden_name;

Performance Comparison Index Types

Index TypeSearch TypePerformanceStorageUsage
B-TreeEquality, RangeO(log n)MediumStandard Indexes
HashEquality OnlyO(1)LowMemory Tables
FulltextFull TextVariableHighText Search
ClusteredPrimary KeyO(log n)TableFrequent Access

Index Strategies

Single-Column Index

-- Simple index on one column
CREATE INDEX idx_kunden_email ON kunden(email);

Multi-Column Index

-- Composite index with optimal column order
CREATE INDEX idx_kunden_stadt_name ON kunden(stadt, name);

Covering Index

-- Index contains all required columns
CREATE INDEX idx_bestellungen_covering ON bestellungen(kunden_id, bestelldatum, gesamtbetrag);

Partial Index

-- Index only for part of the data
CREATE INDEX idx_aktive_kunden ON kunden(id) WHERE status = 'aktiv';

Function-Based Index

-- Index on calculated values
CREATE INDEX idx_kunden_name_lower ON kunden(LOWER(name));

Query Optimization

Maximize Index Usage

-- GOOD Queries (use indexes)
SELECT * FROM kunden WHERE stadt = 'Berlin';                    -- Single-Column Index
SELECT * FROM bestellungen WHERE kunden_id = 123 AND datum > '2024-01-01'; -- Composite Index
SELECT * FROM artikel WHERE MATCH(titel) AGAINST('suchbegriff');    -- Fulltext Index

-- BAD Queries (don't use indexes)
SELECT * FROM kunden WHERE LOWER(name) = 'mustermann';           -- No Function-Based Index
SELECT * FROM bestellungen WHERE YEAR(datum) = 2024;            -- Function on column
SELECT * FROM kunden WHERE name LIKE '%mustermann%';             -- Leading Wildcard

EXPLAIN Analysis

-- MySQL
EXPLAIN SELECT * FROM kunden WHERE stadt = 'Berlin';

-- Important columns:
-- type: ALL (bad), ref, range, index (good)
-- key: Used index
-- rows: Estimated row count
-- Extra: Using index (good), Using filesort (bad)

-- PostgreSQL
EXPLAIN ANALYZE SELECT * FROM kunden WHERE stadt = 'Berlin';

-- Important information:
-- Seq Scan vs Index Scan
-- Index-Only Scan (very good)
-- Actual execution time

Advantages and Disadvantages

Advantages of Indexes

  • Performance: Dramatically faster queries
  • Sorting: ORDER BY without additional sorting
  • Uniqueness: UNIQUE indexes guarantee data integrity
  • Joins: Foreign key indexes accelerate JOINs

Disadvantages

  • Storage Space: Indexes require additional storage
  • Write Performance: INSERT/UPDATE/DELETE become slower
  • Maintenance: Indexes must be maintained
  • Overhead: Too many indexes can be counterproductive

Best Practices

When to Create Indexes

  • Frequent WHERE clauses
  • JOIN conditions
  • ORDER BY clauses
  • GROUP BY clauses
  • UNIQUE constraints

When to Avoid Indexes

  • Rarely used columns
  • Tables with few rows
  • Columns with high cardinality and low selectivity
  • Frequent UPDATE/DELETE operations

Index Design

  • Column Order: Selectivity descending
  • Index Width: As narrow as possible
  • Covering Index: Include all required columns
  • Partial Index: Only for relevant data

Frequently Asked Questions

  1. What is the difference between B-Tree and Hash Index? B-Tree supports equality and range search, Hash only equality search.

  2. When do you use a Fulltext Index? For text search in large text fields with stemming and relevance scoring.

  3. Explain Clustered vs Non-Clustered Index! Clustered: Physical data storage, Non-Clustered: Separate index structure.

  4. How do you check index usage? With EXPLAIN/EXPLAIN ANALYZE to analyze the query plan.

Most Important Sources

  1. https://dev.mysql.com/doc/refman/8.0/en/mysql-indexes.html
  2. https://www.postgresql.org/docs/current/indexes.html
  3. https://docs.microsoft.com/en-us/sql/relational-databases/indexes/indexes

Keine Bücher für Kategorie "datenbanken" gefunden.

Back to Blog
Share: