Skip to content
IRC-Coding IRC-Coding
Normalization 1NF 2NF 3NF BCNF Database Design Functional Dependencies Anomalies

Database Normalization: 1NF, 2NF, 3NF & BCNF

Master database normalization to eliminate redundancy and anomalies. Learn 1NF, 2NF, 3NF, BCNF with functional dependencies and practical examples.

S

schutzgeist

2 min read

Database Design: Normalization 1NF, 2NF, 3NF & BCNF

This article is a definition of terms for normalization in databases – including practical examples and exam questions.

In a Nutshell

Normalization is the process of structuring relational databases to eliminate redundancy and avoid data anomalies. The goal is to ensure data integrity and consistency.

Compact Technical Description

Normalization is a formal approach to eliminating redundancy and avoiding anomalies in relational databases. By stepwise application of normal forms, the data structure is improved.

Anomalies without normalization:

  • Insert anomalies: Data cannot be inserted without other information
  • Update anomalies: Changes require updates at multiple locations
  • Delete anomalies: Deleting data unintentionally removes other information

Normal forms:

  • 1NF: Atomic values, no repeating groups
  • 2NF: 1NF + no partial dependencies
  • 3NF: 2NF + no transitive dependencies
  • BCNF: Stronger form of 3NF with stricter rules

Functional dependencies form the mathematical foundation: X → Y means that the value of X uniquely determines the value of Y.

Exam-Relevant Key Points

  • 1NF: Atomic values, no repeating groups, unique primary keys
  • 2NF: 1NF satisfied, no partial dependencies, complete dependency on primary key
  • 3NF: 2NF satisfied, no transitive dependencies, non-key attributes depend only on primary key
  • BCNF: Every determinant is a candidate key
  • Functional dependencies: Mathematical foundation of normalization
  • Anomalies: Insert, update, delete problems with denormalized data
  • IHK-relevant: Important for data modeling and database design
  • Practice: Trade-off between normalization and performance

Core Components

  1. Functional dependency: X → Y (X uniquely determines Y)
  2. Primary key: Unique identification of records
  3. Candidate key: Possible primary keys
  4. Partial dependency: Dependency on part of composite key
  5. Transitive dependency: Indirect dependency through non-key attributes
  6. Determinant: Attribute that determines other attributes
  7. Normalization process: Stepwise application of normal forms
  8. Denormalization: Intentional redundancy for performance optimization

Practical Examples

Unnormalized Table (0NF)

-- Problematic structure with redundancy and anomalies
CREATE TABLE Orders (
    order_nr INT,
    orderDate DATE,
    customerName VARCHAR(100),
    customerAddress VARCHAR(200),
    article_nr INT,
    articleName VARCHAR(100),
    price DECIMAL(10,2),
    quantity INT,
    totalPrice DECIMAL(10,2)
);

-- Data with problems
INSERT INTO Orders VALUES 
(1, '2024-01-15', 'Meier', 'Hauptstraße 1', 101, 'Laptop', 999.99, 2, 1999.98),
(1, '2024-01-15', 'Meier', 'Hauptstraße 1', 102, 'Maus', 29.99, 1, 29.99),
(2, '2024-01-16', 'Schmidt', 'Nebenstraße 2', 101, 'Laptop', 999.99, 1, 999.99);

First Normal Form (1NF)

-- Atomic values, no repeating groups
CREATE TABLE Orders_1NF (
    order_nr INT,
    orderDate DATE,
    customerName VARCHAR(100),
    customerAddress VARCHAR(200),
    article_nr INT,
    articleName VARCHAR(100),
    price DECIMAL(10,2),
    quantity INT,
    totalPrice DECIMAL(10,2),
    PRIMARY KEY (order_nr, article_nr)
);

-- Functional dependencies:
-- order_nr, article_nr → quantity, totalPrice
-- order_nr → orderDate, customerName, customerAddress
-- article_nr → articleName, price

Second Normal Form (2NF)

-- Elimination of partial dependencies
CREATE TABLE Orders_2NF (
    order_nr INT PRIMARY KEY,
    orderDate DATE,
    customerName VARCHAR(100),
    customerAddress VARCHAR(200)
);

CREATE TABLE OrderPositions (
    order_nr INT,
    article_nr INT,
    quantity INT,
    totalPrice DECIMAL(10,2),
    PRIMARY KEY (order_nr, article_nr),
    FOREIGN KEY (order_nr) REFERENCES Orders_2NF(order_nr)
);

CREATE TABLE Articles (
    article_nr INT PRIMARY KEY,
    articleName VARCHAR(100),
    price DECIMAL(10,2)
);

Third Normal Form (3NF)

-- Elimination of transitive dependencies
CREATE TABLE Orders_3NF (
    order_nr INT PRIMARY KEY,
    orderDate DATE,
    customer_id INT,
    FOREIGN KEY (customer_id) REFERENCES Customers(customer_id)
);

CREATE TABLE Customers (
    customer_id INT PRIMARY KEY,
    customerName VARCHAR(100),
    customerAddress VARCHAR(200)
);

CREATE TABLE OrderPositions (
    order_nr INT,
    article_nr INT,
    quantity INT,
    PRIMARY KEY (order_nr, article_nr),
    FOREIGN KEY (order_nr) REFERENCES Orders_3NF(order_nr),
    FOREIGN KEY (article_nr) REFERENCES Articles(article_nr)
);

CREATE TABLE Articles (
    article_nr INT PRIMARY KEY,
    articleName VARCHAR(100),
    price DECIMAL(10,2)
);

Anomalies and Their Solutions

Insert Anomaly (without normalization)

-- Problem: Customer cannot be created without order
-- Solution: Separate customer table
INSERT INTO Customers (customer_id, customerName, customerAddress) 
VALUES (3, 'Mueller', 'Dritter Weg 3');

Update Anomaly (without normalization)

-- Problem: Customer address must be changed at multiple locations
-- Solution: Central customer table
UPDATE Customers 
SET customerAddress = 'Hauptstraße 1a' 
WHERE customer_id = 1;

Delete Anomaly (without normalization)

-- Problem: Deleting last order removes customer data
-- Solution: Separate tables avoid unintended deletion
DELETE FROM OrderPositions WHERE order_nr = 1;
-- Customer data remains in customer table

BCNF (Boyce-Codd Normal Form)

BCNF Rule

Every determinant must be a candidate key.

-- Example that satisfies 3NF but not BCNF
CREATE TABLE ProjectEmployees (
    project_id INT,
    employee_id INT,
    role VARCHAR(50),
    PRIMARY KEY (project_id, employee_id)
);

-- Functional dependencies:
-- project_id, employee_id → role
-- project_id, role → employee_id  (Determinant is not a candidate key!)

-- BCNF solution: Decomposition
CREATE TABLE ProjectRoles (
    project_id INT,
    role VARCHAR(50),
    employee_id INT,
    PRIMARY KEY (project_id, role)
);

CREATE TABLE EmployeeProjects (
    project_id INT,
    employee_id INT,
    PRIMARY KEY (project_id, employee_id)
);

Advantages and Disadvantages

Advantages of Normalization

  • Data integrity: Avoiding redundancy and inconsistencies
  • Maintainability: Changes required at only one location
  • Storage efficiency: Reduction of redundancy
  • Consistency: Unified data representation

Disadvantages

  • Performance: More joins required
  • Complexity: More complex data structure
  • Write performance: Distributed updates across multiple tables
  • Learning curve: Requires understanding of functional dependencies

Common Exam Questions

  1. What is the difference between 2NF and 3NF? 2NF eliminates partial dependencies, 3NF eliminates transitive dependencies.

  2. When is a table in BCNF? When every determinant is a candidate key (stricter than 3NF).

  3. Explain functional dependencies! X → Y means that the value of X uniquely determines the value of Y.

  4. What are anomalies and how are they avoided? Insert, update, delete anomalies are avoided through normalization.

Most Important Sources

  1. https://de.wikipedia.org/wiki/Normalisierung_(Datenbank)
  2. https://www.gatech.edu/coe/cse/normalization
  3. https://www.sql-tutorial.ru/sql-normalization.html

Keine Bücher für Kategorie "datenbanken" gefunden.

Back to Blog
Share:

Related Posts