DevOps Grundlagen: CI/CD, Docker, Kubernetes, Automation, Monitoring & Infrastructure as Code
Dieser Beitrag ist eine umfassende Einführung in die DevOps Grundlagen – inklusive CI/CD, Docker, Kubernetes, Automation, Monitoring und Infrastructure as Code mit praktischen Beispielen.
In a Nutshell
DevOps ist eine Kultur und Methodik, die Softwareentwicklung (Dev) und IT-Operationen (Ops) zusammenbringt, um die Softwarelieferkette zu automatisieren und zu beschleunigen.
Kompakte Fachbeschreibung
DevOps ist ein Ansatz, der durch Automatisierung, Kollaboration und kontinuierliche Verbesserung die Lücke zwischen Entwicklung und Betrieb überwindet.
Kernkomponenten:
Continuous Integration/Continuous Deployment (CI/CD)
- Version Control: Git, GitHub, GitLab, Bitbucket
- Build Automation: Jenkins, GitHub Actions, GitLab CI
- Testing: Unit Tests, Integration Tests, E2E Tests
- Deployment: Automated Rollouts, Blue/Green, Canary
Containerisierung
- Docker: Container-Plattform für Anwendungsisolation
- Docker Compose: Multi-Container-Anwendungen
- Container Registry: Docker Hub, Harbor, AWS ECR
- Image Optimization: Multi-stage Builds, Layer Caching
Orchestrierung
- Kubernetes: Container-Orchestrierungsplattform
- Services: Pods, Deployments, Services, Ingress
- Configuration: ConfigMaps, Secrets, Helm Charts
- Scaling: Horizontal Pod Autoscaling, Cluster Autoscaling
Infrastructure as Code (IaC)
- Terraform: Multi-Cloud Infrastructure Provisioning
- Ansible: Configuration Management
- CloudFormation: AWS-native IaC
- Pulumi: Programmierbare Infrastruktur
Monitoring & Observability
- Metrics: Prometheus, Grafana, InfluxDB
- Logging: ELK Stack, Fluentd, Loki
- Tracing: Jaeger, Zipkin, OpenTelemetry
- APM: Application Performance Monitoring
Prüfungsrelevante Stichpunkte
- DevOps: Kultur und Methodik für Softwareentwicklung und Betrieb
- CI/CD: Continuous Integration und Continuous Deployment
- Docker: Container-Plattform für Anwendungsisolation
- Kubernetes: Container-Orchestrierungsplattform
- Infrastructure as Code: Automatisierte Infrastrukturverwaltung
- Monitoring: Überwachung von Systemen und Anwendungen
- Automation: Automatisierung von wiederkehrenden Aufgaben
- GitOps: Git-basierte Operations-Workflows
- IHK-relevant: Moderne DevOps-Praktiken und -Tools
Kernkomponenten
- Version Control: Git-Workflows, Branching-Strategien
- CI/CD Pipeline: Build, Test, Deploy, Monitor
- Containerisierung: Docker, Container-Images, Registry
- Orchestrierung: Kubernetes, Services, Scaling
- IaC: Terraform, Ansible, Configuration Management
- Monitoring: Metrics, Logging, Tracing
- Security: Scanning, Compliance, Secret Management
- Collaboration: Team-Workflows, Communication
Praxisbeispiele
1. CI/CD Pipeline mit GitHub Actions
# .github/workflows/ci-cd.yml
name: CI/CD Pipeline
on:
push:
branches: [ main, develop ]
pull_request:
branches: [ main ]
release:
types: [ published ]
env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
NODE_VERSION: '18'
PYTHON_VERSION: '3.11'
jobs:
# Code Quality and Security
quality:
name: Code Quality & Security
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
cache: 'npm'
- name: Setup Python
uses: actions/setup-python@v4
with:
python-version: ${{ env.PYTHON_VERSION }}
cache: 'pip'
- name: Install dependencies
run: |
npm ci
pip install -r requirements.txt
pip install -r requirements-dev.txt
- name: Run ESLint
run: npm run lint
- name: Run Prettier check
run: npm run format:check
- name: Run Python linting
run: |
flake8 src/
black --check src/
isort --check-only src/
- name: Run security scan
run: |
npm audit --audit-level moderate
safety check
- name: Run SonarCloud scan
uses: SonarSource/sonarcloud-github-action@master
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}
# Testing
test:
name: Test Suite
runs-on: ubuntu-latest
strategy:
matrix:
node-version: [16, 18, 20]
python-version: [3.9, 3.11, 3.12]
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Setup Node.js ${{ matrix.node-version }}
uses: actions/setup-node@v4
with:
node-version: ${{ matrix.node-version }}
cache: 'npm'
- name: Setup Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
cache: 'pip'
- name: Install dependencies
run: |
npm ci
pip install -r requirements.txt
pip install -r requirements-test.txt
- name: Run unit tests
run: |
npm run test:unit
pytest tests/unit/ -v --cov=src --cov-report=xml
- name: Run integration tests
run: |
npm run test:integration
pytest tests/integration/ -v
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
with:
file: ./coverage.xml
flags: unittests
name: codecov-umbrella
# Build and Test Docker Image
build:
name: Build Docker Image
runs-on: ubuntu-latest
needs: [quality, test]
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Log in to Container Registry
uses: docker/login-action@v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Extract metadata
id: meta
uses: docker/metadata-action@v5
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
tags: |
type=ref,event=branch
type=ref,event=pr
type=sha,prefix={{branch}}-
type=raw,value=latest,enable={{is_default_branch}}
- name: Build and push Docker image
uses: docker/build-push-action@v5
with:
context: .
platforms: linux/amd64,linux/arm64
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max
- name: Run container security scan
uses: aquasecurity/trivy-action@master
with:
image-ref: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}
format: 'sarif'
output: 'trivy-results.sarif'
- name: Upload Trivy scan results to GitHub Security tab
uses: github/codeql-action/upload-sarif@v2
with:
sarif_file: 'trivy-results.sarif'
# Deploy to Staging
deploy-staging:
name: Deploy to Staging
runs-on: ubuntu-latest
needs: build
if: github.ref == 'refs/heads/develop'
environment: staging
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Setup kubectl
uses: azure/setup-kubectl@v3
with:
version: 'v1.28.0'
- name: Configure kubectl
run: |
echo "${{ secrets.KUBE_CONFIG_STAGING }}" | base64 -d > kubeconfig
export KUBECONFIG=kubeconfig
- name: Deploy to Kubernetes
run: |
export KUBECONFIG=kubeconfig
helm upgrade --install app-staging ./helm/app \
--namespace staging \
--create-namespace \
--set image.tag=${{ github.sha }} \
--set environment=staging \
--values helm/values-staging.yaml
- name: Run smoke tests
run: |
export KUBECONFIG=kubeconfig
kubectl wait --for=condition=ready pod -l app=app-staging -n staging --timeout=300s
npm run test:smoke -- --env=staging
- name: Run integration tests against staging
run: |
npm run test:integration -- --env=staging
# Deploy to Production
deploy-production:
name: Deploy to Production
runs-on: ubuntu-latest
needs: build
if: github.event_name == 'release'
environment: production
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Setup kubectl
uses: azure/setup-kubectl@v3
with:
version: 'v1.28.0'
- name: Configure kubectl
run: |
echo "${{ secrets.KUBE_CONFIG_PRODUCTION }}" | base64 -d > kubeconfig
export KUBECONFIG=kubeconfig
- name: Deploy to Kubernetes (Blue/Green)
run: |
export KUBECONFIG=kubeconfig
# Deploy to green environment
helm upgrade --install app-green ./helm/app \
--namespace production \
--set image.tag=${{ github.sha }} \
--set environment=production \
--set deployment.color=green \
--values helm/values-production.yaml
# Wait for green deployment to be ready
kubectl wait --for=condition=ready pod -l app=app-green,color=green -n production --timeout=600s
# Run health checks
npm run test:health -- --env=production-green
# Switch traffic to green
kubectl patch service app-production -n production -p '{"spec":{"selector":{"color":"green"}}}'
# Wait for traffic switch
sleep 30
# Run final tests
npm run test:smoke -- --env=production
- name: Cleanup blue environment
run: |
export KUBECONFIG=kubeconfig
helm uninstall app-blue -n production || true
kubectl delete deployment app-blue -n production || true
- name: Notify deployment
uses: 8398a7/action-slack@v3
with:
status: ${{ job.status }}
channel: '#deployments'
webhook_url: ${{ secrets.SLACK_WEBHOOK }}
if: always()
# Performance Testing
performance:
name: Performance Testing
runs-on: ubuntu-latest
needs: deploy-staging
if: github.ref == 'refs/heads/develop'
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Setup k6
run: |
sudo gpg -k
sudo gpg --no-default-keyring --keyring /usr/share/keyrings/k6-archive-keyring.gpg --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys C5AD17C747E3415A3642D57D77C6C491D6AC1D69
echo "deb [signed-by=/usr/share/keyrings/k6-archive-keyring.gpg] https://dl.k6.io/deb stable main" | sudo tee /etc/apt/sources.list.d/k6.list
sudo apt-get update
sudo apt-get install k6
- name: Run performance tests
run: |
k6 run --out json=performance-results.json tests/performance/load-test.js
- name: Upload performance results
uses: actions/upload-artifact@v3
with:
name: performance-results
path: performance-results.json
- name: Analyze performance
run: |
npm run analyze:performance -- performance-results.json
# Documentation
docs:
name: Build Documentation
runs-on: ubuntu-latest
needs: test
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Build documentation
run: |
npm run docs:build
npm run docs:generate-api
- name: Deploy to GitHub Pages
uses: peaceiris/actions-gh-pages@v3
if: github.ref == 'refs/heads/main'
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: ./docs/build
# Workflow for dependency updates
name: Dependency Updates
on:
schedule:
- cron: '0 2 * * 1' # Every Monday at 2 AM
workflow_dispatch:
jobs:
update-dependencies:
name: Update Dependencies
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
token: ${{ secrets.GITHUB_TOKEN }}
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
cache: 'npm'
- name: Setup Python
uses: actions/setup-python@v4
with:
python-version: ${{ env.PYTHON_VERSION }}
cache: 'pip'
- name: Update Node.js dependencies
run: |
npm update
npm audit fix
- name: Update Python dependencies
run: |
pip-compile requirements.in
pip-compile requirements-dev.in
- name: Run tests
run: |
npm ci
npm run test
pip install -r requirements.txt
pytest tests/
- name: Create Pull Request
uses: peter-evans/create-pull-request@v5
with:
token: ${{ secrets.GITHUB_TOKEN }}
commit-message: 'chore: update dependencies'
title: 'chore: update dependencies'
body: |
Automated dependency update
- Updated Node.js dependencies
- Updated Python dependencies
Please review the changes and ensure all tests pass.
branch: chore/update-dependencies
delete-branch: true
2. Docker Multi-Stage Build mit Best Practices
# Multi-stage Dockerfile for production-ready application
# Stage 1: Build stage
FROM node:18-alpine AS builder
# Set build arguments
ARG NODE_ENV=production
ARG APP_VERSION=1.0.0
# Set environment variables
ENV NODE_ENV=$NODE_ENV
ENV APP_VERSION=$APP_VERSION
# Install build dependencies
RUN apk add --no-cache \
python3 \
make \
g++ \
git
# Create app directory
WORKDIR /app
# Copy package files
COPY package*.json ./
COPY requirements.txt ./
# Install Node.js dependencies
RUN npm ci --only=production && npm cache clean --force
# Install Python dependencies
RUN pip install --no-cache-dir -r requirements.txt
# Copy source code
COPY . .
# Run build and tests
RUN npm run build
RUN npm run test
# Stage 2: Runtime stage
FROM python:3.11-slim AS runtime
# Set runtime arguments
ARG APP_USER=appuser
ARG APP_UID=1001
ARG APP_GID=1001
# Set environment variables
ENV NODE_ENV=production
ENV PYTHONUNBUFFERED=1
ENV PYTHONDONTWRITEBYTECODE=1
ENV APP_PORT=3000
# Install runtime dependencies
RUN apt-get update && apt-get install -y \
curl \
ca-certificates \
&& rm -rf /var/lib/apt/lists/*
# Create non-root user
RUN groupadd -g $APP_GID $APP_USER && \
useradd -m -u $APP_UID -g $APP_GID -s /bin/bash $APP_USER
# Create app directory
WORKDIR /app
# Copy built application from builder stage
COPY --from=builder --chown=$APP_USER:$APP_GID /app/dist ./dist
COPY --from=builder --chown=$APP_USER:$APP_GID /app/node_modules ./node_modules
COPY --from=builder --chown=$APP_USER:$APP_GID /app/requirements.txt ./
# Install Python production dependencies
RUN pip install --no-cache-dir -r requirements.txt
# Copy configuration files
COPY --chown=$APP_USER:$APP_GID config/ ./config/
COPY --chown=$APP_USER:$APP_GID scripts/ ./scripts/
# Set permissions
RUN chmod +x scripts/*.sh
# Switch to non-root user
USER $APP_USER
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD curl -f http://localhost:$APP_PORT/health || exit 1
# Expose port
EXPOSE $APP_PORT
# Set entrypoint
ENTRYPOINT ["./scripts/entrypoint.sh"]
# Default command
CMD ["npm", "start"]
# Stage 3: Development stage
FROM runtime AS development
# Override environment for development
ENV NODE_ENV=development
# Install development dependencies
RUN apt-get update && apt-get install -y \
git \
vim \
&& rm -rf /var/lib/apt/lists/*
# Install Node.js development dependencies
RUN npm install
# Switch back to root for development tools
USER root
# Install development tools
RUN pip install --no-cache-dir pytest pytest-cov black flake8
# Switch back to app user
USER $APP_USER
# Override command for development
CMD ["npm", "run", "dev"]
# Stage 4: Testing stage
FROM builder AS testing
# Install test dependencies
RUN npm install --no-save
RUN pip install --no-cache-dir pytest pytest-cov
# Run comprehensive tests
RUN npm run test:coverage
RUN pytest tests/ --cov=src --cov-report=xml
# Security scanning
RUN npm audit --audit-level high
RUN safety check
# Stage 5: Security scanning stage
FROM builder AS security
# Install security scanning tools
RUN npm install -g audit-ci
RUN pip install safety bandit
# Run security scans
RUN audit-ci --moderate
RUN safety check --json --output safety-report.json
RUN bandit -r src/ -f json -o bandit-report.json
# Export security reports
COPY --from=security /app/safety-report.json /reports/
COPY --from=security /app/bandit-report.json /reports/
3. Kubernetes Deployment mit Helm und GitOps
# helm/app/Chart.yaml
apiVersion: v2
name: app
description: A Helm chart for deploying the application
type: application
version: 1.0.0
appVersion: "1.0.0"
home: https://github.com/organization/app
sources:
- https://github.com/organization/app
maintainers:
- name: DevOps Team
email: devops@organization.com
keywords:
- web
- application
- devops
annotations:
category: WebApplication
# helm/app/values.yaml
# Default values for the application
replicaCount: 3
image:
repository: ghcr.io/organization/app
pullPolicy: IfNotPresent
tag: "latest"
nameOverride: ""
fullnameOverride: ""
serviceAccount:
create: true
annotations: {}
name: ""
podAnnotations: {}
podSecurityContext:
fsGroup: 1001
securityContext:
allowPrivilegeEscalation: false
runAsNonRoot: true
runAsUser: 1001
runAsGroup: 1001
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
service:
type: ClusterIP
port: 80
targetPort: 3000
ingress:
enabled: true
className: "nginx"
annotations:
cert-manager.io/cluster-issuer: "letsencrypt-prod"
nginx.ingress.kubernetes.io/rate-limit: "100"
nginx.ingress.kubernetes.io/rate-limit-window: "1m"
hosts:
- host: app.example.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: app-tls
hosts:
- app.example.com
resources:
limits:
cpu: 500m
memory: 512Mi
requests:
cpu: 250m
memory: 256Mi
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 10
targetCPUUtilizationPercentage: 70
targetMemoryUtilizationPercentage: 80
nodeSelector: {}
tolerations: []
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app.kubernetes.io/name
operator: In
values:
- app
topologyKey: kubernetes.io/hostname
config:
environment: production
logLevel: info
database:
host: postgres.example.com
port: 5432
name: app_prod
redis:
host: redis.example.com
port: 6379
monitoring:
enabled: true
port: 9090
secrets:
databasePassword: ""
jwtSecret: ""
apiKeys: ""
# helm/app/templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "app.fullname" . }}
labels:
{{- include "app.labels" . | nindent 4 }}
spec:
{{- if not .Values.autoscaling.enabled }}
replicas: {{ .Values.replicaCount }}
{{- end }}
selector:
matchLabels:
{{- include "app.selectorLabels" . | nindent 6 }}
template:
metadata:
annotations:
checksum/config: {{ include (print $.Template.BasePath "/configmap.yaml") . | sha256sum }}
checksum/secret: {{ include (print $.Template.BasePath "/secret.yaml") . | sha256sum }}
{{- with .Values.podAnnotations }}
{{- toYaml . | nindent 8 }}
{{- end }}
labels:
{{- include "app.selectorLabels" . | nindent 8 }}
spec:
{{- with .Values.imagePullSecrets }}
imagePullSecrets:
{{- toYaml . | nindent 8 }}
{{- end }}
serviceAccountName: {{ include "app.serviceAccountName" . }}
securityContext:
{{- toYaml .Values.podSecurityContext | nindent 8 }}
initContainers:
- name: wait-for-db
image: postgres:15-alpine
command:
- sh
- -c
- |
until pg_isready -h {{ .Values.config.database.host }} -p {{ .Values.config.database.port }}; do
echo "Waiting for database..."
sleep 2
done
- name: migrate-db
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
command:
- npm
- run
- migrate
envFrom:
- configMapRef:
name: {{ include "app.fullname" . }}
- secretRef:
name: {{ include "app.fullname" . }}
containers:
- name: {{ .Chart.Name }}
securityContext:
{{- toYaml .Values.securityContext | nindent 12 }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- name: http
containerPort: {{ .Values.service.targetPort }}
protocol: TCP
- name: metrics
containerPort: {{ .Values.config.monitoring.port }}
protocol: TCP
livenessProbe:
httpGet:
path: /health
port: http
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: http
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
resources:
{{- toYaml .Values.resources | nindent 12 }}
envFrom:
- configMapRef:
name: {{ include "app.fullname" . }}
- secretRef:
name: {{ include "app.fullname" . }}
volumeMounts:
- name: tmp
mountPath: /tmp
- name: config
mountPath: /app/config
readOnly: true
- name: log-shipper
image: fluent/fluent-bit:2.0
resources:
limits:
cpu: 100m
memory: 128Mi
requests:
cpu: 50m
memory: 64Mi
volumeMounts:
- name: varlog
mountPath: /var/log
readOnly: true
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
- name: fluent-bit-config
mountPath: /fluent-bit/etc/
volumes:
- name: tmp
emptyDir: {}
- name: config
configMap:
name: {{ include "app.fullname" . }}
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
- name: fluent-bit-config
configMap:
name: fluent-bit-config
{{- with .Values.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
# helm/app/templates/hpa.yaml
{{- if .Values.autoscaling.enabled }}
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: {{ include "app.fullname" . }}
labels:
{{- include "app.labels" . | nindent 4 }}
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: {{ include "app.fullname" . }}
minReplicas: {{ .Values.autoscaling.minReplicas }}
maxReplicas: {{ .Values.autoscaling.maxReplicas }}
metrics:
{{- if .Values.autoscaling.targetCPUUtilizationPercentage }}
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: {{ .Values.autoscaling.targetCPUUtilizationPercentage }}
{{- end }}
{{- if .Values.autoscaling.targetMemoryUtilizationPercentage }}
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: {{ .Values.autoscaling.targetMemoryUtilizationPercentage }}
{{- end }}
{{- end }}
# helm/app/templates/monitoring.yaml
{{- if .Values.config.monitoring.enabled }}
apiVersion: v1
kind: Service
metadata:
name: {{ include "app.fullname" . }}-metrics
labels:
{{- include "app.labels" . | nindent 4 }}
spec:
type: ClusterIP
ports:
- port: {{ .Values.config.monitoring.port }}
targetPort: metrics
protocol: TCP
name: metrics
selector:
{{- include "app.selectorLabels" . | nindent 4 }}
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: {{ include "app.fullname" . }}
labels:
{{- include "app.labels" . | nindent 4 }}
spec:
selector:
matchLabels:
{{- include "app.selectorLabels" . | nindent 6 }}
endpoints:
- port: metrics
interval: 30s
path: /metrics
{{- end }}
# GitOps Application Manifest (ArgoCD)
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: app-production
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/organization/app-helm
targetRevision: HEAD
path: helm/app
helm:
valueFiles:
- values-production.yaml
destination:
server: https://kubernetes.default.svc
namespace: production
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
retry:
limit: 5
backoff:
duration: 5s
factor: 2
maxDuration: 3m
4. Terraform Infrastructure as Code
# terraform/main.tf
provider "aws" {
region = var.aws_region
default_tags {
tags = {
Environment = var.environment
Project = var.project_name
ManagedBy = "terraform"
}
}
}
# Terraform backend configuration
terraform {
backend "s3" {
bucket = "terraform-state-${var.project_name}"
key = "infrastructure/terraform.tfstate"
region = "us-east-1"
encrypt = true
dynamodb_table = "terraform-locks-${var.project_name}"
}
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
kubernetes = {
source = "hashicorp/kubernetes"
version = "~> 2.0"
}
helm = {
source = "hashicorp/helm"
version = "~> 2.0"
}
random = {
source = "hashicorp/random"
version = "~> 3.0"
}
null = {
source = "hashicorp/null"
version = "~> 3.0"
}
}
}
# terraform/variables.tf
variable "aws_region" {
description = "AWS region"
type = string
default = "us-east-1"
}
variable "environment" {
description = "Environment name"
type = string
default = "production"
}
variable "project_name" {
description = "Project name"
type = string
default = "my-app"
}
variable "vpc_cidr" {
description = "CIDR block for VPC"
type = string
default = "10.0.0.0/16"
}
variable "availability_zones" {
description = "List of availability zones"
type = list(string)
default = ["us-east-1a", "us-east-1b", "us-east-1c"]
}
variable "cluster_name" {
description = "EKS cluster name"
type = string
default = "my-app-cluster"
}
variable "cluster_version" {
description = "EKS cluster version"
type = string
default = "1.28"
}
variable "node_groups" {
description = "EKS node groups configuration"
type = map(object({
instance_type = string
min_size = number
max_size = number
desired_size = number
disk_size = number
}))
default = {
general = {
instance_type = "t3.medium"
min_size = 3
max_size = 10
desired_size = 3
disk_size = 50
}
compute = {
instance_type = "c5.large"
min_size = 2
max_size = 5
desired_size = 2
disk_size = 100
}
}
}
# terraform/vpc.tf
# VPC Configuration
resource "aws_vpc" "main" {
cidr_block = var.vpc_cidr
enable_dns_hostnames = true
enable_dns_support = true
tags = {
Name = "${var.project_name}-vpc"
}
}
# Internet Gateway
resource "aws_internet_gateway" "main" {
vpc_id = aws_vpc.main.id
tags = {
Name = "${var.project_name}-igw"
}
}
# Public Subnets
resource "aws_subnet" "public" {
count = length(var.availability_zones)
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(var.vpc_cidr, 8, count.index)
availability_zone = var.availability_zones[count.index]
map_public_ip_on_launch = true
tags = {
Name = "${var.project_name}-public-${count.index}"
Type = "Public"
}
}
# Private Subnets
resource "aws_subnet" "private" {
count = length(var.availability_zones)
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(var.vpc_cidr, 8, count.index + 3)
availability_zone = var.availability_zones[count.index]
tags = {
Name = "${var.project_name}-private-${count.index}"
Type = "Private"
}
}
# Database Subnets
resource "aws_subnet" "database" {
count = length(var.availability_zones)
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(var.vpc_cidr, 8, count.index + 6)
availability_zone = var.availability_zones[count.index]
tags = {
Name = "${var.project_name}-database-${count.index}"
Type = "Database"
}
}
# Route Tables
resource "aws_route_table" "public" {
vpc_id = aws_vpc.main.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.main.id
}
tags = {
Name = "${var.project_name}-public-rt"
}
}
resource "aws_route_table_association" "public" {
count = length(aws_subnet.public)
subnet_id = aws_subnet.public[count.index].id
route_table_id = aws_route_table.public.id
}
# EKS Cluster
resource "aws_eks_cluster" "main" {
name = var.cluster_name
role_arn = aws_iam_role.eks_cluster.arn
version = var.cluster_version
vpc_config {
subnet_ids = concat(
aws_subnet.public[*].id,
aws_subnet.private[*].id
)
endpoint_public_access = true
endpoint_private_access = true
public_access_cidrs = ["0.0.0.0/0"]
}
depends_on = [
aws_iam_role_policy_attachment.eks_cluster_policy,
]
tags = {
Name = var.cluster_name
}
}
# EKS Node Groups
resource "aws_eks_node_group" "main" {
for_each = var.node_groups
cluster_name = aws_eks_cluster.main.name
node_group_name = each.key
node_role_arn = aws_iam_role.eks_node.arn
subnet_ids = aws_subnet.private[*].id
scaling_config {
desired_size = each.value.desired_size
max_size = each.value.max_size
min_size = each.value.min_size
}
instance_types = [each.value.instance_type]
disk_size = each.value.disk_size
remote_access {
ec2_ssh_key = aws_key_pair.main.key_name
source_security_group_ids = [aws_security_group.eks_nodes.id]
}
depends_on = [
aws_iam_role_policy_attachment.eks_worker_node_policy,
aws_iam_role_policy_attachment.eks_cni_policy,
aws_iam_role_policy_attachment.eks_container_registry_policy,
]
tags = {
Name = "${var.cluster_name}-${each.key}"
Type = each.key
}
}
# IAM Roles
resource "aws_iam_role" "eks_cluster" {
name = "${var.project_name}-eks-cluster-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "eks.amazonaws.com"
}
}
]
})
}
resource "aws_iam_role_policy_attachment" "eks_cluster_policy" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEKSClusterPolicy"
role = aws_iam_role.eks_cluster.name
}
resource "aws_iam_role" "eks_node" {
name = "${var.project_name}-eks-node-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "ec2.amazonaws.com"
}
}
]
})
}
resource "aws_iam_role_policy_attachment" "eks_worker_node_policy" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy"
role = aws_iam_role.eks_node.name
}
resource "aws_iam_role_policy_attachment" "eks_cni_policy" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"
role = aws_iam_role.eks_node.name
}
resource "aws_iam_role_policy_attachment" "eks_container_registry_policy" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
role = aws_iam_role.eks_node.name
}
# Security Groups
resource "aws_security_group" "eks_cluster" {
name = "${var.project_name}-eks-cluster-sg"
description = "Security group for EKS cluster"
vpc_id = aws_vpc.main.id
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "${var.project_name}-eks-cluster-sg"
}
}
resource "aws_security_group" "eks_nodes" {
name = "${var.project_name}-eks-nodes-sg"
description = "Security group for EKS nodes"
vpc_id = aws_vpc.main.id
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "${var.project_name}-eks-nodes-sg"
}
}
# RDS Database
resource "aws_db_subnet_group" "main" {
name = "${var.project_name}-db-subnet-group"
subnet_ids = aws_subnet.database[*].id
tags = {
Name = "${var.project_name}-db-subnet-group"
}
}
resource "aws_security_group" "rds" {
name = "${var.project_name}-rds-sg"
description = "Security group for RDS database"
vpc_id = aws_vpc.main.id
ingress {
from_port = 5432
to_port = 5432
protocol = "tcp"
security_groups = [aws_security_group.eks_nodes.id]
}
tags = {
Name = "${var.project_name}-rds-sg"
}
}
resource "aws_db_instance" "postgres" {
identifier = "${var.project_name}-postgres"
engine = "postgres"
engine_version = "15.4"
instance_class = "db.t3.medium"
allocated_storage = 100
max_allocated_storage = 1000
storage_type = "gp2"
storage_encrypted = true
db_name = "app"
username = "app_user"
password = random_password.db_password.result
db_subnet_group_name = aws_db_subnet_group.main.name
vpc_security_group_ids = [aws_security_group.rds.id]
backup_retention_period = 7
backup_window = "03:00-04:00"
maintenance_window = "sun:04:00-sun:05:00"
skip_final_snapshot = false
final_snapshot_identifier = "${var.project_name}-postgres-final-snapshot"
deletion_protection = true
tags = {
Name = "${var.project_name}-postgres"
}
}
# Redis ElastiCache
resource "aws_elasticache_subnet_group" "main" {
name = "${var.project_name}-cache-subnet-group"
subnet_ids = aws_subnet.private[*].id
tags = {
Name = "${var.project_name}-cache-subnet-group"
}
}
resource "aws_security_group" "redis" {
name = "${var.project_name}-redis-sg"
description = "Security group for Redis"
vpc_id = aws_vpc.main.id
ingress {
from_port = 6379
to_port = 6379
protocol = "tcp"
security_groups = [aws_security_group.eks_nodes.id]
}
tags = {
Name = "${var.project_name}-redis-sg"
}
}
resource "aws_elasticache_replication_group" "redis" {
replication_group_id = "${var.project_name}-redis"
description = "Redis cluster for ${var.project_name}"
node_type = "cache.t3.micro"
port = 6379
parameter_group_name = "default.redis7"
num_cache_clusters = 2
automatic_failover_enabled = true
multi_az_enabled = true
subnet_group_name = aws_elasticache_subnet_group.main.name
security_group_ids = [aws_security_group.redis.id]
at_rest_encryption_enabled = true
transit_encryption_enabled = true
auth_token = random_password.redis_auth_token.result
snapshot_retention_limit = 7
snapshot_window = "05:00-06:00"
maintenance_window = "sun:06:00-sun:07:00"
tags = {
Name = "${var.project_name}-redis"
}
}
# S3 Buckets
resource "aws_s3_bucket" "app_storage" {
bucket = "${var.project_name}-storage-${random_string.bucket_suffix.result}"
tags = {
Name = "${var.project_name}-storage"
}
}
resource "aws_s3_bucket_versioning" "app_storage" {
bucket = aws_s3_bucket.app_storage.id
versioning_configuration {
status = "Enabled"
}
}
resource "aws_s3_bucket_encryption" "app_storage" {
bucket = aws_s3_bucket.app_storage.id
server_side_encryption_configuration {
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "AES256"
}
}
}
}
resource "aws_s3_bucket_public_access_block" "app_storage" {
bucket = aws_s3_bucket.app_storage.id
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}
# Random resources
resource "random_password" "db_password" {
length = 32
special = true
override_special = "!#$%&*()-_=+[]{}<>:?"
}
resource "random_password" "redis_auth_token" {
length = 64
special = true
override_special = "!#$%&*()-_=+[]{}<>:?"
}
resource "random_string" "bucket_suffix" {
length = 8
special = false
upper = false
}
# Outputs
output "cluster_name" {
description = "EKS cluster name"
value = aws_eks_cluster.main.name
}
output "cluster_endpoint" {
description = "EKS cluster endpoint"
value = aws_eks_cluster.main.endpoint
}
output "cluster_certificate_authority_data" {
description = "EKS cluster certificate authority data"
value = aws_eks_cluster.main.certificate_authority[0].data
}
output "database_endpoint" {
description = "RDS database endpoint"
value = aws_db_instance.postgres.endpoint
sensitive = true
}
output "redis_endpoint" {
description = "Redis endpoint"
value = aws_elasticache_replication_group.redis.primary_endpoint_address
sensitive = true
}
output "storage_bucket" {
description = "S3 storage bucket name"
value = aws_s3_bucket.app_storage.bucket
}
5. Monitoring mit Prometheus und Grafana
# monitoring/prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s
rule_files:
- "alert_rules.yml"
alerting:
alertmanagers:
- static_configs:
- targets:
- alertmanager:9093
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'kubernetes-apiservers'
kubernetes_sd_configs:
- role: endpoints
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
action: keep
regex: default;kubernetes;https
- job_name: 'kubernetes-nodes'
kubernetes_sd_configs:
- role: node
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
- job_name: 'kubernetes-services'
kubernetes_sd_configs:
- role: service
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe]
action: keep
regex: true
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_name
# monitoring/alert_rules.yml
groups:
- name: kubernetes-apps
rules:
- alert: KubernetesPodCrashLooping
expr: rate(kube_pod_container_status_restarts_total[15m]) > 0
for: 5m
labels:
severity: critical
annotations:
summary: "Pod {{ $labels.pod }} is crash looping"
description: "Pod {{ $labels.pod }} in namespace {{ $labels.namespace }} is crash looping."
- alert: KubernetesPodNotReady
expr: kube_pod_status_ready{condition="true"} == 0
for: 10m
labels:
severity: warning
annotations:
summary: "Pod {{ $labels.pod }} is not ready"
description: "Pod {{ $labels.pod }} in namespace {{ $labels.namespace }} is not ready."
- alert: KubernetesNodeNotReady
expr: kube_node_status_condition{condition="Ready",status="true"} == 0
for: 10m
labels:
severity: critical
annotations:
summary: "Node {{ $labels.node }} is not ready"
description: "Node {{ $labels.node }} has been not ready for more than 10 minutes."
- name: application
rules:
- alert: HighErrorRate
expr: rate(http_requests_total{status=~"5.."}[5m]) / rate(http_requests_total[5m]) > 0.1
for: 5m
labels:
severity: critical
annotations:
summary: "High error rate detected"
description: "Error rate is {{ $value | humanizePercentage }} for {{ $labels.job }}."
- alert: HighResponseTime
expr: histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m])) > 1
for: 5m
labels:
severity: warning
annotations:
summary: "High response time detected"
description: "95th percentile response time is {{ $value }}s for {{ $labels.job }}."
- alert: LowThroughput
expr: rate(http_requests_total[5m]) < 10
for: 10m
labels:
severity: warning
annotations:
summary: "Low throughput detected"
description: "Request rate is {{ $value }} requests/second for {{ $labels.job }}."
- name: infrastructure
rules:
- alert: HighCPUUsage
expr: 100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
for: 10m
labels:
severity: warning
annotations:
summary: "High CPU usage on {{ $labels.instance }}"
description: "CPU usage is {{ $value }}% on {{ $labels.instance }}."
- alert: HighMemoryUsage
expr: (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100 > 85
for: 10m
labels:
severity: warning
annotations:
summary: "High memory usage on {{ $labels.instance }}"
description: "Memory usage is {{ $value }}% on {{ $labels.instance }}."
- alert: DiskSpaceLow
expr: (node_filesystem_avail_bytes / node_filesystem_size_bytes) * 100 < 10
for: 5m
labels:
severity: critical
annotations:
summary: "Disk space low on {{ $labels.instance }}"
description: "Disk space is {{ $value }}% available on {{ $labels.device }}."
# grafana/dashboards/app-dashboard.json
{
"dashboard": {
"id": null,
"title": "Application Dashboard",
"tags": ["app", "production"],
"timezone": "browser",
"panels": [
{
"id": 1,
"title": "Request Rate",
"type": "graph",
"targets": [
{
"expr": "rate(http_requests_total[5m])",
"legendFormat": "{{method}} {{status}}"
}
],
"yAxes": [
{
"label": "Requests/sec"
}
]
},
{
"id": 2,
"title": "Response Time",
"type": "graph",
"targets": [
{
"expr": "histogram_quantile(0.50, rate(http_request_duration_seconds_bucket[5m]))",
"legendFormat": "50th percentile"
},
{
"expr": "histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))",
"legendFormat": "95th percentile"
},
{
"expr": "histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m]))",
"legendFormat": "99th percentile"
}
],
"yAxes": [
{
"label": "Seconds"
}
]
},
{
"id": 3,
"title": "Error Rate",
"type": "graph",
"targets": [
{
"expr": "rate(http_requests_total{status=~\"5..\"}[5m]) / rate(http_requests_total[5m])",
"legendFormat": "Error Rate"
}
],
"yAxes": [
{
"label": "Percentage",
"max": 1,
"min": 0
}
]
},
{
"id": 4,
"title": "Application Status",
"type": "stat",
"targets": [
{
"expr": "up{job=\"app\"}",
"legendFormat": "Application Status"
}
],
"fieldConfig": {
"defaults": {
"mappings": [
{
"options": {
"0": {
"text": "DOWN",
"color": "red"
},
"1": {
"text": "UP",
"color": "green"
}
},
"type": "value"
}
]
}
}
}
],
"time": {
"from": "now-1h",
"to": "now"
},
"refresh": "5s"
}
}
DevOps Pipeline Architektur
CI/CD Pipeline Stages
graph TD
A[Code Commit] --> B[Build Stage]
B --> C[Test Stage]
C --> D[Security Scan]
D --> E[Package Stage]
E --> F[Deploy Staging]
F --> G[Integration Tests]
G --> H[Approve Production]
H --> I[Deploy Production]
I --> J[Monitoring]
J --> K[Rollback if needed]
A1[Git Push] --> A
B1[Docker Build] --> B
C1[Unit Tests] --> C
C2[Integration Tests] --> C
D1[Vulnerability Scan] --> D
E1[Image Registry] --> E
F1[Kubernetes Deploy] --> F
G1[E2E Tests] --> G
H1[Manual Approval] --> H
I1[Blue/Green Deploy] --> I
J1[Prometheus/Grafana] --> J
K1[Automated Rollback] --> K
Containerisierung Vergleich
Container Runtimes
| Runtime | Sprache | Sicherheit | Performance | Anwendung |
|---|---|---|---|---|
| Docker | Go | Mittel | Gut | General Purpose |
| containerd | Go | Hoch | Sehr Gut | Production |
| CRI-O | Go | Hoch | Gut | Kubernetes |
| Podman | Go | Hoch | Gut | Daemonless |
Orchestrierung Plattformen
| Plattform | Komplexität | Skalierbarkeit | Cloud-Native | Anwendung |
|---|---|---|---|---|
| Kubernetes | Hoch | Sehr Hoch | Ja | Enterprise |
| Docker Swarm | Niedrig | Mittel | Teilweise | Small/Medium |
| OpenShift | Hoch | Sehr Hoch | Ja | Enterprise |
| Nomad | Mittel | Hoch | Ja | Multi-Cloud |
Infrastructure as Code Tools
Terraform vs. CloudFormation vs. Pulumi
| Tool | Sprache | Multi-Cloud | State Management | Anwendung |
|---|---|---|---|---|
| Terraform | HCL | Ja | Eigener State | Multi-Cloud |
| CloudFormation | YAML | Nein | AWS Managed | AWS-only |
| Pulumi | Verschiedene | Ja | Eigener State | Programmierbar |
| Ansible | YAML | Ja | Kein State | Configuration |
IaC Best Practices
- Modularisierung: Kleine, wiederverwendbare Module
- Versionierung: Git-basierte Versionskontrolle
- Testing: Automated Testing von Infrastruktur
- Documentation: Automatisierte Dokumentation
- Security: Security Scanning und Compliance
Monitoring und Observability
Observability Pillars
| Pillar | Werkzeuge | Metriken | Anwendung |
|---|---|---|---|
| Metrics | Prometheus, InfluxDB | Numerische Daten | Performance |
| Logs | ELK Stack, Loki | Textuelle Daten | Troubleshooting |
| Traces | Jaeger, Zipkin | Request-Flows | Distributed Systems |
| Events | CloudWatch, EventBridge | Zustandsänderungen | Audit Trail |
Alerting Strategien
- Threshold-based: Statische Grenzwerte
- Anomaly Detection: Automatische Anomalieerkennung
- Predictive: Vorhersage von Problemen
- Business Metrics: Geschäftsrelevante Metriken
Vorteile und Nachteile
Vorteile von DevOps
- Schnellere Lieferung: Beschleunigte Softwareentwicklung
- Höhere Qualität: Automatisierte Tests und Qualitätssicherung
- Bessere Zusammenarbeit: Integration von Dev und Ops
- Skalierbarkeit: Automatisierte Skalierung von Infrastruktur
- Zuverlässigkeit: Konsistente und wiederholbare Deployments
Nachteile
- Komplexität: Hohe initiale Komplexität
- Kosten: Investition in Tools und Training
- Kultureller Wandel: Erfordert Organisationsveränderungen
- Lernkurve: Steile Lernkurve für Teams
- Tool-overload: Viele verschiedene Werkzeuge
Häufige Prüfungsfragen
-
Was ist der Unterschied zwischen CI und CD? CI (Continuous Integration) automatisiert das Build und Testen von Code, CD (Continuous Deployment) automatisiert das Deployment in Produktion.
-
Erklären Sie Containerisierung mit Docker! Docker isoliert Anwendungen in Containern mit allen Abhängigkeiten, was konsistente Umgebungen über verschiedene Systeme hinweg gewährleistet.
-
Wann verwendet man Kubernetes vs. Docker Swarm? Kubernetes für komplexe, skalierbare Anwendungen in Enterprise-Umgebungen, Docker Swarm für einfachere Setups und kleine bis mittlere Unternehmen.
-
Was ist Infrastructure as Code? Infrastructure as Code ist die Praxis, Infrastruktur durch Code zu definieren und zu verwalten, was Automatisierung und Versionierung ermöglicht.
Wichtigste Quellen
- https://docs.docker.com/
- https://kubernetes.io/docs/
- https://www.terraform.io/docs/
- https://prometheus.io/docs/