Master this essential documentation concept
An open-source container orchestration platform used to automate the deployment, scaling, and management of containerized applications, commonly referenced in DevOps and cloud documentation.
An open-source container orchestration platform used to automate the deployment, scaling, and management of containerized applications, commonly referenced in DevOps and cloud documentation.
Engineering teams manually SSHing into servers to update a payment API cause 15-30 minute outages during each release cycle, violating SLA commitments and eroding customer trust.
Kubernetes RollingUpdate deployment strategy incrementally replaces old pods with new ones, maintaining minimum available replicas throughout the update so traffic is never fully interrupted.
['Define a Deployment manifest with strategy.type: RollingUpdate, setting maxUnavailable: 1 and maxSurge: 1 to control pod replacement pace.', 'Configure readinessProbe on the container pointing to the /health endpoint so Kubernetes only routes traffic to pods that pass the health check.', 'Run kubectl rollout status deployment/payment-api to monitor the rollout in CI/CD pipelines and block the pipeline on failure.', 'Use kubectl rollout undo deployment/payment-api to instantly revert to the previous ReplicaSet if error rates spike post-deployment.']
Deployment downtime drops from 15-30 minutes to zero, with rollbacks completing in under 60 seconds, enabling teams to ship multiple times per day safely.
An ML inference service running on fixed VM infrastructure gets overwhelmed during business hours, causing p99 latency to spike from 200ms to 8 seconds and dropping requests when GPU utilization exceeds 90%.
Kubernetes Horizontal Pod Autoscaler (HPA) combined with Cluster Autoscaler automatically scales both pod count and underlying node count based on custom GPU utilization metrics from Prometheus.
['Deploy the Prometheus Adapter to expose custom GPU utilization metrics to the Kubernetes metrics API under the custom.metrics.k8s.io endpoint.', 'Create an HPA manifest targeting the inference Deployment with a custom metric threshold of 70% GPU utilization, setting minReplicas: 2 and maxReplicas: 20.', 'Configure Cluster Autoscaler on the node group with GPU instances so new nodes provision automatically when pods remain in Pending state due to insufficient resources.', 'Set PodDisruptionBudgets to ensure at least 50% of inference pods remain available during scale-down events to prevent latency spikes during node removal.']
p99 latency stays below 300ms even at 10x normal traffic, infrastructure costs drop 40% during off-peak hours due to scale-down, and zero manual intervention is required during traffic events.
A SaaS company runs all customer workloads in a shared cluster without isolation, causing a noisy-neighbor situation where one client's batch job consumes all cluster CPU, degrading response times for other paying customers.
Kubernetes Namespaces combined with ResourceQuotas, LimitRanges, and NetworkPolicies create hard boundaries between tenants, guaranteeing resource allocation and preventing cross-tenant network access.
['Create a dedicated Namespace per enterprise client (e.g., tenant-acme, tenant-globex) and apply a ResourceQuota limiting CPU requests to 16 cores and memory to 64Gi per namespace.', 'Define a LimitRange in each namespace to set default container CPU limits to 500m and memory to 512Mi, preventing unbounded resource consumption by misconfigured pods.', "Apply a default-deny NetworkPolicy in each namespace and whitelist only ingress from the shared ingress controller namespace, ensuring tenants cannot query each other's pod IPs.", "Use RBAC RoleBindings to grant each tenant's CI/CD service account deploy permissions only within their own namespace, preventing accidental cross-tenant deployments."]
Noisy-neighbor incidents are eliminated, SLA compliance rises to 99.95%, and onboarding a new enterprise tenant is reduced from 2 days of manual setup to a 10-minute automated namespace provisioning script.
A team discovers database passwords and API keys hardcoded in Docker images and environment variables in YAML files committed to a public GitHub repository, creating a critical security vulnerability requiring immediate credential rotation.
Kubernetes Secrets integrated with HashiCorp Vault via the Vault Agent Injector dynamically injects short-lived credentials into pods at runtime, eliminating static secrets from source code and container images entirely.
['Deploy HashiCorp Vault with the Kubernetes auth method enabled, allowing pods to authenticate using their ServiceAccount JWT tokens bound to specific Vault policies.', "Annotate application Deployments with vault.hashicorp.com/agent-inject: 'true' and specify the Vault secret path so the sidecar agent writes credentials to an in-memory tmpfs volume at /vault/secrets/.", 'Remove all hardcoded environment variables and secretKeyRef references from Deployment manifests, replacing them with application logic that reads credentials from the mounted file path.', "Enable Vault's dynamic database secrets engine to issue unique, time-limited PostgreSQL credentials per pod, so compromised credentials automatically expire within 1 hour."]
All static credentials are eliminated from the codebase and images, credential rotation happens automatically without pod restarts, and the security audit passes with zero hardcoded secret findings.
Kubernetes uses resource requests for scheduling decisions and limits for runtime enforcement. Without requests, the scheduler may place too many pods on a single node causing OOMKill events; without limits, a single runaway process can starve neighboring pods of memory. Setting both ensures predictable scheduling and prevents cascading failures in shared clusters.
Liveness probes restart containers that are deadlocked or in an unrecoverable state, while readiness probes temporarily remove pods from Service endpoints when they cannot handle traffic. Conflating the two by using the same probe for both purposes causes unnecessary pod restarts during temporary overload conditions, worsening the situation instead of shedding load gracefully.
Using mutable tags like :latest or :stable in Deployment manifests means a pod restart or node replacement can silently pull a different image version than what was originally deployed, making debugging production incidents extremely difficult. Image digest pinning (e.g., nginx@sha256:abc123) guarantees every replica runs the exact same binary regardless of when or where it is scheduled.
Without PodDisruptionBudgets (PDBs), Kubernetes node drain operations during cluster upgrades or Cluster Autoscaler scale-down events can simultaneously evict all replicas of a Deployment, causing complete service unavailability. PDBs enforce a minimum availability guarantee during voluntary disruptions, ensuring the control plane respects your SLA requirements when making scheduling decisions.
Granting ClusterAdmin to CI/CD service accounts or developer namespaces is a common shortcut that creates serious security exposure, allowing a compromised pipeline to delete production workloads or exfiltrate secrets cluster-wide. Namespace-scoped Roles with only the specific verbs and resources required for each use case dramatically reduce the blast radius of credential compromise.
Join thousands of teams creating outstanding documentation
Start Free Trial