Summary

Overview

This course session provides an in-depth, hands-on exploration of Kubernetes core concepts, focusing on pod management, deployments, health checks, labels, selectors, and networking services. The trainer guides learners through practical lab exercises involving pod creation, troubleshooting, rolling updates, replica sets, liveness/readiness probes, and service types (ClusterIP, NodePort, LoadBalancer). The session emphasizes enterprise-grade practices such as using deployments over standalone pods, implementing health probes for reliability, and leveraging labels for resource orchestration. The content culminates in an introduction to Kubernetes networking and service abstractions to ensure stable application access despite dynamic pod IPs.

Topic (Timeline)

1. Pod Creation, Troubleshooting, and Volume Sharing [00:00:07 - 00:09:05]

  • Demonstrated basic kubectl commands: get pods, describe pod, exec -it -- bash to access containers.
  • Troubleshooting workflow: Identified pod creation failures due to incorrect YAML filename (prod.yaml vs pod.yaml), corrected command syntax (-f flag).
  • Explored volume sharing between containers in a single pod using an emptyDir volume:
    • First container mounts volume at /user/share/nginx/html.
    • Second container mounts same volume at /html.
    • Files written in either container are synchronized across both via the shared volume.
  • Confirmed volume persistence on the worker node under /var/lib/kubelet/pods/<pod-id>/volumes/emptydir/.
  • Highlighted that deleting the pod removes the emptyDir content, confirming ephemeral nature.

2. Deployment Fundamentals and Replica Sets [00:09:05 - 00:22:01]

  • Contrasted standalone pods with deployments: standalone pods lack scalability, self-healing, and update capabilities.
  • Introduced ReplicaSet as the controller ensuring desired pod count; deployment manages ReplicaSet.
  • Explained that deployments enable:
    • Scaling up/down (e.g., 2 → 5 pods).
    • Automatic pod recovery on failure.
    • Versioned updates via rolling updates.
  • Emphasized that deployments are the standard enterprise method for pod management.

3. Labels, Selectors, and Filtering [00:22:01 - 00:33:32]

  • Defined labels as key-value pairs (e.g., app=blue, env=prod) attached to pods, nodes, or other resources.
  • Demonstrated label operations:
    • Adding labels: kubectl label pod <name> app=blue
    • Viewing labels: kubectl get pods --show-labels
    • Filtering: kubectl get pods -l app=blue, kubectl get pods -l 'app!=blue'
    • Removing labels: kubectl label pod <name> app-
  • Explained selectors: how deployments and services use label selectors to target specific pods.

4. Deployment Architecture and Lifecycle [00:33:32 - 00:42:29]

  • Detailed the deployment control flow:
    1. kubectl apply → API Server
    2. Controller Manager → creates ReplicaSet
    3. Scheduler → assigns pod to worker node
    4. Kubelet → creates container on node
  • Discussed network and storage issues encountered during lab execution (e.g., worker node storage limits, CNI plugin conflicts, firewall rules).
  • Noted intermittent connectivity errors during pod creation, attributed to backend infrastructure (AWS node storage, security groups, CNI conflicts).

5. YAML-Based Deployments and Rolling Updates [00:52:09 - 01:03:46]

  • Created deployment using YAML manifest with apiVersion: apps/v1, kind: Deployment, replicas: 3, and container image.
  • Applied YAML: kubectl apply -f deploy.yaml → created deployment → ReplicaSet → 3 pods.
  • Demonstrated rolling update:
    • Updated image version: kubectl set image deployment/<name> <container>=<new-image>
    • Observed new ReplicaSet created, old one scaled to zero.
    • Verified image version change via kubectl describe pod.
  • Showed rollback:
    • kubectl rollout undo deployment/<name> → reverted to previous version.
    • kubectl rollout history deployment/<name> → viewed revision history.
    • Rollback to specific revision: kubectl rollout undo deployment/<name> --to-revision=2.

6. Deployment Update Strategies and Best Practices [01:03:46 - 01:12:31]

  • Compared deployment update strategies:
    • Recreate: Terminate all old pods before creating new → downtime.
    • Rolling Update (default): Gradual replacement (e.g., 25% at a time) → zero downtime.
    • Blue-Green: Two identical environments; switch traffic after validation → high cost.
    • Canary: Route small % of traffic to new version → requires service mesh (Istio/Linkerd).
    • A/B Testing: Route traffic based on user attributes (location/device).
    • Shadow: Mirror traffic to new version without exposing to users → for testing.
  • Best practices:
    • Scan container images for vulnerabilities.
    • Use only authorized/private images.
    • Avoid direct worker node access; manage via master.
    • Apply network policies and audit logs.

7. Health Probes: Liveness, Readiness, and Startup [01:12:31 - 01:32:46]

  • Defined probes:
    • Liveness Probe: Restarts container if app hangs (e.g., HTTP GET on / returns 200).
    • Readiness Probe: Determines if pod can receive traffic (e.g., /health endpoint).
    • Startup Probe: Delays liveness/readiness checks until app initializes.
  • Configured probes in YAML:
    • HTTP probe: httpGet.path: /index.html, port: 80, initialDelaySeconds: 15, periodSeconds: 5, failureThreshold: 3.
  • Demonstrated failure:
    • Used non-existent path (/example.txt) → probe fails → container restarts → crash loop backoff.
  • TCP probe example: Tested on port 8080 (success) vs 8090 (failure).
  • Observed restart count in kubectl get pods and logs via kubectl logs <pod>.

8. CronJobs for Scheduled Tasks [01:34:03 - 01:37:21]

  • Introduced CronJob as Kubernetes equivalent of Linux cron.
  • Created YAML with schedule: "* * * * *" → runs job every minute.
  • Job runs a container that prints “Hello from cluster” to stdout.
  • Observed job creation → pod execution → completion → automatic cleanup.
  • Use cases: backups, cleanup scripts, periodic data sync.

9. Kubernetes Networking and Services [01:41:26 - 01:51:11]

  • Explained dynamic pod IPs → necessitates Services for stable access.
  • Defined Service types:
    • ClusterIP (default): Internal cluster IP; accessible only within cluster.
    • NodePort: Exposes service on static port (30000–32767) on all nodes → accessible via <node-ip>:<port>.
    • LoadBalancer: External cloud load balancer (e.g., AWS ELB); provisioned automatically.
  • Demonstrated service-to-pod mapping via label selectors (app=blue).
  • Clarified role of kube-proxy: manages iptables/IPVS rules to route traffic from Service to pods.
  • Noted that LoadBalancer and NodePort are for external access; ClusterIP for internal.
  • Mentioned Ingress controllers as next topic (to be covered later).

Appendix

Key Principles

  • Deployments > Pods: Always use Deployments for production workloads to enable scaling, rolling updates, and self-healing.
  • Labels & Selectors: Essential for organizing and targeting resources (pods, services, deployments).
  • Ephemeral Storage: emptyDir volumes are tied to pod lifecycle; data is lost on pod deletion.
  • Health Probes: Critical for reliability; use liveness for restarts, readiness for traffic routing, startup for slow apps.
  • Services as Abstraction: Services provide stable endpoints despite changing pod IPs.

Tools Used

  • kubectl get pods, describe pod, exec, logs
  • kubectl apply -f, kubectl set image, kubectl rollout undo/history
  • kubectl label, kubectl scale
  • kubectl get services, kubectl get deployments, kubectl get cronjobs

Common Pitfalls

  • Incorrect YAML filename (prod.yaml instead of pod.yaml) → command silently fails.
  • Using non-existent paths in HTTP probes → crash loop backoff.
  • Misconfigured security groups or storage limits on worker nodes → pod creation failures.
  • Assuming static IPs for pods → always use Services for stable access.

Practice Suggestions

  • Recreate all labs manually: create pod → troubleshoot → convert to deployment → update image → rollback → add probes → expose via NodePort.
  • Experiment with different probe configurations (HTTP, TCP, exec).
  • Practice label filtering with multiple labels (app=frontend,env=prod).
  • Try creating a CronJob that backs up a file to a persistent volume.