Course recordings on DaDesktop for Training platform
Visit NobleProg websites for related course
Visit outline: Building Microservices with Spring Boot, Docker, and Kubernetes (Course code: microsrvcspringboot)
Categories: Microservices · Spring Boot
Summary
Overview
This session is a technical course focused on Cloud-Native application development, centered on the Twelve-Factor App methodology and its implementation using Spring Boot, Docker, and Kubernetes. The instructor guides learners through core principles of modern cloud applications—including stateless processes, configuration management, dependency isolation, build/deploy/run separation, and observability—while demonstrating practical exercises using a sample music recommendation application (RP1 Music App). The latter portion of the session shifts to a separate meeting discussing PostgreSQL optimization, high availability configuration, and server dimensioning for a digital library platform, including requirements for joint client sessions and documentation.
Topic (Timeline)
1. Introduction to Cloud-Native Concepts and Persistent Storage [00:21:00 - 00:26:41]
- Introduced the concept of persistent storage in containers to survive restarts, emphasizing the use of volumes.
- Contrasted persistent volumes with ephemeral storage (e.g.,
emptyDir), explaining that ephemeral volumes use host memory or disk and lose data on container restart. - Mentioned cloud storage providers (e.g., OpenStack, IBM) capable of offering ephemeral volumes.
- Clarified that data persistence requires explicit volume definition, not reliance on container filesystems.
2. The Twelve-Factor App Methodology [00:26:44 - 00:38:21]
- Introduced the Twelve-Factor App as the foundational methodology for Cloud-Native applications, emphasizing portability, scalability, and automation.
- Explained that Spring Boot and Spring Cloud evolved to align with these principles.
- Key principles covered:
- Stateless applications: No in-memory session or state; state must be externalized (e.g., via Redis, databases).
- Codebase: One codebase per application, version-controlled (Git), with one-to-one mapping between codebase and application.
- Dependencies: Explicitly declared via package managers (Maven, npm, pip); dependencies must not be committed to source control.
- Configuration: Stored in environment variables, never in code or config files; separation of code and config critical for multi-environment deployment.
- Backing services: Treated as attached resources (e.g., databases, queues); connection details (URLs, credentials) are externalized via config.
- Port binding: Applications bind to ports and expose services via those ports; no reliance on external web servers.
- Concurrency: Scale out via multiple processes; each service scales independently based on workload (e.g., 10 web instances, 4 workers).
- Disposability: Fast startup and graceful shutdown; processes are replaceable (“cattle, not pets”); minimized boot time critical (e.g., Quarkus vs. Spring Boot).
- Emphasized that these principles enable DevOps, CI/CD, microservices, and cloud portability.
3. Build, Release, Run Separation and CI/CD Pipeline [00:38:21 - 00:57:58]
- Defined the three distinct stages:
- Build: Compile source + dependencies → generate immutable artifact (e.g., Docker image, JAR).
- Release: Combine artifact with configuration → create release with unique identifier (e.g., Docker tag).
- Run: Execute release on target platform (e.g., Kubernetes) with environment-specific config.
- Stressed that changes must never be made directly in production; all changes must go through the pipeline.
- Discussed tools: Jenkins (legacy), GitLab, GitHub Actions, Azure DevOps.
- Warned against modifying code in production (e.g., PHP edits) and emphasized rollback capability via versioned releases.
4. Process Model, Port Binding, and Scalability [00:57:58 - 01:08:55]
- Reinforced stateless processes: No local session storage; all state must be offloaded to backing services (e.g., Redis, databases).
- Explained port binding: Applications expose services via configurable ports; no hardcoded endpoints.
- Discussed horizontal vs. vertical scaling:
- Horizontal: Scale number of instances (HPA in Kubernetes).
- Vertical: Scale resources per instance (VPA in Kubernetes).
- Used Amazon’s Black Friday infrastructure as an example: Over-provisioned servers for peak load, underutilized otherwise → led to AWS.
- Emphasized: Many small processes > few large ones for cost efficiency and elastic scaling.
5. Disposability, Parity, and Logs [01:08:55 - 01:24:51]
- Disposability: Applications must start in seconds (not minutes), shut down gracefully, and be replaceable. Highlighted Quarkus (0.2s startup) vs. Spring Boot (20s).
- Dev/Prod Parity: Minimize differences between environments. Use identical services, config, and tooling. Automate deployment pipelines to avoid “works on my machine” issues.
- Logs: Treat logs as event streams, not files. Output to stdout/stderr; let the platform (e.g., Kubernetes, ELK stack) handle collection, storage, and analysis. Developer should not manage log files.
6. Admin Processes and Additional Principles [01:24:51 - 01:27:49]
- Admin processes: Run one-off tasks (e.g., database migrations, batch jobs) as separate, disposable processes (e.g., Kubernetes Jobs), not embedded in main app.
- Additional modern principles:
- API-first design: Define contracts before implementation; decouple services via standardized APIs (REST).
- Telemetry: Collect metrics, traces, and logs for observability (e.g., OpenTelemetry).
- Authentication & Authorization: Secure service-to-service communication (e.g., OAuth2, mTLS).
7. Spring Boot Alignment with Twelve-Factor Principles [01:27:49 - 01:48:07]
- Spring Boot automates configuration, dependency management, and embedded server deployment (Tomcat, Jetty).
- Key features aligning with Twelve-Factor:
- Auto-configuration: Detects classpath and configures services (e.g., database, cache) automatically.
- Profiles: Use
application.yml+spring.profiles.activeto manage environment-specific config. - Externalized config: Read from environment variables, Vault, or Config Server.
- Actuator: Provides health, metrics, and shutdown endpoints for observability and management.
- Build tooling: Maven/Gradle generate executable JARs;
packor Dockerfiles build images. - Stateless by default: Sessions stored externally (Redis, JDBC).
- Graceful shutdown: Enabled via actuator endpoint.
- Emphasized: Spring Boot implements Twelve-Factor out-of-the-box, reducing boilerplate and enforcing best practices.
8. RP1 Music App Refactoring Exercise [01:48:07 - 02:03:37]
- Walkthrough of refactoring a legacy music app to comply with Twelve-Factor:
- Docker setup: Create network, deploy MySQL, Neo4j, Redis via
docker-compose. - API-first: Build REST API for recommendation engine using Spring Initializr; expose via Swagger (OpenAPI).
- Dependency management: Add
spring-boot-starter-data-neo4j,spring-boot-starter-data-redis,spring-boot-starter-actuator. - Configuration: Use environment variables for DB URLs, Redis host, etc.
- Logging: Configure app to output logs to stdout; integrate with Logstash + Kibana for centralized logging.
- Disposability: Measure startup time; use GraalVM for native image optimization.
- Scalability: Manually scale app instances; use HAProxy for load balancing.
- Telemetry: Enable actuator endpoints (
/health,/metrics) to monitor app state.
- Docker setup: Create network, deploy MySQL, Neo4j, Redis via
- Noted: Exercise uses outdated Spring Boot version (3.4.10) and requires manual fixes (e.g., Neo4j version pinning, Dockerfile adjustments).
9. PostgreSQL Optimization and High Availability Meeting [02:03:37 - 05:26:08]
- Context: Client (digital library platform) uses a 20GB PostgreSQL database serving three web applications.
- Optimization (60 hours):
- Required: OS-level tuning (kernel parameters, user limits), query optimization, use of
pg_tune. - Client expects all sessions to be live/online, not recorded; team (Rosman, Carla) must lead sessions.
- Required: OS-level tuning (kernel parameters, user limits), query optimization, use of
- High Availability (54 hours):
- Client now requires implementation of HA (previously assumed client would handle it).
- Current setup: PostgreSQL in Docker; need to configure streaming replication, failover, monitoring.
- Must build visual dashboards (e.g., Grafana) for client visibility.
- Server Dimensioning (36 hours):
- Three databases to be split across 9 servers:
- One 2TB DB → split across 5 servers.
- One 850GB DB → split across 3 servers.
- One 700GB DB → single server.
- Hardware specs: CPU, RAM, SSD/NVMe based on query volume and data size.
- Client Requirements:
- All work must be done in joint sessions (not solo); recordings required.
- Documentation must include OS tuning, DB config, and architecture diagrams.
- Team must improve client communication (Rosman needs to be more assertive).
10. Technical Troubleshooting and Final Setup [05:26:08 - 06:18:51]
- Debugged Kibana/Logstash integration issues:
- Logstash was binding to port 50000, but Kibana was configured for 5000 → corrected to 50000.
- Docker volume conflicts caused stale network references → resolved by removing volumes and containers.
- Confirmed successful log ingestion into Kibana after fixing port and cleaning Docker state.
- Finalized: RP1 Music App logs visible in Kibana; all Twelve-Factor components implemented.
Appendix
Key Principles (Twelve-Factor App)
- Codebase: One codebase per app, version-controlled (Git), one-to-one mapping.
- Dependencies: Explicit, isolated, managed via package managers (Maven, pip, npm); never committed.
- Config: Stored in environment variables; never in code or config files.
- Backing Services: Treated as attached resources; connection details via config (URLs, credentials).
- Build, Release, Run: Strict separation; immutable artifacts, versioned releases.
- Processes: Stateless; no session state; use external storage (Redis, DB).
- Port Binding: Self-contained; expose services via port binding (no external web servers).
- Concurrency: Scale out via multiple processes; each service scales independently.
- Disposability: Fast startup, graceful shutdown; replaceable (“cattle, not pets”).
- Dev/Prod Parity: Minimize differences; automate deployment pipeline.
- Logs: Output to stdout/stderr; platform handles collection and storage.
- Admin Processes: Run one-off tasks as disposable jobs (e.g., Kubernetes Jobs).
Tools & Technologies
- Containerization: Docker,
docker-compose - Orchestration: Kubernetes (HPA, VPA, Jobs)
- CI/CD: GitHub Actions, GitLab CI, Azure DevOps, Jenkins (legacy)
- Observability: Kibana, Logstash, OpenTelemetry, Spring Boot Actuator
- Package Managers: Maven (Java), pip (Python), npm (Node.js), Composer (PHP)
- Caching/State: Redis, Memcached, PostgreSQL
- Frameworks: Spring Boot, Spring Cloud, Quarkus, GraalVM
Common Pitfalls
- Hardcoding config (e.g., DB URLs) in code → violates Twelve-Factor.
- Committing dependencies to source control → breaks portability.
- Modifying code in production → breaks CI/CD and traceability.
- Using
emptyDirfor persistent data → data lost on restart. - Not using environment variables for secrets → security risk (e.g., passwords in GitHub).
- Long startup times → hinders scalability and disposability.
Practice Suggestions
- Refactor legacy apps using the Twelve-Factor checklist.
- Use Spring Boot Initializr to generate new projects with correct defaults.
- Always use
application.yml+ profiles for environment config. - Integrate logging to stdout and centralize with ELK or OpenTelemetry.
- Test scalability manually:
docker runmultiple instances + HAProxy. - Use
docker system pruneregularly to clean stale containers/volumes. - Never run admin tasks (migrations, batch jobs) inside main app process.