Cloud Desktop Teaching Platforms

This session is a technical course focused on Cloud-Native application development, centered on the Twelve-Factor App methodology and its implementation using Spring Boot, Docker, and Kubernetes. The instructor guides learners through core principles of modern cloud applications—including stateless processes, configuration management, dependency isolation, build/deploy/run separation, and observability—while demonstrating practical exercises using a sample music recommendation application (RP1 Music App). The latter portion of the session shifts to a separate meeting discussing PostgreSQL optimization, high availability configuration, and server dimensioning for a digital library platform, including requirements for joint client sessions and documentation.

Topic (Timeline)

1. Introduction to Cloud-Native Concepts and Persistent Storage [00:21:00 - 00:26:41]

Introduced the concept of persistent storage in containers to survive restarts, emphasizing the use of volumes.
Contrasted persistent volumes with ephemeral storage (e.g., emptyDir), explaining that ephemeral volumes use host memory or disk and lose data on container restart.
Mentioned cloud storage providers (e.g., OpenStack, IBM) capable of offering ephemeral volumes.
Clarified that data persistence requires explicit volume definition, not reliance on container filesystems.

2. The Twelve-Factor App Methodology [00:26:44 - 00:38:21]

Introduced the Twelve-Factor App as the foundational methodology for Cloud-Native applications, emphasizing portability, scalability, and automation.
Explained that Spring Boot and Spring Cloud evolved to align with these principles.
Key principles covered:
- Stateless applications: No in-memory session or state; state must be externalized (e.g., via Redis, databases).
- Codebase: One codebase per application, version-controlled (Git), with one-to-one mapping between codebase and application.
- Dependencies: Explicitly declared via package managers (Maven, npm, pip); dependencies must not be committed to source control.
- Configuration: Stored in environment variables, never in code or config files; separation of code and config critical for multi-environment deployment.
- Backing services: Treated as attached resources (e.g., databases, queues); connection details (URLs, credentials) are externalized via config.
- Port binding: Applications bind to ports and expose services via those ports; no reliance on external web servers.
- Concurrency: Scale out via multiple processes; each service scales independently based on workload (e.g., 10 web instances, 4 workers).
- Disposability: Fast startup and graceful shutdown; processes are replaceable (“cattle, not pets”); minimized boot time critical (e.g., Quarkus vs. Spring Boot).
Emphasized that these principles enable DevOps, CI/CD, microservices, and cloud portability.

3. Build, Release, Run Separation and CI/CD Pipeline [00:38:21 - 00:57:58]

Defined the three distinct stages:
- Build: Compile source + dependencies → generate immutable artifact (e.g., Docker image, JAR).
- Release: Combine artifact with configuration → create release with unique identifier (e.g., Docker tag).
- Run: Execute release on target platform (e.g., Kubernetes) with environment-specific config.
Stressed that changes must never be made directly in production; all changes must go through the pipeline.
Discussed tools: Jenkins (legacy), GitLab, GitHub Actions, Azure DevOps.
Warned against modifying code in production (e.g., PHP edits) and emphasized rollback capability via versioned releases.

4. Process Model, Port Binding, and Scalability [00:57:58 - 01:08:55]

Reinforced stateless processes: No local session storage; all state must be offloaded to backing services (e.g., Redis, databases).
Explained port binding: Applications expose services via configurable ports; no hardcoded endpoints.
Discussed horizontal vs. vertical scaling:
- Horizontal: Scale number of instances (HPA in Kubernetes).
- Vertical: Scale resources per instance (VPA in Kubernetes).
Used Amazon’s Black Friday infrastructure as an example: Over-provisioned servers for peak load, underutilized otherwise → led to AWS.
Emphasized: Many small processes > few large ones for cost efficiency and elastic scaling.

5. Disposability, Parity, and Logs [01:08:55 - 01:24:51]

Disposability: Applications must start in seconds (not minutes), shut down gracefully, and be replaceable. Highlighted Quarkus (0.2s startup) vs. Spring Boot (20s).
Dev/Prod Parity: Minimize differences between environments. Use identical services, config, and tooling. Automate deployment pipelines to avoid “works on my machine” issues.
Logs: Treat logs as event streams, not files. Output to stdout/stderr; let the platform (e.g., Kubernetes, ELK stack) handle collection, storage, and analysis. Developer should not manage log files.

6. Admin Processes and Additional Principles [01:24:51 - 01:27:49]

Admin processes: Run one-off tasks (e.g., database migrations, batch jobs) as separate, disposable processes (e.g., Kubernetes Jobs), not embedded in main app.
Additional modern principles:
- API-first design: Define contracts before implementation; decouple services via standardized APIs (REST).
- Telemetry: Collect metrics, traces, and logs for observability (e.g., OpenTelemetry).
- Authentication & Authorization: Secure service-to-service communication (e.g., OAuth2, mTLS).

7. Spring Boot Alignment with Twelve-Factor Principles [01:27:49 - 01:48:07]

Spring Boot automates configuration, dependency management, and embedded server deployment (Tomcat, Jetty).
Key features aligning with Twelve-Factor:
- Auto-configuration: Detects classpath and configures services (e.g., database, cache) automatically.
- Profiles: Use application.yml + spring.profiles.active to manage environment-specific config.
- Externalized config: Read from environment variables, Vault, or Config Server.
- Actuator: Provides health, metrics, and shutdown endpoints for observability and management.
- Build tooling: Maven/Gradle generate executable JARs; pack or Dockerfiles build images.
- Stateless by default: Sessions stored externally (Redis, JDBC).
- Graceful shutdown: Enabled via actuator endpoint.
Emphasized: Spring Boot implements Twelve-Factor out-of-the-box, reducing boilerplate and enforcing best practices.

8. RP1 Music App Refactoring Exercise [01:48:07 - 02:03:37]

Walkthrough of refactoring a legacy music app to comply with Twelve-Factor:
- Docker setup: Create network, deploy MySQL, Neo4j, Redis via docker-compose.
- API-first: Build REST API for recommendation engine using Spring Initializr; expose via Swagger (OpenAPI).
- Dependency management: Add spring-boot-starter-data-neo4j, spring-boot-starter-data-redis, spring-boot-starter-actuator.
- Configuration: Use environment variables for DB URLs, Redis host, etc.
- Logging: Configure app to output logs to stdout; integrate with Logstash + Kibana for centralized logging.
- Disposability: Measure startup time; use GraalVM for native image optimization.
- Scalability: Manually scale app instances; use HAProxy for load balancing.
- Telemetry: Enable actuator endpoints (/health, /metrics) to monitor app state.
Noted: Exercise uses outdated Spring Boot version (3.4.10) and requires manual fixes (e.g., Neo4j version pinning, Dockerfile adjustments).

9. PostgreSQL Optimization and High Availability Meeting [02:03:37 - 05:26:08]

Context: Client (digital library platform) uses a 20GB PostgreSQL database serving three web applications.
Optimization (60 hours):
- Required: OS-level tuning (kernel parameters, user limits), query optimization, use of pg_tune.
- Client expects all sessions to be live/online, not recorded; team (Rosman, Carla) must lead sessions.
High Availability (54 hours):
- Client now requires implementation of HA (previously assumed client would handle it).
- Current setup: PostgreSQL in Docker; need to configure streaming replication, failover, monitoring.
- Must build visual dashboards (e.g., Grafana) for client visibility.
Server Dimensioning (36 hours):
- Three databases to be split across 9 servers:
- One 2TB DB → split across 5 servers.
- One 850GB DB → split across 3 servers.
- One 700GB DB → single server.
- Hardware specs: CPU, RAM, SSD/NVMe based on query volume and data size.
Client Requirements:
- All work must be done in joint sessions (not solo); recordings required.
- Documentation must include OS tuning, DB config, and architecture diagrams.
- Team must improve client communication (Rosman needs to be more assertive).

10. Technical Troubleshooting and Final Setup [05:26:08 - 06:18:51]

Debugged Kibana/Logstash integration issues:
- Logstash was binding to port 50000, but Kibana was configured for 5000 → corrected to 50000.
- Docker volume conflicts caused stale network references → resolved by removing volumes and containers.
Confirmed successful log ingestion into Kibana after fixing port and cleaning Docker state.
Finalized: RP1 Music App logs visible in Kibana; all Twelve-Factor components implemented.

Appendix

Key Principles (Twelve-Factor App)

Codebase: One codebase per app, version-controlled (Git), one-to-one mapping.
Dependencies: Explicit, isolated, managed via package managers (Maven, pip, npm); never committed.
Config: Stored in environment variables; never in code or config files.
Backing Services: Treated as attached resources; connection details via config (URLs, credentials).
Build, Release, Run: Strict separation; immutable artifacts, versioned releases.
Processes: Stateless; no session state; use external storage (Redis, DB).
Port Binding: Self-contained; expose services via port binding (no external web servers).
Concurrency: Scale out via multiple processes; each service scales independently.
Disposability: Fast startup, graceful shutdown; replaceable (“cattle, not pets”).
Dev/Prod Parity: Minimize differences; automate deployment pipeline.
Logs: Output to stdout/stderr; platform handles collection and storage.
Admin Processes: Run one-off tasks as disposable jobs (e.g., Kubernetes Jobs).

Tools & Technologies

Containerization: Docker, docker-compose
Orchestration: Kubernetes (HPA, VPA, Jobs)
CI/CD: GitHub Actions, GitLab CI, Azure DevOps, Jenkins (legacy)
Observability: Kibana, Logstash, OpenTelemetry, Spring Boot Actuator
Package Managers: Maven (Java), pip (Python), npm (Node.js), Composer (PHP)
Caching/State: Redis, Memcached, PostgreSQL
Frameworks: Spring Boot, Spring Cloud, Quarkus, GraalVM

Common Pitfalls

Hardcoding config (e.g., DB URLs) in code → violates Twelve-Factor.
Committing dependencies to source control → breaks portability.
Modifying code in production → breaks CI/CD and traceability.
Using emptyDir for persistent data → data lost on restart.
Not using environment variables for secrets → security risk (e.g., passwords in GitHub).
Long startup times → hinders scalability and disposability.

Practice Suggestions

Refactor legacy apps using the Twelve-Factor checklist.
Use Spring Boot Initializr to generate new projects with correct defaults.
Always use application.yml + profiles for environment config.
Integrate logging to stdout and centralize with ELK or OpenTelemetry.
Test scalability manually: docker run multiple instances + HAProxy.
Use docker system prune regularly to clean stale containers/volumes.
Never run admin tasks (migrations, batch jobs) inside main app process.

Building Microservices with Spring Boot, Docker, and Kubernetes - andres-ptbc-20251021-194049

Visit NobleProg websites for related course

Summary

Overview