85 Architecture, Virtualization, and Production Design
Design and analyze a production-ready architecture using containers, Kubernetes, and virtualization concepts.
Introduction
In this lab, you will step back from implementation details and think like a systems designer.
You will:
- Analyze application architecture
- Compare containers and virtual machines
- Design a production-ready deployment
- Introduce scaling, resilience, and failure thinking
- Connect Kubernetes to real-world infrastructure
This lab is intentionally deeper and more open-ended than previous ones.
You are expected to reason, discuss, and justify your decisions.
Submission checklist
🚨 Read this section before continuing.
This clarifies how Lab 85 will be graded.
Your repository must contain:
- A file named
architecture-notes.md - Written answers to all reasoning questions
- A production architecture design section
- An architecture diagram (PNG, Draw.io export, or image file)
- The diagram embedded in or clearly linked from
architecture-notes.md - Updated Kubernetes manifests with:
- Resource requests and limits
- Readiness and liveness probes
- Secret-based configuration (no plain-text credentials)
- Evidence of at least one controlled failure and analysis
Everything must be committed and pushed to GitHub.
If it is not in your repository, it cannot be graded.
Review your current deployment
Before designing something larger, verify your current Kubernetes setup.
Run:
kubectl get podskubectl get serviceskubectl get deploymentsConfirm:
- Your quote application is running
- The database is reachable
- You can port-forward successfully
If something is broken, fix it now.
You will build on this foundation.
Map the current architecture
Create a simple architecture diagram on paper or digitally.
Your diagram must include:
- User / browser
- Kubernetes cluster
- Deployment
- Pod
- Service
- PostgreSQL database
Answer in writing:
- Where does isolation happen?
- What restarts automatically?
- What does Kubernetes not manage?
Be prepared to explain your diagram.
Compare containers and virtual machines
Create a comparison table with at least five differences.
Topics to consider:
- Kernel sharing
- Startup time
- Resource overhead
- Security isolation
- Operational complexity
Then answer:
- When would you prefer a VM over a container?
- When would you combine both?
Write your answers in a file named:
architecture-notes.mdCommit and push your reasoning.
Introduce horizontal scaling
Scale your deployment.
kubectl scale deployment quote-app --replicas=3Verify:
kubectl get podsYou should see multiple replicas.
Now test behavior:
- Port-forward the service
- Refresh the page multiple times
Observe whether responses appear consistent.
Answer:
- What changes when you scale?
- What does not change?
Simulate failure
Delete one running pod.
kubectl delete pod <pod-name>Immediately observe:
kubectl get podsAnswer:
- Who recreated the pod?
- Why?
- What would happen if the node itself failed?
Write your answers in architecture-notes.md.
Introduce resource limits
Edit your deployment and add resource constraints.
Add under the container spec:
resources: requests: cpu: "100m" memory: "128Mi" limits: cpu: "250m" memory: "256Mi"Apply the updated manifest.
Observe:
kubectl describe pod <pod-name>Answer:
- What are requests vs limits?
- Why are they important in multi-tenant systems?
Add readiness and liveness probes
Enhance your deployment with health checks.
Example:
livenessProbe: httpGet: path: / port: 3000 initialDelaySeconds: 5 periodSeconds: 10
readinessProbe: httpGet: path: / port: 3000 initialDelaySeconds: 5 periodSeconds: 5Apply and observe behavior.
Break your app temporarily to see how probes react.
Answer:
- What is the difference between readiness and liveness?
- Why does this matter in production?
Connect Kubernetes to virtualization
Discuss and write answers:
- What runs underneath your k3s cluster?
- Is Kubernetes replacing virtualization?
- In a cloud provider, what actually hosts your nodes?
Explain how this stack might look in:
- A cloud data center
- An embedded automotive system
- A financial institution
Add this section to architecture-notes.md.
Design a production architecture
Create a new section in architecture-notes.md.
Design a production-ready version of this system.
Include:
- Multiple nodes
- Database persistence
- Backup strategy
- Monitoring
- Logging
- CI/CD pipeline integration
Answer clearly:
- What would run in Kubernetes?
- What would run in VMs?
- What would run outside the cluster?
This is a graded reasoning exercise.
Required break and analysis
Introduce a controlled failure in your deployment.
Examples:
- Set an invalid image name
- Remove environment variables
- Misconfigure resource limits
Apply and observe the failure.
Capture:
kubectl describe pod <pod-name>kubectl get eventsThen fix the issue.
You must show at least one failed state in the Kubernetes events.
Required extension: secret-based configuration
This extension is required for everyone.
Refactor your database credentials so they are not stored in plain text inside your deployment manifest.
Create a Secret
Create a secret for your database credentials:
kubectl create secret generic quote-db-secret \ --from-literal=POSTGRES_USER=quote \ --from-literal=POSTGRES_PASSWORD=quoteReference the Secret in your deployment
Modify your Deployment manifest so environment variables are loaded from the Secret instead of hard-coded values.
Example structure:
env: - name: POSTGRES_USER valueFrom: secretKeyRef: name: quote-db-secret key: POSTGRES_USER - name: POSTGRES_PASSWORD valueFrom: secretKeyRef: name: quote-db-secret key: POSTGRES_PASSWORDApply your changes and verify the application still works.
Answer in architecture-notes.md:
- Why is this better than plain-text configuration?
- Is a Secret encrypted by default? Where?
Commit and push your changes.
🆕 New for Session 3: Controlled rollouts and safe rollback
This section is new. It is designed to keep strong students busy and to prepare everyone for the final session.
What you will practice:
- Make a controlled change
- Observe the rollout
- Intentionally break it once
- Roll back safely
- Document what you learned
Perform a controlled rollout
Pick one small, safe change. For example:
- Change the page title in the template
- Add a new route that returns a short message
- Add a tiny log line
Commit the change locally first.
Build and publish a versioned image
You need two image versions to demonstrate a rollout.
If you already have a workflow from the previous course, you can reuse it.
Otherwise, keep it simple and build locally.
- Build a versioned image tag:
docker build -t quote-app:v1 -f docker/Dockerfile .- Make your code change, then build a second version:
docker build -t quote-app:v2 -f docker/Dockerfile .If you are using k3s with containerd, you may need to import the image.
Example (local, no registry):
docker save quote-app:v2 | sudo k3s ctr images import -Update the Deployment to the new image
Update the Deployment image to quote-app:v2.
If you used a registry image name in your manifests, keep using that same pattern.
Apply your manifest update, then observe the rollout.
Observe rollout progress
Run:
kubectl rollout status deployment quote-appkubectl rollout history deployment quote-appkubectl get pods -o wideAnswer in architecture-notes.md:
- What changed in the cluster during the rollout?
- What stayed the same?
- How did Kubernetes decide when to create and delete Pods?
Require one broken rollout
Introduce a controlled failure so the rollout does not complete.
Examples:
- Set the image to a tag that does not exist
- Set the container port to the wrong value
- Break your readiness probe path
Apply the change and confirm you can see evidence of failure:
kubectl get podskubectl describe pod <pod-name>kubectl get eventsIn architecture-notes.md, write:
- What failed first?
- Which signal showed you the failure fastest?
- What would you check next if this happened in production?
Roll back safely
Undo the broken rollout:
kubectl rollout undo deployment quote-appVerify:
- The application becomes healthy again
- The Deployment stabilizes
- Pods return to a ready state
Add a short note in architecture-notes.md:
- What did rollback change?
- What did rollback not change?
Required extension choice
Choose ONE of these extensions and implement it.
Option A: Explicit rollout strategy
Update your Deployment to explicitly define a rolling update strategy:
strategy: type: RollingUpdate rollingUpdate: maxSurge: 1 maxUnavailable: 0Explain in architecture-notes.md:
- What does
maxSurgedo? - What does
maxUnavailabledo? - Why might you choose
0formaxUnavailable?
Option B: Rollout notes and evidence
Add a short “Rollout Evidence” section to architecture-notes.md that includes:
- The commands you ran (
rollout status,history,undo) - One or two short excerpts of output (keep it short)
- Your interpretation of what happened
Option C: Canary-style rollout (manual)
Simulate a “canary” rollout manually:
- Temporarily run 1 replica of
v2and 2 replicas ofv1 - Compare behavior
- Then complete the rollout
Document the approach and limitations in architecture-notes.md.
Optional stretch: real end-to-end checks
In real projects, teams often add end-to-end checks (for example with Playwright) to validate an application through the browser.
You do not need to implement this here, but write 3–5 bullets in architecture-notes.md explaining:
- What an end-to-end test would validate for this app
- Where it should run (locally, CI, or both)
- What the biggest cost or risk would be
Optional extension lanes
The following lanes are optional unless specified by your instructor.
Choose one or more depending on your pace.
Multi-node simulation
If your environment supports it:
- Add an additional worker node
- Observe pod scheduling behavior
- Run:
kubectl get nodeskubectl get pods -o wideDocument:
- How pods are distributed
- How Kubernetes chooses nodes
- What happens when a node becomes unavailable
Database hardening
Improve the security posture of your database deployment:
- Ensure credentials come from a Kubernetes Secret
- Avoid plain-text passwords in manifests
- Verify environment variables are injected correctly
Explain:
- Why Secrets are preferable to ConfigMaps for credentials
- Where Secret data is stored in the cluster
Autoscaling exploration
Research Horizontal Pod Autoscaler (HPA).
Document:
- What metrics are required
- Why metrics-server is needed
- How autoscaling differs from manual scaling
- What would be required in a real production cluster
Implementation is optional unless you complete all required tasks early.
Architecture critique
Write a short critique of your own design.
Include:
- Single points of failure
- Operational risks
- Security concerns
- Observability gaps
Be precise and realistic.
🐉 Beast Mode: production realism challenge
Attempt this only after completing all required work.
Choose ONE of the following:
- Add a NetworkPolicy that restricts which pods can talk to the database.
- Split application and database into separate namespaces and update access rules.
- Add a basic Ingress resource and explain how it connects to a real load balancer.
- Simulate a rolling update and observe rollout history.
For your chosen challenge:
- Implement it as far as your environment allows.
- Document what worked and what did not.
- Explain what would change in a real multi-node production cluster.
Add a dedicated section in architecture-notes.md titled:
Beast Mode ChallengeWrap-up reflection
Before leaving, ensure:
- Your manifests are committed
architecture-notes.mdis pushed- You can explain your design verbally
This session is about thinking like an engineer responsible for production systems.
Speed is not the goal.
Clarity and reasoning are.