diff --git a/.github/workflows/python-ci.yml b/.github/workflows/python-ci.yml new file mode 100644 index 0000000000..60cd483628 --- /dev/null +++ b/.github/workflows/python-ci.yml @@ -0,0 +1,106 @@ +name: Python CI & Docker Build + +on: + push: + branches: [ main, dev, lab3 ] + tags: [ 'v*' ] + pull_request: + branches: [ main ] + +permissions: + contents: read + packages: write + +jobs: + test: + name: Test & Lint + runs-on: ubuntu-latest + + strategy: + matrix: + python-version: ["3.9", "3.10", "3.11"] + + steps: + - name: Checkout code + uses: actions/checkout@v4 + + - name: Set up Python ${{ matrix.python-version }} + uses: actions/setup-python@v5 + with: + python-version: ${{ matrix.python-version }} + + - name: Install dependencies + run: | + python -m pip install --upgrade pip + pip install -r app_python/requirements.txt + pip install ruff pytest + + - name: Lint with Ruff + run: ruff check . + + - name: Run tests + run: pytest app_python/tests/ --verbose -v + + - name: Format check + run: ruff format --check . + + security: + name: Snyk Security Scan + runs-on: ubuntu-latest + if: github.event_name != 'pull_request' + defaults: + run: + working-directory: ./app_python + steps: + - uses: actions/checkout@v4 + - name: Install dependencies + run: | + python -m pip install --upgrade pip + pip install -r requirements.txt + - name: Snyk CLI + uses: snyk/actions/python-3.11@master + env: + SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }} + with: + args: --severity-threshold=critical --skip-unresolved + continue-on-error: true + + + docker: + name: Build & Push Docker + needs: [ test, security ] # Runs only if test & security passed + runs-on: ubuntu-latest + if: github.event_name != 'pull_request' # Dont push pr to docker hub + + steps: + - name: Checkout code + uses: actions/checkout@v4 + + - name: Docker meta + id: meta + uses: docker/metadata-action@v5 + with: + images: ${{ secrets.DOCKER_USERNAME }}/testiks + tags: | + type=raw,value=latest,enable={{is_default_branch}} + type=raw,value={{date 'YYYY.MM'}},enable={{is_default_branch}} + type=ref,event=branch + + + - name: Set up Docker Buildx + uses: docker/setup-buildx-action@v3 + + - name: Login to Docker Hub + uses: docker/login-action@v3 + with: + username: ${{ secrets.DOCKER_USERNAME }} + password: ${{ secrets.DOCKERHUB_TOKEN }} + + - name: Build and push Docker image + uses: docker/build-push-action@v5 + with: + context: ./app_python/ + push: true + tags: ${{ steps.meta.outputs.tags }} + cache-from: type=gha + cache-to: type=gha,mode=max diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000000..205f63c513 --- /dev/null +++ b/.gitignore @@ -0,0 +1,5 @@ +test +.* +minikube* +edge-api +**/__pycache__/ \ No newline at end of file diff --git a/README.md b/README.md index 0b159ed716..371d51f456 100644 --- a/README.md +++ b/README.md @@ -1,81 +1,271 @@ -# DevOps Engineering Labs +# DevOps Engineering: Core Practices -## Introduction +[![Labs](https://img.shields.io/badge/Labs-18-blue)](#labs) +[![Exam](https://img.shields.io/badge/Exam-Optional-green)](#exam-alternative) +[![Duration](https://img.shields.io/badge/Duration-18%20Weeks-lightgrey)](#course-roadmap) -Welcome to the DevOps Engineering course labs! These hands-on labs are designed to guide you through various aspects of DevOps practices and principles. As you progress through the labs, you'll gain practical experience in application development, containerization, testing, infrastructure setup, CI/CD processes, and more. +Master **production-grade DevOps practices** through hands-on labs. Build, containerize, deploy, monitor, and scale applications using industry-standard tools. -## Lab Syllabus +--- -Lab 1: Web Application Development -Lab 2: Containerization -Lab 3: Continuous Integration -Lab 4: Infrastructure as Code & Terraform -Lab 5: Configuration Management -Lab 6: Ansible Automation -Lab 7: Observability, Logging, Loki Stack -Lab 8: Monitoring & Prometheus -Lab 9: Kubernetes & Declarative Manifests -Lab 10: Helm Charts & Library Charts -Lab 11: Kubernetes Secrets Management (Vault, ConfigMaps) -Lab 12: Kubernetes ConfigMaps & Environment Variables -Lab 13: GitOps with ArgoCD -Lab 14: StatefulSet Optimization -Lab 15: Kubernetes Monitoring & Init Containers -Lab 16: IPFS & Fleek Decentralization +## Quick Start -## Architecture +1. **Fork** this repository +2. **Clone** your fork locally +3. **Start with Lab 1** and progress sequentially +4. **Submit PRs** for each lab (details below) -This repository has a master branch containing an introduction. Each new lab assignment will be added as a markdown file with a lab number. +--- -## Rules +## Course Roadmap -To successfully complete the labs and pass the course, follow these rules: +| Week | Lab | Topic | Key Technologies | +|------|-----|-------|------------------| +| 1 | 1 | Web Application Development | Python/Go, Best Practices | +| 2 | 2 | Containerization | Docker, Multi-stage Builds | +| 3 | 3 | Continuous Integration | GitHub Actions, Snyk | +| 4 | 4 | Infrastructure as Code | Terraform, Cloud Providers | +| 5 | 5 | Configuration Management | Ansible Basics | +| 6 | 6 | Continuous Deployment | Ansible Advanced | +| 7 | 7 | Logging | Promtail, Loki, Grafana | +| 8 | 8 | Monitoring | Prometheus, Grafana | +| 9 | 9 | Kubernetes Basics | Minikube, Deployments, Services | +| 10 | 10 | Helm Charts | Templating, Hooks | +| 11 | 11 | Secrets Management | K8s Secrets, HashiCorp Vault | +| 12 | 12 | Configuration & Storage | ConfigMaps, PVCs | +| 13 | 13 | GitOps | ArgoCD | +| 14 | 14 | Progressive Delivery | Argo Rollouts | +| 15 | 15 | StatefulSets | Persistent Storage, Headless Services | +| 16 | 16 | Cluster Monitoring | Kube-Prometheus, Init Containers | +| โ€” | **Exam Alternative Labs** | | | +| 17 | 17 | Edge Deployment | Fly.io, Global Distribution | +| 18 | 18 | Decentralized Storage | 4EVERLAND, IPFS, Web3 | -1. **Lab Dependency:** Complete the labs in order; each lab builds upon the previous one. -2. **Submission and Grading:** Submit your solutions as pull requests (PRs) to the master branch of this repository. You need at least 6/10 points for each lab to pass. -3. **Fork Repository:** Fork this repository to your workspace to create your own version for solving the labs. -4. **Recommended Workflow:** Build your solutions incrementally. Complete lab N based on lab N-1. -5. **PR Creation:** Create a PR from your fork to the master branch of this repository and from your fork's branch to your fork's master branch. -6. **Wait for Grade:** Once your PR is created, wait for your lab to be reviewed and graded. +--- -### Example for the first lab +## Grading -1. Fork this repository. -2. Checkout to the lab1 branch. -3. Complete the lab1 tasks. -4. Push the code to your repository. -5. Create a PR to the master branch of this repository from your fork's lab1 branch. -6. Create a PR to the master branch of your repository from your lab1 branch. -7. Wait for your grade. +### Grade Composition -## Grading and Grades Distribution +| Component | Weight | Points | +|-----------|--------|--------| +| **Labs (16 required)** | 80% | 160 pts | +| **Final Exam** | 20% | 40 pts | +| **Bonus Tasks** | Extra | +40 pts max | +| **Total** | 100% | 200 pts | -Your final grade will be determined based on labs and a final exam: +### Exam Alternative -- Labs: 70% of your final grade. -- Final Exam: 30% of your final grade. +Don't want to take the exam? Complete **both** bonus labs: -Grade ranges: +| Lab | Topic | Points | +|-----|-------|--------| +| **Lab 17** | Fly.io Edge Deployment | 20 pts | +| **Lab 18** | 4EVERLAND & IPFS | 20 pts | -- [90-100] - A -- [75-90) - B -- [60-75) - C -- [0-60) - D +**Requirements:** +- Complete both labs (17 + 18 = 40 pts, replaces exam) +- Minimum 16/20 on each lab +- Deadline: **1 week before exam date** +- Can still take exam if you need more points for desired grade -### Labs Grading +
+๐Ÿ“Š Grade Scale -Each lab is worth 10 points. Completing main tasks correctly earns you 10 points. Completing bonus tasks correctly adds 2.5 points. You can earn a maximum of 12.5 points per lab by completing all main and bonus tasks. +| Grade | Points | Percentage | +|-------|--------|------------| +| **A** | 180-200+ | 90-100% | +| **B** | 150-179 | 75-89% | +| **C** | 120-149 | 60-74% | +| **D** | 0-119 | 0-59% | -Finishing all bonus tasks lets you skip the exam and grants you 5 extra points. Incomplete bonus tasks require you to take the exam, which could save you from failing it. +**Minimum to Pass:** 120 points (60%) ->The labs account for 70% of your final grade. With 14 labs in total, each lab contributes 5% to your final grade. Completing all main tasks in a lab earns you the maximum 10 points, which corresponds to 5% of your final grade. ->If you successfully complete all bonus tasks, you'll earn an additional 2.5 points, totaling 12.5 points for that lab, or 6.25% of your final grade. Over the course of all 14 labs, the cumulative points from bonus tasks add up to 87.5% of your final grade. ->Additionally, a 5% bonus is granted for successfully finishing all bonus tasks, ensuring that if you successfully complete everything, your final grade will be 92.5%, which corresponds to an A grade. +
-## Deadlines and Labs Distribution +
+๐Ÿ“ˆ Grade Examples -Each week, two new labs will be available. You'll have one week to submit your solutions. Refer to Moodle for presentation slides and deadlines. +**Scenario 1: Labs + Exam** +``` +Labs: 16 ร— 9 = 144 pts +Bonus: 5 labs ร— 2.5 = 12.5 pts +Exam: 35/40 pts +Total: 191.5 pts = 96% (A) +``` -## Submission Policy +**Scenario 2: Labs + Exam Alternative** +``` +Labs: 16 ร— 9 = 144 pts +Bonus: 8 labs ร— 2.5 = 20 pts +Lab 17: 18 pts +Lab 18: 17 pts +Total: 199 pts = 99.5% (A) +``` -Submitting your lab results on time is crucial for your grading. Late submissions receive a maximum score of 6 points for the corresponding lab. Remember, completing all labs is necessary to successfully pass the course. +
+ +--- + +## Lab Structure + +Each lab is worth **10 points** (main tasks) + **2.5 points** (bonus). + +- **Minimum passing score:** 6/10 per lab +- **Late submissions:** Max 6/10 (within 1 week) +- **Very late (>1 week):** Not accepted + +
+๐Ÿ“‹ Lab Categories + +**Foundation (Labs 1-2)** +- Web app development +- Docker containerization + +**CI/CD & Infrastructure (Labs 3-4)** +- GitHub Actions +- Terraform + +**Configuration Management (Labs 5-6)** +- Ansible playbooks and roles + +**Observability (Labs 7-8)** +- Loki logging stack +- Prometheus monitoring + +**Kubernetes Core (Labs 9-12)** +- K8s basics, Helm +- Secrets, ConfigMaps + +**Advanced Kubernetes (Labs 13-16)** +- ArgoCD, Argo Rollouts +- StatefulSets, Monitoring + +**Exam Alternative (Labs 17-18)** +- Fly.io, 4EVERLAND/IPFS + +
+ +--- + +## How to Submit + +```bash +# 1. Create branch +git checkout -b lab1 + +# 2. Complete lab tasks + +# 3. Commit and push +git add . +git commit -m "Complete lab1" +git push -u origin lab1 + +# 4. Create TWO Pull Requests: +# PR #1: your-fork:lab1 โ†’ course-repo:master +# PR #2: your-fork:lab1 โ†’ your-fork:master +``` + +
+๐Ÿ“ Submission Checklist + +- [ ] All main tasks completed +- [ ] Documentation files created +- [ ] Screenshots where required +- [ ] Code tested and working +- [ ] Markdown validated ([linter](https://dlaa.me/markdownlint/)) +- [ ] Both PRs created + +
+ +--- + +## Resources + +
+๐Ÿ› ๏ธ Required Tools + +| Tool | Purpose | +|------|---------| +| Git | Version control | +| Docker | Containerization | +| kubectl | Kubernetes CLI | +| Helm | K8s package manager | +| Minikube | Local K8s cluster | +| Terraform | Infrastructure as Code | +| Ansible | Configuration management | + +
+ +
+๐Ÿ“š Documentation Links + +**Core:** +- [Docker](https://docs.docker.com/) +- [Kubernetes](https://kubernetes.io/docs/) +- [Helm](https://helm.sh/docs/) + +**CI/CD:** +- [GitHub Actions](https://docs.github.com/en/actions) +- [Terraform](https://www.terraform.io/docs) +- [Ansible](https://docs.ansible.com/) + +**Observability:** +- [Prometheus](https://prometheus.io/docs/) +- [Grafana](https://grafana.com/docs/) + +**Advanced:** +- [ArgoCD](https://argo-cd.readthedocs.io/) +- [Argo Rollouts](https://argoproj.github.io/argo-rollouts/) +- [HashiCorp Vault](https://developer.hashicorp.com/vault/docs) + +
+ +
+๐Ÿ’ก Tips for Success + +1. **Start early** - Don't wait until deadline +2. **Read instructions fully** before starting +3. **Test everything** before submitting +4. **Document as you go** - Don't leave it for the end +5. **Ask questions early** - Don't wait until last minute +6. **Use proper Git workflow** - Branches, commits, PRs + +
+ +
+๐Ÿ”ง Common Issues + +**Docker:** +- Daemon not running โ†’ Start Docker Desktop +- Permission denied โ†’ Add user to docker group + +**Minikube:** +- Won't start โ†’ Try `--driver=docker` +- Resource issues โ†’ Allocate more memory/CPU + +**Kubernetes:** +- ImagePullBackOff โ†’ Check image name/registry +- CrashLoopBackOff โ†’ Check logs: `kubectl logs ` + +
+ +--- + +## Course Completion + +After completing all 16 core labs (+ optional Labs 17-18), you'll have: + +โœ… Full-stack DevOps expertise +โœ… Production-ready portfolio with 16-18 projects +โœ… Container and Kubernetes mastery +โœ… CI/CD pipeline experience +โœ… Infrastructure as Code skills +โœ… Monitoring and observability knowledge +โœ… GitOps workflow experience + +--- + +**Ready to begin? Start with [Lab 1](labs/lab01.md)!** + +Questions? Check the course Moodle page or ask during office hours. diff --git a/ansible/ansible.cfg b/ansible/ansible.cfg new file mode 100644 index 0000000000..f955730d83 --- /dev/null +++ b/ansible/ansible.cfg @@ -0,0 +1,11 @@ +[defaults] +inventory = inventory/hosts.ini +roles_path = roles +host_key_checking = False +remote_user = debil +retry_files_enabled = False + +[privilege_escalation] +become = True +become_method = sudo +become_user = root \ No newline at end of file diff --git a/ansible/app_python/Dockerfile b/ansible/app_python/Dockerfile new file mode 100644 index 0000000000..d82173a7d1 --- /dev/null +++ b/ansible/app_python/Dockerfile @@ -0,0 +1,22 @@ +FROM python:3.12-slim + +ENV PYTHONDONTWRITEBYTECODE=1 \ + PYTHONUNBUFFERED=1 + +# Non-root user +RUN groupadd -r appuser && useradd -r -g appuser appuser + +# Install deps first +WORKDIR /app +COPY requirements.txt . +RUN pip install --no-cache-dir -r requirements.txt + + +COPY app.py . +RUN chown -R appuser:appuser /app +USER appuser + +EXPOSE 8000 + +# Run app finally +CMD ["python", "app.py"] diff --git a/ansible/app_python/README.md b/ansible/app_python/README.md new file mode 100644 index 0000000000..f122dfce25 --- /dev/null +++ b/ansible/app_python/README.md @@ -0,0 +1,55 @@ +# DevOps Info Service +A lightweight demo Python web application that system information via HTTP endpoints + +### Prerequisites +Python 3.10+ +Flask 3.1.0 + +### Installation +```bash +python3 -m venv venv +source venv/bin/activate +pip install -r requirements.txt +``` + +### Running the Application +```bash +python3 app.py +# Or with custom config +PORT=8080 python3 app.py +``` + +### API Endpoints +There are few main endpoints: +- `GET /` - Service and system information +- `GET /health` - Health check. + +### Configuration + +| Variable | Value | Purpose | +| -------- | ------ | ------------------------------------ | +| Host | string | A host to run app on | +| Port | int | A port to assign for web application | +| Debug | bool | Should debug output be enabled | + +## Docker +This application can be run in a containerized environment with Docker + +### Build the image locally +To build the Docker image, use the Docker build command from the project directory, specifying the Dockerfile and an image name with a tag +```bash +cd app_python +docker build -t . +``` + +### Run a container +To run the application, start a container from the built image and map the container port to a port on the host machine so the application can be accessed locally +```bash +docker run -p:5000 +``` + +### Pull from Docker Hub +The pre-built image is also available on Docker Hub and can be pulled using the standard Docker pull command with the repository name and desired tag +```bash +docker pull cacucoh/testiks:1.0 +``` \ No newline at end of file diff --git a/ansible/app_python/app.py b/ansible/app_python/app.py new file mode 100644 index 0000000000..9064052f8c --- /dev/null +++ b/ansible/app_python/app.py @@ -0,0 +1,182 @@ +import os +import logging +import platform +import socket +from datetime import datetime, timezone + +import time +from functools import wraps + +from flask import Flask, request, jsonify, Response +from pythonjsonlogger import jsonlogger +from prometheus_client import Counter, Histogram, Gauge, generate_latest, CONTENT_TYPE_LATEST + +# ---------------------- +# JSON Logging Setup +# ---------------------- +logger = logging.getLogger() + +logHandler = logging.StreamHandler() +formatter = jsonlogger.JsonFormatter( + fmt="%(asctime)s %(name)s %(levelname)s %(message)s" +) +logHandler.setFormatter(formatter) +logger.addHandler(logHandler) +logger.setLevel(logging.INFO) + + +app = Flask(__name__) + +HOST = os.getenv("HOST", "0.0.0.0") +PORT = int(os.getenv("PORT", 5000)) +DEBUG = os.getenv("DEBUG", "False").lower() == "true" + +START_TIME = datetime.now(timezone.utc) + +# Define metrics +http_requests_total = Counter( + 'http_requests_total', + 'Total HTTP requests', + ['method', 'endpoint', 'status'] +) + +http_request_duration_seconds = Histogram( + 'http_request_duration_seconds', + 'HTTP request duration', + ['method', 'endpoint'] +) + +http_requests_in_progress = Gauge( + 'http_requests_in_progress', + 'HTTP requests currently being processed' +) + + +def track_metrics(func): + @wraps(func) + def wrapper(*args, **kwargs): + http_requests_in_progress.inc() + start_time = time.time() + + try: + response = func(*args, **kwargs) + status = getattr(response, "status_code", 200) + except Exception: + status = 500 + raise + finally: + duration = time.time() - start_time + endpoint = request.path + method = request.method + + http_requests_total.labels(method=method, endpoint=endpoint, status=str(status)).inc() + http_request_duration_seconds.labels(method=method, endpoint=endpoint).observe(duration) + + http_requests_in_progress.dec() + + return response + + return wrapper + + +def get_uptime(): + delta = datetime.now(timezone.utc) - START_TIME + seconds = int(delta.total_seconds()) + hours = seconds // 3600 + minutes = (seconds % 3600) // 60 + return {"seconds": seconds, "human": f"{hours} hours, {minutes} minutes"} + + +def get_system_info(): + return { + "hostname": socket.gethostname(), + "platform": platform.system(), + "platform_version": platform.version(), + "architecture": platform.machine(), + "cpu_count": os.cpu_count(), + "python_version": platform.python_version(), + } + + +@app.route("/health", methods=["GET"]) +@track_metrics +def health(): + uptime = get_uptime() + return jsonify( + { + "status": "healthy", + "timestamp": datetime.now(timezone.utc).isoformat(), + "uptime_seconds": uptime["seconds"], + } + ) + +# k8s +@app.route('/ready') +def ready(): + return 'OK', 200 + + +@app.route('/metrics') +def metrics(): + return Response(generate_latest(), mimetype=CONTENT_TYPE_LATEST) + + +@app.route("/", methods=["GET"]) +@track_metrics +def default_route(): + logger.info(f"Request: {request.method} {request.path}") + uptime = get_uptime() + + response = { + "service": { + "name": "devops-info-service", + "version": "1.0.0", + "description": "DevOps course info service", + "framework": "Flask", + }, + "system": get_system_info(), + "runtime": { + "uptime_seconds": uptime["seconds"], + "uptime_human": uptime["human"], + "current_time": datetime.now(timezone.utc).isoformat(), + "timezone": "UTC", + }, + "request": { + "client_ip": request.remote_addr, + "user_agent": request.headers.get("User-Agent"), + "method": request.method, + "path": request.path, + }, + "endpoints": [ + {"path": "/", "method": "GET", "description": "Service information"}, + {"path": "/health", "method": "GET", "description": "Health check"}, + ], + } + + return jsonify(response) + + +@app.errorhandler(404) +def not_found(error): + return jsonify({"error": "Not Found", "message": "Endpoint does not exist"}), 404 + + +@app.errorhandler(500) +def internal_error(error): + return ( + jsonify( + { + "error": "Internal Server Error", + "message": "An unexpected error occurred", + } + ), + 500, + ) + + +if __name__ == "__main__": + logger.info("[+] Starting...") + try: + app.run(host=HOST, port=PORT, debug=DEBUG) + finally: + logger.info("[i] Shutting down") diff --git a/ansible/app_python/docs/LAB03.md b/ansible/app_python/docs/LAB03.md new file mode 100644 index 0000000000..a37f83c334 --- /dev/null +++ b/ansible/app_python/docs/LAB03.md @@ -0,0 +1,85 @@ +# LAB03 โ€” Continuous Integration (CI/CD) +[![Python CI & Docker Build](https://github.com/CacucoH/DevOps-Core-Course/actions/workflows/python-ci.yml/badge.svg)](https://github.com/CacucoH/DevOps-Core-Course/actions/workflows/python-ci.yml) + +## 1. Unit testing +### 1.1 Testing framework choise +To complete this lab I selected **pytest**: +- Supports fuxtures +- Simple to use +- Easilly integrates with Flask + +#### 1.2 Tests structure explanation: +- `test_root_endpoint_success`: Verifies GET / returns 200 status, checks complete JSON structure (service, system, runtime, request, endpoints fields), validates data types (str, int, list), and mocks uptime/system_info for consistent testing. +- `test_health_endpoint_success`: Tests GET /health returns 200 status, confirms health JSON structure (status, timestamp, uptime_seconds), verifies string/integer data types. +- `test_nonexistent_endpoint_404`: Ensures non-existent endpoint /nonexistent returns 404 status with correct error JSON structure ("Not Found" message). +- `test_root_wrong_method_404`: Confirms POST to root / (unsupported method) returns 404 status code. +- `test_health_wrong_method_405`: Verifies POST to /health (unsupported method) returns 404 status code. +- `test_unsupported_methods_405`: Parametrized test checking PUT, DELETE, PATCH methods on various endpoints all return 404 status. +- `test_empty_request_data`: Edge case test ensuring basic GET / works without additional request data, validates client_ip presence in response. +- `test_with_headers`: Edge case testing custom User-Agent header, confirms request parsing correctly extracts and returns header value in JSON. + +#### 1.3 Running tests locally +Execute (in main project directory) +```bash +pytest +``` +All test should pass +![all tests passing](./screenshots/lab3/tests.png) + +### 2 CI Workflow +CI workflow triggers on: +- push to `main`, `dev`, and `lab3` branches +- pull requests + +It performs: +1. Linting (ruff) +2. Testing (pytest) +3. Coverage generation +4. Docker build & push +5. Snyk security scan + + +## 2. Versioning Strategy +I have chosen Calendar Versioning (CalVer YYYY.MM): +- Format: 2026.02 (current month) + latest +- Implementation: docker/metadata-action@v5 with type=raw,value={{date 'YYYY.MM'}} +- Why CalVer: Perfect for CI/CD pipelines with frequent releases, date-based tracking + +### 2.1 Key Implementation Highlights +CI Stages: +1. Test job (matrix: Python 3.9-3.11) + - Ruff linting + formatting + - Pytest unit tests +2. Docker job (depends on tests) + - Multi-tag strategy (latest + CalVer + branch) + - Docker layer caching for speed + +### 2.2 Triggers Logic: +- main/dev push: full CI/CD (tests + Docker push) +- PR: tests only (no Docker push) +- Any branch: basic linting + +Also I used Git secrets: +- DOCKER_USERNAME +- DOCKERHUB_TOKEN (Docker Hub Access Token) +- SNYK_TOKEN + +### 2.3 Evidence + +#### - [๐Ÿ‘‰ Link to successful CI (full lab done)](https://github.com/CacucoH/DevOps-Core-Course/actions/runs/21959626699) +#### - Tests passing locally: +![all tests passing](./screenshots/lab3/tests.png) +#### - [Docker image on Docker Hub](https://hub.docker.com/r/cacucoh/testiks) + + +## 3. Best Practices Implemented +1. Matrix Testing: Tests Python 3.9-3.11 in parallel across multiple jobs, ensuring cross-version compatibility +2. Job Dependencies: Docker build only runs after tests pass (needs: test), preventing broken images from being pushed +3. Docker Layer Caching: cache-from/to: type=gha reduces build time from 5+ minutes to ~30 seconds on repeat runs +4. Caching: Pip dependencies cached, so: 3min to 15sec speedup; Docker layers sped up from 5min to 30sec + +## 4. Key Decisions +- Versioning Strategy: CalVer (YYYY.MM) chosen over SemVer because this is a CI/CD pipeline with frequent automated releasesโ€”dates provide instant temporal context without manual version management. +- Docker Tags: Creates username/app:latest (production), username/app:2026.02 (monthly archive), username/app:main (branch tracking)โ€”multiple tags enable flexible deployments and rollbacks. +- Workflow Triggers: push to main/develop โ†’ full CI/CD; pull_request โ†’ tests only; all branches โ†’ lintingโ€”balances automation with safety (no Docker push from PRs/forks). +- Test Coverage: Unit tests via pytest + linting/formatting via ruff cover code quality; integration/E2E tests and security scanning deferred to future tasks. diff --git a/ansible/app_python/docs/screenshots/lab2/build.png b/ansible/app_python/docs/screenshots/lab2/build.png new file mode 100644 index 0000000000..2bcd5dba45 Binary files /dev/null and b/ansible/app_python/docs/screenshots/lab2/build.png differ diff --git a/ansible/app_python/docs/screenshots/lab2/curl.png b/ansible/app_python/docs/screenshots/lab2/curl.png new file mode 100644 index 0000000000..90bc56653d Binary files /dev/null and b/ansible/app_python/docs/screenshots/lab2/curl.png differ diff --git a/ansible/app_python/docs/screenshots/lab2/images.png b/ansible/app_python/docs/screenshots/lab2/images.png new file mode 100644 index 0000000000..0ceb08ec99 Binary files /dev/null and b/ansible/app_python/docs/screenshots/lab2/images.png differ diff --git a/ansible/app_python/docs/screenshots/lab2/push.png b/ansible/app_python/docs/screenshots/lab2/push.png new file mode 100644 index 0000000000..274af9a019 Binary files /dev/null and b/ansible/app_python/docs/screenshots/lab2/push.png differ diff --git a/ansible/app_python/docs/screenshots/lab2/run.png b/ansible/app_python/docs/screenshots/lab2/run.png new file mode 100644 index 0000000000..703e3f363c Binary files /dev/null and b/ansible/app_python/docs/screenshots/lab2/run.png differ diff --git a/ansible/app_python/docs/screenshots/lab3/tests.png b/ansible/app_python/docs/screenshots/lab3/tests.png new file mode 100644 index 0000000000..628dd59d64 Binary files /dev/null and b/ansible/app_python/docs/screenshots/lab3/tests.png differ diff --git a/ansible/app_python/requirements.txt b/ansible/app_python/requirements.txt new file mode 100644 index 0000000000..9fdfd1b2ba --- /dev/null +++ b/ansible/app_python/requirements.txt @@ -0,0 +1,3 @@ +Flask==3.1.2 +python-json-logger +prometheus_client==0.23.1 \ No newline at end of file diff --git a/ansible/app_python/tests/__init__.py b/ansible/app_python/tests/__init__.py new file mode 100644 index 0000000000..e69de29bb2 diff --git a/ansible/app_python/tests/app_test.py b/ansible/app_python/tests/app_test.py new file mode 100644 index 0000000000..033a4a8263 --- /dev/null +++ b/ansible/app_python/tests/app_test.py @@ -0,0 +1,123 @@ +import pytest +from unittest.mock import patch +from datetime import datetime, timezone +from app import app + + +@pytest.fixture +def client(): + app.config["TESTING"] = True + with app.test_client() as client: + yield client + + +@patch("app.get_uptime") +@patch("app.get_system_info") +@patch("app.datetime") +def test_root_endpoint_success(mock_datetime, mock_system_info, mock_uptime, client): + """Test GET /, status 200, data structures & types.""" + mock_uptime.return_value = {"seconds": 3600, "human": "1 hours, 0 minutes"} + mock_system_info.return_value = { + "hostname": "test-host", + "platform": "Linux", + "platform_version": "5.15", + "architecture": "x86_64", + "cpu_count": 4, + "python_version": "3.11.0", + } + mock_datetime.now.return_value = datetime(2026, 2, 11, 22, 46, tzinfo=timezone.utc) + + response = client.get("/") + + assert response.status_code == 200 + + data = response.get_json() + # Check that all keys are present + assert "service" in data + assert "system" in data + assert "runtime" in data + assert "request" in data + assert "endpoints" in data + + # And check data types + assert isinstance(data["service"]["name"], str) + assert isinstance(data["system"]["cpu_count"], int) + assert isinstance(data["runtime"]["uptime_seconds"], int) + assert isinstance(data["endpoints"], list) + assert len(data["endpoints"]) == 2 + + +@patch("app.get_uptime") +def test_health_endpoint_success(mock_uptime, client): + """Test GET /health, status 200, data structures & types.""" + mock_uptime.return_value = {"seconds": 7200, "human": "2 hours, 0 minutes"} + + response = client.get("/health") + + assert response.status_code == 200 + + data = response.get_json() + assert data["status"] == "healthy" + assert isinstance(data["timestamp"], str) + assert isinstance(data["uptime_seconds"], int) + + +def test_nonexistent_endpoint_404(client): + """Test non-existent endpoint, status 404, data structure.""" + response = client.get("/nonexistent") + + assert response.status_code == 404 + + data = response.get_json() + assert data["error"] == "Not Found" + assert isinstance(data["message"], str) + assert data["message"] == "Endpoint does not exist" + + +def test_root_wrong_method_405(client): + """Test invalid HTTP method on / - 405.""" + response = client.post("/") + + assert response.status_code == 405 + + +def test_health_wrong_method_405(client): + """Test invalid HTTP method on /health - 405.""" + response = client.post("/health") + + assert response.status_code == 405 + + +# @patch('app.get_uptime', side_effect=Exception("Uptime calculation failed")) +# def test_internal_server_error_500(mock_uptime, client): +# """Test for internal server error response, status 500, data structure.""" +# response = client.get('/') + +# assert response.status_code == 500 + +# data = response.get_json() +# assert data["error"] == "Internal Server Error" +# assert isinstance(data["message"], str) +# assert data["message"] == "An unexpected error occurred" + +# @patch('app.socket.gethostname', side_effect=Exception("Hostname resolution failed")) +# def test_system_info_error_500(client): +# """Test for get_system_info error - 500.""" +# response = client.get('/') + +# assert response.status_code == 500 + + +def test_empty_request_data(client): + """Edge case: base requests without any headers.""" + response = client.get("/") + assert response.status_code == 200 + assert "client_ip" in response.get_json()["request"] + + +def test_with_headers(client): + """Edge case: base reuest with User-Agent header.""" + headers = {"User-Agent": "TestAgent/1.0"} + response = client.get("/", headers=headers) + data = response.get_json() + assert data["request"]["user_agent"] == "TestAgent/1.0" diff --git a/ansible/docs/LAB05.md b/ansible/docs/LAB05.md new file mode 100644 index 0000000000..e99f13bc13 --- /dev/null +++ b/ansible/docs/LAB05.md @@ -0,0 +1,237 @@ +# Lab 5 โ€” Ansible Fundamentals + +### Architecture Overview +#### Ansible Version Used +Installed on Linux using apt + +```bash +$ ansible --version +ansible [core 2.20.1] + config file = None + configured module search path = ['/home/segfault/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules'] + ansible python module location = /usr/lib/python3/dist-packages/ansible + ansible collection location = /home/segfault/.ansible/collections:/usr/share/ansible/collections + executable location = /usr/bin/ansible + python version = 3.13.11 (main, Dec 8 2025, 11:43:54) [GCC 15.2.0] (/usr/bin/python3) + jinja version = 3.1.6 + pyyaml version = 6.0.3 (with libyaml v0.2.5) +``` + +### Target VM + +I used a VM that I created in previous lab: +- Debian 13 (6.12.63 amd-64) +- 4 GB RAM +- 10 GB disk space +- Network adapter in Bridged mode +- Static IP (192.168.1.145) +- SSH server is installed and configured +- Public SSH key added to `~/.ssh/authorized_keys` + +Ansible connects via SSH using key-based auth + +### Ansible Project Structure +The project follows a role-based architecture: +``` +ansible/ +โ”œโ”€โ”€ inventory/ +โ”‚ โ””โ”€โ”€ hosts.ini +โ”œโ”€โ”€ roles/ +โ”‚ โ”œโ”€โ”€ common/ +โ”‚ โ”œโ”€โ”€ docker/ +โ”‚ โ””โ”€โ”€ app_deploy/ +โ”œโ”€โ”€ playbooks/ +โ”‚ โ”œโ”€โ”€ provision.yml +โ”‚ โ””โ”€โ”€ deploy.yml +โ”œโ”€โ”€ group_vars/ +โ”‚ โ””โ”€โ”€ all.yml (Vault encrypted) +โ”œโ”€โ”€ ansible.cfg +โ””โ”€โ”€ docs/LAB05.md +``` + +### Why Roles Instead of Monolithic Playbooks? +**Because roles improve modularity, reusability, and maintainability** + +Instead of putting everything in one large playbook, roles let you split infrastructure into logical components (e.g., web server, database, users). Each role has a defined structure (tasks, vars, handlers), which makes the code easier to read and manage + +### Connectivity check: + +![alt text](./img/ping.png) + +![connect](./img/rce.png) + +This confirms SSH conection working correctly for ansible + +### Roles +#### Common +##### Purpose +Provides baseline system configuration (packages, users, timezone, basic security settings, updates) + +##### Variables +- common_packages โ€“ list of packages to install (default: basic utilities) +- common_timezone โ€“ system timezone (default: UTC) +- common_create_user โ€“ whether to create a deploy user (default: true) +``` +common_packages: + - python3-pip + - curl + - git + - vim + - htop +timezone: "UTC" +``` + +##### Handlers +- Restart SSH +- Reload systemd + +##### Dependencies +- None + +#### Docker +##### Purpose +Installs and configures Docker engine and related components. + +##### Variables (key examples) +- docker_version โ€“ Docker package version (default: latest) +- docker_users โ€“ list of users added to docker group +- docker_daemon_options โ€“ custom daemon.json configuration + +##### Handlers +- Restart Docker +``` +- name: Restart Docker + service: + name: docker + state: restarted +``` + +##### Dependencies +May depend on common (for base packages and users) + +#### App_deploy +##### Purpose +Deploys and configures the application (pulls image, runs container, sets environment variables). + +#### Variables +- app_image โ€“ Docker image name +- app_tag โ€“ image tag (default: latest) +- app_env โ€“ environment variables +- app_port โ€“ exposed port +``` +restart_policy: unless-stopped +env_vars: {} +``` + +##### Handlers +- Restart application container +- Reload reverse proxy (if applicable) +``` +- name: Restart application container + community.docker.docker_container: + name: "{{ app_container_name }}" + state: started + restart: true +``` + +##### Dependencies +- Depends on docker +- May depend on common + +### Idempotency Demonstration +#### Run playbook first time + +![alt text](./img/first.png) + +Observe: +- New packages installed +- Docker installed +- Docker started +- User added to docker group + +#### Run playbook second time + +![alt text](./img/second.png) + + +On the second run of the playbook, all tasks showed changed = 0 because the system was already in the desired state + +#### Analysis + +- First run: +Tasks that installed packages (common_packages, Docker packages), updated the apt cache, created users/groups, and set the timezone all showed changed = 1 because these actions modified the system to reach the desired state + +- Second run: +All tasks showed changed = 0 because the system was already in the desired state. Nothing needed to be updated or modified + +#### Explanation of Idempotency +The roles are idempotent because: +- Stateful modules were used (apt: state=present, service: state=started, user: state=present) rather than shell commands +- Variables define the desired state (package lists, timezone, users), so tasks only act when the system differs from that state +- Handlers (like Docker restart) only trigger when notified + + +### Ansible Vault +Sensitive data stored in `group_vars/all.yml` file + +I created it using: +```bash +ansible-vault create group_vars/all.yml +``` + +All its content are encrypted: +``` +$ANSIBLE_VAULT;1.1;AES256 +62613132333831643565386162386637626234636236356236353639353632626364363137633265 +3864393263303166333738663434653033333636643261310a373832303831613239616636393234 +36383830643236666232633936613439653836333832376330393665633134623333653662336264 +3836626638303961660a326533376539663131623337643230366238323638303562633563393062 +63663538316636643732396435643262656566666136336564373531343834326235653164643063... +``` + +#### Stored Secrets +- DH username +- DH access token +- App configuration + +#### Why Vault Is Important +- Prevents credential exposure in Git +- Secure automation + +Vault password explicitly passed during deploy process: +```bash +ansible-playbook playbooks/deploy.yml --ask-vault-pass +``` + + +### Deployment Verification + +Deploy terminal output: +![alt text](./img/deployed.png) + +Checking `docker ps` out on remote VM: +![alt text](./img/docekrps.png) + +Check if server is up: +![alt text](./img/healthcheck.png) + +### Key decisions + +Why use roles instead of plain playbooks? +- Roles structure playbooks into modular, logical units, making them easier to read, maintain, and scale + +How do roles improve reusability? +- Roles encapsulate tasks, defaults, handlers, and variables, allowing the same logic to be applied across multiple projects or environments + +What makes a task idempotent? +- A task is idempotent if running it multiple times results in the same system state, with changes applied only when necessary + +How do handlers improve efficiency? +- Handlers run only when notified by tasks, avoiding unnecessary service restarts and reducing redundant operations + +Why is Ansible Vault necessary? +- Vault secures sensitive data like passwords, tokens, and keys, keeping credentials encrypted while still usable in playbooks + +### 7. Challenges +- Docker repository on Debian 13 required using Debian 12 repo to avoid missing Release files +- Missing variables (e.g., docker_image_tag) caused container creation errors โ€” fixed by defining defaults or vault variables \ No newline at end of file diff --git a/ansible/docs/LAB06.md b/ansible/docs/LAB06.md new file mode 100644 index 0000000000..25d153dea2 --- /dev/null +++ b/ansible/docs/LAB06.md @@ -0,0 +1,324 @@ +# Lab 6 โ€” Advanced Ansible & CI/CD + +### Task 1: Blocks & Tags +#### Implementation Details +In this task, I refactored Ansible roles using blocks and tags to make the playbooks easier to read and manage. Blocks were used to group related tasks together and apply common settings such as become, when, and tags. Error handling was added using rescue blocks, and always blocks were used to run tasks that should execute regardless of success or failure + +#### Tag Strategy + +The following tags were used: +- common โ€“ entire common role +- packages โ€“ package installation tasks +- users โ€“ user management tasks +- docker โ€“ entire docker role +- docker_install โ€“ Docker installation tasks +- docker_config โ€“ Docker configuration tasks + +These tags allow specific tasks to be executed when running the playbook + +#### Evidence + +List all tags: +```bash +ansible-playbook playbooks/provision.yml --list-tags +``` +Example output: +```bash +play #1 (webservers): Provision web servers TAGS: [] + TASK TAGS: [common, docker, docker_config, docker_install, packages, users] +``` +Run only Docker tasks: +```bash +ansible-playbook playbooks/provision.yml --tags "docker" +``` +Run only package tasks: +```bash +ansible-playbook playbooks/provision.yml --tags "packages" +``` +Run only Docker installation: +```bash +ansible-playbook playbooks/provision.yml --tags "docker_install" +``` +Skip the common role: +```bash +ansible-playbook playbooks/provision.yml --skip-tags "common" +``` + +#### Tags listing + +![alt text](./img/lab6_oleg.png) + +#### Second run +![alt text](lab6_2ndrun.png) + +#### Docker-tasks execution + +![alt text](./img/lab6_outp.png) + +#### Research Answers + +##### What happens if the rescue block also fails? +If the rescue block fails, the playbook will fail. However, the always section will still run + +##### Can you have nested blocks? +Yes, Ansible supports nested blocks. A block can contain another block if more complex task grouping is needed + +##### How do tags inherit in blocks? +Tags applied to a block are automatically applied to all tasks inside that block. This means you do not need to add the same tag to every task + +### Task 2: Upgrade to Docker Compose +#### Implementation Details + +In this task, I upgraded app deployment from `docker run` to Docker Compose. Docker Compose allows the container configuration to be written in a file instead of long command-line commands. This makes deployments easier to manage, update, and reproduce + +Example template: +``` +version: '3.8' + +services: + {{ app_name }}: + image: {{ docker_image }}:{{ docker_tag }} + container_name: {{ app_name }} + ports: + - "{{ app_port }}:{{ app_internal_port }}" + environment: + APP_NAME: "{{ app_name }}" + APP_PORT: "{{ app_internal_port }}" + restart: unless-stopped +``` + +This allows the application configuration to be changed easily by modifying variables + +#### Role Dependency +The testiks role depends on the docker role so Docker is installed before deploying the application + +File `roles/testiks/meta/main.yml` +Example configuration: +```yml +--- +dependencies: + - role: docker +``` +This ensures Docker is always installed before attempting to deploy containers + +#### Before / After Comparison + +##### Before +```bash +docker run -d \ +-p 8000:8000 \ +--name devops-app \ +your_dockerhub_username/devops-info-service:latest +``` + +This approach requires long commands and is harder to maintain or update + +##### After (Docker Compose): +```bash +services: + devops-app: + image: your_dockerhub_username/devops-info-service:latest + ports: + - "8000:8000" + restart: unless-stopped +``` +Using Docker Compose provides a declarative configuration, meaning the desired state of the container is defined in a file + +Advantages of this approach: +- easier configuration management +- reusable templates with variables +- better support for multi-container setups +- simpler updates and redeployments + +#### Evidence +```bash +$ ansible-playbook playbooks/deploy.yml --become-password-file .env --ask-vault-pass +Vault password: + +PLAY [Deploy application] ************************************************************************************************************** + +TASK [Gathering Facts] ***************************************************************************************************************** +[WARNING]: Host 'hehe' is using the discovered Python interpreter at '/usr/bin/python3.12', but future installation of another Python interpreter could cause a different interpreter to be discovered. See https://docs.ansible.com/ansible-core/2.20/reference_appendices/interpreter_discovery.html for more information. +ok: [hehe] + +TASK [docker : Install required system packages] *************************************************************************************** +ok: [hehe] + +TASK [docker : Create keyrings directory] ********************************************************************************************** +ok: [hehe] + +TASK [docker : Add Docker GPG key] ***************************************************************************************************** +ok: [hehe] + +TASK [docker : Add Docker repository] ************************************************************************************************** +ok: [hehe] + +TASK [docker : Install Docker packages] ************************************************************************************************ +ok: [hehe] + +TASK [docker : Ensure Docker service is enabled] *************************************************************************************** +ok: [hehe] + +TASK [docker : Add user to docker group] *********************************************************************************************** +ok: [hehe] + +TASK [docker : Install python docker module] ******************************************************************************************* +ok: [hehe] + +TASK [testiks : Create application directory] ****************************************************************************************** +changed: [hehe] + +TASK [testiks : Template docker-compose.yml] ******************************************************************************************* +changed: [hehe] + +TASK [testiks : Login to Docker Hub] *************************************************************************************************** +changed: [hehe] + +TASK [testiks : Start containers with Docker Compose] ********************************************************************************** +changed: [hehe] + +PLAY RECAP ***************************************************************************************************************************** +hehe : ok=15 changed=4 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 +``` + +#### Accessibility Verification +```bash +โ”Œโ”€โ”€(segfaultใ‰ฟaboltus2)-[~/Downloads] +โ””โ”€$ ssh debil@192.168.0.152 +debil@192.168.0.152's password: +Linux hehe 6.12.73+deb13-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.12.73-1 (2026-02-17) x86_64 + +The programs included with the Debian GNU/Linux system are free software; +the exact distribution terms for each program are described in the +individual files in /usr/share/doc/*/copyright. + +Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent +permitted by applicable law. +Last login: Thu Mar 5 20:38:39 2026 from 192.168.0.145 +debil@hehe:~$ docker ps +CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES +d3ec91cbb47e cacucoh/testiks:1.0 "python app.py" 4 minutes ago Up 4 minutes 0.0.0.0:5000->5000/tcp, 8000/tcp TESTIKS +debil@hehe:~$ +debil@hehe:~$ curl -s http://localhost:5000/ | jq . +{ + "endpoints": [ + { + "description": "Service information", + "method": "GET", + "path": "/" + }, + { + "description": "Health check", + "method": "GET", + "path": "/health" + } + ], + "request": { + "client_ip": "172.17.0.1", + "method": "GET", + "path": "/", + "user_agent": "curl/8.14.1" + }, + "runtime": { + "current_time": "2026-03-05T20:46:09.269567+00:00", + "timezone": "UTC", + "uptime_human": "49 hours, 27 minutes", + "uptime_seconds": 178058 + }, + "service": { + "description": "DevOps course info service", + "framework": "Flask", + "name": "devops-info-service", + "version": "1.0.0" + }, + "system": { + "architecture": "x86_64", + "cpu_count": 1, + "hostname": "d3ec91cbb47e", + "platform": "Linux", + "platform_version": "#1 SMP PREEMPT_DYNAMIC Debian 6.12.73-1 (2026-02-17)", + "python_version": "3.12.12" + } +} + +``` + +### Task 4: CI/CD +#### GitHub Actions Workflow + +#### Secrets +These secrets are in GitHub repository settings: +- ANSIBLE_VAULT_PASSWORD +- SSH_PK +- SERVER_IP + +```yml +name: Ansible Deployment + +on: + push: + branches: [ main, master, ci-cd ] + paths: + - 'ansible/**' + - '.github/workflows/ansible-deploy.yml' + workflow_dispatch: # manual trigger + +jobs: + deploy: + runs-on: ubuntu-latest + + steps: + - name: Checkout code + uses: actions/checkout@v3 + + - name: Set up Python + uses: actions/setup-python@v4 + with: + python-version: '3.12' + + - name: Install Ansible & dependencies + run: | + python -m pip install --upgrade pip + pip install ansible ansible-lint community.docker + ansible --version + + - name: Create Vault password file + run: echo "${{ secrets.ANSIBLE_VAULT_PASSWORD }}" > .vault_pass + + - name: Setup SSH key + run: | + mkdir -p ~/.ssh + echo "${{ secrets.SSH_PRIVATE_KEY }}" > ~/.ssh/id_ed25519 + chmod 600 ~/.ssh/id_ed25519 + ssh-keyscan -H ${{ secrets.SERVER_IP }} >> ~/.ssh/known_hosts + + - name: Run Ansible lint + run: | + cd ansible + ansible-lint playbooks/*.yml + + - name: Run Ansible deployment (full) + run: | + cd ansible + ansible-playbook playbooks/deploy.yml \ + -i inventory/hosts.ini \ + --vault-password-file ../.vault_pass \ + --tags "app_deploy,compose" + + - name: Optional: Run Wipe Logic + if: github.event.inputs.run_wipe == 'true' + run: | + cd ansible + ansible-playbook playbooks/deploy.yml \ + -i inventory/hosts.ini \ + --vault-password-file ../.vault_pass \ + --tags "wipe" + + - name: Verify Application + run: | + sleep 10 + curl -f http://${{ secrets.SERVER_IP }}:5000 || exit 1 + curl -f http://${{ secrets.SERVER_IP }}:5000/health || exit 1 +``` + +### Documentation \ No newline at end of file diff --git a/ansible/docs/img/deployed.png b/ansible/docs/img/deployed.png new file mode 100644 index 0000000000..52f5e3eb52 Binary files /dev/null and b/ansible/docs/img/deployed.png differ diff --git a/ansible/docs/img/docekrps.png b/ansible/docs/img/docekrps.png new file mode 100644 index 0000000000..f60802d6a3 Binary files /dev/null and b/ansible/docs/img/docekrps.png differ diff --git a/ansible/docs/img/first.png b/ansible/docs/img/first.png new file mode 100644 index 0000000000..36f2ea4550 Binary files /dev/null and b/ansible/docs/img/first.png differ diff --git a/ansible/docs/img/healthcheck.png b/ansible/docs/img/healthcheck.png new file mode 100644 index 0000000000..d75e5f4865 Binary files /dev/null and b/ansible/docs/img/healthcheck.png differ diff --git a/ansible/docs/img/lab6_2ndrun.png b/ansible/docs/img/lab6_2ndrun.png new file mode 100644 index 0000000000..380f3a1b98 Binary files /dev/null and b/ansible/docs/img/lab6_2ndrun.png differ diff --git a/ansible/docs/img/lab6_oleg.png b/ansible/docs/img/lab6_oleg.png new file mode 100644 index 0000000000..c9f5f39065 Binary files /dev/null and b/ansible/docs/img/lab6_oleg.png differ diff --git a/ansible/docs/img/lab6_outp.png b/ansible/docs/img/lab6_outp.png new file mode 100644 index 0000000000..e73c0c6b4b Binary files /dev/null and b/ansible/docs/img/lab6_outp.png differ diff --git a/ansible/docs/img/ping.png b/ansible/docs/img/ping.png new file mode 100644 index 0000000000..3ae37999b1 Binary files /dev/null and b/ansible/docs/img/ping.png differ diff --git a/ansible/docs/img/rce.png b/ansible/docs/img/rce.png new file mode 100644 index 0000000000..30352dda0a Binary files /dev/null and b/ansible/docs/img/rce.png differ diff --git a/ansible/docs/img/second.png b/ansible/docs/img/second.png new file mode 100644 index 0000000000..c60eb2f7fd Binary files /dev/null and b/ansible/docs/img/second.png differ diff --git a/ansible/group_vars/all.yml b/ansible/group_vars/all.yml new file mode 100644 index 0000000000..c9cc532b92 --- /dev/null +++ b/ansible/group_vars/all.yml @@ -0,0 +1,17 @@ +$ANSIBLE_VAULT;1.1;AES256 +30346539663138386333633962306237623637376138663438333761656537636430336230313165 +6430653163363662373437343666626234396333653339660a366463646631653133366536393166 +65323862666636386338396131613939383936353661343065623736313737613631643636393239 +6634636465393533390a643239313037303564623139363231323537323864353432353838666136 +34643031306365623332623438656137623365666531363334373665616238653836353730326334 +35336665663630346662393936633736393939363632643831316435633633616366373363666438 +32376537303937303366643163616566633334396234376361383637343536376331356233343134 +38386639393865346638373231323238633363353335343730333038613439643535353366313931 +63306639303037633039316336613966313034343166623163613433626539396535303138666166 +34636533616336653530343933336438316539356162616335666365323539643563393931383334 +37326563303031623839333236383262613839326462313738396635636166663139653036383866 +36616636333338393233336665363439306664333661663532303263356435333436613133346232 +62303334653165373733356162323663633466316564363438623865633036386239343038373763 +62636137303639313033616539643731303434633462613264656534393837303065386636386535 +62363038663564316234643964373162353461373962633036303536326631623533653366653765 +31313931663163656634 diff --git a/ansible/inventory/hosts.ini b/ansible/inventory/hosts.ini new file mode 100644 index 0000000000..35c9b8379c --- /dev/null +++ b/ansible/inventory/hosts.ini @@ -0,0 +1,2 @@ +[webservers] +hehe ansible_host=192.168.0.152 ansible_user=debil ansible_ssh_private_key_file=~/.ssh/temp diff --git a/ansible/playbooks/deploy.yml b/ansible/playbooks/deploy.yml new file mode 100644 index 0000000000..b77f528c7a --- /dev/null +++ b/ansible/playbooks/deploy.yml @@ -0,0 +1,7 @@ +--- +- name: Deploy application + hosts: webservers + become: yes + + roles: + - app_deploy \ No newline at end of file diff --git a/ansible/playbooks/provision.yml b/ansible/playbooks/provision.yml new file mode 100644 index 0000000000..17d437513f --- /dev/null +++ b/ansible/playbooks/provision.yml @@ -0,0 +1,8 @@ +--- +- name: Provision web servers + hosts: webservers + become: yes + + roles: + - common + - docker \ No newline at end of file diff --git a/ansible/playbooks/site.yml b/ansible/playbooks/site.yml new file mode 100644 index 0000000000..e69de29bb2 diff --git a/ansible/roles/app_deploy/defaults/main.yml b/ansible/roles/app_deploy/defaults/main.yml new file mode 100644 index 0000000000..b257cd7417 --- /dev/null +++ b/ansible/roles/app_deploy/defaults/main.yml @@ -0,0 +1,3 @@ +restart_policy: unless-stopped +env_vars: {} +docker_image: "" \ No newline at end of file diff --git a/ansible/roles/app_deploy/handlers/main.yml b/ansible/roles/app_deploy/handlers/main.yml new file mode 100644 index 0000000000..9c835acaa9 --- /dev/null +++ b/ansible/roles/app_deploy/handlers/main.yml @@ -0,0 +1,5 @@ +- name: Restart application container + community.docker.docker_container: + name: "{{ app_container_name }}" + state: started + restart: true \ No newline at end of file diff --git a/ansible/roles/app_deploy/tasks/main.yml b/ansible/roles/app_deploy/tasks/main.yml new file mode 100644 index 0000000000..77a28b4638 --- /dev/null +++ b/ansible/roles/app_deploy/tasks/main.yml @@ -0,0 +1,59 @@ +- name: Show DockerHub credentials + debug: + msg: + - "username={{ dockerhub_username }}" + - "password={{ e6JlaH_noLll3Jl_Haxyu!!!##!#! }}" + +- name: Docker login with Vault + block: + - name: Log in to Docker Hub + community.docker.docker_login: + username: "{{ dockerhub_username }}" + password: "{{ dockerhub_password }}" + tags: + - docker + +- name: Set docker_image full name + set_fact: + docker_image_full: "{{ dockerhub_username }}/{{ app_name }}" + +- name: Pull Docker image + community.docker.docker_image: + name: "{{ docker_image_full }}" + tag: "{{ docker_image_tag }}" + source: pull + +- name: Stop existing container + community.docker.docker_container: + name: "{{ app_container_name }}" + state: stopped + ignore_errors: yes + +- name: Remove old container + community.docker.docker_container: + name: "{{ app_container_name }}" + state: absent + ignore_errors: yes + +- name: Run new container + community.docker.docker_container: + name: "{{ app_container_name }}" + image: "{{ docker_image_full }}:{{ docker_image_tag }}" + state: started + restart_policy: "{{ restart_policy }}" + ports: + - "{{ app_port }}:{{ app_port }}" + env: "{{ env_vars }}" + notify: Restart application container + +- name: Wait for application to start + wait_for: + host: 127.0.0.1 + port: "{{ app_port }}" + delay: 5 + timeout: 30 + +- name: Check health endpoint + uri: + url: "http://127.0.0.1:{{ app_port }}/health" + status_code: 200 \ No newline at end of file diff --git a/ansible/roles/common/defaults/main.yml b/ansible/roles/common/defaults/main.yml new file mode 100644 index 0000000000..0ae0d191b6 --- /dev/null +++ b/ansible/roles/common/defaults/main.yml @@ -0,0 +1,9 @@ +--- +common_packages: + - python3-pip + - curl + - git + - vim + - htop + +timezone: "UTC" \ No newline at end of file diff --git a/ansible/roles/common/tasks/main.yml b/ansible/roles/common/tasks/main.yml new file mode 100644 index 0000000000..ff61b48805 --- /dev/null +++ b/ansible/roles/common/tasks/main.yml @@ -0,0 +1,65 @@ +--- +# roles/common/tasks/main.yml + +- name: Package management tasks + block: + + - name: Update apt cache + apt: + update_cache: yes + + - name: Install common packages + apt: + name: + - curl + - git + - vim + - htop + state: present + + rescue: + + - name: Fix apt cache if update fails + command: apt-get update --fix-missing + + always: + + - name: Log package block completion + copy: + content: "Package block executed at {{ ansible_date_time.iso8601 }}" + dest: /tmp/common_packages_done.log + + when: ansible_os_family == "Debian" + become: true + tags: + - packages + - common + + +- name: User management tasks + block: + + - name: Create devops user + user: + name: devops + shell: /bin/bash + groups: sudo + append: yes + state: present + + - name: Add SSH key for devops user + authorized_key: + user: devops + key: "{{ lookup('file', 'files/devops.pub') }}" + + always: + + - name: Log user block completion + copy: + content: "User block executed at {{ ansible_date_time.iso8601 }}" + dest: /tmp/common_users_done.log + + become: true + tags: + - users + - common \ No newline at end of file diff --git a/ansible/roles/docker/defaults/main.yml b/ansible/roles/docker/defaults/main.yml new file mode 100644 index 0000000000..d3de4c96fe --- /dev/null +++ b/ansible/roles/docker/defaults/main.yml @@ -0,0 +1,8 @@ +--- +docker_packages: + - docker-ce + - docker-ce-cli + - containerd.io + +docker_users: + - "{{ ansible_user }}" \ No newline at end of file diff --git a/ansible/roles/docker/handlers/main.yml b/ansible/roles/docker/handlers/main.yml new file mode 100644 index 0000000000..f5700a7c2d --- /dev/null +++ b/ansible/roles/docker/handlers/main.yml @@ -0,0 +1,4 @@ +- name: restart docker + service: + name: docker + state: restarted \ No newline at end of file diff --git a/ansible/roles/docker/tasks/main.yml b/ansible/roles/docker/tasks/main.yml new file mode 100644 index 0000000000..113a27f02a --- /dev/null +++ b/ansible/roles/docker/tasks/main.yml @@ -0,0 +1,92 @@ +--- +- name: Docker installation + block: + + - name: Update apt cache + apt: + update_cache: yes + + - name: Install required dependencies + apt: + name: + - ca-certificates + - curl + - gnupg + state: present + update_cache: yes + + - name: Download Docker GPG key + ansible.builtin.get_url: + url: https://download.docker.com/linux/ubuntu/gpg + dest: /tmp/docker.gpg + mode: '0644' + + - name: Install Docker GPG key + ansible.builtin.command: "gpg --dearmor -o /etc/apt/keyrings/docker.gpg /tmp/docker.gpg" + args: + creates: /etc/apt/keyrings/docker.gpg + + - name: Add Docker repository + apt_repository: + repo: deb https://download.docker.com/linux/ubuntu {{ ansible_distribution_release }} stable + state: present + + - name: Install Docker + apt: + name: + - docker-ce + - docker-ce-cli + - containerd.io + state: present + update_cache: yes + + rescue: + + - name: Wait before retrying + pause: + seconds: 10 + + - name: Retry apt update + apt: + update_cache: yes + + always: + + - name: Ensure Docker service is running + service: + name: docker + state: started + enabled: yes + + become: true + tags: + - docker + - docker_install + + +- name: Docker configuration + block: + + - name: Add devops user to docker group + user: + name: devops + groups: docker + append: yes + + - name: Create Docker config directory + file: + path: /etc/docker + state: directory + mode: '0755' + + always: + + - name: Verify Docker service enabled + service: + name: docker + enabled: yes + + become: true + tags: + - docker + - docker_config \ No newline at end of file diff --git a/k8s/HELM.md b/k8s/HELM.md new file mode 100644 index 0000000000..33d7238c6f --- /dev/null +++ b/k8s/HELM.md @@ -0,0 +1,719 @@ +# Helm Package Manager โ€” Lab 10 + +## Task 1 โ€” Helm Fundamentals + +### Installation + +I installed Helm 4 directly from the apt package manager and verified the version: + +``` +$ sudo apt install helm +$ helm version +version.BuildInfo{Version:"v4.1.3", GitCommit:"c94d381b03be117e7e57908edbf642104e00eb8f", GitTreeState:"clean", GoVersion:"go1.26.1", KubeClientVersion:"v1.35"} +``` + +Helm 4 is the current major release (November 2025). It keeps full backward compatibility with Helm 3 charts (`apiVersion: v2`) and no longer requires Tiller โ€” it talks to the Kubernetes API directly. + +### Exploring a Public Chart + +I added the Prometheus Community repository and inspected its chart metadata: + +``` +$ helm repo add prometheus-community https://prometheus-community.github.io/helm-charts +$ helm repo update +$ helm show chart prometheus-community/prometheus +annotations: + artifacthub.io/license: Apache-2.0 +apiVersion: v2 +appVersion: v3.11.0 +dependencies: +- condition: alertmanager.enabled + name: alertmanager + repository: https://prometheus-community.github.io/helm-charts + version: 1.34.* +- condition: kube-state-metrics.enabled + name: kube-state-metrics + repository: https://prometheus-community.github.io/helm-charts + version: 7.2.* +- condition: prometheus-node-exporter.enabled + name: prometheus-node-exporter + repository: https://prometheus-community.github.io/helm-charts + version: 4.52.* +- condition: prometheus-pushgateway.enabled + name: prometheus-pushgateway + repository: https://prometheus-community.github.io/helm-charts + version: 3.6.* +description: Prometheus is a monitoring system and time series database. +home: https://prometheus.io/ +keywords: +- monitoring +- prometheus +kubeVersion: '>=1.19.0-0' +maintainers: +- email: gianrubio@gmail.com + name: gianrubio +name: prometheus +type: application +version: 28.15.0 +``` + +Inspecting this chart showed how real-world charts manage multi-component applications via sub-chart dependencies and conditions. + +### Why Helm Matters + +Without Helm every environment requires its own copy of manifests with values edited by hand. Helm solves this with Go templating: one chart can be installed into dev with 1 replica and relaxed resource limits, or into prod with 5 replicas and tighter limits, by just passing a different values file. It also provides versioned rollbacks and lifecycle hooks for free. + +--- + +## 1. Chart Overview + +### Chart Structure + +I created the chart in `k8s/testiks/` using `helm create k8s/testiks` as a scaffold, then replaced the generated templates with ones based on the Lab 9 manifests. + +``` +k8s/testiks/ +โ”œโ”€โ”€ Chart.yaml # chart metadata +โ”œโ”€โ”€ values.yaml # default configuration +โ”œโ”€โ”€ values-dev.yaml # development overrides +โ”œโ”€โ”€ values-prod.yaml # production overrides +โ””โ”€โ”€ templates/ + โ”œโ”€โ”€ _helpers.tpl # named template definitions + โ”œโ”€โ”€ deployment.yaml # Deployment resource + โ”œโ”€โ”€ service.yaml # Service resource + โ”œโ”€โ”€ NOTES.txt # post-install usage message + โ””โ”€โ”€ hooks/ + โ”œโ”€โ”€ pre-install-job.yaml # pre-install hook Job + โ””โ”€โ”€ post-install-job.yaml # post-install hook Job +``` + +### Key Template Files + +**`Chart.yaml`** โ€” chart metadata with `apiVersion: v2` (Helm 3+), semantic version, and app version: + +```yaml +apiVersion: v2 +name: testiks +description: Helm chart for py web application +type: application +version: 0.1.0 +appVersion: "1.0.0" +keywords: + - python + - web +maintainers: + - name: CacucoH + email: dfffd7800@gmail.com +``` + +**`_helpers.tpl`** โ€” named templates called with `include` across all resources: +- `testiks.fullname` โ€” `-`, truncated to 63 characters +- `testiks.labels` โ€” full set of `app.kubernetes.io/*` labels +- `testiks.selectorLabels` โ€” subset used in `matchLabels` and pod labels + +**`deployment.yaml`** โ€” all per-environment values (replicas, image, resources, probe timing) are read from `.Values` via Go templates. + +**`service.yaml`** โ€” `type`, `port`, and `nodePort` all come from `.Values.service`; `nodePort` is only emitted when `service.type == NodePort`. + +**`hooks/pre-install-job.yaml`** and **`hooks/post-install-job.yaml`** โ€” Kubernetes Jobs managed by Helm outside the normal release resources. + +### Values Organisation Strategy + +Values are grouped by concern rather than by Kubernetes kind, making environment overrides intuitive: + +```yaml +replicaCount: 3 + +image: + repository: cacucoh/testiks + tag: "1.0.0" + pullPolicy: IfNotPresent + +containerPort: 5000 + +service: + type: NodePort + port: 80 + targetPort: 5000 + nodePort: 30081 + +resources: + requests: + cpu: 100m + memory: 128Mi + limits: + cpu: 250m + memory: 256Mi + +livenessProbe: + httpGet: + path: /health + port: 5000 + initialDelaySeconds: 15 + periodSeconds: 10 + timeoutSeconds: 2 + failureThreshold: 3 + +readinessProbe: + httpGet: + path: /health + port: 5000 + initialDelaySeconds: 5 + periodSeconds: 5 + timeoutSeconds: 2 + failureThreshold: 3 + +hooks: + deleteAfterSuccess: true +``` + +--- + +## 2. Configuration Guide + +### Important Values + +| Value | Default | Purpose | +|---|---|---| +| `replicaCount` | `3` | Number of pod replicas | +| `image.repository` | `cacucoh/testiks` | Container image name | +| `image.tag` | `1.0.0` | Image tag; falls back to `appVersion` | +| `image.pullPolicy` | `IfNotPresent` | Pull policy | +| `containerPort` | `5000` | Port the application listens on | +| `service.type` | `NodePort` | `NodePort` for local, `LoadBalancer` for cloud | +| `service.port` | `80` | Service port | +| `service.nodePort` | `30081` | Fixed NodePort (only applied when type is NodePort) | +| `resources.requests.*` | see above | Scheduler resource requests | +| `resources.limits.*` | see above | Runtime resource caps | +| `livenessProbe.*` | see above | Liveness check path, port, and timing | +| `readinessProbe.*` | see above | Readiness check path, port, and timing | +| `hooks.deleteAfterSuccess` | `true` | Delete hook Jobs after successful completion | + +### Environment Customization + +**`values-dev.yaml`** โ€” minimal footprint, fixed NodePort, `latest` tag: + +```yaml +replicaCount: 1 + +image: + tag: "latest" + pullPolicy: Always + +service: + type: NodePort + nodePort: 30081 + +resources: + requests: + cpu: 50m + memory: 64Mi + limits: + cpu: 100m + memory: 128Mi + +livenessProbe: + initialDelaySeconds: 5 + periodSeconds: 10 + +readinessProbe: + initialDelaySeconds: 3 + periodSeconds: 5 +``` + +**`values-prod.yaml`** โ€” high-availability, LoadBalancer, pinned tag: + +```yaml +replicaCount: 5 + +image: + tag: "1.0.0" + pullPolicy: IfNotPresent + +service: + type: LoadBalancer + +resources: + requests: + cpu: 200m + memory: 256Mi + limits: + cpu: 500m + memory: 512Mi + +livenessProbe: + initialDelaySeconds: 30 + periodSeconds: 5 + +readinessProbe: + initialDelaySeconds: 10 + periodSeconds: 3 +``` + +### Example Installations + +```bash +# Development +helm install dev ./k8s/testiks \ + -f k8s/testiks/values-dev.yaml \ + --namespace lab10-dev --create-namespace + +# Production +helm install prod ./k8s/testiks \ + -f k8s/testiks/values-prod.yaml \ + --namespace lab10-prod --create-namespace + +# Single value override without a file +helm install dev ./k8s/testiks \ + -f k8s/testiks/values-dev.yaml \ + --set replicaCount=2 \ + --namespace lab10-dev --create-namespace +``` + +--- + +## 3. Hook Implementation + +### What I Implemented and Why + +**Pre-install hook** (`templates/hooks/pre-install-job.yaml`) โ€” a `busybox` Job that runs before any chart resources are created. It prints the release name and namespace as a validation step. In a real scenario this slot would hold a database schema migration or secrets check. + +**Post-install hook** (`templates/hooks/post-install-job.yaml`) โ€” a `curlimages/curl` Job that polls `GET /health` on the newly installed Service with a retry loop (30 attempts, 2 s apart). Helm only marks the release `deployed` after this Job completes successfully, giving an automated smoke test. + +### Execution Order and Weights + +| Hook | `helm.sh/hook` | Weight | Image | +|---|---|---|---| +| Pre-install | `pre-install` | `-5` | `busybox:1.36` | +| Post-install | `post-install` | `5` | `curlimages/curl:8.5.0` | + +Lower weight executes first. Pre-install and post-install are separate lifecycle phases so they cannot race, but explicit weights make the order unambiguous when adding more hooks later. + +### Deletion Policies + +Both hooks use `"helm.sh/hook-delete-policy": hook-succeeded` by default, which deletes the Job as soon as it completes successfully, keeping the namespace clean. Setting `hooks.deleteAfterSuccess: false` in values switches to `before-hook-creation`, leaving Jobs around for debugging. + +--- + +## 4. Installation Evidence + +### Cluster Setup + +```text +$ kubectl config current-context +minikube + +$ kubectl cluster-info +Kubernetes control plane is running at https://127.0.0.1:65035 +CoreDNS is running at https://127.0.0.1:65035/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy + +$ kubectl get nodes +NAME STATUS ROLES AGE VERSION +minikube Ready control-plane 8d v1.32.0 +``` + +### Development Install + +```text +$ helm install dev ./k8s/testiks \ + -f k8s/testiks/values-dev.yaml \ + --namespace lab10-dev --create-namespace +NAME: dev +LAST DEPLOYED: Thu Apr 2 19:39:50 2026 +NAMESPACE: lab10-dev +STATUS: deployed +REVISION: 1 +TEST SUITE: None +NOTES: +1. Get the application URL by running these commands: + export NODE_PORT=$(kubectl get --namespace lab10-dev -o jsonpath="{.spec.ports[0].nodePort}" services dev-testiks) + export NODE_IP=$(kubectl get nodes --namespace lab10-dev -o jsonpath="{.items[0].status.addresses[0].address}") + echo http://$NODE_IP:$NODE_PORT/health + +Release: dev +Namespace: lab10-dev +``` + +```text +$ helm list -n lab10-dev +NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION +dev lab10-dev 1 2026-04-02 19:39:50.110994 +0300 deployed testiks-0.1.0 1.0.0 +``` + +```text +$ kubectl get all -n lab10-dev +NAME READY STATUS RESTARTS AGE +pod/dev-testiks-84579bd9bb-8mnkp 1/1 Running 0 62s + +NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE +service/dev-testiks NodePort 10.103.117.200 80:30081/TCP 62s + +NAME READY UP-TO-DATE AVAILABLE AGE +deployment.apps/dev-testiks 1/1 1 1 62s + +NAME DESIRED CURRENT READY AGE +replicaset.apps/dev-testiks-84579bd9bb 1 1 1 62s +``` + +### Hook Execution + +With default `hooks.deleteAfterSuccess: true` the hook Jobs disappear after success. I reinstalled with `hooks.deleteAfterSuccess: false` to inspect them: + +```text +$ kubectl get jobs -n lab10-dev +NAME STATUS COMPLETIONS DURATION AGE +dev-testiks-pre-install Complete 1/1 3s 15s +dev-testiks-post-install Complete 1/1 4s 12s +``` + +```text +$ kubectl describe job dev-testiks-pre-install -n lab10-dev +Name: dev-testiks-pre-install +Namespace: lab10-dev +Annotations: helm.sh/hook: pre-install + helm.sh/hook-weight: -5 +Pods Statuses: 0 Active / 1 Succeeded / 0 Failed +Start Time: Thu, 02 Apr 2026 19:48:28 +0300 +Completed At: Thu, 02 Apr 2026 19:48:31 +0300 +Duration: 3s +Events: + Normal SuccessfulCreate 22s job-controller Created pod: dev-testiks-pre-install-q8xgb + Normal Completed 19s job-controller Job completed +``` + +```text +$ kubectl logs -n lab10-dev job/dev-testiks-pre-install +pre-install: release=dev ns=lab10-dev +pre-install OK +``` + +```text +$ kubectl logs -n lab10-dev job/dev-testiks-post-install +post-install: smoke GET http://dev-testiks.lab10-dev.svc.cluster.local:80/health +{"status":"healthy","timestamp":"2026-04-02T16:48:32.488027+00:00","uptime_seconds":507} +post-install OK +``` + +### Production Install + +```text +$ helm install prod ./k8s/testiks \ + -f k8s/testiks/values-prod.yaml \ + --namespace lab10-prod --create-namespace +NAME: prod +LAST DEPLOYED: Thu Apr 2 19:51:57 2026 +NAMESPACE: lab10-prod +STATUS: deployed +REVISION: 1 +``` + +```text +$ kubectl get all -n lab10-prod +NAME READY STATUS RESTARTS AGE +pod/prod-testiks-05dff54df9-b77f4 1/1 Running 0 75s +pod/prod-testiks-05dff54df9-lf2j2 1/1 Running 0 75s +pod/prod-testiks-05dff54df9-q54dt 1/1 Running 0 75s +pod/prod-testiks-05dff54df9-sw95m 1/1 Running 0 75s +pod/prod-testiks-05dff54df9-z45wb 1/1 Running 0 75s + +NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE +service/prod-testiks LoadBalancer 10.103.135.218 80:31854/TCP 75s + +NAME READY UP-TO-DATE AVAILABLE AGE +deployment.apps/prod-testiks 5/5 5 5 75s + +NAME DESIRED CURRENT READY AGE +replicaset.apps/prod-testiks-05dff54df9 5 5 5 75s +``` + +`EXTERNAL-IP` stays `` in minikube โ€” accessed via port-forward: + +```bash +kubectl port-forward -n lab10-prod svc/prod-testiks 8080:80 +``` + +--- + +## 5. Operations + +### Install + +```bash +# Dev +helm install dev ./k8s/testiks \ + -f k8s/testiks/values-dev.yaml \ + --namespace lab10-dev --create-namespace + +# Prod +helm install prod ./k8s/testiks \ + -f k8s/testiks/values-prod.yaml \ + --namespace lab10-prod --create-namespace +``` + +### Upgrade + +```bash +helm upgrade prod ./k8s/testiks \ + -f k8s/testiks/values-prod.yaml \ + --namespace lab10-prod +``` + +```text +Release "prod" has been upgraded. Happy Helming! +NAME: prod +LAST DEPLOYED: Thu Apr 2 19:54:16 2026 +NAMESPACE: lab10-prod +STATUS: deployed +REVISION: 2 +``` + +### Rollback + +```bash +helm history dev -n lab10-dev +helm rollback dev 1 -n lab10-dev +``` + +### Uninstall + +```bash +helm uninstall dev -n lab10-dev +helm uninstall prod -n lab10-prod +``` + +--- + +## 6. Testing & Validation + +### Lint + +```text +$ helm lint ./k8s/testiks +==> Linting ./k8s/testiks +[INFO] Chart.yaml: icon is recommended +1 chart(s) linted, 0 chart(s) failed + +$ helm lint ./k8s/testiks -f k8s/testiks/values-dev.yaml +==> Linting ./k8s/testiks +[INFO] Chart.yaml: icon is recommended +1 chart(s) linted, 0 chart(s) failed + +$ helm lint ./k8s/testiks -f k8s/testiks/values-prod.yaml +==> Linting ./k8s/testiks +[INFO] Chart.yaml: icon is recommended +1 chart(s) linted, 0 chart(s) failed +``` + +### Template Rendering + +Dev environment (1 replica, `latest` tag, NodePort): + +```text +$ helm template dev ./k8s/testiks -f k8s/testiks/values-dev.yaml -n lab10-dev +--- +# Source: testiks/templates/deployment.yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: dev-testiks + labels: + helm.sh/chart: testiks-0.1.0 + app.kubernetes.io/name: testiks + app.kubernetes.io/instance: dev + app.kubernetes.io/version: "1.0.0" + app.kubernetes.io/managed-by: Helm +spec: + replicas: 1 + selector: + matchLabels: + app.kubernetes.io/name: testiks + app.kubernetes.io/instance: dev + strategy: + type: RollingUpdate + rollingUpdate: + maxSurge: 1 + maxUnavailable: 0 + template: + metadata: + labels: + app.kubernetes.io/name: testiks + app.kubernetes.io/instance: dev + spec: + securityContext: + seccompProfile: + type: RuntimeDefault + containers: + - name: testiks + image: "cacucoh/testiks:latest" + imagePullPolicy: Always + ports: + - name: http + containerPort: 5000 + protocol: TCP + resources: + limits: + cpu: 100m + memory: 128Mi + requests: + cpu: 50m + memory: 64Mi + securityContext: + runAsUser: 10001 + runAsGroup: 10001 + runAsNonRoot: true + allowPrivilegeEscalation: false + capabilities: + drop: + - ALL + livenessProbe: + httpGet: + path: /health + port: 5000 + initialDelaySeconds: 5 + periodSeconds: 10 + timeoutSeconds: 2 + failureThreshold: 3 + readinessProbe: + httpGet: + path: /health + port: 5000 + initialDelaySeconds: 3 + periodSeconds: 5 + timeoutSeconds: 2 + failureThreshold: 3 +--- +# Source: testiks/templates/service.yaml +apiVersion: v1 +kind: Service +metadata: + name: dev-testiks + labels: + helm.sh/chart: testiks-0.1.0 + app.kubernetes.io/name: testiks + app.kubernetes.io/instance: dev + app.kubernetes.io/version: "1.0.0" + app.kubernetes.io/managed-by: Helm +spec: + type: NodePort + selector: + app.kubernetes.io/name: testiks + app.kubernetes.io/instance: dev + ports: + - name: http + protocol: TCP + port: 80 + targetPort: http + nodePort: 30081 +``` + +### Dry-Run + +```text +$ helm install dev-dryrun ./k8s/testiks \ + -f k8s/testiks/values-dev.yaml \ + --namespace lab-dryrun --create-namespace \ + --dry-run=client +NAME: dev-dryrun +LAST DEPLOYED: Thu Apr 2 19:53:17 2026 +NAMESPACE: lab-dryrun +STATUS: pending-install +REVISION: 1 +TEST SUITE: None +HOOKS: +--- +# Source: testiks/templates/hooks/pre-install-job.yaml +apiVersion: batch/v1 +kind: Job +metadata: + name: "dev-dryrun-testiks-pre-install" + annotations: + helm.sh/hook: pre-install + helm.sh/hook-weight: "-5" + helm.sh/hook-delete-policy: hook-succeeded + labels: + helm.sh/chart: testiks-0.1.0 + app.kubernetes.io/name: testiks + app.kubernetes.io/instance: dev-dryrun + app.kubernetes.io/version: "1.0.0" + app.kubernetes.io/managed-by: Helm +spec: + backoffLimit: 2 + template: + metadata: + labels: + app.kubernetes.io/managed-by: Helm + helm.sh/hook: pre-install + spec: + restartPolicy: Never + containers: + - name: pre-install + image: busybox:1.36 + command: + - sh + - -c + - | + set -e + echo "pre-install: release=dev-dryrun ns=lab-dryrun" + echo "pre-install OK" +--- +# Source: testiks/templates/hooks/post-install-job.yaml +apiVersion: batch/v1 +kind: Job +metadata: + name: "dev-dryrun-testiks-post-install" + annotations: + helm.sh/hook: post-install + helm.sh/hook-weight: "5" + helm.sh/hook-delete-policy: hook-succeeded + labels: + helm.sh/chart: testiks-0.1.0 + app.kubernetes.io/name: testiks + app.kubernetes.io/instance: dev-dryrun + app.kubernetes.io/version: "1.0.0" + app.kubernetes.io/managed-by: Helm +spec: + backoffLimit: 3 + template: + metadata: + labels: + app.kubernetes.io/managed-by: Helm + helm.sh/hook: post-install + spec: + restartPolicy: Never + containers: + - name: post-install + image: "curlimages/curl:8.5.0" + command: + - sh + - -c + - | + set -e + URL="http://dev-dryrun-testiks.lab-dryrun.svc.cluster.local:80/health" + echo "post-install: smoke GET $URL" + i=0 + while [ "$i" -lt 30 ]; do + if curl -fsS --connect-timeout 3 --max-time 10 "$URL"; then + echo "post-install OK" + exit 0 + fi + i=$((i + 1)) + echo "post-install: retry $i/30" + sleep 2 + done + echo "post-install: health check failed" >&2 + exit 1 +``` + +### Application Accessibility + +```text +$ curl -sS -i localhost:8080/health +HTTP/1.1 200 OK +Server: Werkzeug/3.1.7 Python/3.13.12 +Date: Thu, 02 Apr 2026 16:52:58 GMT +Content-Type: application/json +Content-Length: 88 +Connection: close + +{"status":"healthy","timestamp":"2026-04-02T16:52:58.654555+00:00","uptime_seconds":41} +``` diff --git a/k8s/README.md b/k8s/README.md new file mode 100644 index 0000000000..0c2db86455 --- /dev/null +++ b/k8s/README.md @@ -0,0 +1,160 @@ +## Lab 9 โ€” Kubernetes Fundamentals + +The system architecture is based on a Deployment named `testiks`: +- Workload: `Deployment/testiks` +- Replicas: **three Pods** by default (scaled to 5 later); process listens on **TCP 5000** +- Service exposure: `Service/devops-info-service` of type **NodePort** + +Communications diagram: + +```mermaid +flowchart TB + Service["testiks service
type: NodePort
80 โ†’ targetPort: http (5000)
nodePort: 30080"] + Deployment["Deployment
replicas: 3
strategy: RollingUpdate"] + + Pod1["Pod
:5000"] + Pod2["Pod
:5000"] + Pod3["Pod
:5000"] + + Service -->|selector: app=devops-info-service| Deployment + Deployment --> Pod1 + Deployment --> Pod2 + Deployment --> Pod3 +``` + +## Manifest Files + +| File | Usage | +| ---------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `deployment.yml` | Creates and manages the application Pods; creates the `Deployment` and defines replica count, rolling update strategy, resource requests/limits. Performs healthchecks: `livenessProbe` on `/health`,`readinessProbe` on `/health` | +| `service.yml` | Creates a `Service` of type NodePort, selects Pods using the Deployment label selector, exposes port `80` and routes to container `targetPort: 5000` | + +Key choices: +- `replicas: 3` provides basic high availability even on a single-node local cluster. +- `maxUnavailable: 0` keeps all existing Pods serving traffic during rollout (when readiness passes). +- requests/limits are small but realistic for a lightweight Flask app. +- `port: 80` is convenient for clients; application stays on `5000`. +- NodePort allows access via `minikube service ... --url`. + +## Deployment Evidence + +### Cluster objects +![](./img/pods.png) + +### Detailed pods + services + +![](./img/detailed.png) + +### Deployment description +``` +โ””โ”€$ kubectl describe deployment testiks +Name: testiks +Namespace: default +CreationTimestamp: Thu, 26 Mar 2026 12:49:02 +0300 +Labels: app=testiks + component=api +Annotations: deployment.kubernetes.io/revision: 16 +Selector: app=testiks +Replicas: 3 desired | 3 updated | 3 total | 3 available | 0 unavailable +StrategyType: RollingUpdate +MinReadySeconds: 0 +RollingUpdateStrategy: 0 max unavailable, 1 max surge +Pod Template: + Labels: app=testiks + component=api + Annotations: kubectl.kubernetes.io/restartedAt: 2026-03-26T13:30:43+03:00 + Containers: + app: + Image: cacucoh/testiks:lab9 + Port: 5000/TCP + Host Port: 0/TCP + Limits: + cpu: 500m + memory: 256Mi + Requests: + cpu: 100m + memory: 128Mi + Liveness: http-get http://:http/health delay=15s timeout=3s period=10s #success=1 #failure=3 + Readiness: http-get http://:http/ready delay=5s timeout=2s period=5s #success=1 #failure=3 + Environment: + PORT: 5000 + Mounts: + Volumes: + Node-Selectors: + Tolerations: +Conditions: + Type Status Reason + ---- ------ ------ + Available True MinimumReplicasAvailable + Progressing True NewReplicaSetAvailable +OldReplicaSets: testiks-6dd7b49449 (0/0 replicas created), testiks-6c99d58d4d (0/0 replicas created), testiks-7f5cfd8947 (0/0 replicas created), testiks-7cb7974599 (0/0 replicas created), testiks-849474fb78 (0/0 replicas created), testiks-5dcd66d7c6 (0/0 replicas created), testiks-559cd698df (0/0 replicas created), testiks-6c8fdf9559 (0/0 replicas created), testiks-6f96859f5b (0/0 replicas created), testiks-6c976bcf57 (0/0 replicas created) +NewReplicaSet: testiks-764db4db6 (3/3 replicas created) +Events: + Type Reason Age From Message + ---- ------ ---- ---- ------- + Normal ScalingReplicaSet 44m deployment-controller Scaled up replica set testiks-c65d77cf5 from 0 to 3 + Normal ScalingReplicaSet 42m deployment-controller Scaled up replica set testiks-7d66876995 from 0 to 1 + Normal ScalingReplicaSet 39m deployment-controller Scaled down replica set testiks-c65d77cf5 from 3 to 2 + Normal ScalingReplicaSet 39m deployment-controller Scaled up replica set testiks-7c7fbdfbf4 from 0 to 1 + Normal ScalingReplicaSet 37m deployment-controller Scaled down replica set testiks-c65d77cf5 from 2 to 1 + Normal ScalingReplicaSet 37m deployment-controller Scaled up replica set testiks-698df5d97c from 0 to 1 + Normal ScalingReplicaSet 31m deployment-controller Scaled down replica set testiks-c65d77cf5 from 1 to 0 + Normal ScalingReplicaSet 31m deployment-controller Scaled up replica set testiks-6dd7b49449 from 0 to 1 + Normal ScalingReplicaSet 30m deployment-controller Scaled down replica set testiks-7d66876995 from 1 to 0 + Normal ScalingReplicaSet 30m deployment-controller Scaled up replica set testiks-6c99d58d4d from 0 to 1 + Normal ScalingReplicaSet 18m deployment-controller Scaled down replica set testiks-7c7fbdfbf4 from 1 to 0 + Normal ScalingReplicaSet 18m deployment-controller Scaled up replica set testiks-7f5cfd8947 from 0 to 1 + Normal ScalingReplicaSet 15m deployment-controller Scaled down replica set testiks-698df5d97c from 1 to 0 + Normal ScalingReplicaSet 15m deployment-controller Scaled up replica set testiks-7cb7974599 from 0 to 1 + Normal ScalingReplicaSet 14m deployment-controller Scaled down replica set testiks-6dd7b49449 from 1 to 0 + Normal ScalingReplicaSet 14m deployment-controller Scaled up replica set testiks-849474fb78 from 0 to 1 + Normal ScalingReplicaSet 12m deployment-controller Scaled down replica set testiks-6c99d58d4d from 1 to 0 + Normal ScalingReplicaSet 12m deployment-controller Scaled up replica set testiks-5dcd66d7c6 from 0 to 1 + Normal ScalingReplicaSet 12m deployment-controller Scaled down replica set testiks-7f5cfd8947 from 1 to 0 + Normal ScalingReplicaSet 2m30s (x14 over 12m) deployment-controller (combined from similar events): Scaled down replica set testiks-6c976bcf57 from 1 to 0 +``` +### Endpoints +``` +โ””โ”€$ kubectl get endpoints tetsiks + +Warning: v1 Endpoints is deprecated in v1.33+; use discovery.k8s.io/v1 EndpointSlice + +NAME ENDPOINTS AGE +tetsiks 10.244.0.41:5000,10.244.0.42:5000,10.244.0.43:5000 + 2 more... 17m + +NAME ADDRESSTYPE PORTS ENDPOINTS AGE +tetsiks-8lkwr IPv4 5000 10.244.0.41,10.244.0.42,10.244.0.43 + 2 more... 17m +``` + +### Curl tests +![](./img/curl.png) + +### Scaling to 5 pods +![](./img/scale.png) +``` +kubectl scale deployment testiks --replicas=5 +kubectl rollout restart deployment/testiks +kubectl rollout status deployment/testiks +kubectl get pods +``` + +### Rollback +![](./img/rollback.png) + +## 5. Production Considerations +**Health checks:** **Liveness** probes call **`/health`** so Kubernetes restarts the container if the HTTP server stops responding while the process is still running. **Readiness** probes call **`/ready`** so traffic is only sent to Pods that report ready, which avoids routing to Pods that are still starting or temporarily overloaded. + +**Resource limits:** Limits (**256Mi** memory, **500m** CPU) bound worst-case usage on shared nodes. Requests (**128Mi**, **100m**) reserve a minimum so the scheduler does not overcommit the node. Values are conservative for a small API and would be raised after measuring steady-state and peak load. + +**Improvements for a real production environment**: +- Add **startupProbe** for slow-start applications. +- Add `PodDisruptionBudget` to preserve availability during voluntary disruptions. +- Use `HorizontalPodAutoscaler` (HPA) based on CPU/RPS. +- Use private registry + `imagePullSecrets`, pin image tags (no `latest`), sign images. +- Use namespaces, NetworkPolicies, and secrets management (e.g., External Secrets/Vault). + +**Monitoring and observability:** The application exposes **Prometheus metrics** at `/metrics` for request rates, latency, and errors. Cluster-level metrics come from **kube-state-metrics** and **cAdvisor** when integrated with Prometheus. Logs can be collected with a node agent and centralized (for example **Loki**) for correlation with metric alerts. + +### Challenges & Solutions + +**Docker image issues:** Pods I pushed changes to my python app (added `/ready` endpoint) but kubernetes ignored these changes. Then I just created new lab tag and all succeed \ No newline at end of file diff --git a/k8s/deployment.yml b/k8s/deployment.yml new file mode 100644 index 0000000000..7f2abf3e7f --- /dev/null +++ b/k8s/deployment.yml @@ -0,0 +1,67 @@ +apiVersion: apps/v1 +kind: Deployment +metadata: + name: testiks + labels: + app: testiks + component: api +spec: + replicas: 3 + strategy: + type: RollingUpdate + rollingUpdate: + maxSurge: 1 + maxUnavailable: 0 + selector: + matchLabels: + app: testiks + template: + metadata: + labels: + app: testiks + component: api + spec: + securityContext: + runAsNonRoot: true + runAsUser: 1000 + runAsGroup: 1000 + fsGroup: 1000 + containers: + - name: app + image: cacucoh/testiks:lab9 + imagePullPolicy: IfNotPresent + securityContext: + allowPrivilegeEscalation: false + capabilities: + drop: + - ALL + env: + - name: PORT + value: "5000" + ports: + - name: http + containerPort: 5000 + protocol: TCP + resources: + requests: + cpu: "100m" + memory: "128Mi" + limits: + cpu: "500m" + memory: "256Mi" + livenessProbe: + httpGet: + path: /health + port: http + initialDelaySeconds: 15 + periodSeconds: 10 + timeoutSeconds: 3 + failureThreshold: 3 + readinessProbe: + httpGet: + path: /ready + port: http + initialDelaySeconds: 5 + periodSeconds: 5 + timeoutSeconds: 2 + failureThreshold: 3 \ No newline at end of file diff --git a/k8s/img/curl.png b/k8s/img/curl.png new file mode 100644 index 0000000000..4d34a7296c Binary files /dev/null and b/k8s/img/curl.png differ diff --git a/k8s/img/detailed.png b/k8s/img/detailed.png new file mode 100644 index 0000000000..a3519d5583 Binary files /dev/null and b/k8s/img/detailed.png differ diff --git a/k8s/img/pods.png b/k8s/img/pods.png new file mode 100644 index 0000000000..2a28ee99b1 Binary files /dev/null and b/k8s/img/pods.png differ diff --git a/k8s/img/rollback.png b/k8s/img/rollback.png new file mode 100644 index 0000000000..2a9510e821 Binary files /dev/null and b/k8s/img/rollback.png differ diff --git a/k8s/img/scale.png b/k8s/img/scale.png new file mode 100644 index 0000000000..9ad947e1ee Binary files /dev/null and b/k8s/img/scale.png differ diff --git a/k8s/service.yml b/k8s/service.yml new file mode 100644 index 0000000000..d91419e221 --- /dev/null +++ b/k8s/service.yml @@ -0,0 +1,16 @@ +apiVersion: v1 +kind: Service +metadata: + name: testiks + labels: + app: testiks +spec: + type: NodePort + selector: + app: testiks + ports: + - name: http + protocol: TCP + port: 80 + targetPort: http + nodePort: 30080 \ No newline at end of file diff --git a/k8s/testiks/Chart.yaml b/k8s/testiks/Chart.yaml new file mode 100644 index 0000000000..6e658a2710 --- /dev/null +++ b/k8s/testiks/Chart.yaml @@ -0,0 +1,12 @@ +apiVersion: v2 +name: testiks +description: Helm chart for py web application +type: application +version: 0.1.0 +appVersion: "1.0.0" +keywords: + - python + - web +maintainers: + - name: CacucoH + email: dfffd7800@gmail.com diff --git a/k8s/testiks/HELM.md b/k8s/testiks/HELM.md new file mode 100644 index 0000000000..ce82315fb8 --- /dev/null +++ b/k8s/testiks/HELM.md @@ -0,0 +1,586 @@ +## Helm Package Manager (Lab 10) +### Chart structure + +This Helm chart follows a standard and production-ready structure for deploying a Kubernetes application. Below is an explanation of each component and its purpose + +### Root Directory + +``` +testiks/ +โ”œโ”€โ”€ Chart.yaml +โ”œโ”€โ”€ values.yaml +โ”œโ”€โ”€ values-dev.yaml +โ”œโ”€โ”€ values-prod.yaml +โ””โ”€โ”€ templates/ + โ”œโ”€โ”€ _helpers.tpl + โ”œโ”€โ”€ deployment.yaml + โ”œโ”€โ”€ service.yaml + โ”œโ”€โ”€ hooks-preinstall-job.yaml + โ””โ”€โ”€ hooks-postinstall-job.yaml +``` + +### Files and its purpose +charts/: Directory containing any dependencies + + +Chart.yaml: +- This file contains metadata about the Helm chart +Purpose: +- Defines chart name, version, and description +- Specifies chart type (application or library) +- Provides application version + + +values.yaml: +- The values.yaml file defines default configuration values used across templates. +Purpose: + +- Centralized configuration management +- Allows easy customization without modifying templates +- Supports overrides via CLI or environment-specific files + +_helpers.tpl: +- Contains reusable template definitions. +Purpose: +- Avoid duplication +- Standardize naming and labels +- Improve maintainability + +deployment.yaml +- Defines the Kubernetes Deployment resource. +Purpose: +- Deploys the application pods +- Configures replicas, rolling updates, and container settings +- Uses values from values.yaml for dynamic configuration + +hooks-postinstall-job.yaml: +- Defines a Helm post-install hook +Purpose: +- Executes after installation completes +- Used for smoke tests or notifications + +## task 1 + +``` +$ sudo apt install helm +$ helm version +version.BuildInfo{Version:"v4.1.3", GitCommit:"c94d381b03be117e7e57908edbf642104e00eb8f", GitTreeState:"clean", GoVersion:"go1.26.1", KubeClientVersion:"v1.35"} +``` + +I started by creating a Helm chart in the k8s/ directory for my application. To do this, I ran the following command: +``` +helm create k8s/testiks +``` +This generated the basic Helm chart structure with all the necessary files and directories. I then updated the Chart.yaml to include the metadata for my chart: +``` +apiVersion: v2 +name: testiks +description: Helm chart for py web application +type: application +version: 0.1.0 +appVersion: "1.0" +``` +The name field is set to testiks, and I chose 0.1.0 as the chart version. The appVersion is set to "1.0" to represent the version of my Python app. + +promethus repo: +``` +$ helm show chart prometheus-community/prometheus +annotations: + artifacthub.io/license: Apache-2.0 + artifacthub.io/links: | + - name: Chart Source + url: https://github.com/prometheus-community/helm-charts + - name: Upstream Project + url: https://github.com/prometheus/prometheus +apiVersion: v2 +appVersion: v3.11.0 +dependencies: +- condition: alertmanager.enabled + name: alertmanager + repository: https://prometheus-community.github.io/helm-charts + version: 1.34.* +- condition: kube-state-metrics.enabled + name: kube-state-metrics + repository: https://prometheus-community.github.io/helm-charts + version: 7.2.* +- condition: prometheus-node-exporter.enabled + name: prometheus-node-exporter + repository: https://prometheus-community.github.io/helm-charts + version: 4.52.* +- condition: prometheus-pushgateway.enabled + name: prometheus-pushgateway + repository: https://prometheus-community.github.io/helm-charts + version: 3.6.* +description: Prometheus is a monitoring system and time series database. +home: https://prometheus.io/ +icon: https://raw.githubusercontent.com/prometheus/prometheus.github.io/master/assets/prometheus_logo-cb55bb5c346.png +keywords: +- monitoring +- prometheus +kubeVersion: '>=1.19.0-0' +maintainers: +- email: gianrubio@gmail.com + name: gianrubio + url: https://github.com/gianrubio +- email: zanhsieh@gmail.com + name: zanhsieh + url: https://github.com/zanhsieh +- email: miroslav.hadzhiev@gmail.com + name: Xtigyro + url: https://github.com/Xtigyro +- email: naseem@transit.app + name: naseemkullah + url: https://github.com/naseemkullah +- email: rootsandtrees@posteo.de + name: zeritti + url: https://github.com/zeritti +name: prometheus +sources: +- https://github.com/prometheus/alertmanager +- https://github.com/prometheus/prometheus +- https://github.com/prometheus/pushgateway +- https://github.com/prometheus/node_exporter +- https://github.com/kubernetes/kube-state-metrics +type: application +version: 28.15.0 +``` + +### Why Helm matters +Helm simplifies Kubernetes application management by providing a package manager for deploying, managing, and scaling applications. It allows you to define reusable and customizable Kubernetes manifests using charts, making deployments consistent across environments. Helm also offers versioning, rollback capabilities, dependency management, and automation, ensuring easier and more reliable application management on Kubernetes + +## Task 2 + +Important Values: +- replicaCount: number of pod replicas +- image.repository / image.tag: container image source +- containerPort: container listening port +- service.type: NodePort for local access, LoadBalancer for production-style exposure +- service.nodePort: fixed local NodePort for dev install +- resources.requests / resources.limits: scheduler and runtime resource boundaries +- livenessProbe / readinessProbe: health-check timings and paths +- hooks.enabled: enables lifecycle Jobs + +### Environment Customization + +Two environment-specific configuration files are used: +- Development (values-dev.yaml) + - 1 replica + - Lower resource usage + - NodePort service + - Latest image tag + +- Production (values-prod.yaml) + - 5 replicas + - Higher resource limits + - LoadBalancer service + - Fixed image version + +### Install example +Development: +``` +helm install testiks . -f values-dev.yaml +``` +Production: +``` +helm upgrade testiks . -f values-prod.yaml +``` + +## Task 3 + +Two Helm hooks are implemented: +1. Pre-install Hook +- Runs before chart installation +- Purpose: simulate pre-deployment validation + +2. Post-install Hook +- Runs after deployment +- Purpose: simulate smoke testing + +| | Pre-install | Post-install | +|---|-------------|--------------| +| Kind | Job | Job | +| `helm.sh/hook` | `pre-install` | `post-install` | +| `helm.sh/hook-weight` | `-5` | `5` | +| `helm.sh/hook-delete-policy` | `hook-succeeded` when `hooks.deleteAfterSuccess` is true | same | + + +### Operations +``` +helm uninstall dev -n lab10-dev +helm uninstall prod -n lab10-prod +helm upgrade dev ./k8s/testiks -f k8s/testiks/values-dev.yaml -n lab10-dev +helm history dev -n lab10-dev +helm rollback dev -n lab10-dev +``` + +## Installation evidence + +```text +$ helm lint ./k8s/testiks +==> Linting ./k8s/testiks +[INFO] Chart.yaml: icon is recommended +1 chart(s) linted, 0 chart(s) failed +``` + +```text +$ kubectl config current-context +minikube + +$ kubectl cluster-info +Kubernetes control plane is running at https://127.0.0.1:65035 +CoreDNS is running at https://127.0.0.1:65035/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy + +$ kubectl get nodes +NAME STATUS ROLES AGE VERSION +minikube Ready control-plane 8d v1.32.0 +``` + +```bash +helm template dev ./k8s/testiks -f k8s/testiks/values-dev.yaml -n lab10-dev +helm template prod ./k8s/testiks -f k8s/testiks/values-prod.yaml -n lab10-prod +``` + +```text +$ helm template dev ./k8s/testiks -f k8s/testiks/values-dev.yaml -n lab10-dev 2>&1 | head -42 +--- +# Source: testiks/templates/deployment.yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: dev-testiks + labels: + helm.sh/chart: testiks-0.1.0 + app.kubernetes.io/name: testiks + app.kubernetes.io/instance: dev + app.kubernetes.io/version: "1.0.0" + app.kubernetes.io/managed-by: Helm +spec: + replicas: 1 + selector: + matchLabels: + app.kubernetes.io/name: testiks + app.kubernetes.io/instance: dev + strategy: + type: RollingUpdate + rollingUpdate: + maxSurge: 1 + maxUnavailable: 0 + template: + metadata: + labels: + app.kubernetes.io/name: testiks + app.kubernetes.io/instance: dev + spec: + containers: + - name: testiks + image: "cacucoh/testiks:latest" + imagePullPolicy: Always + ports: + - name: http + containerPort: 5000 + protocol: TCP +``` + +```text +$ helm install dev ./k8s/testiks \ + -f k8s/testiks/values-dev.yaml \ + --namespace lab10-dev --create-namespace +NAME: dev +LAST DEPLOYED: Thu Apr 2 19:39:50 2026 +NAMESPACE: lab10-dev +STATUS: deployed +REVISION: 1 +TEST SUITE: None +NOTES: +1. Get the application URL by running these commands: + export NODE_PORT=$(kubectl get --namespace lab10-dev -o jsonpath="{.spec.ports[0].nodePort}" services dev-testiks) + export NODE_IP=$(kubectl get nodes --namespace lab10-dev -o jsonpath="{.items[0].status.addresses[0].address}") + echo http://$NODE_IP:$NODE_PORT/health + +Release: dev +Namespace: lab10-dev +``` + +```text +$ helm list -n lab10-dev +NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION +dev lab10-dev 1 2026-04-02 19:39:50.110994 +0300 MSK deployed testiks-0.1.0 1.0.0 +``` + +```text +$ kubectl get all -n lab10-dev +NAME READY STATUS RESTARTS AGE +pod/dev-testiks-84579bd9bb-8mnkp 1/1 Running 0 62s + +NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE +service/dev-testiks NodePort 10.103.117.200 80:30081/TCP 62s + +NAME READY UP-TO-DATE AVAILABLE AGE +deployment.apps/dev-testiks 1/1 1 1 62s + +NAME DESIRED CURRENT READY AGE +replicaset.apps/dev-testiks-84579bd9bb 1 1 1 62s +``` + +With default `deleteAfterSuccess: true`, hook Jobs are removed after success (`kubectl get jobs` is empty). With `values-hooks-keep.yaml`: + +```bash +helm uninstall dev -n lab10-dev +helm install dev ./k8s/testiks \ + -f k8s/testiks/values-dev.yaml \ + -f k8s/testiks/values-hooks-keep.yaml \ + --namespace lab10-dev +kubectl get jobs -n lab10-dev +kubectl describe job dev-testiks-pre-install -n lab10-dev +kubectl describe job dev-testiks-post-install -n lab10-dev +kubectl logs -n lab10-dev job/dev-testiks-pre-install +kubectl logs -n lab10-dev job/dev-testiks-post-install +``` + +```text +$ helm install dev ./k8s/testiks \ + -f k8s/testiks/values-dev.yaml \ + -f k8s/testiks/values-hooks-keep.yaml \ + --namespace lab10-dev +NAME: dev +LAST DEPLOYED: Thu Apr 2 19:48:28 2026 +NAMESPACE: lab10-dev +STATUS: deployed +REVISION: 1 +``` + +```text +$ kubectl get jobs -n lab10-dev +NAME STATUS COMPLETIONS DURATION AGE +dev-testiks-post-install Complete 1/1 4s 12s +dev-testiks-pre-install Complete 1/1 3s 15s +``` + +```text +$ kubectl describe job dev-testiks-pre-install -n lab10-dev +Name: dev-testiks-pre-install +Namespace: lab10-dev +Selector: batch.kubernetes.io/controller-uid=b3df58aa-361f-48fd-8b38-934fa4dbe167 +Labels: app.kubernetes.io/instance=dev + app.kubernetes.io/managed-by=Helm + app.kubernetes.io/name=testiks + app.kubernetes.io/version=1.0.0 + helm.sh/chart=testiks-0.1.0 +Annotations: helm.sh/hook: pre-install + helm.sh/hook-weight: -5 +Parallelism: 1 +Completions: 1 +Completion Mode: NonIndexed +Suspend: false +Backoff Limit: 2 +Start Time: Thu, 02 Apr 2026 19:48:28 +0300 +Completed At: Thu, 02 Apr 2026 19:48:31 +0300 +Duration: 3s +Pods Statuses: 0 Active (0 Ready) / 1 Succeeded / 0 Failed +Pod Template: + Labels: app.kubernetes.io/managed-by=Helm + batch.kubernetes.io/controller-uid=b3df58aa-361f-48fd-8b38-934fa4dbe167 + batch.kubernetes.io/job-name=dev-testiks-pre-install + controller-uid=b3df58aa-361f-48fd-8b38-934fa4dbe167 + helm.sh/hook=pre-install + job-name=dev-testiks-pre-install + Containers: + pre-install: + Image: busybox:1.36 + Command: + sh + -c + set -e + echo "pre-install: release=dev ns=lab10-dev" + echo "pre-install OK" + Environment: + Mounts: + Volumes: + Node-Selectors: + Tolerations: +Events: + Type Reason Age From Message + ---- ------ ---- ---- ------- + Normal SuccessfulCreate 22s job-controller Created pod: dev-testiks-pre-install-q8xgb + Normal Completed 19s job-controller Job completed +``` + +```text +$ kubectl logs -n lab10-dev job/dev-testiks-pre-install +pre-install: release=dev ns=lab10-dev +pre-install OK +``` + +```text +$ kubectl logs -n lab10-dev job/dev-testiks-post-install +post-install: smoke GET http://dev-testiks.lab10-dev.svc.cluster.local:80/health +{"status":"healthy","timestamp":"2026-04-02T16:48:32.488027+00:00","uptime_seconds":507} +post-install OK +``` + +Production install (`values-prod.yaml`): + +```text +$ helm list -n lab10-prod +NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION +prod lab10-prod 1 2026-04-02 19:51:57.134345 +0300 MSK failed testiks-0.1.0 1.0.0 +``` + +```text +$ kubectl get all -n lab10-prod +NAME READY STATUS RESTARTS AGE +pod/prod-testiks-05dff54df9-b77f4 0/1 Running 0 40s +pod/prod-testiks-05dff54df9-lf2j2 0/1 Running 0 40s +pod/prod-testiks-05dff54df9-q54dt 0/1 Running 0 40s +pod/prod-testiks-05dff54df9-sw95m 1/1 Running 0 40s +pod/prod-testiks-05dff54df9-z45wb 1/1 Running 0 40s +pod/prod-testiks-post-install-t4c9p 0/1 Completed 0 40s + +NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE +service/prod-testiks LoadBalancer 10.103.135.218 80:31854/TCP 40s + +NAME READY UP-TO-DATE AVAILABLE AGE +deployment.apps/prod-testiks 2/5 5 2 40s + +NAME DESIRED CURRENT READY AGE +replicaset.apps/prod-testiks-05dff54df9 5 5 2 40s + +NAME STATUS COMPLETIONS DURATION AGE +job.batch/prod-testiks-post-install Complete 1/1 30s 40s +``` + +```text +$ kubectl get svc -n lab10-prod +NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE +prod-testiks LoadBalancer 10.103.135.218 80:31854/TCP 47s +``` + +```bash +kubectl port-forward -n lab10-prod svc/prod-testiks 8080:80 +``` + +```text +$ kubectl rollout status deployment/prod-testiks -n lab10-prod +deployment "prod-testiks" successfully rolled out + +$ helm upgrade prod ./k8s/testiks -f k8s/testiks/values-prod.yaml -n lab10-prod +Release "prod" has been upgraded. Happy Helming! +NAME: prod +LAST DEPLOYED: Thu Apr 2 19:54:16 2026 +NAMESPACE: lab10-prod +STATUS: deployed +REVISION: 2 +TEST SUITE: None +NOTES: +1. Get the application URL by running these commands: + NOTE: It may take a few minutes for the LoadBalancer IP to be available. + Watch status: kubectl get svc -w prod-testiks + export SERVICE_IP=$(kubectl get svc --namespace lab10-prod prod-testiks -o jsonpath='{.status.loadBalancer.ingress[0].ip}') + echo http://$SERVICE_IP:80/health + +Release: prod +Namespace: lab10-prod +``` + +The following `helm list -A` was captured before `helm upgrade prod`; the upgrade transcript above records `prod` at revision 2 `deployed`. + +```text +$ helm list -A +NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION +dev default 1 2026-04-02 19:38:26.499655 +0300 MSK failed testiks-0.1.0 1.0.0 +dev lab10-dev 1 2026-04-02 19:48:28.029525 +0300 MSK deployed testiks-0.1.0 1.0.0 +prod lab10-prod 1 2026-04-02 19:51:57.134345 +0300 MSK failed testiks-0.1.0 1.0.0 +``` + +```bash +helm uninstall dev -n default +``` + +## Testing and validation + +```bash +helm lint ./k8s/testiks +helm template dev ./k8s/testiks -f k8s/testiks/values-dev.yaml -n lab10-dev +helm install dev-dryrun ./k8s/testiks \ + -f k8s/testiks/values-dev.yaml \ + --namespace lab-dryrun --create-namespace \ + --dry-run=client +``` + +```text +$ helm install dev-dryrun ./k8s/testiks \ + -f k8s/testiks/values-dev.yaml \ + --namespace lab-dryrun --create-namespace \ + --dry-run=client 2>&1 | head -80 +NAME: dev-dryrun +LAST DEPLOYED: Thu Apr 2 19:53:17 2026 +NAMESPACE: lab-dryrun +STATUS: pending-install +REVISION: 1 +TEST SUITE: None +HOOKS: +--- +# Source: testiks/templates/hooks/post-install-job.yaml +apiVersion: batch/v1 +kind: Job +metadata: + name: dev-dryrun-testiks-post-install + annotations: + helm.sh/hook: post-install + helm.sh/hook-weight: "5" + helm.sh/hook-delete-policy: hook-succeeded + labels: + helm.sh/chart: testiks-0.1.0 + app.kubernetes.io/name: testiks + app.kubernetes.io/instance: dev-dryrun + app.kubernetes.io/version: "1.0.0" + app.kubernetes.io/managed-by: Helm +spec: + backoffLimit: 3 + template: + metadata: + labels: + app.kubernetes.io/managed-by: Helm + helm.sh/hook: post-install + spec: + restartPolicy: Never + containers: + - name: post-install + image: "curlimages/curl:8.5.0" + command: + - sh + - -c + - | + set -e + URL="http://dev-dryrun-testiks.lab-dryrun.svc.cluster.local:80/health" + echo "post-install: smoke GET $URL" + i=0 + while [ "$i" -lt 30 ]; do + if curl -fsS --connect-timeout 3 --max-time 10 "$URL"; then + echo "post-install OK" + exit 0 + fi + i=$((i + 1)) + echo "post-install: retry $i/30" + sleep 2 + done + echo "post-install: health check failed" >&2 + exit 1 +``` + +```text +$ curl -sS -i localhost:8080/health + +HTTP/1.1 200 OK +Server: Werkzeug/3.1.7 Python/3.13.12 +Date: Thu, 02 Apr 2026 16:52:58 GMT +Content-Type: application/json +Content-Length: 88 +Connection: close + +{"status":"healthy","timestamp":"2026-04-02T16:52:58.654555+00:00","uptime_seconds":41} +``` + +```bash +curl "$(minikube service dev-testiks -n lab10-dev --url)/health" +``` \ No newline at end of file diff --git a/k8s/testiks/deployment.yml b/k8s/testiks/deployment.yml new file mode 100644 index 0000000000..78ec4c61c1 --- /dev/null +++ b/k8s/testiks/deployment.yml @@ -0,0 +1,73 @@ +apiVersion: apps/v1 +kind: Deployment +metadata: + name: testiks + labels: + app: testiks + app.kubernetes.io/name: testiks + app.kubernetes.io/component: web + app.kubernetes.io/part-of: devops-core-course +spec: + replicas: 3 + minReadySeconds: 5 + revisionHistoryLimit: 5 + selector: + matchLabels: + app: testiks + strategy: + type: RollingUpdate + rollingUpdate: + maxSurge: 1 + maxUnavailable: 0 + template: + metadata: + labels: + app: testiks + app.kubernetes.io/name: testiks + app.kubernetes.io/component: web + app.kubernetes.io/part-of: devops-core-course + spec: + securityContext: + seccompProfile: + type: RuntimeDefault + containers: + - name: testiks + image: testiks:lab09 + imagePullPolicy: IfNotPresent + ports: + - name: http + containerPort: 5000 + env: + - name: PORT + value: "5000" + resources: + requests: + cpu: 100m + memory: 128Mi + limits: + cpu: 250m + memory: 256Mi + securityContext: + runAsUser: 10001 + runAsGroup: 10001 + runAsNonRoot: true + allowPrivilegeEscalation: false + capabilities: + drop: + - ALL + readinessProbe: + httpGet: + path: /health + port: http + initialDelaySeconds: 5 + periodSeconds: 5 + timeoutSeconds: 2 + failureThreshold: 3 + livenessProbe: + httpGet: + path: /health + port: http + initialDelaySeconds: 15 + periodSeconds: 10 + timeoutSeconds: 2 + failureThreshold: 3 \ No newline at end of file diff --git a/k8s/testiks/service.yml b/k8s/testiks/service.yml new file mode 100644 index 0000000000..e55c7b045a --- /dev/null +++ b/k8s/testiks/service.yml @@ -0,0 +1,19 @@ +apiVersion: v1 +kind: Service +metadata: + name: testiks + labels: + app: testiks + app.kubernetes.io/name: testiks + app.kubernetes.io/component: web + app.kubernetes.io/part-of: devops-core-course +spec: + type: NodePort + selector: + app: testiks + ports: + - name: http + protocol: TCP + port: 80 + targetPort: http + nodePort: 30080 \ No newline at end of file diff --git a/k8s/testiks/templates/NOTES.txt b/k8s/testiks/templates/NOTES.txt new file mode 100644 index 0000000000..43246accac --- /dev/null +++ b/k8s/testiks/templates/NOTES.txt @@ -0,0 +1,14 @@ +1. Get the application URL by running these commands: +{{- if eq .Values.service.type "NodePort" }} + export NODE_PORT=$(kubectl get --namespace {{ .Release.Namespace }} -o jsonpath="{.spec.ports[0].nodePort}" services {{ include "testiks.fullname" . }}) + export NODE_IP=$(kubectl get nodes --namespace {{ .Release.Namespace }} -o jsonpath="{.items[0].status.addresses[0].address}") + echo http://$NODE_IP:$NODE_PORT/health +{{- else if eq .Values.service.type "LoadBalancer" }} + NOTE: It may take a few minutes for the LoadBalancer IP to be available. + Watch status: kubectl get svc -w {{ include "testiks.fullname" . }} + export SERVICE_IP=$(kubectl get svc --namespace {{ .Release.Namespace }} {{ include "testiks.fullname" . }} -o jsonpath='{.status.loadBalancer.ingress[0].ip}') + echo http://$SERVICE_IP:{{ .Values.service.port }}/health +{{- end }} + +Release: {{ .Release.Name }} +Namespace: {{ .Release.Namespace }} diff --git a/k8s/testiks/templates/_helpers.tpl b/k8s/testiks/templates/_helpers.tpl new file mode 100644 index 0000000000..3c2a31c442 --- /dev/null +++ b/k8s/testiks/templates/_helpers.tpl @@ -0,0 +1,43 @@ +{{/* +Expand the name of the chart. +*/}} +{{- define "testiks.name" -}} +{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }} +{{- end }} + +{{/* +Create a default fully qualified app name. +*/}} +{{- define "testiks.fullname" -}} +{{- if .Values.fullnameOverride }} +{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" }} +{{- else }} +{{- $name := default .Chart.Name .Values.nameOverride }} +{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" }} +{{- end }} +{{- end }} + +{{/* +Create chart name and version as used by the chart label. +*/}} +{{- define "testiks.chart" -}} +{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }} +{{- end }} + +{{/* +Common labels +*/}} +{{- define "testiks.labels" -}} +helm.sh/chart: {{ include "testiks.chart" . }} +{{ include "testiks.selectorLabels" . }} +app.kubernetes.io/version: {{ .Chart.AppVersion | quote }} +app.kubernetes.io/managed-by: {{ .Release.Service }} +{{- end }} + +{{/* +Selector labels +*/}} +{{- define "testiks.selectorLabels" -}} +app.kubernetes.io/name: {{ include "testiks.name" . }} +app.kubernetes.io/instance: {{ .Release.Name }} +{{- end }} diff --git a/k8s/testiks/templates/deployment.yaml b/k8s/testiks/templates/deployment.yaml new file mode 100644 index 0000000000..09bcf9acc8 --- /dev/null +++ b/k8s/testiks/templates/deployment.yaml @@ -0,0 +1,46 @@ +apiVersion: apps/v1 +kind: Deployment +metadata: + name: {{ include "testiks.fullname" . }} + labels: + {{- include "testiks.labels" . | nindent 4 }} +spec: + replicas: {{ .Values.replicaCount }} + selector: + matchLabels: + {{- include "testiks.selectorLabels" . | nindent 6 }} + strategy: + type: RollingUpdate + rollingUpdate: + maxSurge: 1 + maxUnavailable: 0 + template: + metadata: + labels: + {{- include "testiks.selectorLabels" . | nindent 8 }} + spec: + securityContext: + seccompProfile: + type: RuntimeDefault + containers: + - name: {{ .Chart.Name }} + image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}" + imagePullPolicy: {{ .Values.image.pullPolicy }} + ports: + - name: http + containerPort: {{ .Values.containerPort }} + protocol: TCP + resources: + {{- toYaml .Values.resources | nindent 12 }} + securityContext: + runAsUser: 10001 + runAsGroup: 10001 + runAsNonRoot: true + allowPrivilegeEscalation: false + capabilities: + drop: + - ALL + livenessProbe: + {{- toYaml .Values.livenessProbe | nindent 12 }} + readinessProbe: + {{- toYaml .Values.readinessProbe | nindent 12 }} diff --git a/k8s/testiks/templates/hooks/post-install-job.yaml b/k8s/testiks/templates/hooks/post-install-job.yaml new file mode 100644 index 0000000000..99eb9ae2e4 --- /dev/null +++ b/k8s/testiks/templates/hooks/post-install-job.yaml @@ -0,0 +1,41 @@ +apiVersion: batch/v1 +kind: Job +metadata: + name: "{{ include "testiks.fullname" . }}-post-install" + labels: + {{- include "testiks.labels" . | nindent 4 }} + annotations: + "helm.sh/hook": post-install + "helm.sh/hook-weight": "5" + "helm.sh/hook-delete-policy": {{ if .Values.hooks.deleteAfterSuccess }}hook-succeeded{{ else }}before-hook-creation{{ end }} +spec: + backoffLimit: 3 + template: + metadata: + labels: + app.kubernetes.io/managed-by: Helm + helm.sh/hook: post-install + spec: + restartPolicy: Never + containers: + - name: post-install + image: "curlimages/curl:8.5.0" + command: + - sh + - -c + - | + set -e + URL="http://{{ include "testiks.fullname" . }}.{{ .Release.Namespace }}.svc.cluster.local:{{ .Values.service.port }}/health" + echo "post-install: smoke GET $URL" + i=0 + while [ "$i" -lt 30 ]; do + if curl -fsS --connect-timeout 3 --max-time 10 "$URL"; then + echo "post-install OK" + exit 0 + fi + i=$((i + 1)) + echo "post-install: retry $i/30" + sleep 2 + done + echo "post-install: health check failed" >&2 + exit 1 diff --git a/k8s/testiks/templates/hooks/pre-install-job.yaml b/k8s/testiks/templates/hooks/pre-install-job.yaml new file mode 100644 index 0000000000..08c582166f --- /dev/null +++ b/k8s/testiks/templates/hooks/pre-install-job.yaml @@ -0,0 +1,29 @@ +apiVersion: batch/v1 +kind: Job +metadata: + name: "{{ include "testiks.fullname" . }}-pre-install" + labels: + {{- include "testiks.labels" . | nindent 4 }} + annotations: + "helm.sh/hook": pre-install + "helm.sh/hook-weight": "-5" + "helm.sh/hook-delete-policy": {{ if .Values.hooks.deleteAfterSuccess }}hook-succeeded{{ else }}before-hook-creation{{ end }} +spec: + backoffLimit: 2 + template: + metadata: + labels: + app.kubernetes.io/managed-by: Helm + helm.sh/hook: pre-install + spec: + restartPolicy: Never + containers: + - name: pre-install + image: busybox:1.36 + command: + - sh + - -c + - | + set -e + echo "pre-install: release={{ .Release.Name }} ns={{ .Release.Namespace }}" + echo "pre-install OK" diff --git a/k8s/testiks/templates/service.yaml b/k8s/testiks/templates/service.yaml new file mode 100644 index 0000000000..f3299537b5 --- /dev/null +++ b/k8s/testiks/templates/service.yaml @@ -0,0 +1,18 @@ +apiVersion: v1 +kind: Service +metadata: + name: {{ include "testiks.fullname" . }} + labels: + {{- include "testiks.labels" . | nindent 4 }} +spec: + type: {{ .Values.service.type }} + selector: + {{- include "testiks.selectorLabels" . | nindent 4 }} + ports: + - name: http + protocol: TCP + port: {{ .Values.service.port }} + targetPort: http + {{- if and (eq .Values.service.type "NodePort") .Values.service.nodePort }} + nodePort: {{ .Values.service.nodePort }} + {{- end }} diff --git a/k8s/testiks/values-dev.yaml b/k8s/testiks/values-dev.yaml new file mode 100644 index 0000000000..47363925f6 --- /dev/null +++ b/k8s/testiks/values-dev.yaml @@ -0,0 +1,25 @@ +replicaCount: 1 + +image: + tag: "latest" + pullPolicy: Always + +service: + type: NodePort + nodePort: 30081 + +resources: + requests: + cpu: 50m + memory: 64Mi + limits: + cpu: 100m + memory: 128Mi + +livenessProbe: + initialDelaySeconds: 5 + periodSeconds: 10 + +readinessProbe: + initialDelaySeconds: 3 + periodSeconds: 5 diff --git a/k8s/testiks/values-prod.yaml b/k8s/testiks/values-prod.yaml new file mode 100644 index 0000000000..10ecf11523 --- /dev/null +++ b/k8s/testiks/values-prod.yaml @@ -0,0 +1,24 @@ +replicaCount: 5 + +image: + tag: "1.0.0" + pullPolicy: IfNotPresent + +service: + type: LoadBalancer + +resources: + requests: + cpu: 200m + memory: 256Mi + limits: + cpu: 500m + memory: 512Mi + +livenessProbe: + initialDelaySeconds: 30 + periodSeconds: 5 + +readinessProbe: + initialDelaySeconds: 10 + periodSeconds: 3 diff --git a/k8s/testiks/values.yaml b/k8s/testiks/values.yaml new file mode 100644 index 0000000000..c413a1e7f2 --- /dev/null +++ b/k8s/testiks/values.yaml @@ -0,0 +1,43 @@ +replicaCount: 3 + +image: + repository: cacucoh/testiks + tag: "1.0.0" + pullPolicy: IfNotPresent + +containerPort: 5000 + +service: + type: NodePort + port: 80 + targetPort: 5000 + nodePort: 30081 + +resources: + requests: + cpu: 100m + memory: 128Mi + limits: + cpu: 250m + memory: 256Mi + +livenessProbe: + httpGet: + path: /health + port: 5000 + initialDelaySeconds: 15 + periodSeconds: 10 + timeoutSeconds: 2 + failureThreshold: 3 + +readinessProbe: + httpGet: + path: /health + port: 5000 + initialDelaySeconds: 5 + periodSeconds: 5 + timeoutSeconds: 2 + failureThreshold: 3 + +hooks: + deleteAfterSuccess: true diff --git a/lab1.md b/lab1.md deleted file mode 100644 index 30b74c95f5..0000000000 --- a/lab1.md +++ /dev/null @@ -1,65 +0,0 @@ -# Lab 1: Web Application Development - -## Overview - -In this lab assignment, you will develop a simple web application using Python and best practices. You will also have the opportunity to create a bonus web application using a different programming language. Follow the tasks below to complete the lab assignment. - -## Task 1: Python Web Application - -**6 Points:** - -1. Create `app_python` Folder: - - Create a folder named `app_python` to contain your Python web application files. - - Inside the `app_python` folder, create a file named `PYTHON.md`. - -2. Develop and Test Python Web Application: - - Develop a Python web application that displays the current time in Moscow. - - Choose a suitable framework for your web application and justify your choice in the `PYTHON.md` file. - - Implement best practices in your code and follow coding standards. - - Test your application to ensure the displayed time updates upon page refreshing. - -## Task 2: Well Decorated Description - -**4 Points:** - -1. Update `PYTHON.md`: - - Describe best practices applied in the web application. - - Explain how you followed coding standards, implemented testing, and ensured code quality. - -2. Create `README.md` in `app_python` folder: - - Use a Markdown template to document the Python web application. - -3. Ensure: - - Maintain a clean `.gitignore` file. - - Use a concise `requirements.txt` file for required dependencies. - -### List of Requirements - -- MSK Time timezone set up -- 2 PRs created -- README includes Overview -- Nice Markdown decoration -- Local installation details in README - -## Bonus Task: Additional Web Application - -**2.5 Points:** - -1. Create `app_*` Folder: - - Create a folder named `app_*` in the main project directory, replacing `*` with a programming language of your choice (other than Python). - - Inside the `app_*` folder, create a file named `*`.md. - -2. Develop Your Own Web App: - - Create a web application using the programming language you chose. - - Decide what your web application will display or do, and use your creativity. - -3. Follow Main Task Steps: - - Implement your bonus web application following the same suggestions and steps as the main Python web application task. - -### Guidelines - -- Use proper Markdown formatting and structure for the documentation files. We will use [online one](https://dlaa.me/markdownlint/) to check your `.md` files. -- Organize the files within the lab folder using appropriate naming conventions. -- Create a PR from your fork to the master branch of this repository and from your fork's branch to your fork's master branch with your completed lab assignment. - -> Note: Apply best practices, coding standards, and testing to your Python web application. Explore creativity in your bonus web application, and document your process using Markdown. diff --git a/lab10.md b/lab10.md deleted file mode 100644 index c472086168..0000000000 --- a/lab10.md +++ /dev/null @@ -1,91 +0,0 @@ -# Lab 10: Introduction to Helm - -## Overview - -In this lab, you will become familiar with Helm, set up a local development environment, and generate manifests for your application. - -## Task 1: Helm Setup and Chart Creation - -**6 Points:** - -1. Learn About Helm: - - Begin by exploring the architecture and concepts of Helm: - - [Helm Architecture](https://helm.sh/docs/topics/architecture/) - - [Understanding Helm Charts](https://helm.sh/docs/topics/charts/) - -2. Install Helm: - - Install Helm using the instructions provided: - - [Helm Installation](https://helm.sh/docs/intro/install/) - - [Chart Repository Initialization](https://helm.sh/docs/intro/quickstart/#initialize-a-helm-chart-repository) - -3. Create Your Own Helm Chart: - - Generate a Helm chart for your application. - - Inside the `k8s` folder, create a Helm chart template by using the command `helm create your-app`. - - Replace the default repository and tag inside the `values.yaml` file with your repository name. - - Modify the `containerPort` setting in the `deployment.yml` file. - - If you encounter issues with `livenessProbe` and `readinessProbe`, you can comment them out. - - > For troubleshooting, you can use the `minikube dashboard` command. - -4. Install Your Helm Chart: - - Install your custom Helm chart and ensure that all services are healthy. Verify this by checking the `Workloads` page in the Minikube dashboard. - -5. Access Your Application: - - Confirm that your application is accessible by running the `minikube service your_service_name` command. - -6. Create a HELM.md File: - - Construct a `HELM.md` file and provide the output of the `kubectl get pods,svc` command within it. - -## Task 2: Helm Chart Hooks - -**4 Points:** - -1. Learn About Chart Hooks: - - Familiarize yourself with [Helm Chart Hooks](https://helm.sh/docs/topics/charts_hooks/). - -2. Implement Helm Chart Hooks: - - Develop pre-install and post-install pods within your Helm chart, without adding any complex logic (e.g., use "sleep 20"). You can refer to [Example 1 in the guide](https://www.golinuxcloud.com/kubernetes-helm-hooks-examples/). - -3. Troubleshoot Hooks: - - Execute the following commands to troubleshoot your hooks: - 1. `helm lint ` - 2. `helm install --dry-run helm-hooks ` - 3. `kubectl get po` - -4. Provide Output: - - Execute the following commands and include their output in your report: - 1. `kubectl get po` - 2. `kubectl describe po ` - 3. `kubectl describe po ` - -5. Hook Delete Policy: - - Implement a hook delete policy to remove the hook once it has executed successfully. - -**List of Requirements:** - -- Helm Chart with Hooks implemented, including the hook delete policy. -- Output of the `kubectl get pods,svc` command in `HELM.md`. -- Output of all commands from the step 4 of Task 2 in `HELM.md`. - -## Bonus Task: Helm Library Chart - -**To Earn 2.5 Additional Points:** - -1. Helm Chart for Extra App: - - Prepare a Helm chart for an additional application. - -2. Helm Library Charts: - - Get acquainted with [Helm Library Charts](https://helm.sh/docs/topics/library_charts/). - -3. Create a Library Chart: - - Develop a simple library chart that includes a "labels" template. You can follow the steps outlined in [the Using Library Charts guide](https://austindewey.com/2020/08/17/how-to-reduce-helm-chart-boilerplate-with-library-charts/). Use this library chart for both of your applications. - -### Guidelines - -- Ensure your documentation is clear and well-structured. -- Include all the necessary components. -- Follow appropriate file and folder naming conventions. -- Create and participate in PRs for the peer review process. -- Create pull requests (PRs) as needed: from your fork to the main branch of this repository, and from your fork's branch to your fork's master branch. - -> Note: Detailed documentation is crucial to ensure that your Helm deployment and hooks function as expected. Engage with the bonus tasks to further enhance your understanding and application deployment skills. diff --git a/lab11.md b/lab11.md deleted file mode 100644 index 4994bb1a80..0000000000 --- a/lab11.md +++ /dev/null @@ -1,85 +0,0 @@ -# Lab 11: Kubernetes Secrets and Hashicorp Vault - -## Overview - -In this lab, you will learn how to manage sensitive data, such as passwords, tokens, or keys, within Kubernetes. Additionally, you will configure CPU and memory limits for your application. - -## Task 1: Kubernetes Secrets and Resource Management - -**6 Points:** - -1. Create a Secret Using `kubectl`: - - Learn about Kubernetes Secrets and create a secret using the `kubectl` command: - - [Kubernetes Secrets](https://kubernetes.io/docs/concepts/configuration/secret/) - - [Managing Secrets with kubectl](https://kubernetes.io/docs/tasks/configmap-secret/managing-secret-using-kubectl/#decoding-secret) - -2. Verify and Decode Your Secret: - - Confirm and decode the secret, then create an `11.md` file within the `k8s` folder. Provide the output of the necessary commands inside this file. - -3. Manage Secrets with Helm: - - Use Helm to manage your secrets. - - Create a `secrets.yaml` file in the `templates` folder. - - Define a `secret` object within this YAML file. - - Add an `env` field to your `Deployment`. The path to update is: `spec.template.spec.containers.env`. - - > Refer to this [Helm Secrets Video](https://www.youtube.com/watch?v=hRSlKRvYe1A) for guidance. - - - Update your Helm deployment as instructed in the video. - - Retrieve the list of pods using the command `kubectl get po`. Use the name of the pod as proof of your success within the report. - - Verify your secret inside the pod, for example: `kubectl exec demo-5f898f5f4c-2gpnd -- printenv | grep MY_PASS`. Share this output in `11.md`. - -## Task 2: Vault Secret Management System - -**4 Points:** - -1. Install Vault Using Helm Chart: - - Install Vault using a Helm chart. Follow the steps provided in this guide: - - [Vault Installation Guide](https://developer.hashicorp.com/vault/tutorials/kubernetes/kubernetes-sidecar#install-the-vault-helm-chart) - -2. Follow the Tutorial with Your Helm Chart: - - Adapt the tutorial to work with your Helm chart, including the following steps: - - [Set a Secret in Vault](https://developer.hashicorp.com/vault/tutorials/kubernetes/kubernetes-sidecar#set-a-secret-in-vault) - - [Configure Kubernetes Authentication](https://developer.hashicorp.com/vault/tutorials/kubernetes/kubernetes-sidecar#configure-kubernetes-authentication) - - Be cautious with the service account. If you used `helm create ...`, it will be created automatically. In the guide, they create it manually. - - [Manually Define a Kubernetes Service Account](https://developer.hashicorp.com/vault/tutorials/kubernetes/kubernetes-sidecar#define-a-kubernetes-service-account) - -3. Implement Vault Secrets in Your Helm Chart: - - Use the steps from the guide as an example for your Helm chart: - - [Update values.yaml](https://developer.hashicorp.com/vault/tutorials/kubernetes/kubernetes-sidecar#launch-an-application) - - [Add Labels](https://developer.hashicorp.com/vault/tutorials/kubernetes/kubernetes-sidecar#inject-secrets-into-the-pod) - - Test to ensure your credentials are injected successfully. Use the `kubectl exec -it -- bash` command to access the container. Verify the injected secrets using `cat /path/to/your/secret` and `df -h`. Share the output in the `11.md` report. - - Apply a template as described in the guide. Test the updates as you did in the previous step and provide the outputs in `11.md`. - -**List of Requirements:** - -- Proof of work with a secret in `11.md` for the Task 1 - steps 2 and 3. -- `secrets.yaml` file. -- Resource requests and limits for CPU and memory. -- Vault configuration implemented, with proofs in `11.md`. - -## Bonus Task: Resource Management and Environment Variables - -**2.5 Points:** - -1. Read About Resource Management: - - Familiarize yourself with resource management in Kubernetes: - - [Resource Management](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/) - -2. Set Up Requests and Limits for CPU and Memory for Both Helm Charts: - - Configure resource requests and limits for CPU and memory for your application. - - Test to ensure these configurations work correctly. - -3. Add Environment Variables for Your Containers for Both Helm Charts: - - Read about Kubernetes environment variables: - - [Kubernetes Environment Variables](https://kubernetes.io/docs/tasks/inject-data-application/define-environment-variable-container/) - - Update your Helm chart with several environment variables using named templates. Move these variables to the `_helpers.tpl` file: - - [Helm Named Templates](https://helm.sh/docs/chart_template_guide/named_templates/) - -### Guidelines - -- Ensure that your documentation is clear and organized. -- Include all the necessary components. -- Follow appropriate file and folder naming conventions. -- Create pull requests (PRs) as needed: from your fork to the main branch of this repository, and from your fork's branch to your fork's master branch. - -> Note: Thorough documentation is essential to demonstrate your success in managing secrets and resource allocation in Kubernetes. Explore the bonus tasks to enhance your skills further. diff --git a/lab12.md b/lab12.md deleted file mode 100644 index efb72a29ec..0000000000 --- a/lab12.md +++ /dev/null @@ -1,68 +0,0 @@ -# Lab 12: Kubernetes ConfigMaps - -## Overview - -In this lab, you'll delve into Kubernetes ConfigMaps, focusing on managing non-confidential data and upgrading your application for persistence. ConfigMaps provide a way to decouple configuration artifacts from image content, allowing you to manage configuration data separately from the application. - -## Task 1: Upgrade Application for Persistence - -**6 Points:** - -1. Upgrade Your Application: - - Modify your application to: - - Implement a counter logic in your application to keep track of the number of times it's accessed. - - Save the counter number in the `visits` file. - - Introduce a new endpoint `/visits` to display the recorded visits. - - Test the changes: - - Update your `docker-compose.yml` to include a new volume with your `visits` file. - - Verify that the enhancements work as expected, you must see the updated number in the `visits` file on the host machine. - - Update the `README.md` for your application. - -## Task 2: ConfigMap Implementation - -**4 Points:** - -1. Understand ConfigMaps: - - Read about ConfigMaps in Kubernetes: - - [ConfigMaps](https://kubernetes.io/docs/concepts/configuration/configmap/) - -2. Mount a Config File: - - Create a `files` folder with a `config.json` file. - - Populate `config.json` with data in JSON format. - - Use Helm to mount `config.json`: - - Create a `configMap` manifest, extracting data from `config.json` using `.Files.Get`. - - Update `deployment.yaml` with `Volumes` and `VolumeMounts`. - - [Example](https://carlos.mendible.com/2019/02/10/kubernetes-mount-file-pod-with-configmap/) - - Install the updated Helm chart and verify success: - - Retrieve the list of pods: `kubectl get po`. - - Use the pod name as proof of successful deployment. - - Check the ConfigMap inside the pod, e.g., `kubectl exec demo-758cc4d7c4-cxnrn -- cat /config.json`. - -3. Documentation: - - Create `12.md` in the `k8s` folder and include the output of relevant commands. - -**List of Requirements:** - -- `config.json` in the `files` folder. -- `configMap` retrieving data from `config.json` using `.Files.Get`. -- `Volume`s and `VolumeMount`s in `deployments.yml`. -- `12.md` documenting the results of commands. - -## Bonus Task: ConfigMap via Environment Variables - -**2.5 Points:** - -1. Upgrade Bonus App: - - Implement persistence logic in your bonus app. - -2. ConfigMap via Environment Variables: - - Utilize ConfigMap via environment variables in a running container using the `envFrom` property. - - Provide proof with the output of the `env` command inside your container. - -### Guidelines - -- Maintain clear and organized documentation. -- Use appropriate naming conventions for files and folders. -- For your repository PR, ensure it's from the `lab12` branch to the main branch. - -> Note: Clear documentation is crucial to demonstrate successful data persistence and ConfigMap utilization in Kubernetes. Explore the bonus tasks to further enhance your skills. diff --git a/lab13.md b/lab13.md deleted file mode 100644 index e6f6c919f8..0000000000 --- a/lab13.md +++ /dev/null @@ -1,212 +0,0 @@ -# Lab 13: ArgoCD for GitOps Deployment - -## Overview - -In this lab, you will implement ArgoCD to automate Kubernetes application deployments using GitOps principles. Youโ€™ll install ArgoCD via Helm, configure it to manage your Python app, and simulate production-like workflows. - -## Task 1: Deploy and Configure ArgoCD - -**6 Points:** - -1. Install ArgoCD via Helm - - Add the ArgoCD Helm repository: - - ```bash - helm repo add argo https://argoproj.github.io/argo-helm - ``` - - [ArgoCD Helm Chart Docs](https://github.com/argoproj/argo-helm) - - - Install ArgoCD: - - ```bash - helm install argo argo/argo-cd --namespace argocd --create-namespace - ``` - - [ArgoCD Installation Guide](https://argo-cd.readthedocs.io/en/stable/getting_started/) - - - Verify installation: - - ```bash - kubectl wait --for=condition=ready pod -l app.kubernetes.io/name=argocd-server -n argocd --timeout=90s - ``` - -2. Install ArgoCD CLI - - Install the ArgoCD CLI tool (required for command-line interactions): - - ```bash - # For macOS (Homebrew): - brew install argocd - - # For Debian/Ubuntu: - sudo apt-get install -y argocd - - # For other OS/architectures: - curl -sSL -o argocd https://github.com/argoproj/argo-cd/releases/latest/download/argocd-linux-amd64 - chmod +x argocd - sudo mv argocd /usr/local/bin/ - ``` - - [ArgoCD CLI Docs](https://argo-cd.readthedocs.io/en/stable/cli_installation/) - - - Verify CLI installation: - - ```bash - argocd version - ``` - -3. Access the ArgoCD UI - - Forward the ArgoCD server port: - - ```bash - kubectl port-forward svc/argocd-server -n argocd 8080:443 & - ``` - - - Log in using the initial admin password: - - ```bash - # Retrieve the password: - kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 --decode - - # Log in via CLI: - argocd login localhost:8080 --insecure - argocd account login - ``` - - [ArgoCD Authentication Docs](https://argo-cd.readthedocs.io/en/stable/user-guide/accessing/) - -4. Configure Python App Sync - - Create an ArgoCD folder: - Add an `ArgoCD` folder in your `k8s` directory for ArgoCD manifests. - - - Define the ArgoCD Application: - Create `argocd-python-app.yaml` in the `ArgoCD` folder: - - ```yaml - apiVersion: argoproj.io/v1alpha1 - kind: Application - metadata: - name: python-app - namespace: argocd - spec: - project: default - source: - repoURL: https://github.com//S25-core-course-labs.git - targetRevision: lab13 - path: - helm: - valueFiles: - - values.yaml - destination: - server: https://kubernetes.default.svc - namespace: default - syncPolicy: - automated: {} - ``` - - [ArgoCD Application Manifest Docs](https://argo-cd.readthedocs.io/en/stable/operator-manual/declarative_setup/) - - - Apply the configuration: - - ```bash - kubectl apply -f ArgoCD/argocd-python-app.yaml - ``` - - - Verify sync: - - ```bash - argocd app sync python-app - argocd app status python-app - ``` - -5. Test Sync Workflow - - Modify `values.yaml` (e.g., update `replicaCount`). - - Commit and push changes to the target branch from the config. - - Observe ArgoCD auto-sync the update: - - ```bash - argocd app status python-app - ``` - -### Task 2: Multi-Environment Deployment & Auto-Sync - -**4 Points:** - -1. Set Up Multi-Environment Configurations - - Extend your Python appโ€™s Helm chart to support `dev` and `prod` environments. - - Create environment-specific values files (`values-dev.yaml`, `values-prod.yaml`). - -2. Create Namespaces - - ```bash - kubectl create namespace dev - kubectl create namespace prod - ``` - -3. Deploy Multi-Environment via ArgoCD - - Define two ArgoCD applications with auto-sync: - `argocd-python-dev.yaml` and `argocd-python-prod.yaml` (as before). - -4. Enable Auto-Sync - - Test auto-sync by updating `values-prod.yaml` and pushing to Git. - -5. Self-Heal Testing - - Test 1: Manual Override of Replica Count - 1. Modify the deploymentโ€™s replica count manually: - - ```bash - kubectl patch deployment python-app-prod -n prod --patch '{"spec":{"replicas": 3}}' - ``` - - 2. Observe ArgoCD auto-revert the change (due to `syncPolicy.automated`): - - ```bash - argocd app sync python-app-prod - argocd app status python-app-prod - ``` - - - Test 2: Delete a Pod (Replica) - 1. Delete a pod in the `prod` namespace: - - ```bash - kubectl delete pod -n prod -l - ``` - - 2. Verify Kubernetes recreates the pod to match the deploymentโ€™s `replicaCount`: - - ```bash - kubectl get pods -n prod -w - ``` - - 3. Confirm ArgoCD shows no drift (since pod deletions donโ€™t affect the desired state): - - ```bash - argocd app diff python-app-prod - ``` - -6. Documentation - - In `13.md`, include: - - Output of `kubectl get pods -n prod` before and after pod deletion. - - Screenshots of ArgoCD UI showing sync status and the dashboard after both tests. - - Explanation of how ArgoCD handles configuration drift vs. runtime events. - -## Bonus Task: Sync Your Bonus App with ArgoCD - -**2.5 Points:** - -1. Configure ArgoCD for Bonus App - - Create an `argocd--app.yaml` similar to Task 1, pointing to your bonus appโ€™s helm chart folder. - - Sync and validate deployment with: - - ```bash - kubectl get pods -n - ``` - -### Guidelines - -- Follow the [ArgoCD docs](https://argo-cd.readthedocs.io/) for advanced configurations. -- Use consistent naming conventions (e.g., `lab13` branch for Git commits). -- Document all steps in `13.md` (include diffs, outputs, and UI screenshots). -- For your repository PR, ensure it's from the `lab14` branch to the main branch. - -> **Note**: This lab emphasizes GitOps workflows, environment isolation, and automation. Mastery of ArgoCD will streamline your CI/CD pipelines in real-world scenarios. diff --git a/lab14.md b/lab14.md deleted file mode 100644 index d1d6ba51cd..0000000000 --- a/lab14.md +++ /dev/null @@ -1,106 +0,0 @@ -# Lab 14: Kubernetes StatefulSet - -## Overview - -In this lab, you'll explore Kubernetes StatefulSets, focusing on managing stateful applications with guarantees about the ordering and uniqueness of a set of Pods. - -## Task 1: Implement StatefulSet in Helm Chart - -**6 Points:** - -1. Understand StatefulSets: - - Read about StatefulSet objects: - - [Concept](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/) - - [Tutorial](https://kubernetes.io/docs/tutorials/stateful-application/basic-stateful-set/) - -2. Update Helm Chart: - - Rename `deployment.yml` to `statefulset.yml`. - - Create a manifest for StatefulSet following the tutorial. - - Test with command: `helm install --dry-run --debug name_of_your_chart path_to_your_chart`. - - Fix any issues and deploy it. - - Apply best practices by moving values to variables in `values.yml` meaningfully. - -## Task 2: StatefulSet Exploration and Optimization - -**4 Points:** - -1. Research and Documentation: - - Create `14.md` report. - - Include the output of `kubectl get po,sts,svc,pvc` commands. - - Use `minikube service name_of_your_statefulset` command to access your app. - - Access the root path of your app from different tabs and modes in your browser. - - Check the content of your file in each pod, e.g., `kubectl exec pod/demo-0 -- cat visits`, and provide the output for all replicas. - - Describe and explain differences in the report. - -2. Persistent Storage Validation - - Delete a pod: - - ```bash - kubectl delete pod app-stateful-0 - ``` - - - Verify that the PVC and data persist: - - ```bash - kubectl get pvc - kubectl exec app-stateful-0 -- cat /data/visits - ``` - -3. Headless Service Access - - Access pods via DNS: - - ```bash - kubectl exec app-stateful-0 -- nslookup app-stateful-1.app-stateful - ``` - - - Document DNS resolution in `14.md`. - -4. Monitoring & Alerts - - Add liveness/readiness probes to your StatefulSet. - - Describe in `14.md`: - - How probes ensure pod health. - - Why theyโ€™re critical for stateful apps. - -5. Ordering Guarantee and Parallel Operations: - - Explain why ordering guarantees are unnecessary for your app. - - Implement a way to instruct the StatefulSet controller to launch or terminate all Pods in parallel. - -**List of Requirements:** - -- Outputs of commands in `14.md`. -- Results of the "number of visits" command for each pod, with an explanation in `14.md`. -- Answers to questions in point 2 of `14.md`. -- Implementation of parallel launch and terminate. - -## Bonus Task: Update Strategies - -**2.5 Points:** - -1. Apply StatefulSet to Bonus App - - Convert your bonus appโ€™s Helm chart to use a StatefulSet. - -2. Explore Update Strategies - - Implement Rolling Updates: - - ```yaml - spec: - updateStrategy: - type: RollingUpdate - rollingUpdate: - partition: 1 - ``` - - - Test Canaries: - Update a subset of pods first. - - - Document in `14.md`: - - Explain `OnDelete`, `RollingUpdate`, and their use cases. - - Compare with Deployment update strategies. - -### Guidelines - -- Maintain clear and organized documentation. -- Use appropriate naming conventions for files and folders. -- For your repository PR, ensure it's from the `lab14` branch to the main branch. - -> Note: Understanding StatefulSets and their optimization is crucial for managing stateful applications in Kubernetes. Explore the bonus tasks to further enhance your skills. diff --git a/lab15.md b/lab15.md deleted file mode 100644 index 887587145d..0000000000 --- a/lab15.md +++ /dev/null @@ -1,78 +0,0 @@ -# Lab 15: Kubernetes Monitoring and Init Containers - -## Overview - -In this lab, you will explore Kubernetes cluster monitoring using Prometheus with the Kube Prometheus Stack. Additionally, you'll delve into the concept of Init Containers in Kubernetes. - -## Task 1: Kubernetes Cluster Monitoring with Prometheus - -**6 Points:** - -1. This lab was tested on a specific version of components: - - Minikube v1.33.0 - - Minikube kubectl v1.28.3 - - kube-prometheus-stack-57.2.0 v0.72.0 - - the minikube start command - `minikube start --driver=docker --container-runtime=containerd` - -2. Read about `Kube Prometheus Stack`: - - [Helm chart with installation guide](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack) - - [Explanation of components](https://github.com/prometheus-operator/kube-prometheus#kubeprometheus) - -3. Describe Components: - - Create `15.md` and detail the components of the Kube Prometheus Stack, explaining their roles and functions. Avoid direct copy-pasting; provide a personal understanding. - -4. Install Helm Charts: - - Install the Kube Prometheus Stack to your Kubernetes cluster. - - Install your app's Helm chart. - - Provide the output of the `kubectl get po,sts,svc,pvc,cm` command in the report and explain each part. - -5. Utilize Grafana Dashboards: - - Access Grafana using `minikube service monitoring-grafana`. - - Explore existing dashboards to find information about your cluster: - 1. Check CPU and Memory consumption of your StatefulSet. - 2. Identify Pods with higher and lower CPU usage in the default namespace. - 3. Monitor node memory usage in percentage and megabytes. - 4. Count the number of pods and containers managed by the Kubelet service. - 5. Evaluate network usage of Pods in the default namespace. - 6. Determine the number of active alerts; also check the Web UI with `minikube service monitoring-kube-prometheus-alertmanager`. - - Provide answers to all these points in the report. - -## Task 2: Init Containers - -**4 Points:** - -1. Read about `Init Containers`: - - [Concept](https://kubernetes.io/docs/concepts/workloads/pods/init-containers/) - - [Tutorial](https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-initialization/#create-a-pod-that-has-an-init-container) - -2. Implement Init Container: - - Create a new Volume. - - Implement an Init container to download any file using `wget` (you can use a site from the example). - - Provide proof of success, e.g., `kubectl exec pod/demo-0 -- cat /test.html`. - -**List of Requirements:** - -- Detailed explanation of monitoring stack components in `15.md`. -- Output and explanation of `kubectl get po,sts,svc,pvc,cm`. -- Answers to all 6 questions from point 4 in `15.md`. -- Implementation of Init Container. -- Proof of Init Container downloading a file. - -## Bonus Task: App Metrics & Multiple Init Containers - -**2.5 Points:** - -1. App Metrics: - - Fetch metrics from your app and provide proof. - -2. Init Container Queue: - - Create a queue of three Init containers, with any logic like adding new lines to the same file. - - Provide proof using the `cat` tool. - -### Guidelines - -- Ensure clear and organized documentation. -- Use appropriate naming conventions for files and folders. -- For your repository PR, ensure it's from the `lab15` branch to the main branch. - -> Note: Demonstrate successful implementation and understanding of Kubernetes monitoring and Init Containers. Take your time to explore the bonus tasks for additional learning opportunities. diff --git a/lab16.md b/lab16.md deleted file mode 100644 index 37912fc50b..0000000000 --- a/lab16.md +++ /dev/null @@ -1,75 +0,0 @@ -# Lab 16: IPFS and Fleek - -In this lab, you will explore essential DevOps tools and set up a project on the Fleek service. Follow the tasks below to complete the lab assignment. - -## Task 1: Set Up an IPFS Gateway Using Docker - -Objective: Understand and implement an IPFS gateway using Docker, upload a file, and verify it via an IPFS cluster. - -1. Set Up IPFS Gateway: - - Install Docker on your machine if it's not already installed. - - [Docker Installation Guide](https://docs.docker.com/get-docker/) - - - Pull the IPFS Docker image and run an IPFS container: - - ```sh - docker pull ipfs/go-ipfs - docker run -d --name ipfs_host -v /path/to/folder/with/file:/export -v ipfs_data:/data/ipfs -p 8080:8080 -p 4001:4001 -p 5001:5001 ipfs/go-ipfs - ``` - - - Verify the IPFS container is running: - - ```sh - docker ps - ``` - -2. Upload a File to IPFS: - - Open a browser and access the IPFS web UI: - - ```sh - http://127.0.0.1:5001/webui/ - ``` - - - Explore the web UI and wait for 5 minutes to sync up with the network. - - Upload any file via the web UI. - - Use the obtained hash to access the file via any public IPFS gateway. Here are a few options: - - [IPFS.io Gateway](https://ipfs.io/ipfs/) - - [Cloudflare IPFS Gateway](https://cloudflare-ipfs.com/ipfs/) - - [Infura IPFS Gateway](https://ipfs.infura.io/ipfs/) - - - Append your file hash to any of the gateway URLs to verify your file is accessible. Note that it may fail due to network overload, so don't worry if you can't reach it. - -3. Documentation: - - Create a `submission2.md` file. - - Share information about connected peers and bandwidth in your report. - - Provide the hash and the URLs used to verify the file on the IPFS gateways. - -## Task 2: Set Up Project on Fleek.xyz - -Objective: Set up a project on the Fleek service and share the IPFS link. - -1. Research: - - Understand what IPFS is and its purpose. - - Explore Fleek's features. - -2. Set Up: - - Sign up for a Fleek account if you haven't already. - - Use your fork of the Labs repository as your project source. Optionally, set up your own website (notify us in advance). - - Configure the project settings on Fleek. - - Deploy the Labs repository to Fleek, ensuring it is uploaded to IPFS. - -3. Documentation: - - Share the IPFS link and domain of the deployed project in the `submission2.md` file. - -## Additional Resources - -- [IPFS Documentation](https://docs.ipfs.io/) -- [Fleek Documentation](https://docs.fleek.xyz/) - -### Guidelines - -- Use proper Markdown formatting for documentation files. -- Organize files with appropriate naming conventions. -- Create a Pull Request to the main branch of the repository with your completed lab assignment. - -> Note: Actively explore and document your findings to gain hands-on experience with IPFS and Fleek. diff --git a/lab16/index.html b/lab16/index.html deleted file mode 100644 index acce39eee3..0000000000 --- a/lab16/index.html +++ /dev/null @@ -1,303 +0,0 @@ - - - - - - DevOps Engineering Expert Track - - - - -
- -
- -
-
-

Master Modern DevOps Practices

-

16 hands-on labs covering Kubernetes, Terraform, CI/CD, and more

- Start Free Trial โ†’ -
-
- -
-

Why This Course?

-
-
- -

16 Advanced Labs

-

Build production-ready systems from scratch

-
-
- -

Industry-Standard Tools

-

Terraform, ArgoCD, Prometheus, Vault, and more

-
-
- -

Job-Ready Skills

-

Learn tools used by top tech companies

-
-
-
- -
-
-

Lab Syllabus (2025 Edition)

-
    -
  1. Lab 1: Web Application Development
  2. -
  3. Lab 2: Containerization
  4. -
  5. Lab 3: Continuous Integration
  6. -
  7. Lab 4: Infrastructure as Code & Terraform
  8. -
  9. Lab 5: Configuration Management
  10. -
  11. Lab 6: Ansible Automation
  12. -
  13. Lab 7: Observability, Logging, Loki Stack
  14. -
  15. Lab 8: Monitoring & Prometheus
  16. -
  17. Lab 9: Kubernetes & Declarative Manifests
  18. -
  19. Lab 10: Helm Charts & Library Charts
  20. -
  21. Lab 11: Kubernetes Secrets Management (Vault, ConfigMaps)
  22. -
  23. Lab 12: Kubernetes ConfigMaps & Environment Variables
  24. -
  25. Lab 13: GitOps with ArgoCD
  26. -
  27. Lab 14: StatefulSet Optimization
  28. -
  29. Lab 15: Kubernetes Monitoring & Init Containers
  30. -
  31. Lab 16: IPFS & Fleek Decentralization
  32. -
-
-
- -
-

Learning Progression

-
-

Phase 1: Foundations (Labs 1-6)

-

Web Dev โ†’ Containers โ†’ CI/CD โ†’ IaC โ†’ Ansible

-
-
-

Phase 2: Observability (Labs 7-8)

-

Logging โ†’ Monitoring โ†’ Loki/Prometheus

-
-
-

Phase 3: Kubernetes Mastery (Labs 9-12)

-

Deployments โ†’ Helm โ†’ Secrets โ†’ ConfigMaps

-
-
-

Phase 4: Expert Track (Labs 13-16)

-

GitOps โ†’ StatefulSets โ†’ IPFS โ†’ Final Project

-
-
- - - - - - \ No newline at end of file diff --git a/lab2.md b/lab2.md deleted file mode 100644 index ff71bc227d..0000000000 --- a/lab2.md +++ /dev/null @@ -1,85 +0,0 @@ -# Lab 2: Containerization - Docker - -## Overview - -In this lab assignment, you will learn to containerize applications using Docker, while focusing on best practices. Additionally, you will explore Docker multi-stage builds. Follow the tasks below to complete the lab assignment. - -## Task 1: Dockerize Your Application - -**6 Points:** - -1. Create a `Dockerfile`: - - Inside the `app_python` folder, craft a `Dockerfile` for your application. - - Research and implement Docker best practices. Utilize a Dockerfile linter for quality assurance. - -2. Build and Test Docker Image: - - Build a Docker image using your Dockerfile. - - Thoroughly test the image to ensure it functions correctly. - -3. Push Image to Docker Hub: - - If you lack a public Docker Hub account, create one. - - Push your Docker image to your public Docker Hub account. - -4. Run and Verify Docker Image: - - Retrieve the Docker image from your Docker Hub account. - - Execute the image and validate its functionality. - -## Task 2: Docker Best Practices - -**4 Points:** - -1. Enhance your docker image by implementing [Docker Best Practices](https://docs.docker.com/build/building/best-practices/). - - No root user inside, or you will get no points at all. - -2. Write `DOCKER.md`: - - Inside the `app_python` folder, create a `DOCKER.md` file. - - Elaborate on the best practices you employed within your Dockerfile. - - Implementing and listing numerous Docker best practices will earn you more points. - -3. Enhance the README.md: - - Update the `README.md` file in the `app_python` folder. - - Include a dedicated `Docker` section, explaining your containerized application and providing clear instructions for execution. - - How to build? - - How to pull? - - How to run? - -### List of Requirements - -- Rootless container. -- Use COPY, but only specific files. -- Layer sanity. -- Use `.dockerignore`. -- Use a precise version of your base image and language, example `python:3-alpine3.15`. - -## Bonus Task: Multi-Stage Builds Exploration - -**2.5 Points:** - -1. Dockerize Previous App: - - Craft a `Dockerfile` for the application from the prior lab. - - Place this Dockerfile within the corresponding `app_*` folder. - -2. Follow Main Task Guidelines: - - Apply the same steps and suggestions as in the primary Dockerization task. - -3. Study Docker Multi-Stage Builds: - - Familiarize yourself with Docker multi-stage builds. - - Consider implementing multi-stage builds, only if they enhance your project's structure and efficiency. - -4. Study Distroless Images: - - Explore how to use Distroless images by reviewing the official documentation: [GoogleContainerTools/distroless](https://github.com/GoogleContainerTools/distroless). - - Create new `distroless.Dockerfile` files for your Python app and your second app. - - Use the `nonroot` tag for both images to ensure they run with non-root privileges. - - Verify that the applications work correctly with the Distroless images. - - Compare the sizes of your previous Docker images with the new Distroless-based images. - - In the `DOCKER.md` file, describe the differences between the Distroless images and your previous images. Explain why these differences exist (e.g., smaller size, reduced attack surface, etc.). - - Include a screenshot of your final results (e.g., image sizes). - - Add a new section to the `README.md` file titled "Distroless Image Version". - -### Guidelines - -- Utilize appropriate Markdown formatting and structure for all documentation. -- Organize files within the lab folder with suitable naming conventions. -- Create pull requests (PRs) as needed: from your fork to the main branch of this repository, and from your fork's branch to your fork's master branch. - -> Note: Utilize Docker to containerize your application, adhering to best practices. Explore Docker multi-stage builds for a deeper understanding, and document your process using Markdown. diff --git a/lab3.md b/lab3.md deleted file mode 100644 index 2f4899750a..0000000000 --- a/lab3.md +++ /dev/null @@ -1,53 +0,0 @@ -# Lab 3: Continuous Integration Lab - -## Overview - -In this lab assignment, you will delve into continuous integration (CI) practices by focusing on code testing, setting up Git Actions CI, and optimizing workflows. Additionally, you will have the opportunity to explore bonus tasks to enhance your CI knowledge. Follow the tasks below to complete the lab assignment. - -## Task 1: Code Testing and Git Actions CI - -**6 Points:** - -1. Code Testing: - - Begin by researching and implementing best practices for code testing. - - Write comprehensive unit tests for your application. - - In the `PYTHON.md` file, describe the unit tests you've created and the best practices you applied. - - Enhance the `README.md` file by adding a "Unit Tests" section. - -2. Set Up Git Actions CI: - - Create a CI workflow using GitHub Actions to build and test your Python project. Refer to the [official GitHub Actions documentation](https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python) for guidance. - - Ensure your CI workflow includes at least three essential steps: Dependencies, Linter, and Tests. - - Integrate Docker-related steps into your CI workflow, at least two steps Login, Build & Push. You can refer to the [Docker GitHub Actions documentation](https://docs.docker.com/ci-cd/github-actions/) for assistance. - - Update the `README.md` file to provide information about your CI workflow. - -## Task 2: CI Workflow Improvements - -**4 Points:** - -1. Workflow Enhancements: - - Add a workflow status badge to your repository for visibility. - - Dive into best practices for CI workflows and apply them to optimize your existing workflow. - - Utilize build cache to enhance workflow efficiency. - - Create a `CI.md` file and document the best practices you've implemented. - -2. Implement Snyk Vulnerability Checks: - - Integrate Snyk into your CI workflow to identify and address vulnerabilities in your projects. You can refer to the [Python example](https://github.com/snyk/actions/tree/master/python-3.8) for guidance, check [another option](https://docs.snyk.io/integrations/snyk-ci-cd-integrations/github-actions-integration#use-your-own-development-environment) how to install dependencies if you face any issue. - -## Bonus Task - -**2.5 Points:** - -1. Follow the Main Task Steps: - - Apply the same steps as in the primary CI task to set up CI workflows for an extra application. You can find useful examples in the [GitHub Actions starter workflows](https://github.com/actions/starter-workflows/tree/main/ci). - -2. CI Workflow Improvements: - 1. Python App CI: Configure the CI workflow to run only when changes occur in the `app_python` folder. - 2. Extra Language App CI: Configure the CI workflow to run only when changes occur in the `app_` folder. - -### Guidelines - -- Use proper Markdown formatting and structure for all documentation files. -- Organize files within the lab folder with suitable naming conventions. -- Create pull requests (PRs) as needed: from your fork to the main branch of this repository, and from your fork's branch to your fork's master branch. - -> Note: Implement CI best practices, optimize your workflows, and explore bonus tasks to deepen your understanding of continuous integration. diff --git a/lab4.md b/lab4.md deleted file mode 100644 index e88b5e63e5..0000000000 --- a/lab4.md +++ /dev/null @@ -1,84 +0,0 @@ -# Lab 4: Infrastructure as Code Lab - -## Overview - -In this lab assignment, you will explore Infrastructure as Code (IAC) using Terraform. You'll build Docker and AWS infrastructures and dive into managing GitHub repositories through Terraform. Additionally, there are bonus tasks to enhance your Terraform skills. Follow the tasks below to complete the lab assignment. - -## Task 1: Introduction to Terraform - -**6 Points:** - -0. You will need a VPN tool for this lab - -1. Get Familiar with Terraform: - - Begin by familiarizing yourself with Terraform by reading the [introduction](https://www.terraform.io/intro/index.html) and exploring [best practices](https://www.terraform.io/docs/cloud/guides/recommended-practices/index.html). - -2. Set Up Terraform Workspace: - - Create a `terraform` folder to organize your Terraform workspaces. - - Inside the `terraform` folder, create a file named `TF.md`. - -3. Docker Infrastructure Using Terraform: - - Follow the [Docker tutorial](https://learn.hashicorp.com/collections/terraform/docker-get-started) for building a Docker infrastructure with Terraform. - - Perform the following tasks as instructed in the tutorial: - - Install Terraform. - - Build the Infrastructure. - - Provide the output of the following commands in the `TF.md` file: - - ```sh - terraform state show - terraform state list - ``` - - - Document a part of the log with the applied changes. - - Utilize input variables to rename your Docker container. - - Finish the tutorial and provide the output of the `terraform output` command in the `TF.md` file. - -4. Yandex Cloud Infrastracture Using Terraform: - - Create an account on [Yandex Cloud](https://cloud.yandex.com/). - - Check for available free-tier options and select a free VM instance suitable for this lab. - - Follow the [Yandex Quickstart Guide](https://yandex.cloud/en-ru/docs/tutorials/infrastructure-management/terraform-quickstart#linux_1) to set up and configure Terraform for managing Yandex Cloud resources. - - Document the entire process, including setup steps, configurations, and any challenges encountered, in the `TF.md` file. - -5. [Optioinal] AWS Infrastructure Using Terraform: - - Follow the [AWS tutorial](https://learn.hashicorp.com/tutorials/terraform/aws-build?in=terraform/aws-get-started) alongside the instructions from the previous step. - -## Task 2: Terraform for GitHub - -**4 Points:** - -1. GitHub Infrastructure Using Terraform: - - Utilize the [GitHub provider for Terraform](https://registry.terraform.io/providers/integrations/github/latest/docs). - - Create a directory inside the `terraform` folder specifically for managing your GitHub project infrastructure. - - Build GitHub infrastructure following a reference like [this example](https://dev.to/pwd9000/manage-and-maintain-github-with-terraform-2k86). Prepare `.tf` files that include: - - Repository name - - Repository description - - Visibility settings - - Default branch - - Branch protection rule for the default branch - - Avoid placing your token as a variable in the code; instead, use an environment variable. - -2. Import Existing Repository: - - Use the `terraform import` command to import your current GitHub repository into your Terraform configuration. No need to create a new one. Example: `terraform import "github_repository.core-course-labs" "core-course-labs"`. - -3. Apply Terraform Changes: - - Apply changes from your Terraform configuration to your GitHub repository. - -4. Document Best Practices: - - Provide Terraform-related best practices that you applied in the `TF.md` file. - -## Bonus Task: Adding Teams - -**2.5 Points:** - -1. GitHub Teams Using Terraform: - - You need to create a new organization. - - Extend your Terraform configuration to add several teams to your GitHub repository, each with different levels of access. - - Apply the changes and ensure they take effect in your GitHub repository. - -### Guidelines - -- Use proper Markdown formatting and structure for documentation files. -- Organize files within the lab folder with suitable naming conventions. -- Create pull requests (PRs) as needed: from your fork to the main branch of this repository, and from your fork's branch to your fork's master branch. - -> Note: Dive into Terraform to manage infrastructures efficiently. Explore the AWS and Docker tutorials, and don't forget to document your process and best practices in the `TF.md` file. diff --git a/lab5.md b/lab5.md deleted file mode 100644 index bead18ceea..0000000000 --- a/lab5.md +++ /dev/null @@ -1,141 +0,0 @@ -# Lab 5: Ansible and Docker Deployment - -## Overview - -In this lab, you will get acquainted with Ansible, a powerful configuration management and automation tool. Your objective is to use Ansible to deploy Docker on a newly created cloud VM. This knowledge will be essential for your application deployment in the next lab. - -## Task 1: Initial Setup - -**6 Points:** - -1. Repository Structure: - - Organize your repository following the recommended structure below: - - ```sh - . - |-- README.md - |-- ansible - | |-- inventory - | | `-- default_aws_ec2.yml - | |-- playbooks - | | `-- dev - | | `-- main.yaml - | |-- roles - | | |-- docker - | | | |-- defaults - | | | | `-- main.yml - | | | |-- handlers - | | | | `-- main.yml - | | | |-- tasks - | | | | |-- install_compose.yml - | | | | |-- install_docker.yml - | | | | `-- main.yml - | | | `-- README.md - | | `-- web_app - | | |-- defaults - | | | `-- main.yml - | | |-- handlers - | | | `-- main.yml - | | |-- meta - | | | `-- main.yml - | | |-- tasks - | | | `-- main.yml - | | `-- templates - | | `-- docker-compose.yml.j2 - | `-- ansible.cfg - |-- app_go - |-- app_python - `-- terraform - ``` - -2. Installation and Introduction: - - Install Ansible and familiarize yourself with its basics. You can follow the [Ansible installation guide](https://docs.ansible.com/ansible/latest/installation_guide/intro_installation.html). - -3. Use an Existing Ansible Role for Docker: - - Utilize an existing Ansible role for Docker from `ansible-galaxy` as a template. You can explore [this Docker role](https://github.com/geerlingguy/ansible-role-docker) as an example. - -4. Create a Playbook and Testing: - - Develop an Ansible playbook for deploying Docker. - - Test your playbook to ensure it works as expected. - -## Task 2: Custom Docker Role - -**4 Points:** - -1. Create Your Custom Docker Role: - - Develop a custom Ansible role for Docker with the following tasks: - 1. Install Docker and Docker Compose. - 2. Update your playbook to utilize this custom role. [Tricks and Tips](https://docs.ansible.com/ansible/latest/user_guide/playbooks_best_practices.html). - 3. Test your playbook with the custom role to ensure successful deployment. - 4. Make sure the role has a task to configure Docker to start on boot (`systemctl enable docker`). - 5. Include a task to add the current user to the `docker` group to avoid using `sudo` for Docker commands. - -2. Documentation: - - Develop an `ANSIBLE.md` file in the `ansible` folder to document your Ansible-related work. - - Create a `README.md` file in the `ansible/roles/docker` folder. - - Use a Markdown template to describe your Docker role, its requirements and usage. - - Example `README.md` template for the Docker role: - - ```markdown - # Docker Role - - This role installs and configures Docker and Docker Compose. - - ## Requirements - - - Ansible 2.9+ - - Ubuntu 22.04 - - ## Role Variables - - - `docker_version`: The version of Docker to install (default: `latest`). - - `docker_compose_version`: The version of Docker Compose to install (default: `1.29.2`). - - ## Example Playbook - - ```yaml - - hosts: all - roles: - - role: docker - ``` - -3. Deployment Output: - - Execute your playbook to deploy the Docker role. - - Provide the last 50 lines of the output from your deployment command in the `ANSIBLE.md` file. - - Use the `--check` flag with `ansible-playbook` to perform a dry run and verify changes before applying them. - - Example command: - - ```sh - ansible-playbook --diff - ``` - -4. **Inventory Details:** - - Execute the following command `ansible-inventory -i .yaml --list` and provide its output in the `ANSIBLE.md` file. - - Validate the inventory file using `ansible-inventory -i .yaml --graph` to visualize the inventory structure. - - Ensure you have documented the inventory information. - -## Bonus Task: Dynamic Inventory - -**2.5 Points:** - -1. Set up Dynamic Inventory: - - Implement dynamic inventory for your cloud environment, if available. - - You may explore ready-made solutions for dynamic inventories: - - - [AWS Example](https://docs.ansible.com/ansible/latest/collections/amazon/aws/aws_ec2_inventory.html) - - [Yandex Cloud (Note: Not tested)](https://github.com/rodion-goritskov/yacloud_compute) - - Implementing dynamic inventory can enhance your automation capabilities. - -2. Secure Docker Configuration: - - Add a task to configure Docker security settings, disable root access. - - Use the `copy` module and modify the `daemon.json` file. - -### Guidelines - -- Use proper Markdown formatting and structure for documentation files. -- Organize files within the lab folder with suitable naming conventions. -- Create pull requests (PRs) as needed: from your fork to the main branch of this repository, and from your fork's branch to your fork's master branch. - -> Note: Ensure that your repository is well-structured, follow Ansible best practices, and provide clear documentation for a successful submission. diff --git a/lab6.md b/lab6.md deleted file mode 100644 index cc8249390d..0000000000 --- a/lab6.md +++ /dev/null @@ -1,139 +0,0 @@ -# Lab 6: Ansible and Application Deployment - -## Overview - -In this lab, you will utilize Ansible to set up a Continuous Deployment (CD) process for your application. - -## Task 1: Application Deployment - -**6 Points:** - -1. Create an Ansible Role: - - Develop an Ansible role specifically for deploying your application's Docker image, it can be done manually or via `ansible-galaxy init roles/web_app`. Call it `web_app`. - - Define variables in `roles/web_app/defaults/main.yml`. - - Add tasks to `roles/web_app/tasks/main.yml` to pull the Docker image and start the container. - - > Managing just a container is bad practice, you can omit it and move to the Task 2 directly. - -2. Update the Playbook: - - Modify your Ansible playbook to integrate the new role you've created for Docker image deployment. - -3. Deployment Output: - - Execute your playbook to deploy the role. - - Provide the last 50 lines of the output from your deployment command in the `ANSIBLE.md` file. - -## Task 2: Ansible Best Practices - -**4 Points:** - -1. Group Tasks with Blocks: - - Organize related tasks within your playbooks using Ansible blocks. - - Implement logical blocks. For example: - - ```yaml - - name: Setup Docker Environment - block: - - name: Install Docker - apt: - name: docker.io - state: present - - - name: Start Docker Service - service: - name: docker - state: started - enabled: yes - tags: - - setup - ``` - -2. Role Dependency: - - Set the role dependency for your `web_app` role to include the `docker` role. - - Specify dependencies in `roles/web_app/meta/main.yml`. - -3. Apply Tags: - - Implement Ansible tags to group tasks logically and enable selective execution. For example: - - ```yaml - - name: Pull Docker image - docker_image: - name: "{{ docker_image }}" - source: pull - tags: - - docker - ``` - - - Run specific tags. For example: - - ```bash - ansible-playbook site.yml --tags docker - ``` - -4. Wipe Logic: - - Create a wipe logic in `roles/web_app/tasks/0-wipe.yml`. This should include removing your Docker container and all related files. - - Ensure that this wipe process can be enabled or disabled by using a variable, for example, `web_app_full_wipe=true`. - -5. Separate Tag for Wipe: - - Utilize a distinct tag for the **Wipe** section of your Ansible playbook. This allows you to run the wipe tasks independently from the main tasks. - -6. Docker Compose File: - - Write a Jinja2 template (`roles/web_app/templates/docker-compose.yml.j2`). For example: - - ```yaml - version: '3' - services: - app: - image: "{{ docker_image }}" - ports: - - "{{ app_port }}:80" - ``` - - - Deliver the template using the `template` module in `roles/web_app/tasks/main.yml`. - - Suggested structure: - - ```sh - . - |-- defaults - | `-- main.yml - |-- meta - | `-- main.yml - |-- tasks - | |-- 0-wipe.yml - | `-- main.yml - `-- templates - `-- docker-compose.yml.j2 - ``` - -7. Create `README.md`: - - Create a `README.md` file in the `ansible/roles/web_app` folder. - - Use a suggested Docker Markdown template from the previous lab to describe your role, its requirements and usage. - -## Bonus Task: CD Improvement - -**2.5 Points:** - -1. Create an Extra Playbook: - - Develop an additional Ansible playbook specifically for your bonus application. - - You can reuse the existing Ansible role you created for your primary application or create a new one. - - Suggested structure: - - ```sh - . - `--ansible - `-- playbooks - `-- dev - |-- app_python - | `-- main.yaml - `-- app_go - `-- main.yaml - ``` - -### Guidelines - -- Use proper Markdown formatting and structure for documentation files. -- Organize files within the lab folder with suitable naming conventions. -- Create pull requests (PRs) as needed: from your fork to the main branch of this repository, and from your fork's branch to your fork's master branch. -- Follow the suggested structure for your Ansible roles, tasks, and templates. -- Utilize Ansible best practices such as grouping tasks with blocks, applying tags, and separating roles logically. - -> Note: Apply diligence to your Ansible implementation, follow best practices, and clearly document your work to achieve the best results in this lab assignment. diff --git a/lab7.md b/lab7.md deleted file mode 100644 index 48e65eb202..0000000000 --- a/lab7.md +++ /dev/null @@ -1,59 +0,0 @@ -# Lab 7: Monitoring and Logging - -## Overview - -In this lab, you will become familiar with a logging stack that includes Promtail, Loki, and Grafana. Your goal is to create a Docker Compose configuration and configuration files to set up this logging stack. - -## Task 1: Logging Stack Setup - -**6 Points:** - -1. Study the Logging Stack: - - Begin by researching the components of the logging stack: - - [Grafana Webinar: Loki Getting Started](https://grafana.com/go/webinar/loki-getting-started/) - - [Loki Overview](https://grafana.com/docs/loki/latest/overview/) - - [Loki GitHub Repository](https://github.com/grafana/loki) - -2. Create a Monitoring Folder: - - Start by creating a new folder named `monitoring` in your project directory. - -3. Docker Compose Configuration: - - Inside the `monitoring` folder, prepare a `docker-compose.yml` file that defines the entire logging stack along with your application. - - To assist you in this task, refer to these resources for sample Docker Compose configurations: - - [Example Docker Compose Configuration from Loki Repository](https://github.com/grafana/loki/blob/main/production/docker-compose.yaml) - - [Promtail Configuration Example](https://github.com/black-rosary/loki-nginx/blob/master/promtail/promtail.yml) (Adapt it as needed) - -4. Testing: - - Verify that the configured logging stack and your application work as expected. - -## Task 2: Documentation and Reporting - -**4 Points:** - -1. Logging Stack Report: - - Create a new file named `LOGGING.md` to document how the logging stack you've set up functions. - - Provide detailed explanations of each component's role within the stack. - -2. Screenshots: - - Capture screenshots that demonstrate the successful operation of your logging stack. - - Include these screenshots in your `LOGGING.md` report for reference. - -## Bonus Task: Additional Configuration - -**2.5 Points:** - -1. Integrating Your Extra App: - - Extend the `docker-compose.yml` configuration to include your additional application. - -2. Configure Stack for Comprehensive Logging: - - Modify the logging stack's configuration to collect logs from all containers defined in the `docker-compose.yml`. - - Include screenshots in your `LOGGING.md` report to demonstrate your success. - -### Guidelines - -- Ensure that your documentation in `LOGGING.md` is well-structured and comprehensible. -- Follow proper naming conventions for files and folders. -- Use code blocks and Markdown formatting where appropriate. -- Create pull requests (PRs) as needed: from your fork to the main branch of this repository, and from your fork's branch to your fork's master branch. - -> Note: Thoroughly document your work, and ensure the logging stack functions correctly. Utilize the bonus points opportunity to enhance your understanding and the completeness of your setup. diff --git a/lab8.md b/lab8.md deleted file mode 100644 index 8eb0752ec7..0000000000 --- a/lab8.md +++ /dev/null @@ -1,71 +0,0 @@ -# Lab 8: Monitoring with Prometheus - -## Overview - -In this lab, you will become acquainted with Prometheus, set it up, and configure applications to collect metrics. - -## Task 1: Prometheus Setup - -**6 Points:** - -1. Learn About Prometheus: - - Begin by reading about Prometheus and its fundamental concepts: - - [Prometheus Overview](https://prometheus.io/docs/introduction/overview/) - - [Prometheus Naming Best Practices](https://prometheus.io/docs/practices/naming/) - -2. Integration with Docker Compose: - - Expand your existing `docker-compose.yml` file from the previous lab to include Prometheus. - -3. Prometheus Configuration: - - Configure Prometheus to collect metrics from both Loki and Prometheus containers. - -4. Verify Prometheus Targets: - - Access `http://localhost:9090/targets` to ensure that Prometheus is correctly scraping metrics. - - Capture screenshots that confirm the successful setup and place them in a file named `METRICS.md` within the monitoring folder. - -## Task 2: Dashboard and Configuration Enhancements - -**4 Points:** - -1. Grafana Dashboards: - - Set up dashboards in Grafana for both Loki and Prometheus. - - You can use examples as references: - - [Example Dashboard for Loki](https://grafana.com/grafana/dashboards/13407) - - [Example Dashboard for Prometheus](https://grafana.com/grafana/dashboards/3662) - - Capture screenshots displaying your successful dashboard configurations and include them in `METRICS.md`. - -2. Service Configuration Updates: - - Enhance the configuration of all services in the `docker-compose.yml` file: - - Add log rotation mechanisms. - - Specify memory limits for containers. - - Ensure these changes are documented within your `METRICS.md` file. - -3. Metrics Gathering: - - Extend Prometheus to gather metrics from all services defined in the `docker-compose.yml` file. - -## Bonus Task: Metrics and Health Checks - -**To Earn 2.5 Additional Points:** - -1. Application Metrics: - - Integrate metrics into your applications. You can refer to Python examples like: - - [Monitoring a Synchronous Python Web Application](https://dzone.com/articles/monitoring-your-synchronous-python-web-application) - - [Metrics Monitoring in Python](https://opensource.com/article/18/4/metrics-monitoring-and-python) - -2. Obtain Application Metrics: - - Configure your applications to export metrics. - -3. METRICS.md Update: - - Document your progress with the bonus tasks, including screenshots, in the `METRICS.md` file. - -4. Health Checks: - - Further enhance the `docker-compose.yml` file's service configurations by adding health checks for the containers. - -### Guidelines - -- Maintain a well-structured and comprehensible `METRICS.md` document. -- Adhere to file and folder naming conventions. -- Utilize code blocks and Markdown formatting where appropriate. -- Create pull requests (PRs) as needed: from your fork to the main branch of this repository, and from your fork's branch to your fork's master branch. - -> Note: Ensure thorough documentation of your work, and guarantee that Prometheus correctly collects metrics. Take advantage of the bonus tasks to deepen your understanding and enhance the completeness of your setup. diff --git a/lab9.md b/lab9.md deleted file mode 100644 index 5493f042a6..0000000000 --- a/lab9.md +++ /dev/null @@ -1,76 +0,0 @@ -# Lab 9: Introduction to Kubernetes - -## Overview - -In this lab, you will explore Kubernetes, set up a local development environment, and create manifests for your application. - -## Task 1: Kubernetes Setup and Basic Deployment - -**6 Points:** - -1. Learn About Kubernetes: - - Begin by studying the fundamentals of Kubernetes: - - [What is Kubernetes](https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/) - - [Kubernetes Components](https://kubernetes.io/docs/concepts/overview/components/) - -2. Install Kubernetes Tools: - - Install `kubectl` and `minikube`, essential tools for managing Kubernetes. - - [Kubernetes Tools](https://kubernetes.io/docs/tasks/tools/) - -3. Deploy Your Application: - - Deploy your application within the Minikube cluster using the `kubectl create` command. Create a `Deployment` resource for your app. - - [Example of Creating a Deployment](https://kubernetes.io/docs/tutorials/hello-minikube/#create-a-deployment) - - [Deployment Overview](https://kubernetes.io/docs/tutorials/kubernetes-basics/deploy-app/deploy-intro/) - -4. Access Your Application: - - Make your application accessible from outside the Kubernetes virtual network. Achieve this by creating a `Service` resource. - - [Example of Creating a Service](https://kubernetes.io/docs/tutorials/hello-minikube/#create-a-service) - - [Service Overview](https://kubernetes.io/docs/tutorials/kubernetes-basics/expose/expose-intro/) - -5. Create a Kubernetes Folder: - - Establish a `k8s` folder within your repository. - - Create a `README.md` report within this folder and include the output of the `kubectl get pods,svc` command. - -6. Cleanup: - - Remove the `Deployment` and `Service` resources that you created, maintaining a tidy Kubernetes environment. - -## Task 2: Declarative Kubernetes Manifests - -**4 Points:** - -1. Manifest Files for Your Application: - - As a more efficient and structured approach, employ configuration files to deploy your application. - - Create a `deployment.yml` manifest file that describes your app's deployment, specifying at least 3 replicas. - - [Kubernetes Deployment](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/) - - [Declarative Management of Kubernetes Objects Using Configuration Files](https://kubernetes.io/docs/tasks/manage-kubernetes-objects/declarative-config/) - -2. Service Manifest: - - Develop a `service.yml` manifest file for your application. - -3. Manifest Files in `k8s` Folder: - - Store these manifest files in the `k8s` folder of your repository. - - Additionally, provide the output of the `kubectl get pods,svc` command in the `README.md` report. - - Include the output of the `minikube service --all` command and the result from your browser, with a screenshot demonstrating that the IP matches the output of `minikube service --all`. - -## Bonus Task: Additional Configuration and Ingress - -**To Earn 2.5 Additional Points:** - -1. Manifests for Extra App: - - Create `deployment` and `service` manifests for an additional application. - -2. Ingress Manifests: - - Construct [Ingress manifests](https://kubernetes.io/docs/tasks/access-application-cluster/ingress-minikube/) for your applications. - -3. Application Availability Check: - - Utilize `curl` or a similar tool to verify the availability of your applications. Include the output in the report. - -**Guidelines:** - -- Maintain a clear and well-structured `README.md` document. -- Ensure that all required components are included. -- Adhere to file and folder naming conventions. -- Create and participate in PRs to facilitate the peer review process. -- Create pull requests (PRs) as needed: from your fork to the main branch of this repository, and from your fork's branch to your fork's master branch. - -> Note: Detailed documentation is crucial to ensure that your Kubernetes deployment is fully functional and accessible. Engage with the bonus tasks to further enhance your understanding and application deployment skills. diff --git a/labs/docs/LAB08.md b/labs/docs/LAB08.md new file mode 100644 index 0000000000..4b47e9ea29 --- /dev/null +++ b/labs/docs/LAB08.md @@ -0,0 +1,259 @@ +## Lab 8 โ€” Metrics & Monitoring with Prometheus + +## Architecture +Key components: +- `testiks-app`: exposes Prometheus metrics at `GET /metrics` +- `Prometheus`: scrapes metrics (pull model) and stores time-series in TSDB +- `Grafana`: visualizes Prometheus metrics with dashboards (PromQL) +- `Loki`: remain for logs, complementing metrics + + +### Diagram: + +```mermaid +flowchart LR + A[py app :5000] --> |scrape| P[Prometheus :9090] + G[Grafana :3000] --> |query| P + G --> |query| D[Dashboards with metrics] + P --> |scrape| L[Loki :3100] + PT[Promtail :9080] --> |puhs| L + DC[D0cker conainers] --> |logs| PT +``` + +Data flow: +- Py app exposes metrics at `/metrics` using prometheus +- Prometheus scrapes all targets (app, itself, Loki, Grafana) +- Grafana queries Prometheus via PromQL to render dashboard panels +- Loki receives logs from Promtail, while Prometheus scrapes Loki's own metrics +- Grafana combines both data sources for full observability (logs + metrics) + +### Why these metrics +- Counter (`http_requests_total`): Useful for calculating request rates and error rates over time windows +- Histogram (`http_request_duration_seconds`): Provides bucketed latency distribution, enabling percentile calculations (p50, p95, p99) +- Gauge (`http_requests_in_progress`): Can go up and down: shows current load on the service +- Business metrics (`devops_info_endpoint_calls`): Track which endpoints are most popular beyond raw HTTP metrics + +The `/metrics` endpoint itself is excluded from tracking to avoid feedback loops + +## Application Instrumentation +### Metrics +We track the standard RED metrics with low-cardinality labels like `method`, normalized `endpoint`, and `status_code`: + +- **Counter** `http_requests_total{method,endpoint,status_code}` + Counts all HTTP requests. Useful for monitoring request rates and errors. +- **Histogram** `http_request_duration_seconds_bucket{method,endpoint,...}` + Measures latency distribution. We use this to calculate p95 and create heatmaps. +- **Gauge** `http_requests_in_progress` + Shows the number of ongoing HTTP requests at any moment. + +App-specific Metrics: +- **Counter** `devops_info_endpoint_calls{endpoint}` + Tracks usage for specific endpoints like `"/"` and `"/health"` +- **Histogram** `devops_info_system_collection_seconds` + Measures the time spent collecting system info within a request + +**Label Design Note**: Endpoint labels are normalized using Flask route rules (for example `"/health"`). We deliberately avoid using user IDs or raw paths to prevent high label cardinality. + +![all working](./screenshots/metrics.png) + +### Code Location +- Metrics are implemented in: `./ansible/app_python/app.py`: +```python +http_requests_total = Counter( + 'http_requests_total', + 'Total HTTP requests', + ['method', 'endpoint', 'status'] +) + +http_request_duration_seconds = Histogram( + 'http_request_duration_seconds', + 'HTTP request duration', + ['method', 'endpoint'] +) + +http_requests_in_progress = Gauge( + 'http_requests_in_progress', + 'HTTP requests currently being processed' +) + +``` + +### Local Testing +```bash +cd app_python +pip install -r requirements.txt +python3 app.py +curl -s http://localhost:5000/metrics | head -n 40 +``` + +## Prometheus Configuration +### Docker Compose Setup + +The monitoring stack is defined in monitoring/docker-compose.yml + +Key settings: +- Prometheus image: prom/prometheus:v3.9.0 +- Scrape interval: 15s +- Retention: + - `--storage.tsdb.retention.time=15d` + - `--storage.tsdb.retention.size=10GB` + +Persistent volume: `prometheus-data:/prometheus` + +Connected to the same logging network as Loki and Grafana (from Lab 7) + +### Scrape Targets: + +Prometheus configuration is in monitoring/prometheus/prometheus.yml. Jobs include: +- prometheus: localhost:9090 +- app: app-python:5000 (path: `/metrics`) +- loki: loki:3100 (path: `/metrics`) +- grafana: grafana:3000 (path: `/metrics`) + +![alt text](./screenshots/allgreen.png) + +## Grafana Dashboard Walkthrough + +### Request Rate (time series) +Shows throughput per endpoint (RED metric โ€œRateโ€): + +`sum by (endpoint) (rate(http_requests_total[5m]))` + +![alt text](./screenshots/endpoints.png) + +### Error Rate (5xx) (time series) +Tracks server errors: + +`sum(rate(http_requests_total{status_code=~"5.."}[5m]))` + +![alt text](./screenshots/500.png) + +### Latency Heatmap (heatmap) +Visualizes latency distribution: + +`sum by (le) (rate(http_request_duration_seconds_bucket[5m]))` + +![alt text](./screenshots/latency.png) + +### Active Requests (stat/time series) +Displays ongoing requests: + +`http_requests_in_progress` + +![alt text](./screenshots/progress.png) + +### Status Code Distribution (pie chart) +Breakdown of 2xx/4xx/5xx responses: + +`sum by (status_code) (rate(http_requests_total[5m]))` + +![alt text](./screenshots/allreq.png) + + +### Uptime (app target) (stat) +Shows app availability: + +`up{job="app"}` + +![alt text](./screenshots/up.png) + +### CPU usage rate +Shows app CPU consumption: + +`rate(process_cpu_seconds_total{job="app"}[5m]) * 100` + +![alt text](image.png) + +## Production Setup + +### Health checks + +| Service | Check | Interval | Retries | +|-------------|-------------------------------------------------|----------|---------| +| Prometheus | `wget http://localhost:9090/-/healthy` | 10s | 5 | +| Loki | `wget http://localhost:3100/ready` | 10s | 5 | +| Grafana | `curl http://localhost:3000/api/health` | 10s | 5 | +| app-python | `urllib.request.urlopen('http://localhost:5000/health')` | 10s | 5 | + +--- + +### Resource limits + +| Service | CPU Limit | Memory Limit | CPU Reserved | Memory Reserved | +|-------------|-----------|--------------|--------------|----------------| +| Prometheus | 1.0 | 1 GB | 0.25 | 256 MB | +| Loki | 1.0 | 1 GB | 0.25 | 256 MB | +| Grafana | 0.5 | 512 MB | 0.25 | 256 MB | +| app-python | 0.5 | 256 MB | 0.1 | 64 MB | +| Promtail | 0.5 | 512 MB | 0.1 | 128 MB | + +--- + +### Retention policies + +- **Prometheus**: 15 days / 10 GB (whichever limit is reached first) +- **Loki**: 168 hours (7 days), configured via `limits_config.retention_period` + +--- + +### Persistent volumes + +| Volume | Service | Mount Point | Purpose | +|-----------------|------------|--------------------|----------------------------------| +| `prometheus-data` | Prometheus | `/prometheus` | TSDB storage | +| `loki-data` | Loki | `/loki` | Log chunks and index | +| `grafana-data` | Grafana | `/var/lib/grafana` | Dashboards, users, settings | + +> Data survives `docker compose down` + `docker compose up -d`. + +--- + +## Testing Results + +### Verification steps + +```bash +cd monitoring +echo 'GRAFANA_ADMIN_PASSWORD=testpass' > .env +docker compose up -d + +docker compose ps + +curl http://localhost:9090/api/v1/targets | jq '.data.activeTargets[].health' +curl http://localhost:8000/metrics + +curl -u admin:admin http://localhost:3000/api/datasources + +# Test persistence +docker compose down +docker compose up -d +# Dashboards and data should persist +``` + +Persistance evidence: + +![alt text](image-1.png) + +## Metrics vs Logs โ€” When to Use Each + +| Aspect | Metrics (Prometheus) | Logs (Loki) | +|----------------|----------------------------------------|----------------------------------| +| Purpose | Numeric measurements over time | Event records with context | +| Use when | "How many?", "How fast?", "How much?" | "What happened?", "Why did it fail?" | +| Alerting | Ideal โ€” threshold-based alerts on rates | Possible but less efficient | +| Storage | Compact (numeric time series) | Verbose (full text) | +| Query | PromQL โ€” aggregations, rates, percentiles | LogQL โ€” filter, parse, aggregate | +| Example | "Error rate > 5% in last 5 min" | "Show me the stack trace for request X" | +| Cardinality | Keep low (avoid high-cardinality labels) | Naturally high (each log is unique) | + +**Best practice:** Use metrics for detection (something is wrong), logs for investigation (why itโ€™s wrong) + + +## Challenges & Solutions + +| Challenge | Solution | +|----------------------------------------|-------------------------------------------------------------------------| +| Metrics endpoint creating feedback loops | Excluded `/metrics` path from request tracking in `before_request` / `after_request` hooks | +| Grafana data source UID mismatch | Used provisioning YAML to auto-configure Prometheus and Loki data sources | +| Prometheus container health check | Used `wget` instead of `curl` since `prom/prometheus` image is Alpine-based | +| Dashboard persistence across restarts | Used Grafana provisioning with JSON files mounted as volumes | \ No newline at end of file diff --git a/labs/docs/screenshots/500.png b/labs/docs/screenshots/500.png new file mode 100644 index 0000000000..3aff243056 Binary files /dev/null and b/labs/docs/screenshots/500.png differ diff --git a/labs/docs/screenshots/allgreen.png b/labs/docs/screenshots/allgreen.png new file mode 100644 index 0000000000..61ae8110c2 Binary files /dev/null and b/labs/docs/screenshots/allgreen.png differ diff --git a/labs/docs/screenshots/allreq.png b/labs/docs/screenshots/allreq.png new file mode 100644 index 0000000000..743d770919 Binary files /dev/null and b/labs/docs/screenshots/allreq.png differ diff --git a/labs/docs/screenshots/endpoints.png b/labs/docs/screenshots/endpoints.png new file mode 100644 index 0000000000..4cc70766b4 Binary files /dev/null and b/labs/docs/screenshots/endpoints.png differ diff --git a/labs/docs/screenshots/image-1.png b/labs/docs/screenshots/image-1.png new file mode 100644 index 0000000000..9f1781d97c Binary files /dev/null and b/labs/docs/screenshots/image-1.png differ diff --git a/labs/docs/screenshots/image.png b/labs/docs/screenshots/image.png new file mode 100644 index 0000000000..241124d6d6 Binary files /dev/null and b/labs/docs/screenshots/image.png differ diff --git a/labs/docs/screenshots/latency.png b/labs/docs/screenshots/latency.png new file mode 100644 index 0000000000..29690493bf Binary files /dev/null and b/labs/docs/screenshots/latency.png differ diff --git a/labs/docs/screenshots/metrics.png b/labs/docs/screenshots/metrics.png new file mode 100644 index 0000000000..0b988a8377 Binary files /dev/null and b/labs/docs/screenshots/metrics.png differ diff --git a/labs/docs/screenshots/progress.png b/labs/docs/screenshots/progress.png new file mode 100644 index 0000000000..384de84ea2 Binary files /dev/null and b/labs/docs/screenshots/progress.png differ diff --git a/labs/docs/screenshots/up.png b/labs/docs/screenshots/up.png new file mode 100644 index 0000000000..92535b5601 Binary files /dev/null and b/labs/docs/screenshots/up.png differ diff --git a/labs/lab01.md b/labs/lab01.md new file mode 100644 index 0000000000..18c9ff6c43 --- /dev/null +++ b/labs/lab01.md @@ -0,0 +1,693 @@ +# Lab 1 โ€” DevOps Info Service: Web Application Development + +![difficulty](https://img.shields.io/badge/difficulty-beginner-success) +![topic](https://img.shields.io/badge/topic-Web%20Development-blue) +![points](https://img.shields.io/badge/points-10%2B2.5-orange) +![languages](https://img.shields.io/badge/languages-Python%20|%20Go-informational) + +> Build a DevOps info service that reports system information and health status. This service will evolve throughout the course into a comprehensive monitoring tool. + +## Overview + +Create a **DevOps Info Service** - a web application providing detailed information about itself and its runtime environment. This foundation will grow throughout the course as you add containerization, CI/CD, monitoring, and persistence. + +**What You'll Learn:** +- Web framework selection and implementation +- System introspection and API design +- Python best practices and documentation +- Foundation for future DevOps tooling + +**Tech Stack:** Python 3.11+ | Flask 3.1 or FastAPI 0.115 + +--- + +## Tasks + +### Task 1 โ€” Python Web Application (6 pts) + +Build a production-ready Python web service with comprehensive system information. + +#### 1.1 Project Structure + +Create this structure: + +``` +app_python/ +โ”œโ”€โ”€ app.py # Main application +โ”œโ”€โ”€ requirements.txt # Dependencies +โ”œโ”€โ”€ .gitignore # Git ignore +โ”œโ”€โ”€ README.md # App documentation +โ”œโ”€โ”€ tests/ # Unit tests (Lab 3) +โ”‚ โ””โ”€โ”€ __init__.py +โ””โ”€โ”€ docs/ # Lab documentation + โ”œโ”€โ”€ LAB01.md # Your lab submission + โ””โ”€โ”€ screenshots/ # Proof of work + โ”œโ”€โ”€ 01-main-endpoint.png + โ”œโ”€โ”€ 02-health-check.png + โ””โ”€โ”€ 03-formatted-output.png +``` + +#### 1.2 Choose Web Framework + +Select and justify your choice: +- **Flask** - Lightweight, easy to learn +- **FastAPI** - Modern, async, auto-documentation +- **Django** - Full-featured, includes ORM + +Document your decision in `app_python/docs/LAB01.md`. + +#### 1.3 Implement Main Endpoint: `GET /` + +Return comprehensive service and system information: + +```json +{ + "service": { + "name": "devops-info-service", + "version": "1.0.0", + "description": "DevOps course info service", + "framework": "Flask" + }, + "system": { + "hostname": "my-laptop", + "platform": "Linux", + "platform_version": "Ubuntu 24.04", + "architecture": "x86_64", + "cpu_count": 8, + "python_version": "3.13.1" + }, + "runtime": { + "uptime_seconds": 3600, + "uptime_human": "1 hour, 0 minutes", + "current_time": "2026-01-07T14:30:00.000Z", + "timezone": "UTC" + }, + "request": { + "client_ip": "127.0.0.1", + "user_agent": "curl/7.81.0", + "method": "GET", + "path": "/" + }, + "endpoints": [ + {"path": "/", "method": "GET", "description": "Service information"}, + {"path": "/health", "method": "GET", "description": "Health check"} + ] +} +``` + +
+๐Ÿ’ก Implementation Hints + +**Get System Information:** +```python +import platform +import socket +from datetime import datetime + +hostname = socket.gethostname() +platform_name = platform.system() +architecture = platform.machine() +python_version = platform.python_version() +``` + +**Calculate Uptime:** +```python +start_time = datetime.now() + +def get_uptime(): + delta = datetime.now() - start_time + seconds = int(delta.total_seconds()) + hours = seconds // 3600 + minutes = (seconds % 3600) // 60 + return { + 'seconds': seconds, + 'human': f"{hours} hours, {minutes} minutes" + } +``` + +**Request Information:** +```python +# Flask +request.remote_addr # Client IP +request.headers.get('User-Agent') # User agent +request.method # HTTP method +request.path # Request path + +# FastAPI +request.client.host +request.headers.get('user-agent') +request.method +request.url.path +``` + +
+ +#### 1.4 Implement Health Check: `GET /health` + +Simple health endpoint for monitoring: + +```json +{ + "status": "healthy", + "timestamp": "2024-01-15T14:30:00.000Z", + "uptime_seconds": 3600 +} +``` + +Return HTTP 200 for healthy status. This will be used for Kubernetes probes in Lab 9. + +
+๐Ÿ’ก Implementation Hints + +```python +# Flask +@app.route('/health') +def health(): + return jsonify({ + 'status': 'healthy', + 'timestamp': datetime.now(timezone.utc).isoformat(), + 'uptime_seconds': get_uptime()['seconds'] + }) + +# FastAPI +@app.get("/health") +def health(): + return { + 'status': 'healthy', + 'timestamp': datetime.now(timezone.utc).isoformat(), + 'uptime_seconds': get_uptime()['seconds'] + } +``` + +
+ +#### 1.5 Configuration + +Make your app configurable via environment variables: + +```python +import os + +HOST = os.getenv('HOST', '0.0.0.0') +PORT = int(os.getenv('PORT', 5000)) +DEBUG = os.getenv('DEBUG', 'False').lower() == 'true' +``` + +**Test:** +```bash +python app.py # Default: 0.0.0.0:5000 +PORT=8080 python app.py # Custom port +HOST=127.0.0.1 PORT=3000 python app.py +``` + +--- + +### Task 2 โ€” Documentation & Best Practices (4 pts) + +#### 2.1 Application README (`app_python/README.md`) + +Create user-facing documentation: + +**Required Sections:** +1. **Overview** - What the service does +2. **Prerequisites** - Python version, dependencies +3. **Installation** + ```bash + python -m venv venv + source venv/bin/activate + pip install -r requirements.txt + ``` +4. **Running the Application** + ```bash + python app.py + # Or with custom config + PORT=8080 python app.py + ``` +5. **API Endpoints** + - `GET /` - Service and system information + - `GET /health` - Health check +6. **Configuration** - Environment variables table + +#### 2.2 Best Practices + +Implement these in your code: + +**1. Clean Code Organization** +- Clear function names +- Proper imports grouping +- Comments only where needed +- Follow PEP 8 + +
+๐Ÿ’ก Example Structure + +```python +""" +DevOps Info Service +Main application module +""" +import os +import socket +import platform +from datetime import datetime, timezone +from flask import Flask, jsonify, request + +app = Flask(__name__) + +# Configuration +HOST = os.getenv('HOST', '0.0.0.0') +PORT = int(os.getenv('PORT', 5000)) + +# Application start time +START_TIME = datetime.now(timezone.utc) + +def get_system_info(): + """Collect system information.""" + return { + 'hostname': socket.gethostname(), + 'platform': platform.system(), + 'architecture': platform.machine(), + 'python_version': platform.python_version() + } + +@app.route('/') +def index(): + """Main endpoint - service and system information.""" + # Implementation +``` + +
+ +**2. Error Handling** + +
+๐Ÿ’ก Implementation + +```python +@app.errorhandler(404) +def not_found(error): + return jsonify({ + 'error': 'Not Found', + 'message': 'Endpoint does not exist' + }), 404 + +@app.errorhandler(500) +def internal_error(error): + return jsonify({ + 'error': 'Internal Server Error', + 'message': 'An unexpected error occurred' + }), 500 +``` + +
+ +**3. Logging** + +
+๐Ÿ’ก Implementation + +```python +import logging + +logging.basicConfig( + level=logging.INFO, + format='%(asctime)s - %(name)s - %(levelname)s - %(message)s' +) +logger = logging.getLogger(__name__) + +logger.info('Application starting...') +logger.debug(f'Request: {request.method} {request.path}') +``` + +
+ +**4. Dependencies (`requirements.txt`)** + +```txt +# Web Framework +Flask==3.1.0 +# or +fastapi==0.115.0 +uvicorn[standard]==0.32.0 # Includes performance extras +``` + +Pin exact versions for reproducibility. + +**5. Git Ignore (`.gitignore`)** + +```gitignore +# Python +__pycache__/ +*.py[cod] +venv/ +*.log + +# IDE +.vscode/ +.idea/ + +# OS +.DS_Store +``` + +#### 2.3 Lab Submission (`app_python/docs/LAB01.md`) + +Document your implementation: + +**Required Sections:** +1. **Framework Selection** + - Your choice and why + - Comparison table with alternatives +2. **Best Practices Applied** + - List practices with code examples + - Explain importance of each +3. **API Documentation** + - Request/response examples + - Testing commands +4. **Testing Evidence** + - Screenshots showing endpoints work + - Terminal output +5. **Challenges & Solutions** + - Problems encountered + - How you solved them + +**Required Screenshots:** +- Main endpoint showing complete JSON +- Health check response +- Formatted/pretty-printed output + +#### 2.4 GitHub Community Engagement + +**Objective:** Explore GitHub's social features that support collaboration and discovery. + +**Actions Required:** +1. **Star** the course repository +2. **Star** the [simple-container-com/api](https://github.com/simple-container-com/api) project โ€” a promising open-source tool for container management +3. **Follow** your professor and TAs on GitHub: + - Professor: [@Cre-eD](https://github.com/Cre-eD) + - TA: [@marat-biriushev](https://github.com/marat-biriushev) + - TA: [@pierrepicaud](https://github.com/pierrepicaud) +4. **Follow** at least 3 classmates from the course + +**Document in LAB01.md:** + +Add a "GitHub Community" section (after Challenges & Solutions) with 1-2 sentences explaining: +- Why starring repositories matters in open source +- How following developers helps in team projects and professional growth + +
+๐Ÿ’ก GitHub Social Features + +**Why Stars Matter:** + +**Discovery & Bookmarking:** +- Stars help you bookmark interesting projects for later reference +- Star count indicates project popularity and community trust +- Starred repos appear in your GitHub profile, showing your interests + +**Open Source Signal:** +- Stars encourage maintainers (shows appreciation) +- High star count attracts more contributors +- Helps projects gain visibility in GitHub search and recommendations + +**Professional Context:** +- Shows you follow best practices and quality projects +- Indicates awareness of industry tools and trends + +**Why Following Matters:** + +**Networking:** +- See what other developers are working on +- Discover new projects through their activity +- Build professional connections beyond the classroom + +**Learning:** +- Learn from others' code and commits +- See how experienced developers solve problems +- Get inspiration for your own projects + +**Collaboration:** +- Stay updated on classmates' work +- Easier to find team members for future projects +- Build a supportive learning community + +**Career Growth:** +- Follow thought leaders in your technology stack +- See trending projects in real-time +- Build visibility in the developer community + +**GitHub Best Practices:** +- Star repos you find useful (not spam) +- Follow developers whose work interests you +- Engage meaningfully with the community +- Your GitHub activity shows employers your interests and involvement + +
+ +--- + +## Bonus Task โ€” Compiled Language (2.5 pts) + +Implement the same service in a compiled language to prepare for multi-stage Docker builds (Lab 2). + +**Choose One:** +- **Go** (Recommended) - Small binaries, fast compilation +- **Rust** - Memory safety, modern features +- **Java/Spring Boot** - Enterprise standard +- **C#/ASP.NET Core** - Cross-platform .NET + +**Structure:** + +``` +app_go/ (or app_rust, app_java, etc.) +โ”œโ”€โ”€ main.go +โ”œโ”€โ”€ go.mod +โ”œโ”€โ”€ README.md +โ””โ”€โ”€ docs/ + โ”œโ”€โ”€ LAB01.md # Implementation details + โ”œโ”€โ”€ GO.md # Language justification + โ””โ”€โ”€ screenshots/ +``` + +**Requirements:** +- Same two endpoints: `/` and `/health` +- Same JSON structure +- Document build process +- Compare binary size to Python + +
+๐Ÿ’ก Go Example Skeleton + +```go +package main + +import ( + "encoding/json" + "net/http" + "os" + "runtime" + "time" +) + +type ServiceInfo struct { + Service Service `json:"service"` + System System `json:"system"` + Runtime Runtime `json:"runtime"` + Request Request `json:"request"` +} + +var startTime = time.Now() + +func mainHandler(w http.ResponseWriter, r *http.Request) { + info := ServiceInfo{ + Service: Service{ + Name: "devops-info-service", + Version: "1.0.0", + }, + System: System{ + Platform: runtime.GOOS, + Architecture: runtime.GOARCH, + CPUCount: runtime.NumCPU(), + }, + // ... implement rest + } + + w.Header().Set("Content-Type", "application/json") + json.NewEncoder(w).Encode(info) +} + +func main() { + http.HandleFunc("/", mainHandler) + http.HandleFunc("/health", healthHandler) + + port := os.Getenv("PORT") + if port == "" { + port = "8080" + } + + http.ListenAndServe(":"+port, nil) +} +``` + +
+ +--- + +## How to Submit + +1. **Create Branch:** + ```bash + git checkout -b lab01 + ``` + +2. **Commit Work:** + ```bash + git add app_python/ + git commit -m "feat: implement lab01 devops info service" + git push -u origin lab01 + ``` + +3. **Create Pull Requests:** + - **PR #1:** `your-fork:lab01` โ†’ `course-repo:master` + - **PR #2:** `your-fork:lab01` โ†’ `your-fork:master` + +4. **Verify:** + - All files present + - Screenshots included + - Documentation complete + +--- + +## Acceptance Criteria + +### Main Tasks (10 points) + +**Application Functionality (3 pts):** +- [ ] Service runs without errors +- [ ] `GET /` returns all required fields: + - [ ] Service metadata (name, version, description, framework) + - [ ] System info (hostname, platform, architecture, CPU, Python version) + - [ ] Runtime info (uptime, current time, timezone) + - [ ] Request info (client IP, user agent, method, path) + - [ ] Endpoints list +- [ ] `GET /health` returns status and uptime +- [ ] Configurable via environment variables (PORT, HOST) + +**Code Quality (2 pts):** +- [ ] Clean code structure +- [ ] PEP 8 compliant +- [ ] Error handling implemented +- [ ] Logging configured + +**Documentation (3 pts):** +- [ ] `app_python/README.md` complete with all sections +- [ ] `app_python/docs/LAB01.md` includes: + - [ ] Framework justification + - [ ] Best practices documentation + - [ ] API examples + - [ ] Testing evidence + - [ ] Challenges solved + - [ ] GitHub Community section (why stars/follows matter) +- [ ] All 3 required screenshots present +- [ ] Course repository starred +- [ ] simple-container-com/api repository starred +- [ ] Professor and TAs followed on GitHub +- [ ] At least 3 classmates followed on GitHub + +**Configuration (2 pts):** +- [ ] `requirements.txt` with pinned versions +- [ ] `.gitignore` properly configured +- [ ] Environment variables working + +### Bonus Task (2.5 points) + +- [ ] Compiled language app implements both endpoints +- [ ] Same JSON structure as Python version +- [ ] `app_/README.md` with build/run instructions +- [ ] `app_/docs/GO.md` with language justification +- [ ] `app_/docs/LAB01.md` with implementation details +- [ ] Screenshots showing compilation and execution + +--- + +## Rubric + +| Criteria | Points | Description | +|----------|--------|-------------| +| **Functionality** | 3 pts | Both endpoints work with complete, correct data | +| **Code Quality** | 2 pts | Clean, organized, follows Python standards | +| **Documentation** | 3 pts | Complete README and lab submission docs | +| **Configuration** | 2 pts | Dependencies, environment vars, .gitignore | +| **Bonus** | 2.5 pts | Compiled language implementation | +| **Total** | 12.5 pts | 10 pts required + 2.5 pts bonus | + +**Grading Scale:** +- **10/10:** Perfect implementation, excellent documentation +- **8-9/10:** All works, good docs, minor improvements possible +- **6-7/10:** Core functionality present, basic documentation +- **<6/10:** Missing features or documentation, needs revision + +--- + +## Resources + +
+๐Ÿ“š Python Web Frameworks + +- [Flask 3.1 Documentation](https://flask.palletsprojects.com/en/latest/) +- [Flask Quickstart](https://flask.palletsprojects.com/en/latest/quickstart/) +- [FastAPI Documentation](https://fastapi.tiangolo.com/) +- [FastAPI Tutorial](https://fastapi.tiangolo.com/tutorial/first-steps/) +- [Django 5.1 Documentation](https://docs.djangoproject.com/en/5.1/) + +
+ +
+๐Ÿ Python Best Practices + +- [PEP 8 Style Guide](https://pep8.org/) +- [Python Logging Tutorial](https://docs.python.org/3/howto/logging.html) +- [Python platform module](https://docs.python.org/3/library/platform.html) +- [Python socket module](https://docs.python.org/3/library/socket.html) + +
+ +
+๐Ÿ”ง Compiled Languages (Bonus) + +- [Go Web Development](https://golang.org/doc/articles/wiki/) +- [Go net/http Package](https://pkg.go.dev/net/http) +- [Rust Web Frameworks](https://www.arewewebyet.org/) +- [Spring Boot Quickstart](https://spring.io/quickstart) +- [ASP.NET Core Tutorial](https://docs.microsoft.com/aspnet/core/) + +
+ +
+๐Ÿ› ๏ธ Development Tools + +- [Postman](https://www.postman.com/) - API testing +- [HTTPie](https://httpie.io/) - Command-line HTTP client +- [curl](https://curl.se/) - Data transfer tool +- [jq](https://stedolan.github.io/jq/) - JSON processor + +
+ +--- + +## Looking Ahead + +This service evolves throughout the course: + +- **Lab 2:** Containerize with Docker, multi-stage builds +- **Lab 3:** Add unit tests and CI/CD pipeline +- **Lab 8:** Add `/metrics` endpoint for Prometheus +- **Lab 9:** Deploy to Kubernetes using `/health` probes +- **Lab 12:** Add `/visits` endpoint with file persistence +- **Lab 13:** Multi-environment deployment with GitOps + +--- + +**Good luck!** ๐Ÿš€ + +> **Remember:** Keep it simple, write clean code, and document thoroughly. This foundation will carry through all 16 labs! diff --git a/labs/lab02.md b/labs/lab02.md new file mode 100644 index 0000000000..1c3e032f89 --- /dev/null +++ b/labs/lab02.md @@ -0,0 +1,366 @@ +# Lab 2 โ€” Docker Containerization + +![difficulty](https://img.shields.io/badge/difficulty-beginner-success) +![topic](https://img.shields.io/badge/topic-Containerization-blue) +![points](https://img.shields.io/badge/points-10%2B2.5-orange) +![tech](https://img.shields.io/badge/tech-Docker-informational) + +> Containerize your Python app from Lab 1 using Docker best practices and publish it to Docker Hub. + +## Overview + +Take your Lab 1 application and package it into a Docker container. Learn image optimization, security basics, and the Docker workflow used in production. + +**What You'll Learn:** +- Writing production-ready Dockerfiles +- Docker best practices and security +- Image optimization techniques +- Docker Hub workflow + +**Tech Stack:** Docker 25+ | Python 3.13-slim | Multi-stage builds + +--- + +## Tasks + +### Task 1 โ€” Create Dockerfile (4 pts) + +**Objective:** Write a Dockerfile that containerizes your Python app following best practices. + +Create `app_python/Dockerfile` with these requirements: + +**Must Have:** +- Non-root user (mandatory) +- Specific base image version (e.g., `python:3.13-slim` or `python:3.12-slim`) +- Only copy necessary files +- Proper layer ordering +- `.dockerignore` file + +**Your app should work the same way in the container as it did locally.** + +
+๐Ÿ’ก Dockerfile Concepts & Resources + +**Key Dockerfile Instructions to Research:** +- `FROM` - Choose your base image (look at python:3.13-slim, python:3.12-slim, python:3.13-alpine) +- `RUN` - Execute commands (creating users, installing packages) +- `WORKDIR` - Set working directory +- `COPY` - Copy files into the image +- `USER` - Switch to non-root user +- `EXPOSE` - Document which port your app uses +- `CMD` - Define how to start your application + +**Critical Concepts:** +- **Layer Caching**: Why does the order of COPY commands matter? +- **Non-root User**: How do you create and switch to a non-root user? +- **Base Image Selection**: What's the difference between slim, alpine, and full images? +- **Dependency Installation**: Why copy requirements.txt separately from application code? + +**Resources:** +- [Dockerfile Reference](https://docs.docker.com/reference/dockerfile/) +- [Best Practices Guide](https://docs.docker.com/build/building/best-practices/) +- [Python Image Variants](https://hub.docker.com/_/python) - Use 3.13-slim or 3.12-slim + +**Think About:** +- What happens if you copy all files before installing dependencies? +- Why shouldn't you run as root? +- How does layer caching speed up rebuilds? + +
+ +
+๐Ÿ’ก .dockerignore Concepts + +**Purpose:** Prevent unnecessary files from being sent to Docker daemon during build (faster builds, smaller context). + +**What Should You Exclude?** +Think about what doesn't need to be in your container: +- Development artifacts (like Python's `__pycache__`, `*.pyc`) +- Version control files (`.git` directory) +- IDE configuration files +- Virtual environments (`venv/`, `.venv/`) +- Documentation that's not needed at runtime +- Test files (if not running tests in container) + +**Key Question:** Why does excluding files from the build context matter for build speed? + +**Resources:** +- [.dockerignore Documentation](https://docs.docker.com/engine/reference/builder/#dockerignore-file) +- Look at your `.gitignore` for inspiration - many patterns overlap + +**Exercise:** Start minimal and add exclusions as needed, rather than copying a huge list you don't understand. + +
+ +**Test Your Container:** + +You should be able to: +1. Build your image using the `docker build` command +2. Run a container from your image with proper port mapping +3. Access your application endpoints from the host machine + +Verify that your application works the same way in the container as it did locally. + +--- + +### Task 2 โ€” Docker Hub (2 pts) + +**Objective:** Publish your image to Docker Hub. + +**Requirements:** +1. Create a Docker Hub account (if you don't have one) +2. Tag your image with your Docker Hub username +3. Authenticate with Docker Hub +4. Push your image to the registry +5. Verify the image is publicly accessible + +**Documentation Required:** +- Terminal output showing successful push +- Docker Hub repository URL +- Explanation of your tagging strategy + +
+๐Ÿ’ก Docker Hub Resources + +**Useful Commands:** +- `docker tag` - Tag images for registry push +- `docker login` - Authenticate with Docker Hub +- `docker push` - Upload image to registry +- `docker pull` - Download image from registry + +**Resources:** +- [Docker Hub Quickstart](https://docs.docker.com/docker-hub/quickstart/) +- [Docker Tag Reference](https://docs.docker.com/reference/cli/docker/image/tag/) +- [Best Practices for Tagging](https://docs.docker.com/build/building/best-practices/#tagging) + +
+ +--- + +### Task 3 โ€” Documentation (4 pts) + +**Objective:** Document your Docker implementation with focus on understanding and decisions. + +#### 3.1 Update `app_python/README.md` + +Add a **Docker** section explaining how to use your containerized application. Include command patterns (not exact commands) for: +- Building the image locally +- Running a container +- Pulling from Docker Hub + +#### 3.2 Create `app_python/docs/LAB02.md` + +Document your implementation with these sections: + +**Required Sections:** + +1. **Docker Best Practices Applied** + - List each practice you implemented (non-root user, layer caching, .dockerignore, etc.) + - Explain WHY each matters (not just what it does) + - Include relevant Dockerfile snippets with explanations + +2. **Image Information & Decisions** + - Base image chosen and justification (why this specific version?) + - Final image size and your assessment + - Layer structure explanation + - Optimization choices you made + +3. **Build & Run Process** + - Complete terminal output from your build process + - Terminal output showing container running + - Terminal output from testing endpoints (curl/httpie) + - Docker Hub repository URL + +4. **Technical Analysis** + - Why does your Dockerfile work the way it does? + - What would happen if you changed the layer order? + - What security considerations did you implement? + - How does .dockerignore improve your build? + +5. **Challenges & Solutions** + - Issues encountered during implementation + - How you debugged and resolved them + - What you learned from the process + +--- + +## Bonus Task โ€” Multi-Stage Build (2.5 pts) + +**Objective:** Containerize your compiled language app (from Lab 1 bonus) using multi-stage builds. + +**Why Multi-Stage?** Separate build environment from runtime โ†’ smaller final image. + +**Example Flow:** +1. **Stage 1 (Builder):** Compile the app (large image with compilers) +2. **Stage 2 (Runtime):** Copy only the binary (small image, no build tools) + +
+๐Ÿ’ก Multi-Stage Build Concepts + +**The Problem:** Compiled language images include the entire compiler/SDK in the final image (huge!). + +**The Solution:** Use multiple `FROM` statements: +- **Stage 1 (Builder)**: Use full SDK image, compile your application +- **Stage 2 (Runtime)**: Use minimal base image, copy only the compiled binary + +**Key Concepts to Research:** +- How to name build stages (`AS builder`) +- How to copy files from previous stages (`COPY --from=builder`) +- Choosing runtime base images (alpine, distroless, scratch) +- Static vs dynamic compilation (affects what base image you can use) + +**Questions to Explore:** +- What's the size difference between your builder and final image? +- Why can't you just use the builder image as your final image? +- What security benefits come from smaller images? +- Can you use `FROM scratch`? Why or why not? + +**Resources:** +- [Multi-Stage Builds Documentation](https://docs.docker.com/build/building/multi-stage/) +- [Distroless Base Images](https://github.com/GoogleContainerTools/distroless) +- Language-specific: Search "Go static binary Docker" or "Rust alpine Docker" + +**Challenge:** Try to get your final image under 20MB. + +
+ +**Requirements:** +- Multi-stage Dockerfile in `app_go/` (or your chosen language) +- Working containerized application +- Documentation in `app_go/docs/LAB02.md` explaining: + - Your multi-stage build strategy + - Size comparison with analysis (builder vs final image) + - Why multi-stage builds matter for compiled languages + - Terminal output showing build process and image sizes + - Technical explanation of each stage's purpose + +**Bonus Points Given For:** +- Significant size reduction achieved with clear metrics +- Deep understanding of multi-stage build benefits +- Analysis of security implications (smaller attack surface) +- Explanation of trade-offs and decisions made + +--- + +## How to Submit + +1. **Create Branch:** Create a new branch called `lab02` + +2. **Commit Work:** + - Add your changes (app_python/ directory with Dockerfile, .dockerignore, updated docs) + - Commit with a descriptive message following conventional commits format + - Push to your fork + +3. **Create Pull Requests:** + - **PR #1:** `your-fork:lab02` โ†’ `course-repo:master` + - **PR #2:** `your-fork:lab02` โ†’ `your-fork:master` + +--- + +## Acceptance Criteria + +### Main Tasks (10 points) + +**Dockerfile (4 pts):** +- [ ] Dockerfile exists in `app_python/` +- [ ] Uses specific base image version +- [ ] Runs as non-root user (USER directive) +- [ ] Proper layer ordering (dependencies before code) +- [ ] Only copies necessary files +- [ ] `.dockerignore` file present +- [ ] Image builds successfully +- [ ] Container runs and app works + +**Docker Hub (2 pts):** +- [ ] Image pushed to Docker Hub +- [ ] Image is publicly accessible +- [ ] Correct tagging used +- [ ] Can pull and run from Docker Hub + +**Documentation (4 pts):** +- [ ] `app_python/README.md` has Docker section with command patterns +- [ ] `app_python/docs/LAB02.md` complete with: + - [ ] Best practices explained with WHY (not just what) + - [ ] Image information and justifications for choices + - [ ] Terminal output from build, run, and testing + - [ ] Technical analysis demonstrating understanding + - [ ] Challenges and solutions documented + - [ ] Docker Hub repository URL provided + +### Bonus Task (2.5 points) + +- [ ] Multi-stage Dockerfile for compiled language app +- [ ] Working containerized application +- [ ] Documentation in `app_/docs/LAB02.md` with: + - [ ] Multi-stage strategy explained + - [ ] Terminal output showing image sizes (builder vs final) + - [ ] Analysis of size reduction and why it matters + - [ ] Technical explanation of each stage + - [ ] Security benefits discussed + +--- + +## Rubric + +| Criteria | Points | Description | +|----------|--------|-------------| +| **Dockerfile** | 4 pts | Correct, secure, optimized | +| **Docker Hub** | 2 pts | Successfully published | +| **Documentation** | 4 pts | Complete and clear | +| **Bonus** | 2.5 pts | Multi-stage implementation | +| **Total** | 12.5 pts | 10 pts required + 2.5 pts bonus | + +**Grading:** +- **10/10:** Perfect Dockerfile, deep understanding demonstrated, excellent analysis +- **8-9/10:** Working container, good practices, solid understanding shown +- **6-7/10:** Container works, basic security, surface-level explanations +- **<6/10:** Missing requirements, runs as root, copy-paste without understanding + +--- + +## Resources + +
+๐Ÿ“š Docker Documentation + +- [Dockerfile Best Practices](https://docs.docker.com/build/building/best-practices/) +- [Dockerfile Reference](https://docs.docker.com/reference/dockerfile/) +- [Multi-Stage Builds](https://docs.docker.com/build/building/multi-stage/) +- [.dockerignore](https://docs.docker.com/reference/dockerfile/#dockerignore-file) +- [Docker Build Guide](https://docs.docker.com/build/guide/) + +
+ +
+๐Ÿ”’ Security Resources + +- [Docker Security Best Practices](https://docs.docker.com/build/building/best-practices/#security) +- [Snyk Docker Security](https://snyk.io/learn/docker-security-scanning/) +- [Why Non-Root Containers](https://docs.docker.com/build/building/best-practices/#user) +- [Distroless Images](https://github.com/GoogleContainerTools/distroless) - Minimal base images + +
+ +
+๐Ÿ› ๏ธ Tools + +- [Hadolint](https://github.com/hadolint/hadolint) - Dockerfile linter +- [Dive](https://github.com/wagoodman/dive) - Explore image layers +- [Docker Hub](https://hub.docker.com/) - Container registry + +
+ +--- + +## Looking Ahead + +- **Lab 3:** CI/CD will automatically build these Docker images +- **Lab 7-8:** Deploy containers with docker-compose for logging/monitoring +- **Lab 9:** Run these containers in Kubernetes +- **Lab 13:** ArgoCD will deploy containerized apps automatically + +--- + +**Good luck!** ๐Ÿš€ + +> **Remember:** Understanding beats copy-paste. Explain your decisions, not just your actions. Run as non-root or no points! diff --git a/labs/lab03.md b/labs/lab03.md new file mode 100644 index 0000000000..9824e934b3 --- /dev/null +++ b/labs/lab03.md @@ -0,0 +1,931 @@ +# Lab 3 โ€” Continuous Integration (CI/CD) + +![difficulty](https://img.shields.io/badge/difficulty-beginner-success) +![topic](https://img.shields.io/badge/topic-CI/CD-blue) +![points](https://img.shields.io/badge/points-10%2B2.5-orange) +![tech](https://img.shields.io/badge/tech-GitHub%20Actions-informational) + +> Automate your Python app testing and Docker builds with GitHub Actions CI/CD pipeline. + +## Overview + +Take your containerized app from Labs 1-2 and add automated testing and deployment. Learn how CI/CD catches bugs early, ensures code quality, and automates the Docker build/push workflow. + +**What You'll Learn:** +- Writing effective unit tests +- GitHub Actions workflow syntax +- CI/CD best practices (caching, matrix builds, security scanning) +- Automated Docker image publishing +- Continuous integration for multiple applications + +**Tech Stack:** GitHub Actions | pytest 8+ | Python 3.11+ | Snyk | Docker + +**Connection to Previous Labs:** +- **Lab 1:** Test the endpoints you created +- **Lab 2:** Automate the Docker build/push workflow +- **Lab 4+:** This CI pipeline will run for all future labs + +--- + +## Tasks + +### Task 1 โ€” Unit Testing (3 pts) + +**Objective:** Write comprehensive unit tests for your Python application to ensure reliability. + +**Requirements:** + +1. **Choose a Testing Framework** + - Research Python testing frameworks (pytest, unittest, etc.) + - Select one and justify your choice + - Install it in your `requirements.txt` or create `requirements-dev.txt` + +2. **Write Unit Tests** + - Create `app_python/tests/` directory + - Write tests for **all** your endpoints: + - `GET /` - Verify JSON structure and required fields + - `GET /health` - Verify health check response + - Test both successful responses and error cases + - Aim for meaningful test coverage (not just basic smoke tests) + +3. **Run Tests Locally** + - Verify all tests pass locally before CI setup + - Document how to run tests in your README + +
+๐Ÿ’ก Testing Framework Guidance + +**Popular Python Testing Frameworks:** + +**pytest (Recommended):** +- Pros: Simple syntax, powerful fixtures, excellent plugin ecosystem +- Cons: Additional dependency +- Use case: Most modern Python projects + +**unittest:** +- Pros: Built into Python (no extra dependencies) +- Cons: More verbose, less modern features +- Use case: Minimal dependency projects + +**Key Testing Concepts to Research:** +- Test fixtures and setup/teardown +- Mocking external dependencies +- Testing HTTP endpoints (test client usage) +- Test coverage measurement +- Assertions and expected vs actual results + +**What Should You Test?** +- Correct HTTP status codes (200, 404, 500) +- Response data structure (JSON fields present) +- Response data types (strings, integers, etc.) +- Edge cases (invalid requests, missing data) +- Error handling (what happens when things fail?) + +**Questions to Consider:** +- How do you test a Flask/FastAPI app without starting the server? +- Should you test that `hostname` returns your actual hostname, or just that the field exists? +- How do you simulate different client IPs or user agents in tests? + +**Resources:** +- [Pytest Documentation](https://docs.pytest.org/) +- [Flask Testing](https://flask.palletsprojects.com/en/stable/testing/) +- [FastAPI Testing](https://fastapi.tiangolo.com/tutorial/testing/) +- [Python unittest](https://docs.python.org/3/library/unittest.html) + +**Anti-Patterns to Avoid:** +- Testing framework functionality instead of your code +- Tests that always pass regardless of implementation +- Tests with no assertions +- Tests that depend on external services + +
+ +**What to Document:** +- Your testing framework choice and why +- Test structure explanation +- How to run tests locally +- Terminal output showing all tests passing + +--- + +### Task 2 โ€” GitHub Actions CI Workflow (4 pts) + +**Objective:** Create a GitHub Actions workflow that automatically tests your code and builds Docker images with proper versioning. + +**Requirements:** + +1. **Create Workflow File** + - Create `.github/workflows/python-ci.yml` in your repository + - Name your workflow descriptively + +2. **Implement Essential CI Steps** + + Your workflow must include these logical stages: + + **a) Code Quality & Testing:** + - Install dependencies + - Run a linter (pylint, flake8, black, ruff, etc.) + - Run your unit tests + + **b) Docker Build & Push with Versioning:** + - Authenticate with Docker Hub + - Build your Docker image + - Tag with proper version strategy (see versioning section below) + - Push to Docker Hub with multiple tags + +3. **Versioning Strategy** + + Choose **one** versioning approach and implement it: + + **Option A: Semantic Versioning (SemVer)** + - Version format: `v1.2.3` (major.minor.patch) + - Use git tags for releases + - Tag images like: `username/app:1.2.3`, `username/app:1.2`, `username/app:latest` + - **When to use:** Traditional software releases with breaking changes + + **Option B: Calendar Versioning (CalVer)** + - Version format: `2024.01.15` or `2024.01` (year.month.day or year.month) + - Based on release date + - Tag images like: `username/app:2024.01`, `username/app:latest` + - **When to use:** Time-based releases, continuous deployment + + **Required:** + - Document which strategy you chose and why + - Implement it in your CI workflow + - Show at least 2 tags per image (e.g., version + latest) + +4. **Workflow Triggers** + - Configure when the workflow runs (push, pull request, etc.) + - Consider which branches should trigger builds + +5. **Testing the Workflow** + - Push your workflow file and verify it runs + - Fix any issues that arise + - Ensure all steps complete successfully + - Verify Docker Hub shows your version tags + +
+๐Ÿ’ก GitHub Actions Concepts + +**Core Concepts to Research:** + +**Workflow Anatomy:** +- `name` - What is your workflow called? +- `on` - When does it run? (push, pull_request, schedule, etc.) +- `jobs` - What work needs to be done? +- `steps` - Individual commands within a job +- `runs-on` - What OS environment? (ubuntu-latest, etc.) + +**Key Questions:** +- Should you run CI on every push, or only on pull requests? +- What happens if tests fail? Should the workflow continue? +- How do you access secrets (like Docker Hub credentials) securely? +- Why might you want multiple jobs vs multiple steps in one job? + +**Python CI Steps Pattern:** +```yaml +# This is a pattern, not exact copy-paste code +# Research the actual syntax and actions needed + +- Set up Python environment +- Install dependencies +- Run linter +- Run tests +``` + +**Docker CI Steps Pattern:** +```yaml +# This is a pattern, not exact copy-paste code +# Research the actual actions and their parameters + +- Log in to Docker Hub +- Extract metadata for tags +- Build and push Docker image +``` + +**Important Concepts:** +- **Actions Marketplace:** Reusable actions (actions/checkout@v4, actions/setup-python@v5, docker/build-push-action@v6) +- **Secrets:** How to store Docker Hub credentials securely +- **Job Dependencies:** Can one job depend on another succeeding? +- **Matrix Builds:** Testing multiple Python versions (optional but good to know) +- **Caching:** Speed up workflows by caching dependencies (we'll add this in Task 3) + +**Resources:** +- [GitHub Actions Documentation](https://docs.github.com/en/actions) +- [Building and Testing Python](https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python) +- [Publishing Docker Images](https://docs.docker.com/ci-cd/github-actions/) +- [GitHub Actions Marketplace](https://github.com/marketplace?type=actions) +- [Workflow Syntax](https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions) + +**Security Best Practices:** +- Never hardcode passwords or tokens in workflow files +- Use GitHub Secrets for sensitive data +- Understand when secrets are exposed to pull requests from forks +- Use `secrets.GITHUB_TOKEN` for GitHub API access (auto-provided) + +**Docker Hub Authentication:** +You'll need to create a Docker Hub access token and add it as a GitHub Secret. Research: +- How to create Docker Hub access tokens +- How to add secrets to your GitHub repository +- How to reference secrets in workflow files (hint: `${{ secrets.NAME }}`) + +
+ +
+๐Ÿ’ก Versioning Strategy Guidance + +**Semantic Versioning (SemVer):** + +**Format:** MAJOR.MINOR.PATCH (e.g., 1.2.3) +- **MAJOR:** Breaking changes (incompatible API changes) +- **MINOR:** New features (backward-compatible) +- **PATCH:** Bug fixes (backward-compatible) + +**Implementation Approaches:** +1. **Manual Git Tags:** Create git tags (v1.0.0) and reference in workflow +2. **Automated from Commits:** Parse conventional commits to bump version +3. **GitHub Releases:** Trigger on release creation + +**Docker Tagging Example:** +- `username/app:1.2.3` (full version) +- `username/app:1.2` (minor version, rolling) +- `username/app:1` (major version, rolling) +- `username/app:latest` (latest stable) + +**Pros:** Clear when breaking changes occur, industry standard for libraries +**Cons:** Requires discipline to follow rules correctly + +--- + +**Calendar Versioning (CalVer):** + +**Common Formats:** +- `YYYY.MM.DD` (e.g., 2024.01.15) - Daily releases +- `YYYY.MM.MICRO` (e.g., 2024.01.0) - Monthly with patch number +- `YYYY.0M` (e.g., 2024.01) - Monthly releases + +**Implementation Approaches:** +1. **Date-based:** Generate from current date in workflow +2. **Git SHA:** Combine with short commit SHA (2024.01-a1b2c3d) +3. **Build Number:** Use GitHub run number (2024.01.42) + +**Docker Tagging Example:** +- `username/app:2024.01` (month version) +- `username/app:2024.01.123` (with build number) +- `username/app:latest` (latest build) + +**Pros:** No ambiguity, good for continuous deployment, easier to remember +**Cons:** Doesn't indicate breaking changes + +--- + +**How to Implement in CI:** + +**Using docker/metadata-action:** +```yaml +# Pattern - research actual syntax +- name: Docker metadata + uses: docker/metadata-action + with: + # Define your tagging strategy here + # Can reference git tags, dates, commit SHAs +``` + +**Manual Tagging:** +```yaml +# Pattern - research actual syntax +- name: Generate version + run: echo "VERSION=$(date +%Y.%m.%d)" >> $GITHUB_ENV + +- name: Build and push + # Use ${{ env.VERSION }} in tags +``` + +**Questions to Consider:** +- How often will you release? (Daily? Per feature? Monthly?) +- Do users need to know about breaking changes explicitly? +- Are you building a library (use SemVer) or a service (CalVer works)? +- How will you track what's in each version? + +**Resources:** +- [Semantic Versioning](https://semver.org/) +- [Calendar Versioning](https://calver.org/) +- [Docker Metadata Action](https://github.com/docker/metadata-action) +- [Conventional Commits](https://www.conventionalcommits.org/) (for automated SemVer) + +
+ +
+๐Ÿ’ก Debugging GitHub Actions + +**Common Issues & How to Debug:** + +**Workflow Won't Trigger:** +- Check your `on:` configuration +- Verify you pushed to the correct branch +- Look at Actions tab for filtering options + +**Steps Failing:** +- Click into the failed step to see full logs +- Check for typos in action names or parameters +- Verify secrets are configured correctly +- Test commands locally first + +**Docker Build Fails:** +- Ensure Dockerfile is in the correct location +- Check context path in build step +- Verify base image exists and is accessible +- Test Docker build locally first + +**Authentication Issues:** +- Verify secret names match exactly (case-sensitive) +- Check that Docker Hub token has write permissions +- Ensure you're using `docker/login-action` correctly + +**Debugging Techniques:** +- Add `run: echo "Debug message"` steps to understand workflow state +- Use `run: env` to see available environment variables +- Check Actions tab for detailed logs +- Enable debug logging (add `ACTIONS_RUNNER_DEBUG` secret = true) + +
+ +**What to Document:** +- Your workflow trigger strategy and reasoning +- Why you chose specific actions from the marketplace +- Your Docker tagging strategy (latest? version tags? commit SHA?) +- Link to successful workflow run in GitHub Actions tab +- Terminal output or screenshot of green checkmark + +--- + +### Task 3 โ€” CI Best Practices & Security (3 pts) + +**Objective:** Optimize your CI workflow and add security scanning. + +**Requirements:** + +1. **Add Status Badge** + - Add a GitHub Actions status badge to your `app_python/README.md` + - The badge should show the current workflow status (passing/failing) + +2. **Implement Dependency Caching** + - Add caching for Python dependencies to speed up workflow + - Measure and document the speed improvement + +3. **Add Security Scanning with Snyk** + - Integrate Snyk vulnerability scanning into your workflow + - Configure it to check for vulnerabilities in your dependencies + - Document any vulnerabilities found and how you addressed them + +4. **Apply CI Best Practices** + - Research and implement at least 3 additional CI best practices + - Document which practices you applied and why they matter + +
+๐Ÿ’ก CI Best Practices Guidance + +**Dependency Caching:** + +Caching speeds up workflows by reusing previously downloaded dependencies. + +**Key Concepts:** +- What should be cached? (pip packages, Docker layers, etc.) +- What's the cache key? (based on requirements.txt hash) +- When does cache become invalid? +- How much time does caching save? + +**Actions to Research:** +- `actions/cache` for general caching +- `actions/setup-python` has built-in cache support + +**Questions to Explore:** +- Where are Python packages stored that should be cached? +- How do you measure cache hit vs cache miss? +- What happens if requirements.txt changes? + +**Status Badges:** + +Show workflow status directly in your README. + +**Format Pattern:** +```markdown +![Workflow Name](https://github.com/username/repo/workflows/workflow-name/badge.svg) +``` + +Research how to: +- Get the correct badge URL for your workflow +- Make badges clickable (link to Actions tab) +- Display specific branch status + +**CI Best Practices to Consider:** + +Research and choose at least 3 to implement: + +1. **Fail Fast:** Stop workflow on first failure +2. **Matrix Builds:** Test multiple Python versions (3.12, 3.13) +3. **Job Dependencies:** Don't push Docker if tests fail +4. **Conditional Steps:** Only push on main branch +5. **Pull Request Checks:** Require passing CI before merge +6. **Workflow Concurrency:** Cancel outdated workflow runs +7. **Docker Layer Caching:** Cache Docker build layers +8. **Environment Variables:** Use env for repeated values +9. **Secrets Scanning:** Prevent committing secrets +10. **YAML Validation:** Lint your workflow files + +**Resources:** +- [GitHub Actions Best Practices](https://docs.github.com/en/actions/learn-github-actions/usage-limits-billing-and-administration#usage-limits) +- [Caching Dependencies](https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows) +- [Security Hardening](https://docs.github.com/en/actions/security-guides/security-hardening-for-github-actions) + +
+ +
+๐Ÿ’ก Snyk Integration Guidance + +**What is Snyk?** + +Snyk is a security tool that scans your dependencies for known vulnerabilities. + +**Key Concepts:** +- Vulnerability databases (CVEs) +- Severity levels (low, medium, high, critical) +- Automated dependency updates +- Security advisories + +**Integration Options:** + +1. **Snyk GitHub Action:** + - Use `snyk/actions` from GitHub Marketplace + - Requires Snyk API token (free tier available) + - Can fail builds on vulnerabilities + +2. **Snyk CLI in Workflow:** + - Install Snyk CLI in workflow + - Run `snyk test` command + - More flexible but requires setup + +**Setup Steps:** +1. Create free Snyk account +2. Get API token from Snyk dashboard +3. Add token as GitHub Secret +4. Add Snyk step to workflow +5. Configure severity threshold (what level fails the build?) + +**Questions to Explore:** +- Should every vulnerability fail your build? +- What if vulnerabilities have no fix available? +- How do you handle false positives? +- When should you break the build vs just warn? + +**Resources:** +- [Snyk GitHub Actions](https://github.com/snyk/actions) +- [Snyk Python Example](https://github.com/snyk/actions/tree/master/python) +- [Snyk Documentation](https://docs.snyk.io/integrations/ci-cd-integrations/github-actions-integration) + +**Common Issues:** +- Dependencies not installed before Snyk runs +- API token not configured correctly +- Overly strict severity settings breaking builds +- Virtual environment confusion + +**What to Document:** +- Your severity threshold decision and reasoning +- Any vulnerabilities found and your response +- Whether you fail builds on vulnerabilities or just warn + +
+ +**What to Document:** +- Status badge in README (visible proof it works) +- Caching implementation and speed improvement metrics +- CI best practices you applied with explanations +- Snyk integration results and vulnerability handling +- Terminal output showing improved workflow performance + +--- + +## Bonus Task โ€” Multi-App CI with Path Filters + Test Coverage (2.5 pts) + +**Objective:** Set up CI for your compiled language app with intelligent path-based triggers AND add test coverage tracking. + +**Part 1: Multi-App CI (1.5 pts)** + +1. **Create Second CI Workflow** + - Create `.github/workflows/-ci.yml` for your Go/Rust/Java app + - Implement similar CI steps (lint, test, build Docker image) + - Use language-specific actions and best practices + - Apply versioning strategy (SemVer or CalVer) consistently + +2. **Implement Path-Based Triggers** + - Python workflow should only run when `app_python/` files change + - Compiled language workflow should only run when `app_/` files change + - Neither should run when only docs or other files change + +3. **Optimize for Multiple Apps** + - Ensure both workflows can run in parallel + - Consider using workflow templates (DRY principle) + - Document the benefits of path-based triggers + +**Part 2: Test Coverage Badge (1 pt)** + +4. **Add Coverage Tracking** + - Install coverage tool (`pytest-cov` for Python, coverage tool for your other language) + - Generate coverage reports in CI workflow + - Integrate with codecov.io or coveralls.io (free for public repos) + - Add coverage badge to README showing percentage + +5. **Coverage Goals** + - Document your current coverage percentage + - Identify what's not covered and why + - Set a coverage threshold in CI (e.g., fail if below 70%) + +
+๐Ÿ’ก Path Filters & Multi-App CI + +**Why Path Filters?** + +In a monorepo with multiple apps, you don't want to run Python CI when only Go code changes. + +**Path Filter Syntax:** +```yaml +on: + push: + paths: + - 'app_python/**' + - '.github/workflows/python-ci.yml' +``` + +**Key Concepts:** +- Glob patterns for path matching +- When to include workflow file itself +- Exclude patterns (paths-ignore) +- How to test path filters + +**Questions to Explore:** +- Should changes to README.md trigger CI? +- Should changes to the root .gitignore trigger CI? +- What about changes to both apps in one commit? +- How do you test that path filters work correctly? + +**Multi-Language CI Patterns:** + +**For Go:** +- actions/setup-go +- golangci-lint for linting +- go test for testing +- Multi-stage Docker builds (from Lab 2 bonus) + +**For Rust:** +- actions-rs/toolchain +- cargo clippy for linting +- cargo test for testing +- cargo-audit for security + +**For Java:** +- actions/setup-java +- Maven or Gradle for build +- Checkstyle or SpotBugs for linting +- JUnit tests + +**Workflow Reusability:** + +Consider: +- Reusable workflows (call one workflow from another) +- Composite actions (bundle steps together) +- Workflow templates (DRY for similar workflows) + +**Resources:** +- [Path Filters](https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#onpushpull_requestpaths) +- [Reusable Workflows](https://docs.github.com/en/actions/using-workflows/reusing-workflows) +- [Starter Workflows](https://github.com/actions/starter-workflows/tree/main/ci) + +
+ +
+๐Ÿ’ก Test Coverage Tracking + +**What is Test Coverage?** + +Coverage measures what percentage of your code is executed by your tests. High coverage = more code is tested. + +**Why Coverage Matters:** +- Identifies untested code paths +- Prevents regressions (changes breaking untested code) +- Increases confidence in refactoring +- Industry standard quality metric + +**Coverage Tools by Language:** + +**Python (pytest-cov):** +```bash +# Install +pip install pytest-cov + +# Run with coverage +pytest --cov=app_python --cov-report=xml --cov-report=term + +# Generates coverage.xml for upload +``` + +**Go (built-in):** +```bash +go test -coverprofile=coverage.out ./... +go tool cover -html=coverage.out +``` + +**Rust (tarpaulin):** +```bash +cargo install cargo-tarpaulin +cargo tarpaulin --out Xml +``` + +**Java (JaCoCo with Maven/Gradle):** +```bash +mvn test jacoco:report +# or +gradle test jacocoTestReport +``` + +**Integration Services:** + +**Codecov (Recommended):** +- Free for public repos +- Beautiful visualizations +- PR comments with coverage diff +- Setup: Sign in with GitHub, add repo, upload coverage report + +**Coveralls:** +- Alternative to Codecov +- Similar features +- Different UI + +**Coverage in CI Workflow:** +```yaml +# Pattern for Python (research actual syntax) +- name: Run tests with coverage + run: pytest --cov=. --cov-report=xml + +- name: Upload to Codecov + uses: codecov/codecov-action@v4 + with: + file: ./coverage.xml + token: ${{ secrets.CODECOV_TOKEN }} +``` + +**Coverage Badge:** +```markdown +![Coverage](https://codecov.io/gh/username/repo/branch/main/graph/badge.svg) +``` + +**Setting Coverage Thresholds:** + +You can fail CI if coverage drops below a threshold: + +```yaml +# In pytest.ini or pyproject.toml +[tool:pytest] +addopts = --cov=. --cov-fail-under=70 +``` + +**Questions to Consider:** +- What's a reasonable coverage target? (70%? 80%? 90%?) +- Should you aim for 100% coverage? (Usually no - diminishing returns) +- What code is OK to leave untested? (Error handlers, config, main) +- How do you test hard-to-reach code paths? + +**Best Practices:** +- Don't chase 100% coverage blindly +- Focus on testing critical business logic +- Integration points should have high coverage +- Simple getters/setters can be skipped +- Measure coverage trends, not just absolute numbers + +**Resources:** +- [Codecov Documentation](https://docs.codecov.com/) +- [pytest-cov Documentation](https://pytest-cov.readthedocs.io/) +- [Go Coverage](https://go.dev/blog/cover) +- [Cargo Tarpaulin](https://github.com/xd009642/tarpaulin) +- [JaCoCo](https://www.jacoco.org/) + +
+ +**What to Document:** +- Second workflow implementation with language-specific best practices +- Path filter configuration and testing proof +- Benefits analysis: Why path filters matter in monorepos +- Example showing workflows running independently +- Terminal output or Actions tab showing selective triggering +- **Coverage integration:** Screenshot/link to codecov/coveralls dashboard +- **Coverage analysis:** Current percentage, what's covered/not covered, your threshold + +--- + +## How to Submit + +1. **Create Branch:** + - Create a new branch called `lab03` + - Develop your CI workflows on this branch + +2. **Commit Work:** + - Add workflow files (`.github/workflows/`) + - Add test files (`app_python/tests/`) + - Add documentation (`app_python/docs/LAB03.md`) + - Commit with descriptive message following conventional commits + +3. **Verify CI Works:** + - Push to your fork and verify workflows run + - Check that all jobs pass + - Review workflow logs for any issues + +4. **Create Pull Requests:** + - **PR #1:** `your-fork:lab03` โ†’ `course-repo:master` + - **PR #2:** `your-fork:lab03` โ†’ `your-fork:master` + - CI should run automatically on your PRs + +--- + +## Acceptance Criteria + +### Main Tasks (10 points) + +**Unit Testing (3 pts):** +- [ ] Testing framework chosen with justification +- [ ] Tests exist in `app_python/tests/` directory +- [ ] All endpoints have test coverage +- [ ] Tests pass locally (terminal output provided) +- [ ] README updated with testing instructions + +**GitHub Actions CI (4 pts):** +- [ ] Workflow file exists at `.github/workflows/python-ci.yml` +- [ ] Workflow includes: dependency installation, linting, testing +- [ ] Workflow includes: Docker Hub login, build, and push +- [ ] Versioning strategy chosen (SemVer or CalVer) and implemented +- [ ] Docker images tagged with at least 2 tags (e.g., version + latest) +- [ ] Workflow triggers configured appropriately +- [ ] All workflow steps pass successfully +- [ ] Docker Hub shows versioned images +- [ ] Link to successful workflow run provided + +**CI Best Practices (3 pts):** +- [ ] Status badge added to README and working +- [ ] Dependency caching implemented with performance metrics +- [ ] Snyk security scanning integrated +- [ ] At least 3 CI best practices applied +- [ ] Documentation complete (see Documentation Requirements section) + +### Bonus Task (2.5 points) + +**Part 1: Multi-App CI (1.5 pts)** +- [ ] Second workflow created for compiled language app (`.github/workflows/-ci.yml`) +- [ ] Language-specific linting and testing implemented +- [ ] Versioning strategy applied to second app +- [ ] Path filters configured for both workflows +- [ ] Path filters tested and proven to work (workflows run selectively) +- [ ] Both workflows can run in parallel +- [ ] Documentation explains benefits and shows selective triggering + +**Part 2: Test Coverage (1 pt)** +- [ ] Coverage tool integrated (`pytest-cov` or equivalent) +- [ ] Coverage reports generated in CI workflow +- [ ] Codecov or Coveralls integration complete +- [ ] Coverage badge added to README +- [ ] Coverage threshold set in CI (optional but recommended) +- [ ] Documentation includes coverage analysis (percentage, what's covered/not) + +--- + +## Documentation Requirements + +Create `app_python/docs/LAB03.md` with these sections: + +### 1. Overview +- Testing framework used and why you chose it +- What endpoints/functionality your tests cover +- CI workflow trigger configuration (when does it run?) +- Versioning strategy chosen (SemVer or CalVer) and rationale + +### 2. Workflow Evidence +``` +Provide links/terminal output for: +- โœ… Successful workflow run (GitHub Actions link) +- โœ… Tests passing locally (terminal output) +- โœ… Docker image on Docker Hub (link to your image) +- โœ… Status badge working in README +``` + +### 3. Best Practices Implemented +Quick list with one-sentence explanations: +- **Practice 1:** Why it helps +- **Practice 2:** Why it helps +- **Practice 3:** Why it helps +- **Caching:** Time saved (before vs after) +- **Snyk:** Any vulnerabilities found? Your action taken + +### 4. Key Decisions +Answer these briefly (2-3 sentences each): +- **Versioning Strategy:** SemVer or CalVer? Why did you choose it for your app? +- **Docker Tags:** What tags does your CI create? (e.g., latest, version number, etc.) +- **Workflow Triggers:** Why did you choose those triggers? +- **Test Coverage:** What's tested vs not tested? + +### 5. Challenges (Optional) +- Any issues you encountered and how you fixed them +- Keep it brief - bullet points are fine + +--- + +## Rubric + +| Criteria | Points | Description | +|----------|--------|-------------| +| **Unit Testing** | 3 pts | Comprehensive tests, good coverage | +| **CI Workflow** | 4 pts | Complete, functional, automated | +| **Best Practices** | 3 pts | Optimized, secure, well-documented | +| **Bonus** | 2.5 pts | Multi-app CI with path filters | +| **Total** | 12.5 pts | 10 pts required + 2.5 pts bonus | + +**Grading:** +- **10/10:** All tasks complete, CI works flawlessly, clear documentation, meaningful tests +- **8-9/10:** CI works, good test coverage, best practices applied, solid documentation +- **6-7/10:** CI functional, basic tests, some best practices, minimal documentation +- **<6/10:** CI broken or missing steps, poor tests, incomplete work + +**Quick Checklist for Full Points:** +- โœ… Tests actually test your endpoints (not just imports) +- โœ… CI workflow runs and passes +- โœ… Docker image builds and pushes successfully +- โœ… At least 3 best practices applied (caching, Snyk, status badge, etc.) +- โœ… Documentation complete but concise (no essay needed!) +- โœ… Links/evidence provided (workflow runs, Docker Hub, etc.) + +**Documentation Should Take:** 15-30 minutes to write, 5 minutes to review + +--- + +## Resources + +
+๐Ÿ“š GitHub Actions Documentation + +- [GitHub Actions Quickstart](https://docs.github.com/en/actions/quickstart) +- [Workflow Syntax](https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions) +- [Building and Testing Python](https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python) +- [Publishing Docker Images](https://docs.docker.com/ci-cd/github-actions/) +- [GitHub Actions Marketplace](https://github.com/marketplace?type=actions) + +
+ +
+๐Ÿงช Testing Resources + +- [Pytest Documentation](https://docs.pytest.org/) +- [Flask Testing Guide](https://flask.palletsprojects.com/en/stable/testing/) +- [FastAPI Testing Guide](https://fastapi.tiangolo.com/tutorial/testing/) +- [Python Testing Best Practices](https://realpython.com/python-testing/) + +
+ +
+๐Ÿ”’ Security & Quality + +- [Snyk GitHub Actions](https://github.com/snyk/actions) +- [Snyk Python Integration](https://docs.snyk.io/integrations/ci-cd-integrations/github-actions-integration) +- [GitHub Security Best Practices](https://docs.github.com/en/actions/security-guides/security-hardening-for-github-actions) +- [Dependency Scanning](https://docs.github.com/en/code-security/supply-chain-security) + +
+ +
+โšก Performance & Optimization + +- [Caching Dependencies](https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows) +- [Docker Build Cache](https://docs.docker.com/build/cache/) +- [Workflow Optimization](https://docs.github.com/en/actions/learn-github-actions/usage-limits-billing-and-administration) + +
+ +
+๐Ÿ› ๏ธ CI/CD Tools + +- [act](https://github.com/nektos/act) - Run GitHub Actions locally +- [actionlint](https://github.com/rhysd/actionlint) - Lint workflow files +- [GitHub CLI](https://cli.github.com/) - Manage workflows from terminal + +
+ +--- + +## Looking Ahead + +- **Lab 4-6:** CI will validate your Terraform and Ansible code +- **Lab 7-8:** CI will run integration tests with logging/metrics +- **Lab 9-10:** CI will validate Kubernetes manifests and Helm charts +- **Lab 13:** ArgoCD will deploy what CI builds (GitOps!) +- **All Future Labs:** This pipeline is your safety net for changes + +--- + +**Good luck!** ๐Ÿš€ + +> **Remember:** CI isn't about having green checkmarksโ€”it's about catching problems before they reach production. Focus on meaningful tests and understanding why each practice matters. Think like a DevOps engineer: automate everything, fail fast, and learn from failures. diff --git a/labs/lab04.md b/labs/lab04.md new file mode 100644 index 0000000000..36efa60723 --- /dev/null +++ b/labs/lab04.md @@ -0,0 +1,1509 @@ +# Lab 4 โ€” Infrastructure as Code (Terraform & Pulumi) + +![difficulty](https://img.shields.io/badge/difficulty-beginner-success) +![topic](https://img.shields.io/badge/topic-Infrastructure%20as%20Code-blue) +![points](https://img.shields.io/badge/points-10%2B2.5-orange) +![tech](https://img.shields.io/badge/tech-Terraform%20%7C%20Pulumi-informational) + +> Provision cloud infrastructure using code with Terraform and Pulumi, comparing both approaches. + +## Overview + +Learn Infrastructure as Code (IaC) by creating virtual machines in the cloud using two popular tools: Terraform (declarative, HCL) and Pulumi (imperative, real programming languages). + +**What You'll Learn:** +- Terraform fundamentals and HCL syntax +- Pulumi fundamentals and infrastructure with code +- Cloud provider APIs and resources +- Infrastructure lifecycle management +- IaC best practices and validation +- Comparing IaC tools and approaches + +**Connection to Previous Labs:** +- **Lab 2:** Created Docker images - now we'll provision infrastructure to run them +- **Lab 3:** CI/CD for applications - now we'll add CI/CD for infrastructure +- **Lab 5:** Ansible will provision software on these VMs (you'll need a VM ready!) + +**Tech Stack:** Terraform 1.9+ | Pulumi 3.x | Yandex Cloud / AWS + +**Why Two Tools?** +By using both Terraform and Pulumi for the same task, you'll understand: +- Different IaC philosophies (declarative vs imperative) +- Tool trade-offs and use cases +- How to evaluate IaC tools for your needs + +**Important for Lab 5:** +The VM you create in this lab will be used in **Lab 5 (Ansible)** for configuration management. You have two options: +- **Option A (Recommended):** Keep your cloud VM running until you complete Lab 5 +- **Option B:** Use a local VM (see Local VM Alternative section below) + +If you choose to destroy your cloud VM after Lab 4, you can easily recreate it later using your Terraform/Pulumi code! + +--- + +## Important: Cloud Provider Selection + +### Recommended for Russia: Yandex Cloud + +Yandex Cloud offers free tier and is accessible in Russia: +- 1 VM with 20% vCPU, 1 GB RAM (free tier) +- 10 GB SSD storage +- No credit card required initially + +### Alternative Cloud Providers + +If Yandex Cloud is unavailable, choose any of these: + +**VK Cloud (Russia):** +- Russian cloud provider +- Free trial with bonus credits +- Good documentation in Russian + +**AWS (Amazon Web Services):** +- 750 hours/month free tier (t2.micro) +- Most popular globally +- Extensive documentation + +**GCP (Google Cloud Platform):** +- $300 free credits for 90 days +- Always-free tier for e2-micro +- Modern interface + +**Azure (Microsoft):** +- $200 free credits for 30 days +- Free tier for B1s instances +- Good Windows support + +**DigitalOcean:** +- Simple pricing and interface +- $200 free credits with GitHub Student Pack +- Beginner-friendly + +### Cost Management ๐Ÿšจ + +**IMPORTANT - Read This:** +- โœ… **Use smallest/free tier instances only** +- โœ… **Run `terraform destroy` when done testing** +- โœ… **Consider keeping VM for Lab 5 to avoid recreation** +- โœ… **Set billing alerts if available** +- โœ… **If not using for Lab 5, delete resources after lab completion** +- โŒ **Never commit cloud credentials to Git** + +--- + +## Local VM Alternative + +If you cannot or prefer not to use cloud providers, you can use a local VM instead. This VM will need to meet specific requirements for Lab 5 (Ansible). + +### Option 1: VirtualBox/VMware VM + +**Requirements:** +- Ubuntu 24.04 LTS (recommended) or Ubuntu 22.04 LTS +- 1 GB RAM minimum (2 GB recommended) +- 10 GB disk space +- Network adapter in Bridged mode (or NAT with port forwarding) +- SSH server installed and configured +- Your SSH public key added to `~/.ssh/authorized_keys` +- Static or predictable IP address + +**Setup Steps:** +```bash +# Install SSH server (if not installed) +sudo apt update +sudo apt install openssh-server + +# Add your SSH public key +mkdir -p ~/.ssh +echo "your-public-key-here" >> ~/.ssh/authorized_keys +chmod 700 ~/.ssh +chmod 600 ~/.ssh/authorized_keys + +# Verify SSH access from your host machine +ssh username@vm-ip-address +``` + +### Option 2: Vagrant VM + +**Requirements:** +- Vagrant installed on your machine +- VirtualBox (or another Vagrant provider) + +**Basic Vagrantfile:** +```ruby +Vagrant.configure("2") do |config| + config.vm.box = "ubuntu/noble64" # Ubuntu 24.04 LTS + # Or use "ubuntu/jammy64" for Ubuntu 22.04 LTS + config.vm.network "private_network", ip: "192.168.56.10" + config.vm.provider "virtualbox" do |vb| + vb.memory = "2048" + end +end +``` + +### Option 3: WSL2 (Windows Subsystem for Linux) + +**Note:** WSL2 can work but has networking limitations. Bridged mode VM is preferred. + +**If using local VM:** +- You can skip Terraform/Pulumi cloud provider setup +- Document your local VM setup instead +- For Task 1, show VM creation (manual or Vagrant) +- For Task 2, you can skip Pulumi (or use Pulumi to manage Vagrant) +- Focus on understanding IaC concepts with cloud provider research + +**Recommended Approach:** +Even with a local VM, complete the Terraform/Pulumi tasks with a cloud provider to gain real IaC experience. You can destroy the cloud VM after Lab 4 and use your local VM for Lab 5. + +--- + +## Tasks + +### Task 1 โ€” Terraform VM Creation (4 pts) + +**Objective:** Create a virtual machine using Terraform on your chosen cloud provider. + +**Requirements:** + +1. **Setup Terraform** + - Install Terraform CLI + - Choose and configure your cloud provider + - Set up authentication (access keys, service accounts, etc.) + - Initialize Terraform + +2. **Define Infrastructure** + + Create a `terraform/` directory with the following resources: + + **Minimum Required Resources:** + - **VM/Compute Instance** (smallest free tier size) + - **Network/VPC** (if required by provider) + - **Security Group/Firewall Rules:** + - Allow SSH (port 22) from your IP + - Allow HTTP (port 80) + - Allow custom port 5000 (for future app deployment) + - **Public IP Address** (to access VM remotely) + +3. **Configuration Best Practices** + - Use variables for configurable values (region, instance type, etc.) + - Use outputs to display important information (public IP, etc.) + - Add appropriate tags/labels for resource identification + - Use `.gitignore` for sensitive files + +4. **Apply Infrastructure** + - Run `terraform plan` to preview changes + - Review the plan carefully + - Apply infrastructure + - Verify VM is accessible via SSH + - Document the public IP and connection method + +5. **State Management** + - Keep state file local (for now) + - Understand what the state file contains + - **Never commit `terraform.tfstate` to Git** + +
+๐Ÿ’ก Terraform Fundamentals + +**What is Terraform?** + +Terraform is a declarative IaC tool that lets you define infrastructure in configuration files (HCL - HashiCorp Configuration Language). + +**Key Concepts:** + +**Providers:** +- Plugins that interact with cloud APIs +- Each cloud has its own provider (yandex, aws, google, azurerm) +- Configure authentication and region + +**Resources:** +- Infrastructure components (VMs, networks, firewalls) +- Format: `resource "type" "name" { ... }` +- Each resource has required and optional arguments + +**Data Sources:** +- Query existing infrastructure +- Example: Find latest Ubuntu image ID +- Format: `data "type" "name" { ... }` + +**Variables:** +- Make configurations reusable +- Define in `variables.tf` +- Set values in `terraform.tfvars` (gitignored!) +- Reference: `var.variable_name` + +**Outputs:** +- Display important values after apply +- Example: VM public IP +- Define in `outputs.tf` + +**State File:** +- Tracks real infrastructure +- Maps config to reality +- **Never commit to Git** (contains sensitive data) +- Add to `.gitignore` + +**Typical Workflow:** +```bash +terraform init # Initialize provider plugins +terraform fmt # Format code +terraform validate # Check syntax +terraform plan # Preview changes +terraform apply # Create/update infrastructure +terraform destroy # Delete all infrastructure +``` + +**Resources:** +- [Terraform Documentation](https://developer.hashicorp.com/terraform/docs) +- [Terraform Registry](https://registry.terraform.io/) - Provider docs +- [HCL Syntax](https://developer.hashicorp.com/terraform/language/syntax) + +
+ +
+โ˜๏ธ Yandex Cloud Terraform Guide + +**Yandex Cloud Setup:** + +**Authentication:** +- Create service account in Yandex Cloud Console +- Generate authorized key (JSON) +- Set key file path or use environment variables + +**Provider Configuration Pattern:** +```hcl +terraform { + required_providers { + yandex = { + source = "yandex-cloud/yandex" + } + } +} + +provider "yandex" { + # Configuration here (zone, folder_id, etc.) +} +``` + +**Key Resources:** +- `yandex_compute_instance` - Virtual machine +- `yandex_vpc_network` - Virtual private cloud +- `yandex_vpc_subnet` - Subnet within VPC +- `yandex_vpc_security_group` - Firewall rules + +**Free Tier Instance:** +- Platform: standard-v2 +- Cores: 2 (core_fraction = 20%) +- Memory: 1 GB +- Boot disk: 10 GB HDD + +**SSH Access:** +- Add SSH public key to `metadata` +- Use `ssh-keys` metadata field +- Connect: `ssh @` + +**Resources:** +- [Yandex Cloud Terraform Provider](https://registry.terraform.io/providers/yandex-cloud/yandex/latest/docs) +- [Getting Started Guide](https://cloud.yandex.com/en/docs/tutorials/infrastructure-management/terraform-quickstart) +- [Compute Instance Example](https://registry.terraform.io/providers/yandex-cloud/yandex/latest/docs/resources/compute_instance) + +
+ +
+โ˜๏ธ AWS Terraform Guide + +**AWS Setup:** + +**Authentication:** +- Create IAM user with EC2 permissions +- Generate access key ID and secret access key +- Configure AWS CLI or use environment variables +- Never hardcode credentials + +**Provider Configuration Pattern:** +```hcl +terraform { + required_providers { + aws = { + source = "hashicorp/aws" + } + } +} + +provider "aws" { + region = var.region # e.g., "us-east-1" +} +``` + +**Key Resources:** +- `aws_instance` - EC2 instance +- `aws_vpc` - Virtual Private Cloud +- `aws_subnet` - Subnet within VPC +- `aws_security_group` - Firewall rules +- `aws_key_pair` - SSH key + +**Free Tier Instance:** +- Instance type: t2.micro +- AMI: Amazon Linux 2 or Ubuntu (find with data source) +- 750 hours/month free for 12 months +- 30 GB storage included + +**Data Source for AMI:** +Use `aws_ami` data source to find latest Ubuntu image dynamically + +**Resources:** +- [AWS Provider Documentation](https://registry.terraform.io/providers/hashicorp/aws/latest/docs) +- [EC2 Instance Resource](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/instance) +- [AWS Free Tier](https://aws.amazon.com/free/) + +
+ +
+โ˜๏ธ GCP Terraform Guide + +**GCP Setup:** +**Authentication:** +- Create service account in Google Cloud Console +- Download JSON key file +- Set `GOOGLE_APPLICATION_CREDENTIALS` environment variable +- Enable Compute Engine API + +**Provider Configuration Pattern:** +```hcl +terraform { + required_providers { + google = { + source = "hashicorp/google" + } + } +} + +provider "google" { + project = var.project_id + region = var.region +} +``` + +**Key Resources:** +- `google_compute_instance` - VM instance +- `google_compute_network` - VPC network +- `google_compute_subnetwork` - Subnet +- `google_compute_firewall` - Firewall rules + +**Free Tier Instance:** +- Machine type: e2-micro +- Zone: us-central1-a (or other free tier zone) +- Always free (within limits) +- Boot disk: 30 GB standard persistent disk + +**Resources:** +- [Google Provider Documentation](https://registry.terraform.io/providers/hashicorp/google/latest/docs) +- [Compute Instance Resource](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_instance) +- [GCP Free Tier](https://cloud.google.com/free) + +
+ +
+โ˜๏ธ Other Cloud Providers + +**Azure:** +- Provider: `azurerm` +- Resource: `azurerm_linux_virtual_machine` +- Free tier: B1s instance +- [Azure Provider Docs](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs) + +**VK Cloud:** +- Based on OpenStack +- Provider: OpenStack provider +- [VK Cloud Documentation](https://mcs.mail.ru/help/) + +**DigitalOcean:** +- Provider: `digitalocean` +- Resource: `digitalocean_droplet` +- Simple and beginner-friendly +- [DigitalOcean Provider Docs](https://registry.terraform.io/providers/digitalocean/digitalocean/latest/docs) + +**Questions to Explore:** +- What's the smallest instance size for your provider? +- How do you find the right OS image ID? +- What authentication method does your provider use? +- How do you add SSH keys to instances? + +
+ +
+๐Ÿ”’ Security Best Practices + +**Credentials Management:** + +**โŒ NEVER DO THIS:** +```hcl +provider "aws" { + access_key = "AKIAIOSFODNN7EXAMPLE" # NEVER! + secret_key = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY" # NEVER! +} +``` + +**โœ… DO THIS INSTEAD:** + +**Option 1: Environment Variables** +```bash +export AWS_ACCESS_KEY_ID="your-key" +export AWS_SECRET_ACCESS_KEY="your-secret" +# Provider will auto-detect +``` + +**Option 2: Credentials File** +```bash +# ~/.aws/credentials (for AWS) +[default] +aws_access_key_id = your-key +aws_secret_access_key = your-secret +``` + +**Option 3: terraform.tfvars (gitignored)** +```hcl +# terraform.tfvars (add to .gitignore!) +access_key = "your-key" +secret_key = "your-secret" +``` + +**Files to Add to .gitignore:** +``` +# Terraform +*.tfstate +*.tfstate.* +.terraform/ +terraform.tfvars +*.tfvars +.terraform.lock.hcl + +# Cloud credentials +*.pem +*.key +*.json # Service account keys +credentials +``` + +**SSH Key Management:** +- Generate SSH key pair locally +- Add public key to cloud provider +- Keep private key secure (never commit) +- Use `chmod 600` on private key file + +**Security Group Rules:** +- Restrict SSH to your IP only (not 0.0.0.0/0) +- Only open ports you need +- Document why each port is open + +
+ +
+๐Ÿ“ Terraform Project Structure + +**Recommended Structure:** + +``` +terraform/ +โ”œโ”€โ”€ .gitignore # Ignore state, credentials +โ”œโ”€โ”€ main.tf # Main resources +โ”œโ”€โ”€ variables.tf # Input variables +โ”œโ”€โ”€ outputs.tf # Output values +โ”œโ”€โ”€ terraform.tfvars # Variable values (gitignored!) +โ””โ”€โ”€ README.md # Setup instructions +``` + +**What Goes in Each File:** + +**main.tf:** +- Provider configuration +- Resource definitions +- Data sources + +**variables.tf:** +- Variable declarations +- Descriptions +- Default values (non-sensitive only) + +**outputs.tf:** +- Important values to display +- VM IP addresses +- Connection strings + +**terraform.tfvars:** +- Actual variable values +- Secrets and credentials +- **MUST be in .gitignore** + +**Alternative: Single File** +For small projects, you can put everything in `main.tf`, but multi-file is more maintainable. + +
+ +**What to Document:** +- Cloud provider chosen and why +- Terraform version used +- Resources created (VM size, region, etc.) +- Public IP address of created VM +- SSH connection command +- Terminal output from `terraform plan` and `terraform apply` +- Proof of SSH access to VM + +--- + +### Task 2 โ€” Pulumi VM Creation (4 pts) + +**Objective:** Destroy the Terraform VM and recreate the same infrastructure using Pulumi. + +**Requirements:** + +1. **Cleanup Terraform Infrastructure** + - Run `terraform destroy` to delete all resources + - Verify all resources are deleted in cloud console + - Document the cleanup process + +2. **Setup Pulumi** + - Install Pulumi CLI + - Choose a programming language (Python recommended, or TypeScript, Go, C#, Java) + - Initialize a new Pulumi project + - Configure cloud provider + +3. **Recreate Same Infrastructure** + + Create a `pulumi/` directory with equivalent resources: + + **Same Resources as Task 1:** + - VM/Compute Instance (same size) + - Network/VPC + - Security Group/Firewall (same rules) + - Public IP Address + + **Goal:** Functionally identical infrastructure, different tool + +4. **Apply Infrastructure** + - Run `pulumi preview` to see planned changes + - Apply infrastructure with `pulumi up` + - Verify VM is accessible via SSH + - Document the public IP + +5. **Compare Experience** + - What was easier/harder than Terraform? + - How does the code differ? + - Which approach do you prefer and why? + +
+๐Ÿ’ก Pulumi Fundamentals + +**What is Pulumi?** + +Pulumi is an imperative IaC tool that lets you write infrastructure using real programming languages (Python, TypeScript, Go, etc.). + +**Key Differences from Terraform:** + +| Aspect | Terraform | Pulumi | +|--------|-----------|--------| +| **Language** | HCL (declarative) | Python, JS, Go, etc. (imperative) | +| **State** | Local or remote state file | Pulumi Cloud (free tier) or self-hosted | +| **Logic** | Limited (count, for_each) | Full programming language | +| **Testing** | External tools | Native unit tests | +| **Secrets** | Plain in state | Encrypted by default | + +**Key Concepts:** + +**Resources:** +- Similar to Terraform, but defined in code +- Example (Python): `vm = compute.Instance("my-vm", ...)` + +**Stacks:** +- Like Terraform workspaces +- Separate environments (dev, staging, prod) +- Each has its own config and state + +**Outputs:** +- Return values from your program +- Example: `pulumi.export("ip", vm.public_ip)` + +**Config:** +- Per-stack configuration +- Set with: `pulumi config set key value` +- Access in code: `config.get("key")` + +**Typical Workflow:** +```bash +pulumi new