|
1 | | -# devopslearn |
| 1 | +# DevOps Learning Platform - Learn by Fixing Broken Things |
| 2 | + |
| 3 | +A hands-on learning platform for DevOps engineers based on real-world troubleshooting scenarios. Inspired by the "debugging in public" philosophy, this platform provides broken infrastructure scenarios that users must diagnose and fix. |
| 4 | + |
| 5 | +## 🎯 Features |
| 6 | + |
| 7 | +- **Interactive Terminal**: Browser-based terminal for hands-on practice |
| 8 | +- **Real Scenarios**: Based on actual production issues and community suggestions |
| 9 | +- **Progressive Hints**: Get help when stuck without spoiling the learning experience |
| 10 | +- **Isolated Environments**: Each scenario runs in its own Docker container |
| 11 | +- **Multiple Categories**: Kubernetes, CI/CD, Databases, Terraform, Monitoring, and Security |
| 12 | + |
| 13 | +## 🚀 Quick Start |
| 14 | + |
| 15 | +### Prerequisites |
| 16 | + |
| 17 | +- Docker and Docker Compose |
| 18 | +- Node.js 18+ (for local development) |
| 19 | +- Make (optional, for using Makefile commands) |
| 20 | + |
| 21 | +### Installation |
| 22 | + |
| 23 | +1. Clone the repository: |
| 24 | +```bash |
| 25 | +git clone https://github.com/yourusername/devopslearn.git |
| 26 | +cd devopslearn |
| 27 | +``` |
| 28 | + |
| 29 | +2. Install dependencies: |
| 30 | +```bash |
| 31 | +make install |
| 32 | +# or manually: |
| 33 | +npm install |
| 34 | +cd src/server && npm install |
| 35 | +``` |
| 36 | + |
| 37 | +3. Start the development environment: |
| 38 | +```bash |
| 39 | +make dev |
| 40 | +# or manually: |
| 41 | +docker-compose up -d |
| 42 | +``` |
| 43 | + |
| 44 | +4. Access the platform: |
| 45 | +- Frontend: http://localhost:3000 |
| 46 | +- Backend: http://localhost:3001 |
| 47 | + |
| 48 | +## 📚 Available Scenarios |
| 49 | + |
| 50 | +### Kubernetes |
| 51 | +- **Keycloak in CrashLoopBackOff**: Fix authentication service that won't start |
| 52 | +- **DNS Resolution Failures**: Troubleshoot cluster networking issues |
| 53 | +- **Istio Service Mesh Issues**: Debug service mesh configuration |
| 54 | + |
| 55 | +### CI/CD |
| 56 | +- **Jenkins to GitHub Actions Migration**: Complete a migration with broken configs |
| 57 | +- **Secret Management Gone Wrong**: Fix exposed secrets and implement proper handling |
| 58 | +- **Failed Production Deployment**: Recover from a failed deployment |
| 59 | + |
| 60 | +### Database Recovery |
| 61 | +- **RDS Instance Failure**: Handle AWS RDS failures |
| 62 | +- **Database Corruption Recovery**: Recover from data corruption |
| 63 | +- **Performance Bottleneck Detection**: Identify and fix slow queries |
| 64 | + |
| 65 | +### Infrastructure (Terraform) |
| 66 | +- **Terraform State Drift**: Reconcile state with actual resources |
| 67 | +- **Accidental Resource Deletion**: Recover from accidental deletions |
| 68 | +- **IAM Permission Issues**: Fix AWS permission problems |
| 69 | + |
| 70 | +### Monitoring & Observability |
| 71 | +- **Grafana Dashboard Creation**: Build effective monitoring dashboards |
| 72 | +- **Prometheus Auto-discovery Fix**: Fix service discovery issues |
| 73 | +- **Log Rotation Nightmare**: Handle massive log files |
| 74 | + |
| 75 | +### Security & Secrets |
| 76 | +- **Certificate Expiry Crisis**: Handle expired certificates |
| 77 | +- **Exposed Secrets in Git**: Clean up and rotate exposed credentials |
| 78 | +- **HashiCorp Vault Lockout**: Recover from Vault access issues |
| 79 | + |
| 80 | +## 🛠️ Development |
| 81 | + |
| 82 | +### Project Structure |
| 83 | +``` |
| 84 | +devopslearn/ |
| 85 | +├── src/ |
| 86 | +│ ├── components/ # React components |
| 87 | +│ ├── pages/ # Next.js pages |
| 88 | +│ ├── server/ # WebSocket backend |
| 89 | +│ └── styles/ # CSS styles |
| 90 | +├── scenarios/ # Scenario configurations |
| 91 | +│ ├── kubernetes/ |
| 92 | +│ ├── cicd/ |
| 93 | +│ ├── database/ |
| 94 | +│ └── ... |
| 95 | +├── docker/ # Docker configurations |
| 96 | +├── public/ # Static assets |
| 97 | +└── tests/ # Test files |
| 98 | +``` |
| 99 | + |
| 100 | +### Adding New Scenarios |
| 101 | + |
| 102 | +1. Create scenario directory: |
| 103 | +```bash |
| 104 | +mkdir -p scenarios/category/scenario-name |
| 105 | +``` |
| 106 | + |
| 107 | +2. Add scenario files: |
| 108 | +- `Dockerfile`: Container configuration |
| 109 | +- `setup.sh`: Scenario initialization script |
| 110 | +- `manifests/`: Configuration files |
| 111 | +- `solution/`: Solution files and guide |
| 112 | + |
| 113 | +3. Register in frontend (`src/pages/index.tsx`) |
| 114 | + |
| 115 | +4. Build scenario image: |
| 116 | +```bash |
| 117 | +docker build -t devopslearn/scenario-name:latest scenarios/category/scenario-name/ |
| 118 | +``` |
| 119 | + |
| 120 | +### Commands |
| 121 | + |
| 122 | +```bash |
| 123 | +make help # Show all available commands |
| 124 | +make dev # Start development environment |
| 125 | +make build # Build all images |
| 126 | +make test # Run tests |
| 127 | +make lint # Run linting |
| 128 | +make clean # Clean up containers |
| 129 | +``` |
| 130 | + |
| 131 | +## 🤝 Contributing |
| 132 | + |
| 133 | +1. Fork the repository |
| 134 | +2. Create a feature branch |
| 135 | +3. Add your scenario or feature |
| 136 | +4. Submit a pull request |
| 137 | + |
| 138 | +### Scenario Ideas Welcome! |
| 139 | + |
| 140 | +We're always looking for new scenarios based on real-world problems. Submit your ideas via issues or pull requests. |
| 141 | + |
| 142 | +## 📝 License |
| 143 | + |
| 144 | +MIT License - see LICENSE file for details |
| 145 | + |
| 146 | +## 🙏 Acknowledgments |
| 147 | + |
| 148 | +- Inspired by the DevOps community's "learning by breaking things" philosophy |
| 149 | +- Based on suggestions from r/devops community members |
| 150 | +- Built with Next.js, TypeScript, Docker, and xterm.js |
0 commit comments