|
1 | 1 | = Deployment With KinD |
2 | 2 |
|
3 | | -_Work In Progress_ |
| 3 | +An example of a local deployment in KinD is provided https://github.com/camptocamp/devops-stack/tree/main/examples/kind[here]. Clone this repository and modify it at your convenance. |
| 4 | +In the folder, as in a standard https://developer.hashicorp.com/terraform/tutorials/modules/module#what-is-a-terraform-module[Terraform module], you will find the following files: |
| 5 | + |
| 6 | +* `terraform.tf`: declaration of the Terraform providers used in this project. |
| 7 | +* `locals.tf`: local variables used in the DevOps Stacks. |
| 8 | +* `main.tf`: definition of all deployed modules. |
| 9 | +* `s3_bucket.tf`: configuration of the MinIO bucket, used as backend for Loki and Thanos. |
| 10 | +* `outputs.tf`: the output variables of the DevOps Stack, e.g. credentials and the `.kubeconfig` file to use with `kubectl`. |
| 11 | +
|
| 12 | +== Specificities of the KinD deployment |
| 13 | + |
| 14 | +==== Local Load Balancer |
| 15 | + |
| 16 | +https://metallb.universe.tf/[MetalLB] is used as a load balancer for the cluster. This allows us to have a multi-node KinD cluster without the need to use Traefik in a single replica with a NodePort configuration. |
| 17 | + |
| 18 | +==== Self-signed SSL certificates |
| 19 | + |
| 20 | +Since KinD is locally deployed, there is no easy way of creating valid SSL certificates for the ingresses using Let's Encrypt. As such, `cert-manager` is configured to use a self-signed Certificate Authority and the remaining modules are configured to ignore the SSL warnings/errors that are a consequence of that. |
| 21 | + |
| 22 | +NOTE: When accessing the ingresses on your browser, you'll obviously see warnings saying that the certificate is not valid. You can safely ignore them. |
| 23 | + |
| 24 | +== Requirements |
| 25 | + |
| 26 | +For this setup, you will need to have installed on your machine: |
| 27 | + |
| 28 | +* https://docs.docker.com/get-docker[Docker] to deploy the KinD containers |
| 29 | +* https://www.terraform.io/[Terraform] to provision the whole stack |
| 30 | +* https://kubernetes.io/docs/reference/kubectl/[`kubectl`] to interact with your cluster |
| 31 | + |
| 32 | +== Deployment |
| 33 | + |
| 34 | +1. From the source of the example deployment, initialize Terraform, which downloads all required providers and modules locally (they will be stored in the hidden folder `.terraform`). |
| 35 | ++ |
| 36 | +[source,bash] |
| 37 | +---- |
| 38 | +terraform init |
| 39 | +---- |
| 40 | + |
| 41 | +2. Check out the modules you want to deploy in the `main.tf` file, and comment out the others. |
| 42 | ++ |
| 43 | +TIP: You can also add your owns Terraform modules in this file or any other file on the root folder. A good place to start to write your own module is to clone the https://github.com/camptocamp/devops-stack-helloworld[devops-stack-helloworld] repository and adapt it to your needs. |
| 44 | + |
| 45 | +3. Configure the variables in `locals.tf` to your preference: |
| 46 | ++ |
| 47 | +[source,hcl] |
| 48 | +---- |
| 49 | +include::example$deploy_examples/kind/locals.tf[] |
| 50 | +---- |
| 51 | + |
| 52 | +4. Finally, run `terraform apply` and accept the proposed changes to create the Kubernetes nodes as Docker containers, and populate them with our services. |
| 53 | ++ |
| 54 | +[source,bash] |
| 55 | +---- |
| 56 | +terraform apply |
| 57 | +---- |
| 58 | + |
| 59 | +=== Troubleshooting |
| 60 | + |
| 61 | +==== ArgoCD: connection refused |
| 62 | + |
| 63 | +Because at the end of the deployment we use the Argo CD to deploy and manage itself, there could be an error like this one: |
| 64 | + |
| 65 | +This error happens when ArgoCD bootstraps itself. |
| 66 | +[source,shell] |
| 67 | +---- |
| 68 | +β· |
| 69 | +β Error: Error while waiting for application argocd to be created |
| 70 | +β |
| 71 | +β with module.argocd.argocd_application.this, |
| 72 | +β on .terraform/modules/argocd/main.tf line 55, in resource "argocd_application" "this": |
| 73 | +β 55: resource "argocd_application" "this" { |
| 74 | +β |
| 75 | +β error while waiting for application argocd to be synced and healthy: rpc error: code = Unavailable desc = connection error: desc = "transport: error while dialing: dial tcp 127.0.0.1:45729: connect: connection refused" |
| 76 | +β΅ |
| 77 | +---- |
| 78 | + |
| 79 | +This happens because the Argo CD server pod is redeployed and the Argo CD Terraform provider loses connection. You can simply re-run the command `terraform apply` to finalize the bootstrap of the cluster. |
| 80 | + |
| 81 | +==== `loki-stack-promtail` pods stuck with status `CrashLoopBackOff` |
| 82 | + |
| 83 | +When the pods of `loki-stack-promtail` are stuck in a creation loop with the following logs: |
| 84 | + |
| 85 | +[source] |
| 86 | +---- |
| 87 | +level=error ts=2023-05-09T06:32:38.495673778Z caller=main.go:117 msg="error creating promtail" error="failed to make file target manager: too many open files" |
| 88 | +Stream closed EOF for loki-stack/loki-stack-promtail-bxcmw (promtail) |
| 89 | +---- |
| 90 | + |
| 91 | +You will have to increase the upper limit on the number of INotify instances that can be created per real user ID: |
| 92 | + |
| 93 | +[source,bash] |
| 94 | +---- |
| 95 | +# Increase the limit until next reboot: |
| 96 | +sudo sysctl fs.inotify.max_user_instances=512 |
| 97 | +# Increase the limit permanently (run this command as root): |
| 98 | +echo 'fs.inotify.max_user_instances=512' >> /etc/sysctl.conf |
| 99 | +---- |
| 100 | + |
| 101 | +== Access the applications |
| 102 | + |
| 103 | +The URLs of the applications are visible in the ingresses of the cluster. You can get them by running the following command: |
| 104 | + |
| 105 | +[source,bash] |
| 106 | +---- |
| 107 | +kubectl get ingress --all-namespaces |
| 108 | +---- |
| 109 | + |
| 110 | +For example, if the base domain name is `172-19-0-1.nip.io`, the applications are accessible at the following adresses: |
| 111 | + |
| 112 | +---- |
| 113 | +https://grafana.apps.172-19-0-1.nip.io |
| 114 | +https://alertmanager.apps.172-19-0-1.nip.io |
| 115 | +https://prometheus.apps.172-19-0-1.nip.io |
| 116 | +https://keycloak.apps.172-19-0-1.nip.io |
| 117 | +https://minio.apps.172-19-0-1.nip.io |
| 118 | +https://argocd.apps.172-19-0-1.nip.io |
| 119 | +https://thanos-bucketweb.apps.172-19-0-1.nip.io |
| 120 | +https://thanos-query.apps.172-19-0-1.nip.io |
| 121 | +---- |
| 122 | + |
| 123 | +You can access the applications using the credentials created by the Keycloak module. They are written to the Terraform output: |
| 124 | + |
| 125 | +[source,bash] |
| 126 | +---- |
| 127 | +# List all outputs: |
| 128 | +$ terraform output |
| 129 | +keycloak_admin_credentials = <sensitive> |
| 130 | +keycloak_users = <sensitive> |
| 131 | +kubernetes_kubeconfig = <sensitive> |
| 132 | +minio_root_user_credentials = <sensitive> |
| 133 | +
|
| 134 | +# To get the credentials for Grafana, Prometheus, etc. |
| 135 | +$ terraform output keycloak_users |
| 136 | +{ |
| 137 | + "devopsadmin" = "aqhEbd3L0Msryjjp547ej7nyN6E2FllV" |
| 138 | +} |
| 139 | +---- |
| 140 | + |
| 141 | +== Pause the cluster |
| 142 | + |
| 143 | +The `docker pause` command can be used to halt the cluster for a while in order to save energy (replace `kind-cluster` by the cluster name you defined in `locals.tf`): |
| 144 | + |
| 145 | +[source,bash] |
| 146 | +---- |
| 147 | +# Pause the cluster: |
| 148 | +docker pause kind-cluster-control-plane kind-cluster-worker{,2,3} |
| 149 | +
|
| 150 | +# Resume the cluster: |
| 151 | +docker unpause kind-cluster-control-plane kind-cluster-worker{,2,3} |
| 152 | +---- |
| 153 | + |
| 154 | +NOTE: When the host computer is restarted, the Docker container will start |
| 155 | +again, but the cluster will not resume correctly. It has to be destroyed and |
| 156 | +recreated. |
| 157 | + |
| 158 | +== Stop the cluster |
| 159 | + |
| 160 | +To definitively stop the cluster on a single command (that is the reason we delete some resources from the state file), we can use the following command: |
| 161 | + |
| 162 | +[source,bash] |
| 163 | +---- |
| 164 | +terraform state rm $(terraform state list | grep "argocd_application\|argocd_project\|kubernetes_\|helm_\|keycloak_") && terraform destroy |
| 165 | +---- |
| 166 | + |
| 167 | +A dirtier alternative is to directly destroy the Docker containers and volumes (replace `kind-cluster` by the cluster name you defined in `locals.tf`): |
| 168 | + |
| 169 | +[source,bash] |
| 170 | +---- |
| 171 | +# Stop and remove Docker containers: |
| 172 | +docker container stop kind-cluster-control-plane kind-cluster-worker{,2,3} && \ |
| 173 | + docker container rm -v kind-cluster-control-plane kind-cluster-worker{,2,3} |
| 174 | +# Remove the Terraform state: |
| 175 | +rm terraform.state |
| 176 | +---- |
| 177 | + |
| 178 | +== Conclusion |
| 179 | + |
| 180 | +That's it, you have deployed the DevOps Stack locally! For more informations, keep on reading the https://devops-stack.io/docs/latest/[documentation]. **You can explore the possibilities of each module and get the link to the source code on their respective documentation pages.** |
0 commit comments