Skip to content

Commit c3a45f8

Browse files
rh-rahulshettyjeremyeder
authored andcommitted
fix(grafana): Resolve issue with add-grafana make command & add new K8s dashboard (#1018)
## Summary - Moved Grafana resources (`grafana.yaml`, dashboards, RBAC) into the `with-grafana` overlay - Fixed Grafana Prometheus auth: init container injects SA token into datasource config at pod startup - Fixed PVC mount path (`/var/lib/grafana` instead of `/var/lib/grafana/data`) so `grafana.db` persists across restarts - Removed `GF_SECURITY_ADMIN_PASSWORD` env var that was resetting password on every restart - Added K8s infrastructure dashboards (cluster, nodes, namespace, pods) - PVC is managed separately from kustomize so `make clean-observability` preserves Grafana data ## Dashboards | Dashboard | Screenshot | Description | |-----------|------------|-------------| | K8s Cluster Monitoring | ![](https://github.com/user-attachments/assets/077fac80-5995-4235-b634-c6f8d77e0e75) | Cluster-level CPU, memory, network | | K8s Nodes | ![](https://github.com/user-attachments/assets/342e8ccd-c9bc-4266-b2a4-29b8c5923b9e) | Node-level resource usage | | K8s Namespace | ![](https://github.com/user-attachments/assets/726da1b3-6390-46a6-8a3d-6fe23facd0fc)| Namespace-level resource usage | | K8s Pods | ![](https://github.com/user-attachments/assets/7f5ef084-d8d6-40db-9f1d-0252b0cbc281) | Pod-level resource usage | ## Test plan - [x] `make add-grafana` deploys without errors - [x] `make clean-observability` removes stack but preserves PVC - [x] Grafana password persists across pod restarts - [x] All dashboards load and show data from Prometheus > Note: These charts are taken from community dashboards like https://github.com/dotdc/grafana-dashboards-kubernetes, so some data might be missing depending on the cluster setup. I'll be addressing them in another change request. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Introduced pre-provisioned Grafana dashboards for Kubernetes cluster and pod monitoring, automatically deployed without manual configuration. * Added simplified Grafana installation workflow using the `make add-grafana` command. * **Documentation** * Updated installation and cleanup instructions for Grafana. * Added comprehensive guide for managing and customizing dashboards. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Rahul Shetty <rashetty@redhat.com> Co-authored-by: Jeremy Eder <jeder@redhat.com>
1 parent e537e06 commit c3a45f8

19 files changed

Lines changed: 12509 additions & 530 deletions

Makefile

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -288,15 +288,17 @@ deploy-observability: ## Deploy observability (OTel + OpenShift Prometheus)
288288

289289
add-grafana: ## Add Grafana on top of observability stack
290290
@echo "$(COLOR_BLUE)$(COLOR_RESET) Adding Grafana..."
291+
@kubectl apply -f components/manifests/observability/overlays/with-grafana/grafana-pvc.yaml
291292
@kubectl apply -k components/manifests/observability/overlays/with-grafana/
292293
@echo "$(COLOR_GREEN)$(COLOR_RESET) Grafana deployed"
293294
@echo " Create route: oc create route edge grafana --service=grafana -n $(NAMESPACE)"
294295

295-
clean-observability: ## Remove observability components
296+
clean-observability: ## Remove observability components (preserves Grafana PVC)
296297
@echo "$(COLOR_BLUE)$(COLOR_RESET) Removing observability..."
297298
@kubectl delete -k components/manifests/observability/overlays/with-grafana/ 2>/dev/null || true
298299
@kubectl delete -k components/manifests/observability/ 2>/dev/null || true
299300
@echo "$(COLOR_GREEN)$(COLOR_RESET) Observability removed"
301+
@echo " To also delete Grafana data: kubectl delete pvc grafana-storage -n $(NAMESPACE)"
300302

301303
grafana-dashboard: ## Open Grafana (create route first)
302304
@echo "$(COLOR_BLUE)$(COLOR_RESET) Opening Grafana..."

components/manifests/observability/README.md

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -40,11 +40,14 @@ Open **OpenShift Console → Observe → Metrics** and query:
4040
If you want custom dashboards:
4141

4242
```bash
43-
# Add Grafana overlay
43+
make add-grafana
44+
45+
# Or manually
46+
kubectl apply -f components/manifests/observability/overlays/with-grafana/grafana-pvc.yaml
4447
kubectl apply -k components/manifests/observability/overlays/with-grafana/
4548
```
4649

47-
**Adds**: Grafana (additional 128MB) - still uses OpenShift Prometheus
50+
**Adds**: Grafana (additional 128MB) with pre-provisioned dashboards - still uses OpenShift Prometheus
4851

4952
**Access Grafana**:
5053
```bash
@@ -53,10 +56,10 @@ oc create route edge grafana --service=grafana -n ambient-code
5356

5457
# Get URL
5558
oc get route grafana -n ambient-code -o jsonpath='{.spec.host}'
56-
# Login: admin/admin
59+
# Login: admin/admin (change on first login)
5760
```
5861

59-
**Import dashboard**: Upload `dashboards/ambient-operator-dashboard.json` in Grafana UI
62+
**Dashboards** are provisioned automatically from `overlays/with-grafana/dashboards/`. See [dashboards/README.md](./overlays/with-grafana/dashboards/README.md) for how to add new ones.
6063

6164
---
6265

@@ -185,6 +188,6 @@ EOF
185188
## Cleanup
186189

187190
```bash
188-
kubectl delete -k components/manifests/observability/overlays/with-grafana/ # If Grafana deployed
189-
kubectl delete -k components/manifests/observability/
191+
make clean-observability # Removes stack but preserves Grafana PVC
192+
kubectl delete pvc grafana-storage -n ambient-code # Also delete Grafana data
190193
```
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
apiVersion: kustomize.config.k8s.io/v1beta1
2+
kind: Kustomization
3+
4+
namespace: ambient-code
5+
6+
resources:
7+
- otel-collector.yaml
8+
- servicemonitor.yaml

components/manifests/observability/otel-collector.yaml renamed to components/manifests/observability/base/otel-collector.yaml

File renamed without changes.

components/manifests/observability/servicemonitor.yaml renamed to components/manifests/observability/base/servicemonitor.yaml

File renamed without changes.

0 commit comments

Comments
 (0)