|
| 1 | +--- |
| 2 | +title: "Nightingale vs Prometheus" |
| 3 | +description: "Nightingale and Prometheus are often discussed in relation to each other, and in fact, they have a complementary relationship. This article will detail the differences and connections between the two." |
| 4 | +date: 2025-07-26T13:12:27.760+08:00 |
| 5 | +lastmod: 2025-07-26T13:12:27.760+08:00 |
| 6 | +draft: false |
| 7 | +images: [] |
| 8 | +menu: |
| 9 | + docs: |
| 10 | + parent: "prologue" |
| 11 | +weight: 150 |
| 12 | +toc: true |
| 13 | +--- |
| 14 | + |
| 15 | +Nightingale is similar to Grafana in that it can integrate with a variety of data sources, the most common of which is Prometheus-type. Other data sources that are compatible with the Prometheus interface, such as VictoriaMetrics, Thanos, and M3DB, can also be considered Prometheus-type sources, so the relationship between the two is close. |
| 16 | + |
| 17 | +If you have the following requirements, you might consider using Nightingale: |
| 18 | + |
| 19 | +- You have multiple time-series databases, such as Prometheus and VictoriaMetrics, and want to use a unified platform to manage various alert rules with permission control. |
| 20 | +- You are concerned about the single point of failure of Prometheus's alerting engine and want to avoid downtime. |
| 21 | +- In addition to Prometheus alerts, you need alerts from other data sources such as ElasticSearch, Loki, and ClickHouse. |
| 22 | +- You require more flexible alert rule configurations, such as controlling the effective time, event relabeling, event linkage with CMDB, and supporting alert self-healing scripts. |
| 23 | + |
| 24 | +Nightingale also has visualization capabilities similar to Grafana, but it may not be as advanced. In my observation, many companies adopt a combination approach (in the adult world, there are no absolutes): |
| 25 | + |
| 26 | +- Data Collection: A combination of various agents and exporters is used, with Categraf being the primary choice (especially for machine monitoring, seamlessly integrated with Nightingale), supplemented by various exporters. |
| 27 | +- Storage: The time-series database primarily used is VictoriaMetrics, as it is compatible with Prometheus, offers better performance, and has a clustered version. For most companies, the single-node version is sufficient. |
| 28 | +- Alerting Engine: Nightingale is used for alerting, making it easy for different teams to manage and collaborate. It comes with some built-in rules out of the box, and the configuration of alert rules is very flexible, with an event pipeline mechanism that facilitates integration with their own CMDB, etc. |
| 29 | +- Visualization: Grafana is used for visualization, as it offers more advanced and visually appealing charts. The community is also very large, and many pre-made dashboards can be found on the Grafana site, making it relatively hassle-free. |
| 30 | +- On-call Distribution of Alert Events: [FlashDuty](https://flashcat.cloud/product/flashduty/) is used, which supports integration with various monitoring systems such as Zabbix, Prometheus, Nightingale, cloud monitoring solutions, Elastalert, etc. It consolidates alert events into a single platform for unified noise reduction, scheduling, claim escalation, response, distribution, and more. |
0 commit comments