You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/en/docs/prologue/introduction.md
+56-13Lines changed: 56 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -23,25 +23,68 @@ Nightingale can query data from multiple data sources, generate alarm events, an
23
23
24
24
Any issues or PRs are welcome!
25
25
26
-
## Architecture
26
+
## Working logic
27
27
28
-
<imgsrc="/img/prologue/intro/product-arch.png" />
29
-
<br />
30
-
<br />
28
+
Many users have already collected metrics and log data themselves. In this case, they can integrate their storage repositories (like VictoriaMetrics, Elasticsearch, etc.) as data sources in Nightingale. Users can then configure alert rules and notification rules in Nightingale to generate and dispatch alert events.
31
29
32
-
Nightingale can integrate with various data sources such as Prometheus, VictoriaMetrics, Elasticsearch, and Loki. It queries metrics and logs based on the alert rules configured by users, makes alert determinations, and then generates alert events, which are pushed to various notification channels.
Nightingale itself does not provide data collection capabilities. We recommend using [Categraf](https://github.com/flashcatcloud/categraf) as a collector, which can seamlessly integrate with Nightingale.
33
+
34
+
[Categraf](https://github.com/flashcatcloud/categraf) can collect monitoring data from operating systems, network devices, middleware, and databases. It pushes this data to Nightingale (via Prometheus Remote Write protocol), which then forwards the data to time-series databases (like Prometheus, VictoriaMetrics, etc.) and provides alerting and visualization capabilities.
35
+
36
+
For specific edge data centers, where the network link to the central Nightingale server is poor, Nightingale also provides a design for edge data center alerting engine deployment. In this mode, even if the edge and central networks are disconnected, alerting functionality remains unaffected.
> In the above diagram, the network link between data center A and the central data center is good, so the alerting engine is handled by the central Nightingale process. For data center B, where the network link to the central data center is poor, we deploy `n9e-edge` as the alerting engine to handle data source alerting functionality locally.
41
+
42
+
## Alerting, Upgrades, and Collaboration
43
+
44
+
Nightingale focuses on being an alerting engine, responsible for generating alert events and flexibly dispatching them based on rules. It has built-in support for 20 notification channels (like phone calls, SMS, email, DingTalk, Feishu, WeCom, Slack, etc.).
45
+
46
+
If you have more advanced requirements, such as:
47
+
48
+
- Want to aggregate events from multiple monitoring systems into one platform for unified noise reduction, response handling, and data analysis
49
+
- Want to support team on-call culture, including features like alert claim, escalation (to avoid missing alerts), and collaborative handling
50
+
51
+
Then Nightingale may not be suitable. We recommend using **[FlashDuty](https://flashcat.cloud/product/flashcat-duty/)**, an on-call product that aggregates alerts from various monitoring systems for unified noise reduction, distribution, and response.
- Nightingale supports alert rules, muting rules, subscription rules, and notification rules. It natively integrates 20 notification channels and allows customization of message templates.
58
+
- Nightingale supports event pipelines to process alert events and integrate with third-party systems. For example, it can perform operations like relabeling, filtering, and enriching on events.
59
+
- Nightingale supports the concept of business groups and introduces a permission system to manage various rules in a categorized manner.
60
+
- Many databases and middleware have built-in alert rules that can be directly imported for use, and Prometheus alert rules can also be directly imported.
61
+
- Nightingale supports alert self-healing, which means that after an alert is triggered, a script is automatically executed to perform some predefined logic, such as cleaning up the disk or capturing the on-site situation.
- Nightingale has built-in metric explanations, dashboards, and alert rules for common operating systems, middleware, and databases. However, these are all contributed by the community, and their overall quality varies.
71
+
- Nightingale directly receives data from multiple protocols such as Remote Write, OpenTSDB, Datadog, and Falcon, thus enabling integration with various types of agents.
72
+
- Nightingale supports multiple data sources including Prometheus, ElasticSearch, Loki, and TDEngine, and can perform alerting based on the data from them.
73
+
- Nightingale can be easily embedded into internal enterprise systems, such as Grafana and CMDB. It even allows configuring the menu visibility of these embedded systems.
- Nightingale supports dashboard functionality, featuring common chart types and some built-in dashboards. The image above is a screenshot of one of these dashboards.
78
+
- If you're already accustomed to Grafana, it's recommended to continue using Grafana for viewing charts, as Grafana has more profound expertise in this area.
79
+
- For machine-related monitoring data collected by Categraf, it's advisable to use Nightingale's built-in dashboards. This is because Categraf's metric naming follows Telegraf's naming convention, which differs from that of Node Exporter.
80
+
- Since Nightingale incorporates the concept of business groups, where machines can belong to different business groups, there are times when you may only want to view machines belonging to the current business group in the dashboard. Therefore, Nightingale's dashboards can be linked with business groups.
81
+
82
+
## Thank you for the trust from numerous enterprises.
39
83
40
-
Nightingale enables flexible alarm configuration. It supports both metric and log data sources. Users can configure aspects such as the active time periods of alarm rules, the clusters in which the rules are effective, and event relabeling.
84
+
Nightingale has been adopted by many enterprises, including but not limited to:
Although Nightingale's visualization capabilities are not as strong as those of Grafana, it still supports common dashboard chart types. Moreover, it has built-in alarm rules and dashboards for various middleware and databases, making it ready-to-use.
88
+
## Open Source License
47
89
90
+
The Nightingale monitoring project is open-sourced under the Apache License 2.0.
0 commit comments