In this tutorial we will create alerts on the
ping_request_count metric that we instrumented earlier in the
Instrumenting HTTP server written in Go tutorial.
For the sake of this tutorial we will alert when the
ping_request_count metric is greater than 5, Checkout real world best practices to learn more about alerting principles.
Download the latest release of Alertmanager for your operating system from here
Alertmanager supports various receivers like
slack etc through which it can notify when an alert is firing. You can find the list of receivers and how to configure them here. We will use
webhook as a receiver for this tutorial, head over to webhook.site and copy the webhook URL which we will use later to configure the Alertmanager.
First let's setup Alertmanager with webhook receiver.
global: resolve_timeout: 5m route: receiver: webhook_receiver receivers: - name: webhook_receiver webhook_configs: - url: '<INSERT-YOUR-WEBHOOK>' send_resolved: false
<INSERT-YOUR-WEBHOOK> with the webhook that we copied earlier in the alertmanager.yml file and run the Alertmanager using the following command.
Once the Alertmanager is up and running navigate to http://localhost:9093 and you should be able to access it.
Now that we have configured the Alertmanager with webhook receiver let's add the rules to the Prometheus config.
global: scrape_interval: 15s evaluation_interval: 10s rule_files: - rules.yml alerting: alertmanagers: - static_configs: - targets: - localhost:9093 scrape_configs: - job_name: prometheus static_configs: - targets: ["localhost:9090"] - job_name: simple_server static_configs: - targets: ["localhost:8090"]
If you notice the
alerting sections are added to the Prometheus config, the
evaluation_interval defines the intervals at which the rules are evaluated,
rule_files accepts an array of yaml files that defines the rules and the
alerting section defines the Alertmanager configuration. As mentioned in the beginning of this tutorial we will create a basic rule where we want to
raise an alert when the
ping_request_count value is greater than 5.
groups: - name: Count greater than 5 rules: - alert: CountGreaterThan5 expr: ping_request_count > 5 for: 10s
Now let's run Prometheus using the following command.
Open http://localhost:9090/rules in your browser to see the rules. Next run the instrumented ping server and visit the http://localhost:8090/ping endpoint and refresh the page atleast 6 times. You can check the ping count by navigating to http://localhost:8090/metrics endpoint. To see the status of the alert visit http://localhost:9090/alerts. Once the condition
ping_request_count > 5 is true for more than 10s the
state will become
FIRING. Now if you navigate back to your
webhook.site URL you will see the alert message.
Similarly Alertmanager can be configured with other receivers to notify when an alert is firing.
This documentation is open-source. Please help improve it by filing issues or pull requests.