欧洲杯投注软件

Step-by-step guide to setting up Prometheus Alertmanager with Slack, PagerDuty, and Gmail

Published: 25 Feb 2020 by Ivana Huckova RSS

In my previous blog post, “How to Explore Prometheus with Easy ‘Hello World’ Projects”欧洲杯投注软件, I described three projects that I used to get a better sense of what Prometheus can do. In this post, I’d like to share how I got more familiar with Prometheus Alertmanager and how I set up alert notifications for Slack, PagerDuty, and Gmail.

(I’m going to reference my previous blog post quite a bit, so I recommend reading it before continuing on.)

The Basics

Setting up alerts with Prometheus is a two-step process:

欧洲杯投注软件First, you need to create your alerting rules in Prometheus, and specify under what conditions you want to be alerted (such as when an instance is down).

Second, you need to set up Alertmanager, which receives the alerts specified in Prometheus. Alertmanager will then be able to do a variety of things, including grouping alerts of similar nature into a single notification; silencing alerts for a specific time; muting notifications for certain alerts if other specified alerts are already firing; and picking which receivers receive a particular alert.

Step One: Create Alerting Rules in Prometheus

Before you begin, you’ll want to create a dedicated Prometheus folder with server, node_exporter, github_exporter and prom_middleware subfolders. (The process is explained here.)

After that, move to the server subfolder and open the content in the code editor, then create a new rules file. In the rules.yml, you will specify the conditions when you would like to be alerted.

cd Prometheus/server
touch rules.yml

I’m sure everyone agrees that knowing when any of your instances are down is very important. Therefore, I’m going to use this as our condition by using up metric. By evaluating this metric in the Prometheus user interface (), you will see that all running instances have value of 1, while all instances that are currently not running have value of 0 (we currently run only our Prometheus instance).

After you’ve decided on your alerting condition, you need to specify them in rules.yml. Its content is going to be following:

groups:
- name: AllInstances
 rules:
 - alert: InstanceDown
   # Condition for alerting
   expr: up == 0
   for: 1m
   # Annotation - additional informational labels to store more information
   annotations:
     title: 'Instance {{ $labels.instance }} down'
     description: '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 1 minute.'
   # Labels - additional labels to be attached to the alert
   labels:
       severity: 'critical'

To summarize, it says that if any of the instances are going to be down (up == 0) for one minute, then the alert will be firing. I have also included annotations and labels, which store additional information about the alerts. For those, you can use such as {{ $labels.instance }}欧洲杯投注软件 which are then interpolated into specific instances (such as localhost:9100).

(You can read more about Prometheus alerting rules .)

Once you have rules.yml ready, you need to link the file to prometheus.yml and add alerting configuration. Your prometheus.yml is going to look like this:

global:
 # How frequently to scrape targets
 scrape_interval:     10s
 # How frequently to evaluate rules
 evaluation_interval: 10s
 
# Rules and alerts are read from the specified file(s)
rule_files:
 - rules.yml
 
# Alerting specifies settings related to the Alertmanager
alerting:
 alertmanagers:
   - static_configs:
     - targets:
       # Alertmanager's default port is 9093
       - localhost:9093
 
# A list of scrape configurations that specifies a set of
# targets and parameters describing how to scrape them. 
scrape_configs:
 - job_name: 'prometheus'
   scrape_interval: 5s
   static_configs:
     - targets:
       - localhost:9090
 - job_name: 'node_exporter'
   scrape_interval: 5s
   static_configs:
     - targets:
       - localhost:9100
 - job_name: 'prom_middleware'
   scrape_interval: 5s
   static_configs:
     - targets:
       - localhost:9091
 

If you’ve started Prometheus with --web.enable-lifecycle flag, you can reload configuration by sending POST request to /-/reload endpoint curl -X POST http://localhost:9090/-/reload. After that, start prom_middleware app (node index.js in prom_middleware folder) and node_exporter (./node_exporter in node_exporter folder).

欧洲杯投注软件Once you do this, anytime you want to create an alert, you can just stop the node_exporter or prom_middleware app.

Step Two: Set up Alertmanager

Create an alert_manager subfolder in the Prometheus folder, mkdir alert_manager. To this folder, you’ll then download and extract Alertmanager from the , and without any modifications to the alertmanager.yml, you’ll run ./alertmanager --config.file=alertmanager.yml and open localhost:9093.

Depending on whether or not you have any active alerts, Alertmanager should be properly set up and look something like the image below. To see the annotations that you added in the step above, you’d click on the +Info button.

欧洲杯投注软件With all of that completed, we can now look at the different ways of utilizing Alertmanager and sending out alert notifications.

How to Set up Slack Alerts

If you want to receive notifications via Slack, you should be part of a Slack workspace. If you are currently not a part of any Slack workspace, or you want to test this out in separate workspace, you can quickly create one .

To set up alerting in your Slack workspace, you’re going to need a Slack API URL. Go to Slack -> Administration -> Manage apps.

In the Manage apps directory, search for Incoming WebHooks and add it to your Slack workspace.

欧洲杯投注软件Next, specify in which channel you’d like to receive notifications from Alertmanager. (I’ve created #monitoring-infrastructure channel.) After you confirm and add Incoming WebHooks integration, webhook URL (which is your Slack API URL) is displayed. Copy it.

Then you need to modify the alertmanager.yml file. First, open subfolder alert_manager in your code editor and fill out your alertmanager.yml based on the template below. Use the url that you have just copied as slack_api_url.

global:
 resolve_timeout: 1m
 slack_api_url: 'http://hooks.slack.com/services/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'
 
route:
 receiver: 'slack-notifications'
 
receivers:
- name: 'slack-notifications'
 slack_configs:
 - channel: '#monitoring-instances'
   send_resolved: true

Reload configuration by sending POST request to /-/reload endpoint curl -X POST http://localhost:9093/-/reload. In a couple of minutes (after you stop at least one of your instances), you should be receiving your alert notifications through Slack, like this:

欧洲杯投注软件If you would like to improve your notifications and make them look nicer, you can use the template below, or use and create your own.

global:
 resolve_timeout: 1m
 slack_api_url: 'http://hooks.slack.com/services/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'
 
route:
 receiver: 'slack-notifications'
 
receivers:
- name: 'slack-notifications'
 slack_configs:
 - channel: '#monitoring-instances'
   send_resolved: true
   icon_url: http://avatars3.githubusercontent.com/u/3380462
   title: |-
     [{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end }}] {{ .CommonLabels.alertname }} for {{ .CommonLabels.job }}
     {{- if gt (len .CommonLabels) (len .GroupLabels) -}}
       {{" "}}(
       {{- with .CommonLabels.Remove .GroupLabels.Names }}
         {{- range $index, $label := .SortedPairs -}}
           {{ if $index }}, {{ end }}
           {{- $label.Name }}="{{ $label.Value -}}"
         {{- end }}
       {{- end -}}
       )
     {{- end }}
   text: >-
     {{ range .Alerts -}}
     *Alert:* {{ .Annotations.title }}{{ if .Labels.severity }} - `{{ .Labels.severity }}`{{ end }}
 
     *Description:* {{ .Annotations.description }}
 
     *Details:*
       {{ range .Labels.SortedPairs }} • *{{ .Name }}:* `{{ .Value }}`
       {{ end }}
     {{ end }}

欧洲杯投注软件And this is the final result:

How to Set up PagerDuty Alerts

PagerDuty is one of the most well-known incident response platforms for IT departments. To set up alerting through PagerDuty, you need to create an account there. (PagerDuty is a paid service, but you can always do a 14-day free trial.) Once you’re logged in, go to Configuration -> Services -> + New Service.

Choose Prometheus from the Integration types list and give the service a name – I decided to call mine Prometheus Alertmanager. (You can also customize the incident settings, but I went with the default setup.) Then click save.

The Integration Key will be displayed. Copy the key.

You’ll need to update the content of your alertmanager.yml. It should look like the example below, but use your own service_key (integration key from PagerDuty). Pagerduty_url欧洲杯投注软件 should stay the same and should be set to . Save and restart the Alertmanager.

global:
 resolve_timeout: 1m
 pagerduty_url: 'http://events.pagerduty.com/v2/enqueue'
 
route:x
 receiver: 'pagerduty-notifications'
 
receivers:
- name: 'pagerduty-notifications'
 pagerduty_configs:
 - service_key: 0c1cc665a594419b6d215e81f4e38f7
   send_resolved: true

Stop one of your instances. After a couple of minutes, alert notifications should be displayed in PagerDuty.

欧洲杯投注软件In PagerDuty user settings, you can decide how you’d like to be notified: email and/or phone call. I chose both and they each worked successfully.

How to Set up Gmail Alerts

欧洲杯投注软件If you prefer to have your notifications come directly through an email service, the setup is even easier. Alertmanager can simply pass along messages to your provider – in this case, I used Gmail – which then sends them on your behalf.

It isn’t recommended that you use your personal password for this, so you should create an . To do that, go to Account Settings -> Security -> Signing in to Google -> App password (if you don’t see App password as an option, you probably haven’t set up 2-Step Verification and will need to do that first). Copy the newly-created password.

You’ll need to update the content of your alertmanager.yml欧洲杯投注软件 again. The content should look similar to the example below. Don’t forget to replace the email address with your own email address, and the password with your new app password.

global:
 resolve_timeout: 1m
 
route:
 receiver: 'gmail-notifications'
 
receivers:
- name: 'gmail-notifications'
 email_configs:
 - to: monitoringinstances@gmail.com
   from: monitoringinstances@gmail.com
   smarthost: smtp.gmail.com:587
   auth_username: monitoringinstances@gmail.com
   auth_identity: monitoringinstances@gmail.com
   auth_password: password
   send_resolved: true

Once again, after a couple of minutes (after you stop at least one of your instances), alert notifications should be sent to your Gmail.

欧洲杯投注软件And that’s it!


Related Posts

Connecting Prometheus-Ksonnet to Grafana Cloud

欧洲杯投注软件You've installed monitoring to your Kubernetes cluster using the Prometheus-Ksonnet library. Now learn how to connect your cluster to Grafana Cloud for resilient long-term storage.

Don't Miss the First Bay Area Grafana Labs User Group Meetup Feb. 27

We're hosting our first-ever user group meetup in San Francisco. Sign up now!

Monitoring Setup Made Simple with Tanka and the Prometheus-Ksonnet Library

欧洲杯投注软件Check out this how-to for setting up monitoring in your Kubernetes cluster with Tanka and the Prometheus-Ksonnet library.


Related Case Studies

DigitalOcean gains new insight with Grafana visualizations

The company relies on Grafana to be the consolidated data visualization and dashboard solution for sharing data.

"Grafana produces beautiful graphs we can send to our customers, works with our Chef deployment process, and is all hosted in-house."
– David Byrd, Product Manager, DigitalOcean

How Grafana Cloud is enabling HotSchedules to develop next-generation applications

欧洲杯投注软件The visibility for all these metrics helps service delivery teams quickly iterate on new features.

"Grafana Cloud enables us to achieve observability bliss at HotSchedules. We don’t have to worry about scaling and maintaining the service."
– Denise Stockman, Director, Infrastructure, Hotschedules

How Grafana Cloud is enabling Packet’s teams, from engineering to sales

Most people at Packet look at a Grafana dashboard at some point during their work day.

"Being able to, in a single dashboard, pull in data from a variety of different data sources provides the biggest value for us."
– Nathan Goulding, SVP, Engineering, Packet