This recipe will provide a step-by-step guide to setting up basic alerts using Timebeat software and the management platform. The goal of this process will be that at the end of which you can create your own alerts with the steps learned in this article tailored to your needs.
You want to be alerted when the synchronisation of a particular device goes above a certain offset threshold.
Ingredients (the minimum required items)
- A basic PTP implementation
- Timebeat management platform
- Slack (in this example - other notification channels are available)
- A browser
Total Prep time: 5 mins, Total Cook time: 15 mins
In this recipe we will be looking at the Slack notification channel, however, other notification channels are available.
First, we will log into our Timebeat management platform. If you haven't yet got this up and running check out this guide first.
So to log in we head to the required web address of our Grafana instance and log in with the credentials.
Once logged in, we should be landed on our Home Page.
Let's check out what notification channels are available to us:
So for this, we head to the left-hand side menu, hover over the "bell" icon and select "Contact policies"
Once the Contact Policies page has loaded select "New contact point"
This provides us with a wide range of contact point types. In this example, we will set up a contact point using Slack. So we find Slack in the drop-down menu and select it.
As you can see above we are greeted with some items to fill in to complete the contact point.
So we will set up our Slack channel by heading to this link:
Once arrived at that page we follow the quick set-up buttons to enable a webhook.
We select Create an App.
We will select from scratch in this example
Type in our App Name - we use Timebeat Alerts. If you aren't familiar with Slack this will be the title sender of your alert messages in the Slack chat window.
We also select our Workspace at this time.
Now we have our Slack App. We need to set up a Webhook for Grafana to interact with.
For this select the Incoming webhook
On the top right, we activate this control with the toggle switch.
Once on we scroll down and select Add new webhook to the workspace.
Select the correct channel for the alerts to arrive in. We have selected #timebeat-alerts.
Now our webhook is configured and available at the bottom of the webhook section. So just copy the Webhook URL and return it to Grafana.
We paste the webhook previously copied into the Webhook URL section in our policy.
Then we select test in the top right corner.
Select Send test notification. This will send a "dummy" alert to our slack. We should receive this pretty quickly so if you look in slack we will see the following
Once the test notification is received go ahead and save the contact point using the save button at the bottom of the page in Grafana.
Now we have our contact point we need to set up our notification policy. So on the top bar, we select notification policies and then press the new specific policy button.
Now we select the previously configured contact point, we recommend selecting "continue matching subsequent sibling nodes" but this is up to you. This continues alerts in the event we configure an alert across multiple devices. If selected alerts will be sent for all devices that exceed even if the alert has already been fired and not resolved.
Once selected we hit save policy and return to the Alert rules tab.
From the alert rules tab, we select "New alert rule"
Now we set our query at the top.
In this example, we will run a simple query for Max absolute offset separated by host
Once our query is in place, we set the trigger condition. Above we have set the trigger to 20000 (as can also be seen by the graph line in red). This will trigger when the max offset of the query is above 20000.
Once set we scroll down and set the evaluation time. For this, we will use 1m for 5m. This means we will check for an offset above 20,000 every minute. With the addition of the "for 5m" however we will need to have the trigger condition take place within this windowing period. This is usually more interesting if evaluating averages. As we have used the Max trigger this has minimal effect. It will mean that for the alert to clear you will need to be below 20,000 for a minimum of a 5 min period.
After our evaluation period, we just name our alert rule and select the Folder for the alert to reside. Once complete just hit save and return to the Grafana home page.
Once at the home page, you can check all your alerts in the table at the top left. As can be seen below our alert is present and firing. There are several states an alert can be in but just think of the traffic lights where alerts are concerned - Green means everything is ok and functioning within parameters, Red is not so good and the alert is being triggered.
On occasion, you may see an orange alert. This means that the alert is in a pending state. Typically this is due to your configuration, pending means, a value has been over a threshold but as you are using an evaluation period the alert is not triggered until the end of that period. An example would be one data point above a threshold but you evaluate the average for 1m, until the 1m window is calculated and the average is above the alert will not be triggered, however, a value has been above so you are in a pending state.