Timing on Composite Metrics


Setting the timing for composite metrics can be tricky, because you can't expect a one-to-one ratio of composite values to check values. That won't happen because of differences in how the data is collected. Composite values smooth out over time.


For example, say you want to try to produce that one-to-one ratio, and you're dealing with data that is sent once a minute, so you use an average value over 60 seconds. The values won't match up. This is because of variance in when the check fires. Sometimes, there will be two values within the same 60 second period, even if data is only being sent once a minute. The second value comes in just under the wire. If you never want see that happen, you can set the time to 59 seconds. That way, you'll never get two values in the same period, but you will sometimes get a zero value. This may also be undesirable, since it will likely trigger an alert. If that's an issue for you, you'll need to up the time to above 59 seconds, and you're back where you started. You can't win.


Generally speaking, it is preferable to use a larger time period (3-5 minutes) and just be aware that you're pulling a larger average. If you're just using a composite metric for basic alerting, you can set a shorter time period (around 90 seconds) and alert on that. However, it can't be too short because if you alert on an interval of 59 seconds or less with no delay, you're going to get pinged constantly.