Circonus supports data ingestion from a wide variety of sources in all kinds of formats. This data is transformed and processed internally and ultimately stored in the form of 1M-rollup-aggregates, that come in two flavors (cf. DataTypes):

1. Numeric metrics, which store a number of summary statistics, including the mean value across the rollup-window.

2. Histogram metrics, which store the sample distribution as key-value pairs (Histogram Representation).

This note explains how these rollup-aggregates are derived from the incoming raw-data.


1. Rolling-up Numeric Data


We will cover the most common case first, that raw numeric values are submitted to Circonus, one at a time.


The Numeric raw values r that arrive at Circonus have an associated timestamp, r.t: long, which is a (long) integer in milliseconds, and an associated value, r.v: float.


The stream of incoming raw values is divided into rollup windows, w, of equal duration, which is the rollup period, p. Currently, the minimum supported rollup period is 1 minute (p = 60000).


The rollup windows carry a start time, w.start, that is a multiple of the rollup period. The window end time is defined to be w.end = w.start + p.


A raw value, r, belongs to a rollup window, w, if:


w.start < r.t and r.t < w.end


The rollup process takes a list of raw-values, R, that belong to a rollup window, w, and are sorted in order of the time-stamp, and computes the following of summary statistics over the raw values:

  • w.count is the number of samples in R: 

          w.count = #R

  • w.value is the average value of all sample values in R:

             


  • w.stddev is the sample standard devation of all samples in R. Requires: #R > 2
              

  • w.derivative is the effective rate of change of the samples per second. It is defined as:

             
    Note, that there are 1000 milliseconds in one second. There is an alternative way to compute w.derivative that will be used later on. It is based on the notion of the difference series DT and DV of R which are:

         DV[i] = R[i+1].v - R[i].v

     DT[i] = R[i+1].t - R[i].t

Here is the alternative computation:



  • w.derivative_stddev is defined as the (uncorrected) standard derivation of the difference series DV weighted by DT:

    The weighted standard deviation is defined as:

             


  • w.counter, the counter derivative rollup statistics, is similar to the derivative rollup, but is suitable to raw value streams that experience overflow events that should be ignored by the metric. Examples for such streams include the number of octects observed by a network switch, or the total visit count of a server that gets restarted occasionally.

    To define the counter derivative rollup, we use the DV and DT series from above. Let CV and CT be copies of DV and DT where indices i with DV[i] < 0 are removed from both series. Then:

             


  • w.counter_stddev. In analogy to w.derivative_stddev, this rollup is defined as the (uncorrected) standard derivation of the monotone difference series CV weighted by CT:

              w.counter_derivative_stddev = stddev(CV,CT)


In addition to those numeric rollup-aggregates, Circonus also supports rollup to Histogram metrics (cf. Histograms, Histogram Internals). For numeric input values those are calculated as follows:


  • w.histogram, the histogram statistic, can represented as an associative array that maps bucket boundaries to sample counts:

            


    here b represents a bucket boundary, which can be any float value of the form b = x * 10^e, where
    • x is one of the following float values: x = -9.9, -9.8, -9.7 .. -1.1, -1.0, 0, 1.0, 1.1, .. ,9.9
    • e is an integer in the following range: e = -128 .. 127
    • Bucket(b) is the bucket with boundary b, e.g. if b = 1, then Bucket(b) is the interval [1, 1.1).

      For more details, see Histogram Internals.

2. Rolling-up Histogram Data


In addition to numeric input values, Circonus also accepts histogram data submitted directly via a HTTP Trap, in the form of:

  • Raw histograms as arrays of numeric values, e.g. [1, 2.7, 5]
  • Raw histograms in pre-bucketed form, e.g.  ["H[0.1]=3", "H[11]=7"]


You can mix both types of data submission with each other and with numeric raw-values as well. 

The resulting histogram rollup aggregates all data that was submitted for a time period by summing the counts for each bucket.


Example


If the following values are submitted:


| t   | value                  |
|-----+------------------------|
| 0   | 1                      |
| 100 | [1,2,3]                |
| 200 | ["H[1]=3", "H[5]=3"]   |
| 300 | 2                      |
| 400 | [3,4,5]                |
| 500 | ["H[5]=3", "H[6]=3"]   |


The resulting histogram for the time period [0, 1000] is, given by:


H[1] = 1 + 1 + 3 + 0 + 0 + 0  = 5

H[2] = 0 + 1 + 0 + 1 + 0 + 0  = 2

H[3] = 0 + 1 + 0 + 0 + 1 + 0  = 2

H[4] = 0 + 0 + 0 + 0 + 1 + 0  = 1

H[5] = 0 + 0 + 3 + 0 + 1 + 3  = 7

H[6] = 0 + 0 + 0 + 0 + 0 + 3  = 3


Numeric Rollup of Histogram Values


When pushing raw-histogram values, numeric aggregates are also performed. However, due to the details of the data collection, the resulting rollup values are often perceived as un-intuitive. In particular, they are not equivalent to pushing the containing samples individually. Hence, we make the following recommendation:


WARNING: 
If you use raw-histogram values, ONLY the resulting histogram metric should be used for further computations!


Now, that we have warned you, we will tell you what values end up in the numeric metric, when histogram values are pushed.


  1. When a Raw histogram array is pushed, the numeric metric will receive a single sample with the average value of the array.
  2. When a Raw pre-bucketed histogram is pushed, the numeric metric will receive a nil-sample.


Therefore the numeric statistics count and average are are calculated as follows:

    w.count   = 1 + 1 + 1 + 1 + 1 + 1 = 6

    w.average = 1 + 2 + 0 + 2 + 4 + 0 = 9


Note that, these are different from the total count of all samples (21) and the total average (84/21=4.0).

The recommended way to derive those values is to use our Analytics Query Language, CAQL. E.g.


     metric:histogram(...) | histogram:count()

     metric:histogram(...) | histogram:mean()

     metric:histogram(...) | histogram:stddev()


CAQL would compute approximations of those respective values from the histogram itself.



Appendix: Numeric Rollup Example


--- Adding rollup value {t=0, v=0.0}
- count             : 1
- value             : 0
- stddev            : 0
- derivative        : 0
- derivative_stddev : 0
- counter           : 0
- counter_stddev    : 0 
 
--- Adding rollup value {t=100, v=1.0}
- count             : 2


- value             : 0.5 
- stddev            : 0.5
- derivative        : 10
- derivative_stddev : 0
- counter           : 10
- counter_stddev    : 0

--- Adding rollup value {t=220, v=1.0}
- count             : 3
- value             : 0.66666666666667
- stddev            : 0.47140452265739
- derivative        : 4.5454545021057
- derivative_stddev : 4.979296207428
- counter           : 4.5454545021057
- counter_stddev    : 4.979296207428

--- Adding rollup value {t=300, v=6.0}
- count             : 4
- value             : 2
- stddev            : 2.3452079296112
- derivative        : 20
- derivative_stddev : 25.980762481689
- counter           : 20
- counter_stddev    : 25.980762481689

--- Adding rollup value {t=400, v=0.0}
- count             : 5
- value             : 1.6
- stddev            : 2.2449944019318
- derivative        : 0
- derivative_stddev : 41.306777954102
- counter           : 20
- counter_stddev    : 25.980762481689



(continued)

--- Adding rollup value {t=500, v=1.0}
 
- count             : 6 
- value             : 1.5 
- stddev            : 2.0615527629852 
- derivative        : 2 
- derivative_stddev : 37.161808013916 
- counter           : 17.5 
- counter_stddev    : 22.912878036499 


--- Adding rollup value {t=600, v=3.0}
- count             : 7
- value             : 1.7142857142857
- stddev            : 1.9794865846634
- derivative        : 5
- derivative_stddev : 34.580821990967
- counter           : 18
- counter_stddev    : 20.518283843994


--- Adding rollup value {t=700, v=0.0}
- count             : 8
- value             : 1.5
- stddev            : 1.9364916086197
- derivative        : 0
- derivative_stddev : 34.278270721436
- counter           : 18
- counter_stddev    : 20.518283843994


--- Adding rollup value {t=800, v=1.0}
- count             : 9
- value             : 1.4444444444444
- stddev            : 1.8324912786484
- derivative        : 1.25
- derivative_stddev : 32.234489440918
- counter           : 16.666666030884
- counter_stddev    : 18.966344833374


--- Adding rollup value {t=900, v=3.0}
- count             : 10
- value             : 1.6
- stddev            : 1.7999999523163
- derivative        : 3.3333332538605
- derivative_stddev : 30.956954956055
- counter           : 17.1428565979
- counter_stddev    : 17.598121643066