StatsD sends Average or Graphite stores Average instead of Sum - graphite

I am using StatsD to record Requests send to my Server, and Graphite to collect the statistics. But when I try to display the statistics, instead of a sum aggregated over a minute, I get averages.
My retention rate for the requests is 1m:7d,5m:35d,1d:1y. My xFilesFactor is 0 and my aggregationMethod is sum. The FlushInterval of StatsD is set to 1m. What am I doing wrong?

Statsd normalizes the point it sends to graphite over that time period on a per second basis.
The aggregationMethod is a config for Graphite retention schemas which specifies how points are aggregation as you go from (in your case) a 1m representation to a 5 minute per point representation (and so on).
If you want the number of requests over that minute period, in graphite you can multiply the series by a constant 60 and get the result.

Related

DynamoDB ConsumedWriteCapacityUnits vs Consumed CloudWatch Metrics for 1 second Period

I am confused by this live chart, ConsumedWriteCapacityUnits is exceeding the provisioned units, while "consumed" is way below. Do I have a real problem or not?
This seems to only show for the
One Second Period
One Minute Period
Your period is wrong for the metrics. DynamoDB emits metrics at the following periods:
ConsumedCapacity: 1 min
ProvisionedCapacity: 5 min
For ConsumedCapacity you should divide the metric by the period but only at a minimum of 1min.
Exceeding provisioned capacity for short periods of time is fine, as burst capacity will allow you to do so. But if you exceed it for long periods it will lead to throttling.

Retentions policy understand fro Graphite DB

I have the below retentions policy mention in storage-schemas.conf file
[metrics]
pattern = ^metrics.api.*
retentions = 10s:5m,1m:1d,1h:30d,1d:1y,30d:10y
Below is my understanding
this policy runs for the matched pattern starting with metrics.api*
1st: 10s:5m -> 1 or more times record inserted at 10s then its will take the latest record and maintain 1 datapoint , till 5min its maintains the history say suppose in 5m 5 datapoints added for the metrics key.
2nd:1m:1d -> this second run after the 5min overs for the same metrics key ,1 or more times record inserted at 1m then its will take the latest record and maintain 1 datapoint,till 1d its maintains the history say suppose in 1d 15 datapoints added for the metrics key.
so my question is what happens for these 2 retention is it will do the average 1st 5+15/2= 10 ? and get one average data point out of this 1st and 2nd rentions
--- its goes till 10years of data to be stored
can you please explain on the above retention polciy
aggregationMethod will be applied on this retention policy when switching boundaries.
First retention - 10s:5m means Graphite will store 30 datapoints (every 10 seconds for last 5 minutes) in archive 0.
Please note, that it will always store these datapoints, even if no data arrived. In that case Graphite will put NULLs there.
Then next retention - 1m:1d means that every minute whisper will take 6 of these 10s datapoints from archive 0, apply average() function and store them in archive 1.
But please note that Whisper will do so only if at least 3 (number of datapoints - 6 multiplied by xFilesFactor = 0.5) or more points in archive 0 have values (i.e. not NULLs). Otherwise Whisper decides that it has not enough data to propagate and put also NULL instead.
Etc - third retention 1h:30d means that 60 of datapoints from archive 1 will be aggregated using average function and propagated to archive 2, but only if at least 30 of them have value, etc.

Graphite, sending multiple values for same key: only one remains

I have this configuration in my Graphite:
# go-carbon.aggregation
[sum_counts]
pattern = ^stats_counts.*
xFilesFactor = 0
aggregationMethod = sum
# /go-carbon.schemas
[default_1min_for_7days_and_1hour_for_5years]
pattern = .*
retentions = 1m:30d,1h:5y
I am sending same key different values in the same minute. But I see how some of them are ignored.
I was expecting the values to be sum as is defined in the aggregationMethod.
To be specific: I am sending 1 signal every 10 seconds to Graphite directly to this key:
stats_counts.test.monitor.remote.datapoint
But when I check what Graphite is storing I see that is only counting 1 signal per minute.
Is there any possibility to request Graphite to aggregate the incoming signals?
Note: we are using StatsD to aggregate these signals and it works, the problem is now we want to add several StatsD and then we need Graphite to aggregate the signals coming from different StatsD daemons.
For me, it looks like you mixed up storage aggregation, which happens when metric become too old according to the retention rules, with carbon-aggregator daemon which acts as proxy and does exactly what you need

graphite: how to get per-second metrics from batch metrics?

I'm trying to measure a online mini-batch processing system with a per-second metrics (total query per second). For every batch, a metric (e.g. "stats.gauges.<host>.query.count") will be send to graphite. batches are processed in several different hosts in parallel and a batch of data take about 5 seconds to process.
I've tried:
simply sum series: sumSeries(stats.gauges.*.query.count),
the result metrics is many times greater than the actual value;
scale to 1 second:
scaleToSeconds(sumSeries(stats.gauges.*.query.count),
1), the result metrics is much less than the actual value;
integral then derivative: nonNegativeDerivative(sumSeries(integral(stats.gauges.*.query.count))), same as the first case ...
send gauges with
delta=True param, then derivative. the result is about 20% greater
than the actual value
so, how to get per-second metrics from batch metrics? what is the best practice?
You should use carbon-aggregator service to add several metrics together as they come in. There is an example which fits your case at http://graphite.readthedocs.io/en/latest/config-carbon.html#aggregation-rules-conf
As your batch takes 5 secs to process, frequency should be 5 to buffer all the metrics. After five seconds, aggregator will sum them up and write to graphite.

Graphite: show change from previous value

I am sending Graphite the time spent in Garbage Collection (getting this from jvm via jmx). This is a counter that increases. Is their a way to have Graphite graph the change every minute so I can see a graph that shows time spent in GC by minute?
You should be able to turn the counter into a hit-rate with the Derivative function, then use the summarize function to the counter into the time frame that your after.
&target=summarize(derivative(java.gc_time), "1min") # time spent per minute
derivative(seriesList)
This is the opposite of the integral function. This is useful for taking a
running totalmetric and showing how many requests per minute were handled.
&target=derivative(company.server.application01.ifconfig.TXPackets)
Each time you run ifconfig, the RX and TXPackets are higher (assuming there is network traffic.)
By applying the derivative function, you can get an idea of the packets per minute sent or received, even though you’re only recording the total.
summarize(seriesList, intervalString, func='sum', alignToFrom=False)
Summarize the data into interval buckets of a certain size.
By default, the contents of each interval bucket are summed together.
This is useful for counters where each increment represents a discrete event and
retrieving a “per X” value requires summing all the events in that interval.
Source: http://graphite.readthedocs.org/en/0.9.10/functions.html

Resources