Statsd + Graphie , showinglow fequency events - graphite

I am using the stadard Graphite + Statsd stack. A 10 sec statsD interval + 10s graphite retention schema.
I have several events that happen 1K - 10K times a day. And I would like to show them in a graph too. But the graph is too flat (24h range of statsD has 8640 slots). I would like to show a graph that measures my event in like events/min , or events/5min - the default right now is events / 10s . Is that possible wihouth changing the default statsd interval and retention schemas in graphite.
Thanks

I found the answer the built in function summarize in graphite does exactly that.

Related

High volume write to graphite database with ms seconds delay

I have 5000K devices which ones are reading data from sensor 20 times in second.
I'm pushing all data to kinesis but I want to make it queryable
So I'm thinking to pool data from kinesis and store in Graphite.
I will have a lot of write and not much read. (5000*20*60 ~ 6000000 write in 1 min)
I want to understand if Graphite/carbon-cache/whisper will fit to this requirement and what king of hardware I need to handle this (Bulk write every 1 min is ok for me as well ) ?
You can use Graphite / Whisper database to store high volume metrics (6 Millions per min). You need to mention proper Archival and Retention policy in Whisper. Also Graphite database is highly scalable so you can add more CPU or I/O.
Graphite Whisper document

How do you sum a statsd counter over a large time range with correct values?

Background
A basic use case for statsd & grafana is learning how many times a function has been called over a time range-- whether that is "last 6h", "since beginning of today", "since beginning of time", etc.
What I'm struggling to find is the correct function to achieve this. I'm using a hosted solution; however, I can confirm that data is being flushed from StatsD to Graphite in 10s intervals.
Current Setup
StatsD Flush: 10s
Graph Function: hitcount(counters.login.employer.count, "10seconds")
Time Range: 24h
Problem
When using hitcount(counters.login.employer.count, "10seconds"), the data returned is incorrect. In fact, I can do 24h, 23h, 22h, and note the values are actually increasing.
I've performed all testing here in a controlled environment, only my machine is sending metrics to StatsD. This is not yet in production code.
Any idea what could be going on here?
The way counters work is that on each interval the value of the counter is sent to graphite and reset in statsd, so what you're looking for is the sum of the series.
You can do that using consolidateBy('sum') combined with maxDataPoints=1.
Be aware that if your series is being aggregated in graphite you'll need to make sure that the aggregation is by sum, otherwise when values get rolled up from the individual values reported by statsd into aggregated buckets they'll be averaged, and your sum won't work across longer intervals. You can read more about configuring aggregation in Graphite here.

graphite: how to get per-second metrics from batch metrics?

I'm trying to measure a online mini-batch processing system with a per-second metrics (total query per second). For every batch, a metric (e.g. "stats.gauges.<host>.query.count") will be send to graphite. batches are processed in several different hosts in parallel and a batch of data take about 5 seconds to process.
I've tried:
simply sum series: sumSeries(stats.gauges.*.query.count),
the result metrics is many times greater than the actual value;
scale to 1 second:
scaleToSeconds(sumSeries(stats.gauges.*.query.count),
1), the result metrics is much less than the actual value;
integral then derivative: nonNegativeDerivative(sumSeries(integral(stats.gauges.*.query.count))), same as the first case ...
send gauges with
delta=True param, then derivative. the result is about 20% greater
than the actual value
so, how to get per-second metrics from batch metrics? what is the best practice?
You should use carbon-aggregator service to add several metrics together as they come in. There is an example which fits your case at http://graphite.readthedocs.io/en/latest/config-carbon.html#aggregation-rules-conf
As your batch takes 5 secs to process, frequency should be 5 to buffer all the metrics. After five seconds, aggregator will sum them up and write to graphite.

StatsD sends Average or Graphite stores Average instead of Sum

I am using StatsD to record Requests send to my Server, and Graphite to collect the statistics. But when I try to display the statistics, instead of a sum aggregated over a minute, I get averages.
My retention rate for the requests is 1m:7d,5m:35d,1d:1y. My xFilesFactor is 0 and my aggregationMethod is sum. The FlushInterval of StatsD is set to 1m. What am I doing wrong?
Statsd normalizes the point it sends to graphite over that time period on a per second basis.
The aggregationMethod is a config for Graphite retention schemas which specifies how points are aggregation as you go from (in your case) a 1m representation to a 5 minute per point representation (and so on).
If you want the number of requests over that minute period, in graphite you can multiply the series by a constant 60 and get the result.

Graphite: show change from previous value

I am sending Graphite the time spent in Garbage Collection (getting this from jvm via jmx). This is a counter that increases. Is their a way to have Graphite graph the change every minute so I can see a graph that shows time spent in GC by minute?
You should be able to turn the counter into a hit-rate with the Derivative function, then use the summarize function to the counter into the time frame that your after.
&target=summarize(derivative(java.gc_time), "1min") # time spent per minute
derivative(seriesList)
This is the opposite of the integral function. This is useful for taking a
running totalmetric and showing how many requests per minute were handled.
&target=derivative(company.server.application01.ifconfig.TXPackets)
Each time you run ifconfig, the RX and TXPackets are higher (assuming there is network traffic.)
By applying the derivative function, you can get an idea of the packets per minute sent or received, even though you’re only recording the total.
summarize(seriesList, intervalString, func='sum', alignToFrom=False)
Summarize the data into interval buckets of a certain size.
By default, the contents of each interval bucket are summed together.
This is useful for counters where each increment represents a discrete event and
retrieving a “per X” value requires summing all the events in that interval.
Source: http://graphite.readthedocs.org/en/0.9.10/functions.html

Resources