Graphite - calculate the difference of two series containing multiple values - graphite

I'm using Graphite and Grafana to graph some metrics. Given the following example, is it possible to output a difference that contains multiple values?
service.cluster1.host1.quota
service.cluster1.host1.usage
service.cluster1.host2.quota
service.cluster1.host2.usage
service.cluster1.host3.quota
service.cluster1.host3.usage
I'm trying to output separate values (based on last) (i.e. quota - usage) for each host. I can display all the data with two separate series using a wildcard for the 'host#' tag, but I'm not certain how I can output the difference per host. My goal is then to use limit() to only display the top few. I've been looking at functions like groupByNode() and diffSeries() but I haven't found a solution. I'm trying to avoid defining a separate series for each host.

I stumbled to the following solution using reduceSeries() and mapSeries() (given the previous example data):
limit(sortBy(aliasByTags(reduceSeries(mapSeries(service.cluster1.*.*, 2), 'diffSeries', 3, 'quota', 'usage'), 2), 'last', false), 10)

Related

Grouping and transposing data in R

It is hard to explain this without just showing what I have, where I am, and what I need in terms of data structure:
What structure I had:
Where I have got to with my transformation efforts:
What I need to end up with:
Notes:
I've not given actual names for anything as the data is classed as sensitive, but:
Metrics are things that can be measured- for example, the number of permanent or full-time jobs. The number of metrics is larger than presented in the test data (and the example structure above).
Each metric has many years of data (whilst trying to do the code I have restricted myself to just 3 years. The illustration of the structure is based on this test). The number of years captured will change overtime- generally it will increase.
The number of policies will fluctuate, I've just labelled them policy 1, 2 etc for sensitivity reasons and limited the number whilst testing the code. Again, I have limited the number to make it easier to check the outputs.
The source data comes from a workbook of surveys with a tab for each policy. The initial import creates a list of tibbles consisting of a row for each metric, and 4 columns (the metric names, the values for 2024, the values for 2030, and the values for 2035). I converted this to a dataframe, created a vector to be a column header and used cbind() to put this on top to get the "What structure I had" data.
To get to the "Where I have got to with my transformation efforts" version of the table, I removed all the metric columns, created another vector of metrics and used rbind() to put this as the first column.
The idea in my head was to group the data by policy to get a vector for each metric, then transpose this so that the metric became the column, and the grouped data would become the row. Then expand the data to get the metrics repeated for each year. A friend of mine who does coding (but has never used R) has suggested using loops might be a better way forward. Again, I am not sure of the best approach so welcome advice. On Reddit someone suggested using pivot_wider/pivot_longer but this appears to be a summarise tool and I am not trying to summarise the data rather transform its structure.
Any suggestions on approaches or possible tools/functions to use would be gratefully received. I am learning R whilst trying to pull this data together to create a database that can be used for analysis, so, if my approach sounds weird, feel free to suggest alternatives. Thanks

Grafana dividing 2 series

I'm trying to divide 2 series to get their ratio.
For example I'm got sites (a.com, b.com, c.com) as * (All sites)
Each of them has total sections count and errors occurred stats. I'm wanna to show as bars errors/sections where section > errors for each site to each erros for this site. Here I'm whant to got 3 bars.
So:
A parser.*.sections.total
B parser.*.errors.total
X-Axis Mode:Series
Display:DrawMode: Bars
When i'm trying to use divideSeries I'm always got VallueError(divideSeries second argument must reference exactly 1 series)
A new function divideSeriesLists was introduced in Graphite 1.0.2 for dividing one series with another. Both the series should be of same length.
You can use mapSeries with divideSeries to do vector matching of series in Graphite (or maybe asPercent depending on which version of graphite you are using).
An example query:
aliasByNode(reduceSeries(mapSeries(groupByNodes(parser.*.{sections,errors}.total, 'maxSeries', 1, 2), 0), 'asPercent', 1, 'sections', 'errors'), 0)
I'm not sure what aggregation function you are using so substitute maxSeries for the function you need.
Check out this blog post about using mapSeries with divideSeries for more explanation.
Here is an example from our system in the Grafana query editor:

How do I retrieve aggregate measures from R when I need to pass disaggregated data in Tableau?

I have extensively read and re-read the Troubleshooting R Connections and Tableau and R Integration help documents, but as a new Tableau user they just aren't helping me.
I need to be able to calculate Kaplan-Meier survival probabilities across any dimensions that are dragged onto the sheet. Ideally, I would be able to retrieve this in a tabular format at multiple time points, but for now, I would be happy just to get it at a single time point.
My data in Tableau have columns for [event-boolean] and [time to event]. Let's say I also have columns for Gender and District.
Currently, I have a calculated field [surv] as:
SCRIPT_REAL('
library(survival);
fit <- summary(survfit(Surv(.arg2,.arg1) ~ 1), times=365);
fit$surv'
, min([event-boolean])
, min([time to event])
)
I have messed with Computed Using, Addressing, Partitions, Aggregate Measures, and parameters to the R function, but no combination I have tried has worked.
If [District] is in Columns, do I need to change my SCRIPT_REAL call or do I just need to change some other combination of levers?
I used Andrew's solution to solve this problem. Essentially,
- Turn off Aggregate Measures
- In the Measure Values shelf, select Compute Using > Cell
- In the calculated field, start with If FIRST() == 0 script_*() END
- Ctrl+drag the measure to the Filters shelf and use a Special > Non-null filter.

Transform for graphite counter

I'm using the incr function from the python statsd client. The key I'm sending for the name is registered in graphite but it shows up as a flat line on the graph. What filters or transforms do I need to apply to get the rate of the increments over time? I've tried an apply function > transform > integral and an apply function > special > aggregate by sum but no success yet.
Your requested function is "Summarize" - see it over here: http://graphite.readthedocs.org/en/latest/functions.html
In order to the totals over time just use the summarize functions with the "alignToFrom =
true".
For example:
You can use the following metric for 1 day period:
summarize(stats_counts.your.metrics.path,"1d","sum",true)
See graphite summarize datapoints
for more details.
The data is there, it just needs hundreds of counts before you start to be able to see it on the graph. Taking the integral also works and shows number of cumulative hits over time, have had to multiple it by x100 to get approximately the correct value.

Is there a way to Ignore values by range when displaying in composer?

Is there a function in the Graphite URL API which allows us to ignore values which are inside (or outside) a certain range?
I believe you can check out the removeAboveValue and removeBelowValue functions.
For example, to exclude values below 2 and above 10:
http://host/render&target=removeAboveValue(removeBelowValue(a.b.c, 2), 10)
Ignoring values inside a range is a little more difficult, but it can probably be achieved by summing series where data has previously been filtered out (untested):
http://host/render&target=sum(removeAboveValue(a.b.c, 2), removeBelowValue(a.b.c, 10))

Resources