Graphite returning incorrect datapoint - graphite

I downloaded statsd and graphite 0.9.x
I used the stats-client provided with source of statsd as follows:
./statsd-client.sh 'development.com.alpha.operation.testing.rate:1|c'
I did the above operation 10 times.
Then I tried querying for a summary for last 24 hours:
http://example.com/render?format=json&target=summarize(stats.development.com.alpha.operation.testing.rate,
"24hours", "sum",true)&from=-24hours&tz=UTC
I get 1 datapoint as follows:
"datapoints": [[0.0, 1386277560]]}]
Why I am getting 0.0? Even Graphite Composer does not display anything
I was expecting a value of "10" as I performed the operation 10 times. What did I do wrong?
storage-schemas.conf
[carbon]
pattern = ^carbon\.
retentions = 60:90d
[default_1min_for_1day]
pattern = .*
retentions = 60s:1d
Please help me understand the problem.
EDIT:
As per answer below, I changed storage-aggregation and I get following response after running whisper-info on metric_file.wsp. But I am still getting "0.0" as value in datapoint and Graphite browser does not display anything.
maxRetention: 86400
xFilesFactor: 0.0
aggregationMethod: sum
fileSize: 17308
Archive 0
retention: 86400
secondsPerPoint: 60
points: 1440
size: 17280
offset: 28
I also looked at stats_counts Tree as suggested in another answer, but its the same.
What is wrong with my setup. I am using default setting for everything but the changes suggested by an answer below in storage-aggregation

Within the whisper package, you will get a script- whisper-info.py. Invoke it on the appropriate metric file-
/whisper-info.py /opt/graphite/storage/whisper/alpha/beta/charlie.wsp
You will get something like this-
maxRetention: 31536000
xFilesFactor: 0.0
aggregationMethod: sum
fileSize: 1261468
Archive 0
retention: 31536000
secondsPerPoint: 300
points: 105120
size: 1261440
offset: 28
Here, make sure that aggregationMethod is sum, and xFilesFactor is 0.0. Most probably it is not, since this isn't graphite's default behavior. Now make a regex that picks up your metrics and put it at the beginning of the config file storage-aggregation.conf. This will ensure that the newly created metrics follow this new aggregation rule. You can read more about how xFilesFactor works here.

Have you tried using the stats_counts tree instead of stats? StatsD populates both for regular counters. stats by default does some fancy averaging which can tend make low-intensity stat signals disappear, whereas stats_counts just gives you the straight-up count, which sounds like what you want.

Related

How to change the interval of a plugin in telegraf?

Using: telegraf version 1.23.1
Thats the workflow Telegraf => Influx => Grafana.
I am using telegraf to check my metrics on a shared server. So far so good, i already could initalize the Telegraf uWSGI Plugin and display the data of my running django projects in grafana.
Problem
Now i wanted to check some folder size too with the [[inputs.filecount]] Telegraf Plugin and this works also well. However i do not need Metrics for every 10s for this plugin. So i change the interval like mentioned in the Documentation in the [[inputs.filecount]] Plugin.
telegraf.conf
[agent]
interval = "10s"
round_interval = true
metric_batch_size = 1000
metric_buffer_limit = 10000
collection_jitter = "5s"
flush_interval = "10s"
flush_jitter = "0s"
#... PLUGIN
[[inputs.filecount]]
# set different interval for this input plugin every 10min
interval=“600s”
collection_jitter=“20s”
# Default from Doc =>
directories = ["/home/myserver/logs", "/home/someName/growingData, ]
name = "*"
recursive = true
regular_only = false
follow_symlinks = false
size = "0B"
mtime = "0s"
After restarting Telegram with Supervisor it crashed because it could not parse the new lines.
supervisor.log
Error running agent: Error loading config file /home/user/etc/telegraf/telegraf.conf: Error parsing data: line 208: invalid TOML syntax
So that are these lines i added because i thought that is how the Doc it mention it.
telegraf.conf
# set different interval for this input plugin every 10min
interval=“600s”
collection_jitter=“20s”
Question
So my question is. How can i change or setup the interval for a single input plugin in telegraf?
Or do i have to apply a different TOML syntax like [[inputs.filecount.agent]] or so?
I assume that i do not have to change any output interval also? Because i assume even though its currently 10s, if this input plugin only pulls/inputs data every 600s it should not matter, some flush cycle will push the Data to influx .
How can i change or setup the interval for a single input plugin in telegraf?
As the link you pointed to shows, individual inputs can set the interval and collection_jitter options. There is no difference in the TOML syntax for example I can do the following for the memory input plugin:
[[inputs.mem]]
interval="600s"
collection_jitter="20s"
I assume that i do not have to change any output interval also?
Correct, these are independent of each other.
line 208: invalid TOML syntax
Knowing what exactly is on line 208 and around that line will hopefully resolve your issue and get you going again. Also make sure your quotes that you used are correct. Sometimes when people copy and paste quotes they get ” vs " which can cause issues!

Graphite not showing correct values

Basically, my problem is Graphite is not showing the correct data, as if it is not aggregating the data properly.
What I want to do is create a view counter using Graphite.
My Configuration
I'm using Node-StatsD to send data to Carbon:
var client = new StatsD({
host: config.host,
port: config.port,
prefix: config.prefix
});
client.increment("bucketName");`
I have the following storage-schemas.conf:
[default]
pattern = .*
retentions = 1h:14d,1d:99y
and everything else is on default including storage-aggregation.conf, which by default uses the aggregation method sum and has xFilesFactor of 0.
The Problem
I've confirmed that the data is able to reach Graphite as it did create my bucket and stats.counters.statsd.metrics_received.count does increase.
However my bucket count does not increase, here are the results of my query:
target=bucket.count&rawData=true gives
bucket.count,1553497200,1553583600,3600|0.0,0.0,0.0,...,0.0,0.0
target=summarize(bucket.count,"1d")&format=json gives
{"datapoints": [[0.0, 1553472000], [0.0, 1553558400]], "target":...}
hitcount(bucket.count,"1d") gives
{"datapoints": [[3600.0, 1553497200]], "target":...}
hitcount does give something that is not 0 but the count is way more than I send to Graphite. I sent at most 1 or 2 dozen in the last hour.

Configuring scollector to get different frequences for different collectors

I'm working on scollector and I want to have specific frequencies for different collector.
For example:
get info from disk usage every 5 minutes
info from memory every minute
iostat every 30 seconds
and so on...
Here is a part of the conf.toml I made:
FullHost = true
Freq = 60
DisableSelf = true
[[iostat]]
Filter = "iostat"
Freq = 30
[[memory]]
Filter = "memory"
Freq = 60
But I get some error
./scollector -conf="perso.toml" -p
2016/04/19 14:40:45 fatal: main.go:297: extra keys in perso.toml: [iostat iostat.Freq memory memory.Freq]
It seems that I cannot multiply the frequencies.
What should I do to get what I want?
Thank you all
According to scollector documentation, Freq is a global setting, so it's not possible to set different frequencies for each collector. The exception is for external collectors, which may be put in a folder named after the desired frequency (in seconds).
Freq is indeed global setting and interval is usually set to it. Although some collectors override interval to different values e.g. elasticsearch-indices runs every 15 minutes because there's a lot of data to pull.
To change it either
(best) hack scollector code to read and pass freq parameter to every collector
(second best) file a github issue
(last resort) you can just change intervals scollector code in specific collectors and recompile scollector
Well, we might found something.
We create differents folders representing several Freq (0, 30, 60, 120...) and in each folders, we write external collectors we need.
'/etc/collectors/0',
'/etc/collectors/15',
'/etc/collectors/30',
'/etc/collectors/60',
'/etc/collectors/120',
'/etc/collectors/300',
'/etc/collectors/600'
In the conf.toml:
ColDir = "/etc/scollector/collectors"
If we want the internal collectors, we have to rewrite them :(

Munin: What does the 'm' mean in the y-axis of nginx requests?

I have the following munin-generated graph of nginx requests:
What does the 'm' in the y-axis mean?
The nginx munin plugin at /usr/share/munin/plugins/nginx_request is extracting:
if ($response->content =~ /^\s+(\d+)\s+(\d+)\s+(\d+)/m) {
print "request.value $3\n";
Which means it is taking the third component of nginx_status, which appears to be the total accumulated request count. Here is an example execution from this same server:
$ curl http://127.0.0.1/nginx_status
Active connections: 1
server accepts handled requests
2936 2936 4205
Reading: 0 Writing: 1 Waiting: 0
The munin nginx plugin is passing the following to rrdtool:
print "graph_title Nginx requests\n";
print "graph_args --base 1000\n";
print "graph_category nginx\n";
print "graph_vlabel Request per second\n";
print "request.label req/sec\n";
print "request.type DERIVE\n";
print "request.min 0\n";
print "request.label requests port $port\n";
print "request.draw LINE2\n";
The 'm' is the 'milli' prefix for the units. So, 400 m means 0.400.
By default, RRDTool uses the SI prefixes -- 2000 is shown as 2k, 0.01 is shown as 10m and so on. This isn't normally an issue except when there are no units or the thing being measured doesnt make sense in fractional parts.
The way to stop this behaviour is to not use the %s in the GPRINT (this fixes the legend), and to use the --units-exponent=0 option (this fixes the Y-axis). I don't know that it is possible to make munin do this, though. You might be able to modify the plugin to add '--units-exponent 0' to the graph_args though.

Writing the total request time in seconds to an nginx access log, possibly using a calculated variable

I'm trying to modify my nginx access log format to include the request duration, in seconds.
I see two possible variables I could use:
1) $request_time
2) $upstream_response_time
However both of these variables are expressed in microseconds, and I need this value to be rendered in seconds. Is there any way to specify the output as an expression (i.e. $request_time * 1000) or accomplish this in some other way?
Thanks
The webserver is not a calculator or statistical program. It's logging function is to provide the raw data you can do your analysis with. If you analysis program is incapable of converting microseconds to seconds you should shop around for other software. In any case, it us unrealistic to expect a program's logging function to perform unit conversions for you. The goal of logging is not to format, yet to record what it has done without impacting the performance of it's core functionality.
If you use a reporter like LogStash (ELK stack) you can do some calculation when parsing the log. Here is my example to convert second into millisecond in my Logstash filter for Nginx:
grok {
match => {
"message" => "%{IPORHOST:clientip} %{USER:ident} %{USER:auth} \[%{HTTPDATE:timestamp}\] \"%{WORD:verb} %{URIPATHPARAM:logMessage} HTTP/%{NUMBER:httpversion}\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) (?:\"(?:%{URI:referrer}|-)\"|%{QS:referrer}) %{QS:agent} rt=%{NUMBER:duration_sec} cid=%{GREEDYDATA:correlationId}"
}
}
mutate { convert => [ "duration_sec", "float" ] }
ruby { code => "event['duration_ms'] = event['duration_sec'].to_f * 1000" }
Hope this helps.
As noted in the comments $request_time is already in seconds, however it is possible to convert to another unit purely in nginx config as follows.
I realise this is an old question but it seems to get a lot of traffic so perhaps the below will help someone else who, like me, might want to convert to a different unit, in my case nanoseconds (though if you want milliseconds then simply omit the 6 trailing zeros).
This is useful for sending to the Elastic event.duration field - to avoid the faff of an ingest pipeline or similar on the Elastic end (to multiply by 1000000) you can instead do some hideous regex as below.
The fact that $request_time is always zero-padded to 3 decimal places helps, so in the common cases you can append 6 zeros - but you also have to handle leading zeros (before & after the decimal point) to make sure the result doesn't end up with any leading zeros:
map $request_time $request_time_nanos {
# Simple case for 0.000
~^0\.000$ 0;
# If 0 before decimal place, must remove leading zeros after it, before adding 6 zeros
~^0\.(?:0*)([^0].*)$ $1000000;
# Otherwise just concatenate the full pre- & post-decimal parts, before adding 6 zeros
~^([^0][^.]*)\.(.*)$ $1$2000000;
}
Example transformations (commas inserted for readability only):
[sec] [nanoseconds]
0.000 => 0
0.110 => 110,000,000
0.010 => 10,000,000
1.010 => 1,010,000,000

Resources