I want to extract some hourly data from rrdtool databases in order to create some graphs within a dashboard system.
These databases don't have an hourly datasource, the closest is a 30-min datasource (they are generated by munin)
Now, I can use rrdfetch, but that doesn't do the nice averaging that rrdgraph would do, so something like this
rrdtool fetch xxx-apache_accesses-accesses80-d.rrd AVERAGE \
--resolution 3600 -s 1328458200 -e 1328544600
Might give me 30 min data points like this
2012-Feb-05 16:30:00 3.5376357135e+00
2012-Feb-05 17:00:00 3.4655067194e+00
2012-Feb-05 17:30:00 4.0483210375e+00
2012-Feb-05 18:00:00 4.3210061422e+00
....
I could average those, but it seems that rrdgraph can output parsable text, but I can't figure out the correct incantation. Here's what I've tried
rrdtool graph dummy.png -s 1328523300 -e 1328609700 \
DEF:access=xxx-apache_accesses-accesses80-d.rrd:42:AVERAGE \
"PRINT:access:AVERAGE: %5.1lf %S"
outputs
0x0
4.7
Now I think that's simply the average for the period given, but is there any way to get rrdtool to spit out an average for particular chunks or step sizes? I tried --step but this did not change the output.
I could call rrdtool graph for each data point I need, but that seems rather wasteful.
No sooner had I posted than I hit upon the right approach!
rrdtool xport -s 1328523300 -e 1328609700 --step 3600 \
DEF:access=xxx-apache_accesses-accesses80-d.rrd:42:AVERAGE \
XPORT:access:"average"
This gives me the dump I need...
<?xml version="1.0" encoding="ISO-8859-1"?>
<xport>
<meta>
<start>1328526000</start>
<step>3600</step>
<end>1328612400</end>
<rows>25</rows>
<columns>1</columns>
<legend>
<entry>average</entry>
</legend>
</meta>
<data>
<row><t>1328526000</t><v>2.1949556516e+00</v></row>
<row><t>1328529600</t><v>2.0074586816e+00</v></row>
<row><t>1328533200</t><v>2.4574720485e+00</v></row>
<row><t>1328536800</t><v>3.4861890250e+00</v></row>
<row><t>1328540400</t><v>4.2725023347e+00</v></row>
<row><t>1328544000</t><v>6.2119548259e+00</v></row>
<row><t>1328547600</t><v>5.6709432075e+00</v></row>
<row><t>1328551200</t><v>6.1214185470e+00</v></row>
<row><t>1328554800</t><v>8.1137357347e+00</v></row>
<row><t>1328558400</t><v>5.8345894022e+00</v></row>
<row><t>1328562000</t><v>6.2264732776e+00</v></row>
<row><t>1328565600</t><v>6.1652113350e+00</v></row>
<row><t>1328569200</t><v>5.8851025574e+00</v></row>
<row><t>1328572800</t><v>5.4612112119e+00</v></row>
<row><t>1328576400</t><v>6.3908056120e+00</v></row>
<row><t>1328580000</t><v>6.0361776174e+00</v></row>
<row><t>1328583600</t><v>6.3164590113e+00</v></row>
<row><t>1328587200</t><v>6.0902986521e+00</v></row>
<row><t>1328590800</t><v>4.6756445168e+00</v></row>
<row><t>1328594400</t><v>3.9461916905e+00</v></row>
<row><t>1328598000</t><v>2.9449490046e+00</v></row>
<row><t>1328601600</t><v>2.4011760751e+00</v></row>
<row><t>1328605200</t><v>2.2187817639e+00</v></row>
<row><t>1328608800</t><v>2.1775208736e+00</v></row>
<row><t>1328612400</t><v>NaN</v></row>
</data>
</xport>
Related
Summary
I need help getting NCO tools to be helpful. I'm running into the error
"One or more variable sizes violate format constraints"
... when trying to just view the list of variables in the file with:
ncdump -h isrm_v1.2.1.ncf
It seems odd to trip on this when I'm not asking for any large variables to be read ... just metadata. Are there any flags I should or could be passing to avoid this error?
Reprex
isrm_v1.2.1.ncf (165 GB) is available on Zenodo.
Details
I've just installed the NCO suite via brew --install nco --build-from-source on a Mac (I know, I know) running OS X 11.6.5. ncks --version says 5.0.6.
Tips appreciated. I've been trawling through the ncks docs for a couple of hours without much insight. A friend was able to slice the file on a different system running actual Linux, so I'm pretty sure my NCO install is to blame.
How can I dig in deeper to find the root cause? NCO tools don't seem very verbose. I understand there are different sub-formats of NetCDF (3, 4, ...) but I'm not even sure how to verify the version/format of the .nc file that I'm trying to access.
My larger goal is to be able to slice it, like ncks -v pNH4 -d layer,0 isrm_v1.2.1.ncf pNH4L0.nc, but if I can't even view metadata, I'm thinking I need to solve that first.
The more-verbose version of the error message, for the record, is:
HINT: NC_EVARSIZE errors occur when attempting to copy or aggregate input files together into an output file that exceeds the per-file capacity of the output file format, and when trying to copy, aggregate, or define individual variables that exceed the per-variable constraints of the output file format. The per-file limit of all netCDF formats is not less than 8 EiB on modern computers, so any NC_EVARSIZE error is almost certainly due to violating a per-variable limit. Relevant limits: netCDF3 NETCDF_CLASSIC format limits fixed variables to sizes smaller than 2^31 B = 2 GiB ~ 2.1 GB, and record variables to that size per record. A single variable may exceed this limit if and only if it is the last defined variable. netCDF3 NETCDF_64BIT_OFFSET format limits fixed variables to sizes smaller than 2^32 B = 4 GiB ~ 4.2 GB, and record variables to that size per record. Any number of variables may reach, though not exceed, this size for fixed variables, or this size per record for record variables. The netCDF3 NETCDF_64BIT_DATA and netCDF4 NETCDF4 formats have no variable size limitations of real-world import. If any variable in your dataset exceeds these limits, alter the output file to a format capacious enough, either netCDF3 classic with 64-bit offsets (with -6 or --64), to PnetCDF/CDF5 with 64-bit data (with -5), or to netCDF4 (with -4 or -7). For more details, see http://nco.sf.net/nco.html#fl_fmt
Tips appreciated!
ncdump is not an NCO program, so I can't help you there, except to say that printing metadata should not cause an error in this case, so try ncks -m in.nc instead of ncdump -h in.nc.
Nevertheless, the hyperslab problem you have experienced is most likely due to trying to shove too much data into a netCDF format that can't hold it. The generic solution to that is to write the data to a more capacious netCDF format:
Try either one of these commands:
ncks -5 -v pNH4 -d layer,0 isrm_v1.2.1.ncf pNH4L0.nc
ncks -7 -v pNH4 -d layer,0 isrm_v1.2.1.ncf pNH4L0.nc
Formats are documented here
I would like my conky system monitor to show a time series of rain forecast probabilities as a graph. I request the forecast data via the darksky API and format it to csv with jq like this
curl "https://api.darksky.net/forecast/<myapikey>/<mylat>,<mylon>" |
jq '.minutely.data | map([.time, .precipProbability] | join(",")) | join("\n") ' |
sed 's/"//g' | sed 's/\\n/\n/g'
which produces output like this
1552253100,0
1552253160,0
1552253220,0
1552253280,0
1552253340,0
1552253400,0.01
1552253460,0.03
...
Is there a way to display this data in conky with ${execgraph ...} or similar? As far as I understand, you can only pass a single value at a time to update execgraph, but I want to display an entire time series at once.
At the moment I pass the data to gnuplot, produce a graph and include it in conky as an ${image ...} which works alright, but perhaps there is a native conky solution.
If displaying the probabilities from when conky starts and thereafter is sufficient, you could use ${execgraph ...} and just pass the latest value in the series each time conky updates.
Our monitoring system dumps metrics into Graphite does so once per minute, and has a retention of 1min:2d,5min:20d,30min:120d,6h:2y. However I've recently added monitors that run on a 5-minute period, and I've found that:
The 1 minute points are four zeroes and an actual value, repeating of course.
The 5+ minute points are all zeroes, likely because my xFilesFactor is higher than 0.2 and the aggregation just doesn't happen at all.
What I'd like to do is simply create a new Whisper file with the new retentions, [and no wasted space] and then import/re-aggregate the data into it. From what I've found whiper-resize.py is supposed to be the right tool.
As a test I've been doing:
whisper-resize.py \
--newfile=/tmp/foo.wsp \
--aggregate --aggregationMethod=max \
--xFilesFactor=0.1 \
--force \
quotas/us-central1CPUS/CPUS.wsp \
5min:20d 30min:120d 6h:2y
But after this operation completes foo-wsp is just filled with zeroes.
What's the deal?
You just need to change xFilesFactor for target files, like
whisper-resize.py --xFilesFactor=0.0 --nobackup quotas/us-central1CPUS/CPUS.wsp 1min:2d 5min:20d 30min:120d 6h:2y
You will not waste space - whisper format has fixed file size anyway. Please see details in http://obfuscurity.com/2012/04/Unhelpful-Graphite-Tip-9
I have some RSS feeds that are periodically fetched, but the time between each check is essentially random. After each fetch, the number of new items is logged along with the time of the fetch.
I want to make a graph using RRDTool which combines the data from each feed and shows the number of new items per hour over the course of a day.
For instance, I want it to show what the rate is for all feeds between the hours of 17:00 - 18:00, 18:00 - 19:00, etc. I want to use the unit new items/hour.
I have looked at the man pages and a tutorial, but the concepts of using multiple data sources and RRA's in this particular scenario is a bit over my head. I understand that I want to input the data using something like this:
rrdtool update feed1.rrd 1445113625:14 # 14 new items
rrdtool update feed2.rrd 1445113627:8 # 8 new items
rrdtool update feed3.rrd 1445113629:22 # 22 new items
rrdtool update feed1.rrd 1445113748:4 # 4 new items
rrdtool update feed2.rrd 1445113750:8 # 8 new items
rrdtool update feed3.rrd 1445113752:16 # 16 new items
How would I set up the rrdtool create and rrdtool graph commands to get the results I am looking for?
You want to use the ABSOLUTE datasource type:
rrdtool create feed1.rrd --step 300 DS:feed:ABSOLUTE:600:0:U RRA:AVERAGE:1:1000
this assumes you fetch your feed at least every 600 seconds
once you did a fetch, you run
rrdtool update feed1.rrd N:$number_of_articles_fetched
to create graphs, you load data from all rrds
rrdtool graph feed.png \
DEF:f1=feed1.rrd:feed:AVERAGE \
DEF:f2=feed2.rrd:feed:AVERAGE \
CDEF:sum=f1,f2,+ \
AREA:sum#00f:Articles
Is it possible to graph the query resolution time of bind9 in munin?
I know there is a way to graph it in a unbound server, is it already done in bind? If not how do I start writing a munin plugin for that? I'm getting stats from http://127.0.0.1:8053/ in the bind9 server.
I don't believe that "query time" is a function of BIND. About the only time that I see that value (with individual lookups) is when using dig. If you're willing to use that, the following might be a good starting point:
#!/bin/sh
case $1 in
config)
cat <<'EOM'
graph_title Red Hat Query Time
graph_vlabel time
time.label msec
EOM
exit 0;;
esac
echo -n "time.value "
dig www.redhat.com|grep Query|cut -d':' -f2|cut -d\ -f2
Note that there's two spaces after the "-d\" in the second cut statement. If you save the above as "querytime" and run it at the command line, output should look something like:
root#pi1:~# ./querytime
time.value 189
root#pi1:~# ./querytime config
graph_title Red Hat Query Time
graph_vlabel time
time.label msec
I'm not sure of the value in tracking the above though. The response time can be affected: if the query is an initial lookup, if the answer is cached locally, depending on server load, depending on intervening network congestion, etc.
Note: the above may be a bit buggy as I've written it on the fly, but it should give you a good starting point. That it returned the above output is a good sign.
In any case, recommend reading the following before you write your own: http://munin-monitoring.org/wiki/HowToWritePlugins