Using RRDTool to graph RSS feeds - graph

I have some RSS feeds that are periodically fetched, but the time between each check is essentially random. After each fetch, the number of new items is logged along with the time of the fetch.
I want to make a graph using RRDTool which combines the data from each feed and shows the number of new items per hour over the course of a day.
For instance, I want it to show what the rate is for all feeds between the hours of 17:00 - 18:00, 18:00 - 19:00, etc. I want to use the unit new items/hour.
I have looked at the man pages and a tutorial, but the concepts of using multiple data sources and RRA's in this particular scenario is a bit over my head. I understand that I want to input the data using something like this:
rrdtool update feed1.rrd 1445113625:14 # 14 new items
rrdtool update feed2.rrd 1445113627:8 # 8 new items
rrdtool update feed3.rrd 1445113629:22 # 22 new items
rrdtool update feed1.rrd 1445113748:4 # 4 new items
rrdtool update feed2.rrd 1445113750:8 # 8 new items
rrdtool update feed3.rrd 1445113752:16 # 16 new items
How would I set up the rrdtool create and rrdtool graph commands to get the results I am looking for?

You want to use the ABSOLUTE datasource type:
rrdtool create feed1.rrd --step 300 DS:feed:ABSOLUTE:600:0:U RRA:AVERAGE:1:1000
this assumes you fetch your feed at least every 600 seconds
once you did a fetch, you run
rrdtool update feed1.rrd N:$number_of_articles_fetched
to create graphs, you load data from all rrds
rrdtool graph feed.png \
DEF:f1=feed1.rrd:feed:AVERAGE \
DEF:f2=feed2.rrd:feed:AVERAGE \
CDEF:sum=f1,f2,+ \
AREA:sum#00f:Articles

Related

How to know the netwrok traffic my test (using JMeter) is going to generate?

I am going to run load test using JMeter over Amazon AWS and I need to know before starting my test how much traffic is it going to generate over network.
The criteria that Amazon has in their policy is:
sustains, in aggregate, for more than 1 minute, over 1 Gbps (1 billion bits per second) or 1 Gpps (1 billion packets per second). If my test is going to exceed this criteria we need to submit a form before starting the test.
so how can I know if the test is going to exceed this number or not?
Run your test with 1 virtual user and 1 iteration in command-line non-GUI mode like:
jmeter -n -t test.jmx -l result.csv
To get an approximate figure open Open the result.csv file using Aggregate Report listener and there you will have 2 columns: Received KB/sec and Sent KB/sec. Multiply it by the duration of your test in seconds and you will get the number you're looking for.
alternatively you can open the result.csv file using MS Excel or LibreOffice Calc or equivalent where you can sum bytes and sentBytes columns and get the traffic with 1 byte precision:

display weather forecast data from weather api in conky graph

I would like my conky system monitor to show a time series of rain forecast probabilities as a graph. I request the forecast data via the darksky API and format it to csv with jq like this
curl "https://api.darksky.net/forecast/<myapikey>/<mylat>,<mylon>" |
jq '.minutely.data | map([.time, .precipProbability] | join(",")) | join("\n") ' |
sed 's/"//g' | sed 's/\\n/\n/g'
which produces output like this
1552253100,0
1552253160,0
1552253220,0
1552253280,0
1552253340,0
1552253400,0.01
1552253460,0.03
...
Is there a way to display this data in conky with ${execgraph ...} or similar? As far as I understand, you can only pass a single value at a time to update execgraph, but I want to display an entire time series at once.
At the moment I pass the data to gnuplot, produce a graph and include it in conky as an ${image ...} which works alright, but perhaps there is a native conky solution.
If displaying the probabilities from when conky starts and thereafter is sufficient, you could use ${execgraph ...} and just pass the latest value in the series each time conky updates.

Resize and re-aggregate whisper stats

Our monitoring system dumps metrics into Graphite does so once per minute, and has a retention of 1min:2d,5min:20d,30min:120d,6h:2y. However I've recently added monitors that run on a 5-minute period, and I've found that:
The 1 minute points are four zeroes and an actual value, repeating of course.
The 5+ minute points are all zeroes, likely because my xFilesFactor is higher than 0.2 and the aggregation just doesn't happen at all.
What I'd like to do is simply create a new Whisper file with the new retentions, [and no wasted space] and then import/re-aggregate the data into it. From what I've found whiper-resize.py is supposed to be the right tool.
As a test I've been doing:
whisper-resize.py \
--newfile=/tmp/foo.wsp \
--aggregate --aggregationMethod=max \
--xFilesFactor=0.1 \
--force \
quotas/us-central1CPUS/CPUS.wsp \
5min:20d 30min:120d 6h:2y
But after this operation completes foo-wsp is just filled with zeroes.
What's the deal?
You just need to change xFilesFactor for target files, like
whisper-resize.py --xFilesFactor=0.0 --nobackup quotas/us-central1CPUS/CPUS.wsp 1min:2d 5min:20d 30min:120d 6h:2y
You will not waste space - whisper format has fixed file size anyway. Please see details in http://obfuscurity.com/2012/04/Unhelpful-Graphite-Tip-9

Titan Graph Queries taking too long to execute

I have a problem with the executing speed of Titan queries.
To be more specific:
I created a property file for my graph using BerkeleyJe which is looking like this:
storage.backend=berkeleyje
storage.directory=/finalGraph_script/graph
Afterwards, i opened the Gremlin.bat to open my Graph.
I set up all the neccessary Index Keys for my nodes:
m = g.getManagementSystem();
username = m.makePropertyKey('username').dataType(String.class).make()
m.buildIndex('byUsername',Vertex.class).addKey(username).unique().buildCompositeIndex()
m.commit()
g.commit()
(all other keys are created the same way...)
I imported a csv file containing about 100 000 lines, each line is producing at least 2 nodes and some edges. All this is done via Batchloading.
That works without a Problem.
Then i execute a groupBy query which is looking like that:
m = g.V.has("imageLink").groupBy{it.imageLink}{it.in("is_on_image").out("is_species")}{it._().species.groupCount().cap.next()}.cap.next()
With this query i want for every node with the property key "imageLink" the number of the different "species". "Species" are also nodes, and can be called by going back the edge "is_on_image" and following the edge "is_species".
Well this is also working like a charm, for my recent nodes. This query is taking about 2 minutes on my local PC.
But now to the problem.
My whole dataset is a csv with 10 million entries. The structure is the same as above, and each line is also creating at least 2 nodes and some edges.
With my local PC i cant even import this set, causing an Memory Exception after 3 days of loading.
So I tried the same on a server with much more RAM and memory. There the Import works, and takes about 1 day. But the groupBy failes after about 3 days.
I actually dont know if the groupBy itself fails, or just the Connection to the Server after that long time.
So my first Question:
In my opinion about 15 million nodes shouldn't be that big deal for a graph database, should it?
Second Question:
Is it normal that it takes so long? Or is there anyway to speed it up using indices? I configured the indices as listet above :(
I don't know which exact information you need for helping me, but please just tell me what you need in addition to that.
Thanks a lot!
Best regards,
Ricardo
EDIT 1: The way im loading the CSV in the Graph:
I'm using this code, i deleted some unneccassry properties, which are also set an property for some nodes, loaded the same way.
bg = new BatchGraph(g, VertexIDType.STRING, 10000)
new File("annotation_nodes_wNothing.csv").eachLine({ final String line ->def (annotationId,species,username,imageLink) = line.split('\t')*.trim();def userVertex = bg.getVertex(username) ?: bg.addVertex(username);def imageVertex = bg.getVertex(imageLink) ?: bg.addVertex(imageLink);def speciesVertex = bg.getVertex(species) ?: bg.addVertex(species);def annotationVertex = bg.getVertex(annotationId) ?: bg.addVertex(annotationId);userVertex.setProperty("username",username);imageVertex.setProperty("imageLink", imageLink);speciesVertex.setProperty("species",species);annotationVertex.setProperty("annotationId", annotationId);def classifies = bg.addEdge(null, userVertex, annotationVertex, "classifies");def is_on_image = bg.addEdge(null, annotationVertex, imageVertex, "is_on_image");def is_species = bg.addEdge(null, annotationVertex, speciesVertex, "is_species");})
bg.commit()
g.commit()

Getting parsable data from rrdtool graph

I want to extract some hourly data from rrdtool databases in order to create some graphs within a dashboard system.
These databases don't have an hourly datasource, the closest is a 30-min datasource (they are generated by munin)
Now, I can use rrdfetch, but that doesn't do the nice averaging that rrdgraph would do, so something like this
rrdtool fetch xxx-apache_accesses-accesses80-d.rrd AVERAGE \
--resolution 3600 -s 1328458200 -e 1328544600
Might give me 30 min data points like this
2012-Feb-05 16:30:00 3.5376357135e+00
2012-Feb-05 17:00:00 3.4655067194e+00
2012-Feb-05 17:30:00 4.0483210375e+00
2012-Feb-05 18:00:00 4.3210061422e+00
....
I could average those, but it seems that rrdgraph can output parsable text, but I can't figure out the correct incantation. Here's what I've tried
rrdtool graph dummy.png -s 1328523300 -e 1328609700 \
DEF:access=xxx-apache_accesses-accesses80-d.rrd:42:AVERAGE \
"PRINT:access:AVERAGE: %5.1lf %S"
outputs
0x0
4.7
Now I think that's simply the average for the period given, but is there any way to get rrdtool to spit out an average for particular chunks or step sizes? I tried --step but this did not change the output.
I could call rrdtool graph for each data point I need, but that seems rather wasteful.
No sooner had I posted than I hit upon the right approach!
rrdtool xport -s 1328523300 -e 1328609700 --step 3600 \
DEF:access=xxx-apache_accesses-accesses80-d.rrd:42:AVERAGE \
XPORT:access:"average"
This gives me the dump I need...
<?xml version="1.0" encoding="ISO-8859-1"?>
<xport>
<meta>
<start>1328526000</start>
<step>3600</step>
<end>1328612400</end>
<rows>25</rows>
<columns>1</columns>
<legend>
<entry>average</entry>
</legend>
</meta>
<data>
<row><t>1328526000</t><v>2.1949556516e+00</v></row>
<row><t>1328529600</t><v>2.0074586816e+00</v></row>
<row><t>1328533200</t><v>2.4574720485e+00</v></row>
<row><t>1328536800</t><v>3.4861890250e+00</v></row>
<row><t>1328540400</t><v>4.2725023347e+00</v></row>
<row><t>1328544000</t><v>6.2119548259e+00</v></row>
<row><t>1328547600</t><v>5.6709432075e+00</v></row>
<row><t>1328551200</t><v>6.1214185470e+00</v></row>
<row><t>1328554800</t><v>8.1137357347e+00</v></row>
<row><t>1328558400</t><v>5.8345894022e+00</v></row>
<row><t>1328562000</t><v>6.2264732776e+00</v></row>
<row><t>1328565600</t><v>6.1652113350e+00</v></row>
<row><t>1328569200</t><v>5.8851025574e+00</v></row>
<row><t>1328572800</t><v>5.4612112119e+00</v></row>
<row><t>1328576400</t><v>6.3908056120e+00</v></row>
<row><t>1328580000</t><v>6.0361776174e+00</v></row>
<row><t>1328583600</t><v>6.3164590113e+00</v></row>
<row><t>1328587200</t><v>6.0902986521e+00</v></row>
<row><t>1328590800</t><v>4.6756445168e+00</v></row>
<row><t>1328594400</t><v>3.9461916905e+00</v></row>
<row><t>1328598000</t><v>2.9449490046e+00</v></row>
<row><t>1328601600</t><v>2.4011760751e+00</v></row>
<row><t>1328605200</t><v>2.2187817639e+00</v></row>
<row><t>1328608800</t><v>2.1775208736e+00</v></row>
<row><t>1328612400</t><v>NaN</v></row>
</data>
</xport>

Resources