Graphite/Carbon not retaining data for some statistics - graphite

I have a Carbon/Graphite stack with some very basic retention schemas set up. These retention periods work fine, apart from on a couple of statistics - these only appear to last for a week.
My storage-schemas.conf:
[carbon]
pattern = ^carbon\.
retentions = 60:90d
[collectd]
pattern = ^collectd.*
retentions = 10s:2d,1m:14d,5m:1y
And my storage-aggregation.conf:
[min]
pattern = \.min$
xFilesFactor = 0.1
aggregationMethod = min
[max]
pattern = \.max$
xFilesFactor = 0.1
aggregationMethod = max
[sum]
pattern = \.count$
xFilesFactor = 0
aggregationMethod = sum
[default_average]
pattern = .*
xFilesFactor = 0.5
aggregationMethod = average
All stats arrive prefixed with collectd., so the retention patterns are correct. When viewing an affected dashboard in Grafana I see the following in graphite's cache.log:
Thu Oct 13 11:25:16 2016 :: CarbonLink cache-query request for collectd.host_domain_com.openstack-keystone-totals.gauge-users-count returned 0 datapoints
Using whisper-info.py on an affected .wsp shows the following:
maxRetention: 31536000
xFilesFactor: 0.5
aggregationMethod: average
fileSize: 1710772
Archive 0
retention: 172800
secondsPerPoint: 10
points: 17280
size: 207360
offset: 52
Archive 1
retention: 1209600
secondsPerPoint: 60
points: 20160
size: 241920
offset: 207412
Archive 2
retention: 31536000
secondsPerPoint: 300
points: 105120
size: 1261440
offset: 449332
Can anyone suggest anything I may have missed?

So the answer to this comes from a couple of issues. Firstly, the data points are being submitted with -count on the end of the name instead of .count, so the default [sum] aggregation is being applied to the data. Because we're not submitting data every 10 seconds (and because we have an xFilesFactor of 0.5 on the default), the data is munged when it hits the retention point and because there are less than 50% of the expected data points, a null value is stored instead.

Related

Name ''total'' is not defined

What should I differently? The Result is line 12 print(total) NameError: name 'total' is not defined
def gross_pay (hours,rate):
info =()
info = getUserInfo()
rate = float(input('How much do you make an hour?:'))
hours = int(input('How many hours did you work?:'))
total = rate * hours
taxes = total * 0.05
total = total - taxes
print(total)
total is a local variable. It doesn't exist outside the function. Also you need to call the function, where you can return total. getUserInfo() is not present and info is unused. Asking for the input parameters inside the function is incorrect as well. Technically, pay after taxes is net pay, not gross:
def net_pay(hours,rate):
total = rate * hours
taxes = total * 0.05
return total - taxes
rate = float(input('How much do you make an hour? '))
hours = int(input('How many hours did you work? '))
print(net_pay(hours,rate))
Output:
How much do you make an hour? 10.50
How many hours did you work? 40
399.0
def gross_pay (hours,rate):
info =()
# getUserInfo() should also be defined on your code:
info = getUserInfo()
rate = float(input('How much do you make an hour?:'))
hours = int(input('How many hours did you work?:'))
total = rate * hours
taxes = total * 0.05
total = total - taxes
print(total)
#calling the declarated (defined) function:
hours=0
rate=0
gross_pay()
I'm assuming you're passing the parameters hours and rate by reference because you're gonna need the values later, otherwise they're not necesary, since you're asking for input inside the gross_pay function

Graphite does not show old stats

We are using Graphite to store stats about our websites. Everything works fine when we want to see the data for the last 24h and 7days. When we are trying to have a look at the last month data Graphite does not show any data.
We collect the data for one metric every 5min and the other ones once an hour.
When I use the GUI this "query" works:
width=1188&height=580&target=identifierXYP.value&lineMode=connected&from=-8days
And this one does not return any data
width=1188&height=580&target=identifierXYP.value&lineMode=connected&from=-9days
The only thing that changed was the "from" part.
I already ran
find ./ -type f -name '*.wsp' -exec whisper-resize.py --nobackup {} 5m:365d \;
but it did not help.
whisper-info.py value.wsp outputs:
maxRetention: 157680000
xFilesFactor: 0.5
aggregationMethod: average
fileSize: 2521504
Archive 0
retention: 691200
secondsPerPoint: 10
points: 69120
size: 829440
offset: 64
Archive 1
retention: 2678400
secondsPerPoint: 60
points: 44640
size: 535680
offset: 829504
Archive 2
retention: 31536000
secondsPerPoint: 600
points: 52560
size: 630720
offset: 1365184
Archive 3
retention: 157680000
secondsPerPoint: 3600
points: 43800
size: 525600
offset: 1995904

overall count not as expected in grafana dashboard

i have the following singlestat for capturing the overall count of "lookups".
Whe I select a date range e.g. Nov 25, 2015 10:49:23 to Nov 25, 2015 19:31:18 I get a value of 82(see image)
I am using the following metric:
When I zoom out to get a greater date range e.g. Nov 24, 2015 08:54:39 to Nov 25, 2015 19:42:18 the value of 82 "Lookups" goes to 41.
See image:
I was expecting the count to remain at 82 as I am getting an overall count.
I know the issue is not to do with graphite or grafana, so what am I doing wrong?
Update
This is my storage-aggregation.conf
[min]
pattern = \.lower$
xFilesFactor = 0.1
aggregationMethod = min
[max]
pattern = \.upper(_\d+)?$
xFilesFactor = 0.1
aggregationMethod = max
[sum]
pattern = \.sum$
xFilesFactor = 0
aggregationMethod = sum
[count]
pattern = \.count$
xFilesFactor = 0
aggregationMethod = sum
[count_legacy]
pattern = ^stats_counts.*
xFilesFactor = 0
aggregationMethod = sum
[default_average]
pattern = .*
xFilesFactor = 0.3
aggregationMethod = average

Hash Table + Binary Search

I'm using an Hash Table to store some values. Here are the details:
There will be roughly 1M items to store (not known before, so no perfect-hash possible).
Table is 10M large.
Hash function is MurMurHash3.
I did some tests and storing 1M values I get 350,000 collisions and 30 elements at the most-colliding hash table's slot.
Are these result good?
Would it make sense to implement Binary Search for lists that get created at colliding hash-table's slots?
What' your advice to improve performances?
EDIT: Here is my code
var
HashList: array [0..10000000 - 1] of Integer;
for I := 0 to High(HashList) do
HashList[I] := 0;
for I := 1 to 1000000 do
begin
Y := MurmurHash3(UIntToStr(I));
Y := Y mod Length(HashList);
Inc(HashList[Y]);
if HashList[Y] > 1 then
Inc(TotalCollisionsCount);
if HashList[Y] > MostCollidingSlotItemCount then
MostCollidingSlotItemCount := HashList[Y];
end;
Writeln('Total: ' + IntToStr(TotalCollisionsCount) + ' Max: ' + IntToStr(MostCollidingSlotItemCount));
Here is the result I get:
Total: 48169 Max: 5
Am I missing something?
This is what you get when you put 1M items randomly into 10M cells
calendar_size=10000000 nperson = 1000000
E/cell| Ncell | frac | Nelem | frac |h/cell| hops | Cumhops
----+---------+--------+----------+--------+------+--------+--------
0: 9048262 (0.904826) 0 (0.000000) 0 0 0
1: 905064 (0.090506) 905064 (0.905064) 1 905064 905064
2: 45136 (0.004514) 90272 (0.090272) 3 135408 1040472
3: 1488 (0.000149) 4464 (0.004464) 6 8928 1049400
4: 50 (0.000005) 200 (0.000200) 10 500 1049900
----+---------+--------+----------+--------+------+--------+--------
5: 10000000 1000000 1.049900 1049900
The left column is the number of items in a cell. The second: the number of cells having this itemcount.
WRT the binary search: it is obvious that for small tables like this (maximum chain length=4, but most chains are of length=1), linear search outperforms binary search. The takeover-point is probably somewhere between 10 and 100.

What datetime format is this?

I have DateTime structure for an old data format that I don't have access to any specs for. There is a field which indicates the datetime of the the data, but it isn't in any format I recognize. It appears to be stored as a 32-bit integer, that increments by 20 for each day. Has anyone ever run across something like this?
EDIT:
Example: 1088631936 DEC = 80 34 E3 40 00 00 00 00 HEX = 09/07/2007
EDIT:
First off, sorry for the delay. I had hoped to do stuff over the weekend, but was unable to.
Second, this date format is weirder than I initially thought. It appears to be some sort of exponential or logarithmic method, as the dates do not change at an increasing rate.
Third, the defunct app that I have for interpreting these values only shows the date portion, so I don't know what the time portion is.
Example data:
(Hex values are big-endian, dates are mm/dd/yyyy)
0x40000000 = 01/01/1900
0x40010000 = 01/01/1900
0x40020000 = 01/01/1900
0x40030000 = 01/01/1900
0x40040000 = 01/01/1900
0x40050000 = 01/01/1900
0x40060000 = 01/01/1900
0x40070000 = 01/01/1900
0x40080000 = 01/02/1900
0x40090000 = 01/02/1900
0x400A0000 = 01/02/1900
0x400B0000 = 01/02/1900
0x400C0000 = 01/02/1900
0x400D0000 = 01/02/1900
0x400E0000 = 01/02/1900
0x400F0000 = 01/02/1900
0x40100000 = 01/03/1900
0x40110000 = 01/03/1900
0x40120000 = 01/03/1900
0x40130000 = 01/03/1900
0x40140000 = 01/04/1900
0x40150000 = 01/04/1900
0x40160000 = 01/04/1900
0x40170000 = 01/04/1900
0x40180000 = 01/05/1900
0x40190000 = 01/05/1900
0x401A0000 = 01/05/1900
0x401B0000 = 01/05/1900
0x401C0000 = 01/06/1900
0x401D0000 = 01/06/1900
0x401E0000 = 01/06/1900
0x401F0000 = 01/06/1900
0x40200000 = 01/07/1900
0x40210000 = 01/07/1900
0x40220000 = 01/08/1900
0x40230000 = 01/08/1900
....
0x40800000 = 05/26/1901
0x40810000 = 06/27/1901
0x40820000 = 07/29/1901
....
0x40D00000 = 11/08/1944
0x40D10000 = 08/29/1947
EDIT: I finally figured this out, but since I've already given up the points for the bounty, I'll hold off on the solution in case anyone wants to give it a shot.
BTW, there is no time component to this, it is purely for storing dates.
It's not integer, it's a 32 bit floating point number. I haven't quite worked out the format yet, it's not IEEE.
Edit: got it. 1 bit sign, 11 bit exponent with an offset of 0x3ff, and 20 bit mantissa with an implied bit to the left. In C, assuming positive numbers only:
double offset = pow(2, (i >> 20) - 0x3ff) * (((i & 0xfffff) + 0x100000) / (double) 0x100000);
This yields 0x40000000 = 2.0, so the starting date must be 12/30/1899.
Edit again: since you were so kind as to accept my answer, and you seem concerned about speed, I thought I'd refine this a little. You don't need the fractional part of the real number, so we can convert straight to integer using only bitwise operations. In Python this time, complete with test results. I've included some intermediate values for better readability. In addition to the restriction of no negative numbers, this version might have problems when the exponent goes over 19, but this should keep you good until the year 3335.
>>> def IntFromReal32(i):
exponent = (i >> 20) - 0x3ff
mantissa = (i & 0xfffff) + 0x100000
return mantissa >> (20 - exponent)
>>> testdata = range(0x40000000,0x40240000,0x10000) + range(0x40800000,0x40830000,0x10000) + [1088631936]
>>> from datetime import date,timedelta
>>> for i in testdata:
print "0x%08x" % i, date(1899,12,30) + timedelta(IntFromReal32(i))
0x40000000 1900-01-01
0x40010000 1900-01-01
0x40020000 1900-01-01
0x40030000 1900-01-01
0x40040000 1900-01-01
0x40050000 1900-01-01
0x40060000 1900-01-01
0x40070000 1900-01-01
0x40080000 1900-01-02
0x40090000 1900-01-02
0x400a0000 1900-01-02
0x400b0000 1900-01-02
0x400c0000 1900-01-02
0x400d0000 1900-01-02
0x400e0000 1900-01-02
0x400f0000 1900-01-02
0x40100000 1900-01-03
0x40110000 1900-01-03
0x40120000 1900-01-03
0x40130000 1900-01-03
0x40140000 1900-01-04
0x40150000 1900-01-04
0x40160000 1900-01-04
0x40170000 1900-01-04
0x40180000 1900-01-05
0x40190000 1900-01-05
0x401a0000 1900-01-05
0x401b0000 1900-01-05
0x401c0000 1900-01-06
0x401d0000 1900-01-06
0x401e0000 1900-01-06
0x401f0000 1900-01-06
0x40200000 1900-01-07
0x40210000 1900-01-07
0x40220000 1900-01-08
0x40230000 1900-01-08
0x40800000 1901-05-26
0x40810000 1901-06-27
0x40820000 1901-07-29
0x40e33480 2007-09-07
Are you sure that values correspond to 09/07/2007?
I ask because 1088631936 are the number of seconds since Linux (et al) zero date: 01/01/1970 00:00:00 to 06/30/2004 21:45:36.
Seems to me reasonable to think the value are seconds since this usual zero date.
Edit: I know it is very possible for this not to be the correct answer. It is just one approach (a valid one) but I think more info is needed (see the comments). Editing this (again) to bring the question to the front in the hope of somebody else to answer it or give ideas. Me: with a fairness, sportive and sharing spirit :D
I'd say that vmarquez is close.
Here are dates 2009-3-21 and 2009-3-22 as unix epochtime:
In [8]: time.strftime("%s", (2009, 3, 21, 1, 1, 0, 0,0,0))
Out[8]: '1237590060'
In [9]: time.strftime("%s", (2009, 3, 22, 1, 1, 0, 0,0,0))
Out[9]: '1237676460'
And here they are in hex:
In [10]: print("%0x %0x" % (1237590060, 1237676460))
49c4202c 49c571ac
If you take only first 5 digits, the growth is 21. Which kinda matches your format, neg?
Some context would be useful. If your data file looks something, literally or at least figuratively, like this file, vmarquez is on the money.
http://www.slac.stanford.edu/comp/net/bandwidth-tests/eventanalysis/all_100days_sep04/node1.niit.pk
That reference is data produced by Available Bandwith Estimation tool (ABwE) -- the curious item is that it actually contains that 1088631936 value as well as the context. That example
date time abw xtr dbcap avabw avxtr avdbcap rtt timestamp
06/30/04 14:43:48 1.000 0.000 1.100 1.042 0.003 1.095 384.387 1088631828
06/30/04 14:45:36 1.100 0.000 1.100 1.051 0.003 1.096 376.408 1088631936
06/30/04 14:47:23 1.000 0.000 1.100 1.043 0.003 1.097 375.196 1088632043
seems to have a seven hour offset from the suggested 21:45:36 time value. (Probably Stanford local, running on Daylight savings time.)
Well, you've only shown us how your program uses 2 of the 8 digits, so we'll have to assume that the other 6 are ignored (because your program could be doing anything it wants with those other digits).
So, we could say that the input format is:
40mn0000
where m and n are two hex digits.
Then, the output is:
01/01/1900 + floor((2^(m+1)-2) + n*2^(m-3)) days
Explanation:
In each example, notice that incrementing n by 1 increases the number of days by 2^(m-3).
Notice that every time n goes from F to 0, m is incremented.
Using these two rules, and playing around with the numbers, you get the equation above.
(Except for floor, which was added because the output doesn't display fractional days).
I suppose you could rewrite this by replacing the two separate hex variables m and n with a single 2-digit hex number H. However, I think that would make the equation a lot uglier.

Resources