Carbon Aggregator re-aggregating metric - graphite

I have the following aggregation rule:
abc.prod.ALL.<service>.<metric>.count (60) = sum abc.local.*.<service>.<<metric>>.count
Given metrics like:
abc.prod.host1.aservice.ametric.count
abc.prod.host2.aservice.ametric.count
I would expect them to be aggregated to
abc.prod.ALL.aservice.ametric.count
But that metric is never created. In aggregator logs, I see
Allocating new metric buffer for abc.prod.ALL.aservice.ametric.count
but it's not created. If I add a layer to the generated metric like:
abc.prod.extralayer.ALL.<service>.<metric>.count (60) = sum abc.local.*.<service>.<<metric>>.count
then we seem to get a recursive explosion of created metrics like:
abc.prod.extralayer.ALL.aservice.ametric.count
abc.prod.extralayer.ALL.ALL.aservice.ametric.count
abc.prod.extralayer.ALL.ALL.ALL.aservice.ametric.count
abc.prod.extralayer.ALL.ALL.ALL.ALL.aservice.ametric.count
Which led me to believe that the generated metric is then aggregated again...
I added a logging line to AggregationProcessor.process:
else:
log.clients("Found aggregate " + aggregate_metric + " for " + metric)
aggregate_metrics.add(aggregate_metric)
And then tried with my original, desired rule.. and I eventually started to see, loglines like:
Found aggregate abc.prod.ALL.aservice.ametric.count for abc.prod.ALL.aservice.ametric.count
It matched itself as if it was a new incoming metric... Why is it being fed back into the aggregator?

This appears to have been a bug. It was not in older version but was in master at the time of my question.
If you are seeing this behaviour, follow the issue on GitHub:
https://github.com/graphite-project/carbon/issues/560
https://github.com/graphite-project/carbon/issues/455
There is no point in continuing the question here on SO.
Note: I am using the older version, 0.9.15 and not seeing the problem - so I recommend this until it is confirmed to not be resolved in master.

Related

How to analyze a MariaDB crash log?

When I created a partition for an existing table, I got an exception stack below:
Thread pointer: 0x1ce1ecc6ab8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
mysqld.exe!ha_partition::read_par_file()[ha_partition.cc:3021]
mysqld.exe!ha_partition::get_from_handler_file()[ha_partition.cc:3161]
mysqld.exe!ha_partition::initialize_partition()[ha_partition.cc:512]
mysqld.exe!partition_create_handler()[ha_partition.cc:185]
mysqld.exe!get_new_handler()[handler.cc:302]
mysqld.exe!TABLE_SHARE::init_from_binary_frm_image()[table.cc:2025]
mysqld.exe!open_table_def()[table.cc:698]
mysqld.exe!tdc_acquire_share()[table_cache.cc:842]
mysqld.exe!open_table()[sql_base.cc:1905]
mysqld.exe!open_and_process_table()[sql_base.cc:3802]
mysqld.exe!open_tables()[sql_base.cc:4300]
mysqld.exe!mysql_alter_table()[sql_table.cc:9353]
mysqld.exe!Sql_cmd_alter_table::execute()[sql_alter.cc:510]
mysqld.exe!mysql_execute_command()[sql_parse.cc:6087]
mysqld.exe!Prepared_statement::execute()[sql_prepare.cc:4760]
mysqld.exe!Prepared_statement::execute_loop()[sql_prepare.cc:4246]
mysqld.exe!mysql_sql_stmt_execute()[sql_prepare.cc:3364]
mysqld.exe!mysql_execute_command()[sql_parse.cc:3901]
mysqld.exe!sp_instr_stmt::exec_core()[sp_head.cc:3652]
mysqld.exe!sp_lex_keeper::reset_lex_and_exec_core()[sp_head.cc:3335]
mysqld.exe!sp_instr_stmt::execute()[sp_head.cc:3513]
mysqld.exe!sp_head::execute()[sp_head.cc:1346]
mysqld.exe!sp_head::execute_procedure()[sp_head.cc:2288]
mysqld.exe!do_execute_sp()[sql_parse.cc:3005]
mysqld.exe!Sql_cmd_call::execute()[sql_parse.cc:3247]
mysqld.exe!mysql_execute_command()[sql_parse.cc:6087]
mysqld.exe!sp_instr_stmt::exec_core()[sp_head.cc:3652]
mysqld.exe!sp_lex_keeper::reset_lex_and_exec_core()[sp_head.cc:3335]
mysqld.exe!sp_instr_stmt::execute()[sp_head.cc:3513]
mysqld.exe!sp_head::execute()[sp_head.cc:1346]
mysqld.exe!sp_head::execute_procedure()[sp_head.cc:2288]
mysqld.exe!Event_job_data::execute()[event_data_objects.cc:1459]
mysqld.exe!Event_worker_thread::run()[event_scheduler.cc:312]
mysqld.exe!event_worker_thread()[event_scheduler.cc:268]
mysqld.exe!pthread_start()[my_winthread.c:62]
ucrtbase.dll!_configthreadlocale()
KERNEL32.DLL!BaseThreadInitThunk()
ntdll.dll!RtlUserThreadStart()
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x1ce21cf66b8): alter table mytable add PARTITION (PARTITION t20221103 VALUES LESS THAN (TO_DAYS("2022-11-03")+1))
Connection ID (thread ID): 1649
Status: NOT_KILLED
The running platform is windows so I cannot debug the core file.
Therefore, I read the source code but I cannot understand the reason still deeply.
The database version is 10.4.6 so here is the source code link:
https://github.com/MariaDB/server/blob/mariadb-10.4.6/sql/ha_partition.cc#L3021
chksum= 0;
for (i= 0; i < len_words; i++)
chksum ^= uint4korr((file_buffer) + PAR_WORD_SIZE * i);
if (chksum)
goto err2;
m_tot_parts= uint4korr((file_buffer) + PAR_NUM_PARTS_OFFSET);
DBUG_PRINT("info", ("No of parts: %u", m_tot_parts));
tot_partition_words= (m_tot_parts + PAR_WORD_SIZE - 1) / PAR_WORD_SIZE;
tot_name_len_offset= file_buffer + PAR_ENGINES_OFFSET +
PAR_WORD_SIZE * tot_partition_words;
tot_name_words= (uint4korr(tot_name_len_offset) + PAR_WORD_SIZE - 1) /
PAR_WORD_SIZE; // <--- crashed here
There are some macros, so I don't know whether the line number is exact.
It seems a bug for loading the par file. I found that the above code has checked the validation on the par file, but subsequently, MariaDB still raises the exception.
So, I wonder if my analysis is right and how to bypass the exception, thanks a lot.
First is to search existing bug reports for 'read_par_file', and there's no obvious match.
Second is, looking at the line it crashed, we can assume its reading from memory that isn't allocated. The file_buffer was allocated earlier in the file based of a word length at the beginning.
Third, if you look at the git blame of the function, you see nothing has changed in quite a while.
So its likely a new bug. Please report it using the show create table and par file.
To confirm, that the show create table mytable, paste that into a running 10.4 latest version (containers are good for this). Then issue your alter table statement again. Its likely to crash, but its good to confirm.
There does seem to be a lack of checking around some of these offsets with regard to the allocated space.
Given it looks like you are doing daily partitions, have you exceeded the maximum partitions of 8k per table? Either way, an error message should occur rather than a crash.

Error while using "EpiEstim" and "ggplot2" libraries

First of all, I must say I'm completely noob in R. So I apologize in advance for asking for help with such a simple task. My task is to form a graph of COVID-19 cases for a certain period using data from the CSV file. Unfortunately, at the moment I cannot contact the person from the World Health Organization who provided the data and the script for launching. But I was left with an error that I cannot fix either myself, not with the help of Google.
script.R
library(EpiEstim)
library(ggplot2)
COVID<-read.csv("dataset.csv")
res_parametric_si<-estimate_R(COVID$I,method="parametric_si",config=make_config(list(mean_si=4,std_si=3)))
plot(res_parametric_si)
dataset.csv
Date,Suspected per day,Total suspected,Discarded/pending,Confirmed per day,Total confirmed,Deaths per day,Deaths Total,Case fatality rate,Daily confirmed,Recovered per day,Recovered total,Active cases,Tested with PCR,# of PCR tests total,average tests/ 7 days,Inf HCW,Inf HCW/d,Vent HCW,Susp per day
01-Jul-20,1239,91172,45285,889,45887,12,1185,2.58%,889,505,20053,24649,11109,676684,10073,6828,63,,1239
02-Jul-20,1249,92421,45658,876,46763,27,1212,2.59%,876,505,20558,24993,13167,689851,9966,6874,46,,1249
03-Jul-20,1288,93709,46032,914,47677,15,1227,2.57%,914,597,21155,25295,11825,701676,9915.7,6937,63,,1288
04-Jul-20,926,94635,46135,823,48500,22,1249,2.58%,823,221,21376,25875,9934,711610,9957,6990,53,,926
05-Jul-20,680,95315,46272,543,49043,13,1262,2.57%,543,327,21703,26078,6696,718306,9963.7,7030,40,,680
06-Jul-20,871,96186,46579,564,49607,21,1283,2.59%,564,490,22193,26131,9343,727649,10303.9,7046,16,,871
07-Jul-20,1170,97356,46942,807,50414,23,1306,2.59%,807,926,23119,25989,13568,741217,10806,7092,46,,1170
Error
Error in process_I(incid) (script.R#4): incid must be a vector or a dataframe with either i) a column called 'I', or ii) 2 columns called 'local' and 'imported'.
For the example data the issue seems to be that it does only cover 7 data points, and the configurator assumes that there it can window over more than 7 days. What worked for me was the following code (working in the sense that it does not throw an error).
config <- make_config(incid = COVID$Daily.confirmed,
method="parametric_si",
list(mean_si=4,std_si=3, t_start = c(2,3),t_end = c(6,7)))
res_parametric_si<-estimate_R(COVID$Daily.confirmed,method="parametric_si",config=config)
plot(res_parametric_si)

Is there a restriction on the no. of HERE API Calls I can make in a loop (using R)

I am trying to loop through a list of origin destination lat long locations to get the transit time. I am getting the following error when I loop. However when I do a single call (without looping), I get an output without error. I use the freemium HERE-API and I am allowed 250k transactions a month.
`for (i in 1:nrow(test))
{
call <- paste0("https://route.api.here.com/routing/7.2/calculateroute.json",
"?app_id=","appid",
"&app_code=","appcode",
"&waypoint0=geo!",y$dc_lat[i],",",y$dc_long[i],
"&waypoint1=geo!",y$store_lat[i],",",y$store_long[i],
"&mode=","fastest;truck;traffic:enabled",
"&trailerscount=","1",
"&routeattributes=","sh",
"&maneuverattributes=","di,sh",
"&limitedweight=","20")
response <-fromJSON(call, simplify = TRUE)
Traffic_time = (response[["response"]][["route"]][[1]][["summary"]][["trafficTime"]]) / 60
Base_time = (response[["response"]][["route"]][[1]][["summary"]][["baseTime"]]) / 60
print(Traffic_time)
}`
Error in file(con, "r"): cannot open the connection to 'https://route.api.here.com/routing/7.2/calculateroute.json?app_id=appid&app_code=appcode&waypoint0=geo!45.1005200,-93.2452000&waypoint1=geo!45.0978500,-95.0413620&mode=fastest;truck;traffic:enabled&trailerscount=1&routeattributes=sh&maneuverattributes=di,sh&limitedweight=20'
Traceback:
As per the error, this suggests that there is problem with the file at your end. it could be corrupt, good to try with changing the extension of the file. Can also try to restart your IDE. The number of API calls depend on the plans that you have opted for freemium or pro plans. You can have more details : https://developer.here.com/faqs

BigQuery Timeout Errors in R Loop Using bigrquery

I am running a query in a loop for each store in a dataframe. Typically there are 70 or so stores so the loop repeats that many times for each complete loop.
Maybe 75% of the time this loop works all the way through with no errors.
About 25% of the time I get the following error during any one of the loop iterations:
Error in curl::curl_fetch_memory(url, handle = handle) :
Timeout was reached
Then I have to figure out which iteration bombed, and repeat the loop excluding iterations that completed successfully.
I can't find anything on the web to help me understand what is causing this seemingly random error. Perhaps it is a BQ technical issue? There does not seem to be any relation to the size of the result set it crashes on.
Here is the part of my code that does the loop...again it works all the way through most of the time. The cartesian product across IDs is intentional, as I want every combination of each Test ID with all possible Control IDs within store.
sql<-"SELECT pstore as store, max(pretrips) as pretrips FROM analytics.campaign_ids
group by 1 order by 1"
store_maxtrips<-query_exec(sql,project=project, max_pages = 1)
store_maxtrips
for (i in 1:length(store_maxtrips$store)) {
#pull back all ids shopping in same primary store as each test ID with their pre metrics
sql<-paste("SELECT a.pstore as pstore, a.id as test_id,
b.id as ctl_id,
(abs(a.zpbsales-b.zpbsales)*",wt_pb_sales,")+(abs(a.zcatsales-b.zcatsales)*",wt_cat_sales,")+
(abs(a.zsales-b.zsales)*",wt_retail_sales,")+(abs(a.ztrips-b.ztrips)*",wt_retail_trips,") as zscore
FROM analytics.campaign_ids a inner join analytics.pre_zscores b
on a.pstore=b.pstore
where a.id<>b.id and a.pstore=",store_maxtrips$store[i]," order by a.pstore, a.id, zscore")
print(paste("processing store",store_maxtrips$store[i]))
query_exec(sql,project=project,destination_table = "analytics.campaign_matches",
write_disposition = "WRITE_APPEND", max_pages = 1)
}
Solved!
It turns out I was using query_exec, but I should have been using insert_query_job since I do not want to retrieve any results. The errors were all happening in the course of R trying to retrieve results from BigQuery which I didn't want anyhow.
By using insert_query_job + wait_for(job) in my loop instead of the query_exec command, it eliminated all issues with the loop finishing.
I did also need to add a try() function to help circumvent some rare errors that still popped up with this approach. Thanks to MarkeD for this tip. So my final solution looked like this:
try(job<-insert_query_job(sql,project=project,destination_table = "analytics.campaign_matches",
write_disposition = "WRITE_APPEND"))
wait_for(job)
Thanks to everyone who commented and helped me research the issue.

Configuring scollector to get different frequences for different collectors

I'm working on scollector and I want to have specific frequencies for different collector.
For example:
get info from disk usage every 5 minutes
info from memory every minute
iostat every 30 seconds
and so on...
Here is a part of the conf.toml I made:
FullHost = true
Freq = 60
DisableSelf = true
[[iostat]]
Filter = "iostat"
Freq = 30
[[memory]]
Filter = "memory"
Freq = 60
But I get some error
./scollector -conf="perso.toml" -p
2016/04/19 14:40:45 fatal: main.go:297: extra keys in perso.toml: [iostat iostat.Freq memory memory.Freq]
It seems that I cannot multiply the frequencies.
What should I do to get what I want?
Thank you all
According to scollector documentation, Freq is a global setting, so it's not possible to set different frequencies for each collector. The exception is for external collectors, which may be put in a folder named after the desired frequency (in seconds).
Freq is indeed global setting and interval is usually set to it. Although some collectors override interval to different values e.g. elasticsearch-indices runs every 15 minutes because there's a lot of data to pull.
To change it either
(best) hack scollector code to read and pass freq parameter to every collector
(second best) file a github issue
(last resort) you can just change intervals scollector code in specific collectors and recompile scollector
Well, we might found something.
We create differents folders representing several Freq (0, 30, 60, 120...) and in each folders, we write external collectors we need.
'/etc/collectors/0',
'/etc/collectors/15',
'/etc/collectors/30',
'/etc/collectors/60',
'/etc/collectors/120',
'/etc/collectors/300',
'/etc/collectors/600'
In the conf.toml:
ColDir = "/etc/scollector/collectors"
If we want the internal collectors, we have to rewrite them :(

Resources