how to configure specific graphs in munin to update every day instead of every 5 minutes? - munin

The default update interval in Munin is 5 minutes which is OK for most purposes.
For some cases though 5 minutes is too frequent, useless and sometimes increases load of the servers being watched.
For example, I want to graph database sizes once every day and I have plugins written for that. But sampling every 5 minutes could be costly in terms of performance.
So, is it possible to configure Munin for specific graphs to update every day or every hour instead of every 5 minutes?

You can change the interval for all graphs see the munin FAQ
Munin runs at an interval of every five minutes (*/5) on debian systems by default. Is it possible to change this interval to an arbitrary value?
Just edit /etc/cron.d/munin.
However, this won't change Munin's (or rather RRD's) granularity; all RRD files are constructed to create 5 minutes averages, and no matter how often you update the RRD files the output won't be (much) different.
But I think you do not want it ? :)
I think you can optimize your script, you can store the result in a file and for each call you can verify if the file modify date is the same of the current date. If not, just call the database for retrieve the database size and all others time you return the content file. It should consume less resource.

Related

Does collection.get() in firestore has limitations?

In my project, every user has a document with a field f. My project uses the sum of every user's f field frequently, probably hundred thousand times a day, and my project is planned to have millions of users.
Obviously it is not efficient to calculate the sum every time I need it, so my plan is to have an additional document to track the sum. Every time a user's f changed, update the sum too.
But I think roundoff error may occur after a period of time, so I plan to recalculate the sum every 24 hours or 7 days.
My problem is, if I have a million documents, does collection.get() still work? What about a billion? I've noticed WriteBatch has a 500 limit. Does collection.get() has limit too?
You can get as many documents as you can fit in memory on the machine where you issued the query. The backend will stream the results until you run out.

How to keep track of the number of runs of an R script per day

I am currently looking to write a function in R that can keep track of the number of completed runs of an .R file within any particular day. Notice that the runs might be conducted at different time periods of a day. I did some research on this problem and came across this post (To show how many times user has run the script). So far I am unable to build upon the first commenter's code when converting into R (the main obstacle is to replicate the try....except ). However, I need to add the restriction that the count is measured only within a day (exactly from 00:00:00 AM EST to 24:00:00 AM EST).
Can someone please offer some help on how to accomplish this goal?
Either I didn't get the problem, or it seems a rather easy one: create a temporary file (use Sys.Date() to name it) and store the current run number there; at the beginning of your .R file, read the temporary file, increment the number, write the file back.

Graphite Render URL API to Splunk - Track received events?

I'd like to setup a scripted input in Splunk to do a curl against the render url api for Graphite. I imagine I could configure this input to run on the minute, and retrieve that last minutes worth of events.
My concern with this is that some events might be missed, or duplicated.
Has anybody done something similar to this? How could I keep track of the events from Graphite that I have already read?
If you write a modular input you can use data checkpoints. See the docs for more info: http://docs.splunk.com/Documentation/Splunk/6.2.1/AdvancedDev/ModInputsCheckpoint
My concern with this is that some events might be missed, or duplicated.
Yes, it may go missing. In two cases-
If you're pushing your graphite server to the limits, there is a lag between the point wherein the datapoint is received and its flushing to disk. With large queues, i have seen this go upto 20 mins. (IO is the constraint here).
For example- in the case above wherein there's a 20 minute lag, and i am storing data at a 1m granularity- i will have the latest 20 datapoints with NULL against the timestamp. Of-course, they will soon fill in with the next flush.
Know that these are indeterminate. So if you have a zero lag deployment- go for this approach.
The latest datapoint can or cannot be NULL at any given point, because of the flushing nature of graphite, even if nothing is throttling. You can use something like &from=-21m&to=-1m to make sure you never encounter this. Note: Your monitoring now lags by a minute. :)
All said, graphite is a great monitoring tool if your requirements aren't realtime.

How to keep track of number of Active users in Graphite

I need to keep track of the number of Active user on my web site at any point of time . For this i am incrementing a key named "users.loggedin" every time a user log-in and decrements it, every time a user signs out.
I am sending my metrics to Graphite via StatD. But based on what i have read "Increment" gives the changes per time interval thus I could see the changes in the Graphite Dashboard, but it shows zero again after some time.
Configure Graphite for StatsD
Did you configure Graphite for usage with StatsD? You must specify in Graphite configuration how you expect it to handle the data you are sending from StatsD. This is important because Graphite could be averaging your counts instead of summing them.
If sending sparse or "bursty" data
Confirm that your xFilesFactor is low enough that aggregation produces non null values even with a high rate of nulls. For example, 100 requests in the first 10 seconds, and none for the remaining 50 seconds in a minute would cause a storage of 100, null, null, null, null, null which would be summed up to null when the data ages if the XFilesFactor is higher than 1/6. Using the statsd recommended graphite configuration handles this, but it is good to know about... as this can give the appearance of lost data.
Saving schema or aggregation changes
If you changed the graphite schema or aggregation settings after any metrics were stored (in whisper = graphite's storage) you'll need to either delete the .wsp files for the metric (graphite will recreate them) or run whisper-resize.py.
Validating settings
You can verify the settings against some whisper data by running whisper-info.py on a .wsp file. Find the .wsp file for one of your metrics in /graphite/storage/whisper/
Run: whisper-info.py my_metric_data.wsp. whisper-info.py output should tell you more about how the storage settings are working.
After you've confirmed that your data is accurate, then I'd move on to creating the graph you want in the UI:
You might need to use the hitcount() function for this.
This post covers what you are after pretty well (even if you aren't using StatsD).

Flex: Calculate hours between 2 times?

I am building a scheduling system. The current system is just using excel, and they type in times like 9:3-5 (meaning 9:30am-5pm). I haven't set up the format for how these times are going to be stored yet, I think I may have to use military time in order to be able to calculate the hours, but I would like to avoid that if possible. But basically I need to be able to figure out how to calculate the hours. for example 9:3-5 would be (7.5 hours). I am open to different ways of storing the times as well. I just need to be able to display it in an easy way for the user to understand and be able to calculate how many hours it is.
Any ideas?
Thanks!!
Quick dirty ugly solution
public static const millisecondsPerHour:int = 1000 * 60 * 60;
private function getHoursDifference(minDate:Date, maxDate:Date):uint {
return Math.ceil(( maxDate.getTime() - minDate.getTime()) / millisecondsPerHour);
}
Ok it sounds like you're talking about changing from a schedule or plan that's currently developed by a person using an excel spreadsheet and want to "computerize" the process. 1st Warning: "Scheduling is not trivial." How you store the time isn't all that important, but it is common to establish some level of granularity and convert the task time to integer multiples of this interval to simplify the scheduling task.
If you want to automate the process or simply error check you'll want to abstract things a bit. A basic weekly calendar with start and stop times, and perhaps shift info will be needed. An exception calendar would be a good idea to plan from the start. The exception calendar allows holidays and other exceptions. A table containing resource and capacity info will be needed. A table containing all the tasks to schedule and any dependencies between tasks will be needed. Do you want to consider concurrent requirements? (I need a truck and a driver...) Do you want to consider intermittent scheduling of resources? Should you support forward or backward scheduling? Do you plan to support what if scenarios? (Then you'll want a master schedule that's independent of the planning schedule(s)) Do you want to prioritize how tasks are placed on the schedule? (A lot of though is needed here depending on the work to be done.) You may very well want to identify a subset of the tasks to actually schedule. Then simply provide a reporting mechanism to show if the remaining work can fit into the white space in the schedule. (If you can't get the most demanding 10% done in the time available who cares about the other 90%)
2nd Warning: "If God wrote the schedule most companies couldn't follow it."

Resources