Resize and re-aggregate whisper stats - graphite

Our monitoring system dumps metrics into Graphite does so once per minute, and has a retention of 1min:2d,5min:20d,30min:120d,6h:2y. However I've recently added monitors that run on a 5-minute period, and I've found that:
The 1 minute points are four zeroes and an actual value, repeating of course.
The 5+ minute points are all zeroes, likely because my xFilesFactor is higher than 0.2 and the aggregation just doesn't happen at all.
What I'd like to do is simply create a new Whisper file with the new retentions, [and no wasted space] and then import/re-aggregate the data into it. From what I've found whiper-resize.py is supposed to be the right tool.
As a test I've been doing:
whisper-resize.py \
--newfile=/tmp/foo.wsp \
--aggregate --aggregationMethod=max \
--xFilesFactor=0.1 \
--force \
quotas/us-central1CPUS/CPUS.wsp \
5min:20d 30min:120d 6h:2y
But after this operation completes foo-wsp is just filled with zeroes.
What's the deal?

You just need to change xFilesFactor for target files, like
whisper-resize.py --xFilesFactor=0.0 --nobackup quotas/us-central1CPUS/CPUS.wsp 1min:2d 5min:20d 30min:120d 6h:2y
You will not waste space - whisper format has fixed file size anyway. Please see details in http://obfuscurity.com/2012/04/Unhelpful-Graphite-Tip-9

Related

View NetCDF metadata without tripping on large file size / format

Summary
I need help getting NCO tools to be helpful. I'm running into the error
"One or more variable sizes violate format constraints"
... when trying to just view the list of variables in the file with:
ncdump -h isrm_v1.2.1.ncf
It seems odd to trip on this when I'm not asking for any large variables to be read ... just metadata. Are there any flags I should or could be passing to avoid this error?
Reprex
isrm_v1.2.1.ncf (165 GB) is available on Zenodo.
Details
I've just installed the NCO suite via brew --install nco --build-from-source on a Mac (I know, I know) running OS X 11.6.5. ncks --version says 5.0.6.
Tips appreciated. I've been trawling through the ncks docs for a couple of hours without much insight. A friend was able to slice the file on a different system running actual Linux, so I'm pretty sure my NCO install is to blame.
How can I dig in deeper to find the root cause? NCO tools don't seem very verbose. I understand there are different sub-formats of NetCDF (3, 4, ...) but I'm not even sure how to verify the version/format of the .nc file that I'm trying to access.
My larger goal is to be able to slice it, like ncks -v pNH4 -d layer,0 isrm_v1.2.1.ncf pNH4L0.nc, but if I can't even view metadata, I'm thinking I need to solve that first.
The more-verbose version of the error message, for the record, is:
HINT: NC_EVARSIZE errors occur when attempting to copy or aggregate input files together into an output file that exceeds the per-file capacity of the output file format, and when trying to copy, aggregate, or define individual variables that exceed the per-variable constraints of the output file format. The per-file limit of all netCDF formats is not less than 8 EiB on modern computers, so any NC_EVARSIZE error is almost certainly due to violating a per-variable limit. Relevant limits: netCDF3 NETCDF_CLASSIC format limits fixed variables to sizes smaller than 2^31 B = 2 GiB ~ 2.1 GB, and record variables to that size per record. A single variable may exceed this limit if and only if it is the last defined variable. netCDF3 NETCDF_64BIT_OFFSET format limits fixed variables to sizes smaller than 2^32 B = 4 GiB ~ 4.2 GB, and record variables to that size per record. Any number of variables may reach, though not exceed, this size for fixed variables, or this size per record for record variables. The netCDF3 NETCDF_64BIT_DATA and netCDF4 NETCDF4 formats have no variable size limitations of real-world import. If any variable in your dataset exceeds these limits, alter the output file to a format capacious enough, either netCDF3 classic with 64-bit offsets (with -6 or --64), to PnetCDF/CDF5 with 64-bit data (with -5), or to netCDF4 (with -4 or -7). For more details, see http://nco.sf.net/nco.html#fl_fmt
Tips appreciated!
ncdump is not an NCO program, so I can't help you there, except to say that printing metadata should not cause an error in this case, so try ncks -m in.nc instead of ncdump -h in.nc.
Nevertheless, the hyperslab problem you have experienced is most likely due to trying to shove too much data into a netCDF format that can't hold it. The generic solution to that is to write the data to a more capacious netCDF format:
Try either one of these commands:
ncks -5 -v pNH4 -d layer,0 isrm_v1.2.1.ncf pNH4L0.nc
ncks -7 -v pNH4 -d layer,0 isrm_v1.2.1.ncf pNH4L0.nc
Formats are documented here

how to get arc diff on the command line to assume yes on very large number of changes

My Jenkins server is running arc diff, and once in a while I have large diffs, I don't want my job to fail if that is the case:
Right with the latest master of arc, I get:
This diff has a very large number of changes (762). Differential works
best for changes which will receive detailed human review, and not as
well for large automated changes or bulk checkins. See
https://secure.phabricator.com/book/phabricator/article/differential_large_changes/
for information about reviewing big checkins. Continue anyway? [y/N]
[1mUsage Exception:[m Aborted generation of gigantic diff.
Build step 'Execute shell' marked build as failure
My current code tries to avoid interactivity and mostly works, except for large diffs. Any way around this?
echo "jenkins
Summary:
Test Plan:
required
Reviewers:
alberto56
Subscribers:
JIRA Issues:
$JIRAISSUE" > arc_info.txt
arc diff --allow-untracked --message jenkins --message-file arc_info.txt origin/master
rm arc_info.txt
There is no interaction option (yet) for arc diff. You may wanna try something like:
echo 'y' | arc diff ...
or even
echo 'y y y' | arc diff ...
You could also use the Yes command: http://linux.die.net/man/1/yes

Bind query resolution time in munin

Is it possible to graph the query resolution time of bind9 in munin?
I know there is a way to graph it in a unbound server, is it already done in bind? If not how do I start writing a munin plugin for that? I'm getting stats from http://127.0.0.1:8053/ in the bind9 server.
I don't believe that "query time" is a function of BIND. About the only time that I see that value (with individual lookups) is when using dig. If you're willing to use that, the following might be a good starting point:
#!/bin/sh
case $1 in
config)
cat <<'EOM'
graph_title Red Hat Query Time
graph_vlabel time
time.label msec
EOM
exit 0;;
esac
echo -n "time.value "
dig www.redhat.com|grep Query|cut -d':' -f2|cut -d\ -f2
Note that there's two spaces after the "-d\" in the second cut statement. If you save the above as "querytime" and run it at the command line, output should look something like:
root#pi1:~# ./querytime
time.value 189
root#pi1:~# ./querytime config
graph_title Red Hat Query Time
graph_vlabel time
time.label msec
I'm not sure of the value in tracking the above though. The response time can be affected: if the query is an initial lookup, if the answer is cached locally, depending on server load, depending on intervening network congestion, etc.
Note: the above may be a bit buggy as I've written it on the fly, but it should give you a good starting point. That it returned the above output is a good sign.
In any case, recommend reading the following before you write your own: http://munin-monitoring.org/wiki/HowToWritePlugins

Compress EACH LINE of a file individually and independently of one another? (or preserve newlines)

I have a very large file (~10 GB) that can be compressed to < 1 GB using gzip. I'm interested in using sort FILE | uniq -c | sort to see how often a single line is repeated, however the 10 GB file is too large to sort and my computer runs out of memory.
Is there a way to compress the file while preserving newlines (or an entirely different method all together) that would reduce the file to a small enough size to sort, yet still leave the file in a condition that's sortable?
Or any other method of finding out / countin how many times each line is repetead inside a large file (a ~10 GB CSV-like file) ?
Thanks for any help!
Are you sure you're running out of the Memory (RAM?) with your sort?
My experience debugging sort problems leads me to believe that you have probably run out of diskspace for sort to create it temporary files. Also recall that diskspace used to sort is usually in /tmp or /var/tmp.
So check out your available disk space with :
df -g
(some systems don't support -g, try -m (megs) -k (kiloB) )
If you have an undersized /tmp partition, do you have another partition with 10-20GB free? If yes, then tell your sort to use that dir with
sort -T /alt/dir
Note that for sort version
sort (GNU coreutils) 5.97
The help says
-T, --temporary-directory=DIR use DIR for temporaries, not $TMPDIR or /tmp;
multiple options specify multiple directories
I'm not sure if this means can combine a bunch of -T=/dr1/ -T=/dr2 ... to get to your 10GB*sortFactor space or not. My experience was that it only used the last dir in the list, so try to use 1 dir that is big enough.
Also, note that you can go to the whatever dir you are using for sort, and you'll see the acctivity of the temporary files used for sorting.
I hope this helps.
As you appear to be a new user here on S.O., allow me to welcome you and remind you of four things we do:
. 1) Read the FAQs
. 2) Please accept the answer that best solves your problem, if any, by pressing the checkmark sign. This gives the respondent with the best answer 15 points of reputation. It is not subtracted (as some people seem to think) from your reputation points ;-)
. 3) When you see good Q&A, vote them up by using the gray triangles, as the credibility of the system is based on the reputation that users gain by sharing their knowledge.
. 4) As you receive help, try to give it too, answering questions in your area of expertise
There are some possible solutions:
1 - use any text processing language (perl, awk) to extract each line and save the line number and a hash for that line, and then compare the hashes
2 - Can / Want to remove the duplicate lines, leaving just one occurence per file? Could use a script (command) like:
awk '!x[$0]++' oldfile > newfile
3 - Why not split the files but with some criteria? Supposing all your lines begin with letters:
- break your original_file in 20 smaller files: grep "^a*$" original_file > a_file
- sort each small file: a_file, b_file, and so on
- verify the duplicates, count them, do whatever you want.

What is the unix command to see how much disk space there is and how much is remaining?

I'm looking for the equivalent of right clicking on the drive in windows and seeing the disk space used and remaining info.
Look for the commands du (disk usage) and df (disk free)
Use the df command:
df -h
df -g .
Option g for Size in GBs Block and . for current working directory.
I love doing du -sh * | sort -nr | less to sort by the largest files first
If you want to see how much space each folder ocuppes:
du -sh *
s – summarize
h – human readable
* – list of folders
Note: The original question was answered already, but I would just like to expand on it with some extras that are relevant to the topic.
Your AIX installation would first be put into volume groups. This is done upon installation.
It will first create rootvg (as in root volume group). This is kinda like your actual hard drive mapped.
This would be equivalent to Disc Management in Windows. AIX wont use up all of that space for its file systems like we tend to do it in consumer Windows machines. Instead there will be a good bit of unallocated space.
To check how much space your rootvg would have you use the following command.
lsvg rootvg
That would stand for list volume group rootvg. This will give you information like the size of physical partitions (PP), Total PPs assigned to the volume group, Free PPs in the volume group, etc. Regardless, the output should be fairly comprehensive.
Next thing you may be interested in, is the file systems on the volume group. Each file system would have certain amount of space given within the volume group it belongs to.
To check what file systems you got on your volume group you use the following command.
lsvgfs rootvg
As in list volume group file systems for rootvg.
You can check how much space each file system has using the following command.
df
I personally like to refine it with flags like -m and -g (in megabytes and gigabytes respectively)
If you have free space available in your volume group, you can assign it to your file systems using the following command.
chfs -a size=+1G /home
As in change file system attribute size by adding 1 G where file system is /home. use man chfs for more instructions. This is a powerful tool. This example is for adjusting size, however you can do more with this command than that.
Sources:
http://www.ibm.com/developerworks/aix/library/au-rootvg/
+ My own experience working with AIX.
All these answers are superficially correct. However, the proper answer is
apropos disk # And pray your admin maintains the whatis database
because asking questions the answers of which lay at your fingertips in the manual wastes everybody's time.
su -sm ./*
You can see every file and folder size (-sm=Mb ; -sk=Kb) in the current directory like a list. This way runs in all Unix/Linux environment.
du -sm * => RULLLLLEZ
df -tk
for Disk Free size in 1024 byte blocks

Resources