I would like to call a command mycommand randomly using ZSH. How can I achieve this?
Context: I have a command that I want to be executed every week on Mac OS. The simplest thing I have found is to add it in my .zshrc - bonus point I am sure to reinstall this upon full system reinstallation. But since it uses a pipenv virtualenv it's taking a few precious milliseconds when starting. I would be very happy if I could only spawn it with a 10% probability.
You could do a
((RANDOM % 10 == 1)) && mycommand
for making it 10% likely. Of course you also need to ensure then that it is executed not more than once per week, if it is important for you, but from your question, I assume that this is something you have already a solution for, and are interested only in introducing probability.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I am trying to construct and submit an array job based on R in the HPC of my university.
I'm used to submit array jobs based on Matlab and I have some doubts on how to translate the overall procedure to R. Let me report a very simple Matlab example and then my questions.
The code is based on 3 files:
"main" which does some preliminary operations.
"subf" which should be run by each task and uses some matrices created by "main".
a bash file which I qsub in the terminal.
1. main:
clear
%% Do all the operations that are common across tasks
% Here, as an example, I create
% 1) a matrix A that I will sum to the output of each task
% 2) a matrix grid; each task will use some rows of the matrix grid
m=1000;
A=rand(m,m);
grid=rand(m,m);
%% Tasks
tasks=10; %number of tasks
jobs=round(size(grid,1)/tasks); %I split the number of rows of the matrix grid among the tasks
2. subf:
%% Set task ID
idtemp=str2double(getenv('SGE_TASK_ID'));
%% Select local grid
if idtemp<tasks
grid_local= grid(jobs*(idtemp-1)+1: idtemp*jobs,:);
else
grid_local= grid(jobs*(idtemp-1)+1: end,:); %for the last task, we should take all the rows of grid that have been left
end
sg_local=size(grid_local,1);
%% Do the task
output=zeros(sg_local,1);
for g=1:sg_local
output(g,:)=sum(sum(A+repmat(grid_local(g,:),m,1)));
end
%% Save output by keeping track of task ID
filename = sprintf('output.%d.mat', ID);
save(filename,'output')
3. bash
#$ -S /bin/bash
#$ -l h_vmem=6G
#$ -l tmem=6G
#$ -l h_rt=480:0:0
#$ -cwd
#$ -j y
#Run 10 tasks where each task has a different $SGE_TASK_ID ranging from 1 to 10
#$ -t 1-10
#$ -N Example
date
hostname
#Output the Task ID
echo "Task ID is $SGE_TASK_ID"
export PATH=/xx/xx/matlab/bin:$PATH
matlab -nodisplay -nodesktop -nojvm -nosplash -r "main; ID = $SGE_TASK_ID; subf; exit"
These are my questions:
Suppose I'm able to translate "main" and "subf" into R language. Should I be extra-careful about anything in particular concerning the parallelisation? For example, do I have to declare some parallel environment, such as parLapply or dopar?
In the "main" file I should also install some R packages. Can I do them locally in my folder directly at the beginning of the "main" file, or should I contact the HPC administrator to install them globally?
I could not find any example of bash file for R in the instructions given by my university. Therefore, I have doubts on how to re-adapt the above bash file. I suppose that the only lines to change are:
export PATH=/xx/xx/matlab/bin:$PATH
matlab -nodisplay -nodesktop -nojvm -nosplash -r "main; ID = $SGE_TASK_ID; subf; exit"
Could you give some hints on how I should change them?
The parallelization is handled by the HPC, right? In which case, I think "no", nothing special required.
It depends on how they allow/enable R. In a HPC that I use (not your school), the individual nodes do not have direct internet access, so it would require special care; this might be the exception, I don't know.
Recommendation: if there is a shared filesystem that both you and all of the nodes can access, then create an R "library" there that contains the installed packages you need, then use .libPaths(...) in your R scripts here to add that to the search path for packages. The only gotcha to this might be if there are non-R shared library (e.g., .dll, .so, .a) requirements. For this, either "docker" or "ask admins".
If you don't have a shared filesystem, then you might ask the cluster admins if they use/prefer docker images (you might provide an image or a DOCKERFILE to create one) or if they have preferred mechanisms for enabling various packages.
I do not recommend asking them to install the packages, for two reasons: First, think about them needing to do this with every person who has a job to run, for any number of programming languages, and then realize that they may have no idea how to do it for that language. Second, package versions are very important, and you asking them to install a package may install either a too-new package or overwrite an older version that somebody else is relying on. (See packrat and renv for discussions on reproducible environments.)
Bottom line, the use of a path you control (and using .libPaths) enables you to have complete control over package versions. If you have not been bitten by unintended consequences of newer-versioned packages, just wait ... congratulations, you've been lucky.
I suggest you can add source("main.R") to the beginning of subf.R, which would make your bash file perhaps as simple as
export PATH=/usr/local/R-4.x.x/bin:$PATH
Rscript /path/to/subf.R
(Noting that you'll need to reference Sys.getenv("SGE_TASK_ID") somewhere in subf.R.)
I have checked about the function of "-n" --
"Displays active TCP connections, however, addresses and port numbers are expressed numerically and no attempt is made to determine names."
But I can't see why "-n" can make netstat exit immediately?
From a quick check, I don't see the same description for the "-n" option as you do, and it doesn't make netstat run continuously.
As you didn't specify the version and exact command you are using, I tried both the version that comes with RH7.6 (net-tools 2.10-alpha) and the latest from source code (net-tools 3.14-alpha). The net-tools source code can be found in github [1].
As I couldn't find the exact option you describe, I tried all flags (without combinations) that don't require an argument. As far as I can tell the only options that cause netstat to not exit immediately are '-g' and '-c'. '-c' makes sense as it is the flag for running netstat continuously. For '-g' it isn't as obvious as the continuous behavior is coming from reading the /proc/net/igmp and /proc/net/igmp6 files line-by-line. The first file is read quickly but the igmp6 file takes much longer (1 line per ~1 sec). The '-g' option isn't really continuous, but just takes a lot of time to finish.
From the code, the only reason for continuous execution is (appears 4 times in the code):
if (i || !flag_cnt)
break;
wait_continous();
'i' is a return code from a function and the 'break' command is to break from an infinite for loop, so basically the code will run continuously only if flag_cnt is set (only happens when '-c' is provided) and there were no errors with previous commands.
For the specific issue above there could be a few reasons:
The option involves reading from a file and it takes very long time to finish, but it is not really continuous.
There's a correlation between the given option and flag_cnt, which cause flag_cnt to be set.
There's a call to wait_continous() which doesn't follow the condition above.
As I said, I couldn't reproduce the issue in the original question, nor could I find any flag with the description above. Also, non of the flags besides '-c' caused netstat to run continuously.
If you still want to figure this out I suggest you take a look at your code, or at least specify the net-tools version you use. The kernel version is also important as some code would be compiled-out due to missing kernel support.
[1] https://github.com/ecki/net-tools
My Jenkins server is running arc diff, and once in a while I have large diffs, I don't want my job to fail if that is the case:
Right with the latest master of arc, I get:
This diff has a very large number of changes (762). Differential works
best for changes which will receive detailed human review, and not as
well for large automated changes or bulk checkins. See
https://secure.phabricator.com/book/phabricator/article/differential_large_changes/
for information about reviewing big checkins. Continue anyway? [y/N]
[1mUsage Exception:[m Aborted generation of gigantic diff.
Build step 'Execute shell' marked build as failure
My current code tries to avoid interactivity and mostly works, except for large diffs. Any way around this?
echo "jenkins
Summary:
Test Plan:
required
Reviewers:
alberto56
Subscribers:
JIRA Issues:
$JIRAISSUE" > arc_info.txt
arc diff --allow-untracked --message jenkins --message-file arc_info.txt origin/master
rm arc_info.txt
There is no interaction option (yet) for arc diff. You may wanna try something like:
echo 'y' | arc diff ...
or even
echo 'y y y' | arc diff ...
You could also use the Yes command: http://linux.die.net/man/1/yes
Is it possible to graph the query resolution time of bind9 in munin?
I know there is a way to graph it in a unbound server, is it already done in bind? If not how do I start writing a munin plugin for that? I'm getting stats from http://127.0.0.1:8053/ in the bind9 server.
I don't believe that "query time" is a function of BIND. About the only time that I see that value (with individual lookups) is when using dig. If you're willing to use that, the following might be a good starting point:
#!/bin/sh
case $1 in
config)
cat <<'EOM'
graph_title Red Hat Query Time
graph_vlabel time
time.label msec
EOM
exit 0;;
esac
echo -n "time.value "
dig www.redhat.com|grep Query|cut -d':' -f2|cut -d\ -f2
Note that there's two spaces after the "-d\" in the second cut statement. If you save the above as "querytime" and run it at the command line, output should look something like:
root#pi1:~# ./querytime
time.value 189
root#pi1:~# ./querytime config
graph_title Red Hat Query Time
graph_vlabel time
time.label msec
I'm not sure of the value in tracking the above though. The response time can be affected: if the query is an initial lookup, if the answer is cached locally, depending on server load, depending on intervening network congestion, etc.
Note: the above may be a bit buggy as I've written it on the fly, but it should give you a good starting point. That it returned the above output is a good sign.
In any case, recommend reading the following before you write your own: http://munin-monitoring.org/wiki/HowToWritePlugins
Fork is a great tool in unix.We can use it to generate our copy and change its behaviour.But I don't know the history of fork.
Does someone can tell me the story?
Actually, unlike many of the basic UNIX features, fork was a relative latecomer (a).
The earliest existence of multiple processes within UNIX consisted of a few (fixed number of) processes, one per terminal that was attached to the PDP-7 machine (b).
The basic idea was that the shell process for a given terminal would accept a command from the user, locate the program file, load a small bootstrap program into high memory and jump to it, passing enough details for the bootstrap code to load the program file.
The bootstrap code, after loading the program into low memory (overwriting the shell), would then jump to it.
When the program was finished, it would call exit but it wasn't like the exit we know and love today. This exit would simply reload the shell and run it using pretty much the same method used to load the program in the first place.
So it was really more like a rudimentary exec command, the one that replaces your current program with another, in the same process space.
The shell would exec your program then, when your program was done, it would again exec the shell by calling exit.
This method was similar to that found in many other interactive systems at the time, including the Multics from whence UNIX got its name.
From the two-way exec, it was actually not that big a leap to adding fork as a process duplicator to work in conjunction. While many systems run another program directly, it's this "just add what's needed" method which is responsible for the separation of duties between fork and exec in UNIX. It also resulted in a very simple fork function.
If you're interested in the early history of various features(c) of Unix, you cannot go past the article The Evolution of the Unix Time-Sharing System by Dennis Ritchie, presented at a 1979 conference in Australia, and subsequently published by AT&T.
(a) Though I mean latecomer in the sense that the separation of the four fundamental forces in the universe was "late", happening some 0.00000000001 seconds after the big bang.</humour>.
(b) Since a question was raised in a comment as to how the shells were originally started off, there's a great resource holding very early source code for Unix over at The Unix Heritage Society, specifically the source code archives and, in particular, the first edition.
The init.s file from the first edition shows how the fixed number of shell processes were created (slightly reformatted):
...
mov $itab, r1 / address of table to r1
1:
mov (r1)+, r0 / 'x, x=0, 1... to r0
beq 1f / branch if table end
movb r0, ttyx+8 / put symbol in ttyx
jsr pc, dfork / go to make new init for this ttyx
mov r0, (r1)+ / save child id in word offer '0, '1, etc
br 1b / set up next child
1:
...
itab:
'0; ..
'1; ..
'2; ..
'3; ..
'4; ..
'5; ..
'6; ..
'7; ..
0
Here you can see the snippet which creates the processes for each connected terminal. These are the days of hard-coded values, no auto detection of terminal quantity involved. The zero-terminated table at itab is used to create a number of processes and hopefully the comments from the code explain how (the only possibly tricky bit is the labels - though there are multiple 1 labels, you branch to the nearest one in a given direction, hence 1b means the closest 1 label in the backwards direction).
The code shown simply processes the table, calling dfork to create a process for each terminal and start getty, the login prompt. The getty program, in turn, eventually started the shell. From that point, it's as I described in the main part of this answer.
(c) No paths (and use of temporary links to get around this limitation), limited processes, why there's a GECOS field in the password file, and all sorts of other trivia, generally interesting only to uber-geeks, of course.