I want to run an R script using SLURM. I have created the R script, "test.R" as shown:
print("Running the test script")
write.csv(head(mtcars), "mtcars_data_test.csv")
I created a bash script to run this R script "submit.sh"
#!/bin/bash
#sbatch --job-name=test.job
#sbatch --output=.out/abc.out
Rscript /home/abc/job_sub_test/test.R
And I submitted the job on the cluster
sbatch submit.sh
I am not sure where my output is saved. I looked in the home directory but no output file.
Edit
I also set my working directory in test.R, but nothing different.
setwd("/home/abc")
print("Running the test script")
write.csv(head(mtcars), "mtcars_data_test.csv")
When I run the script without SLURM Rscript test.R, it worked fine and saved the output according to the set path.
Slurm will set the job working directory to the directory which was the working directory when the sbatch command was issued.
Assuming the /home directory is mounted on all compute nodes, you can change explicitly the working directory with cd in the submission script, or setwd() in the R syntax. But that should not be necessary.
Three possibilities:
either the job did not start at all because of a configuration or hardware issue ; that you can find out with the sacct command, looking at the state column.
either the file was indeed created but on the compute node on a filesystem that is not shared; in that case the best option is to SSH to the compute node (which you can find out with sacct) and look for the file there; or
the script crashed and the file was not created at all, in that case you should look into the output file of the job (.out/abc.out). Beware that the .out directory must be present before the job starts, and that, as it starts with a ., it will be a hidden file, revealed in ls only with the -a argument.
The --output argument to sbatch is relative to the folder you submitted the job from. setwd inside the R script wouldn't affect it, because Slurm has already parsed that argument and started piping output to the file by the time the R script is running.
First, if you want the output to go to /home/abc/.out/ make sure you're in your homedir when you submit the script, or specify the full path to the --output argument.
Second, the .out folder has to exist; I tested this and Slurm does not create it if it doesn't.
Related
I am creating an R script myscript.R which manipulates an excel file by means of XLConnectpackage.
The point is it refers to several external files: the excel file itself and another R file (functions library), so I need to set a working directory in the location of the script (so that relative paths to external files work properly).
I am using the following in my script
new_wd <- dirname(sys.frame(1)$ofile)
setwd(new_wd)
When I source the script from my RStudio it gets the job done. The problem is that the script is to be used by non-programmers, non-Rstutio-users, so I create .bat file (which I want to turn into an .exe one)
"C:\Program Files\R\R-4.0.3\bin\Rscript.exe" "C:\my\path\to\myscript.R"
It executes the script line by line but sys.frame(1) only works when sourcing.
How could I solve it?
Thanx
I have found a solution and it works properly.
From CMD command line or from a .bat file one can add an argument -e to the command, so that you can use a R language.
absolute\path\to\Rscript.exe -e "source('"relative\path\to\myscript.R"')"
It worked for me.
Besides, as Compo commented, I think there's no need for a .exe file, since a .bat does the job.
I am submitting a R job on a cluster server that seems to run correctly as far as I save the results in the same working directory my scripts files are. Actually I want save results from R (via saveRDS()) in another directory (e.g. /var/tmp/results), but I think that Slurm is redirecting my bash script output to its working directory, causing a No such file or directory R error.
I've looked around but I was not able to find a way to specify another path for the job resulting output. Maybe I have used the wrong keys because of my language limitations since I am not a native English, and I before apologize if it is something trivial.
I'll add in the following the script I am calling, i.e. my_script.sh
#!/bin/bash
#SBATCH --ntasks=1
#SBATCH --array=1-2
#SBATCH --mem-per-cpu=15000
#SBATCH -e hostname_%j.err
#SBATCH -o hostname_%j.out
module load R
FILES=($HOME/simulation_study_bnp/models/*.R)
FILE=${FILES[$SLURM_TASK_ARRAY_ID]}
echo ${FILE}
srun Rscript runModel.R ${FILES[$SLURM_ARRAY_TASK_ID]}
The R script runModel sources the code of each file in the models folder, load some data and run some analysis. All the necessary files are located in $HOME/simulation_study_bnp/models/ with $HOME being my home directory on the server (e.g. /users/my_dir).
The R script runModel.R should save part of the results, but in another path e.g. /var/tmp/results/ outside my $HOME. When I run something like Rscript runModel.R models/filename.R everything works fine. After some trials, my opinion is that when submitting
my_script.sh via sbatch the process look for the /var/tmp/results/ under my $HOME instead of considering the absolute path, but I have no idea how to solve this.
As it was pointed out by #carles-fenoy, you have to make sure that whatever directory the jobs are writing to must be accessible by all computing nodes. Usually, large clusters do have a dedicated storage space that is accessible by all nodes (see for example USC's scratch dir). Ask your cluster IT managers to see what's that in your cluster.
Another thing I would mention is that it is worthwhile for you to check symbolic links. You can have a symbolic link in your home directory pointing to another folder located in a node with more space. I do that for my R packages as shown here. That way I don't need to worry about running out of space!
I am trying to understand the working of "make" command (just started on this command). I have an ".sh" file which has a script to execute "make" command as shown below:
source /somepath/environment-setup-cortexa9hf-vfp-neon-poky-linux-gnueabi
make arch=arm toolchainPrefix=arm-poky-linux-gnueabi- xeno=off mode=Debug all
The directory where the script file is located has a file named "makefile". but there is nothing specified in the script file above regarding this "makefile". After executing the script file, all the script withing "makefile" is executed automatically. Can someone explain the working of "make xyz all" command in few words.
Thanks
As often with UNIX systems the command works to some degree by conventions. make (the GNU version of make at least) will search the working directory for files called GNUmakefile, makefile, and Makefile in that order or you can use the -f (or --file) option to give it a specific file.
I added an alias (alias homedir='cd /export/home/file/myNmae'
) to .bashrc in my home directory and restarted the session. When I run the alias it says homedir: command not found.
Please advice.
This is because .bashrc is not sourced everytime, only for interactive non login shells .bashrc is sourced.
From the bash man page.
When bash is invoked as an interactive login shell, or as a non-interactive shell with the --login option, it first reads and executes commands from the file /etc/pro-
file, if that file exists. After reading that file, it looks for ~/.bash_profile, ~/.bash_login, and ~/.profile, in that order, and reads and executes commands from the
first one that exists and is readable. The --noprofile option may be used when the shell is started to inhibit this behavior.
When a login shell exits, bash reads and executes commands from the files ~/.bash_logout and /etc/bash.bash_logout, if the files exists.
When an interactive shell that is not a login shell is started, bash reads and executes commands from ~/.bashrc, if that file exists. This may be inhibited by using the
--norc option. The --rcfile file option will force bash to read and execute commands from file instead of ~/.bashrc.
i found the solution - i added it to the .profile file and restarted the session - it worked
I've been digging into several places for 2 simple needs but couldn't find a final answer.
I'm running an R script in batch mode. Not sure whether my solution is the best one but I'm using R CMD BATCH as per http://stat.ethz.ch/R-manual/R-patched/library/utils/html/BATCH.html included in a bat file.
First I'd like to have the directory where the R script is saved set up as the working directory rather than where the bat file is saved.
Secondly I'd like to divert all the output from the R script (csv files and charts) to a specific directory other than the working directory. I cannot find any options for such basic requirement.
The final idea is to be able to run the bat file across different computers no matter where the R script is saved.
Thanks
You don't give code so my answer will be just an advise or what I would do for such job.
Use Rscript.exe it is the way to go for batch script. R CMD is a sort of legacy tool.
You don't need to set or change the working directory. It is a source of problems
You can launch you bat file where you want and within it you go to R script location using cd for example you bat file can be like:
cd R_SCRIPT_PATH
Rscript youscript.R arg1 arg2
You can use one of the script argument as an output directory for your result files. For example inside your script you do somehing like this:
args <- commandArgs(trailingOnly = TRUE)
resultpath <- as.character(args[1])
.....
write.table(res1, file=paste(resultpath,'res1.csv',sep='/')