R markdown files overlap figures when parallelized using Makefile - r

I've created a simple example showing the problem I currently have.
I have a R-markdown file, named example.Rmd, containing the following code
```{r}
plot(rnorm(10000))
```
and a Makefile file with the following content
all : example01.html example02.html
example01.html : example.Rmd
Rscript -e "library(knitr); knit2html(input='example.Rmd', output='example01.html')"
example02.html : example.Rmd
Rscript -e "library(knitr); knit2html(input='example.Rmd', output='example02.html')"
If I run the Makefile file sequentially
make
there is no problem.
If I run the makefile in parallel
make -j 2
the chunks generated by knit2html function overlap and both html files contains the same image.
Any suggestion? I've been searching for a solution but I've found nothing.

Using the idea of Karl, I've written a possible solution.
all : example01.html example02.html
example01.html : example.Rmd
mkdir -p dir_$#
Rscript -e 'library(knitr); opts_knit$$set(base.dir = "dir_$#"); knit2html(input="example.Rmd", output="dir_$#/$#")'
mv dir_$#/$# .
rm -r dir_$#
example02.html : example.Rmd
mkdir -p dir_$#
Rscript -e 'library(knitr); opts_knit$$set(base.dir = "dir_$#"); knit2html(input="example.Rmd", output="dir_$#/$#")'
mv dir_$#/$# .
rm -r dir_$#
There are two modifications respect to initial code.
As Karl commented, I've included the line opts_knit$set(base.dir= "dir_example0?.html") in such a way the figure folder is create in that path.
I've swap " and ' symbol in Rscript -e command as commented here
Parallel execution
make -j 2
works fine.

Related

Running R file using sh/bash

I have created below .sh file to run R code saved in separate .R file.
cat EE.sh
#!/bin/bash
VARIABLES=( 20190719 20190718 )
for i in ${VARIABLES[#]}; do
VARIABLENAME=$i
/usr/lib/R/bin/Rscript -e 'source("/home/EER.R")'
Basically what it is expected to do is, take the dates from VARIABLE and pass to the /home/EER.R file, and R will do execution based on passed date (after correct formatting)
Then I ran below code
sudo chmod a+rx EE.sh
and
sudo bash EE.sh
But I then get below error message.
sudo bash EE.sh
EE.sh: line 2: $'\r': command not found
EE.sh: line 3: $'\r': command not found
EE.sh: line 4: $'\r': command not found
Can anyone help me to resolve this issue.
I am using Ubuntu 18 with R version 3.4.4 (2018-03-15)
This problem looks to be related to carriage returns related(which come when we copy text from windows machine to unix machine), so to identify them use:
cat -v Input_file
If you see carriage returns in your file then try:
tr -d '\r' < Input_file > temp && mv temp Input_file
Once they are removed then try to run your program.

Rscript not working in qsub cluster

I have two Rscripts named iHS.hist.R and Fst.hist.R. I know both scripts work. When I use the following commands in my directory in my ubuntu terminal I get a histogram plot for each script (two total if I do both scripts)
module load R
Rscript iHS.hist.R
or I could do Rscript Fst.hist.R
The point is I know they both work.
The problem is that each Rscript takes about 20 minutes to run because my data is pretty big. And unfortunately it's only going to get bigger. I have access to a cluster and I would like to make use of that. I have created two .sh scripts to send to the cluster with qsub but I am running into issues. Here is my iHS.his.sh script for my iHS.hist.R script.
#PBS -N iHS.plots
#PBS -S /bin/bash
#PBS -l walltime=2:00:00
#PBS -l nodes=1:ppn=8
#PBS -l mem=4gb
#PBS -o $HOME/${PBS_JOBNAME}.o${PBS_JOBID}.log
#PBS -e $HOME/${PBS_JOBNAME}.e${PBS_JOBID}.err
###############related commands
###edit it
#code in qsub
###############cut columns we don't need
###
cut -f1,2,3,4 /group/stranger-lab/ebeiter/test/SNPsnap_mdd_5_100/matched_snps_annotated.txt > /group/stranger-lab/ebeiter/test/SNPsnap_mdd_5_100/cut.matched_snps_annotated.txt
cut -f1,2 /group/stranger-lab/ebeiter/test/SNPsnap_mdd_5_100/input_snps_insufficient_matches.txt > /group/stranger-lab/ebeiter/test/SNPsnap_mdd_5_100/cut.input_snps_insufficient_matches.txt
###
###############only needed columns remain
cd /group/stranger-lab/ebeiter
module load R
Rscript iHS.hist.R
The cuts in the beginning are for setting up the data in the right format.
I have tried qsub iHS.hist.sh and it gives me a job. I check on it, and after about 10 minutes it finishes. So I'm assuming it's running my Rscript. I check the error file and it's empty. I check the log file and it does not give me the usual null device 1 that I get after my jpeg is completed in my Rscript. I don't get the output jpeg file for the Rscript when the cluster job is done. I do get the output jpeg file if I just did the Rscript on it's own like at the top of this. Any idea what is going on?

how to tail -f on multiple files with a script?

I am trying to tail multiple files in a ksh. I have the following script:
test.sh
#!/bin/ksh
for file in "$#"
do
# show tails of each in background.
tail -f $file>out.txt
echo "\n"
done
It is only reading the first file argument I provide to the script. Not reading the other files as the argument to the script.
When I do this:
./test.sh /var/adm/messages /var/adm/logs
it is only reading the /var/adm/messages not the logs. Any ideas what I might be doing wrong
You should use double ">>" syntax to redirect the stream at the end of your output file.
A simple ">" redirection will write the stream at the beginning of the file and consequently it will remove the previous content.
So try :
#!/bin/ksh
for file in "$#"
do
# show tails of each in background.
tail -f $file >> out.txt & # Don't forget to add the last character
done
EDIT : If you want to use multi tail it's not installed by default. On Debian or Ubuntu you can use apt-get install multi tail.

Shell script to sort & mv file based on date

Im new to unix,I have search a lot of info but still don not how to make it in a bash
What i know is used this command ls -tr|xargs -i ksh -c "mv {} ../tmp/" to move file by file.
Now I need to make a script that sorts all of these files by system date and moves them into a directory, The first 1000 oldest files being to be moved.
Example files r like these
KPK.AWQ07102011.66.6708.01
KPK.AWQ07102011.68.6708.01
KPK.EER07102011.561.8312.13
KPK.WWS07102011.806.3287.13
-----------This is the script tat i hv been created-------
if [ ! -d /app/RAID/Source_Files/test/testfolder ] then
echo "test directory does not exist!"
mkdir /app/RAID/Source_Files/calvin/testfolder
echo "unused_file directory created!"
fi
echo "Moving xx oldest files to test directory"
ls -tr /app/RAID/Source_Files/test/*.Z|head -1000|xargs -i ksh -c "mv {} /app/RAID/Source_Files/test/testfolder/"
the problem of this script is
1) unix prompt a syntax erro 'if'
2) The move command is working but it create a new filename testfolder instead move to directory testfolder (testfolder alredy been created in this path)
anyone can gv me a hand ? thanks
Could this help?
mv `ls -tr|head -1000` ../tmp/
head -n takes the n first lines of the previous command (here the 1000 oldest files). The backticks allow for the result of ls and head commands to be used as arguments to mv.

Run cd in script and stay in that directory - 'source' command not helping

Tried using the answer found here:
How to run 'cd' in shell script and stay there after script finishes?
When I add the 'source' command, the directory is still unchanged after script runs, regardless of whether I execute 'source ' or call the script using an alias coded in cshrc.
Any help is much appreciated!
As you can see below, make sure your call to cd is not executing within a subshell. If it is, this won't work, source or not.
Script with cd in subshell
#!/bin/bash
( cd /etc ) # thie exec's in a subshell
Output
$ pwd
/home/siegex
$ source ./cdafterend.sh && pwd
/home/siegex
Script with cd not in subshell
#!/bin/bash
cd /etc # no subshell here
Output
$ pwd
/home/siegex
$ source ./cdafterend.sh && pwd
/etc
It was necessary to remove "/bin/" from the cd command within this script, in order for the command to work as intended. Removing this removes the subshell issue for this script. Also, coding "$1" in the ls command was invalid in this context.

Resources