How to execute raster2pgsql with R? - r

I'm trying to execute the following code in R
system('"C:/Program Files/PostgreSQL/9.5/bin/raster2pgsql" -s 32630 -a -f raster Y:/Sen2R_Download/prueba_sergio/raster3/SCL/S2B2A_20180731_137_sen2r_SCL_10.tif sentinel > Y:/Sen2R_Download/prueba_sergio/rastersql27.sql')
But it throws an error
ERROR: Unable to read raster file: sentinel
But this error shouldn't happen and when I execute the same in cmd it works well
C:\Users\Public\Documents>"C:/Program Files/PostgreSQL/9.5/bin/raster2pgsql" -s 32630 -a -f raster Y:/Sen2R_Download/prueba_sergio/raster3/SCL/S2B2A_20180731_137_sen2r_SCL_10.tif sentinel > Y:/Sen2R_Download/prueba_sergio/rastersql27.sql
Processing 1/1: Y:/Sen2R_Download/prueba_sergio/raster3/SCL/S2B2A_20180731_137_sen2r_SCL_10.tif
What can I do to make it work with R?

Try running the same command with system2 instead.
See https://stat.ethz.ch/R-manual/R-devel/library/base/html/system.html for more details. Specifically the section regarding differences between Unix and windows

Related

Create a shell alias that R can recognise

I have an alias (actually, a function) on my .bashrc but R doesn't seem to recognise it.
fun() {
echo "Hello"
}
which runs correctly when I log in via ssh (it's a remote server). However, if I run system("fun"), I get
sh: 1: fun: not found
Warning message:
In system("fun") : error in running command
From a comment on this question I can get system("bash -i -c fun") to work, although with a weird warning/message
bash: cannot set terminal process group (27173): Inappropriate ioctl for device
bash: no job control in this shell
Hello
However, this doesn't apply to my case, because I'm running external code so I cannot modify the system() call. I need system("command") to use the command command that I defined.
BTW, this is all running on Linux (Debian in the remote server, but I also tried on my machine running elementary OS with the same result).

Running R file using sh/bash

I have created below .sh file to run R code saved in separate .R file.
cat EE.sh
#!/bin/bash
VARIABLES=( 20190719 20190718 )
for i in ${VARIABLES[#]}; do
VARIABLENAME=$i
/usr/lib/R/bin/Rscript -e 'source("/home/EER.R")'
Basically what it is expected to do is, take the dates from VARIABLE and pass to the /home/EER.R file, and R will do execution based on passed date (after correct formatting)
Then I ran below code
sudo chmod a+rx EE.sh
and
sudo bash EE.sh
But I then get below error message.
sudo bash EE.sh
EE.sh: line 2: $'\r': command not found
EE.sh: line 3: $'\r': command not found
EE.sh: line 4: $'\r': command not found
Can anyone help me to resolve this issue.
I am using Ubuntu 18 with R version 3.4.4 (2018-03-15)
This problem looks to be related to carriage returns related(which come when we copy text from windows machine to unix machine), so to identify them use:
cat -v Input_file
If you see carriage returns in your file then try:
tr -d '\r' < Input_file > temp && mv temp Input_file
Once they are removed then try to run your program.

r + hpc + git question: submitting multiple jobs with different values for a parameter list [duplicate]

I am running R on a multiple node Linux cluster. I would like to run my analysis on R using scripts or batch mode without using parallel computing software such as MPI or snow.
I know this can be done by dividing the input data such that each node runs different parts of the data.
My question is how do I go about this exactly? I am not sure how I should code my scripts. An example would be very helpful!
I have been running my scripts so far using PBS but it only seems to run on one node as R is a single thread program. Hence, I need to figure out how to adjust my code so it distributes labor to all of the nodes.
Here is what I have been doing so far:
1) command line:
> qsub myjobs.pbs
2) myjobs.pbs:
> #!/bin/sh
> #PBS -l nodes=6:ppn=2
> #PBS -l walltime=00:05:00
> #PBS -l arch=x86_64
>
> pbsdsh -v $PBS_O_WORKDIR/myscript.sh
3) myscript.sh:
#!/bin/sh
cd $PBS_O_WORKDIR
R CMD BATCH --no-save my_script.R
4) my_script.R:
> library(survival)
> ...
> write.table(test,"TESTER.csv",
> sep=",", row.names=F, quote=F)
Any suggestions will be appreciated! Thank you!
-CC
This is rather a PBS question; I usually make an R script (with Rscript path after #!) and make it gather a parameter (using commandArgs function) that controls which "part of the job" this current instance should make. Because I use multicore a lot I usually have to use only 3-4 nodes, so I just submit few jobs calling this R script with each of a possible control argument values.
On the other hand your use of pbsdsh should do its job... Then the value of PBS_TASKNUM can be used as a control parameter.
This was an answer to a related question - but it's an answer to the comment above (as well).
For most of our work we do run multiple R sessions in parallel using qsub (instead).
If it is for multiple files I normally do:
while read infile rest
do
qsub -v infile=$infile call_r.pbs
done < list_of_infiles.txt
call_r.pbs:
...
R --vanilla -f analyse_file.R $infile
...
analyse_file.R:
args <- commandArgs()
infile=args[5]
outfile=paste(infile,".out",sep="")...
Then I combine all the output afterwards...
This problem seems very well suited for use of GNU parallel. GNU parallel has an excellent tutorial here. I'm not familiar with pbsdsh, and I'm new to HPC, but to me it looks like pbsdsh serves a similar purpose as GNU parallel. I'm also not familiar with launching R from the command line with arguments, but here is my guess at how your PBS file would look:
#!/bin/sh
#PBS -l nodes=6:ppn=2
#PBS -l walltime=00:05:00
#PBS -l arch=x86_64
...
parallel -j2 --env $PBS_O_WORKDIR --sshloginfile $PBS_NODEFILE \
Rscript myscript.R {} :::: infilelist.txt
where infilelist.txt lists the data files you want to process, e.g.:
inputdata01.dat
inputdata02.dat
...
inputdata12.dat
Your myscript.R would access the command line argument to load and process the specified input file.
My main purpose with this answer is to point out the availability of GNU parallel, which came about after the original question was posted. Hopefully someone else can provide a more tangible example. Also, I am still wobbly with my usage of parallel, for example, I'm unsure of the -j2 option. (See my related question.)

UNIX commands from R via shell function

I need to issue unix commands from an R session. I'm on Windows R2 2012 server using RStudio 1.1.383 and R 3.4.3.
The shell() function looks to be the right one for me but when I specify the path to my bash shell (from Git for Windows install) the command fails with error code 127.
shell_path <- "C:\\Program Files\\Git\\git-bash.exe"
shell("ls -a", shell = shell_path)
## running command 'C:\Program Files\Git\git-bash.exe /c ls -a' had status 127'ls -a' execution failed with error code 127
Pretty sure my shell path is correct:
What am I doing wrong?
EDIT: for clarity I would like to pass any number of UNIX commands, I am just using ls -a for an example.
EDIT:
After some playing about 2018-03-09:
shell(cmd = "ls -a", shell = '"C:/Program Files/Git/bin/bash.exe"', intern = TRUE, flag = "-c")
The correct location of my bash.exe was at .../bin/bash.exe. This uses shell with intern = TRUE to return the output as an R object. Note the use of single quote marks around the shell path.
EDIT: 2018-03-09 21:40:46 UT
In RStudio we can also call bash using knitr and setting chunk options:
library(knitr)
```{bash my_bash_chunk, engine.path="C:\\Program Files\\Git\\bin\\bash.exe"}
# Using a call to unix shell
ls -a
```
Two things stand out here. Bash will return exit code 127 if a command is not found; you should try running the fully qualified command name.
I also see that your shell is being run with a /c flag. According to the documentation, the flag argument specifies "the switch to run a command under the shell" and it defaults to /c, but "if the shell is bash or tcsh or sh the default is changed to '-c'." Obviously this isn't happening for git-bash.exe.
Try these changes out:
shell_path <- "C:\\Program Files\\Git\\git-bash.exe"
shell("/bin/ls -a", shell = shell_path, flag = "-c")
Not on Windows, so can't be sure this will work.
Perhaps you need to use shQuote?
shell( paste("ls -a ", shQuote( shell_path) ) )
(Untested. I'm not on Windows. But do read ?shQuote))
If you just want to do ls -a, you can use the below commands:
shell("'ls -a'", shell="C:\\Git\\bin\\sh.exe")
#or
shell('C:\\Git\\bin\\sh.exe -c "ls -a"')
Let us know if the space in "Program Files" is causing problems.
And if you require login before you can call your command,
shell('C:\\Git\\bin\\sh.exe --login -c "ls -a"')
But if you are looking at performing git commands from R, the git2r by ropensci might suit your needs.

download.file in R including pre-requisites

I'm trying to use download.file to get some webpages including embedded images, etc. I think using wget it's the equivalent of the -p -k options, but I can't see how to do this...
if I do:
download.file("http://guardian.co.uk","test.html")
That obviously works, but I get this error:
Warning messages:
1: running command 'wget -p -k "http://guardian.co.uk" -O "test.html"' had status 1
2: In download.file("http://guardian.co.uk", "test.html", method = "wget", :
download had nonzero exit status
When I run:
download.file("http://guardian.co.uk","test.html", method = "wget", extra = "-p -k") #no recursion (-r), but get pre-requisites, and (-k) convert for local viewing
I've done Sys.which("wget") & the path is set (and I'm not trying to access https which I think can cause issues).
Once I've done this I actually want to put it into a loop where I download a set of urls (& their embedded content) to create a single html output...
Easy solution, just use system to call wget directly:
system("wget http://guardian.co.uk -p -k")
I think the issue is that passing an output file ('test.html') means -O option specified, so you can't also invoke -r -k whereas calling wget directly means it saves the files separately.

Resources