Running R file that has Dependencies From Linux Command Line - r

Let say that I have a file
file1.R and it has a method getMe()
Now, I want to run file2.R and it makes calls to getMe()
Do I need to run
R CMD BATCH file1.R
before whenever I want to run
R CMD BATCH file2.R ?
Or will R some how be able to determine that file2 has a function that is defined in file1.R?
Whats the standard for running a file that has other function dependencies in other files?

You need to source file1.R so that the functions defined there becomes available for others.
source('file1.R')
You can do this inside your file2.R. Then run simply
R CMD BATCH file2.R

Related

Calling bash from within R

I have R generating some .csv files for another python program to run in another folder, I know it is possible to call bash from R but how could I call the command make in my ubuntu virtual machine in another directory?
The simple way is creating an script to cd to your dir and exec make after that
script <- tempfile()
fhandle <- file(script)
writeLines("( cd /your_directory && make )",con=fhandle)
system2("/bin/bash",args=c(script))
You may need to find the correct path to /bin/bash, mine is from MacOs
You can work with system2 parameters to control what happens with output from make command and if you want to run the process in parallel with your R task or wait for completion.

SparkR: source() other R files in an R Script when running spark-submit

I'm new to Spark and newer to R, and am trying to figure out how to 'include' other R-scripts when running spark-submit.
Say I have the following R script which "sources" another R script:
main.R
source("sub/fun.R")
mult(4, 2)
The second R script looks like this, which exists in a sub-directory "sub":
sub/fun.R
mult <- function(x, y) {
x*y
}
I can invoke this with Rscript and successfully get this to work.
Rscript file.R
[1] 8
However, I want to run this with Spark, and use spark-submit. When I run spark-submit, I need to be able to set the current working directory on the Spark workers to the directory which contains the main.R script, so that the Spark/R worker process will be able to find the "sourced" file in the "sub" subdirectory. (Note: I plan to have a shared filesystem between the Spark workers, so that all workers will have access to the files).
How can I set the current working directory that SparkR executes in such that it can discover any included (sourced) scripts?
Or, is there a flag/sparkconfig to spark-submit to set the current working directory of the worker process that I can point at the directory containing the R Scripts?
Or, does R have an environment variable that I can set to add an entry to the "R-PATH" (forgive me if no such thing exists in R)?
Or, am I able to use the --files flag to spark-submit to include these additional R-files, and if so, how?
Or is there generally a better way to include R scripts when run with spark-submit?
In summary, I'm looking for a way to include files with spark-submit and R.
Thanks for reading. Any thoughts are much appreciated.

Run .R functions via CMD - missing execute all command?

Good day,
I want to automate data cleaning and merging via R. I have created multiple folders to which you upload excel files. Than with R I clean the data and merge them to a final file.
I would like to do this process without running R or Rstudio. To accomplish this I run R via CMD (.bat), however, the code does not run.
In my R script I have multiple functions calling from other files, I assume that I need a simple "Run all command" to execute the code via CMD?
Example of Combine_all_excel_files.R
setwd("C:\\Code source\\")
source("Seperate_input_folder_by_years.R")
Seperate_input_folder_by_years()
Merge data.bat file:
start "" "C:\Program Files\R\R-3.3.3\bin\x64\Rscript.exe" CMD BATCH "C:\Code source\Combine_all_excel_files.R"
pause
When I run the .bad file, the code does not run. What I am doing wrong?

Can't use igraph within r script, when starting from command line

I have written an R script, which I run from R using the command source("script.r"). It works perfectly.
However, when I try to run the exact same script from command line using either one of the commands: Rcript script.r, R --vanilla < script.r or R CMD BATCH script.r I get the error:
"Error in library(igraph): there is no package called 'igraph'"
If I remove the line from the script and run it without the command library(igraph), then I get the error that function graph.frame.data could not be found, which is a function of igraph.
The script that runs from the Rstudio, does not contain the line library(igraph) and I don't get such an error.
When trying to inform myself about problems of running igraph from a script I have not found any additional information.
Some more information: I am building a graph from multiple files and then doing calculations using the igraph library on that created graph. This is executed within a loop, that goes through all (specific) directories. Output is written out into a file using cat(..).
Thanks for the help in advance.
The igraph library was not installed to the default lib position.
Using the command library("igraph", lib.loc="..igraph.location..") solved the issue.
The library was loaded and the execution of the script proceeded without problems.

Execute R script in R studio using batch file

I want to execute a R script in R Studio using batch file. I know how to execute R script using batch file in R though. When I try to execute using the following:
"C:\Program Files\RStudio\bin\rstudio.exe" CMD BATCH --vanilla --slave "C:\Users\kpappala\Desktop\R schedule\task.R"
It just opens R studio but doesn't execute. Is there a way?
Thanks!
Rstudio is an IDE for R. It isn't R though. It doesn't really even make sense to run it in batch through rstudio.
If you're just saying you want to run a file from within Rstudio that's different and you can just source it or run it in batch via system using a call to R CMD BATCH.
This question was answered here:
batch execute R script.
In short and as Dason said, you Rstudio is just a shell, and doesn't actually run the code. For that use R. Try this instead:
PATH C:\Program Files\R\R-3.1.0\bin;%path%
Rscript "C:\Users\kpappala\Desktop\R schedule\task.R"

Resources