Using the animation package - r

I am new to R and trying to use the animation package. From the manual, I tried to run this snippet
library(animation)
oopt = ani.options(interval = 0.2, nmax = 10)
## use a loop to create images one by one
for (i in 1:ani.options("nmax")) {
plot(rnorm(30))
ani.pause() ## pause for a while (’interval’)
}
## restore the options
ani.options(oopt)
but I get the error:
Error in ani.options(oopt) : object 'oopt' not found
I am have the package installed and I am using ver 2.14.2

RStudio can evaluate your code line by line: Ctrl-Enter runs the line at the cursor in the source. You can see in the Console below your source that in this case only one line will have been evaluated.
To run your full script, use 'run all' (Ctrl-Shift-R).
Instead of running from the source code window, you can also type in a line of code directly and evaluate it by pressing Enter.

Related

How to call a parallelized script from command prompt?

I'm running into this issue and I for the life of me can't figure out how to solve it.
Quick summary before example:
I have several hundred data sets from which I want create reports on everyday. In order to do this efficiently, I parallelized the process with doParallel. From within RStudio, the process works fine, but when I try to make the process automatic via Task Scheduler on windows, I can't seem to get it to work.
The process within RStudio is:
I call a script that sources all of my other scripts, each individual script has a header section that performs the appropriate package import, so for instance it would look like:
get_files <- function(){
get_files.create_path() -> path
for(file in path){
if(!(file.info(paste0(path, file))[['isdir']])){
source(paste0(path, file))
}
}
}
get_files.create_path <- function(){
return(<path to directory>)
}
#self call
get_files()
This would be simply "Source on saved" and brings in everything I need into the .GlobalEnv.
From there, I could simply type: parallel_report() which calls a script that sources another script that houses the parallelization of the report generations. There was an issue awhile back with simply calling the parallelization directly (I wonder if this is related?) and so I had to make the doParallel script a non-function housing script and thus couldn't be brought in with the get_files script which would start the report generation every time I brought everything in. Thus, I had to include it in its own script and save it elsewhere to be called when necessary. The parallel_report() function would simply be:
parallel_report <- function(){
source(<path to script>)
}
Then the script that is sourced is the real parallelization script, and would look something like:
doParallel::registerDoParallel(cl = (parallel::detectCores() - 1))
foreach(name = report.list$names,
.packages = c('tidyverse', 'knitr', 'lubridate', 'stringr', 'rmarkdown'),
.export = c('generate_report'),
.errorhandling = 'remove') %dopar% {
tryCatch(expr = {
generate_report(name)
}, error = function(e){
error_handler(error = e, caller = paste0("generate report for ", name, " from parallel"), line = 28)
})
}
doParallel::stopImplicitCluster()
The generate_report function is simply an .Rmd and render() caller:
generate_report <- function(<arguments>){
#stuff
generate_report.render(<arguments>)
#stuff
}
generate_report.render <- function(<arguments>){
rmarkdown::render(
paste0(data.information#location, 'report_generator.Rmd'),
params = list(
name = name,
date = date,
thoughts = thoughts,
auto = auto),
output_file = paste0(str_to_upper(stock), '_report_', str_remove_all(date, '-'))
)
}
So to recap, in RStudio I would simply perform the following:
1 - Source save the script to bring everything
2 - type parallel_report
2.a - this calls directly the doParallization of generate_report
2.b - generate_report calls an .Rmd file that houses the required function calling and whatnot to produce the reports
And the process starts and successfully completes without a hitch.
In order to make the situation automatic via the Task Scheduler, I made a script that the Task Scheduler can call, named automatic_caller:
source(<path to the get_files script>) # this brings in all the scripts and data into the global, just
# as if it were being done manually
tryCatch(
expr = {
parallel_report()
}, error = function(e){
error_handler(error = e, caller = "parallel_report from automatic_callng", line = 39)
})
The error_handler function is just an in-house script used to log errors throughout.
So then on the Task Schedule's tasks I have the Rscript.exe called and then the automatic_caller after that. Everything within the automatic_caller function works except for the report generation.
The process completes almost automatically, and the only output I get is an error:
"pandoc version 1.12.3 or higher is required and was not found (see the help page ?rmarkdown::pandoc_available)."
But rmarkdown is within the .export call of the doParallel and it is in the scripts that use it explicitly, and in the actual generate_report it is called directly via rmarkdown::render().
So - I am at a complete loss.
Thoughts and suggestions would be completely appreciated.
So pandoc is apprently an executable that helps convert files from one extension to another. RStudio comes with its own pandoc executable so when running the scripts from RStudio, it knew where to point when pandoc is required.
From the command prompt, the system did not know to look inside of RStudio, so simply downloading pandoc as a standalone executable gives the system the proper pointer.
Downloded pandoc and everything works fine.

readline does not prompt user input from Rprofile.site in Rstudio

I have this little function in a file:
library(grDevices) # needed in Rprofile.site
readfun <- function()
{
message("interactive: ", interactive()) # tells TRUE
rl <- readline("Write something: ")
message("rl value is: ", rl)
}
readfun()
I can source it in the R console witin Rstudio just fine.
I can write source("thatfile.R") in the Rrofile.site and calling Rterm via R.exe prompts input as expected. (I'm on Windows, btw).
But starting R with Rstudio, it will not prompt for user input.
Instead, the first typed command will not be executed but messaged back.
This could be related to Wait for user input from keyboard in R before next line of code - readline - Rstudio, but I can't find a way to get it to work...

Different results from Rscript and R CMD BATCH

I have an inconsistency issue which I cannot explain when running an R script. I am not able to produce a reproducible example because there is a whole set of files/functions called by the entry script.
Using Rscript or RStudio with R v3.1.2 I obtain the results I'm expecting, however when calling R CMD BATCH from bash my script does not produce identical output. From bash, R seems to read the command line arguments correctly and reports them from the script, BUT in my code only the Rscript and RStudio source methods seem to use the parameter correctly in my code.
The 2 command line calls are as follows:
Rscript ./script/forecast_category_script.R "category='razors'" "cores=4L"
R CMD BATCH --no-save "--args category='razors' cores=4L" ./script/forecast_category_script.R ~/data/output/out.out
Is there any obvious reason why these inconsistencies might be occurring? I'd prefer to use R CMD BATCH as it redirects output to a file and when I migrate my code to the university cluster as a batch job through the scheduler I'd like to be able to follow what it has done.
UPDATE: changing this line resolves it but why?
Previously I had the following line in there, basically so when I was testing I didn't keep reloading the huge dataset if it was already loaded in my RStudio environment:
if(!exists("spi")) spi = f_load.spi(category = category)
Replaced it with this:
spi = f_load.spi(category = category)
The underlying function f_load_spi remained the same however:
f_load.spi = function(spi = NULL, category = "razors" , n=NULL) {
# check if the data is pre-loaded
if (is.null(spi)) {
fil = paste0(pth.data.storage, "categories/", category, "/", category, ".sp_ss.interp.rds")
print(fil)
spi = readRDS(fil)
}
# subset to a specific set of items
if (!is.null(n)) {
fc.items = unique(spi$fc.item)
rnd = sample(1:length(fc.items), n)
spi = spi[fc.item %in% fc.items[rnd]]
}
spi
}
For some reason the category variable was not being passed through properly into the function and it was loading a different category (beer rather than razors) which was an enormous file and not suitable for testing.
This still doesn't explain why Rscript and R CMD BATCH behaved differently.
It is possible that one of them is loading up a previously saved workspace and using global variables. Have you checked whether it matters which directory you are in or if there are any .Rhistory files present? One way to ensure that you don't have any hidden variables is to clear the worspace at the beginning of each script. For example, rm(list=ls()) as the first line of your Rscript.
Also, you can pipe output to a file with an Rscript using sink().

Error when running (working) R script from command prompt

I am trying to run an R script from the Windows command prompt (the reason is that later on I would like to run the script by using VBA).
After having set up the R environment variable (see the end of the post), the following lines of code saved in R_code.R work perfectly:
library('xlsx')
x <- cbind(rnorm(10),rnorm(10))
write.xlsx(x, 'C:/Temp/output.xlsx')
(in order to run the script and get the resulting xlsx output, I simply type the following command in the Windows command prompt: Rscript C:\Temp\R_code.R).
Now, given that this "toy example" is working as expected, I tried to move to my main goal, which is indeed very similar (to run a simple R script from the command line), but for some reason I cannot succeed.
Again I have to use a specific R package (-copula-, used to sample some correlated random variables) and export the R output into a csv file.
The following script (R_code2.R) works perfectly in R:
library('copula')
par_1 <- list(mean=0, sd=1)
par_2 <- list(mean=0, sd=1)
myCop.norm <- ellipCopula(family='normal', dim=2, dispstr='un', param=c(0.2))
myMvd <- mvdc(myCop.norm,margins=c('norm','norm'),paramMargins=list(par_1,par_2))
x <- rMvdc(10, myMvd)
write.table(x, 'C:/Temp/R_output.csv', row.names=FALSE, col.names=FALSE, sep=',')
Unfortunately, when I try to run the same script from the command prompt as before (Rscript C:\Temp\R_code2.R) I get the following error:
Error in FUN(c("norm", "norm"))[[1L]], ...) :
cannot find the function "existsFunction"
Calls: mvdc -> mvdcCheckM -> mvd.has.marF -> vapply -> FUN
Do you have any idea idea on how to proceed to fix the problem?
Any help is highly appreciated, S.
Setting up the R environment variable (Windows)
For those of you that want to replicate the code, in order to set up the environment variable you have to:
Right click on Computer -> Properties -> Advanced System Settings -> Environment variables
Double click on 'PATH' and add ; followed by the path to your Rscript.exe folder. In my case it is ;C:\Program Files\R\R-3.1.1\bin\x64.
This is a tricky one that has bitten me before. According to the documentation (?Rscript),
Rscript omits the methods package as it takes about 60% of the startup time.
So your better solution IMHO is to add library(methods) near the top of your script.
For those interested, I solved the problem by simply typing the following in the command prompt:
R CMD BATCH C:\Temp\R_code2.R
It is still not clear to me why the previous command does not work. Anyway, once again searching into the R documentation (see here) proves to be an excellent choice!

How to call plot.xts when using RScript

UPDATE: Thanks to Joshua's comment I realized the problem wasn't being inside a function, but inside a script. So I've edited the question and also provided my own answer.
When I use plot.xts() interactively it pops up a graphics window. I just tried it from inside a function (I'm troubleshooting a unit test and wanted some visual help) but nothing appeared. Aha, says I, I know the trick, just use print.
But print(plot.xts(x)) still shows no chart and instead prints my xts object! I.e. it does exactly the same as print(x).
The script I use to run unit tests is:
#!/usr/bin/Rscript --slave
library('RUnit')
options(warn=2) #Turn warnings into errors
#By naming the files runit.*.R, and naming the functions test*(), we can use
# all the defaults to defineTestSuite().
#NOTE: they have a weird default random number generator, so changed here
# to match the R defaults instead.
test.suite=defineTestSuite('tests',dirs=file.path('tests'),
rngKind = "Mersenne-Twister", rngNormalKind = "Inversion")
test.result <- runTestSuite(test.suite)
printTextProtocol(test.result)
The script below does two things:
plot to a device file, as you would in headless setting such as a webserver,
plot a screen device, I use x11() but you could use win().
There is no limitation imposed by Rscript. And this has nothing to do with xts as you could just as easily plot an xts object.
#!/usr/bin/Rscript
set.seed(42)
x <- cumsum(rnorm(100))
png("/tmp/darren.png")
plot(x)
dev.off()
x11()
plot(x)
Sys.sleep(3) # could wait for key pressed or ...
You cannot use graphics (or input functions like readline) when using RScript. However an RScript is still just R, so when you want to add something interactive (e.g. for troubleshooting) start R, then type:
source('run_tests.R')
When run this way, a line like this shows the chart:
plot(x$High);cat("Press a key");readline()
When run directly from the commandline with ./run_tests.R that line gets quietly ignored.

Resources