I am new to R and have a problem on running a ggplot command. Below is my code. The program exits immediately without showing the diagram. Is there anything wrong with my code?
#!/usr/bin/env Rscript
library(ggplot2)
data <- read.csv(file="./data/assignment-02-data.csv",head=TRUE,sep=",")
head(data)
nrow(data)
print(ggplot(data, aes(longitude, latitude)) + geom_point())
From this post: Generate multiple graphics from within an R function. It says that I need to print the value but it is not working.
I am running the program from shell ./myr.r.
Related
I am using RStudio in Windows to develop and run a pipeline for multivariate analysis that involve big dataset (90 by ~ 60000 matrices). With matrices of such a size, I got "protection from stack overflow" pretty often. One way of avoid this problem, while still using RStudio as opposite to the regular Rgui, is to run my script(s) using the following syntax
system("Rscript --max-ppsize=500000, my_script.s").
However, running this command results in the script running succesfully, but I cannot get the desired output. If I run the previous command with the following option
opt-< system("Rscript --max-ppsize=500000, my_script.s", internal = TRUE)
I got the standard output to the terminal as output (as a character vector), but not the desired output.
Consider this toy examples:
save the following code in my_script.R
print("first call")
rnorm(15)
print("second call")
rnorm(20)
and run the following code from the console
a <- system("Rscript my_script.R", intern = TRUE)
a
As you can see, the output is a character vector of length 9 with the standard output to the console.
If you modify my_script.R as follow
print("first call")
i_want_this <- rnorm(20)
and then run it again
a <- system("Rscript my_script.R", intern = TRUE)
a
now the only thing stored in a is the output of the print command.
My question is: is there a way to collect the i_want_this variable as an r object (in this case a numeric vector of length 20) ?
A similar question has been asked here, without a satisfying answer.
I am calling R script from within Matlab. The R script is a function that should load the data generated from Matlab and then passes it through the R function, and finally computes a result and sends it back to Matlab. I have included a very simplified code below. The Matlab and R file are in the same path. The R_script.R is the following:
require("mclust")
group = function(data, num_cls){
Mclustmodel = Mclust(data, num_cls)
return(Mclustmodel$class)
}
In Matlab, the code is:
system( 'Rscript ./R_script.R' )
X = rand(10);
K = 3;
class = group(X, K);
My question is: Can I load X and K into the R function group, and directly calculate the answer?
I am using a Linux system.
Thanks.
You could try RMatlab. See this blog post for instructions. RMatlab is biderctional so you can run it from Matlab, send commands to R and get results as Matlab variables. It works on unix systems.
I have an inconsistency issue which I cannot explain when running an R script. I am not able to produce a reproducible example because there is a whole set of files/functions called by the entry script.
Using Rscript or RStudio with R v3.1.2 I obtain the results I'm expecting, however when calling R CMD BATCH from bash my script does not produce identical output. From bash, R seems to read the command line arguments correctly and reports them from the script, BUT in my code only the Rscript and RStudio source methods seem to use the parameter correctly in my code.
The 2 command line calls are as follows:
Rscript ./script/forecast_category_script.R "category='razors'" "cores=4L"
R CMD BATCH --no-save "--args category='razors' cores=4L" ./script/forecast_category_script.R ~/data/output/out.out
Is there any obvious reason why these inconsistencies might be occurring? I'd prefer to use R CMD BATCH as it redirects output to a file and when I migrate my code to the university cluster as a batch job through the scheduler I'd like to be able to follow what it has done.
UPDATE: changing this line resolves it but why?
Previously I had the following line in there, basically so when I was testing I didn't keep reloading the huge dataset if it was already loaded in my RStudio environment:
if(!exists("spi")) spi = f_load.spi(category = category)
Replaced it with this:
spi = f_load.spi(category = category)
The underlying function f_load_spi remained the same however:
f_load.spi = function(spi = NULL, category = "razors" , n=NULL) {
# check if the data is pre-loaded
if (is.null(spi)) {
fil = paste0(pth.data.storage, "categories/", category, "/", category, ".sp_ss.interp.rds")
print(fil)
spi = readRDS(fil)
}
# subset to a specific set of items
if (!is.null(n)) {
fc.items = unique(spi$fc.item)
rnd = sample(1:length(fc.items), n)
spi = spi[fc.item %in% fc.items[rnd]]
}
spi
}
For some reason the category variable was not being passed through properly into the function and it was loading a different category (beer rather than razors) which was an enormous file and not suitable for testing.
This still doesn't explain why Rscript and R CMD BATCH behaved differently.
It is possible that one of them is loading up a previously saved workspace and using global variables. Have you checked whether it matters which directory you are in or if there are any .Rhistory files present? One way to ensure that you don't have any hidden variables is to clear the worspace at the beginning of each script. For example, rm(list=ls()) as the first line of your Rscript.
Also, you can pipe output to a file with an Rscript using sink().
A basic question as I am starting out R.
What is the main difference when I am sourcing a R script vs executing it?
I am trying to get ggplot2 example scripts running.
library("ggplot2")
d = data.frame(x1=c(1,3,1,5,4), x2=c(2,4,3,6,6), y1=c(1,1,4,1,3), y2=c(2,2,5,3,5), t=c('a','a','a','b','b'), r=c(1,2,3,4,5))
ggplot() +
scale_x_continuous(name="x") +
scale_y_continuous(name="y") +
geom_rect(data=d, mapping=aes(xmin=x1, xmax=x2, ymin=y1, ymax=y2, fill=t),color="black",alpha=0.5) +
geom_text(data=d, aes(x1+(x2-x1)/2,y=y1+(y2-y1)/2, label=r), size=4) +
opts(title="geom_rect", plot.title=theme_text(size=40, vjust=1.5))
When I source this script, no plots appear. I understand this has to do with lack of explicit print statement in my code. I've read a discussion that when you execute a command in the interactive shell, the print statement is implicit.
My question is this - When I execute a script vs source it, what is the basic difference?
When would I do one over another? Thanks!
This seems likely to be related to The R-FAQ in section 7 relating to why grid based graphics do not get plotted. Try using explicit print or plot command.
Reading the first sentence of "Details" in the help page for source to you:
`Details
Note that running code via source differs in a few respects from entering it at the R command line. Since expressions are not executed at the top level, auto-printing is not done.` (And I'm glad to see that you did read the rest of that section.)
UPDATE: Thanks to Joshua's comment I realized the problem wasn't being inside a function, but inside a script. So I've edited the question and also provided my own answer.
When I use plot.xts() interactively it pops up a graphics window. I just tried it from inside a function (I'm troubleshooting a unit test and wanted some visual help) but nothing appeared. Aha, says I, I know the trick, just use print.
But print(plot.xts(x)) still shows no chart and instead prints my xts object! I.e. it does exactly the same as print(x).
The script I use to run unit tests is:
#!/usr/bin/Rscript --slave
library('RUnit')
options(warn=2) #Turn warnings into errors
#By naming the files runit.*.R, and naming the functions test*(), we can use
# all the defaults to defineTestSuite().
#NOTE: they have a weird default random number generator, so changed here
# to match the R defaults instead.
test.suite=defineTestSuite('tests',dirs=file.path('tests'),
rngKind = "Mersenne-Twister", rngNormalKind = "Inversion")
test.result <- runTestSuite(test.suite)
printTextProtocol(test.result)
The script below does two things:
plot to a device file, as you would in headless setting such as a webserver,
plot a screen device, I use x11() but you could use win().
There is no limitation imposed by Rscript. And this has nothing to do with xts as you could just as easily plot an xts object.
#!/usr/bin/Rscript
set.seed(42)
x <- cumsum(rnorm(100))
png("/tmp/darren.png")
plot(x)
dev.off()
x11()
plot(x)
Sys.sleep(3) # could wait for key pressed or ...
You cannot use graphics (or input functions like readline) when using RScript. However an RScript is still just R, so when you want to add something interactive (e.g. for troubleshooting) start R, then type:
source('run_tests.R')
When run this way, a line like this shows the chart:
plot(x$High);cat("Press a key");readline()
When run directly from the commandline with ./run_tests.R that line gets quietly ignored.