Why is source speed different from RStudio console line code? - r

I have a script with self-written functions (no plots). When I copy-paste that script into the R-Studio console, it takes ages to execute, but when I use source("Helperfunctions.R") it doesn't take more than a second.
Question: Where does the difference in speed come from?
I am aware of two differences between running code via the source() function vs. entering code at the R-Studio console:
From ?source:
Since expressions are not executed at the top level, auto-printing is not done.
The way I understand this: source() will not plot graphs (unless made specific with e.g. print(plot)), while the R Studio console codes will always plot graphs. I'm sure this will affect the speed of execution to a certain degree, but this seems irrelevant in my case, because there are barely any plot calls.
And:
(...) the complete file is parsed before any of it is run
I have been working with R for a while now, but I'm not sure whether this relevant for the speed-issue I'm having. Is it possible that completely parsing all code "before any of it is run" speeds up the execution of my helper functions script by a factor of a hundred?
Edit: I'm using R version 3.2.3.

The issue is not source() vs. console line code. Instead, it is an issue of how RStudio sends code from the source pane to the console.
When I copy the content of Helperfunctions.R and run it in RGui (instead of RStudio), the code is executed with nearly the same speed as when I use source("Helperfunctions.R") in RStudio.
Apparently, lines of code always (?) require more execution time in RStudio than in RGui. Even though you may usually not notice the time-difference when executing a couple of lines in the console, it seems to make a huge difference when, say, 3.000 lines of code are being executed in the R Studio console at once.
My understanding is that upon using source("Helperfunctions.R") in the RStudio source pane, the code is not actually sent to the RStudio console (which would have been slow), but is actually executed directly in the R language.

Related

Very simple question on Console vs Script in R

I have just started to learn to code on R, so I apologize for the very simple question. I understand it is best to type your code in as a Script so you can edit and save it. However, when I try to make an object in the script section, it does not work. If I make an object in the console, R saves the object and it appears in my environment. I am typing in a very simple code to try a quick exercise on rolling dice:
die <- 1:6
But it only works in the console and not when typed as a script. Any help/explanation appreciated!
Essentially, you interact with R environment differently when running an .R script via RScript.exe or via console with R.exe, Rterm, etc. and in GUI IDEs like RGui or RStudio. (This applies to any programming language with interactive compilers not just R).
The script does save thedie object in R environment but only during the run or lifetime of that script (i.e., from beginning to end of code lines). Your code line is simply an assignment of object. You do nothing with it. Apply some function, output results, and other actions in that script to see.
On the console, the R environment persists interactively until you quit it with q(). So assigned objects remains for lifetime of your console session. After assigning, you can afterwards apply function, output results, or other actions in line by line calls.
Ultimately, scripts gathers all line by line code in advance of run for automated execution without relying on user to supply lines. Imagine running 1,000 lines of code with nested if/then or for/while loops, apply functions on console! Therefore, have all your R coding needs summarily handled in scripts.
It is always better to have the script, as you say, you can save edit correct, without having to rewrite the code to change a variable or number.
I recommend using Rstudio, it is very practical and will help you to program more efficiently and allows you to see, among other things, the different objects that you have created.

What exactly does Source on Save mean or do?

Despite numerous searches, I can't seem to find a clear explanation as to what "Source on Save" means in RStudio.
I have tried ?source and the explanation there isn't clear, either.
As far as I can tell, it seems to run the script when I hit Save, but I don't understand the relevance/significance of it.
In simple terms, what exactly does Source on Save do and why would/should I use it?
This is kind of a shortcut to save and execute your code. You type something, save the script and it will be automatically sourced.
Very useful for short scripts but very annoying for time consuming longer scripts.
So sourcing is basically running each line of your file.
EDIT:
SO thinking of a scenario where this might be useful...
You developing a function which you will later put into a package... So you write this function already in an extra file but execute the function for testing in the command line...
Normally, you have to execute the whole function again, when you changed something. While using "Source on Save" the function will be executed and you can use Ctrl + 2 to jump into command line and test the function directly.
Since I am working with R, my datasets are much bigger. But I am remembering starting coding in python and vi, I updated my setting in a way to execute the code on save, since these little scripts where done in less then 10 seconds...
So maybe it is just not standard to work with small datasets... But I can still recommend it, for development, to use only 10% of a normal dataset. It will speed up the graphics creation and a lot of other things as well. Test it with the complete dataset every now and then.

Knitr compiling and running all at the same time in RStudio

For running an Rnw file in RStudio, one can compile or run all. Compiling does not see the variables in the current environment, and the current environment does not see the variables created while compiling. I would like to see how the output would look when I compile, and I debug the code using the environment. This requires me to compile and run, which performs the same calculations twice, which is very impractical for large projects. Is there a way to compile and have the output be seen in the environment?
When you knit a document, the work happens in a different R session, which is why you can't examine the results in the current session.
But you have a lot of choices besides run all. Take a look at the Run button: it allows you to run chunks one at a time, or run all previous chunks, etc.
If some of your chunks take too long to run, then you should consider organizing your work differently. Put the long computations into their own script, and save the results of that script using save(). Run it once, then spend time editing the display of those results in multiple runs in the main .Rnw document.
Finally, if you really want to see variables at the end of a run of your vignette, you can add save.image(file = 'vignette.RData') at the end, and in your interactive session, use load('vignette.RData') to load the values for examination. This won't necessarily give you an accurate view of the state of things at the end of the run, because it will load the values in addition to anything you've already got in your workspace, it won't load option settings or attach packages, but it might be enough for debugging.

Run R (on Linux) interactively with a simple repl (no readline, no assumed terminal)

I'd like to run R as an inferior process. I'd like R to display help pages as html (in the browser) and to display plots as if it were run interactively. I know I can run R with the --interactive argument but then R seems to assume it's running on a terminal end emit control sequences.
How should I run R to get html help pages & plot (as with --interactive) and simple textual output with no control sequences?
If I use R, it assumes to be run non-interactively (html help pages or plots won't work but the output contains no control sequences).
If I use R --interactive --no-readline, it runs interactively (help pages and plots work as expected) but the output is garbled and difficult to parse since it assumes to be running on a terminal.
Is there a way to control the assumptions R makes about the terminal it's running in?
It seems that argument order matters. (Part of) the problem gets resolved when calling R as R --no-readline --interactive.

Extract what was printed deep in the R console

When I execute commands in R, the output is printed in the console. After some threshold (I guess, some maximum number of lines), the R console no longer shows the first commands and their output. I cannot scroll up that far because it is simply no longer there.
How can I access this "early" output if it has disappeared from the console?
I care mostly about error messages and messages generated by my own script. I do use a script file and save my results to a file, if anyone wonders, but this still does not help solve my problem.
(I have tried saving the R workspace and R history and then loading it again, but did not know what to do next and was not able to find what I needed...)

Resources