I was just wondering what is the best way in R to keep on printing on the same line in a loop, to avoid swamping your console? Let's say to print a value indicating your progress, as in
for (i in 1:10) {print(i)}
Edit:
I tried inserting carriage returns before each value as in
for (i in 1:10000) {cat("\r",i)}
but that also doesn't quite work as it will just update the value on the screen after the loop, just returning 10000 in this case.... Any thoughts?
NB this is not to make a progress bar, as I know there are various features for that, but just to be able to print some info during the progression of some loop without swamping the console
You have the answer, it's just looping too quickly for you to see. Try:
for (i in 1:10) {Sys.sleep(1); cat("\r",i)}
EDIT: Actually, this is very close to #Simon O'Hanlon's answer, but given the confusion in the comments and the fact that it isn't exactly the same, I'll leave it here.
Try using cat()...
for (i in 1:10) {cat(paste(i," "))}
#1 2 3 4 5 6 7 8 9 10
cat() performs much less conversion than print() (from the horses mouth).
To repeatedly print in the same place, you need to clear the console. I am not aware of another way to do this, but thanks to this great answer this works (in RStudio on Windows at least):
for (i in 1:1e3) {
cat( i )
Sys.sleep(0.01)
cat("\014")
}
Well... are you worried about hangs, or just about being notified when the job completes?
In the first case, I'd stick w/ my j%%N suggestion, where N is large enough that you don't swamp the console.
In the second case, add a final line to your script or function which, e.g., calls "Beep" .
Related
I wonder if there is a way to display the current time in the R command line, like in MS DOS, we can use
Prompt $T $P$G
to include the time clock in every prompt line.
Something like
options(prompt=paste(format(Sys.time(), "%H:%M:%S"),"> "))
will do it, but then it is fixed at the time it was set. I'm not sure how to make it update automatically.
Chase points the right way as options("prompt"=...) can be used for this. But his solutions adds a constant time expression which is not what we want.
The documentation for the function taskCallbackManager has the rest:
R> h <- taskCallbackManager()
R> h$add(function(expr, value, ok, visible) {
+ options("prompt"=format(Sys.time(), "%H:%M:%S> "));
+ return(TRUE) },
+ name = "simpleHandler")
[1] "simpleHandler"
07:25:42> a <- 2
07:25:48>
We register a callback that gets evaluated after each command completes. That does the trick. More fancy documentation is in this document from the R developer site.
None of the other methods, which are based on callbacks, will update the prompt unless a top-level command is executed. So, pressing return in the console will not create a change. Such is the nature of R's standard callback handling.
If you install the tcltk2 package, you can set up a task scheduler that changes the option() as follows:
library(tcltk2)
tclTaskSchedule(1000, {options(prompt=paste(Sys.time(),"> "))}, id = "ticktock", redo = TRUE)
Voila, something like the MS DOS prompt.
NB: Inspiration came from this answer.
Note 1: The wait time (1000 in this case) refers to the # of milliseconds, not seconds. You might adjust it downward when sub-second resolution is somehow useful.
Here is an alternative callback solution:
updatePrompt <- function(...) {options(prompt=paste(Sys.time(),"> ")); return(TRUE)}
addTaskCallback(updatePrompt)
This works the same as Dirk's method, but the syntax is a bit simpler to me.
You can change the default character that is displayed through the options() command. You may want to try something like this:
options(prompt = paste(Sys.time(), ">"))
Check out the help page for ?options for a full list of things you can set. It is a very useful thing to know about!
Assuming this is something you want to do for every R session, consider moving that to your .Rprofile. Several other good nuggets of programming happiness can be found hither on that topic.
I don't know of a native R function for doing this, but I know R has interfaces with other languages that do have system time commands. Maybe this is an option?
Thierry mentioned system.time() and there is also proc.time() depending on what you need it for, although neither of these give you the current time.
I have my code running in a for loop over dates. The code takes a while to run, and there a couple of days left, but I urgently need whatever results there are. Is there a way of breaking the code/for loop, but keep whatever data has been produced up to now.
Yes. You can press "escape", examine the results and then restart your loop.
for(iii in 1:100000000) force(iii)
# now press ESC
iii
# in my case 1121673
# use this value to restart the loop later:
for(iii in 1121674:100000000) force(iii)
I have looked at other posts that appeared similar to this question but they have not helped me. This may be just my ignorance of R. Thus I decided to sign up and make my first post on stack-overflow.
I am running an R-script and would like the user to decide either to use one of the two following loops. The code to decide user input looks similar to the one below:
#Define the function
method.choice<-function() {
Method.to.use<-readline("Please enter 'New' for new method and'Old' for old method: ")
while(Method.to.use!="New" && Method.to.use!="Old"){ #Make sure selection is one of two inputs
cat("You have not entered a valid input, try again", "\n")
Method.to.use<-readline("Please enter 'New' for new method and 'Old' for old method: ")
cat("You have selected", Method.to.use, "\n")
}
return(Method.to.use)
}
#Run the function
method.choice()
Then below this I have the two possible choices:
if(Method.to.use=="New") {
for(i in 1:nrow(linelist)){...}
}
if(Method.to.use=="Old"){
for(i in 1:nrow(linelist)){...}
}
My issue is, and what I have read from other posts, is that whether I use "readline", "scan" or "ask", R does not wait for my input. Instead R will use the following lines as the input.
The only way I found that R would pause for input is if the code is all on the same line or if it is run line by line (instead of selecting all the code at once). See example from gtools using "ask":
silly <- function()
{
age <- ask("How old are you? ")
age <- as.numeric(age)
cat("In 10 years you will be", age+10, "years old!\n")
}
This runs with a pause:
silly(); paste("this is quite silly")
This does not wait for input:
silly()
paste("this is quite silly")
Any guidance would be appreciated to ensure I can still run my entire script and have it pause at readline without continuing. I am using R-studio and I have checked that interactive==TRUE.
The only other work-around I found is wrapping my entire script into one main function, which is not ideal for me. This may require me to use <<- to write to my environment.
Thank you in advance.
In R, I will sometimes have a long for loop or lapply that I want to know the ongoing progress of.
Something like the following is in the spirit of what I want but doesn't work:
lapply(1:n,function(i) { print(i); MAIN COMPUTATIONS })
Ideally the above would print i at the beginning of each new iteration of the lapply.
QUESTION: How do I get ongoing progress updates of how many iterations my lapply or for loop has done?
It sounds like you're using RGui on Windows. There should be an option in one of the menus to tell it to not buffer the output. Alternatively you could call flush.console after every time you print.
lapply(1:1000, function(i){print(i); flush.console()})
Note that this will slow down the code a little bit.
A solution using plyr
l_ply(1:10,function(x) x+1,.progress='text')
or you can define your progress using progress_text
l_ply(1:10000,function(x) x+1,.progress= progress_text(char = '*'))
|*********************************************************************| 100%
or with option print , to get the result of each iteration
l_ply(1:4,function(x) x+1,.progress= progress_text(char = '+'),.print=TRUE)
| | 0%[1] 2
|++++++ | 25%[1] 3
|+++++++++++++++ | 50%[1] 4
|++++++++++++++++++++++ | 75%[1] 5
|++++++++++++++++++++++++++++++++ | 100%[1]
You might also want to look at the functions like winProgressBar, tkProgressBar, or txtProgressBar. The windows and tk versions are nice in that they can show you your progress, but don't clutter your output.
I have the following code running and it's taking me a long time to run. How do I know if it's still doing its job or it got stuck somewhere.
noise4<-NULL;
for(i in 1:length(noise3))
{
if(is.na(noise3[i])==TRUE)
{
next;
}
else
{
noise4<-c(noise4,noise3[i]);
}
}
noise3 is a vector with 2418233 data points.
You just want to remove the NA values. Do it like this:
noise4 <- noise3[!is.na(noise3)]
This will be pretty much instant.
Or as Joshua suggests, a more readable alternative:
noise4 <- na.omit(noise3)
Your code was slow because:
It uses explicit loops which tend to be slow under the R interpreter.
You reallocate memory every iteration.
The memory reallocation is probably the biggest handicap to your code.
I wanted to illustrate the benefits of pre-allocation, so I tried to run your code... but I killed it after ~5 minutes. I recommend you use noise4 <- na.omit(noise3) as I said in my comments. This code is solely for illustrative purposes.
# Create some random data
set.seed(21)
noise3 <- rnorm(2418233)
noise3[sample(2418233, 100)] <- NA
noise <- function(noise3) {
# Pre-allocate
noise4 <- vector("numeric", sum(!is.na(noise3)))
for(i in seq_along(noise3)) {
if(is.na(noise3[i])) {
next
} else {
noise4[i] <- noise3[i]
}
}
}
system.time(noise(noise3)) # MUCH less than 5+ minutes
# user system elapsed
# 9.50 0.44 9.94
# Let's see what we gain from compiling
library(compiler)
cnoise <- cmpfun(noise)
system.time(cnoise(noise3)) # a decent reduction
# user system elapsed
# 3.46 0.49 3.96
The other answers have given you much, much better ways to do the task that you actually set out to achieve (removing NA values in your data), but an answer to the specific question you asked ("how do I know if R is actually working or if it has instead gotten stuck?") is to introduce some output (cat) statements in your loop, as follows:
rpt <- 10000 ## reporting interval
noise4<-NULL;
for(i in 1:length(noise3))
{
if (i %% rpt == 0) cat(i,"\n")
if(is.na(noise3[i])==TRUE)
{
next;
}
else
{
noise4<-c(noise4,noise3[i]);
}
}
If you run this code you can immediately see that it slows down radically as it gets farther into the loop (a consequence of the failure to pre-allocate space) ...
The others have all given correct ways to do the same problem, so that you needn't worry about speed. #BenBolker also gave a good pointer regarding regular output.
A different thing to note is that if you find yourself in a loop, you can break out of it and find the value of i. Assuming that re-starting from that value of i won't harm things, i.e. using that value twice won't be a problem, you can restart. Or, you can just finish the job as the others have stated.
A separate trick is that if the loop is slow (and can't be vectorized or else you're not eager to break out of the loop), AND you don't have any reporting, you can still look for an external method to see if R is actually consuming cycles on your computer. In Linux, the top command is your best bet. On Windows, the task manager will do the trick (I prefer to use the SysInternals / Microsoft program Process Explorer). 'top' also exists on Macs, though I believe there are some other more popular tools.
One other word of advice: if you have a really long loop to run, I strongly encourage saving the results regularly. I typically create a file with the a name like: myPrefix_YYYYMMDDHHMMSS.rdat . This way everything can go to hell and you can still start your loop where you left off.
I don't always iterate, but when I do, I use these tricks. Stay speedy, my friend.
For one case I've faced, updating all packages in use under R studio resolved the issue.