I have the following code running and it's taking me a long time to run. How do I know if it's still doing its job or it got stuck somewhere.
noise4<-NULL;
for(i in 1:length(noise3))
{
if(is.na(noise3[i])==TRUE)
{
next;
}
else
{
noise4<-c(noise4,noise3[i]);
}
}
noise3 is a vector with 2418233 data points.
You just want to remove the NA values. Do it like this:
noise4 <- noise3[!is.na(noise3)]
This will be pretty much instant.
Or as Joshua suggests, a more readable alternative:
noise4 <- na.omit(noise3)
Your code was slow because:
It uses explicit loops which tend to be slow under the R interpreter.
You reallocate memory every iteration.
The memory reallocation is probably the biggest handicap to your code.
I wanted to illustrate the benefits of pre-allocation, so I tried to run your code... but I killed it after ~5 minutes. I recommend you use noise4 <- na.omit(noise3) as I said in my comments. This code is solely for illustrative purposes.
# Create some random data
set.seed(21)
noise3 <- rnorm(2418233)
noise3[sample(2418233, 100)] <- NA
noise <- function(noise3) {
# Pre-allocate
noise4 <- vector("numeric", sum(!is.na(noise3)))
for(i in seq_along(noise3)) {
if(is.na(noise3[i])) {
next
} else {
noise4[i] <- noise3[i]
}
}
}
system.time(noise(noise3)) # MUCH less than 5+ minutes
# user system elapsed
# 9.50 0.44 9.94
# Let's see what we gain from compiling
library(compiler)
cnoise <- cmpfun(noise)
system.time(cnoise(noise3)) # a decent reduction
# user system elapsed
# 3.46 0.49 3.96
The other answers have given you much, much better ways to do the task that you actually set out to achieve (removing NA values in your data), but an answer to the specific question you asked ("how do I know if R is actually working or if it has instead gotten stuck?") is to introduce some output (cat) statements in your loop, as follows:
rpt <- 10000 ## reporting interval
noise4<-NULL;
for(i in 1:length(noise3))
{
if (i %% rpt == 0) cat(i,"\n")
if(is.na(noise3[i])==TRUE)
{
next;
}
else
{
noise4<-c(noise4,noise3[i]);
}
}
If you run this code you can immediately see that it slows down radically as it gets farther into the loop (a consequence of the failure to pre-allocate space) ...
The others have all given correct ways to do the same problem, so that you needn't worry about speed. #BenBolker also gave a good pointer regarding regular output.
A different thing to note is that if you find yourself in a loop, you can break out of it and find the value of i. Assuming that re-starting from that value of i won't harm things, i.e. using that value twice won't be a problem, you can restart. Or, you can just finish the job as the others have stated.
A separate trick is that if the loop is slow (and can't be vectorized or else you're not eager to break out of the loop), AND you don't have any reporting, you can still look for an external method to see if R is actually consuming cycles on your computer. In Linux, the top command is your best bet. On Windows, the task manager will do the trick (I prefer to use the SysInternals / Microsoft program Process Explorer). 'top' also exists on Macs, though I believe there are some other more popular tools.
One other word of advice: if you have a really long loop to run, I strongly encourage saving the results regularly. I typically create a file with the a name like: myPrefix_YYYYMMDDHHMMSS.rdat . This way everything can go to hell and you can still start your loop where you left off.
I don't always iterate, but when I do, I use these tricks. Stay speedy, my friend.
For one case I've faced, updating all packages in use under R studio resolved the issue.
Related
Pls help me
A few weeks ago it came out of gamemaker 2.3, practically in the gamemaker language they changed the scripts into functions, but now after converting the files to be able to reopen them, I double-checked all the scripts and etc but anyway when I start it it remains a black screen, however it doesn't give me any compilation errors or whatever, what could be the problem?
Ps.
I might sound stupid, but if someone has the same program as me I can pass the project to them so they can see the scripts for themselves, so basically it's just the base and there is only the script to make the player walk and for collisions, I know that no one would want to waste time, but I ask the same
Its possible that your code is stuck in an infinite loop, here's an example of what that might look like:
var doloop = true
while(doloop == true){
x += 1
y += 1
}
the "doloop" variable is never changed within the while loop, so it is always equal to true and the loop never ends. Because the code never finishes looping, it can never get around to drawing anything, so you end up with a black screen. The easiest way to check for these is to put a breakpoint/debugging point at the beginning and just after every while/for/do/ect loop and debug it. e.g. (I am using asterisks "*" to represent breakpoints)
var doloop = true
* while(doloop == true){
x += 1
y += 1
}
*
When you get to one of the loops remove the first breakpoint and hit the "continue" button in the debugger. If it (it being the computer) takes an longer than it should to hit the second breakpoint (as in, you wait for a ten seconds to or two minutes (depends on how complex the code is) and it still hasn't hit the second breakpoint), then you should replace the breakpoint at the beginning of the loop to check and make sure it is still in there. If it is still in the loop, then that is likely where the code is getting stuck. Review the loop and everywhere any associated variables are set/changed, and you should be able to find the problem (even if it takes a while).
Majestic_Monkey_ and the commentors are correct: use the debugger. It's easy and it's your friend. Just place a red circle on the very first line of code that runs, and click the little bug icon and you can step through your code easily.
But to address your specific issue (or if anyone in the future has this issue): scripts have changed into files that can have many functions. Where you used to have
//script_name
var num = argument0 + argument1;
return num;
You would now have
function script_name(a, b) {
var num = a + b;
return num;
}
All you have to do is create a decleration for your new function:
function my_function_name(argument_names, etc...)
Then wrap all your old code in { }, and replace all those ugly "argument0" things with actual names. It's that easy. Plus you can have more than one function per script!
I wonder if there is a way to display the current time in the R command line, like in MS DOS, we can use
Prompt $T $P$G
to include the time clock in every prompt line.
Something like
options(prompt=paste(format(Sys.time(), "%H:%M:%S"),"> "))
will do it, but then it is fixed at the time it was set. I'm not sure how to make it update automatically.
Chase points the right way as options("prompt"=...) can be used for this. But his solutions adds a constant time expression which is not what we want.
The documentation for the function taskCallbackManager has the rest:
R> h <- taskCallbackManager()
R> h$add(function(expr, value, ok, visible) {
+ options("prompt"=format(Sys.time(), "%H:%M:%S> "));
+ return(TRUE) },
+ name = "simpleHandler")
[1] "simpleHandler"
07:25:42> a <- 2
07:25:48>
We register a callback that gets evaluated after each command completes. That does the trick. More fancy documentation is in this document from the R developer site.
None of the other methods, which are based on callbacks, will update the prompt unless a top-level command is executed. So, pressing return in the console will not create a change. Such is the nature of R's standard callback handling.
If you install the tcltk2 package, you can set up a task scheduler that changes the option() as follows:
library(tcltk2)
tclTaskSchedule(1000, {options(prompt=paste(Sys.time(),"> "))}, id = "ticktock", redo = TRUE)
Voila, something like the MS DOS prompt.
NB: Inspiration came from this answer.
Note 1: The wait time (1000 in this case) refers to the # of milliseconds, not seconds. You might adjust it downward when sub-second resolution is somehow useful.
Here is an alternative callback solution:
updatePrompt <- function(...) {options(prompt=paste(Sys.time(),"> ")); return(TRUE)}
addTaskCallback(updatePrompt)
This works the same as Dirk's method, but the syntax is a bit simpler to me.
You can change the default character that is displayed through the options() command. You may want to try something like this:
options(prompt = paste(Sys.time(), ">"))
Check out the help page for ?options for a full list of things you can set. It is a very useful thing to know about!
Assuming this is something you want to do for every R session, consider moving that to your .Rprofile. Several other good nuggets of programming happiness can be found hither on that topic.
I don't know of a native R function for doing this, but I know R has interfaces with other languages that do have system time commands. Maybe this is an option?
Thierry mentioned system.time() and there is also proc.time() depending on what you need it for, although neither of these give you the current time.
I have looked at other posts that appeared similar to this question but they have not helped me. This may be just my ignorance of R. Thus I decided to sign up and make my first post on stack-overflow.
I am running an R-script and would like the user to decide either to use one of the two following loops. The code to decide user input looks similar to the one below:
#Define the function
method.choice<-function() {
Method.to.use<-readline("Please enter 'New' for new method and'Old' for old method: ")
while(Method.to.use!="New" && Method.to.use!="Old"){ #Make sure selection is one of two inputs
cat("You have not entered a valid input, try again", "\n")
Method.to.use<-readline("Please enter 'New' for new method and 'Old' for old method: ")
cat("You have selected", Method.to.use, "\n")
}
return(Method.to.use)
}
#Run the function
method.choice()
Then below this I have the two possible choices:
if(Method.to.use=="New") {
for(i in 1:nrow(linelist)){...}
}
if(Method.to.use=="Old"){
for(i in 1:nrow(linelist)){...}
}
My issue is, and what I have read from other posts, is that whether I use "readline", "scan" or "ask", R does not wait for my input. Instead R will use the following lines as the input.
The only way I found that R would pause for input is if the code is all on the same line or if it is run line by line (instead of selecting all the code at once). See example from gtools using "ask":
silly <- function()
{
age <- ask("How old are you? ")
age <- as.numeric(age)
cat("In 10 years you will be", age+10, "years old!\n")
}
This runs with a pause:
silly(); paste("this is quite silly")
This does not wait for input:
silly()
paste("this is quite silly")
Any guidance would be appreciated to ensure I can still run my entire script and have it pause at readline without continuing. I am using R-studio and I have checked that interactive==TRUE.
The only other work-around I found is wrapping my entire script into one main function, which is not ideal for me. This may require me to use <<- to write to my environment.
Thank you in advance.
I was just wondering what is the best way in R to keep on printing on the same line in a loop, to avoid swamping your console? Let's say to print a value indicating your progress, as in
for (i in 1:10) {print(i)}
Edit:
I tried inserting carriage returns before each value as in
for (i in 1:10000) {cat("\r",i)}
but that also doesn't quite work as it will just update the value on the screen after the loop, just returning 10000 in this case.... Any thoughts?
NB this is not to make a progress bar, as I know there are various features for that, but just to be able to print some info during the progression of some loop without swamping the console
You have the answer, it's just looping too quickly for you to see. Try:
for (i in 1:10) {Sys.sleep(1); cat("\r",i)}
EDIT: Actually, this is very close to #Simon O'Hanlon's answer, but given the confusion in the comments and the fact that it isn't exactly the same, I'll leave it here.
Try using cat()...
for (i in 1:10) {cat(paste(i," "))}
#1 2 3 4 5 6 7 8 9 10
cat() performs much less conversion than print() (from the horses mouth).
To repeatedly print in the same place, you need to clear the console. I am not aware of another way to do this, but thanks to this great answer this works (in RStudio on Windows at least):
for (i in 1:1e3) {
cat( i )
Sys.sleep(0.01)
cat("\014")
}
Well... are you worried about hangs, or just about being notified when the job completes?
In the first case, I'd stick w/ my j%%N suggestion, where N is large enough that you don't swamp the console.
In the second case, add a final line to your script or function which, e.g., calls "Beep" .
Say my executable is c:\my irectory\myfile.exe and my R script calls on this executeable with system(myfile.exe)
The R script gives parameters to the executable programme which uses them to do numerical calculations. From the ouput of the executable, the R script then tests whether the parameters are good ore not. If they are not good, the parameters are changed and the executable rerun with updated parameters.
Now, as this executable carries out mathematical calculations and solutions may converge only slowly I wish to be able to kill the executable once it has takes to long to carry out the calculations (say 5 seconds)
How do I do this time dependant kill?
PS:
My question is a little related to this one: (time non dependant kill)
how to run an executable file and then later kill or terminate the same process with R in Windows
You can add code to your R function which issued the executable call:
setTimeLimit(elapse=5, trans=T)
This will kill the calling function, returning control to the parent environment (which could well be a function as well). Then use the examples in the question you linked to for further work.
Alternatively, set up a loop which examines Sys.time and if the expected update to the parameter set has not taken place after 5 seconds, break the loop and issue the system kill command to terminate myfile.exe .
There might possibly be nicer ways but it is a solution.
The assumption here is, that myfile.exe successfully does its calculation within 5 seconds
try.wtl <- function(timeout = 5)
{
y <- evalWithTimeout(system(myfile.exe), timeout = timeout, onTimeout= "warning")
if(inherits(y, "try-error")) NA else y
}
case 1 (myfile.exe is closed after successfull calculation)
g <- try.wtl(5)
case 2 (myfile.exe is not closed after successfull calculation)
g <- try.wtl(0.1)
MSDOS taskkill required for case 2 to recommence from the beginnging
if (class(g) == "NULL") {system('taskkill /im "myfile.exe" /f',show.output.on.console = FALSE)}
PS: inspiration came from Time out an R command via something like try()