I'm running a long R script, which takes 2 or 3 days to finish. I accidentally run another script, which, if it works as R usually does, will go in some queue and R will run it as soon as the first script is over. I need to stop that, as it would compromise the results from the first script. Is there a visible queue or any other way to stop R from running some code?
I'm working on an interactive session in R studio, on windows 10.
Thanks a lot for any help!
Assuming you're running in console (or interactive session in R studio, that's undetermined from your question) and that what you did was sourcing a script/pasting code and while it was running pasting another chunck of code:
What is ongoing is that you pushed data into R process input stream, it's a buffered input, so it will run each line once the previous line call has ended and free the process.
There's no easy way to play with an input buffer, that's R internal input/output system and mostly it's the Operating system which have those information in cache for now.
Asking R itself is not possible as it already has this buffer to read, any new command would go after.
Last chance thing: If you can spot your another chunck of code starting in your console, you can try pressing the esc key to stop the code running.
You may try messing with the process buffers with procexp but there's a fair chance to just make your R session segfault anyway.
To avoid that in the future, use scripts and run them on the command line separately with Rscript (present in R bin directory under windows too despite the link pointing to a linux manpage).
This would create one session per script and allow to kill them independently. That said if they both write to the same place (database, a file would create an error if accessed by two process) that won't prevent data corruption.
I am guessing OP has below problem:
# my big code, running for a long time
Sys.sleep(10); print("hello 1")
# other big code I dropped in console while R was still busy with above code
print("hello 2")
If this is the case, I don't think it is possible to stop the 2nd process from running.
I have been using Jupyter Notebook for a while. Often when I try to stop a cell execution, interrupting the kernel does not work. In this case, what else can I do, other than just closing the notebook and relaunching it again? I guess this might be a common situation for many people.
Currently this is an issue in the github jupyter repository as well,
https://github.com/ipython/ipython/issues/3400
there seems to be no exact solution for that except killing the kernel
If you're ok with losing all currently defined variables, then going to Kernel > Restart will stop execution without closing the notebook.
This worked for me:
- Put the laptop to sleep (one of the power options)
- Wait 10 s
- Wake up computer (with power button)
Kernel then says reconnecting and its either interrupted or you can press interrupt.
Probably isn't fool proof but worth a try so you don't waste previous computation time.
(I had Windows 10 running a Jupyter Notebook that wouldn't stop running a piece of Selenium code)
There are a few options here:
Change the folder name of data:
Works if the cell is running already and pulling data from a particular folder. For example I had a for loop that when interrupted just moved to the next item in list it was processing.
Change the code in the cell to generate an error:
Works if the cell has not been run yet but is just in queue.
Restart Kernel:
If all else fails
Recently I also faced a similar issue.
Found out that there is an issue in Python https://github.com/ipython/ipython/issues/3400 and it was there for 6 some years and it has been resolved as of 1st March 2020.
One thing that might work is hitting interrupt a bunch of times. It's possible that a library you are using catches the interrupt signal and only stops after receiving the signal multiple times.
For example, when using sklearn's cross_val_score() I found that I have to interrupt once for each cross validation fold.
If you know in advance that you might want to stop without losing all your variables, the following solution might be useful:
In cells that take a while because of long loops, you may implement something like this in the loop:
if os.path.exists(os.path.join(os.getcwd(),'stop_true.txt')):
break
Then if you want to stop just create the file 'stop_true.txt'. And the loop stops before the next round.
Usually, the file is called 'stop_false.txt' until I rename it to stop the loop.
Additionally, the results of each loop are stored in a dictionary separately. Therefore I'm able to keep all results until the break happened and can restart the loop from this point onwards.
If the iPython kernel did not die, you might be able to inject Python code into it that saves important data using pyrasite. You need to install and run pyrasite as root, i.e. with sudo python -m pip install pyrasite or python3 as needed. Then you need to figure out the process id (PID) of the iPython kernel (e.g. via htop or ps aux | grep ipython), say 3873. Then, write a script that saves the state for example to a pickle in a file inject.py, say, it is a Pandas dataframe df in the global scope:
df.to_pickle("rescued_df.pkl")
Finally, and inject it into the process as follows:
sudo pyrasite 3873 inject.py
You may need to enable dtrace first like so:
echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
For me, setting up a time limit worked: https://github.com/scipopt/PySCIPOpt/issues/197. Specifically, I added "model.setRealParam("limits/time", 60)" piece of code and it automatically stops calculation after 60 seconds. You can set up any time instead of 60. But this is for pyscipopt package (solving optimization model). I am not sure how to set up the time limit for your specific problem.
Try this:
Close the browser tab in which Jupyter is running
Run jupyter server list
Kill each running server with jupyter server stop <PORT>
You can force the termination by deleting the cell. I copy the code, delete the cell, create a new cell, paste, and execute again. Works like a charm.
I suggest to restart the kernel (Kernel -> Restart Kernel) as suggested by #hamdog.
It will be ready to use after that. However, it will certainly delete all variables stored in memory.
Whenever I run a shiny app from ESS, it works, but I can't get the prompt back without killing the whole R session (like clicking the "Stop" button in RStudio). The normal Ctrl-C Cntrl-C, or Cntl-g don't work. So, I have to resort to Cntrl-x k. How do I kill the shiny process without killing R?
Well, I finally found out the reason. I had the option
(setq comint-prompt-read-only t)
set somewhere in my init files. Apparently, when this option is set it becomes impossible (well, beyond me anyway) to send a kill signal to the R process. I don't understand what is happening though. If I run a server directly from httpuv I can kill it even with the option set, but not when running an app through shiny.
you could use
C-c C-c
in iESS to quit the shiny app.
I want to execute a R script in the background from the R console.
From the console , i usually run R script as source('~/.active-rstudio-document')
I have to wait until the script is completed to go ahead with my rest of work.
Instead of this i want R to be running in the background while i can continue with my work in the console.
Also i should be somehow notified when R completes the source command.
Is this possible in R ?
This might be quite useful as we often sees jobs taking long time.
PS - i want the source script to be running in the same memory space rather than a new one. Hence solutions like fork , system etc wont work for me. I am seeing if i can run the R script as a separate thread and not a separate process.
You can use system() and Rscript to run your script as an asynchronous background process:
system("Rscript -e 'source(\"your-script.R\")'", wait=FALSE)
At the end of your script, you may save your objects with save.image() in order to load them later, and notify of its completion with cat():
...
save.image("script-output.RData")
cat("Script completed\n\n")
Hope this helps!
I am running a some analysis in R which is going to take at least 24 hours to finish. Is it possible to pause the function midway, so that I can take my computer to work and back?
This is not possible AFAIK, but I believe you can just suspend your computer and the processes automatically will be paused.
If you are using Linux, you can also stop and continue a process manually using killall -STOP R and killall -CONT R commands. Take a look at this article and the comment section there, which contain useful information regarding this.
On Windows, you can maybe use the Task Manager or install special software that is capable of doing that. But I really do not know as I do not use Windows on a regular basis.
EDIT: even if you use kill or killall to pause the process, but shutdown the computer, you will lose the data.