ReadAffy() Taking Too Long - r

I am using R 3.3.0 on Rstudio on Ubuntu 14.04 and have installed package Affy successfully.
However, when I set the directory where CEL files are, (using setwd()) and give the command cel1 <- ReadAffy(), there is no output. I don't even go the the next line with >. It simply does not show any output.
Also, Ctrl+C and Esc are also not stopping the process. Usually Esc works for stopping the process on my system. Here is the screenshot:
Also, when I try to quit the session, it takes too long to respond. What is causing this problem and how do I solve it?
EDIT:
I only have 3 CEL files in the folder.
EDIT #2:
I also applied it for single file, but takes again too much time. System monitor screenshot:
It is taking a lot of CPU and a lot of memory(in GB). What is causing the problem? I am using a simple command ReadAffy(filenames = "N54.CEL"). Why is that causing the problem? Any suggestion at all will be helpful. Kind of desperate here.

Turns out there was some problem with the file itself. I got the file from a different source, and it worked fine.

Related

In R, what can cause the "options" file in the suspended-session-data folder to grow too large?

So, I'm sorry if this question is not reproducible, but it's actually because I don't know what is causing my problem. I run R in an RStudio Server in Linux, and recently I'm facing the following problem:
When my R session is suspended (after timeout, for example), sometimes the options file in the suspended-session-data folder grows enormously - 400 GB, for example.
If this was the environment file, it would be more intuitive, as it would simply mean that my session had too much data loaded and R tried to save it during timeout. However, I currently have no clue as to why this problem would happen with the options file, and since the file is too large, I'm not able to read it in order to understand what could have gone wrong. I also couldn't find any documentation regarding this file online.
My /etc/rsession/rsession.conf currently looks like this:
# R Session Configuration File
session-timeout-minutes=60
session-save-action-default=no
Is there an obvious reason as to why the options file would grow too large? If not, is there a good way to debug the problem?
Thank you
I had a similar problem. It seems to be related to this issue in cpp11.
Briefly, when using save() on options()["cpp11_preserve_env"] this leads to some weird recursion.
I managed to fix it in my installation by:
Making sure the cpp11 package is up-to-date (install.packages("cpp11")).
Reinstalling the readr package (install.packages("readr")).

Rstudio is painfully slow

Suddenly, Rstudio is painfully slow, and now it is unusable. This means, I open it up and there is a lag of several seconds if I type anything. I have explored all the options I can come up with:
1. re-installing both R and Rstudio (although I am not 100% sure I could remove all components),
2. trying to reset settings.... the obvious things such as clearing the workspace and the console.
The size of my data is negligible. I cannot think of anything else.... any ideas?
The only observation i can make that shows something could be wrong with the configuration is (sometimes), I see "gctorture false" as a value in the environment.
Just a guess, but ?gctorture says
Provokes garbage collection on (nearly) every memory allocation.
Intended to ferret out memory protection bugs. Also makes R run
_very_ slowly, unfortunately.
which sounds about right for your problem! You could try
gctorture(FALSE)
If that speeds things up, then look for somewhere that this might have been set, e.g., in a .Rprofile (current working directory, or your user home directory, or the installation directory of R; see ?.Rprofile) and make sure that you start R without loading any .Rhistory or .RData files (again in the working directory, your home directory, etc.)
I had a RStudio project with Git Version Control and then it became very slow. I solved the problem by removing the Git Version Control

How to stop the running cell if interupt kernel does not work in Jupyter Notebook

I have been using Jupyter Notebook for a while. Often when I try to stop a cell execution, interrupting the kernel does not work. In this case, what else can I do, other than just closing the notebook and relaunching it again? I guess this might be a common situation for many people.
Currently this is an issue in the github jupyter repository as well,
https://github.com/ipython/ipython/issues/3400
there seems to be no exact solution for that except killing the kernel
If you're ok with losing all currently defined variables, then going to Kernel > Restart will stop execution without closing the notebook.
This worked for me:
- Put the laptop to sleep (one of the power options)
- Wait 10 s
- Wake up computer (with power button)
Kernel then says reconnecting and its either interrupted or you can press interrupt.
Probably isn't fool proof but worth a try so you don't waste previous computation time.
(I had Windows 10 running a Jupyter Notebook that wouldn't stop running a piece of Selenium code)
There are a few options here:
Change the folder name of data:
Works if the cell is running already and pulling data from a particular folder. For example I had a for loop that when interrupted just moved to the next item in list it was processing.
Change the code in the cell to generate an error:
Works if the cell has not been run yet but is just in queue.
Restart Kernel:
If all else fails
Recently I also faced a similar issue.
Found out that there is an issue in Python https://github.com/ipython/ipython/issues/3400 and it was there for 6 some years and it has been resolved as of 1st March 2020.
One thing that might work is hitting interrupt a bunch of times. It's possible that a library you are using catches the interrupt signal and only stops after receiving the signal multiple times.
For example, when using sklearn's cross_val_score() I found that I have to interrupt once for each cross validation fold.
If you know in advance that you might want to stop without losing all your variables, the following solution might be useful:
In cells that take a while because of long loops, you may implement something like this in the loop:
if os.path.exists(os.path.join(os.getcwd(),'stop_true.txt')):
break
Then if you want to stop just create the file 'stop_true.txt'. And the loop stops before the next round.
Usually, the file is called 'stop_false.txt' until I rename it to stop the loop.
Additionally, the results of each loop are stored in a dictionary separately. Therefore I'm able to keep all results until the break happened and can restart the loop from this point onwards.
If the iPython kernel did not die, you might be able to inject Python code into it that saves important data using pyrasite. You need to install and run pyrasite as root, i.e. with sudo python -m pip install pyrasite or python3 as needed. Then you need to figure out the process id (PID) of the iPython kernel (e.g. via htop or ps aux | grep ipython), say 3873. Then, write a script that saves the state for example to a pickle in a file inject.py, say, it is a Pandas dataframe df in the global scope:
df.to_pickle("rescued_df.pkl")
Finally, and inject it into the process as follows:
sudo pyrasite 3873 inject.py
You may need to enable dtrace first like so:
echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
For me, setting up a time limit worked: https://github.com/scipopt/PySCIPOpt/issues/197. Specifically, I added "model.setRealParam("limits/time", 60)" piece of code and it automatically stops calculation after 60 seconds. You can set up any time instead of 60. But this is for pyscipopt package (solving optimization model). I am not sure how to set up the time limit for your specific problem.
Try this:
Close the browser tab in which Jupyter is running
Run jupyter server list
Kill each running server with jupyter server stop <PORT>
You can force the termination by deleting the cell. I copy the code, delete the cell, create a new cell, paste, and execute again. Works like a charm.
I suggest to restart the kernel (Kernel -> Restart Kernel) as suggested by #hamdog.
It will be ready to use after that. However, it will certainly delete all variables stored in memory.

How to recover RStudio session after crash?

I would like to know how to restore my previous RStudio session after RStudio and the R session crashed.
Background:
I find that my R session crashes very often, at random times for random reasons. I am fine with that I guess.
Most of the time RStudio restarts the R session and I can continue.
But sometimes it just freezes at which time I noticed power cycling the entire machine allows RStudio to recover and even reload my old session.
Stupid me, I don't think power cycling is a good idea so I manually killed the R session, but then RStudio responded but was not really working so I restarted it and it came back with an empty work-space.
I have been backing up with Session->Save Workspace As, but it seems to do nothing as recovering leaves me with the blank empty environment.
I am looking to restore the RStudio display, including the command history , which for a novice like me is precious, and my list of open scripts, some of which were unsaved at the time of the crash.
I am assuming since RStudio can recover itself, there is a file somewhere I can use to recover it.
And if there is no way to recover, how can I completely save my workspace so this cannot happen again?
Also, is there a proper way to recover from an RStudio freeze without a hard reset?
It has been a while since I asked this question. I was never able to fully recover, but I switched to Rprojects which is the recommended way to use Rstudio.
Rprojects are stored in a folder and they remember all files and data from that project in that folder.
This did not help me with my initial problem, but projects prevent it from happening again. The hard part is moving a work-space to a project if it was not in a project to start.
Hi the first step of this article helped me entirely.
https://datacornering.com/how-to-restore-closed-unsaved-script-in-rstudio/
Basically, if on windows, go to C:\Users\xx\AppData\Local\RStudio\sources\s-xx and find a file with "-contents" at the end. This is your unsaved file.

Refresh R console without quitting the session?

I usually open the R console all day long, but sometimes I need to clean my history and my workspace's background so that I can test functions or load new data.
I'm wondering whether there is an easier way to use a command line in .Rprofile so that I can refresh the R console without quitting or rebooting my current session.
What I have usually done for this is to q() without saving and then start R again and clean the History. I think somebody here might be able to give me some better suggestions.
Thanks in advance.
For what concerns history, in UNIX-like systems (mine is Debian) this command refreshes it
loadhistory("")
However, as said in comments, loadhistory seems to be platform-dependent.
Check your ?loadhistory if present on your platform. Mine says:
There are several history mechanisms available for the different
R consoles, which work in similar but not identical ways. There
are separate versions of this help file for Unix and Windows.
The functions described here work on Unix-alikes under the
readline command-line interface but may not otherwise (for
example, in batch use or in an embedded application)

Resources