How to avoid RStudio writing temp files to SSD? - r

I have RStudio and R installed in my SSD with only 13 GB of empty space left. The problem is that when I launch some demanding algorithm with RStudio, it quickly fills my SSD with some temporary files that are quit huge, probable because of 16 GB RAM I have in my computer. However, I also have 1 TB almost empty hard disk in the same computer. Is it possible to reroute those temp files to hard disk? Any suggestions how to do this?

Related

Problem loading large .RData file: Error reading from connection

I have an .RData file that is rather large. It takes up 1.10 GB on my hard drive, it contains a data frame with 100 variables and 12 million observations. Now when I try to load it, I can open the task manager and watch the memory usage go all the way up to 7450 MB; at which point my RAM is completely exhausted, and I get "Error reading from connection." I'm pretty sure this memory deficiency is the problem, but how can that be? Like I said, the .RData is only 1.10 GB.
I'm using R x64 4.0.5. If it's any clue, when I open the 32-Bit version of R (4.0.5) it tells me "Error: memory exhausted (limit reached?)", reinforcing my suspicion that this is a memory issue.
I am unable to access the data any other way, I have to make the .RData file work or it's gone. Why does R require more than 8 GB of RAM to load a 1GB workspace?

Google Colab disk space getting full

I'm new to ML and I am now testing some notebooks in Google Colab (using GPU).
My first notebook has been running for a few hours with no complaints about RAM or disk space. Hoever, when running another notebook, I soon get warnings that I am already using around 57 GB of my 68 GB disk space. These warnings only appear in the second notebook, and the disk space icon has only turned yellow in this second one and not in the first one.
I wonder if someone could clarify a bit what happens with this (virtual) disk space? Where are all the heavy files stored and will it reset automatically so that there will be free space again after running these notebooks?
I have tried to find answers on Colab website and forums but so far with no success. Thanks a lot!

R Running out of memory over large file

I had a unique problem yesterday when trying to read a large .csv file into memory.
the file itself is 9GB with a bit more than 80 Million rows and 10 columns.
it loaded perfectly and took up around 7GB in memory using a remote machine with 128GB of RAM.
my problem is, i want to work on the data with a local machine that only has 32GB of RAM.
i tried reading it with data.table::fread but R crahes when it uses all of the machine's memory.
is there a safer way of reading the data that won't crash R?
is this a known issue? can something be wrong with the machine?
both machine are running windows 7 enterprise.
EDIT:
saving and reading the data in an RDS file worked, but i still want to be able to use just one computer for the entire job.
is there any other way to read the data directly from the csv file?
i don't want to report a bug in data.table unless i am sure this is an issue with fread and not a local issue.
any other ideas?

RAM used per mb in R workspace

Is there any way of telling how much RAM is used per mb stored in the workspace in R?
I've got ~700 mb of stuff in the workspace and it brings my pc to a complete freeze, even though it has 4GB of ram, on Ubuntu which is a lightweight OS.
This is just data, I am just doing basic exploratory stats on it like plotting, averaging e.t.c

Where does R store temporary files

I am running some basic data manipulation on a Macbook Air (4GB Memory, 120GB HD with 8GB available). My input file is about 40 MB, and I don't write anything to the disk until end of the process. However, in the middle of my process, my Mac says there's no memory to run. I checked hard drive and found there's about 500MB left.
So here are my questions:
How is it possible that R filled up my disk so quickly? My understanding is that R store everything in memory (unless I explicitly write something out to disk).
If R does write temporary files on the disk, how can I find these files to delete them?
Thanks a lot.
Update 1: error message I got:
Force Quit Applications: Your Mac OS X startup disk has no more space available for
application memory
Update 2: I checked tempdir() and it shows "var/folders/k_xxxxxxx/T//Rtmpdp9GCo". But I can't locate this director from my Finder
Update 3: After unlink(tempdir(),recursive=TRUE) in R and restarting my computer, I got my disk space back. I still would like to know if R write on my hard drive to avoid similar situations in the future.
Update 4: My main object is about 1GB. I use Activity Monitor to track process, and while Memory usage is about 2GB, Disk activity is extremely high: Data read: 14GB, data write, 44GB. I have no idea what R is writing.
R writes to a temporary per-session directory which it also cleans up at exit.
It follows convention and respects TMP and related environment variables.
What makes you think that disk space has anything to do with this? R needs all objects held in memory, not off disk (by default; there are add-on packages that allow a subset of operations on on-disk stored files too big to fit into RAM).
One of the steps in the "process" is causing R to request a chunk of RAM from the OS to enable it to continue. The OS could not comply and thus R terminated the "process" that you were running with the error message you failed to give us. [Hint, it would help if you showed the actual error not your paraphrasing thereof. Some inkling of the code you were running would also help. 40MB on-disk sounds like a reasonably large file; how many rows/columns etc.? How big is the object within R; object.size()?

Resources