What is making rsDriver from RSelenium take up so much RAM? - r

I am running Chrome through rsDriver() from RSelenium in parallel and when I start my script each R-session takes about 300mb of RAM.
After a while, each session starts increasing the RAM and eventually crashes because of RAM failure.
I stopped the script when ram is 98% full and used the following code:
gc()
gc(rsdriver)
It does not help. I checked the environment size with
object_size(ls())
It returns that the environment is less than 1mb. The script I have running fetches data, but uploads it to a database. So it should not store anything.
How can I identify what is occupying this ram and fix it?
Some additional info:
I have 32gb of ram, should be enough.
I am aware of Docker, but irrelevant for this question

Related

Memory Limit in R

I'm relatively new to R and seem to be having a memory limit issue with a new laptop. When I run a large dataset of about 50K survey respondents and over 500 variables I receive the error message: Error: cannot allocate vector of size 413 Kb
I got around this issue fine on my old laptop by increasing the memory limit size via the code: memory.limit(size = 10000). Everything worked fine but on my new laptop which is faster and more powerful, the memory limit fills up very fast and will crash at size 27000 after I run about 7 models.
I have tried closing all unnecessary programs, removing all the unneeded objects in R, and clearing the garbage can: gc(). I was using latest version of R 4.14 and have now gone back to 4.04 where it worked fine on my old PC - but none of these help really.
I am running the 64bit version of R on a 64bit PC that has 8GB capacity.
Does anyone know why this might be occurring on a brand new laptop that runs faster while running slower on my 4-year old PC but atleast worked with it.
Also, how high can you set the memory limit as the manual says R can handle 8TB? And how do you permanently set a memory limit?
Thanks

Rstudio potential memory leak / background activity?

I’m having a lot of trouble working with Rstudio on a new PC. I could not find a solution searching the web.
When Rstudio is on, it is constantly eating up memory until it becomes unworkable. If I work on an existing project, it takes half an hour to an hour to become impossible to work with. If I start a new project without loading any objects or packages, just writing scripts without even running them, it takes longer to reach that point, however, it still does.
When I first start the program, the Task Manager shows memory usage of 950-1000 MB already (sometimes larger), and as I work, it climbs up to 6000 MB at which point it is impossible to work with as every activity is delayed and 'stuck'. Just to compare, on my old PC while working on the program, the Task Manager shows 100-150 MB. When I click the "Memory Usage Report" within Rstudio, the "used by session" is very small, the "used by system" is almost at a maximum yet Rstudio is the only thing taking up they system memory on the PC.
Things I tried: installing older versions of both R and Rstudio, pausing my anti-virus program, changing compatibility mode, zoom on "100%". It feels like Rstudio is continuously running something in the background as the memory usage keeps growing (and quite quickly). But maybe it is something else entirely.
I am currently using the latest versions of R and Rstudio (4.1.2, and 2021.09.0-351), on a PC with processor Intel i7, x64 bit, RAM 16GM, Windows 10.
What should I look for at this point?
On Windows, there is several typical memory or CPU issues with Rstudio. In my answer, I explain how the Rstudio interface itself use memory and CPU, as soon as you open a project (e.g., when Rstudio show you some .Rmd files). The memory / CPU cost associated with the computation is not covered in my answer (i.e. when you have performance issues when executing a line of code = not covered).
When working on 'long' .Rmd files within Rstudio on Windows, the CPU and/or memory usage get sometimes very high and increases progressively (e.g., because of a process named 'Qtwebengineprocess'). To solve the problem caused by long Rmd files loaded within a Rstudio session, you should:
pay attention to the process of Rstudio that consume memory, when scanning your code (i.e. disable or enable stuff in the 'Global options' menu of Rstudio). For example, try to disable 'inline display'(Tools => Global options => Rmarkdown => Show equation and image preview => Never). This post put me on this way to consider that memory / CPU leak are sometimes due to Rstudio itself, nor the data or the code.
set up a bookdown project, in order to split your large Rmd files into several Rmd. See here.
Bonus step, see if there is a conflict in some packages which are loaded with the command tidyverse_conflicts(), but it's already a 'computing problem' (not covered here).

"Cannot allocate vector of size xxx mb" error, nothing seems to fix

I'm running RStudio x64 on Windows 10 with 16GB of RAM. RStudio seems to be running out of memory for allocating large vectors, in this case a 265MB one. I've gone through multiple tests and checks to identify the problem:
Memory limit checks via memory.limit() and memory.size(). Memory limit is ~16GB and size of objects stored in environment is ~5.6GB.
Garbage collection via gc(). This removes some 100s of MBs.
Upped priority of rsession.exe and rstudio.exe via Task Manager to real-time.
Ran chkdsk and RAM diagnostics on system restart. Both returned no errors.
But the problem persists. It seems to me that R can access 16GB of RAM (and shows 16GB committed on Resource Monitor), but somehow is still unable to make a large vector. My main confusion is this: the problem only begins appearing if I run code on multiple datasets consecutively, without restarting RStudio in between. If I do restart RStudio, the problem doesn't show up anymore, not for a few runs.
The error should be replicable with any large R vector allocation (see e.g. the code here). I'm guessing the fault is software, some kind of memory leak, but I'm not sure where or how, or how to automate a fix.
Any thoughts? What have I missed?

Where does R store temporary files

I am running some basic data manipulation on a Macbook Air (4GB Memory, 120GB HD with 8GB available). My input file is about 40 MB, and I don't write anything to the disk until end of the process. However, in the middle of my process, my Mac says there's no memory to run. I checked hard drive and found there's about 500MB left.
So here are my questions:
How is it possible that R filled up my disk so quickly? My understanding is that R store everything in memory (unless I explicitly write something out to disk).
If R does write temporary files on the disk, how can I find these files to delete them?
Thanks a lot.
Update 1: error message I got:
Force Quit Applications: Your Mac OS X startup disk has no more space available for
application memory
Update 2: I checked tempdir() and it shows "var/folders/k_xxxxxxx/T//Rtmpdp9GCo". But I can't locate this director from my Finder
Update 3: After unlink(tempdir(),recursive=TRUE) in R and restarting my computer, I got my disk space back. I still would like to know if R write on my hard drive to avoid similar situations in the future.
Update 4: My main object is about 1GB. I use Activity Monitor to track process, and while Memory usage is about 2GB, Disk activity is extremely high: Data read: 14GB, data write, 44GB. I have no idea what R is writing.
R writes to a temporary per-session directory which it also cleans up at exit.
It follows convention and respects TMP and related environment variables.
What makes you think that disk space has anything to do with this? R needs all objects held in memory, not off disk (by default; there are add-on packages that allow a subset of operations on on-disk stored files too big to fit into RAM).
One of the steps in the "process" is causing R to request a chunk of RAM from the OS to enable it to continue. The OS could not comply and thus R terminated the "process" that you were running with the error message you failed to give us. [Hint, it would help if you showed the actual error not your paraphrasing thereof. Some inkling of the code you were running would also help. 40MB on-disk sounds like a reasonably large file; how many rows/columns etc.? How big is the object within R; object.size()?

App crashes when QList grows too large

I make an application which has to store a lot of data in memory to improve calculation performance.
It is a hierarchy of lists and objects where the top object is a QList<myObject*>. When loading data, a lot of instances of new myObject* are created and added to the list. The memory consumption grows and when it comes to ~1.9Gb the program crashes. My computer (Vista) has 4Gb RAM, and I have tested on other computers with less RAM (XP) and it crashes at the same point. Can I not use more than 1.9Gb RAM?
When a smaller file is loaded and memory usage according to "Windows task manager" is (say) 1.2Gb I can work with the data. But if I want to load another file, the growing starts from 1.2Gb even after calling delete on all objects and clearing the list. Why?
I tried switching to QVector and call squeeze(), but memory stays the same. I have read the other threads here about dynamic memory allocation in QLists, but is it really no way to reset the memory before I load a new file? Especially since it crashes after 1.9Gb; loading 3 small files sequentially and I'm there.
Thanks a lot for any suggestions.
If you have 32-bit Windows, then your process can only use 2 GB of memory. You just cannot address more memory with 32 bits. If you need more memory, maybe you should change to 64-bit Windows.

Resources