R vector memory exhausted - r

I am currently using RStudio on my Macbook Pro.
R version 3.5.0 (2018-04-23)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.4
When using the agnes() function from the cluster package I received the error message:
Error: vector memory exhausted (limit reached?)
To solve I followed the steps mentioned in the answer to the following question: R on MacOS Error: vector memory exhausted (limit reached?)
Now running the same function I receive R session aborted message. R encountered a fatal error. The session was terminated.
Any other solutions?

AGNES needs at least two copies of a distance matrix.
Now if you have 100.000 instances, double precision (8 bytes) that means we are talking about memory usage on the order of 160000000000 bytes. That is 160GB.
Not including the input data, or any overhead. If you are lucky, the R version of AGNES only stores the upper triangular matrix, which would reduce this by a favor of 2. But OTOH if it did, it would likely produce an integer overrun at about 64k objects.
So you probably need to choose a different algorithm than AGNES, or reduce your data first.

Related

How can I determine and increase the memory allocated to R on a Mac

Variants of this question have been asked before (e.g., here, here, here, and here), but none of the suggested solutions works for me.
R returns an error message ("Error: vector memory exhausted (limit reached?)"), even though there is available memory on my computer (a 2019 MacBook Pro with 16 GB memory), as indicated by the Memory Pressure monitor in the Memory tab of the Activity Monitor.
I have set the memory both from the command line (using open .Renviron) and from RStudio (using usethis::edit_r_environ) , as suggested here. Neither solution works.
Has anybody found other solutions to this problem? Also, is there a way to determine, in RStudio, the maximum memory allocated? Sys.getenv() does not return this information.
I do not encounter this problem in base R -- only RStudio.
Session info:
R version 4.0.3 (2020-10-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 10.16

cannot allocate vector of size xxx even after upgrading server instance

I could use some troubleshooting advice. I'm trying to use the block() function of the {blocktools} package on a dataframe with only 45k obs and 14 variables (4 of which I'm trying to block on). I got an error that R could not allocate vector of size X, so I upgraded the AWS instance which doubled the memory. I restarted the instance and tried running again.
I'm still getting the error and can't figure out why seeing as I doubled the memory. Does R on Linux require me to say how much memory should be available? Any other troubleshooting tips?
FWIW, I'm running rm(list=ls()), loading only the dataframe I need, and throwing in gc() for good measure.
What else can I try?
R version 3.6.0 (2019-04-26)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.2 LTS

Dataframe not showing in global environment R

I have some RAM issues which cause some weird stuff to happen in Rstudio. I load in a large dataset (the BAG) containing all addresses in the Netherlands. Then I do some left joins with other public datasets like energy labels and monument status etc. This gives me a dataframe I called combined which has 9.5 million rows and about 80 columns. This used to take a while but otherwise work perfectly. Then I ran into some kind of unrelated error and had to reinstall R. After that, I now get a new error when loading these datasets -->
Error: cannot allocate vector of size 70.9 Mb
Error: cannot allocate vector of size 128.0 Mb
After the error interrupts my script I don't see the combined dataframe in my global environment but I can retrieve data from it from the console (see attached screenshot) using combined[1,1] for example. However if I try View(combined), I get a similar error after it's been loading for a while: cannot allocate vector of size 35 Mb error. R is taking up about 96% of my RAM memory (around 11-12 GB) so I'm assuming it's a full RAM error, but I don't get why I get it now since it used to work perfectly before.
While I've found a bunch of stuff online about sparse matrices and R SQL combo's etc., I find this weird because it didn't happen before. More RAM is on it's way which will hopefully solve this issue but I would like to understand why R is throwing me this error now, and why I can see some smaller dataframes in my global environment but I cannot see the 'combined' one, even though I can access it through the console.
I have reinstalled R, Rtools and Rstudio twice, problem persists. I am running the following version of R:
R.version
_
platform x86_64-w64-mingw32
arch x86_64
os mingw32
system x86_64, mingw32
status
major 3
minor 6.1
year 2019
month 07
day 05
svn rev 76782
language R
version.string R version 3.6.1 (2019-07-05)
nickname Action of the Toes
I have a Lenovo Thinkpad laptop with an i7 8th gen processor and 16 GB of RAM. Any help would be greatly appreciated.

Why is R reported to use much more memory by Windows than by itself?

> sessionInfo()
R version 3.4.0 (2017-04-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
On a 32G system, I got this error when creating a distance matrix:
df <- remove_duplicates_quanteda(dfm, df)
Error: cannot allocate vector of size 1.3 Gb
Looking inside my enviroment, there is little reason for concern:
print(object.size(x = lapply(ls(), get)), units = "Mb")
96.5 Mb
However, Windows reports the following figures:
What is the reason for this difference? Is there a way to find out?
Hadley put it pretty simple in Advanced R:
This number won’t agree with the amount of memory reported by your
operating system for a number of reasons:
It only includes objects created by R, not the R interpreter itself.
Both R and the operating system are lazy: they won’t reclaim memory
until it’s actually needed. R might be holding on to memory because
the OS hasn’t yet asked for it back.
R counts the memory occupied by objects but there may be gaps due to
deleted objects. This problem is known as memory fragmentation.
For more information see the Section about Memory

glm running out of memory in 64-bit R?

I am trying to run glm on a dataset with 255001 data points, but it's saying
Error: cannot allocate vector of size 10.0 Gb
This is very strange because when I start up R, I see the message
R version 3.1.1 (2014-07-10) -- "Sock it to Me"
Copyright (C) 2014 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
Which seems to indicate that I'm running a 64 bit version of R, and I read that the memory limit for 64 bit versions of R on Unix is on the order of 128 Tb.
Furthermore, I have successfully run glm logistic regression on very similar datasets that are twice as large without any problem.
How can I reconcile these facts, and how can I get R to hold large objects in memory?
It turns out there was a bug in my code, and when I was reading in the data, I set header=FALSE instead of header=TRUE. Changing this fixed the problem.

Resources