Crashing When knitting R Markdown under Linux - r

Ubuntu 16.04 LTS, R Version: 3.4.3, R Studio: Version 1.1.383
I'm playing around and learning R Markdown this afternoon. I am not doing any intensive data analysis. I am knitting my R Markdown into HTML with the following command rmarkdown::render("document.Rmd").
About every half an hour my Ubuntu GNOME session almost totally freezes. I can sort of move the mouse cursor around and every several minutes I'm presented with a brief window of time where the computer works again before going back into a deep freeze. I'm not running anything other programs.
I've kept my System Monitor open and notice that rsession and rstudio usually utilize ~200 MiB of memory. When the computer freezes the rsession rises to ~4 GiB and it's also directly after I issue the rmarkdown::render("document.Rmd") command in R Studio.
I did sudo-apt-update and sudo-apt-upgrade. What else should I do? Do I update the Linux kernel? Upgrade R Studio? Submit a bug report? Is this a memory leak (and what is that)?

Related

Why would R Shiny app run indefinitely in R Studio?

I wanted to learn how to build Shiny Apps in R so I started this beginner-level tutorial. However, when I run the app on my desktop (Windows 10 x64, 16GB RAM, 500 GB SSD, i5-3470 CPU, Dual Display) in R Studio 1.3.1093) using R 4.0.3, it loads indefinitely with no error output. I tried running even the basic built-in examples (which you could find here) and they also failed to load. The exact same scripts and examples run on my laptop (Windows 10 x64, 8GB RAM, 250 GB SSD; R & R Studio specs the same) without issue. I've reinstalled the shiny package, reinstalled R and R Studio, and changed whether the app runs internally or externally with no success. I did find this post which seems to have encountered the same issue, but with no solution.
I know it's not much to go on, but I'm at a loss as to the next thing I should check. Any suggestions?
I figured out from this mostly unrelated post that there was a file at the path C:/ Users/.../Documents\R\win-library\4.0/ called 00LOCK which was giving R trouble downloading and updating new packages. I'm not sure how it got there or why R was not telling me there were issues in updating the packages, but the shiny app seems to work perfectly fine now.

Rstudio potential memory leak / background activity?

I’m having a lot of trouble working with Rstudio on a new PC. I could not find a solution searching the web.
When Rstudio is on, it is constantly eating up memory until it becomes unworkable. If I work on an existing project, it takes half an hour to an hour to become impossible to work with. If I start a new project without loading any objects or packages, just writing scripts without even running them, it takes longer to reach that point, however, it still does.
When I first start the program, the Task Manager shows memory usage of 950-1000 MB already (sometimes larger), and as I work, it climbs up to 6000 MB at which point it is impossible to work with as every activity is delayed and 'stuck'. Just to compare, on my old PC while working on the program, the Task Manager shows 100-150 MB. When I click the "Memory Usage Report" within Rstudio, the "used by session" is very small, the "used by system" is almost at a maximum yet Rstudio is the only thing taking up they system memory on the PC.
Things I tried: installing older versions of both R and Rstudio, pausing my anti-virus program, changing compatibility mode, zoom on "100%". It feels like Rstudio is continuously running something in the background as the memory usage keeps growing (and quite quickly). But maybe it is something else entirely.
I am currently using the latest versions of R and Rstudio (4.1.2, and 2021.09.0-351), on a PC with processor Intel i7, x64 bit, RAM 16GM, Windows 10.
What should I look for at this point?
On Windows, there is several typical memory or CPU issues with Rstudio. In my answer, I explain how the Rstudio interface itself use memory and CPU, as soon as you open a project (e.g., when Rstudio show you some .Rmd files). The memory / CPU cost associated with the computation is not covered in my answer (i.e. when you have performance issues when executing a line of code = not covered).
When working on 'long' .Rmd files within Rstudio on Windows, the CPU and/or memory usage get sometimes very high and increases progressively (e.g., because of a process named 'Qtwebengineprocess'). To solve the problem caused by long Rmd files loaded within a Rstudio session, you should:
pay attention to the process of Rstudio that consume memory, when scanning your code (i.e. disable or enable stuff in the 'Global options' menu of Rstudio). For example, try to disable 'inline display'(Tools => Global options => Rmarkdown => Show equation and image preview => Never). This post put me on this way to consider that memory / CPU leak are sometimes due to Rstudio itself, nor the data or the code.
set up a bookdown project, in order to split your large Rmd files into several Rmd. See here.
Bonus step, see if there is a conflict in some packages which are loaded with the command tidyverse_conflicts(), but it's already a 'computing problem' (not covered here).

Downloaded Newest RStudio (1.4.1717) and R files are no longer associated with RStudio and I can't find RStudio in my Programs folder

Also, on a windows 10 machine and R version 4.1.0. Since I can't find RStudio on my machine when it asks what program I want to use to open my code files, I just have to go to through the file menu to open the code file. It's obviously not a big deal, but quite annoying. One of my coworkers is having the same problem after downloading the new RStudio version too, so I'm wondering if it's a bug...or I'm to bug.

Jupyter notebook on Ubuntu doesn't free memory

I have a piece of code (Python 3.7) in a Jupyter notebook cell that generates a pandas data frame (df) containing numpy arrays.
I am checking the memory consumption of the df just by looking at the system monitor app preinstalled in Ubuntu.
The problem is that if I run the cell a second time, the memory consumption double even if the df is assigned to the same variable.
If I run multiple times the same cell, the system goes out of memory, and the kernel will dye by itself.
Using del df or gc.collect() won't free the memory as well.
Restarting the notebook kernel is the only way to free the memory.
In practice, I would expect the memory to stay roughly the same because I am just reassigning a new df to the same variable over and over again.
Indeed, the memory accumulates only if I run the code on a linux machine and in the notebook. If I run the same code via terminal python script.py, or if I run the very same notebook on macOS, the memory pressure will not change, I can run the same cell multiple time and the occupied memory stays stable (as expected).
Can you help me pointing out where is the problem coming from and how to solve it?
P.S. Both Python and Jupiter are installed with Anaconda 2018.12 on Ubuntu 18.04.
I have asked the same question on the Ubuntu community since I am not sure this is strictly related to python itself, but I got no answers so far.

Fixing pandoc "out of memory" error when running the profvis R package

I'm trying to use the profvis package to do memory profiling of a large job in R (64 bit), run under RStudio, run under windows 7. profvis keeps crashing and I get an error message saying that Pandoc is out of memory. The message is copied below.
My understanding, and please correct me if this is wrong, is that the problem is likely to go away if I can set the /LARGEADDRESSAWARE switch on Pandoc. And to do that, I need to install a linker, etc., do my own build, after learning how to do all those things. Or, there is a shortcut, involving installing MS Visual Studio, running the editbin utility, and set the switch that way. However a new install of Visual Studio is unhappy on my machine, and demands that I fix some unspecified problem with Windows Management Instrumentation before it will go forward.
So my question is this: Is there a way to set the /LARGEADDRESSAWARE switch on Pandoc from inside R?
I had a similar problem and was able to resolve it by following the advice at https://www.techpowerup.com/forums/threads/large-address-aware.112556/. See in the post where it has an Attached File called laa_2_0_4.zip. I downloaded it and ran the executable it contains. Basic mode was sufficient; I simply navigated to C:/Program Files/RStudio/bin/pandoc/pandoc and turned on the checkbox for Large Address Aware Flag (step 2), then did Commit Changes (step 3). After this, the profvis-invoked pandoc command eventually ran to success. I was able to watch pandoc's memory consumption in Task Manager rise up to a peak of about 2.7 GB.

Resources