R is using multiple threads with no job running (R v4.0/Win 10.018363) - r

A few days ago I noticed R was using 34% of the CPU when I have no code running. I noticed it again today and I can't figure out why. If I restart R, CPU usage returns to normal, then after 20 minutes or so it ramps up again.
I have a task scheduled that downloads a small file once a week using R, and another using wget in ubuntu (WSL). It might be the case that the constant CPU usage only happens after I download covid-related data from a github (link below). Is there a way to see if this is hijacking resources? If it is, other people should know about it.
I don't think it's a windows task reporting error since my temps are what I would expect for a constant 34% cpu usage (~56C).
Is this a security issue? Is there a way to see what R is doing? I'm sure there is a way to better inspect this but I don't know where to begin.. Glasswire hasn't reported any unusual activity.
From Win10 event viewer, I've noticed a lot of these recently but don't quite know how to read it:
The application-specific permission settings do not grant Local Activation permission for the COM Server application with CLSID {8BC3F05E-D86B-11D0-A075-00C04FB68820} and APPID {8BC3F05E-D86B-11D0-A075-00C04FB68820} to the user redacted SID (S-1-5-21-1564340199-2159526144-420669435-1001) from address LocalHost (Using LRPC) running in the application container Unavailable SID (S-1-15-2-181400768-2433568983-420332673-1010565321-2203959890-2191200666-700592917). This security permission can be modified using the Component Services administrative tool.
*edit: CPU usage seems to be positively correlated with the duration R is open.

Given the information you provided, it looks like RStudio (not R) is using a lot of resources. R and RStudio are 2 very different things. These types of issues are very difficult to investigate as one need to be able to reproduce them on another computer. One thing you can maybe do is raise the issue on github to the RStudio team.

Related

R will not run after latest windows 10 updates

I have updated my windows and R cannot run, and hence neither can R studio. When I run R GUI it just freezes and is unresponsive. I have allowed chromium exemption to the firewall
I am on Windows Insider program and has just updated to
Windows 10 Home, Insider Preview
Evaluation Copy.Build 20190.rs_prerelease.200807-1609
Note that R GUI freezes and then shuts down on its own, so maybe the problem is R GUI and not R Studio.
I get the following errors on R studio.
This site can’t be reached
127.0.0.1 refused to connect.
Try:
Checking the connection
Checking the proxy and the firewall
ERR_CONNECTION_REFUSED
Cannot Connect to R
RStudio can't establish a connection to R. This usually indicates one of the following:
The R session is taking an unusually long time to start, perhaps because of slow operations in startup scripts or slow network drive access.
RStudio is unable to communicate with R over a local network port, possibly because of firewall restrictions or anti-virus software.
Please try the following:
If you've customized R session creation by creating an R profile (e.g. located at ~/.Rprofile), consider temporarily removing it.
If you are using a firewall or antivirus software which guards access to local network ports, add an exclusion for the RStudio and rsession executables.
Run RGui, R.app, or R in a terminal to ensure that R itself starts up correctly.
Further troubleshooting help can be found on our website:
Troubleshooting RStudio Startup
This has been fixed with Windows 10 Insider Preview Build 20201 (released on August 26, 2020 in the Dev channel).The previous two builds were missing 64-bit APIs required by the prebuilt version of R.
Same issue.
Rollback to the previous version solves the problem.
I think it is about the update of the graphic features of Windows.
Here is what Microsoft said in the build 20190 changelog:
Improved Graphics Settings experience
While this isn’t a new feature all together, we have made significant changes based on customer feedback that will benefit our customers’ Graphics Settings experience. We have made the following improvements:
We’ve updated the Graphics Settings to allow users to specify a default high performance GPU.
We’ve updated the Graphics Settings to allow users to pick a specific GPU on a per application basis.

cannot allocate memory - RSelenium and EC2

I am trying to implement a Selenium test to perform automated actions on a website (looping through pages). I am using R and RSelenium package as well as a PostgreSQL database using DBI package. All this using EC2 AWS server.
My problem is that after a few minutes that the script was launched, my RStudio session freezes (as well as my Linux session) and I can see a message like "cannot allocate memory".
So this is clearly a memory issue without a doubt, and by doing top I could see that my Selenium docker was using most of the resources.
But my question is how can I reduce the amount of memory used by the Selenium test?
IMHO there is no practical way for a test to use less memory than the memory required by the given test. You can try to simplify the given test by breaking it up into 2 or more tests. Check for memory leaks, as suggested in another answer.
It would be much easier to use the next largest instance type with more memory, and shut down the instance when not in use to save money, if that is an issue.
Don't forget drive.close() in your code, if you don't close your driver, you will have a lot instance of Chrome.

Difference between Jstack and gcore when generating Core Dumps?

We all know that Core Dumps are an essential diagnostic tools for analysing various processes in Unix . I know both jstack and gcore are both used for generating Javacore files or Core Dumps but I have a doubt that Gcore is mainly used for Processes and Jstack is used for threads .
As from an Operating System perspective Process and Threads though interrelated (Process comprises of Threads only) they are relatively different from each other w.r.t memory/speed/execution . So is that gcore will diagnose the process and jstack will analyse the threads in that process ???
GCore act at OS level and you got a dump of native code that is currently running. From a java point of view, it is not really understandable.
JStack get you the stack trace at VM level (the java stack) of all thread your application have. You can find from that what is the real java code executed at a point.
Clearly, GCore is almost never used (too low level, native code...). Only really strange issue with native library or stuff like that will perhaps need this kind of tool.
There is also jmap that can generate a hprof file which is the heap data from you VM. A tool like 'Memory Analyser Tool' can open the hprof, and you can drill down on what was going on (on memory side).
If your VM crash because of a OutOfMemory, you can also set parameter to get the hprof when the event occurs. It helps to understand why (too many users, A DB query that fetch too much data...)
Last thing is the fact that you can add a debug option when your start your VM, so that you can connect to it, and put debug on running process. It can help if you have some strange issue that you are not able to reproduce in your local environment.

How to achieve high availability of instance in openstack

I wanted to launch an instance with high availability with out having risk factor i.e, an instance will be launched in multiple regions(zones) that to sync the state like database(master-slave). When some applications got installed, same should reflect in another region/zone also(mostly image format). Can we do that?.
I have checked some links based on this. I got a confusion after reading all the docs.
Host-aggregate/Cell in openstack
Nova evacuate command
Buildbot tool
Exactly what is the difference among. VM replication & syncing is possiblein Openstack?
To the best of my knowledge, Open Stack does not support VM replication for now.
There is a component called Remus under the Xen project, which could potentially used by manual configuration as Open Stack supports Xen (https://www.xenproject.org/directory/directory/projects/70-remus.html). But it seems to be slow and unstable.
The newest approach is called reversed virtual machine replication (http://dl.acm.org/citation.cfm?id=2996894&CFID=918229768&CFTOKEN=85577813), this one seems to be very interesting and some critical problems in VM replication is well defined and elegantly solved. However, I did not find the open source project for it.

WinDBG - Analyse dump file on local PC

I have created a memory dump of an ASP.NET process on a server using the following command: .dump /ma mydump.dmp. I am trying to identify a memory leak.
I want to look at the dump file in more detail on my local development PC. I read somewhere that it is advisable to debug on the same machine as you create the dump file. However, I have also read that some developers do analyse the dump file on their local development PC's. What is the best approach?
I notice that when I create a dump file using the command above the W3WP process memory increases by about 1.5 times. Why this this? I suppose this should be avoided on a live server.
Analyzing on the same machine can save you from SOS loading issues thereafter. Unless you are familiar with WinDbg and SOS, you will find it confusing and frustrating then.
If you have to use another machine for analysis, make sure you read carefully this blog post, http://blogs.msdn.com/b/dougste/archive/2009/02/18/failed-to-load-data-access-dll-0x80004005-or-what-is-mscordacwks-dll.aspx as it shows you how to copy the necessary files from the source machine (where the dump is captured) to the target machine (the one you launch WinDbg).
For your second question, as you use WinDbg to attach to the process directly, and use .dump command to capture the dump, the target process unfortunately is modified. Not easy to explain in a few words. The recommended way is to use ADPlus.exe or Debug Diag. Even procdump from SysInternals is better. Those tools are designed for dump capture and they have minimal impact on the target processes.
For memory leak from unmanaged libraries, you should use memory leak rule of Debug Diag. for managed memory leak, you can simply capture hang dumps when memory usage is high.
I am no expert on WinDBG but I once had to analyse a dump file on my ASP.NET site to find a StackOverflowException.
While I got a dump file of my live site (I had no choice since that was what was failing), originally I tried to analyse that dump file on my local dev PC but ran into problems when trying to load the CLR data from it. The reason being that the exact version of the .NET framework differed between my dev PC and the server - both were .NET 4 but I imagine my dev PC had some cumulative updates installed that the server did not. The SOS module simply refused to load because of this discrepancy. I actually wrote a blog post about my findings.
So to answer part of your question it may be that you have no choice but to run WinDBG from your server, at least you can be sure that the dump file will match your environment.
It is not necessary to debug on the actual machine unless the problem is difficult to manifest on your development machine.
So long as you have the pdbs with the private symbols then the symbols should be resolved and call stacks correctly displayed and the correct version of .NET installed.
In terms of looking at memory leaks you should enable Gflags user stack trace and take memory dumps at 2 intervals so you can compare the memory usage before and after the action that provokes the memory leak, remember to disable gflags afterwards!
You could also run DebugDiag on the server which has automated memory pressure analysis scripts that will work with .Net leaks.

Resources