CPU usage of R Session from windows performance monitor or Resource Monitor - r

How can I obtain the data of CPU usage of R Session from windows performance monitor or Resource Monitor between two time points as csv file in Windows

Related

Metrics such as CPU, Memory, disk in windows

I am using fluent-bit to collect metrics such as CPU, Disk, Memory in windows but unable to collect? Does anyone know how to collect data of these metrics using fluent-bit?
Ap per the doc, node-exporter plugin which collects CPU / Disk / Network metrics is only supported on Linux based operating systems.

Difference between two perf events for intel processors

What is the difference between following perf events for intel processors:
UNC_CHA_DIR_UPDATE.HA: Counts only multi-socket cacheline Directory state updates memory writes issued from the HA pipe. This does not include memory write requests which are for I (Invalid) or E (Exclusive) cachelines.
UNC_CHA_DIR_UPDATE.TOR: Counts only multi-socket cacheline Directory state updates due to memory writes issued from the TOR pipe which are the result of remote transaction hitting the SF/LLC and returning data Core2Core. This does not include memory write requests which are for I (Invalid) or E (Exclusive) cachelines.
UNC_M2M_DIRECTORY_UPDATE.ANY: Counts when the M2M (Mesh to Memory) updates the multi-socket cacheline Directory to a new state.
The above description about perf events is taken from here.
In particular, if there is a directory update because of the memory write request coming from a remote socket then which perf event will account for that if any?
As per my understanding, since the CHA is responsible for handling the requests coming from the remote sockets via UPI, the directory updates which are caused by the remote requests should be reflected by UNC_CHA_DIR_UPDATE.HA or UNC_CHA_DIR_UPDATE.TOR. But when I run a program (which I will explain shortly), the UNC_M2M_DIRECTORY_UPDATE.ANY count is much larger (more than 34M) whereas the other two events have the count in the order of few thousand. Since there are no other writes happening other than those coming from the remote socket it seems that UNC_M2M_DIRECTORY_UPDATE.ANY measures the number of directory updates(and not the other two events which) happening due to remote writes.
Description of the system
Intel Xeon GOLD 6242 CPU (Intel Cascadelake architecture)
4 sockets with each socket having PMEM attached to it
part of the PMEM is configured to be used as a system RAM on sockets 2 and 3
OS: Linux (kernel 5.4.0-72-generic)
Description of the program:
Note: use numactl to bind the process to node 2 which is a DRAM node
Allocate two buffers of size 1GB each
Initialize these buffers
Move the second buffer to the PMEM attached to socket 3
Perform a data copy from the first buffer to the second buffer

why large sqlite database can cause windows server tcp connection delay

I have a Windows Server 2016 machine that runs a server program, there're about 2.2k concurrent requests per second. The server program only costs the server 25% cpu and 25% memory and 30% bandwidth. It's written in c++, just like the boost example. it just does some calculation and return the result to client in TCP, and it doesn't use the disk.
But it's very lag, I can see the lag not only from my clients, but also from the Remote Desktop Connection, it takes about 10 seconds to establish an RDP connection, and it's very quick(less than 2 seconds) if I close the server program.
I guess some resources on my server is exhausted. But how can I find it, is there any tool can profile the system to find the bottleneck?
Update
The server program uses all cores averagely by running 8 threads on 8 cores, I did take care about this, it's confirmed in Task Manager, all 8 cores used nearly the same.
I found the problem is: I'm using a sqlite3 database(my.db) to log all the client access, the server becomes more lag when the .db grows. Now it is 1.2Gb, which causes the lag.
Then I tried:
Keep the 1.2Gb .db, just load it once at startup to read some configuration, stop recording new log, no read/write access while server is running, but it's still lag.
Execute delete from log_table and vacuum to delete the previous log and reduce the .db size to 16k. Then lag problem is gone, client request becomes very quick.
Question
Why the large database can cause the whole server lag? Not only for the server itself, but also affect other app like RDP connection, even the load is low?
Server Environment
Windows Server 2016
cpu: 8 cores (25% used)
memory: 16Gb (25% used)
disk: 40Gb (30% used)
server program written in c++ with boost coroutine
sqlite3 database with PRAGMA journal_mode=WAL; enabled.
Install the sysinternals tools.
Launch procexp.exe (Process Explorer ) - use process explorer to find out memory and disk usage for your process, and other.
Use resmon ( Win+R then type "resmon" ) to monitor the network bandwidth when your program is running and when it's not.

what does the return value of cudaDeviceProp::asyncEngineCount mean?

I have read the documentation, and it said that if it returns 1:
device can concurrently copy memory between host and device while executing a kernel
If it is 2:
device can concurrently copy memory between host and device in both directions and execute a kernel at the same time
What exactly is the difference?
With 1 DMA engine, the device can either download data from the CPU or upload data to the CPU, but not do both simultaneously. With 2 DMA engines, the device can do both in parallel.
Regardless of the number of available DMA engines, the device also has an execution engine which can be running a kernel in parallel of ongoing memory operations.

Munin Graphs meaning

I've been using Munin for some days and I think it's very interesting information, but I don't understand some of the graphs, and how they can be used/read to get information to improve system.
The ones I don't understand are:
Disk
Disk throughput per device
Inode usage in percent
IOstat
Firewall Throughput
Processes
Fork rate
Number of threads
VMstat
System
Available entropy
File table usage
Individual interrupts
Inode table usage
Interrupts and context switches
Ty!
Munin creates graphs that enable you to see trends. This is very useful to see if a change you made doesn't negatively impact the performance of the system.
Disk
Disk throughput per device & IOstat
The amount of data written or read from a disk device. Disks are always slow compared to memory. A lot of disk reads could for example indicate that your database server doesn't have enough RAM.
Inode usage in percent
Every filesystem has a index where information about the files is stored, like name, permissions and location on the disk. With many small files the space available to this index could run out. If that happens no new files can be saved to that filesystem, even if there is enough space on the device.
Firewall Throughput
Just like it says, the amount of packets going though the iptables firewall. Often this firewall is active on all interfaces on the system. This is only really interesting if you run munin on a router/firewall/gateway system.
Processes
Fork rate
Processes are created by forking a existing process into two processes. This is the rate at wich new processes are created.
Number of threads
The total number of processes running in the system.
VMstat
Usage of cpu time.
running: time spent running non-kernel code
I/O sleep: Time spent waiting for IO
System
Available entropy: The entropy is the measure of the random numbers available from /dev/urandom. These random numbers are needed to create SSL connections. If you create a large number of SSL connections this randomness pool could possibly run out of real random numbers.
File table usage
The total number of files open in the system. If this number suddenly goes up there might be a program that is not releasing its file handles properly.

Resources