If i run large sets of data , that is 5 charts with 1000's of points per second.. chart gets freezed for few milli seconds.
[Violation] Forced reflow while executing JavaScript took 33ms
and another error
[Violation] 'requestAnimationFrame' handler took <N>ms
I am using macbook pro M1 chip , its pretty fast. is there any way to prevent this lag.
Related
Preamble
Inspired by the presentation by Barret Schloerke at studio::global(2021) (I will add the link as soon as it becomes available), I tried to implement an app to see the differences between using {future}, {plumber}, both or none into a Shiny app running a sequence of fast-slow-slow-fast computations (both on distinct output and in a sequence within the same one).
Gist
You can find here my attempt, including the app.R Shiny app and the plumber.R APIs.
Results
execution selected (5 s for "slow")
result
expectation
comments
Standard run
~20 seconds before anything appear, next everything appears in the same moment
~20 seconds, appearing sequentially somehow
Why did not appear sequentially?
{future} only
~5 seconds before anything appear, next everything appears in the same moment
~20 with "first fast" and "second fast" appearing almost immediatly and next (~5) "first slow" or "second slow," next (~10) the other, and finally (~20) "Sequential")
I would expect that something similar to what happened here for the combined type of run... how is possible that "sequential" completed in 5 seconds???
{plumber} only
same as the standard run (r should remain busy until each API call would be resolved, right?)
same time (~20) but appearing sequentially some how
why shiny rendered everything the same time?
{future} and {plumber}
same as the standard run
I do not expect this at all!, What I expected here is to have "_fast"s appearing immediately, "_slow"s quite the same time after ~5 seconds, and "sequential" after ~10 seconds from the start (i.e., ~10 seconds overall)
I am totally confused here :-(
Doubts
One of the main things I did not understand is why when activating {future} (both with or without {plumber}), "first fast" does not appear immediately. And in general, why the output does not appear in a single sequence when {future} is not involved. And how is it possible that with {future} alone, "sequential" stay ~5 seconds?
So, clearly, I made something the wrong way, and I do not understand something correctly.
Questions
Can someone help me understand where/what (and maybe try to infer "why") I made the app wrong, and/or the API wrong, or their interaction wrong?
Thank you,
Corrado.
I am trying to get a more responsive idea of the system run queue length to see if load balancing based on the one minute load average from sysinfo() is having issues caused by the client processes perhaps looking in lockstep...
I've managed to find /proc/schedstats, and it looks to be what I'm looking for, but...
I want to make sure I base my values on the actual interval between polls of /proc/schedstat, instead of potential processing overhead (it's a shell script).
Now for the question: What is the unit of measurement used for the "timestamp" value at the top of the /proc/schedstats file? It's sure not nanoseconds, because the value is somewhere between 258 and 260 when my script loops through with a sleep 1 between loops.
Inspecting kernel sources sched/stats.c shows that the timestamp field is in jiffies.
What is the unit of measurement used for the "timestamp" value at the top of the /proc/schedstats file?
It's 1/HZ second, typically with HZ=300, the unit would be 3.333 miliseconds if I'm counting right.
Solution: As suggested by user Andy in the comments, an update to the newest version of Octave (at the moment: octave-4.0.1-rc4) fixed the problem and the plot could be saved as PNG.
I have a large-ish amount of data that I plot in Octave. But when I try to save the image, the program crashes without any explanation or real error message. My Octave is version 4.0 and it's running on Win 8.1, the graphics_toolkit is qt.
Saving smaller amounts of data has worked so far, but somehow I seem to have reached a size where the plot can be drawn but not saved.
First, I load the data from several files listed in the vector inputs:
data = [];
for i = 1:length(inputs)
data = [data; load(inputs{i})];
endfor
The result is a 955.524 x 7 matrix containing numbers. Loading alone takes a while on my system (several minutes), but eventually succeeds. I then proceed to plot the data:
hold on;
for j = 1:length(data(1,:))
curenntColumn = normalize(data(:,j)); % make sure all data is in the same range
plot(1:length(curenntColumn), curenntColumn, colours{j}); % plot column with distinct colour
endfor
hold off;
This results in a plot being drawn as Figure 1 that shows all 955.524 entries of each of the seven columns correctly in a distinct colour. If the program ends here, it exits properly. However, if I add
print("data.png");
Octave will keep running after opening the plot window and eventually crash with a simple "program does not work anymore" error message. The same happens if I try to save manually from the File->Save menu (which offers saving as PDF). Even just touching and moving the plot window takes a few seconds.
I tried using gnuplot and fltk as graphics_toolkit, but the latter does not even open a plot window, and the former seems to be broken (crashes on the attempt of plotting even simple data like plot(1:10,1:10);).
Now, I could screenshot the plot and try to work with that, but I'd really rather have it be saved automatically. Also, I find it weird that displaying the curves is possible, but not saving said display. As it works for smaller amounts of data, maybe I just need to somehow allocate more resources to Octave?
It (4.2.2 version) crashes with my Linux Mint. just a simple graph, and it crashed two times in a row. I am going back to R. I had my hopes up as I wanted to review the Numerical Analysis Using Matlab text.
Wait, come to think of it, the Studio Version of R crashes when I try to use it but not when I run the same program from the command line, so I will go back (one more time) and try to run a plot totally from the command line. The Linux Mint requires a 2 CPU 64 bit, and I just have the 64 bit single CPU.
Is there any way to filter metrics in Graphite while ignoring the hierarchy?
For example:
Say I have the following metrics:
stats_count.A.B.TestMetric
stats_count.A.TestMetric
stats.A.B.TestMetric
stats.A.B.TestMetric
How can I sum TestMetric under stats_count only?
I tried the followings with no success:
stats_counts.*.*.TestMetric - obviously this won't work...
sumSeriesWithWildcards(stats_counts.[A-Z\.]*[A-Z]*.TestMetric,1)
sumSeriesWithWildcards(stats_counts.[A-Z]*[.]*[A-Z]*.TestMetric,1)
Any ideas? Is it possible at all?
I have a graphite installation (version 0.9.9) where I create metrics on a lot of small systems
For example, I have 2 installations of a postgresql database (postgresql-1 and postgresql-2) where the second is a slave replication of the first. The first database is used for day to day use while the second is a hot standby used mostly by reporting systems and debugging queries
I think the following example is somewhat what you want to do. The image is the amount of connections on both databases. The blue line is on the first, the green line is on the second while the red line is the sum of both series, giving the total amount of connections
database.postgresql-1.connection.ALL.value # blue line
database.postgresql-2.connection.ALL.value # green line
sumSeries(database.postgresql-*.connection.ALL.value) # red line
Your problem is that your series have different path levels (I tried to avoid that on my series names as indeed it causes problems). I dont see any other option than writing something like this:
given
stats_count.A.B.TestMetric
stats_count.A.TestMetric
stats.A.B.TestMetric
stats.A.B.TestMetric
sum stats_count.**.TestMetric metrics using
sumSeries(stats_count.*.TestMetric, stats_count.*.*.TestMetric)
Graphite still needs a lot of improvement and unfortunately the development is going quite slowly (for instance version 0.9.10 is the latest release, has problems to install and is from 1 year ago). I am indeed considering forking/contributing to this project
I have an Rscript being called from a java program. The purpose of the script is to automatically generate a bunch of graphs in ggplot and them splat them on a pdf. It has grown somewhat large with maybe 30 graphs each of which are called from their own scripts.
The input is a tab delimited file from 5-20mb but the R session goes up to 12gb of ram usage sometimes (on a mac 10.68 btw but this will be run on all platforms).
I have read about how to look at the memory size of objects and nothing is ever over 25mb and even if it deep copies everything for every function and every filter step it shouldn't get close to this level.
I have also tried gc() to no avail. If I do gcinfo(TRUE) then gc() it tells me that it is using something like 38mb of ram. But the activity monitor goes up to 12gb and things slow down presumably due to paging on the hd.
I tried calling it via a bash script in which I did ulimit -v 800000 but no good.
What else can I do?
In the process of making assignments R will always make temporary copies, sometimes more than one or even two. Each temporary assignment will require contiguous memory for the full size of the allocated object. So the usual advice is to plan to have _at_least_ three time the amount of contiguous _memory available. This means you also need to be concerned about how many other non-R programs are competing for system resources as well as being aware of how you memory is being use by R. You should try to restart your computer, run only R, and see if you get success.
An input file of 20mb might expand quite a bit (8 bytes per double, and perhaps more per character element in your vectors) depending on what the structure of the file is. The pdf file object will also take quite a bit of space if you are plotting each point within a large file.
My experience is not the same as others who have commented. I do issue gc() before doing memory intensive operations. You should offer code and describe what you mean by "no good". Are you getting errors or observing the use of virtual memory ... or what?
I apologize for not posting a more comprehensive description with code. It was fairly long as was the input. But the responses I got here were still quite helpful. Here is how I mostly fixed my problem.
I had a variable number of columns which, with some outliers got very numerous. But I didn't need the extreme outliers, so I just excluded them and cut off those extra columns. This alone decreased the memory usage greatly. I hadn't looked at the virtual memory usage before but sometimes it was as high as 200gb lol. This brought it down to up to 2gb.
Each graph was created in its own function. So I rearranged the code such that every graph was first generated, then printed to pdf, then rm(graphname).
Futher, I had many loops in which I was creating new columns in data frames. Instead of doing this, I just created vectors not attached to data frames in these calculations. This actually had the benefit of greatly simplifying some of the code.
Then after not adding columns to the existing dataframes and instead making column vectors it reduced it to 400mb. While this is still more than I would expect it to use, it is well within my restrictions. My users are all in my company so I have some control over what computers it gets run on.