I am plotting heavy graphs in Jupyter using the language R. It is extremely slow as I expect it is first exporting it into EPS and then converting it to a png.
If you try to plot on a native R setup ( R for windows for example ) the plotting is nearly instantaneous.
Is there a way to get R in Jupyter to plot more quickly?
I came here looking for a solution to a potentially related issue-- the browser window became relatively unresponsive with lots of lag when drawing plots with a lot of datapoints, likely because everything was being rendered as vector graphics.
In trying to solve my problem, it sped up initial drawing of graphs by an appreciable amount as well. The solution was to change the jupyter plot output type to png using the command:
options(jupyter.plot_mimetypes = 'image/png')
Now when I plot graphs with 10s of thousands of points, the window remains crisply responsive. The downside is that the plots are now bitmap, but you can always remove the options if you want vector graphics.
Related
I am trying to use the default plot() function in R to try and plot a shapefile that is about 100MB, using RStudio. When I try and plot the shapefile, the command doesn't finish executing for around 5 minutes, and when it finally does, the plotting window remains blank. When I execute the same process exactly in VS Code, the plot appears almost instantly, as expected.
I have tried uninstalling and reinstalling RStudio with no success.
I can't speak for what VStudio does, but I can guarantee that plotting 100MB worth of data points is useless (unless the final plot is going to be maybe 6 by 10 meters in size).
First thing: can you load the source file into R at all? One would hope so since that's not a grossly huge data blob. Then use your choice of reduction algorithms to get a reasonable number of points to plot, e.g. 800 by 1600, which is all a monitor can display anyway.
Next try plotting a small subset to verify the data are in a valid form, etc.
Then consider reducing the data by collapsing maybe each 10x10 region to a single average value, or by using ggplot2:geom_hex .
I am trying to manually identify/correct trees using LiDAR data (1.7 GB object) and a tree tops object via the locate_trees function. Part of the problem is:
Rgl is rendering very slow even though the 4 GB Nvidia 3050 should be able to handle it.
The tree tops (red 3D dots) are not even showing in the rgl window. When I close the rgl window, the tree tops start popping up (red dots appear and disappear resulting in a blank white window) in a new rgl window. And if I close that window, a new tree top window opens up so I stop the process to prevent this from happening.
Does rgl automatically use the GPU or does it default to the integrated graphics on the motherboard? Is there a way to fasten up the rendering?
My other system specs are Corei9 (14 threads) and 64 GB RAM. Moreover, I am using R 4.2.1.
Code:
library(lidR)
# Import LiDAR data
LiDAR_File = readLAS("path/file_name.las")
# Find tree tops
TTops = find_trees(LiDAR_File , lmf(ws = 15, hmin = 5))
# Manually correct tree identification
TTops_Manual = locate_trees(LiDAR_File , manual(TTops)) # This is where rgl rendering becomes too slow if there are too many points involved.
There were two problems here. First, the lidR::manual() function which is used to select trees has a loop where one sphere is drawn for each tree. By default rgl will redraw the whole scene after each change; this should be suppressed. The patch in https://github.com/r-lidar/lidR/pull/611 fixes this. You can install a version with this fix as
remotes::install_github("r-lidar/lidR")
Second, rgl was somewhat inefficient in drawing the initial point cloud of data, duplicating the data unnecessarily. When you have tens of millions of points, this can exhaust all R memory, and things slow to a crawl. The development version of rgl fixes this. It's available via
remotes::install_github("dmurdoch/rgl")
The LiDAR images are very big, so you might find you still have problems even with these changes. Getting more regular RAM will help R: you may need this if the time to the first display is too long. After the first display, almost all the work is done in the graphics system; if things are still too slow, you may need a faster graphics card (or more memory for it).
rgl has trouble displaying too many points. The plot function in lidR is convenient and allows to produce ready to publish illustrations but cannot replace a real point cloud viewer for big point clouds. I don't have GPU on my computer and I don't know if and how rgl can take advantage of GPU.
In the doc of the lidR function your are talking about you can see:
This is only suitable for small-sized plots
I am trying to visualize several hours of neuronal recordings sampled at 500Hz using R in Ubuntu 16.04. Simply I want to have a 2D plot that shows a value (voltage) over time. Its important for to have the plot in an interactive way. I need to have an overall look, compare different times and zoom in and out, therefor I don't want to split my data into different parts and visualize them separately.(Also I can not use the normal R plot since zooming there is a pain and sometimes impossible) What I came up with so far is to use "plot_ly" with scatterrgl type to get started and I could successfully plot 300'000 data points. But that is the limit I can get so far. Above this amount of data the whole R software freezes and exits. The frustrating part is that this can be done easily in MATLAB and with R it seems impossible. Is there any alternative to plot_ly for plotting large data in R?
You might try the dygraph package, working fine here with 500k points:
library(dygraphs)
my_data = data.frame(x = 1:500000, y = rnorm(500000))
dygraph(my_data) %>% dyRangeSelector()
I have a huge scatter plot matrix to generate and save into a zoom-able image. I takes a bunch of hours to draw and then I got some errors like:
"Server Error Unabe to establish connection with R session".
Any ideas? The problem is obviously memory, but there must be a way to get around this.
I've managed to save the file as a pdf format of 28.7 MB, it takes a lot of time to display and makes inkscape crash. I know that people who generate fractals are able to make images of infinite resolution without consuming a lot of memory since the image is generate as u zoom into it. Problem is fractals are self similar and scatterplots are not, so I'm not sure if there's a smart way to get around this issue.
A possible way to get around this "information overload" is to plot variables in pairs using qplot() and then save the file using ggsave(), for example in bmp on jpeg files.
I am trying to render 739455 data point on a graph using R, but on the x-axis I can not view all those numbers, is there a way I can do that?
I am new to R.
Thank you
As others suggested, try hist, hexbin, plot(density(node)), as these are standard methods for dealing with more points than pixels. (I like to set hist with the parameter breaks = "FD" - it tends to have better breakpoints than the default setting.)
Where you may find some joy is in using the iplots package, an interactive plotting package. The corresponding commands include ihist, iplot, and more. As you have a Mac, the more recent Acinonyx package may be even more fun. You can zoom in and out quite easily. I recommend starting with the iplots package as it has more documentation and a nice site.
If you have a data frame with several variables, not just node, then being able to link the different plots such that brushing points in one plot highlights them in another will make the whole process more stimulating and efficient.
That's not to say that you should ignore hexbin and the other ideas - those are still very useful. Be sure to check out the options for hexbin, e.g. ?hexbin.