Cannot use plot() function in RStudio for large objects - r

I am trying to use the default plot() function in R to try and plot a shapefile that is about 100MB, using RStudio. When I try and plot the shapefile, the command doesn't finish executing for around 5 minutes, and when it finally does, the plotting window remains blank. When I execute the same process exactly in VS Code, the plot appears almost instantly, as expected.
I have tried uninstalling and reinstalling RStudio with no success.

I can't speak for what VStudio does, but I can guarantee that plotting 100MB worth of data points is useless (unless the final plot is going to be maybe 6 by 10 meters in size).
First thing: can you load the source file into R at all? One would hope so since that's not a grossly huge data blob. Then use your choice of reduction algorithms to get a reasonable number of points to plot, e.g. 800 by 1600, which is all a monitor can display anyway.
Next try plotting a small subset to verify the data are in a valid form, etc.
Then consider reducing the data by collapsing maybe each 10x10 region to a single average value, or by using ggplot2:geom_hex .

Related

R rgl lidR slow rendering on Windows 11 64 bit

I am trying to manually identify/correct trees using LiDAR data (1.7 GB object) and a tree tops object via the locate_trees function. Part of the problem is:
Rgl is rendering very slow even though the 4 GB Nvidia 3050 should be able to handle it.
The tree tops (red 3D dots) are not even showing in the rgl window. When I close the rgl window, the tree tops start popping up (red dots appear and disappear resulting in a blank white window) in a new rgl window. And if I close that window, a new tree top window opens up so I stop the process to prevent this from happening.
Does rgl automatically use the GPU or does it default to the integrated graphics on the motherboard? Is there a way to fasten up the rendering?
My other system specs are Corei9 (14 threads) and 64 GB RAM. Moreover, I am using R 4.2.1.
Code:
library(lidR)
# Import LiDAR data
LiDAR_File = readLAS("path/file_name.las")
# Find tree tops
TTops = find_trees(LiDAR_File , lmf(ws = 15, hmin = 5))
# Manually correct tree identification
TTops_Manual = locate_trees(LiDAR_File , manual(TTops)) # This is where rgl rendering becomes too slow if there are too many points involved.
There were two problems here. First, the lidR::manual() function which is used to select trees has a loop where one sphere is drawn for each tree. By default rgl will redraw the whole scene after each change; this should be suppressed. The patch in https://github.com/r-lidar/lidR/pull/611 fixes this. You can install a version with this fix as
remotes::install_github("r-lidar/lidR")
Second, rgl was somewhat inefficient in drawing the initial point cloud of data, duplicating the data unnecessarily. When you have tens of millions of points, this can exhaust all R memory, and things slow to a crawl. The development version of rgl fixes this. It's available via
remotes::install_github("dmurdoch/rgl")
The LiDAR images are very big, so you might find you still have problems even with these changes. Getting more regular RAM will help R: you may need this if the time to the first display is too long. After the first display, almost all the work is done in the graphics system; if things are still too slow, you may need a faster graphics card (or more memory for it).
rgl has trouble displaying too many points. The plot function in lidR is convenient and allows to produce ready to publish illustrations but cannot replace a real point cloud viewer for big point clouds. I don't have GPU on my computer and I don't know if and how rgl can take advantage of GPU.
In the doc of the lidR function your are talking about you can see:
This is only suitable for small-sized plots

How to draw scrolling graphics in R, like financial time series

I would like to draw financial time series in R, that are continuously updated all along the day. Sometimes I can have several updates per second and I want to draw the time series as it evolves.
Moreover, I want to improve my graphics with extra information that I will plot too on the same graph (not necessarily a time series).
So I wonder if there is either:
a package in R to draw such series and have them scroll automatically as soon as I push new data
or a way to do bit blit in R and simply update my graph,
or a way to use packages like grid or anything else that would draw what is necessary (at least lines and points) and help scroll the data quickly to have a smooth rendering.
I would like something a bit more modern than a TCL/TK solution like explained here
We are doing this with shiny and a timer variable which refreshes the plot every n seconds.
R itself isn't really made for continuous updates. The (default) graphics device is static (so you can't easily 'append one point'), and there is only one event loop.
You can do it with external programs -- I have used both custom Qt applications I wrote for this as well as custom data handler in the (awesome, under-appreciated) kst real-time visualization program.
I'm not on financial data, but if the data file is itself updated along the day, the simplest solution would be something like:
k <- 0
while ( k<=3600 ) {
foo <- read.table("data.txt")
plot(foo[,1], foo[,2])
Sys.sleep(60) # seconds
k <- k+1
}
This would redraw the plot each 60 seconds. You can put a web adress for the data instead of "data.txt" also. To "scroll", you can play with the xlim argument to plot().

Editing multiple plots in Rstudio

One interesting feature of RStudio is it allows to save multiple plots generated from a script. This however opens up the problem of how to edit multiple plots. My issue at the moment is adding lines to histograms using the abline() function. This function was designed however to work with the last plot generated by the environment. One way of course would be ad the lines as soon as the plot is generated, however I have to calculate the coordinates at the end of the algorithm, by then I have transformed the data and generated multiple plots from it. So I was wondering if there isn't a way to tell R to search for a given plot and add the line to it. I read abline() documentation but found nothing regarding it. One can always save the data necessary to generate the plot and generate it at the end of the script, but I was wondering if there isn't a less consuming memory method.
One way to get around this issue is:
1.Save your graphics as variables, for ex: hist_1=hist(x, plot=FALSE)
2.Write any code u like, for ex: very complicated code give y as a number for output
3.plot(hist_1)
4.abline(hist_1, v=y)
gives a general idea of how to edit multiple plots without having to save multiple copies of datasets and without overloading Rstudio interface. Works well with the R ubuntu terminal too.

add data points to existing plot in R

I try to receive the data from a sensor from time to time and plot it in real time. That means the length of the dataset is not know before hand. And need to adjust the range of the graph dynamically.
I tried the following
plot(1,10, xlim=range(0,10), ylim=range(0,10), type='n')
points(1,data[1])
points(2,data[2])
But once the number of dots is beyond the range of x axis (10 in this case), the data points are out of the range. How to adjust the range accordingly?
Just issue a new plot command with an expanded range. On modern computers the time taken to recreate the plot is small and you generally will not see a delay. Any other approach will essentially do the same thing, clear the current plot and create a new plot.
The ggplot2 and lattice packages have ways of constructing a plot and updating the plot, but when the updated plot is shown it is redrawn from scratch.
There is a zoomplot function in the TeachingDemos package which will allow you to change the range of a plot, but it also will just redraw the plot from scratch (and due to changes in R 3.0.0 it is not currently working, so if you wanted to use it you would need to go back to R 2.15 or before, or wait for it to be fixed).
You can't adjust the range dynamically (sometimes Excel is better). However, you can keep track of what you've plotted, and redo the plot when you've reached the limit. You could also just make a new plot every time you get more data, which would be a way of faking a dynamic update.

Plot two large Raster Data Sets in a Scatter Plot

i have a problem with plotting two Raster Data Sets in R.
I use two different IRS LISS III Scenes (with the same Extent) and what i want is to plot the pixel values of both scenes in one Scatterplot (x= Layer1 and y=Layer2).
My problem is now the handling of the big amount of data. Each Scene has about 80.000.000 pixels due reclassification and other processing i was able to scale down the values to a amount of 12.000.000 in each raster. But when i try to import these values e.g. in a data.frame or load them from an ascii file i always got problems with my memory.
Is it possible two plot such an amount of data, and when yes it would be great if someone could help me, i was trying it for two days now and right now im desperated.
Many thanks,
Stefan
Use the raster package, there's a good chance it will work out of the box since it has good "out-of-memory" handling. If it doesn't work with the ASCII grids, convert them to something more efficient (like an LZW-compressed and tiled GeoTIFF) with GDAL. And if they are still too big resize them, that's all the graphics rendering process will do anyway. (You don't say how you resized originally, or give any details on how you are trying to read them).

Resources