R terra function classify create very large files

R terra function classify create very large files - r

I have an habitat classification map from Iceland (https://vistgerdakort.ni.is/) with 72 classes in a tif file of 5m*5m pixel size. I want to simplify it, so that there is only 14 classes. I open the files (a tif file and a text file containing the reclassification rules) and use the function classify in the terra package as follow on a subset of the map.
raster <- rast("habitat_subset.tif")
reclass_table<-read.table("reclass_habitat.txt")
habitat_simple<-classify(raster, reclass_table, othersNA=TRUE)
It does exactly what I need it to do and I am able to save the file back to tif using
writeRaster(habitat_simple, "reclass_hab.tif")
The problem is that my initial tif file was 105MB and my new reclassify tif file is 420MB. Since my goal is to reclassify the whole extent of the country, I can't afford to have the file become so big. Any insights on how to make it smaller? I could not find any comments online in relation to this issue.

You can specify the datatype, in your case you should be able to use "INT1U" (i.e., byte values between 0 and 254 --- 255 is used for NA, at least that is the default). That should give a file that is 4 times smaller than when you write it with the default "FLT4S". Based on your question, the original data come with that datatype. In addition you could use compression; I am not sure how well they work with "INT1U". You could have found out about this in the documentation, see ?writeRaster
writeRaster(habitat_simple, "reclass_hab.tif",
wopt=list(datatype="INT1U", gdal="COMPRESS=LZW"))
You could also skip the writeRaster step and do (with terra >= 1.1-4) you can just do
habitat_simple <- classify(raster, reclass_table, othersNA=TRUE,
datatype="INT1U", gdal="COMPRESS=LZW")

Related

Merging multiple orthophto tif to gpkg with gdalwrap - artefacts, transaprent background (nodata value)

I try to merging multiple orthophoto photos (tif format) into gpkg fromat for using in qfield app. I found some answer how to do this in R with package gdalUtils.
I used this part of code:
gdalwarp(of="GPKG",srcfile=l[1:50],dstfile="M:/qfield/merge.gpkg",co=c("TILE_FORMAT=PNG_JPEG","TILED=YES))
Process was succesfully finished but when I looked results I found some artefacts.
In a photo you see merged gpkg file with added layer of fishnet (tif lists) and artefacts. Artefacts looks like small parts of original list of tif which are reordered and probably also overlaped.
First I was thougth that there is some error in originall orthophoto tif. But then I created raster mosaic, raster catalog, merged tif to new tif dataset and I also published raster catalog to esri server and created tpk file from them and artefacts did not show. Is problem my code? Thank you
Edit:
I found solution for my problem. If I create vrt then artefacts do not show so i used this code:
gdalbuildvrt(gdalfile = ll, output.vrt = "C:/Users/..../dmk_2017_2019.vrt")
gdalwarp(of="GPKG",srcfile="C:/Users/..../dmk_2017_2019.vrt",
dstfile="C:/Users/..../dmk_2017_2019.gpkg")
gdaladdo(filename="C:/Users/..../dmk_2017_2019.gpkg",r="average",levels=c(2,4,8,16,32,64,128,256),verbose=TRUE)
I have another question. What to do to get transparent nodata value (background)?
I tried with - srcnodata=255 255 255 but this helped only that black background become white I tried also with argument dstalpha but without succes. In qgis its possible this with setting transparenty:

How to batch edit a field in a jpg EXIF header?

I am busy with some drone mapping. However, the altitude value in the images are very inconsistent between repeating flight missions (up to 120m). The program I use to stitch my drone images into a orthomosaic thinks the drone is flying underground as the image altitude is lower than the actual ground elevation.
To rectify this issue, I want to batch edit the altitude values of all my images by adding the difference between actual ground elevation and the drone altitude directly into the EXIF of the images.
e.g.
Original image altitude = 250m. Edited image altitude = 250m+x
I have found the exiftoolr R packages which allows you to read and write EXIF data through using the standalone ExifTool and Perl programs (see here: https://github.com/JoshOBrien/exiftoolr)
This is my code so far:
library(exiftoolr)
#Object containing images in directory
image_files <-dir("D:/....../R/EXIF_Header_Editing/Imagery",full.names=TRUE)
#Reading info
exif_read(image_files, tags = c("filename", "AbsoluteAltitude")) #Only interested in "filename" and "AbsoluteAltitude"
#Saving to new variable
altitude<-list(exif_read(image_files, tags=c("filename","AbsoluteAltitude")))
This is how some of the output looks like:
FileName AbsoluteAltitude
1 DJI_0331.JPG +262.67
2 DJI_0332.JPG +262.37
3 DJI_0333.JPG +262.47
4 DJI_0334.JPG +262.57
5 DJI_0335.JPG +262.47
6 DJI_0336.JPG +262.57
ext.
I know need to add x to every "AbsoluteAltitude" entry in the list, and then overwrite the existing image altitude value with this new adjusted altitude value, without editing any other important EXIF information.
Any ideas?
I have a program that allows me to batch edit EXIF Altitude, but this makes all the vales the same, and I need to keep the variation between the values.
Thanks in advance

Just a follow up from #StarGeek answer. I managed to figure out the R equivalent. Here is my solution:
#Installing package from GitHub
if(!require(devtools)) {install.packages("devtools")}
devtools::install_github("JoshOBrien/exiftoolr",force = TRUE)
#Installing/updating ExifTool program into exiftoolr directory
exiftoolr::install_exiftool()
#Loading packages
library(exiftoolr)
#Set working directory
setwd("D:/..../R/EXIF_Header_Editing")
#Object containing images
image_files <- dir("D:/..../R/EXIF_Header_Editing/Imagery",full.names = TRUE)
#Editing "GPSAltitude" by adding 500m to Altitude value
exif_call(args = "-GPSAltitude+=500", path = image_files)
And when opening the .jpg properties, the adjusted Altitude shows.
Thanks StarGeek

If you're willing to try to just use exiftool, you could try this command:
exiftool -AbsoluteAltitude+=250 <DIRECTORY>
I'd first test it on a few copies of your files to see if it works to your needs.

Rasterize() slow on large SpatialPolygonsDataFrame, alternatives?

I have a large (266,000 elements, 1.7Gb) SpatialPolygonsDataFrame that I am try to convert into 90m resolution RasterLayer (~100,000,000 cells)
The SpatialPolygonsDataFrame has 12 variables of interest to me, thus I intend to make 12 RasterLayers
At the moment, using rasterize(), each conversion takes ~2 days. So nearly a month expected for total processing time.
Can anyone suggest a faster process? I think this would be ~10-40x faster in ArcMap, but I want to do it in R to keep things consistent, and it's a fun challenge!
general code
######################################################
### Make Rasters
######################################################
##Make template
r<-raster(res=90,extent(polys_final))
##set up loop
loop_name <- colnames(as.data.frame(polys_final))
for(i in 1:length(loop_name)){
a <-rasterize(polys_final, r, field=i)
writeRaster(a, filename=paste("/Users/PhD_Soils_raster_90m/",loop_name[i],".tif",sep=""), format="GTiff")
}

I think this is a case for using GDAL, specifically the gdal_rasterize function.
You probably already have GDAL installed on your machine if you are doing a lot of spatial stuff, and you can run GDAL commands from within R using the system() command. I didn't do any tests or anything but this should be be MUCH faster than using the raster package in R.
For example, the code below creates a raster from a shapefile of rivers. This code creates an output file with a 1 value wherever a feature exists, and a 0 where no feature exists.
path_2_gdal_function <- "/Library/Frameworks/GDAL.framework/Programs/gdal_rasterize"
outRaster <- "/Users/me/Desktop/rasterized.tiff"
inVector <- "/Full/Path/To/file.shp"
theCommand <- sprintf("%s -burn 1 -a_nodata 0 -ts 1000 1000 %s %s", path_2_gdal_function, inVector, outRaster)
system(theCommand)
the -ts argument provides the size of the output raster in pixels
the -burn argument specifies what value to put in the output raster where the features exist
-a_nodata indicates which value to put where no features are found
For your case, you will want to add in the -a attribute_name argument, which specifies the name of the attribute in the input vector to be burned into the output raster. Full details on possible arguments here.
Note: that the sprintf() function is just used to format the text string that is passted to the command line using the system() function

Issue using saveRDS() with raster objects

I'm trying to use saveRDS() to save a large number of lists each containing a raster layer and a list with metadata. It worked fine when the raster layer was extracted from a ncdf file, but when the original file is an ascii file, saveRDS() only writes a pointer to the original file instead of writing the values to the end file.
Here's a condensed version of what's going on:
require(raster)
mf <- raster('myfile.asc')
meta <- list(mylonglistofmetadata)
res <- list(mf, meta)
saveRDS(res, 'myresult.Rdata')
myresult.Rdata is now simply a 33KB pointer to myfile.asc, when I really would like it to store the values so it will still work after I erase myfile.asc (so it should be about 15MB)
In contrast, for other files in ncdf format:
require(ncdf4)
require(raster)
ff <- 'myfile2.nc'
nc <- nc_open(ff)
meta <- list(mylonglistofmetadata)
res <- list(nc, meta)
saveRDS(res, 'myresult2.Rdata')
Here, myresult2.Rdata is storing everything just like I want it to, so my guess is that the issue arises with the raster package?
Anyone has any idea on how to fix this? I would prefer not to use writeRaster(), since I'm trying to keep the metadata together with the data, and use the same format as in my batch extracted from ncdf files to ease later processing.

The short answer is that you can do:
mf <- raster('myfile.asc')
mf <- readAll(mf)
mf
Now, the values are in memory and will be saved to the .RData file
Also note that:
You can save metadata with the data via writeRaster (see ?raster::metadata
you can access ncdf files (with geographic data) via raster('myfile2.nc')
your example for the ncdf file is not informative, as you do not actually use nc for anything. If you replaced mf with nc it would not work either after you removed 'myfile2.nc'

Dealing with large shapefiles on R

Good afternoon:
I have been studying R for a while, however now I'm working with large shapefiles, the size of these files is bigger than 600 Mb. I have a computer with 200GB free and 12 GB in RAM, I want to ask if somebody knows how to deal with these files?. I really appreciate your kind help.

With the latest version of 64-bit R, and the latest version of rgdal just try reading it in:
library(rgdal)
shpdata <- readOGR("/path/to/shpfolder/", "shpfilename")
Where "shpfilename" is the filename without the extension.
If that fails update your question with details of what you did, what you saw, details of the file sizes - each of the "shpfilename.*" files, details of your R version, operating system and rgdal version.

Ok so the question is more about a strategy for dealing with large files, not "how does one read a shapefile in R."
This post shows how one might use the divide-apply-recombine approach as a solution by subsetting shapefiles.
Working from the current answer, assume you have a SpatialPolygonsDataFrame called shpdata. shpdata will have a data attribute (accessed via #data) with some sort of identifier for each polygon (for Tiger shapefiles it's usually something like 'GEOID'). You can then loop over these identifiers in groups and subset/process/export the shpdata for each small batch of polygons. I suggest either writing intermediate files as a .csv or inserting them into a database like sqlite.
Some sample code
library(rgdal)
shpdata <- readOGR("/path/to/shpfolder/", "shpfilename")
# assuming the geo id var is 'geo_id'
lapply(unique(shpdata#data$geo_id), function(id_var){
shp_sub = subset(shpdata, geo_id == id_var)
### do something to the shapefile subset here ###
### output results here ###
### clean up memory !!! ###
rm(shp_sub)
gc()
})

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex