lapply to convert rasters to polygons using gdal_polygonizeR - polygon

I am using John Baumgrtner's gdal_polygonizeR (https://johnbaumgartner.wordpress.com/2012/07/26/getting-rasters-into-shape-from-r/) to covert rasters to polygons in R. Aside - I tried raster pkg rasterToPolygons function and it took forever; gdal_polygonizeR is way faster. Anyways, I have a list of 533 raster files (different extents) that I want to convert to polygons. The gdal_polygonizeR function works when a single list element is called, but I have tried to use it on all list elements using lapply and get an error message. See code below:
#path to folder containing all .tif raster files
dir <- "/path/to/raster/files"
#create a list of the files in the folder
files <- list.files(path = dir, pattern = ".tif$")
#use lapply to import/create list of all files in folder
rasterl_50 <- lapply(paste0(dir, files), raster)
#test gdal_polygonizeR function on single list element
gdal_polygonizeR(rasterl_50[[1]]) #works properly
#loop thru all elements in list
lapply(rasterl_50, gdal_polygonizeR)
Output = the first six (6) elements seem to run OK, but I get the following error msg at [[7]]:
wfp1 <- gdal_polygonizeR(rasterl_50[[1]])
Creating output /var/folders/s9/pm92gdl94h18k4n6026cb8x00000gn/T//RtmpvRRvA4/file23d4dc99d8d.shp of format ESRI Shapefile.
0...10...20...30...40...50...60...70...80...90...100 - done.
wfp2 <- gdal_polygonizeR(rasterl_50[[2]])
Creating output /var/folders/s9/pm92gdl94h18k4n6026cb8x00000gn/T//RtmpvRRvA4/file23d7698a853.shp of format ESRI Shapefile.
0...10...20...30...40...50...60...70...80...90...100 - done.
wfp3 <- gdal_polygonizeR(rasterl_50[[3]])
Creating output /var/folders/s9/pm92gdl94h18k4n6026cb8x00000gn/T//RtmpvRRvA4/file23d30d4d703.shp of format ESRI Shapefile.
0...10...20...30...40...50...60...70...80...90...100 - done.
wfp4 <- gdal_polygonizeR(rasterl_50[[4]])
Creating output /var/folders/s9/pm92gdl94h18k4n6026cb8x00000gn/T//RtmpvRRvA4/file23d24036d07.shp of format ESRI Shapefile.
0...10...20...30...40...50...60...70...80...90...100 - done.
wfp5 <- gdal_polygonizeR(rasterl_50[[5]])
Creating output /var/folders/s9/pm92gdl94h18k4n6026cb8x00000gn/T//RtmpvRRvA4/file23d4683ed87.shp of format ESRI Shapefile.
0...10...20...30...40...50...60...70...80...90...100 - done.
wfp6 <- gdal_polygonizeR(rasterl_50[[6]])
Creating output /var/folders/s9/pm92gdl94h18k4n6026cb8x00000gn/T//RtmpvRRvA4/file23d4e23b4d1.shp of format ESRI Shapefile.
0...10...20...30...40...50...60...70...80...90...100 - done.
wfp7 <- gdal_polygonizeR(rasterl_50[[7]])
Creating output /var/folders/s9/pm92gdl94h18k4n6026cb8x00000gn/T//RtmpvRRvA4/file23d6791d108.shp of format ESRI Shapefile.
0...10...20...30...40...50...60...70...80...90...100 - done.
Error in readOGR(dirname(outshape), layer = basename(outshape), verbose = !quiet) :
no features found
In addition: Warning message:
In ogrFIDs(dsn = dsn, layer = layer) :
Show Traceback
Rerun with Debug
Error in readOGR(dirname(outshape), layer = basename(outshape), verbose = !quiet) :
no features found
#
If anyone has ideas for a solution using lapply or a for loop etc., please reply. Thanks

Solution: I had to run gdal_polygonizeR on each individual list element, and found that several raster files in the list contained no values (this resulted from reclassify function applied to rasters prior). I removed these files from the list, and lapply worked. Here is the code:
#remove 'no value' elements from the list
new_rastlist <-
rasterlist[c(-7,-14,-36,-89,-191,-310,-432,-436,-476,-493,-494,-501)]
#then try again to use lapply
polyl <- lapply(rastlist, gdal_polygonizeR)
UPDATE:
Even better, remove rasters with all NAs first with this:
batch_reclass <- function(rastlist){
for (i in 1:length(wfrastlist)) {
#read in raster
r <-raster(paste0("/path/to/rasterfiles/", rastlist[i]))
#perform the reclassifcation
rc <- reclassify(r, rclmat)
#write each reclass to a new file
if (!is.na(minValue(rc))) {
writeRaster(rc, filename = paste0("/path/to/new/rasterfiles/", "rc_",
rastlist[i]), format="GTiff", overwrite=TRUE)
}}
}
#run the function
batch_reclass(rastlist)

Related

using list.files to read many shape files and then merge them into one big file

I have more than 1000 shape files in a directory, and I want to select only 10 of them whose names are already known to me as follows:
15TVN44102267_Polygons.shp, 15TVN44102275_Polygons.shp
15TVN44102282_Polygons.shp, 15TVN44102290_Polygons.shp
15TVN44102297_Polygons.shp, 15TVN44102305_Polygons.shp
15TVN44102312_Polygons.shp, 15TVN44102320_Polygons.shp
15TVN44102327_Polygons.shp, 15TVN44102335_Polygons.shp
First I want to read only these shape files using the list.files command, and then merge them into one big file. I tried the following command, but it failed. I will appreciate any assistance from the community.
setwd('D/LiDAR/CHM_tree_objects')
files <- list.files(pattern="15TVN44102267_Polygons|
15TVN44102275_Polygons| 15TVN44102282_Polygons|
15TVN44102290_Polygons| 15TVN44102297_Polygons|
15TVN44102305_Polygons| 15TVN44102312_Polygons|
15TVN44102320_Polygons| 15TVN44102327_Polygons|
15TVN44102335_Polygons| 15TVN44102342_Polygons|
15TVN44102350_Polygons| 15TVN44102357_Polygons",
recursive = TRUE, full.names = TRUE)
Here's a slightly different approach. If you already know the location of the files and their file names, you don't need to use list.files:
library(sf)
baseDir <- '/temp/r/'
filenames <- c('Denisonia-maculata.shp', 'Denisonia-devisi.shp')
filepaths <- paste(baseDir, filenames, sep='')
# Read each shapefile and return a list of sf objects
listOfShp <- lapply(filepaths, st_read)
# Look to make sure they're all in the same CRS
unique(sapply(listOfShp, st_crs))
# Combine the list of sf objects into a single object
combinedShp <- do.call(what = sf:::rbind.sf, args=listOfShp)
combinedShp will then be an sf object that has all the features in your individual shapefiles. You can then write that out to a single file in your chosen format with st_write.

Read in all shapefiles in a directory in R into memory

I would like to read all shapefiles in a directory into the global environment However, when I get a list of files in my working directory, the list includes both .shp and .shp.xml files, the latter of which I do not want. My draft code reads both .shp and .shp.xml files. How can I prevent it from doing so?
Draft code follows:
library(maptools)
# get all files with the .shp extension from working directory
setwd("N:/Dropbox/_BonesFirst/139_Transit_Metros_Points_Subset_by_R")
shps <- dir(getwd(), "*.shp")
# the assign function will take the string representing shp
# and turn it into a variable which holds the spatial points data
for (shp in shps) {
cat("now loading", shp, "...", '\n\r')
assign(shp, readOGR(shp))
}
EDIT: Problems seems to be in the readShapePoints. Either readOGR (from rgdal) or shapefile (from raster) work better.
Get all the files:
# replace with your folder name:
dir <- "c:/files/shpfiles"
ff <- list.files(dir, pattern="\\.shp$", full.names=TRUE)
Now read them. Easiest with raster::shapefile. Do not use readShapefile (obsolete and incomplete)
library(raster)
# first file
shapefile(ff[1])
# all of them into a list
x <- lapply(ff, shapefile)
These days, you could use "terra" and do
library(terra)
v <- vect( lapply(ff, vect) )
Going by the Title of your post, I believe you want to list all shape files from your current directory.You may want to use below.
list.files(, pattern="*.shp", full.names=TRUE)

Reading hdf files into R and converting them to geoTIFF rasters

I'm trying to read MODIS 17 data files into R, manipulate them (cropping etc.) and then save them as geoTIFF's. The data files come in .hdf format and there doesn't seem to be an easy way to read them into R.
Compared to other topics there isn't a lot of advice out there and most of it is several years old. Some of it also advises using additional programmes but I want to stick with just using R.
What package/s do people use for dealing with .hdf files in R?
Ok, so my MODIS hdf files were hdf4 rather than hdf5 format. It was surprisingly difficult to discover this, MODIS don't mention it on their website but there are a few hints in various blogs and stack exchange posts. In the end I had to download HDFView to find out for sure.
R doesn't do hdf4 files and pretty much all the packages (like rgdal) only support hdf5 files. There are a few posts about downloading drivers and compiling rgdal from source but it all seemed rather complicated and the posts were for MAC or Unix and I'm using Windows.
Basically gdal_translate from the gdalUtils package is the saving grace for anyone who wants to use hdf4 files in R. It converts hdf4 files into geoTIFFs without reading them into R. This means that you can't manipulate them at all e.g. by cropping them, so its worth getting the smallest tiles you can (for MODIS data through something like Reverb) to minimise computing time.
Here's and example of the code:
library(gdalUtils)
# Provides detailed data on hdf4 files but takes ages
gdalinfo("MOD17A3H.A2000001.h21v09.006.2015141183401.hdf")
# Tells me what subdatasets are within my hdf4 MODIS files and makes them into a list
sds <- get_subdatasets("MOD17A3H.A2000001.h21v09.006.2015141183401.hdf")
sds
[1] "HDF4_EOS:EOS_GRID:MOD17A3H.A2000001.h21v09.006.2015141183401.hdf:MOD_Grid_MOD17A3H:Npp_500m"
[2] "HDF4_EOS:EOS_GRID:MOD17A3H.A2000001.h21v09.006.2015141183401.hdf:MOD_Grid_MOD17A3H:Npp_QC_500m"
# I'm only interested in the first subdataset and I can use gdal_translate to convert it to a .tif
gdal_translate(sds[1], dst_dataset = "NPP2000.tif")
# Load and plot the new .tif
rast <- raster("NPP2000.tif")
plot(rast)
# If you have lots of files then you can make a loop to do all this for you
files <- dir(pattern = ".hdf")
files
[1] "MOD17A3H.A2000001.h21v09.006.2015141183401.hdf" "MOD17A3H.A2001001.h21v09.006.2015148124025.hdf"
[3] "MOD17A3H.A2002001.h21v09.006.2015153182349.hdf" "MOD17A3H.A2003001.h21v09.006.2015166203852.hdf"
[5] "MOD17A3H.A2004001.h21v09.006.2015099031743.hdf" "MOD17A3H.A2005001.h21v09.006.2015113012334.hdf"
[7] "MOD17A3H.A2006001.h21v09.006.2015125163852.hdf" "MOD17A3H.A2007001.h21v09.006.2015169164508.hdf"
[9] "MOD17A3H.A2008001.h21v09.006.2015186104744.hdf" "MOD17A3H.A2009001.h21v09.006.2015198113503.hdf"
[11] "MOD17A3H.A2010001.h21v09.006.2015216071137.hdf" "MOD17A3H.A2011001.h21v09.006.2015230092603.hdf"
[13] "MOD17A3H.A2012001.h21v09.006.2015254070417.hdf" "MOD17A3H.A2013001.h21v09.006.2015272075433.hdf"
[15] "MOD17A3H.A2014001.h21v09.006.2015295062210.hdf"
filename <- substr(files,11,14)
filename <- paste0("NPP", filename, ".tif")
filename
[1] "NPP2000.tif" "NPP2001.tif" "NPP2002.tif" "NPP2003.tif" "NPP2004.tif" "NPP2005.tif" "NPP2006.tif" "NPP2007.tif" "NPP2008.tif"
[10] "NPP2009.tif" "NPP2010.tif" "NPP2011.tif" "NPP2012.tif" "NPP2013.tif" "NPP2014.tif"
i <- 1
for (i in 1:15){
sds <- get_subdatasets(files[i])
gdal_translate(sds[1], dst_dataset = filename[i])
}
Now you can read your .tif files into R using, for example, raster from the raster package and work as normal. I've checked the resulting files against a few I converted manually using QGIS and they match so I'm confident the code is doing what I think it is. Thanks to Loïc Dutrieux and this for the help!
These days you can use the terra package with HDF files
Either get sub-datasets
library(terra)
s <- sds("file.hdf")
s
That can be extracted as SpatRasters like this
s[1]
Or create a SpatRaster of all subdatasets like this
r <- rast("file.hdf")
The following worked for me. It's a short program and just takes in the input folder name. Make sure you know which sub data you want. I was interested in sub data 1.
library(raster)
library(gdalUtils)
inpath <- "E:/aster200102/ast_200102"
setwd(inpath)
filenames <- list.files(,pattern=".hdf$",full.names = FALSE)
for (filename in filenames)
{
sds <- get_subdatasets(filename)
gdal_translate(sds[1], dst_dataset=paste0(substr(filename, 1, nchar(filename)-4) ,".tif"))
}
Use the HEG toolkit provided by NASA to convert your hdf file to geotiff and then use any package ("raster" for example) to read the file. I do the same for both old and new hdf files.
Heres the link: https://newsroom.gsfc.nasa.gov/sdptoolkit/HEG/HEGHome.html
Take a look at the NASA products supported here: https://newsroom.gsfc.nasa.gov/sdptoolkit/HEG/HEGProductList.html
Hope this helps.
This script has been very useful and I managed to convert a batch of 36 files using it. However, my problem is that the conversion does not seem correct. When I do it using ArcGIS 'Make NetCDF Raster Layer tool', I get different results + I am able to convert the numbers to C from Kelvin using simple formula: RasterValue * 0.02 - 273.15. With the results from R conversion I don't get the right results after conversion which leads me to believe ArcGIS conversion is good, and R conversion returns an error.
library(gdalUtils)
library(raster)
setwd("D:/Data/Climate/MODIS")
# Get a list of sds names
sds <- get_subdatasets('MOD11C3.A2009001.006.2016006051904.hdf')
# Isolate the name of the first sds
name <- sds[1]
filename <- 'Rasterinr.tif'
gdal_translate(sds[1], dst_dataset = filename)
# Load the Geotiff created into R
r <- raster(filename)
# Identify files to read:
rlist=list.files(getwd(), pattern="hdf$", full.names=FALSE)
# Substract last 5 digits from MODIS filename for use in a new .img filename
substrRight <- function(x, n){
substr(x, nchar(x)-n+1, nchar(x))
}
filenames0 <- substrRight(rlist,9)
# Suffixes for MODIS files for identyfication:
filenamessuffix <- substr(filenames0,1,5)
listofnewnames <- c("2009.01.MODIS_","2009.02.MODIS_","2009.03.MODIS_","2009.04.MODIS_","2009.05.MODIS_",
"2009.06.MODIS_","2009.07.MODIS_","2009.08.MODIS_","2009.09.MODIS_","2009.10.MODIS_",
"2009.11.MODIS_","2009.12.MODIS_",
"2010.01.MODIS_","2010.02.MODIS_","2010.03.MODIS_","2010.04.MODIS_","2010.05.MODIS_",
"2010.06.MODIS_","2010.07.MODIS_","2010.08.MODIS_","2010.09.MODIS_","2010.10.MODIS_",
"2010.11.MODIS_","2010.12.MODIS_",
"2011.01.MODIS_","2011.02.MODIS_","2011.03.MODIS_","2011.04.MODIS_","2011.05.MODIS_",
"2011.06.MODIS_","2011.07.MODIS_","2011.08.MODIS_","2011.09.MODIS_","2011.10.MODIS_",
"2011.11.MODIS_","2011.12.MODIS_")
# Final new names for converted files:
newnames <- vector()
for (i in 1:length(listofnewnames)) {
newnames[i] <- paste0(listofnewnames[i],filenamessuffix[i],".img")
}
# Loop converting files to raster from NetCDF
for (i in 1:length(rlist)) {
sds <- get_subdatasets(rlist[i])
gdal_translate(sds[1], dst_dataset = newnames[i])
}

How can I save multiple jpegs from raster data in R?

I have about 40 spatial rasters in the .tiff format in a folder. I'm trying to generate histograms from each of the rasters in R, and save each histogram as a jpeg in a separate folder. I wrote code to loop through each of the raster, create a histogram and save it using the 'jpeg' package.
setwd("G:/Research/MODIS Albedo Oct 08-July 09/Test")
library(raster)
library(jpeg)
files <- list.files(path="G:/Research/MODIS Albedo Oct 08-July 09/Test", pattern=".tif",all.files=T, full.names=F, no.. = T) #generate a list of rasters in the folder
number<-length(files) #count the number of rasters
for(r in 1:number) #loop over each raster in the folder
{
x<-raster(files[r], header=F) #load one raster file
jpeg("G:/Research/MODIS Albedo Oct 08-July 09/test_histplots/r.jpg") #create jpeg using the name 'r' generated by loop
hist(x) #generate histogram
dev.off()
}
I want each of the generated jpeg to have a different name, ideally a subset of the original raster name. For example, if the original name of the raster is 'MODIS101_265', the jpeg's name should be 265. Here, 265 is the Julian date in the year. I'm assuming that this might involve using a format specifier like %d in C, but I'm not sure how this works in R.
When I run the above code, I get only one histogram. It seems the code is correctly looping over the original rasters, but saving all resultant histograms to a single jpeg.
Any advice will be helpful! Thanks!
Regular expression using gsub are your friend for getting the number out of the name. Assuming that all of your files are named the way your example is ("MODIS101_265.tif"), then the code below will work for you.
Also, welcome to R where for loops are slow and can usually be replaced by the faster lapply.
saveMyHist <- function(fileName) {
fileNum <- as.numeric(gsub(".*_(\\d+)\\.tif", "\\1", fileName))
x <- raster(fileName, header=F)
jpeg(sprintf("G:/Research/MODIS Albedo Oct 08-July 09/test_histplots/%03d.jpg",
fileNum))
hist(x)
dev.off()
}
files <- list.files("/Users/home/Documents/Development/Rtesting",
pattern=".tif",all.files=T, full.names=F, no.. = T)
lapply(files, saveMyHist)

How to convert .shp file into .csv in R?

I want to to convert two .shp files into one database that would allow me to draw the maps together.
Also, is there a way to convert .shp files into .csv files? I want to be able to personalize and add some data which is easier for me under a .csv format. What I have in mind if to add overlay yield data and precipitation data on the maps.
Here are the shapefiles for Morocco, and Western Sahara.
Code to plot the two files:
# This is code for mapping of CGE_Morocco results
# Loading administrative coordinates for Morocco maps
library(sp)
library(maptools)
library(mapdata)
# Loading shape files
Mor <- readShapeSpatial("F:/Purdue University/RA_Position/PhD_ResearchandDissert/PhD_Draft/Country-CGE/MAR_adm1.shp")
Sah <- readShapeSpatial("F:/Purdue University/RA_Position/PhD_ResearchandDissert/PhD_Draft/Country-CGE/ESH_adm1.shp")
# Ploting the maps (raw)
png("Morocco.png")
Morocco <- readShapePoly("F:/Purdue University/RA_Position/PhD_ResearchandDissert/PhD_Draft/Country-CGE/MAR_adm1.shp")
plot(Morocco)
dev.off()
png("WesternSahara.png")
WesternSahara <- readShapePoly("F:/Purdue University/RA_Position/PhD_ResearchandDissert/PhD_Draft/Country-CGE/ESH_adm1.shp")
plot(WesternSahara)
dev.off()
After looking into suggestions from #AriBFriedman and #PaulHiemstra and subsequently figuring out how to merge .shp files, I have managed to produce the following map using the following code and data (For .shp data, cf. links above)
code:
# Merging Mor and Sah .shp files into one .shp file
MoroccoData <- rbind(Mor#data,Sah#data) # First, 'stack' the attribute list rows using rbind()
MoroccoPolys <- c(Mor#polygons,Sah#polygons) # Next, combine the two polygon lists into a single list using c()
summary(MoroccoData)
summary(MoroccoPolys)
offset <- length(MoroccoPolys) # Next, generate a new polygon ID for the new SpatialPolygonDataFrame object
browser()
for (i in 1: offset)
{
sNew = as.character(i)
MoroccoPolys[[i]]#ID = sNew
}
ID <- c(as.character(1:length(MoroccoPolys))) # Create an identical ID field and append it to the merged Data component
MoroccoDataWithID <- cbind(ID,MoroccoData)
MoroccoPolysSP <- SpatialPolygons(MoroccoPolys,proj4string=CRS(proj4string(Sah))) # Promote the merged list to a SpatialPolygons data object
Morocco <- SpatialPolygonsDataFrame(MoroccoPolysSP,data = MoroccoDataWithID,match.ID = FALSE) # Combine the merged Data and Polygon components into a new SpatialPolygonsDataFrame.
Morocco#data$id <- rownames(Morocco#data)
Morocco.fort <- fortify(Morocco, region='id')
Morocco.fort <- Morocco.fort[order(Morocco.fort$order), ]
MoroccoMap <- ggplot(data=Morocco.fort, aes(long, lat, group=group)) +
geom_polygon(colour='black',fill='white') +
theme_bw()
Results:
New Question:
1- How to eliminate the boundaries data that cuts though the map in half?
2- How to combine different regions within a .shp file?
Thanks you all.
P.S: the community in stackoverflow.com is wonderful and very helpful, and especially toward beginners like :) Just thought of emphasizing it.
Once you have loaded your shapefiles into Spatial{Lines/Polygons}DataFrames (classes from the sp-package), you can use the fortify generic function to transform them to flat data.frame format. The specific functions for the fortify generic are included in the ggplot2 package, so you'll need to load that first. A code example:
library(ggplot2)
polygon_dataframe = fortify(polygon_spdf)
where polygon_spdf is a SpatialPolygonsDataFrame. A similar approach works for SpatialLinesDataFrame's.
The difference between my solution and that of #AriBFriedman is that mine includes the x and y coordinates of the polygons/lines, in addition to the data associated to those polgons/lines. I really like visualising my spatial data with the ggplot2 package.
Once you have your data in a normal data.frame you can simply use write.csv to generate a csv file on disk.
I think you mean you want the associated data.frame from each?
If so, it can be accessed with the # slot access function. The slot is called data:
write.csv( WesternSahara#data, file="/home/wherever/myWesternSahara.csv")
Then when you read it back in with read.csv, you can try assigning:
myEdits <- read.csv("/home/wherever/myWesternSahara_modified.csv")
WesternSahara#data <- myEdits
You may need to do some massaging of row names and so forth to get it to accept the new data.frame as valid. I'd probably try to merge the existing data.frame with a csv you read in in R, rather than making edits destructively....

Resources