I have the following problem: I need to process multiple raster files using the same function in R package landscapemetrics. Basically my raster files are parts of a country map, all of the same shape and size (i.e. quadrants. I figured out a code for 1 file, but I have to do the same with more than 600 rasters. So, doing it manually is very irrational. The steps in my code are the following:
# 1. I load "raster" and "landscapemetrics" packages:
library(raster)
library(landscapemetrics)
# 2. I read in my quadrant:
Quadrant <- raster("C:\\Users\\customer\\Documents\\ ... \\2434-44.tif")
# 3. I process the raster to get landscape metrics tibble:
LS_metrics <- calculate_lsm(landscape = Quadrant)
# 4. Finally, I write it into a csv:
write.csv(LS_metrics, file = "2434-44.csv")
I need to keep the same file name for my csv files as I had for tif (e.g. results from processing quadrant "2434-44.tif", need to be stored in "2434-44.csv", possibly in a folder in wd).
I am new to R. I tried to use list.files() and then apply a for loop, but my code did not work.
I need your advice.
Yours faithfully,
Denis
Your question is really about iteration and character (filename) manipulation; not about landscapemetrics etc. There are many similar questions on this site and resources elsewhere that you can consult. The basic approach can be like this:
# get input filenames
inf <- list.files("/my/path", pattern="\\.tif$", full=TRUE)
# create output filenames
outf <- gsub(".tif", ".csv", basename(inf))
# perhaps put output files in particular folder
dir.create("out", FALSE, FALSE)
outf <- file.path("out", outf)
# iterate
for (i in 1:length(inf)) {
# read input
input <- raster(inf[i])
# do something
output <- data.frame(id=1)
# write output
write.csv(output, outf[i])
}
It's very hard to help without further information. What was the issue with your approach of looping through all files using list.files(). In general, this should work.
Furthermore, most likely you don't want to calculate all available landscape metrics, but rather specify a subselection during the calculate_lsm() function call.
Related
I am using a simple code below to append multiple images together with the R magick package. It works well, however, there are many images to process and their names are stored in a .csv file. Could anyone advise on how to load the image names to the image_read function from specific cells in a .csv file (see example below the code)? So far, I was not able to find anything appropriate that would solve this.
library (magick)
pic_A <- image_read('A.png')
pic_B <- image_read('B.png')
pic_C <- image_read('C.png')
combined <- c(pic_A, pic_B, pic_C)
combined <- image_scale(combined, "300x300")
image_info(combined)
final <- image_append(image_scale(combined, "x120"))
print(final)
image_write(final, "final.png") #to save
Something like this should work. If you load the csv into a dataframe then, it's then straightforward to point the image_read towards the appropriate elements.
And the index (row number) is included in the output filename so that things are not overwritten each iteration.
library (magick)
file_list <- read.csv("your.csv",header = F)
names(file_list) <- c("A","B","C")
for (i in 1:nrow(file_list)){
pic_A <- image_read(file_list$A[i])
pic_B <- image_read(file_list$B[i])
pic_C <- image_read(file_list$C[i])
combined <- c(pic_A, pic_B, pic_C)
combined <- image_scale(combined, "300x300")
image_info(combined)
final <- image_append(image_scale(combined, "x120"))
print(final)
image_write(final, paste0("final_",i,".png")) #to save
}
I'm having a lot of trouble reading/writing to CSV files. Say I have over 300 CSV's in a folder, each being a matrix of values.
If I wanted to find out a characteristic of each individual CSV file such as which rows had an exact number of 3's, and write the result to another CSV fil for each test, how would I go about iterating this over 300 different CSV files?
For example, say I have this code I am running for each file:
values_4 <- read.csv(file = 'values_04.csv', header=FALSE) // read CSV in as it's own DF
values_4$howMany3s <- apply(values_04, 1, function(x) length(which(x==3))) // compute number of 3's
values_4$exactly4 <- apply(values_04[50], 1, function(x) length(which(x==4))) // show 1/0 on each column that has exactly four 3's
values_4 // print new matrix
I am then continuously copy and pasting this code and changing the "4" to a 5, 6, etc and noting the values. This seems wildly inefficient to me but I'm not experienced enough at R to know exactly what my options are. Should I look at adding all 300 CSV files to a single list and somehow looping through them?
Appreciate any help!
Here's one way you can read all the files and proceess them. Untested code as you haven't given us anything to work on.
# Get a list of CSV files. Use the path argument to point to a folder
# other than the current working directory
files <- list.files(pattern=".+\\.csv")
# For each file, work your magic
# lapply runs the function defined in the second argument on each
# value of the first argument
everything <- lapply(
files,
function(f) {
values <- read.csv(f, header=FALSE)
apply(values, 1, function(x) length(which(x==3)))
}
)
# And returns the results in a list. Each element consists of
# the results from one function call.
# Make sure you can access the elements of the list by filename
names(everything) <- files
# The return value is a list. Access all of it with
everything
# Or a single element with
everything[["values04.csv"]]
So I have a folder with some n images which I want to open and save with the readImage function. Right now a colleague had written something similar for opening and storing the name only of the images. I'd like to do the following:
setwd("~/ABC/One_Folder_Up")
img_src <- "FolderOfInterest"
image_list <- list.files(path=img_src, pattern = "^closed")
But with the actual .tif images named for example: closed100, closed101,....closed201
The above code works great for getting the names. But how can I get this type of pattern but instead open and save images? The output is a large matrix for each image.
So for n = 1 to n, I want to perform the following:
closed175 <- readImage("closed175.tif")
ave175 <- mean(closed175)
SD175 <- SD(closed175)
I'm assuming the image list shown in the first part could be used in the desired loop?
Then, after the images are saved as their own matricies, and all averages and SDs are calculated, I want to put the averages and SDs in a matrix like this:
imavelist <- c(ave175, ave176,......ave200)
Sorry, not an expert coder. Thank you!
edit: maybe lapply?
edit2: if I use this suggestion,
require(imager)
closed_images <- lapply(closed_im_list, readImage)
closed_im_matrix = do.call('cbind', lapply(closed_images, as.numeric))
Then I need a loop to save each element of the image stack matrix as its own individual image.
setwd("~/ABC/One_Folder_Up/FolderOfInterest/")
#for .tif format
image_list=list.files(path=getwd(), pattern = "*.tif")
# for other formats replace tif with appropriate format.
f=function(x){
y=readImage(x)
mve=mean(y)
sd=sd(y)
c(mve,sd)
}
results=data.frame(t(sapply(image_list,f)))
colnames(results)=c("average","sd")
the resul for 3 images:
> results
average sd
Untitled.tif 0.9761128 0.1451167
Untitled2.tif 0.9604224 0.1861798
Untitled3.tif 0.9782997 0.1457034
>
I'm trying to read MODIS 17 data files into R, manipulate them (cropping etc.) and then save them as geoTIFF's. The data files come in .hdf format and there doesn't seem to be an easy way to read them into R.
Compared to other topics there isn't a lot of advice out there and most of it is several years old. Some of it also advises using additional programmes but I want to stick with just using R.
What package/s do people use for dealing with .hdf files in R?
Ok, so my MODIS hdf files were hdf4 rather than hdf5 format. It was surprisingly difficult to discover this, MODIS don't mention it on their website but there are a few hints in various blogs and stack exchange posts. In the end I had to download HDFView to find out for sure.
R doesn't do hdf4 files and pretty much all the packages (like rgdal) only support hdf5 files. There are a few posts about downloading drivers and compiling rgdal from source but it all seemed rather complicated and the posts were for MAC or Unix and I'm using Windows.
Basically gdal_translate from the gdalUtils package is the saving grace for anyone who wants to use hdf4 files in R. It converts hdf4 files into geoTIFFs without reading them into R. This means that you can't manipulate them at all e.g. by cropping them, so its worth getting the smallest tiles you can (for MODIS data through something like Reverb) to minimise computing time.
Here's and example of the code:
library(gdalUtils)
# Provides detailed data on hdf4 files but takes ages
gdalinfo("MOD17A3H.A2000001.h21v09.006.2015141183401.hdf")
# Tells me what subdatasets are within my hdf4 MODIS files and makes them into a list
sds <- get_subdatasets("MOD17A3H.A2000001.h21v09.006.2015141183401.hdf")
sds
[1] "HDF4_EOS:EOS_GRID:MOD17A3H.A2000001.h21v09.006.2015141183401.hdf:MOD_Grid_MOD17A3H:Npp_500m"
[2] "HDF4_EOS:EOS_GRID:MOD17A3H.A2000001.h21v09.006.2015141183401.hdf:MOD_Grid_MOD17A3H:Npp_QC_500m"
# I'm only interested in the first subdataset and I can use gdal_translate to convert it to a .tif
gdal_translate(sds[1], dst_dataset = "NPP2000.tif")
# Load and plot the new .tif
rast <- raster("NPP2000.tif")
plot(rast)
# If you have lots of files then you can make a loop to do all this for you
files <- dir(pattern = ".hdf")
files
[1] "MOD17A3H.A2000001.h21v09.006.2015141183401.hdf" "MOD17A3H.A2001001.h21v09.006.2015148124025.hdf"
[3] "MOD17A3H.A2002001.h21v09.006.2015153182349.hdf" "MOD17A3H.A2003001.h21v09.006.2015166203852.hdf"
[5] "MOD17A3H.A2004001.h21v09.006.2015099031743.hdf" "MOD17A3H.A2005001.h21v09.006.2015113012334.hdf"
[7] "MOD17A3H.A2006001.h21v09.006.2015125163852.hdf" "MOD17A3H.A2007001.h21v09.006.2015169164508.hdf"
[9] "MOD17A3H.A2008001.h21v09.006.2015186104744.hdf" "MOD17A3H.A2009001.h21v09.006.2015198113503.hdf"
[11] "MOD17A3H.A2010001.h21v09.006.2015216071137.hdf" "MOD17A3H.A2011001.h21v09.006.2015230092603.hdf"
[13] "MOD17A3H.A2012001.h21v09.006.2015254070417.hdf" "MOD17A3H.A2013001.h21v09.006.2015272075433.hdf"
[15] "MOD17A3H.A2014001.h21v09.006.2015295062210.hdf"
filename <- substr(files,11,14)
filename <- paste0("NPP", filename, ".tif")
filename
[1] "NPP2000.tif" "NPP2001.tif" "NPP2002.tif" "NPP2003.tif" "NPP2004.tif" "NPP2005.tif" "NPP2006.tif" "NPP2007.tif" "NPP2008.tif"
[10] "NPP2009.tif" "NPP2010.tif" "NPP2011.tif" "NPP2012.tif" "NPP2013.tif" "NPP2014.tif"
i <- 1
for (i in 1:15){
sds <- get_subdatasets(files[i])
gdal_translate(sds[1], dst_dataset = filename[i])
}
Now you can read your .tif files into R using, for example, raster from the raster package and work as normal. I've checked the resulting files against a few I converted manually using QGIS and they match so I'm confident the code is doing what I think it is. Thanks to Loïc Dutrieux and this for the help!
These days you can use the terra package with HDF files
Either get sub-datasets
library(terra)
s <- sds("file.hdf")
s
That can be extracted as SpatRasters like this
s[1]
Or create a SpatRaster of all subdatasets like this
r <- rast("file.hdf")
The following worked for me. It's a short program and just takes in the input folder name. Make sure you know which sub data you want. I was interested in sub data 1.
library(raster)
library(gdalUtils)
inpath <- "E:/aster200102/ast_200102"
setwd(inpath)
filenames <- list.files(,pattern=".hdf$",full.names = FALSE)
for (filename in filenames)
{
sds <- get_subdatasets(filename)
gdal_translate(sds[1], dst_dataset=paste0(substr(filename, 1, nchar(filename)-4) ,".tif"))
}
Use the HEG toolkit provided by NASA to convert your hdf file to geotiff and then use any package ("raster" for example) to read the file. I do the same for both old and new hdf files.
Heres the link: https://newsroom.gsfc.nasa.gov/sdptoolkit/HEG/HEGHome.html
Take a look at the NASA products supported here: https://newsroom.gsfc.nasa.gov/sdptoolkit/HEG/HEGProductList.html
Hope this helps.
This script has been very useful and I managed to convert a batch of 36 files using it. However, my problem is that the conversion does not seem correct. When I do it using ArcGIS 'Make NetCDF Raster Layer tool', I get different results + I am able to convert the numbers to C from Kelvin using simple formula: RasterValue * 0.02 - 273.15. With the results from R conversion I don't get the right results after conversion which leads me to believe ArcGIS conversion is good, and R conversion returns an error.
library(gdalUtils)
library(raster)
setwd("D:/Data/Climate/MODIS")
# Get a list of sds names
sds <- get_subdatasets('MOD11C3.A2009001.006.2016006051904.hdf')
# Isolate the name of the first sds
name <- sds[1]
filename <- 'Rasterinr.tif'
gdal_translate(sds[1], dst_dataset = filename)
# Load the Geotiff created into R
r <- raster(filename)
# Identify files to read:
rlist=list.files(getwd(), pattern="hdf$", full.names=FALSE)
# Substract last 5 digits from MODIS filename for use in a new .img filename
substrRight <- function(x, n){
substr(x, nchar(x)-n+1, nchar(x))
}
filenames0 <- substrRight(rlist,9)
# Suffixes for MODIS files for identyfication:
filenamessuffix <- substr(filenames0,1,5)
listofnewnames <- c("2009.01.MODIS_","2009.02.MODIS_","2009.03.MODIS_","2009.04.MODIS_","2009.05.MODIS_",
"2009.06.MODIS_","2009.07.MODIS_","2009.08.MODIS_","2009.09.MODIS_","2009.10.MODIS_",
"2009.11.MODIS_","2009.12.MODIS_",
"2010.01.MODIS_","2010.02.MODIS_","2010.03.MODIS_","2010.04.MODIS_","2010.05.MODIS_",
"2010.06.MODIS_","2010.07.MODIS_","2010.08.MODIS_","2010.09.MODIS_","2010.10.MODIS_",
"2010.11.MODIS_","2010.12.MODIS_",
"2011.01.MODIS_","2011.02.MODIS_","2011.03.MODIS_","2011.04.MODIS_","2011.05.MODIS_",
"2011.06.MODIS_","2011.07.MODIS_","2011.08.MODIS_","2011.09.MODIS_","2011.10.MODIS_",
"2011.11.MODIS_","2011.12.MODIS_")
# Final new names for converted files:
newnames <- vector()
for (i in 1:length(listofnewnames)) {
newnames[i] <- paste0(listofnewnames[i],filenamessuffix[i],".img")
}
# Loop converting files to raster from NetCDF
for (i in 1:length(rlist)) {
sds <- get_subdatasets(rlist[i])
gdal_translate(sds[1], dst_dataset = newnames[i])
}
By using R ill try to open my NetCDF data that contain 5 dimensional space with 15 variables. (variable for calculation is in matrix 1000X920 )
This problem actually look like the same with the other question before.
I got explanation from here and the others
At first I used RNetCDF package, but after some trial i found unconsistensy when the package read my data. And then finally better after used ncdf package.
there is no problem for opening data in a single file, but after ill try for looping in more than hundred data inside folder for a spesific variable (for example: var no 15) the program was failed.
> days = formatC(001:004, width=3, flag="0")
> ncfiles = lapply (days,
> function(d){ filename = paste("data",d,".nc",sep="")
> open.ncdf(filename) })
also when i try the command like this for a spesific variable
> sapply(ncfiles,function(file,{get.var.ncdf(file,"var15")})
so my question is, any solution to read all netcdf file with special variable then make calculation in one frame. From the solution before i was failed for generating the variable no 15 on whole netcdf data.
thanks for any solution to this problem.
UPDATE:
this is the last what i have done
when i write
library(ncdf)
files=list.files("allnc/",pattern='*nc',full.names=TRUE)
for(i in seq_along(files)) {
nc <- lapply(files[i],open.ncdf)
lw = get.var.ncdf(nc,"var15")
x=dim(lw)
rbind(df,data.frame(lw))->df
}
i can get all netcdf data by > nc
so i how i can get variable data with new name automatically like lw1,lw2...etc
i cant apply
var1 <- lapply(files, FUN = get.var.ncdf, variable = "var15")
then i can do calculation with all data.
the other technique i try used RNetCDF package n doing a looping
# Declare data frame
df=NULL
#Open all files
files= list.files("allnc/",pattern='*.nc',full.names=TRUE)
# Loop over files
for(i in seq_along(files)) {
nc = open.nc(files[i])
# Read the whole nc file and read the length of the varying dimension (here, the 3rd dimension, specifically time)
lw = var.get.nc(nc,'DBZH')
x=dim(lw)
# Vary the time dimension for each file as required
lw = var.get.nc(nc,'var15')
# Add the values from each file to a single data.frame
}
i can take a variable data but i just got one data from my all file nc.
note: sampe of my data name ( data20150102001.nc,data20150102002.nc.....etc)
This solution uses NCO, not R. You may use it to check your R solution:
ncra -v var15 data20150102*.nc out.nc
That is all.
Full documentation in NCO User Guide.
You can use the ensemble statistics capabilities of CDO, but note that on some systems the number of files is limited to 256:
cdo ensmean data20150102*.nc ensmean.nc
you can replace "mean" with the statistic of your choice, max, std, var, min etc...