I'm trying to make a reclassification of several raster files with R.
My raster files contain species ranges with the values of the total range in the occurring cells. The other cells have NoData. The files have their rank as name (1,2,3...).
Now I try to reclass the values of the cells where species occurs with the Rank.
I tried with for looping the reclassify function without success...
thanks in advance!
Let's say you have one raster representing the presence/absences of a single species. You also know the area extent and the range (expressed as occurrences) of this particular species. For each cell where this species is present you've filled it with the total occurrences.
Lets say this species was detected on 50 out of 400 cells.
The species occurrence expressed by presence on 50 random cells.
e1 <- extent(0,10,0,10)
r1 <- raster(extent(0,10,0,10))
res(r1) <- 0.5
r1[1:ncell(r1)] <- NA
r1[sample(ncell(r1), 50, rep = F)] <- 50
plot(r1)
But you may have this rasterLayer as a image file, stored at your disk and named with a rank of the species.
There are several ways to replace values in a rasterLayer. If for this particular species the rank is 1, you can replace the range by rank with
r1[!is.na(r1)] <- 1
If you want to lop over your file folder try this:
wdata <- '.../R/Stackoverflow/21876858' # your local folder
f.reclass <- function(x=x){
files <- list.files(file.path(wdata), all.files = F) # list all files from a given driver folder
# Assuming TIF images, List files from wdata folder
## Change below if using any other format
ltif <- grep(".tif$", files, ignore.case = TRUE, value = TRUE)
stkl <- stack()
for(i in 1:length(ltif)){
x <- raster(file.path(wdata, ltif[i]),
package = "raster", varname = fname)
rank <- as.numeric(sub("*.tif", "", ltif[i]))
# Change also here if not tif
stopifnot(is.numeric(rank)) # check for numeric
x[!is.na(x)] <- rank
stkl <- addLayer(stkl, x)
}
stkl
}
spreclass <- f.reclass(x=x)
You'll get a rasterStack object with your reclassified rasterLayers. your can unstack them, export, manipulate...
Related
I have large monthly NetCDF data for many years where each .nc file has 8 layers. I want to calculate the median value of each month in each year using the first layer in each .nc file only.
I have done the for loop as:
library(raster)
library(ncdf4)
setwd("path")
# go to all sub-folders
sub <- list.dirs(full.names=FALSE, recursive=FALSE)
# read and create a matrix of file names
fn <- list.files(path=sub, recursive=TRUE, full.names=TRUE, pattern="*.*.nc$")
fn.mat <- matrix(fn, nrow = 155)
# select nc. files
mid <- c(1, 14, seq(26, 155, 13))
# subset the matrix of file names
fn.mat.sub <- fn.mat[mid, ]
# create indices for stackApply function to calculate the median using the first layers
layers <- rep(1:8, 113)
# loop through the whole file name matrix
ls <- list() # create empty list to store the output
for (i in 1:ncol(fn.mat.sub)) {
for (ii in 1:nrow(fn.mat.sub)) {
s <- stack(fn.mat.sub[ii, -i]) # exclude the month of the year that wants to calculate the median for cross-validation
m <- stackApply(s, indices = layers, fun = median, na.rm = T)
ls[[length(ls)+1]] <- m[[1]]
}
}
But the processing time is extremely low. I know R doesn’t like “for-loop” and the “apply” family function (such as apply, lapply, sapply, vapply, etc.) is often used instead. But I don’t know how to use it in this case. I believe there are some better ways to improve the processing time in this case.
Any ideas to speed up this processing? Thanks!
I have two raster files (values ranges from 0 to 1) and I want to find the difference between them. But the problem is there are certain values those are missing. So I want to assign them value 1 (Like NA=1). How can I do this? Any expert can solve this little query. Thanks
My code is this.
library(raster)
R1 <- raster ("D:/Results/1.tiff")
R2 <- raster ("D:/Results/2.tiff")
Se1= R2-R1
plot(Se1)
How large is your raster files and how limited by memory are you? With raster, the optimal memory safe approach when interacting with large files is to use the reclassify function shown below. Let me know if it works.
# Package names
library(raster)
# Read in files
R1 <- raster("D:/Results/1.tiff")
R2 <- raster("D:/Results/1.tiff")
# use the reclassify function to group values to other values.
# In this case, NA values to 1.Reclassification is done with matrix rcl ,
# in the row order of the reclassify table.
D1 <- reclassify(R1, cbind(NA, 1))
D2 <- reclassify(R2, cbind(NA, 1))
# Find the difference between the two and plot.
Se1 = R2-R1
plot(Se1)
Here is how you may do that with "terra" (the replacement of "raster")
library(terra)
R <- rast(paste0("D:/Results/", 1:2, ".tiff"))
R <- subst(R, NA, 1)
Se1 <- diff(R)
I have a list of rasters(.tif format) for multiple years. It is a 16 day NDVI from landsat, i want to make a monthly NDVI (average of two consecutive rasters) and save it in same or different directory as a monthly average
I have listed the raster rasters and make stack of it, later i used stackApply to calculate the mean, but it will produce empty raster. I have 23 images for single year, which i want to average it and make 12 months. This is how my raster files look like
"landsatNDVISC05SLC2000001.tif" "landsatNDVISC05SLC2000017.tif"
"landsatNDVISC05SLC2000033.tif" "landsatNDVISC05SLC2000049.tif"
"landsatNDVISC05SLC2000065.tif" "landsatNDVISC05SLC2000081.tif"
"landsatNDVISC05SLC2000097.tif" "landsatNDVISC05SLC2000113.tif"
"landsatNDVISC05SLC2000129.tif" "landsatNDVISC05SLC2000145.tif"
"landsatNDVISC05SLC2000161.tif" "landsatNDVISC05SLC2000177.tif"
"landsatNDVISC05SLC2000193.tif" "landsatNDVISC05SLC2000209.tif"
"landsatNDVISC05SLC2000225.tif" "landsatNDVISC05SLC2000241.tif"
"landsatNDVISC05SLC2000257.tif" "landsatNDVISC05SLC2000273.tif"
"landsatNDVISC05SLC2000289.tif" "landsatNDVISC05SLC2000305.tif"
"landsatNDVISC05SLC2000321.tif" "landsatNDVISC05SLC2000337.tif"
"landsatNDVISC05SLC2000353.tif
This code works but will produce more than twelve empty raster and i also want to save the raster brick as single subset monthly raster
library(raster)
lrast<-list.files("G:/LANDSAT-NDVI/testAverage")
layers<-paste("landsatNDVISC05SLC2000", seq(from=001, to=353,by=16))
stak<-stack(lrast)
raster<-stackApply(stak, layers, fun = mean)
I want to make a monthly average from landsatNDVISC05SLC2000001.tif and landsatNDVISC05SLC2000017.tif as landsatNDVISC05SLC2000M1.tif. Similarly, 33,49 and since i only have 23 raster, i want to retain landsatNDVISC05SLC2000353.tif as landsatNDVISC05SLC2000M12.tif
Blockquote
not sure how stackapply works but something like this should do the stuff needed.
library(raster)
files <- list.files(path = "...", full.names = T, pattern = ".tif")
stk <- stack()
for (i in files){
print(i)
as <- raster(files[i])
stk <- addLayer(stk, as)
}
jday <-c("landsatNDVISC05SLC2000017.tif","landsatNDVISC05SLC2000033.tif",
"landsatNDVISC05SLC2000049.tif","landsatNDVISC05SLC2000065.tif","landsatNDVISC05SLC2000081.tif",
"landsatNDVISC05SLC2000097.tif","landsatNDVISC05SLC2000113.tif","landsatNDVISC05SLC2000129.tif",
"landsatNDVISC05SLC2000145.tif","landsatNDVISC05SLC2000161.tif","landsatNDVISC05SLC2000177.tif",
"landsatNDVISC05SLC2000193.tif","landsatNDVISC05SLC2000209.tif","landsatNDVISC05SLC2000225.tif",
"landsatNDVISC05SLC2000241.tif","landsatNDVISC05SLC2000257.tif","landsatNDVISC05SLC2000273.tif",
"landsatNDVISC05SLC2000289.tif","landsatNDVISC05SLC2000305.tif","landsatNDVISC05SLC2000321.tif",
"landsatNDVISC05SLC2000337.tif","landsatNDVISC05SLC2000353.tif")
jday <- as.numeric(substr(jday, 24, 25)) #substract the julien days (which I think these number represent before .tif; or you can substract the names from the 'files' vector)
dates <- as.Date(jday, origin=as.Date("2000-01-01")) # create a Date vector
stk <- setZ(stk, dates) # assign the date vector to the raster stack
raster <- zApply(stk, by = format(dates,"%Y-%m"), fun = mean, na.rm = T) # create the monthly stack
I have a netCDF file where there are
5 dimensions (a,b,c,d,e)
100 variables (x1,...,x100)
with the variables having different combinations of dimensions.
outx <- nc_open('testH.nc') #Open the NetCDF file
varx <- names(outx[['var']]) #Get all the variables
And I read all the variables using this:
t <- sapply(varx, function(x) ncvar_get(outx, x), USE.NAMES = TRUE)
But, I only want to read variables corresponding to x and y dimensions together.
How can I do this?
I have two data sets with latitude, longitude, and temperature data. One data set corresponds to a geographic region of interest with the corresponding lat/long pairs that form the boundary and contents of the region (Matrix Dimension = 4518x2)
The other data set contains lat/long and temperature data for a larger region that envelopes the region of interest (Matrix Dimenion = 10875x3).
My question is: How do you extract the appropriate row data (lat, long, temperature) from the 2nd data set that matches the first data set's lat/long data?
I've tried a variety of "for loops," "subset," and "unique" commands but I can't obtain the matching temperature data.
Thanks in advance!
10/31 Edit: I forgot to mention that I'm using "R" to process this data.
The lat/long data for the region of interest was provided as a list of 4,518 files containing the lat/long coordinates in the name of each file:
x<- dir()
lenx<- length(x)
g <- strsplit(x, "_")
coord1 <- matrix(NA,nrow=lenx, ncol=1)
coord2 <- matrix(NA,nrow=lenx, ncol=1)
for(i in 1:lenx) {
coord1[i,1] <- unlist(g)[2+3*(i-1)]
coord2[i,1] <- unlist(g)[3+3*(i-1)]
}
coord1<-as.numeric(coord1)
coord2<-as.numeric(coord2)
coord<- cbind(coord1, coord2)
The lat/long and temperature data was obtained from an NCDF file for with temperature data for 10,875 lat/long pairs:
long<- tempcd$var[["Temp"]]$size[1]
lat<- tempcd$var[["Temp"]]$size[2]
time<- tempcd$var[["Temp"]]$size[3]
proj<- tempcd$var[["Temp"]]$size[4]
temp<- matrix(NA, nrow=lat*long, ncol = time)
lat_c<- matrix(NA, nrow=lat*long, ncol=1)
long_c<- matrix(NA, nrow=lat*long, ncol =1)
counter<- 1
for(i in 1:lat){
for(j in 1:long){
temp[counter,]<-get.var.ncdf(precipcd, varid= "Prcp", count = c(1,1,time,1), start=c(j,i,1,1))
counter<- counter+1
}
}
temp_gcm <- cbind(lat_c, long_c, temp)`
So now the question is how do you remove values from "temp_gcm" that correspond to lat/long data pairs from "coord?"
Noe,
I can think of a number of ways you could do this. The simplest, albeit not the most efficient would be to make use of R's which() function, which takes a logical argument, while iterating over the data frame which you want to apply the matches to. Of course, this is assuming that there can be at most a single match in the larger data set. Based on your data sets, I would do something like this:
attach(temp_gcm) # adds the temp_gcm column names to the global namespace
attach(coord) # adds the coord column names to the global namespace
matched.temp = vector(length = nrow(coord)) # To store matching results
for (i in seq(coord)) {
matched.temp[i] = temp[which(lat_c == coord1[i] & long_c == coord2[i])]
}
# Now add the results column to the coord data frame (indexes match)
coord$temperature = matched.temp
The function which(lat_c == coord1[i] & long_c == coord2[i]) returns a vector of all rows in the dataframe temp_gcm which satisfy lat_c and long_c matching coord1 and coord2 respectively from row i in the iteration (NOTE: I'm assuming this vector will only have length 1, i.e. there is only 1 possible match). matched.temp[i] will then be assigned the value from the column temp in the dataframe temp_gcm which satisfied the logical condition. Note that the goal in doing this is that we create a vector which has matched values that correspond by index to the rows of the dataframe coord.
I hope this helps. Note that this is a rudimentary approach, and I would advise looking up the function merge() as well as apply() to do this in a more succinct manner.
I added an additional column full of zeros to use as the resultant for an IF statement. "x" is the number of rows in temp_gcm. "y" is the number of columns (representative of time steps). "temp_s" is the standardized temperature data
indicator<- matrix(0, nrow = x, ncol = 1)
precip_s<- cbind(precip_s, indicator)
temp_s<- cbind(temp_s, indicator)
for(aa in 1:x){
current_lat<-latitudes[aa,1] #Latitudes corresponding to larger area
current_long<- longitudes[aa,1] #Longitudes corresponding to larger area
for(ab in 1:lenx){ #Lenx coresponds to nrow(coord)
if(current_lat == coord[ab,1] & current_long == coord[ab,2]) {
precip_s[aa,(y/12+1)]<-1 #y/12+1 corresponds to "indicator column"
temp_s[aa,(y/12+1)]<-1
}
}
}
precip_s<- precip_s[precip_s[,(y/12+1)]>0,] #Removes rows with "0"s remaining in "indcator" column
temp_s<- temp_s[temp_s[,(y/12+1)]>0,]
precip_s<- precip_s[,-(y/12+1)] #Removes "indicator column
temp_s<- temp_s[,-(y/12+1)]