Converting a raster to a csv in R - r

I wish to convert a raster to a csv file. I have tried to convert a raster to a dataframe on one file just to see if it works. I have tried using:
as.data.frame( rasterToPoints(species) )
but I get an error when I try to write "species" to a csv :
Error in as.data.frame.default(x[[i]], optional = TRUE) :
cannot coerce class "structure("RasterLayer", package = "raster")" to a data.frame
This is my code (I need to convert multiple rasters to csv (see the loop))
#start loop
file.names <- dir(path, pattern=".csv")
for(i in 1:length(file.names)){
file<- read.csv(file.name[i], header = TRUE, stringsAsFactors=FALSE)
#subsetting each file and renaming column header names
sub.file<-subset(file, select = c('Matched.Scientific.Name', 'Vernacular.Name...matched', 'Latitude...processed', 'Longitude...processed'))
names(sub.file) <- c('species', 'name', 'Lat','Lon')
#turn into a SpatialPointsDataFrame
coordinates(sub.file) <- ~ Lon + Lat
proj4string(sub.file) <- '+init=EPSG:4326'
plot(sub.file, axes=TRUE)
#converting to BNG
sub.file.BNG <- spTransform(sub.file, '+init=EPSG:27700')
plot(sub.file.BNG, axes=TRUE)
#creating template raster
template <- raster(xmn=400000, xmx=600000, ymn=200000, ymx=800000, res=25000, crs='+init=EPSG:27700')
#point data > presence grid
species <- rasterize(sub.file.BNG, template, field=1)
plot(species)
# UK wide
template <- raster(xmn=-200000, xmx=700000, ymn=0, ymx=1250000, res=25000, crs='+init=EPSG:27700')
# use that to turn species point data into a presence grid
species <- rasterize(sub.file, template, field=1)
plot(species)
#converting a raster>dataframe>csv?????
as.data.frame( rasterToPoints(species) )
}

Always provide some example data when asking a question.
library(raster)
f <- system.file("external/test.grd", package="raster")
r <- raster(f)
To get the cell values
x <- as.data.frame(r)
head(x, 2)
# test
#1 NA
#2 NA
To get the cell coordinates and values, only for cells that are not NA
x <- rasterToPoints(r)
head(x, 2)
# x y test
#[1,] 181180 333740 633.686
#[2,] 181140 333700 712.545
To get the cell coordinates and values, only for all cells (including NA)
x <- cbind(coordinates(r), v=values(r))
head(x, 2)
# x y v
#[1,] 178420 333980 NA
#[2,] 178460 333980 NA
Whichever one you choose, you can then do
write.csv(x, "test.csv")
The mistake you made is that you did not assign the result of as.data.frame to a variable, and then tried to write the RasterLayer with write.csv. That is an error, and you get
write.csv(r)
#Error in as.data.frame.default(x[[i]], optional = TRUE) :
# cannot coerce class ‘structure("RasterLayer", package = "raster")’ to a
# data.frame
By the way, if you have multiple rasters, you may want to combine them first
s <- stack(r, r, r)
x <- rasterToPoints(s)
head(x, 2)
# x y test.1 test.2 test.3
#[1,] 181180 333740 633.686 633.686 633.686
#[2,] 181140 333700 712.545 712.545 712.545
write.csv(x, "test.csv")

Assuming your raster is "species"
species<- raster("C:/.../species.tif")
To perform this conversion, it is necessary to take the values of each pixel: X coordinates (1), Y coordinates (2) and own values of each cell (3).
# don't run these lines
#(1) = coordinates (species) [, 1]
#(2) = coordinates (species) [, 2]
#(3) = values (species)
Having these expressions we can add them to a dataframe as follows
dat<- data.frame("X"=coordinates(species)[,1],"Y"=coordinates(species)
[,2],"Values"=values(species))

Related

Looped raster extractions from a SpatialPointsDataFrame; each row needs to extract from a corresponding raster indicated in a column

I am attempting to make a loop that extracts values from NDVI rasters for multiple shapefiles. The shapefiles are animal GPS locations that include a date and a time. The shapefiles include random locations that have been generated from the population's range and I am therefore assigning each random location that didn't have a date, a date from one of the GPS locations in a 5:1 ratio. I have a function that finds the closest date to the date for each GPS location or random point and stores it in the dataframe as "x$NDVIfile" The code all up to this point but I think the problem is here specifically:
for(j in length(nrow(x))){
a <- raster(paste0("E:/RSF_GIS/HabitatVariables/NDVI/",
x$NDVIfile[j], ".tif"))
x$ndvi[j] <- raster::extract(a, x[j,]) # extract each row based on the closest NDVI file
}
I want to extract from each row in my data the raster from the directory that corresponds to "x$NDVIfile[j]". My current outcome looks like this. All the extracted NDVI values are the same for each dataframe and I have the suspicion that the first raster referenced is the only raster being extracted from for each dataframe :
dataframe x
Here is the entire code :
RSF_dir <- list.files("E:/RSF_GIS/RSF_files",
pattern = "*.shp",
full.names = TRUE)
ndvi_dir = list.files("E:/RSF_GIS/HabitatVariables/NDVI",
pattern = "*.tif",
full.names = FALSE)
ndvi_dir.df <- tools::file_path_sans_ext(basename(ndvi_dir))
ndvi_dir.df <- as.Date(ndvi_dir.df)
ndvi_dir.df <- as.data.frame(ndvi_dir.df)
for (i in 1:length(RSF_dir)) {
x <- rgdal::readOGR(RSF_dir[i])
x <- as.data.frame(x)
nona <- x$Acqst_T[!is.na(x$Acqst_T)] # make a list of non- NA values
nona <- rep.int(nona, times = 6) # have the list repeat itself 6 times (5:1 = random_location:GPS_fix)
x$Acqst_T <- dplyr::coalesce(x$Acqst_T, nona) # and assign to NAs
x$date <- as.Date(x$Acqst_T, format = '%Y-%m-%d %H:%M:%S')
min_distances <- as.numeric(x$date)- matrix(rep(as.numeric(ndvi_dir.df$ndvi_dir),nrow(x)),ncol=length(ndvi_dir.df$ndvi_dir),byrow=T)
min_distances <- as.data.frame(t(min_distances))
closest <- sapply(min_distances,function(o) { # function to find the closest NDVI date for each GPS fix
w <- which(o==min(o[o>0])); # (MOD09Q1 collects imagery every 8 days)
ifelse(length(w)==0,NA,w)
})
x$NDVIfile <- as.Date(ndvi_dir.df$ndvi_dir[closest])
x <- SpatialPointsDataFrame(data.frame(x$coords.x1, x$coords.x2), x, proj4string=veg_INREV#proj4string)
for(j in length(nrow(x))){
a <- raster(paste0("E:/RSF_GIS/HabitatVariables/NDVI/",
x$NDVIfile[j], ".tif"))
x$ndvi[j] <- raster::extract(a, x[j,]) # extract each row based on the closest NDVI file
}
writeOGR(obj= x, dsn="E:/RSF_GIS/RSF_files/trial",
layer=(paste0(tools::file_path_sans_ext(basename(RSF_dir[i])))), driver="ESRI Shapefile", overwrite_layer = TRUE)
}
Thanks!
Answering my own question, the solution was simply a change in code indicating the iterations of the loop:
for(j in length(nrow(x))){
...
}
should have been instead
for(j in 1:nrow(x)){
...
}

Access files from list in for loop

I want to calculate the share (%) of pixels classified as 1 from a list of files. For a single image the code works well, however, when I try to write it in a for loop R tells me named numeric(0) for all files.
How do I get what I want?
Single Image:
ras <- raster("path") # binary product
ras_df <- as.data.frame(ras) # creates data frame
ras_table <- table(ras_df$file) # creates table
share_suit_hab <- ras_table[names(ras_table)==1]/sum(ras_table[names(ras_table)]) # number of pixels with value 1 divided by sum of pixels with value 0 and 1 = share of suitable habitat (%)
print(share_suit_hab)
> ras
class : RasterLayer
dimensions : 1000, 1000, 1e+06 (nrow, ncol, ncell)
resolution : 2165.773, 2463.182 (x, y)
extent : -195054.2, 1970719, 2723279, 5186461 (xmin, xmax, ymin, ymax)
crs : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
source : C:/Users/name/MASTERARBEIT/BASELINE/Eastern Arctic/Summer_EA_Output/ct/2006/cis_SGRDREA_20060703_pl_a.tif
names : cis_SGRDREA_20060703_pl_a
values : 0, 1 (min, max)
For Loop:
list_ct <- list.dirs("path")
i=0
for(year in list_ct){
ct_files_list <- list.files(year, recursive = FALSE, pattern = "\\.tif$", full.names = FALSE)
ct_file_df <- as.data.frame(paste0("path", i, "/", ct_files_list))
ct_file_df <- as.data.frame(matrix(unlist(ct_file_df), nrow= length(unlist(ct_file_df[1]))))
ct_table <- table(ct_file_df[, 1])
stored <- ct_table[names(ct_table)==1]/sum(ct_table[names(ct_table)])
print(stored)
}
This is the final code which is running perfectly!
list_ct <- list.dirs("path", recursive = FALSE)
stored <- list()
for (year in seq_along(list_ct)){
ct_file_list <- list.files(list_ct[year], recursive=FALSE, pattern = ".tif$", full.names = FALSE)
tmp <- list()
for (i in seq_along(ct_file_list)){
ct_file_df <- raster(paste0(list_ct[year], "/", ct_file_list[i])) %>% as.data.frame()
# do calculations
tmp[[i]] <- sum(ct_file_df[,1], na.rm=TRUE) / length(ct_file_df[!is.na(ct_file_df)[],1])
names(tmp)[i] <- paste0(list_ct[year], "/", ct_file_list[i])
print(tmp[i])
}
stored[[year]] <- tmp
names(stored)[year] <- paste0(list_ct[year])
}
Could you add a reproducible example (data incl.)?
You probably need to replace numeric(0) simply by 0. Numeric(0) does
not mean 0, it means a numeric vector of length zero (i.e., empty). I'm guessing you're probably assigning numeric(0)+1 which is still a numeric vector of 0.
Edit:
You have a folder containing multiple folders which each include 1 or more tif files. You want to loop through each of these folders, importing the tif(s) file, do a calculation, save the result.
In the following, my path contains 5 folders named '2006','2007','2008','2009' and '2010'. Each of these "year"-folders contain an .xlsx file. Each .xlsx file contains 1 column (here, you just need to select the right one in your data frame). This column has the same name in all excel files, "col1", and contains values between 0 and 1. Then this will work:
library(dplyr)
library(readxl)
#
list_ct <- list.dirs("mypath", recursive = FALSE)
stored <- list()
for (year in seq_along(list_ct)){
ct_file_list <- list.files(list_ct[year], recursive=FALSE, pattern = ".xlsx$", full.names = FALSE)
tmp <- list()
for (i in seq_along(ct_file_list)){
ct_file_df <- read_excel(paste0(list_ct[year], "/", ct_file_list[i])) %>% as.data.frame()
# do calculations ..
tmp[[i]] <- sum(ct_file_df$col1) / length(ct_file_df$col1)
names(tmp)[i] <- paste0(list_ct[year], "/", ct_file_list[i])
print(tmp[i])
}
stored[[year]] <- tmp
names(stored)[year] <- paste0(list_ct[year])
}
Instead of using "read_excel", you just use raster() like you did with the single file. Hope you can use the answer.
Example data
library(raster)
s <- stack(system.file("external/rlogo.grd", package="raster"))
s <- s > 200
#plot(s)
If your actual data is all for the same area (and the raster data have the same extent and resolution, you want to create a RasterStack (using the filenames) and use freq as below
f <- freq(s)
f
#$red
# value count
#[1,] 0 3975
#[2,] 1 3802
#$green
# value count
#[1,] 0 3915
#[2,] 1 3862
#$blue
# value count
#[1,] 0 3406
#[2,] 1 4371
Followed by
sapply(f, function(x) x[2,2]/sum(x[,2]))
# red.count green.count blue.count
# 0.4888775 0.4965925 0.5620419
If you cannot make a RasterStack you can make a list and lapply and continue as above, or use sapply and do this
ss <- as.list(s)
x <- sapply(ss, freq)
x[4,] / colSums(x[3:4, ])
#[1] 0.4888775 0.4965925 0.5620419
If you insist on a loop
res <- rep(NA, length(ss))
for (i in 1:length(ss)) {
# r <- raster(ss[i]) # if these were filenames
r <- ss[[i]] # here we extract from the list
x <- freq(r)[,2]
res[i] <- x[2] / sum(x)
}
res
# 0.4888775 0.4965925 0.5620419
Thank you!
This is working perfectly for all files of one year!
library(raster)
s_list <- list.files("C:/Users/OneDrive - wwfgermany/MASTERARBEIT/BASELINE/Eastern Arctic/Summer_EA_Output/area_calc/ct/2006/", full.names = T)
s <- raster::stack(s_list)
f <- freq(s, useNA = 'no')
f
ct_avg <- sapply(f, function(x) x[2,2]/sum(x[,2]))
ct_avg__mean <- mean(ct_avg)
ct_avg__mean
However, when I want to write it in another loop, to get one value per year as a final result in the end, I end up with an error saying subscript out of bounds. This is the code I am using:
setwd("C:/Users/MASTERARBEIT/BASELINE/Eastern Arctic/Summer_EA_Output/area_calc/ct/")
list_ct <- list.dirs("C:/Users/MASTERARBEIT/BASELINE/Eastern Arctic/Summer_EA_Output/area_calc/ct/")
i=0
for (year in list_ct) {
s_list <- list.files(year, recursive = FALSE, pattern = "\\.tif$", full.names = FALSE)
s <- raster::stack(s_list)
f <- freq(s, useNA = 'no')
f
ct_avg <- sapply(f, function(x) x[2,2]/sum(x[,2]))
ct_avg__mean <- mean(ct_avg)
ct_avg__mean
}

R: Creating a new raster/shape file from calculations done using variables from a shape file and a raster

I am currently trying to create a new raster or shape file based on a conditional calculation that needs to be done over every value in a shape value based on a value in a raster file. I don't usually work with raster and shape files, so I am pretty out of my element here. I'm asking this in general terms, but here is the data I am using so hopefully it will give a better understanding of what I am trying to accomplish:
rast_norm <- ftp://prism.nacse.org/normals_4km/tmean/PRISM_tmean_30yr_normal_4kmM2_04_bil.zip
shp_probs <- ftp://ftp.cpc.ncep.noaa.gov/GIS/us_tempprcpfcst/seastemp_201603.zip
The main objective is to take the probability associated with each point (latitude and longitude) in shp_probs and multiply it by the value that corresponds to the same latitude and longitude in rast_norm, along with some other calculations afterward. If I had two data.tables, I could do something like the following:
dt1 <- data.table(col1 = c(0:3), col2 = c(1:4)*11, factor1 = sqrt(c(285:288))
# # Output # #
# col1 col2 factor1
# 0 11 16.88194
# 1 22 16.91153
# 2 33 16.94107
# 3 44 16.97056
dt2 <- data.table(col1 = c(0:3), col2 = c(1:4)*11, factor2 = abs(sin(c(1:4))))
# # Output # #
# col1 col2 factor1
# 0 11 0.8414710
# 1 22 0.9092974
# 2 33 0.1411200
# 3 44 0.7568025
dt3 <- merge(dt1, dt2, by = c("col1", "col2"))
dt3$factor1 <- dt3$factor1 * dt3$factor2
dt3$factor2 <- NULL
# # Output # #
# col1 col2 factor1
# 0 11 14.205665
# 1 22 15.377615
# 2 33 2.390725
# 3 44 12.843364
Easy-peasy using data tables. But I am at a loss trying to do this with a Raster and a SpatialPolygonsDataFrame. Here's what I have so far to read in and clean up the files:
# Importing the "rast_norm" file, the first listed above with a link
rast_norm <- "/my/file/path/PRISM_tmean_30yr_normal_4kmM2_04_bil.zip"
zipdirec <- "/my/zip/directory"
unzip(rast_norm, exdir = zipdirec)
# Get the correct file from the file list
rast_norm <- list.files(zipdirec, full.names = TRUE, pattern = ".bil")
rast_norm <- rast_norm[!grepl("\\.xml", rast_norm)]
# Convert to raster
rast_norm <- raster(rast_norm)
Plotting rast_norm on its own gives this map.
# Importing the "shp_probs" file, the second listed above with a link
shp_probs <- "/my/file/path/seastemp_201603.zip"
zipdirec <- "/my/zip/directory"
unzip(shp_probs, exdir = zipdirec, overwrite = TRUE)
# Get the correct file from the list of file names and find the layer name
layer_name <- list.files(zipdirec, pattern = "lead14")
layer_name <- layer_name[grepl(".shp", layer_name)]
layer_name <- layer_name[!grepl("\\.xml", layer_name)]
layer_name <- do.call("rbind", strsplit(layer_name, "\\.shp"))[,1]
layer_name <- unique(layer_name)
# Use the layer name to read in the shape file
shp_probs <- readOGR(shp_probs, layer = layer_name)
names_levels <- paste0(shp_probs$Cat, shp_probs$Prob)
names_levels <- gsub("Below", "-", names_levels)
names_levels <- gsub("Above", "+", names_levels)
names_levels <- as.integer(names_levels)
shp_probs#data$id <- names_levels
shp_probs <- as(shp_probs, "SpatialPolygons")
# Create a data frame of values to use in conjunction with the existing id's
weights <- data.table(id = shp_probs$id, weight = shp_probs$id)
weights$weight <- c(.80, .80, .10, .10, .10, .10, .10, .10, .80, .10, .10, .10, .10, .10)
shp_probs <- SpatialPolygonsDataFrame(otlk_sp, weights, match.ID = FALSE)
Plotting shp_probs on its own gives this map.
I now want to take the probabilities that are associated with the shp_probs file and multiply it by the amounts of rainfall associated with the rast_norm file and multiply again by the weight associated with the probability in the shp_probs file.
I really don't know what to do and any help would be very much appreciated. How do I extract all of the corresponding data points for matching latitudes and longitudes? I think if I knwo that, I will know what to do.
Thank you, in advance.
Assuming that you want to perform this calculation for each grid cell of your raster, you can do something like this:
Download/read data, and add weight column. Note that here I've just used random weights, since your example seems to assign 14 weights to 7 polygons. Also, I'm not sure what purpose your id column serves, so I've skipped that part.
library(raster)
library(rgdal)
download.file('ftp://prism.nacse.org/normals_4km/tmean/PRISM_tmean_30yr_normal_4kmM2_04_bil.zip',
fr <- tempfile(), mode='wb')
download.file('ftp://ftp.cpc.ncep.noaa.gov/GIS/us_tempprcpfcst/seastemp_201603.zip',
fs <- tempfile(), mode='wb')
unzip(fr, exdir=tempdir())
unzip(fs, exdir=tempdir())
r <- raster(file.path(tempdir(), 'PRISM_tmean_30yr_normal_4kmM2_04_bil.bil'))
s <- readOGR(tempdir(), 'lead14_Apr_temp')
s$weight <- runif(length(s))
Perform spatial overlay of the coordinates of the raster cells and the polygons. (Alternatively, you could use raster::rasterize twice to convert the Prob and id fields to rasters, and then multiplied the three rasters.)
xy <- SpatialPoints(coordinates(r), proj4string=crs(r))
o <- over(xy, s)
Create a new raster with the same extent/dimensions as the original raster, and assign the appropriate values to its cells.
r2 <- raster(r)
r2[] <- r[] * o$Prob * o$weight
With these random data, the result looks something like this:

create netcdf in R with multiples variables

I want to create a netcdf with 11 or more different variables. For that I was trying to write it within a loop, but it is not working. My code would be something like that:
#Defining names and dimensions
nam_cwt <- c("N","NE","E","SE","S","SW","W","NW","C","A","U") #number of variables
CWTvar <- paste("var_",nam_cwt,sep="")
data_cwt <- list()
mat_cwt <- array()
dimX <- dim.def.ncdf("longitude", "degrees_east", Longvector)
dimY <- dim.def.ncdf("latitude", "degrees_north", Latvector)
dimT <- dim.def.ncdf("time","days since 1961-01-01",1:length(my.date), unlim=TRUE)
missval <- -999
#Creating and filling the netcdf file
for (i in 1:length(nam_cwt)){
#Getting every matrix of elements
data_cwt <- lapply(cwt_out,function(x) x[[1]][[i]][[2]])
dmatrix <- unlist(data_cwt)
mat_cwt <- array(dmatrix,dim=c(144,length(my.date),25))
tmat_cwt <- aperm(mat_cwt,c(1,3,2))
CWTvar[[i]] <- var.def.ncdf(nam_cwt[i],"days",list(dimX,dimY,dimT), ,missval,longname=nam_cwt[i])
ncfile <- create.ncdf("nctypes.nc",CWTvar)
put.var.ncdf(ncfile,CWTvar[i],tmat_cwt)
}
The problem is that I am not sure if I should use var.add.ncdf (instead put.var.ncdf)..
any idea about that???
How can I create and write the file within the loop??
Any help will be helpful!
Here's how I would do this: create a netcdf file first with one of the variables, then add and fill the others in a loop.
library(ncdf)
#Defining names and dimensions
nam_cwt <- c("N","NE","E","SE","S","SW","W","NW","C","A","U") #number of variables
CWTvar <- paste("var_",nam_cwt,sep="")
# values and length of dimensions
Longvector = -180:180
Latvector = -90:90
Datevector = 1:10
x= length(Longvector)
y= length(Latvector)
z= length(Datevector)
# define dimensions
dimX <- dim.def.ncdf("longitude", "degrees_east", Longvector)
dimY <- dim.def.ncdf("latitude", "degrees_north", Latvector)
dimT <- dim.def.ncdf("time","days since 1961-01-01",1:length(Datevector), unlim=TRUE)
# set missing value
missval <- -9999
# create the file with first variable so dimensions are set
CWTvar1 <- var.def.ncdf(name=nam_cwt[1],"days",list(dimX,dimY,dimT), missval=missval)
ncfile <- create.ncdf("nctypes.nc",vars=CWTvar1)
# open newly created file for writing data
mync = open.ncdf(con='nctypes.nc', write=T)
# create some data for first variable
mydata = array(data=runif(n=x*y*z, min=0, max=10), dim=c(x,y,z))
# add data to ncdf file
put.var.ncdf(nc=mync, varid='N', vals=mydata)
# now add all other variables with the same dimensions
for (i in 2:length(nam_cwt)){
# generate new data
mydata = array(data=runif(n=x*y*z, min=0, max=10)*i, dim=c(x,y,z))
# create new variable
CWTvar <- var.def.ncdf(name=nam_cwt[i],"days",list(dimX,dimY,dimT) ,missval=missval)
# add new variable to existing file
mync = var.add.ncdf(nc=mync, v=CWTvar)
# add data to variable in file
put.var.ncdf(nc=mync,varid=CWTvar$name,vals=mydata)
}
close.ncdf(mync)
# check file
newnc = open.ncdf(con='nctypes.nc')
par(mfrow=c(4,3))
for (i in 1:length(nam_cwt)){
zz = get.var.ncdf(newnc, varid=nam_cwt[i])
image(x=Longvector, y=Latvector, z=zz[,,1])
}
close.ncdf(newnc)

R Crop no-data of a raster

I would like to crop the no-data part of some rasters (example of the image in 1 where no-data is in black) without defining the extent manually.
Any idea?
You can use trim to remove exterior rows and columns that only have NA values:
library(raster)
r <- raster(ncols=18,nrows=18)
r[39:49] <- 1
r[205] <- 6
s <- trim(r)
To change other values to or from NA you can use reclassify. For example, to change NA to 0:
x <- reclassify(r, cbind(NA, 0))
[ subsetting and [<- replacement methods are defined for raster objects so you can simply do r[ r[] == 1 ] <- NA to get rid of the values where 1 is your nodata value (use NAvalue(r) to find out what R considers your nodata value is supposed to be if you aren't sure).
Note you have to use r[] inside the [ subsetting command to access the values. Here is a worked example...
Example
# Make a raster from system file
logo1 <- raster(system.file("external/rlogo.grd", package="raster"))
# Copy to see difference
logo2 <- logo1
# Set all values in logo2 that are > 230 to be NA
logo2[ logo2[] > 230 ] <- NA
# Observe difference
par( mfrow = c( 1,2 ) )
plot(logo1)
plot(logo2)
I have 2 slightly different solutions. The first requires to manually identify the extent but uses predefined functions. The second is more automatic, but a bit more handmade.
Create a reproducible raster for which the first 2 rows are NA
library(raster)
# Create a reproducible example
r1 <- raster(ncol=10, nrow=10)
# The first 2 rows are filled with NAs (no value)
r1[] <- c(rep(NA,20),21:100)
Solution #1
Manually get the extent from the plotted figure using drawExtent()
plot(r1)
r1CropExtent <- drawExtent()
Crop the raster using the extent selected from the figure
r2 <- crop(r1, r1CropExtent)
Plot for comparison
layout(matrix(1:2, nrow=1))
plot(r1)
plot(r2)
Solution #2
It identifies the rows and columns of the raster that only have NA values and remove the ones that are on the margin of the raster. It then calculate the extent using extent().
Transform the raster into a matrix that identifies whether the values are NA or not.
r1NaM <- is.na(as.matrix(r1))
Find the columns and rows that are not completely filled by NAs
colNotNA <- which(colSums(r1NaM) != nrow(r1))
rowNotNA <- which(rowSums(r1NaM) != ncol(r1))
Find the extent of the new raster by using the first ans last columns and rows that are not completely filled by NAs. Use crop() to crop the new raster.
r3Extent <- extent(r1, rowNotNA[1], rowNotNA[length(rowNotNA)],
colNotNA[1], colNotNA[length(colNotNA)])
r3 <- crop(r1, r3Extent)
Plot the rasters for comparison.
layout(matrix(1:2, nrow=1))
plot(r1)
plot(r3)
I have written a small function based on Marie's answer to quickly plot cropped rasters. However, there may be a memory issue if the raster is extremely large, because the computer may not have enough RAM to load the raster as a matrix.
I therefore wrote a memory safe function which will use Marie's method if the computer has enough RAM (because it is the fastest way), or a method based on raster functions if the computer does not have enough RAM (it is slower but memory-safe).
Here is the function:
plotCroppedRaster <- function(x, na.value = NA)
{
if(!is.na(na.value))
{
x[x == na.value] <- NA
}
if(canProcessInMemory(x, n = 2))
{
x.matrix <- is.na(as.matrix(x))
colNotNA <- which(colSums(x.matrix) != nrow(x))
rowNotNA <- which(rowSums(x.matrix) != ncol(x))
croppedExtent <- extent(x,
r1 = rowNotNA[1],
r2 = rowNotNA[length(rowNotNA)],
c1 = colNotNA[1],
c2 = colNotNA[length(colNotNA)])
plot(crop(x, croppedExtent))
} else
{
xNA <- is.na(x)
colNotNA <- which(colSums(xNA) != nrow(x))
rowNotNA <- which(rowSums(xNA) != ncol(x))
croppedExtent <- extent(x,
r1 = rowNotNA[1],
r2 = rowNotNA[length(rowNotNA)],
c1 = colNotNA[1],
c2 = colNotNA[length(colNotNA)])
plot(crop(x, croppedExtent))
}
}
Examples :
library(raster)
r1 <- raster(ncol=10, nrow=10)
r1[] <- c(rep(NA,20),21:100)
# Uncropped
plot(r1)
# Cropped
plotCroppedRaster(r1)
# If the no-data value is different, for example 0
r2 <- raster(ncol=10, nrow=10)
r2[] <- c(rep(0,20),21:100)
# Uncropped
plot(r2)
# Cropped
plotCroppedRaster(r2, na.value = 0)
If you use the rasterVis package (any version after Jun 25, 2021), it will automatically crop the NA values out for terra's SpatRaster
Install rasterVis development version from GitHub
if (!require("librarian")) install.packages("librarian")
librarian::shelf(raster, terra, oscarperpinan/rastervis)
# Create a reproducible example
r1 <- raster(ncol = 10, nrow = 10)
# The first 2 rows are filled with NAs (no value)
r1[] <- c(rep(NA, 20), 21:100)
levelplot() for r1
rasterVis::levelplot(r1,
margin = list(axis = TRUE))
Convert to terra's SpatRaster then plot again using levelplot()
r2 <- rast(r1)
rasterVis::levelplot(r2,
margin = list(axis = TRUE))
Created on 2021-06-26 by the reprex package (v2.0.0)

Resources