I need to take a basename from a file path and insert it into a variable so I can access a column in a dataframe. I have created some sample data to illustrate what I am trying to accomplish.
Create some sample data:
library(raster)
## Create a matrix with random data & use image()
xy = matrix(rnorm(400),20,20)
image(xy)
# Turn the matrix into a raster
rast = raster(xy)
# Give it lat/lon coords for 36-37°E, 3-2°S
extent(rast) = c(36,37,-3,-2)
# ... and assign a projection
projection(rast) = CRS("+proj=longlat +datum=WGS84")
plot(rast)
# Write to disk:
writeRaster(rast, "C:/temp/12345.tif", format = "tif")
Create a path to raster
path = 'C:/temp/12345.asc
Create a raster object:
r = raster(file)
Sample random locations in raster and report values in a dataframe
df = data.frame(sampleRandom(r, size=1000, cells=TRUE, sp=TRUE))
Now I need to automate the insertion of the basename into a variable so that it looks like:
test = df$X12345
This is my unsuccessful attempt at inserting the basename into the test variable:
require(tools)
name = basename(file_path_sans_ext(path))
test2 = paste('df$', 'X', name, sep = '')
>test2
[1] "df$X12345"
This method seems to create a the correct character "df$X12345", although I cannot access the dataframe by calling test2. How can I construct a series of characters into a functioning variable so that I can access the particular dataframe column?
Maybe you are looking for parse and eval:
df <- data.frame(a = 1, b = 2)
test2 <- "df$b"
eval(parse(text = test2))
# [1] 2
test = df[,paste0('X', name)]
Is this what you're looking for?
Related
I am new to R. I am working on cmip6 models. I have over 200 plus .nc files. I can run the script for individual files and get the output i need but having hard time looping the script. Can you guys help me out. It will save me alot of my time. Thank you in advance.
setwd("D:\\data")
library (raster) ## load required library
library(sp) ## load library
library(ncdf4)
station.data = read.csv(file.choose(), sep = ",", header =T) ## import station data.file
lon.lat = station.data[,c(2,3)] ## extract data of all stations in station file for which point values are to be extractd
lon.lat = SpatialPoints(lon.lat) ## lon.lat for further use
lon.lat
robject = brick(file.choose(), varname = "pr")## import raster netcdf file from which point values are to be extractd
dim(robject) ## check the dimensions of project
vall = extract(robject , lon.lat, method = "simple" ) ## extract values
vall = t(vall)
write.csv(vall, file = "earthvegbil.csv", fileEncoding = "macroman") ## save output csv file containing extracted point values .
Put your code in an lapply and loop over the indeces of your files. The latter is useful to get nice numerical suffixes in write.csv.
setwd("D:\\data")
files <- list.files(pattern=".csv$")
bricks <- list.files(pattern=".pr$")
stopifnot(length(files) == length(bricks)) ## check for equal length of both vectors
mapply(seq_along(files), \(x) {
station.data <- read.csv(files[[x]], sep=", ", header=T) ## import station data.file
lon.lat <- station.data[, c(2, 3)] ## extract data of all stations in station file for which point values are to be extractd
lon.lat <- SpatialPoints(lon.lat) ## lon.lat for further use
robject <- brick(bricks[[x]], varname="pr")## import raster netcdf file from which point values are to be extractd
vall <- extract(robject , lon.lat, method="simple" ) ## extract values
vall <- t(vall)
write.csv(vall, file=sprintf('./out/earthvegbil_%03d.csv', x), fileEncoding="macroman") ## save output csv file containing extracted point values .
}
this is probably a basic question but I still need some help to figure out what I should do.
My code is very simple, there are a bunch of calculations to get raster files stored in variables :NDWI,NDWI, VSI, etc. each calculation is based on an satellite image which has a name with the date in it. I will find a way to extract the date and store it in a variable.
In essence, what I want is a part of my code that goes over all the files created to save them as raster in a specific path of my laptop, under this format ( "variable name"_"date".tif. )
What I have for now :
Dossier <- "C:/Users/Perrin/Desktop/INRA/Raster/sentinel/L1C_T31UDR_A019210_20190225T105315/S2A_MSIL1C_20190225T105021_N0207_R051_T31UDR_20190225T125616.SAFE/GRANULE/L1C_T31UDR_A019210_20190225T105315/IMG_DATA"
library(raster)
list.files(Dossier)
Bande1 <- raster(list.files(pattern = "\\B01.jp2$"))
Bande2 <- raster(list.files(pattern = "\\B02.jp2$"))
Bande3 <- raster(list.files(pattern = "\\B03.jp2$"))
Bande4 <- raster(list.files(pattern = "\\B04.jp2$"))
NDVI <- (Bande8-Bande4)/(Bande8+Bande4)
NDWI <- (Bande8A-Bande11)/(Bande8A-Bande11)
NDDI <- (NDVI-NDWII)/(NDVI+NDWII)
writeRaster(NDWI, "C:/Users/Perrin/Desktop/INRA/résultats R/NDWI.tif", overwrite = T)
writeRaster(NDVI, "C:/Users/Perrin/Desktop/INRA/résultats R/NDVI", overwrite = T)
writeRaster(NDDI, "C:/Users/Perrin/Desktop/INRA/résultats R/NDDI.tif", overwrite = T)
Any help will be very much appreciated.
Create a list with your objects and then use Map to loop over it:
rasterList <- list(NDVI = NDVI, NDWI = NDWI, NDDI = NDDI)
filenames<-sprintf("C:/Users/Perrin/Desktop/INRA/résultats R/%s_%s.tif",
names(rasterList), format(Sys.Date(), "%Y%m%d"))
Map(writeRaster, rasterList, filenames, MoreArgs = list(overwrite = TRUE))
I have 2 netCDF files (each .nc file has 4 variables: Susceptible, Infected, Recovered and Inhabitable. The dimension of each variable is 64 x 88). I would like to merge these 2 files into a single netCDF file such that the merged file will stack separately Susceptible from the 2 files, Infected from the 2 files, Recovered from the 2 files and Inhabitable from the 2 files.
Here are the 2 files(first and second)
Could anyone help me with this please?
Thanks in advance,
Ashok
The ncdf4 package will do what you want to do. Have a look at the code below, example for one variable only.
#install.packages('ncdf4')
library(ncdf4)
file1 <- nc_open('England_aggr_GPW4_2000_0001.nc')
file2 <- nc_open('England_aggr_GPW4_2000_0002.nc')
# Just for one variable for now
dat_new <- cbind(
ncvar_get(file1, 'Susceptible'),
ncvar_get(file2, 'Susceptible'))
dim(dat_new)
var <- file1$var['Susceptible']$Susceptible
# Create a new file
file_new3 <- nc_create(
filename = 'England_aggr_GPW4_2000_new.nc',
# We need to define the variables here
vars = ncvar_def(
name = 'Susceptible',
units = var$units,
dim = dim(dat_new)))
# And write to it
ncvar_put(
nc = file_new,
varid = 'Susceptible',
vals = dat_new)
# Finally, close the file
nc_close(file_new)
Update:
An alternative approach is using the raster package as shown below. I didn't figure out how to make 4D raster stacks, so I am splitting your data into one NCDF file per variable. Would that work for you?
#install.packages('ncdf4')
library(ncdf4)
library(raster)
var_names <- c('Susceptible', 'Infected', 'Recovered', 'Inhabitable')
for (var_name in var_names) {
# Create raster stack
x <- stack(
raster('England_aggr_GPW4_2000_0001.nc', varname = var_name),
raster('England_aggr_GPW4_2000_0002.nc', varname = var_name))
# Name each layer
names(x) <- c('01', '02')
writeRaster(x = x,
filename = paste0(var_name, '_out.nc'),
overwrite = TRUE,
format = 'CDF')
}
I've been working with the RCP (Representative Concentration Pathway) spatial data. It's a nice gridded dataset in netCDF format. How can I get a list of bricks where each element represents one variable from a multivariate netCDF file (by variable I don't mean lat,lon,time,depth...etc). This is what Iv'e tried to do. I can't post an example of the data, but I've set up the script below to be reproducible if you want to look in to it. Obviously questions welcome... I might not have expressed the language associated with the code smoothly. Cheers.
A: Package requirements
library(sp)
library(maptools)
library(raster)
library(ncdf)
library(rgdal)
library(rasterVis)
library(latticeExtra)
B: Gather data and look at the netCDF file structure
td <- tempdir()
tf <- tempfile(pattern = "fileZ")
download.file("http://tntcat.iiasa.ac.at:8787/RcpDb/download/R85_NOX.zip", tf , mode = 'wb' )
nc <- unzip( tf , exdir = td )
list.files(td)
## Take a look at the netCDF file structure, beyond this I don't use the ncdf package directly
ncFile <- open.ncdf(nc)
print(ncFile)
vars <- names(ncFile$var)[1:12] # I'll try to use these variable names later to make a list of bricks
C: Create a raster brick for one variable. Levels correspond to years
r85NOXene <- brick(nc, lvar = 3, varname = "emiss_ene")
NAvalue(r85NOXene) <- 0
dim(r85NOXene) # [1] 360 720 12
D: Names to faces
data(wrld_simpl) # in maptools
worldPolys <- SpatialPolygons(wrld_simpl#polygons)
cTheme <- rasterTheme(region = rev(heat.colors(20)))
levelplot(r85NOXene,layers = 4,zscaleLog = 10,main = "2020 NOx Emissions From Power Plants",
margin = FALSE, par.settings = cTheme) + layer(sp.polygons(worldPolys))
E: Summarize all grid cells for each year one variable "emis_ene", I want to do this for each variable of the netCDF file I'm working with.
gVals <- getValues(r85NOXene)
dim(gVals)
r85NOXeneA <- sapply(1:12,function(x){ mat <- matrix(gVals[,x],nrow=360)
matfun <- sum(mat, na.rm = TRUE) # Other conversions are needed, but not for the question
return(matfun)
})
F: Another meet and greet. Check out how E looks
library(ggplot2) # loaded here because of masking issues with latticeExtra
years <- c(2000,2005,seq(2010,2100,by=10))
usNOxDat <- data.frame(years=years,NOx=r85NOXeneA)
ggplot(data=usNOxDat,aes(x=years,y=(NOx))) + geom_line() # names to faces again
detach(package:ggplot2, unload=TRUE)
G: Attempt to create a list of bricks. A list of objects created in part C
brickLst <- lapply(1:12,function(x){ tmpBrk <- brick(nc, lvar = 3, varname = vars[x])
NAvalue(tmpBrk) <- 0
return(tmpBrk)
# I thought a list of bricks would be a good structure to do (E) for each netCDF variable.
# This doesn't break but, returns all variables in each element of the list.
# I want one variable in each element of the list.
# with brick() you can ask for one variable from a netCDF file as I did in (C)
# Why can't I loop through the variable names and return on variable for each list element.
})
H: Get rid of the junk you might have downloaded... Sorry
file.remove(dir(td, pattern = "^fileZ",full.names = TRUE))
file.remove(dir(td, pattern = "^R85",full.names = TRUE))
close(ncFile)
Your (E) step can be simplified using cellStats.
foo <- function(x){
b <- brick(nc, lvar = 3, varname = x)
NAvalue(b) <- 0
cellStats(b, 'sum')
}
sumLayers <- sapply(vars, foo)
sumLayers is the result you are looking for, if I understood correctly your question.
Moreover, you may use the zoo package because you are dealing with time series.
library(zoo)
tt <- getZ(r85NOXene)
z <- zoo(sumLayers, tt)
xyplot(z)
I started recently to work with R so this question has probably a simple solution.
I have some .tif satellite images from different scenes. I can create a test raster brick with it but the process needs to be automatised because of the huge amount of files. Therefore I have been trying to create a function to read the list of .tif files and to output a list of rasters.
You can find here below the code I have been using:
# Description: Prepare a raster brick with ordered acquisitions
# from all the scenes of the study area
library(raster)
library(rgdal)
library(sp)
library(rtiff)
rm(list = ls())
setwd=getwd()
# If you want to download the .tif files of the 2 scenes from dropbox:
dl_from_dropbox <- function(x, key) {
require(RCurl)
bin <- getBinaryURL(paste0("https://dl.dropboxusercontent.com/s/", key, "/", x),
ssl.verifypeer = FALSE)
con <- file(x, open = "wb")
writeBin(bin, con)
close(con)
message(noquote(paste(x, "read into", getwd())))
}
dl_from_dropbox("lndsr.LT52210611985245CUB00-vi.NDVI.tif", "qb1bap9rghwivwy")
dl_from_dropbox("lndsr.LT52210611985309CUB00-vi.NDVI.tif", "sbhcffotirwnnc6")
dl_from_dropbox("lndsr.LT52210611987283CUB00-vi.NDVI.tif", "2zrkoo00ngigfzm")
dl_from_dropbox("lndsr.LT42240631992198XXX02-vi.NDVI.tif", "gx0ctxn2mca3u5v")
dl_from_dropbox("lndsr.LT42240631992214XXX02-vi.NDVI.tif", "pqnjw2dpz9beeo5")
dl_from_dropbox("lndsr.LT52240631986157CUB02-vi.NDVI.tif", "rrka10yaktv8la8")
# 1- Create a list of .tif files with names ordered chronologically (for time series analysis later on)
pathdir= # change
# List all the images from any scene in that folder and
# make a dataframe with a column for the date
a <- list.files(path=pathdir,pattern="lndsr.LT", all.files=FALSE,full.names=FALSE)
a1 <- as.data.frame(a, row.names=NULL, optional=FALSE, stringsAsFactors=FALSE) # class(a1$a) # character
# Create date column with julean date and order it in ascending order
a1$date <- substr(a1$a, 16, 22) # class(a1$date) = character
a1 <- a1[order(a1$date),]
# Keep only the column with the name of the scene
a1 <- subset(a1, select=1) # class(a1$a): character
# retrieve an ordered list from the dataframe
ord_dates <- as.list(as.data.frame(t(a1$a))) # length(ord_dates): 4 (correct)
# class(odd_dates) # list
# 2- Create rasters from elements of a list
for (i in 1:(length(ord_dates))){
# Point to each individual .tif file
tif_file <- ord_dates[i] # Problem: accesses only the first item of ord_dates
# Make a raster out of it
r <- raster(tif_file) # we cant use here a list as an input. Gives error:
# Error in .local(x, ...) : list has no "x"
# Give it a standardised name (r1,r2,r3, etc)
name <- paste("r", 1:length(ord_dates),sep = "")
# Write the raster to file
writeRaster (r , filename = name,format = "GTiff", overwrite =T )
}
I have also tried to use lapply() without much success.
r = lapply(ord_dates, raster)
Can you give me an advice on what concept to follow? I am guessing I should be using matrices but I don't really understand which are their advantages here or in what step they are required.
Any help is really appreciated!
Thanks in advance
Assuming ord_dates is a list of file names (that have full path or are in your getwd()), you can apply a (any) function to this list using lapply. I haven't tested this, unfortunately.
convertAllToRaster <- function(tif_file) {
r <- raster(tif_file)
# Give it a standardised name (r1,r2,r3, etc)
name <- paste("r", 1:length(ord_dates),sep = "")
# Write the raster to file
writeRaster (r , filename = name,format = "GTiff", overwrite =T )
message("Eeee, maybe it was written successfully.")
}
lapply(ord_dates, FUN = convertAllToRaster)
After solving the issues with factors and with the name, this is the code that worked for me. I added a for loop also inside the function you proposed, Roman. Thankyou very much for your kind help!!
convertAllToRaster <- function(ord_dates) {
for (i in 1:(length(ord_dates))){
tif_file <- ord_dates[i]
r <- raster(tif_file)
# Keep the original name
name <- paste(tif_file, ".grd", sep ="")
# Write the raster to file
writeRaster (r , filename = name,format = "raster", overwrite =T ) # in .grd format
}
}
lapply(ord_dates, FUN = convertAllToRaster)