Extract values from raster matching csv and raster filenames - r

I have a folder with many csv files. Each file has several columns as well as lat and long columns. Another folder have many rasters in tif format. The .csv files are named based on Julian date (e.g. 251.csv), and so the rasters (e.g. 251.tif). I would like to be able to add the raster value to the csv with matching name and save to a new csv in R. What I want to achieve is this:
raster<-raster("c:/temp/TIFF/2001/273.tif")
points<-read.csv("c:/temp/csv/2001/273.csv")
coordinates(points)=~long+lat
rasValue=extract(raster,points)
combinePointValue <- cbind(points,rasValue)
head(combinePointValue)
library(spdplyr)
combinePointValue <- combinePointValue %>%
rename(chloro = 10)
write.table(combinePointValue,file="c:/temp/2001/chloro/273_chloro.csv",append=FALSE,
sep=",",row.names=FALSE, col.names=TRUE)
Considering the many csv and many tif files, I would prefer avoiding having to type this over and over. Anyone able to help?
Many thanks in advance!
Ilaria

It is better to provide a minimal reproducible example since your code can not run without your specific data. However, if I understand well, you can try something like this. Since csv and tif files have the same name, you can sort them and loop through the file index. You can use the original path of csv files to save a new file just by pasting the suffix "_chloro:
library(spdplyr)
csv <- sort(list.files("c:/temp/csv/2001/",full.names = T))
tif <- sort(list.files("c:/temp/TIFF/2001/",full.names = T))
lapply(1:length(csv),function(i){
raster<-raster(tif[i])
points<-read.csv(csv[i])
coordinates(points)=~long+lat
rasValue=extract(raster,points)
combinePointValue <- cbind(points,rasValue)
head(combinePointValue)
combinePointValue <- combinePointValue %>%
rename(chloro = 10)
write.table(combinePointValue,file=paste0(tools::file_path_sans_ext(csv[i]),"_chloro.csv"),append=FALSE,
sep=",",row.names=FALSE, col.names=TRUE)
})

SInce the R spatial "ecosystem" is undergoing dramatic changes over the past few years, and package like sp and raster will be deprecated, you might consider a solution based on the terra package.
It would go something like:
# Not tested!
library(terra)
csv_path = "c:/temp/csv/2001/"
tif_path = "c:/temp/TIFF/2001/"
tif_list = list.files(file.path(tif_path, pattern = "*.tif", full.names = FALSE)
result_list = lapply(1:length(tif_list), function(i) {
tif_file = file.path(tif_path, tif_list[i])
# Do not assume that the list of files are exactly equivalent.
# Instead create CSV file name from tif file
csv_name = gsub("tif", "csv", tif_file)
csv_file = file.path(csv_path, csv_name)
r = rast(tif_file)
csv_df = read.csv(csv_file)
# Assume csv long/lat are the same CRS as the tif files
pts = vect(csv_df, geom=c("long", "lat"), crs=st_crs(tif))
result = extract(r, pts, xy = TRUE)
new_csv = paste0(tools::file_path_sans_ext(csv_file),"_chloro.csv")
write.csv(result, file.path(csv_path, new_csv))
return(result)
})

Related

Specifying file names in a loop in R (converting .nc to geotiff)

I have a folder of .nc files on sea surface temperature, and I have a loop which extracts the variable I want ("analysed_sst") from the .nc file and writes the files to rasters.
I want to specify the name of the outputted raster files to be the first section of the original .nc file (which is the date).
An example would be that the original .nc file is called "20220113090000-JPL-L4_GHRSST-SSTfnd-MUR25-GLOB-v02.0-fv04.2.nc" and so I would like the outputted raster to be called "20220113_STT.tiff".
I've attached the loop I'm using below.
library(ncdf4)
library(raster)
#input directory
dir.nc <- #file path
files.nc <- list.files(dir.nc, full.names = T, recursive = T)
#output directory
dir.output <- #file path
#loop
for (i in 1:length(files.nc)) {
r.nc <- raster(files.nc[i])
writeRaster(r.nc, paste(dir.output, i, '.tiff', sep = ''), format = 'GTiff', overwrite = T)
}
dirname() is vectorized so you should be able to safely use dirname(files.nc) to get the directory for each file.
Side note: It can be safer to use seq_along(files.nc) rather than 1:length(files.nc). When length(files.nc) == 0 you can get some confusing errors because 1:0 produces [1] 1 0 and your loop will try to do some weird stuff.

using list.files to read many shape files and then merge them into one big file

I have more than 1000 shape files in a directory, and I want to select only 10 of them whose names are already known to me as follows:
15TVN44102267_Polygons.shp, 15TVN44102275_Polygons.shp
15TVN44102282_Polygons.shp, 15TVN44102290_Polygons.shp
15TVN44102297_Polygons.shp, 15TVN44102305_Polygons.shp
15TVN44102312_Polygons.shp, 15TVN44102320_Polygons.shp
15TVN44102327_Polygons.shp, 15TVN44102335_Polygons.shp
First I want to read only these shape files using the list.files command, and then merge them into one big file. I tried the following command, but it failed. I will appreciate any assistance from the community.
setwd('D/LiDAR/CHM_tree_objects')
files <- list.files(pattern="15TVN44102267_Polygons|
15TVN44102275_Polygons| 15TVN44102282_Polygons|
15TVN44102290_Polygons| 15TVN44102297_Polygons|
15TVN44102305_Polygons| 15TVN44102312_Polygons|
15TVN44102320_Polygons| 15TVN44102327_Polygons|
15TVN44102335_Polygons| 15TVN44102342_Polygons|
15TVN44102350_Polygons| 15TVN44102357_Polygons",
recursive = TRUE, full.names = TRUE)
Here's a slightly different approach. If you already know the location of the files and their file names, you don't need to use list.files:
library(sf)
baseDir <- '/temp/r/'
filenames <- c('Denisonia-maculata.shp', 'Denisonia-devisi.shp')
filepaths <- paste(baseDir, filenames, sep='')
# Read each shapefile and return a list of sf objects
listOfShp <- lapply(filepaths, st_read)
# Look to make sure they're all in the same CRS
unique(sapply(listOfShp, st_crs))
# Combine the list of sf objects into a single object
combinedShp <- do.call(what = sf:::rbind.sf, args=listOfShp)
combinedShp will then be an sf object that has all the features in your individual shapefiles. You can then write that out to a single file in your chosen format with st_write.

Changing saved file names R

I'm using the following code:
lst <- split(data, cut(data$Pos, breaks = maxima, include.lowest = TRUE))
dir <- getwd()
lapply(seq_len(length(lst)),
function (i) write.csv(lst[[i]], file = paste0(dir,"/",names(lst[i]), ".csv"), row.names = FALSE)) ## split data into .csv files based on max.csvima values
that another user provided me with, to split and save a dataset into separate .csv files. However, when the files are saved they are saved in a naming format as so: [0,9], (9,19], etc., which the analysis program I'm using cannot read in. How would I change the filenames that they are being saved as? I assumed that it was the
names(lst[i])
portion, however when I changed that (e.g. to names(vec[i]) with vec being a vector of numbers with the same length as the number of data files), no data files were created.
Any help is appreciated!
#desc provides the answer in the comment you only need to change your code to
lst <- split(data, cut(data$Pos, breaks = maxima, include.lowest = TRUE))
dir <- getwd()
lapply(seq_len(length(lst)),
function (i) write.csv(lst[[i]], file = paste0(dir,"/your_desired_label_here",names(lst[i]), ".csv"), row.names = FALSE)) ## split data into .csv files based on max.csvima values

How to load or read all .plt files in R from directory into a single matrix

I'm trying to do some distance calculation based on the Geolife Trajecotry Dataset which is in .plt format. Currently I can read one .plt file at a time using code below.
trajectory = read.table("C:/Users/User/Desktop/20081023025304.plt", header = FALSE, quote = "\"", skip = 6, sep = ",")
My question is how I can read all the .plt files into R using a single command? I have try the command below but not work.
file_list <- list.files("C:/Users/User/Desktop/Geolife Trajectories 1.3/Data/000/Trajecotry")
The Geolife dataset path is :
Geolife Trajectories 1.3/Data/000/Trajectory/
Inside the Data folder there are total 82 folder starting 000 to 081
Thank you for help.
It's very basic R. list.files is to list all the files in a specified directory. read.table is to read a specified file into R. You need to apply read.table to each file listed in the directory.
file_list <- list.files("C:/Users/User/Desktop/Geolife Trajectories 1.3/Data/000/Trajecotry", full=T)
file_con <- lapply(file_list, function(x){
return(read.table(x, head=F, quote = "\"", skip = 6, sep = ","))
})
file_con_df <- do.call(rbind, file_con)

R : How to write an XYZ file from a SpatialPointsDataFrame?

I have a SpatialPointsDataFrame which has one attribute (let's call it z for convenience) as well as lat/long coordinates.
I want to write this out to an XYZ file (i.e. an ASCII file with three columns).
Initially I tried
write.table(spdf, filename, row.names=FALSE)
but this wrote the z value first, followed by the coordinates, on each row. So it was ZXY format rather than XYZ. Not a big deal, perhaps, but annoying for other people who have to use the file.
At present I am using what feels like a really horrible bodge to do this (given below), but my question is: is there a good and straightforward way to write a SPDF out as XYZ, with the columns in the right order? It seems as though it ought to be easy!
Thanks for any advice.
Bodge:
dfOutput <- data.frame(x = coordinates(spdf)[,1], y = coordinates(spdf)[,2])
dfOutput$z <- data.frame(spdf)[,1]
write.table(dfOutput, filename, row.names=FALSE)
Why not just
library(sp)
spdf <- SpatialPointsDataFrame(coords=matrix(rnorm(30), ncol = 2),
data=data.frame(z = rnorm(15)))
write.csv(cbind(coordinates(spdf), spdf#data), file = "example.csv",
row.names = FALSE)
You can write to a .shp file using writeOGR from rgdal package. Alternatively, you could fortify (from ggplot2) your data and write that as a csv file.
Following up on Noah's comment about a method like coordinates but for data values: The raster package has the getValues() method for returning the values of a SpatialPointsDataFrame.
library(raster)
spdf <- raster('raster.sdat')
write.table(
cbind(coordinates(spdf), getValues(spdf)),
file = output_file,
col.names = c("X", "Y", "ZVALUE"),
row.names = FALSE,
quote = FALSE
)

Resources