I was wondering whether someone has an idea how to read the EXIF data from multiple image directories. I have gathered image data, but for single samples this is often stored in multiple subdirectories. So far, I've tried this:
multidirdata <- list.dirs("D:/F04", full.names = TRUE, recursive = TRUE)
for (i in 1 : length(multidirdata)){
setwd("C:/exiftool/")
multisubdirdata <- list.dirs(multidirdata[i])
for (j in 1 : length(multisubdirdata)){
filelist <- list.files(path = multisubdirdata, pattern = ".tif", full.names = TRUE)
fulldata <- data.frame(system('exiftool -FileName -GPSLatitude -GPSLongitude -DateTimeOriginal -,
"D:\\F04\\0005SET\\000"', intern = TRUE))
img.df <- read.delim2(textConnection(fulldata), stringsAsFactors = FALSE, header = FALSE,
col.names = c("File", "Lat", "Lon", "Time"))
setwd(multisubdirdata[j])
write.csv(fulldata, file = paste("multipts", "csv", sep = "."), row.names = TRUE, append = FALSE)
}
}
As you can see, this only asks the EXIF data from "D:\F04\0005SET\000" and not from other directories such as "D:\F04\0005SET\001".
Preferably, I'd like to set a vector of all needed image directories through the vectors multidirdata and multisubdirdata, and use those in the EXIF command.
Paying attention to the Common Mistake that StarGeek mentioned made it work for me now:
setwd("C:/exiftool/")
fulldata <- system('exiftool -FileName -GPSLatitude -GPSLongitude -DateTimeOriginal -ext tif -r. "D:\\GIS\\Congo\\F04"', intern = TRUE)
Related
Consider one file 'C:/ZFILE' that includes many zip files.
Now, consider that each of these zip includes many csv, among which one specific csv named 'NAME.CSV', all these scattered 'NAME.CSV' being similarly named and structured (i.e., same columns).
How to rbind all these scattered csv?
The script below allows that, but a function would be more appropriate.
How to do this?
Thanks
zfile <- "C:/ZFILE"
zlist <- list.files(path = zfile, pattern = "\\.zip$", recursive = FALSE, full.names = TRUE)
zlist # list all zip from the zfile file
zunzip <- lapply(zlist, unzip, exdir = zfile) # unzip all zip in the zfile file (may takes time depending on the number of zip)
library(data.table) # rbindlist & fread
csv_name <- "NAME.CSV"
csv_list <- list.files(path = zfile, pattern = paste0("\\", csv_name, "$"), recursive = TRUE, ignore.case = FALSE, full.names = TRUE)
csv_list # list all 'NAME.CSV' from the zfile file
csv_rbind <- rbindlist(sapply(csv_list, fread, simplify = FALSE), idcol = 'filename')
You can try this type of function ( you can pass the unzip call directly to the cmd param of data.table::fread())
get_zipped_csv <- function(path) {
fnames = list.files(path,full.names = T)
rbindlist(lapply(fnames, \(f) fread(cmd = paste0("unzip -p ",f))[,src:=f]))
}
Usage:
get_zipped_csv(path = "C:\ZFILE\")
im trying to separate a unique column in multiple csv files. I've already done it for one single file with this code:
tempmax <- read.csv(file="path", header=TRUE, sep=";", fill = TRUE)
colnames(tempmax) = c("Fecha", "Hora", "Temperatura max")
rbind(tempmax)
write.csv(tempmax, "path", sep = ";", append = FALSE, row.names = FALSE, col.names = FALSE)
However, I haven't found the way to do it in multiple csv saved in a folder. I would like to do the same: read, modify and write the new one.
I used this to read the multiple files:
getwd <- ("path")
filenames <- list.files("path",
pattern = "*.csv", full.names = TRUE)
But i just cant find the way to edit what i want. (i'm pretty new using R)
I appreciate the help. Thanks!
If we have several files, we can use lapply. It is not clear about the transformation. So, the file is written back by selecting the first column
lapply(filenames, function(file){
tempmax <- read.csv(file= file, header=TRUE, sep=";", fill = TRUE)
colnames(tempmax) = c("Fecha", "Hora", "Temperatura max")
write.csv(tempmax[1], file, sep = ";", append = FALSE,
row.names = FALSE, col.names = FALSE)})
I'm fairly new to R, so my apologies if this is a very basic question.
I'm trying to read two Excel files in, using the list.files(pattern) method, then using a for loop to bind the files and replace values in the bound file. However, the output that my script is producing is the output from only one file, meaning that it is not binding.
The file names are fact_import_2020 and fact_import_20182019.
FilePath <- "//srdceld2/project2/"
FileNames <- list.files(path = FilePath, pattern = "fact_import_20", all.files = FALSE,
full.names = FALSE, recursive = FALSE,
ignore.case = FALSE, include.dirs = FALSE, no.. = FALSE)
FileCount <- length(FileNames)
for(i in 1:FileCount){
MOH_TotalHC_1 <- read_excel(paste(FilePath, "/", FileNames[i], sep = ""), sheet = 1, range = cell_cols("A:I"))
MOH_TotalHC_2 <- read_excel(paste(FilePath, "/", FileNames[i], sep = ""), sheet = 1, range = cell_cols("A:I"))
MOH_TotalHC <- rbind(MOH_TotalHC_1, MOH_TotalHC_2)
MOH_TotalHC <- MOH_TotalHC[complete.cases(MOH_TotalHC), ]
use full.names = TRUE in list.files().
After this, make sure FileNames has full path of the files.
Then loop through the filenames, instead of filecount.
I think, you are trying to do this. I am guessing here. Please see below.
You are getting data from one file, because you are overwriting the data from file-2 with data from file-1. The for() loop is indicating it.
FileNames <- list.files(path = FilePath, pattern = "fact_import_20", all.files = FALSE,
full.names = TRUE, recursive = FALSE,
ignore.case = FALSE, include.dirs = FALSE, no.. = FALSE)
# list of data from excell files
df_lst <- lapply(FileNames, function(fn){
read_excel(fn, sheet = 1, range = cell_cols("A:I"))
})
# combine both data
MOH_TotalHC <- do.call('rbind', df_lst)
# complete cases
MOH_TotalHC[complete.cases(MOH_TotalHC), ]
The potential solution is below. This solution is taken from here and seems like a
duplicate question.
Potential solution:
library(readxl)
library(data.table)
#Set your path here
FilePath <- "//srdceld2/project2/"
#Update the pattern to suit your needs. Currently, its just set for XLSX files
file.list <- list.files(path = FilePath, pattern = "*.xlsx", full.names = T)
df.list <- lapply(file.list, read_excel, sheet = 1, range = cell_cols("a:i"))
attr(df.list, "names") <- file.list
names(df.list) <- file.list
setattr(df.list, "names", file.list)
#final data frame is here
dfFinal <- rbindlist(df.list, use.names = TRUE, fill = TRUE)
Assumptions and call outs:
The files in the folder are similar file types. For example xlsx.
The files could have different set of columns and NULLs as well.
Note that the order of the columns matter and so if there are more columns in new file the number of output columns could be different.
Note: Like #Sathish, I am guessing what the input could look like
Sorry in advance but I don't think i can make this entirely reproduceable as it involves reading in txt files but you can test it out quite easily with a folder of a few tabbed txt files with some random numbers in.
I have a folder with several txt files inside; I would like to read each of them into a nested list. Currently I can read 1 txt at a time with this code:
user_input <- readline(prompt="paste the path for the folder here: ")
files <- list.files(path = user_input, pattern = NULL, all.files = FALSE, full.names = TRUE)
thefiles <- data.frame(files)
thefiles
Sfiles <- split(thefiles, thefiles$files)
Sfiles
input1 <- print(Sfiles[1])
But I want to read all of the files in the given directory.
I suppose it would then be a list of dataframes?
Here are some of the things i've tried:
-i guessed this would just paste all of the files in the directory but that's not entirely what i want to do.
{paste(thefiles,"/",files[[i]],".txt",sep="")
}
-this was meant to use lapply to execute read.delim on all of the files in the folder.
the error it gives is:
Error in file(file, "rt") : invalid 'description' argument
files_test <- list.files(path=user_input, pattern="*.txt", full.names=TRUE, recursive=FALSE)
lapply(thefiles, transform, files = read.delim(files, header = TRUE, sep = "\t", dec = "."))
-I tried it on its own aswell, also doesn't work
read.delim(files_test, header = TRUE, sep = "\t", dec = ".")
-I tried a for loop too:
test2 <- for (i in 1:length(Sepfiles){read.delim(files_test, header = TRUE, sep = "\t", dec = "."})
Is there anything obvious that I'm doing wrong? Any pointers would be appreciated
Thanks
This should work if the read.delim part is correct:
thefiles <- list.files(path = user_input, pattern = ".txt$", ignore.case = TRUE, full.names = TRUE, recursive = FALSE)
lapply(thefiles, function(f) read.delim(f, header = TRUE, sep = "\t", dec = "."))
Some background for my question: This is an R script that a previous research assistant wrote, but he did not provide any guidance to me on using it for myself. After working through an R textbook, I attempted to use the code on my data files.
What this code is supposed to do is load multiple .csv files, delete certain items/columns from them, and then write the new cleaned .csv files to a specified directory.
When I run my code, I don't get any errors, but the code isn't going anything. I originally thought that this was a problem with file permissions, but I'm still having the problem after changing them. Not sure what to try next.
Here's the code:
library(data.table)
library(magrittr)
library(stringr)
# create a function to delete unnecessary variables from a CAFAS or PECFAS
data set and save the reduced copy
del.items <- function(file)
{
data <- read.csv(input = paste0("../data/pecfas|cafas/raw",
str_match(pattern = "cafas|pecfas", string = file) %>% tolower, "/raw/",
file), sep = ",", header = TRUE, na.strings = "", stringsAsFactors = FALSE,
skip = 0, colClasses = "character", data.table = FALSE)
data <- data[-grep(pattern = "^(CA|PEC)FAS_E[0-9]+(T(Initial|[0-
9]+|Exit)|SP[a-z])_(G|S|Item)[0-9]+$", x = names(data))]
write.csv(data, file = paste0("../data/pecfas|cafas/items-del",
str_match(pattern = "cafas|pecfas", string = file) %>% tolower, "/items-
del/", sub(pattern = "ExportData_", x = file, replacement = "")) %>%
tolower, sep = ",", row.names = FALSE, col.names = TRUE)
}
# delete items from all cafas data sets
cafas.files <- list.files("../data/cafas/raw/", pattern = ".csv")
for (file in cafas.files){
del.items(file)
}
# delete items from all pecfas data sets
pecfas.files <- list.files("../data/pecfas/raw/", pattern = ".csv")
for (file in pecfas.files){
del.items(file)
}