Zip files without directory name in R - r

Inside working directory I have folders names ending "*_txt" containing files inside folder, want to zip all folders with original name and files inside them. Everything is working perfectly but problem in .zip contains the name of directory as well that i don't want e.g "1202_txt.zip\1202_txt\files" needs to be "1202_txt.zip\files"
dir.create("1202_txt") # creating folder inside working directory
array <- list.files( , "*_txt")
for (i in 1:length(array)){
name <- paste0(array[i],".zip")
#zip(name, files = paste0(d,paste0("/",array[i])))
zip(name, files = array[i])
}
Above code is available Creating zip file from folders in R
Note: Empty folders can be skipped

Can you please try this? (using R 3.5.0, macOS High Sierra 10.13.6)
dir_array <- list.files(getwd(), "*_txt")
zip_files <- function(dir_name){
zip_name <- paste0(dir_name, ".zip")
zip(zipfile = zip_name, files = dir_name)
}
Map(zip_files, dir_array)
This should zip all the folders inside the current working directory with the specified name. The zipped folders are also housed in the current working directory.

Here is the approach I used to achieve my desired results tricky but still works
setwd("c:/test")
dir.create("1202_txt") # creating folder inside working directory and some CSV files in there
array <- list.files( , "*_txt")
for (i in 1:length(array)){
name <- paste0(array[i],".zip")
Zip_Files <- list.files(path = paste0(getwd(),"/", array[[i]]), pattern = ".csv$")
# Moving Working Directory
setwd(file.path("C:\\test\\",array[[i]]))
#zipping files inside the directory
zip::zip(zipfile = paste0(name[[i]]), files = Zip_Files)
# Moving zip File from Inside folder to outside
file.rename(name[i], paste0("C:\\test\\", name[i]))
print(name[i])
}

Related

Is there a way to get the path of a directory?

With the following
dir.create(paste0('hello ', Sys.Date()))
I created a directory whose name is hello 2020-08-10 (today's date). How can I write a csv file inside it? I might use setwd command but it requires the path of the directory. Is there a way to get the path of a directory?
That directory is created as a subdirectory of your current working directory. Thus, you should be able to write your csv file with the relative path "hello 2020-08-10/file.csv".
If you have data to write from a data frame then you can achieve it this way. You can always get the path of the working directory using getwd().
dir <- paste0('hello ', Sys.Date())
yourdf <- read.csv("file.csv")
wrtfile <- paste0(dir,"/filename.csv")
dir.create(dir)
write.csv(yourdf, file = wrtfile, row.names = FALSE)

Run R script in different folder than data in Ubuntu Server

I have an R script in one directory that takes in the files in a different directory, combines them into one file, and outputs a new excel file, as shown below:
first <- read_excel("file1.xlsx")
second <- read_excel("file2.xlsx")
third <- read_excel("file3.xlsx")
df <- bind_rows(first,second,third)
openxlsx::write.xlsx(df, "newfile.xlsx")
In my code, I can set the working directory to a particular folder just putting setwd("path/to/data") but this only works in one directory. I'd like to make a shell script where I can loop through various folders.
I'd like to try something like
for i in folder1,folder2,folder3
do
# Run Rscript myRscript.R in each folder
done
Ex. Folder 1 has file1, file2, and file3. Folder 2 has file1, file2, and file3. Folder 3 has file1, file2, file3. I'd like the Rscript to be one directory up from the folders and run in each folder and generate a "newfile.xlsx" file for each folder (Each folder is a different set of data but have all the same file names within each folder)
I want to avoid copying a version of the Rscript into each folder to avoid the folder changing nature of my request. Is this possible?
You can loop through the folders and files with R no problem.
folders <- list.dirs()
for (folder in folders) {
files <- list.files(folder)
# extra: neglect non xlsx files
# files <- files[which(str_detect(files, ".xlsx$"))]
df <- tibble()
for (file in files) {
temp <- read_excel(file)
df <- bind_rows(df, temp)
}
# creates a newfile.xlsx in each folder
openxlsx::write.xlsx(df, file.path(folder, "newfile.xlsx"))
# alternative: creates the newfile in the main folder
# openxlsx::write.xlsx(df, paste0(folder, "_newfile.xlsx"))
}

Copy files from nested folders to new nested folders

I am trying to copy a large number of files from one folder to another. We need to restructure the folders, so there is a translation from the old folder path to a new one. The old folder structure is also nested.
Currently the code I have is not throwing any errors, but is returning false on executing the file.copy for all files.
ETA: When I copy a single file, it works.
allFilePaths <- list.files('./oldTopLevelFolder', recursive = TRUE)
testIds <- c(1:4)
otherTestIds <- c(5:8)
allNewFolders <- paste('newTopLevelFolder', testIds, 'aFolderName', otherTestIds, sep = '/')
lapply(allNewFolders, dir.create, recursive = TRUE)
file.copy(from=allFilePaths, to=allNewFolders,
copy.mode = TRUE)
file.copy can copy multiple files, but only to a single destination folder by the looks of it.
In order to copy a bunch of files into varying destination folders, the following will do the job, where allOldFilePaths is a column containing the old filepath where each file currently exists, and allNewFilePaths is a column containing the new folder path for each file.
# function to copy a single file
copySingleFile <- function(oldPath, newPath) {
file.copy(from=oldPath, to=newPath,
copy.mode = TRUE)
}
# copy each file to its new folder path
mapply(copySingleFile, allFilePathsWithRoot, allNewFilePaths)

copy csv file from multiple directories to a new one in R

I am trying to extract many .csv files from multiple directories/subdirectories and copy them in a new folder, where I would like to end up with only .csv files.
The csv files are stored in subdirectories with the following structure:
D:\R data\main_folder\03\07\04\BBB_0120180307031414614.csv
D:\R data\main_folder\03\07\05\BBB_0120180307031414615.csv
I am trying the list.files function to extract the csv files names only.
my_dirs <- list.files("D:\\R data\\main_folder\\",pattern="\\.csv$" ,recursive = TRUE,
include.dirs = FALSE,full.names = FALSE)
The problem is that csv files are listed with the directory path, e.g.
03/07/03/BBB_0120180307031414614.csv
And this, even though full.names and include.dirs is set to FALSE.
This prevents me from copying those files in a new folder, as the name is not recognized.
What am I doing wrong?
Thanks
Use basename function coupled with list.files like below.
If I understood you correctly then you want to fetch the names of .csv files present in different directory.
I have made a temp folder in my documents directory of windows machine , Inside that I have two folders "one" and "two", Inside these folders I have csv files named as "just_one.csv" and "just_two.csv".
So If I want to fetch the names "just_one.csv" and "just_two.csv" then I could do this:
basename(list.files("C:/Users/C_Nfdl_99878314/Documents/temp", "*.csv", recursive=T))
Which results to:
[1] "just_one.csv" "just_two.csv"

Copying only text files into new folder in R

I have a folder of PDFs that I am supposed to perform text analytics on within R. Thus far the best method of doing so has been using R to convert these files to text files using pdftotext. After this however I am unable to perform any analytics as the text files are placed into the same folder as the PDFs from which they are derived.
I am achieving this through:
dest <- "C:/PDF"
myfiles <- list.files(path = dest, pattern = "pdf", full.names = TRUE)
lapply(myfiles, function(i) system(paste('"C:/xpdfbin-win-3.04/bin64/pdftotext.exe"', paste0('"',i,'"')), wait= FALSE))
I was wondering the best method of retaining only the text files, whether it be saving them to a newly created folder in this step or if more must be done.
I have tried:
dir.create("C:/txtfiles")
new.folder <- "C:/txtfiles"
dest <- "C:/PDF"
list.of.files <-list.files(dest, ".txt$")
file.copy(list.of.files, new.folder)
However this only fills the new folder 'txtfiles' with blank text files named after the ones created by the first few lines of code.
use the following code:
files <- list.files(path="current folder location",pattern = "\\.txt$") #lists all .txt files
for(i in 1:length(files)){
file.copy(from=paste("~/current folder location/",files[i],sep=""),
to="destination folder")
This should copy all text files in "current folder location" into a separate folder "destination folder".

Resources