Unzip and rename files keeping original file extension - r

I want to unzip the files inside a folder and rename them with the same name as their .zip file of origin BUT keeping the original extension of the individual files. Any ideas on how to do this?
Reproducible example:
# Download zip files
ftppath1 <- "ftp://geoftp.ibge.gov.br/malhas_digitais/censo_2010/setores_censitarios/se/se_setores_censitarios.zip"
ftppath2 <- "ftp://geoftp.ibge.gov.br/malhas_digitais/censo_2010/setores_censitarios/al/al_setores_censitarios.zip"
download.file(ftppath1, "SE.zip", mode="wb")
download.file(ftppath2, "AL.zip", mode="wb")
What I had in mind was something as naive as this:
# unzip and rename files
unzip("SE.zip", file_name= paste0("SE",.originalextension))
unzip("AL.zip", file_name= paste0("AL",.originalextension))
In the end, these are the files I would have in my folder:
SE.zip
AL.zip
AL.shx
AL.shp
AL.prj
AL.dbf
SE.shx
SE.shp
SE.prj
SE.dbf

for (stem in c('SE','AL')) {
zf <- paste0(stem,'.zip'); ## derive zip file name
unzip(zf); ## extract all compressed files
files <- unzip(zf,list=T)$Name; ## get their orig names
for (file in files) file.rename(file,paste0(stem,'.',sub('.*\\.','',file))); ## rename
};
system('ls;');
## AL.dbf AL.prj AL.shp AL.shx AL.zip SE.dbf SE.prj SE.shp SE.shx SE.zip

Related

Download all the txt files from a website

I would like to download multiple text files which ends with .txt from a website. I am trying to use these answers Download All Files From a Folder on a Website
So I changed the answers to txt ending files but none of the solutions from this page worked for me. I think I have to change this line of the first answer
xpQuery = "//a/#href['.txt'=substring(., string-length(.) - 3)]"
for txt files but I don't know how.
The edited code is something like below:
## your base url
url <- "https://..."
## query the url to get all the file names ending in '.txt'
txts <- XML::getHTMLLinks(
url,
xpQuery = "//a/#href['.txt'=substring(., string-length(.) - 3)]"
)
## create a new directory 'mytxts' to hold the downloads
dir.create("mytxts")
## save the current directory path for later
wd <- getwd()
## change working directory for the download
setwd("mytxts")
## create all the new files
file.create(txts)
## download them all
lapply(paste0(url, txts), function(x) download.file(x, basename(x)))
## reset working directory to original
setwd(wd)

Unzipping Files in R Using a Loop or Lapply and Saving to the Same Folder

I have a data frame that contains all of my directory and sub-directory file paths. My first dataset contains the input 'path':
Input:
\\data\A\New Jersey\Construction\2020.10.27\Results.zip
\\data\A\New Jersey\Materials\2020.10.27\Results.zip
\\data\A\Pennsylvania\Construction\2020.10.27\Results.zip
\\data\A\Pennsylvania\Electrician\2020.10.27\Results.zip
My second dataset contains the 'output' path:
Output:
\\data\A\New Jersey\Construction\2020.10.27
\\data\A\New Jersey\Materials\2020.10.27
\\data\A\Pennsylvania\Construction\2020.10.27
\\data\A\Pennsylvania\Electrician\2020.10.27
As you can see, I want the files unzipped to the same folder. Currently I can use
lapply(list, unzip)
to unzip to my working directory; however, I would like to unzip the file to the same location of the zip file. To do this I believe that I need to use a for loop and define the exdir using the 'output' dataset.
for(i in length(input)){
for(f in length(output)){
path <- input[i]
out <- output[f]
unzip(path, out)
}}
Any suggestions?

Run R script in different folder than data in Ubuntu Server

I have an R script in one directory that takes in the files in a different directory, combines them into one file, and outputs a new excel file, as shown below:
first <- read_excel("file1.xlsx")
second <- read_excel("file2.xlsx")
third <- read_excel("file3.xlsx")
df <- bind_rows(first,second,third)
openxlsx::write.xlsx(df, "newfile.xlsx")
In my code, I can set the working directory to a particular folder just putting setwd("path/to/data") but this only works in one directory. I'd like to make a shell script where I can loop through various folders.
I'd like to try something like
for i in folder1,folder2,folder3
do
# Run Rscript myRscript.R in each folder
done
Ex. Folder 1 has file1, file2, and file3. Folder 2 has file1, file2, and file3. Folder 3 has file1, file2, file3. I'd like the Rscript to be one directory up from the folders and run in each folder and generate a "newfile.xlsx" file for each folder (Each folder is a different set of data but have all the same file names within each folder)
I want to avoid copying a version of the Rscript into each folder to avoid the folder changing nature of my request. Is this possible?
You can loop through the folders and files with R no problem.
folders <- list.dirs()
for (folder in folders) {
files <- list.files(folder)
# extra: neglect non xlsx files
# files <- files[which(str_detect(files, ".xlsx$"))]
df <- tibble()
for (file in files) {
temp <- read_excel(file)
df <- bind_rows(df, temp)
}
# creates a newfile.xlsx in each folder
openxlsx::write.xlsx(df, file.path(folder, "newfile.xlsx"))
# alternative: creates the newfile in the main folder
# openxlsx::write.xlsx(df, paste0(folder, "_newfile.xlsx"))
}

Convert .R files to .txt files en masse

I have about 56 .R files that I need to convert to .txt format--not R objects or data frames but the .R files themselves I need to convert to a text format.
Is there a package or method to do this for many files, or is doing it one by one the only option?
Base R approach:
# get list of .R files in current directory
my_r_files <- list.files(pattern = "*\\.R$")
# specify new name for each file
new_file_names <- sub(pattern="\\.R$", replacement=".txt", x=my_r_files)
# rename files
file.rename(from = my_r_files, to = new_file_names)

How do I zip a csv file and write that zipped file to a folder using R

I have an R script which generates a csv file of nearly 80000 KB after calculations. I want to write this csv file to folder say D:/My_Work/Output with file name result.zip as a zipped file. Please suggest is there any function or any way that i could achieve this.
Use the zip function:
zip(*path to zip*,*path to csv*)
edit: Unfortunately you cannot go from data.frame straight to zipped csv. You need to explicitly make the csv, but it wouldn't be hard to write a wrapper that deletes the csv so that you never know its there like so:
zipped.csv <- function(df, zippedfile) {
# init temp csv
temp <- tempfile(fileext=".csv")
# write temp csv
write.csv(df, file=temp)
# zip temp csv
zip(zippedfile,temp)
# delete temp csv
unlink(temp)
}
If you want just save some space on the disk then it is more convenient to use *.gz compression.
write.csv(iris, gzfile("iris.csv.gz"), row.names = FALSE)
iris2 = read.csv("iris.csv.gz")

Resources