I wrote a script to rename files. But I found the modified date were changed to same. So the original order is broken if they are sorted by date. Is there any way to change the names without changing the modified date? Or although the dates are changed, the order is still the same if they are sorted by date. The following is my current code:
# save previous working folder
wkdir <- getwd()
# set the target folder
setwd("C:/Users/YY/Desktop/Tmp file/")
# set the file pattern
a <- list.files(path = ".", pattern = "abc_*.*$")
# set the name to be replaced
b<-gsub("abc_","ABC_",a)
# rename
file.rename(a,b)
# restore previous working folder
setwd(wkdir)
I would appreciate it if anyone can help me.
I had the same question - I needed to process files, then archive. I tried in R first, then realized the copy changed the original datetime stamp for the file.
I eventually learned the shell() command and solved it with code like below. As I am in Windows OS, I used -R and -d in filenames to denote whether in form for R (/ form in path) or Windows (\ form in path) and converted using normalizePath().
sourcefileR <- "c:/Users/myname/Documents/test.dat"
destfileR <- "c:/Users/myname/Documents/somewhereelse/test.dat"
sourcefiled <- normalizePath(sourcefileR)
# now looks like: "c:\\Users\\myname\\Documents\\test.dat"
destfiled <- normalizePath(destfileR)
rept <- shell(paste("copy ", sourcefiled, destfiled, sep=" "), intern=TRUE)
The intern parameter causes the OS feedback to go into the R object rept, which can be searched to find the "1 file(s) copied" string for success or whatever other error trapping you want.
I am in R version 2.15.3 (2013-03-01) on Platform: x86_64-w64-mingw32/x64 (64-bit)
running Windows 7 Professional, SP1.
Of course it is possible!
Instead of using a command like "REN" or "RENAME", you can use the "MOVE" command for renaming your files/folders and their dates will stay exactly the same.
Example:
MOVE "C:\Folder\Filename.txt" "C:\Folder\New_Filename.txt"
(I don't know if it's working for all versions of Windows but it's seem to be working at least for Windows 7)
If for some reason you can't use the MOVE command, there is a program like Nircmd from Nirsoft that can change the file dates to any dates you want.
Syntax:
nircmd.exe setfiletime "creation-time" "modified-time"
Example:
nircmd.exe setfiletime "c:\temp\myfile.txt" "24-06-2003 17:57:11" "22-11-2005 10:21:56"
You can't change names without changing the modification date. Think about that for a moment! You're modifying the file (even though you're not modifying the content).
Q. Are you sorting in R or outside in Windows folder view?
Q. Have you thought about sorting by creation date?
If you're sorting in windows, you should be able to figure out how to sort by "Creation Date"
and if you're sorting it in R, use file.info to get relevant attributes and then sort on that.
Related
I have multiple .xls (~100MB) files from which I would like to load multiple sheets (from each) into R as a dataframe. I have tried various functions, such as xlsx::xlsx2 and XLConnect::readWorksheetFromFile, both of which always run for a very long time (>15 mins) and never finish and I have to force-quit RStudio to keep working.
I also tried gdata::read.xls, which does finish, but it takes more than 3 minutes per one sheet and it cannot extract multiple sheets at once (which would be very helpful to speed up my pipeline) like XLConnect::loadWorkbook can.
The time it takes these functions to execute (and I am not even sure the first two would ever finish if I let them go longer) is way too long for my pipeline, where I need to work with many files at once. Is there a way to get these to go/finish faster?
In several places, I have seen a recommendation to use the function readxl::read_xls, which seems to be widely recommended for this task and should be faster per sheet. This one, however, gives me an error:
> # Minimal reproducible example:
> setwd("/Users/USER/Desktop")
> library(readxl)
> data <- read_xls(path="test_file.xls")
Error:
filepath: /Users/USER/Desktop/test_file.xls
libxls error: Unable to open file
I also did some elementary testing to make sure the file exists and is in the correct format:
> # Testing existence & format of the file
> file.exists("test_file.xls")
[1] TRUE
> format_from_ext("test_file.xls")
[1] "xls"
> format_from_signature("test_file.xls")
[1] "xls"
The test_file.xls used above is available here.
Any advice would be appreciated in terms of making the first functions run faster or the read_xls run at all - thank you!
UPDATE:
It seems that some users are able to open the file above using the readxl::read_xls function, while others are not, both on Mac and Windows, using the most up to date versions of R, Rstudio, and readxl. The issue has been posted on the readxl GitHub and has not been resolved yet.
I downloaded your dataset and read each excel sheet in this way (for example, for sheets "Overall" and "Area"):
install.packages("readxl")
library(readxl)
library(data.table)
dt_overall <- as.data.table(read_excel("test_file.xls", sheet = "Overall"))
area_sheet <- as.data.table(read_excel("test_file.xls", sheet = "Area"))
Finally, I get dt like this (for example, only part of the dataset for the "Area" sheet):
Just as well, you can use the read_xls function instead read_excel.
I checked, it also works correctly and even a little faster, since read_excel is a wrapper over read_xls and read_xlsx functions from readxl package.
Also, you can use excel_sheets function from readxl package to read all sheets of your Excel file.
UPDATE
Benchmarking is done with microbenchmark package for the following packages/functions: gdata::read.xls, XLConnect::readWorksheetFromFile and readxl::read_excel.
But XLConnect it's a Java-based solution, so it requires a lot of RAM.
I found that I was unable to open the file with read_xl immediately after downloading it, but if I opened the file in Excel, saved it, and closed it again, then read_xl was able to open it without issue.
My suggested workaround for handling hundreds of files is to build a little C# command line utility that opens, saves, and closes an Excel file. Source code is below, the utility can be compiled with visual studio community edition.
using System.IO;
using Excel = Microsoft.Office.Interop.Excel;
namespace resaver
{
class Program
{
static void Main(string[] args)
{
string srcFile = Path.GetFullPath(args[0]);
Excel.Application excelApplication = new Excel.Application();
excelApplication.Application.DisplayAlerts = false;
Excel.Workbook srcworkBook = excelApplication.Workbooks.Open(srcFile);
srcworkBook.Save();
srcworkBook.Close();
excelApplication.Quit();
}
}
}
Once compiled, the utility can be called from R using e.g. system2().
I will propose a different workflow. If you happen to have LibreOffice installed, then you can convert your excel files to csv programatically. I have Linux, so I do it in bash, but I'm sure it can be possible in macOS.
So open a terminal and navigate to the folder with your excel files and run in terminal:
for i in *.xls
do soffice --headless --convert-to csv "$i"
done
Now in R you can use data.table::fread to read your files with a loop:
Scenario 1: the structure of files is different
If the structure of files is different, then you wouldn't want to rbind them together. You could run in R:
files <- dir("path/to/files", pattern = ".csv")
all_files <- list()
for (i in 1:length(files)){
fileName <- gsub("(^.*/)(.*)(.csv$)", "\\2", files[i])
all_files[[fileName]] <- fread(files[i])
}
If you want to extract your named elements within the list into the global environment, so that they can be converted into objects, you can use list2env:
list2env(all_files, envir = .GlobalEnv)
Please be aware of two things: First, in the gsub call, the direction of the slash. And second, list2env may overwrite objects in your Global Environment if they have the same name as the named elements within the list.
Scenario 2: the structure of files is the same
In that case it's likely you want to rbind them all together. You could run in R:
files <- dir("path/to/files", pattern = ".csv")
joined <- list()
for (i in 1:length(files)){
joined <- rbindlist(joined, fread(files[i]), fill = TRUE)
}
On my system, i had to use path.expand.
R> file = "~/blah.xls"
R> read_xls(file)
Error:
filepath: ~/Dropbox/signal/aud/rba/balsheet/data/a03.xls
libxls error: Unable to open file
R> read_xls(path.expand(file)) # fixed
Resaving your file and you can solve your problem easily.
I also find this problem before but I get the answer from your discussion.
I used the read_excel() to open those files.
I was seeing a similar error and wanted to share a short-term solution.
library(readxl)
download.file("https://mjwebster.github.io/DataJ/spreadsheets/MLBpayrolls.xls", "MLBPayrolls.xls")
MLBpayrolls <- read_excel("MLBpayrolls.xls", sheet = "MLB Payrolls", na = "n/a")
Yields (on some systems in my classroom but not others):
Error: filepath: MLBPayrolls.xls libxls error: Unable to open file
The temporary solution was to paste the URL of the xls file into Firefox and download it via the browser. Once this was done we could run the read_excel line without error.
This was happening today on Windows 10, with R 3.6.2 and R Studio 1.2.5033.
If you have downloaded the .xls data from the internet, even if you are opening it in Ms.Excel, it will open a prompt first asking to confirm if you trust the source, see below screenshot, I am guessing this is the reason R (read_xls) also can't open it, as it's considered unsafe. Save it as .xlsx file and then use read_xlsx() or read_excel().
Even thought this is not a code-based solution, I just changed the type file. For instance, instead of xls I saved as csv or xlsx. Then I opened it as regular one.
I worked it for me, because when I opened my xlsfile, I popped up the message: "The file format and extension of 'file.xls'' don't match. The file could be corrupted or unsafe..."
I run a monthly data import process in R, using something similar to this:
Data <- read.csv("c:/Data/March 2018 Data.csv")
However, I want to fully automate the process and, hence, find a way to change the date of the file being uploaded, in this case 'March 2018', using a variable from a lookup table. This lookup table is changed every month externally and the Date variable, which indicates the month of production, is updated during this.
I've tried to use paste() function, but didn't get very far:
Data <- read.csv(paste("C:/Data Folder",Date,"Data.csv"))
Keeps saying "No such file or directoryError in file". I've checked the file names and path are fine. The only issue I'm detecting is the code line in the directory appears like this
'c:/Data folder/ March 2018 Data.csv'
I'm not sure if that extra 'space' is the issue
Any ideas?
Thanks to both bobbel and jalazbe for this solution
I used paste0()
Data <- read.csv(paste0("c/Date folder/",Date,"Data.csv"))
I am creating a package and would like to store settings data locally, since it is unique for each user of the package and so that the setting does not have to be set each time the package is loaded.
How can I do this in the best way?
You could save your necessary data in an object and save it using saveRDS()
whenever a change it made or when user is leaving or giving command for saving.
It saves the R object as it is under a file name in the specified path.
saveRDS(<obj>, "path/to/filename.rds")
And you can load it next time when package is starting using loadRDS().
The good thing of loadRDS() is that you can assign a new name to the obj. (So you don't have to remember its old obj name. However the old obj name is also loaded with the object and will eventually pollute your namespace.
newly.assigned.name <- loadRDS("path/to/filename.rds")
# or also possible:
loadRDS("path/to/filename.rds") # and use its old name
Where to store
Windows
Maybe here:
You can use %systemdrive%%homepath% environment variable to accomplish
this.
The two command variables when concatenated gives you the desired
user's home directory path as below:
Running echo %systemdrive% on command prompt gives:
C:
Running echo %homepath% on command prompt gives:
\Users\
When used together it becomes:
C:\Users\
Linux/OsX
Either in the package location of the user,
path.to.package <- find.package("name.of.your.pacakge",
lib.loc = NULL, quiet = FALSE,
verbose = getOption("verbose"))
# and then construct with
destination.folder.path <- file.path(path.to.package,
"subfoldername", "filename")`
# the path to the final destination
# You should use `file.path()` to construct such paths, because it detects automatically the correct ('/' or '\') separators for the file paths in Unix-derived systems (Linux/Mac Os X) versus Windows.
Or use the $HOME variable of the user and there in a file - the name of which beginning with "." - this is convention in Unix-systems (Linux/Mac OS X) for such kind of file which save configurations of software programs.
e.g. ".your-packages-name.rds".
If anybody has a better solution, please help!
Is it possible to use a prefix when specifying a filepath string in R to ignore escape characters?
For example if I want to read in the file example.csv when using windows, I need to manually change \ to / or \\. For example,
'E:\DATA\example.csv'
becomes
'E:/DATA/example.csv'
data <- read.csv('E:/DATA/example.csv')
In python I can prefix my string using r to avoid doing this (e.g. r'E:\DATA\example.csv'). Is there a similar command in R, or an approach that I can use to avoid having this problem. (I move between windows, mac and linux - this is just a problem on the windows OS obviously).
You can use file.path to construct the correct file path, independent of operating system.
file.path("E:", "DATA", "example.csv")
[1] "E:/DATA/example.csv"
It is also possible to convert a file path to the canonical form for your operating system, using normalizePath:
zz <- file.path("E:", "DATA", "example.csv")
normalizePath(zz)
[1] "E:\\DATA\\example.csv"
But in direct response to your question: I am not aware of a way to ignore the escape sequence using R. In other words, I do not believe it is possible to copy a file path from Windows and paste it directly into R.
However, if what you are really after is a way of copying and pasting from the Windows Clipboard and get a valid R string, try readClipboard
For example, if I copy a file path from Windows Explorer, then run the following code, I get a valid file path:
zz <- readClipboard()
zz
[1] "C:\\Users\\Andrie\\R\\win-library\\"
It is now possible with R version 4.0.0. See ?Quotes for more.
Example
r"(c:\Program files\R)"
## "c:\\Program files\\R"
If E:\DATA\example.csv is on the clipboard then do this:
example.csv <- scan("clipboard", what = "")
## Read 1 item
example.csv
## [1] "E:\\DATA\\example.csv"
Now you can copy "E:\\DATA\\example.csv" from the above output above onto the clipboard and then paste that into your source code if you need to hard code the path.
Similar remarks apply if E:\DATA\example.csv is in a file.
If the file exists then another thing to try is:
example.csv <- file.choose()
and then navigate to it and continue as in 1) above (except the file.choose line replaces the scan statement there).
Note that its not true that you need to change the backslashes to forward slashes for read.csv on Windows but if for some reason you truly need to do that translation then if the file exists then this will translate backslashes to forward slashes (but if it does not exist then it will give an annoying warning so you might want to use one of the other approaches below):
normalizePath(example.csv, winslash = "/")
and these translate backslashes to forward slashes even if the file does not exist:
gsub("\\", "/", example.csv, fixed = TRUE)
## [1] "E:/DATA/example.csv"
or
chartr("\\", "/", example.csv)
## [1] "E:/DATA/example.csv"
In 4.0+ the following syntax is supported. ?Quotes discusses additional variations.
r"{E:\DATA\example.csv}"
EDIT: Added more info on normalizePath.
EDIT: Added (4).
A slightly different approach I use with a custom made function that takes a windows path and corrects it for R.
pathPrep <- function() {
cat("Please enter the path:\\n\\n")
oldstring <- readline()
chartr("\\\\", "/", oldstring)
}
Let's try it out!
When prompted paste the path into console or use ctrl + r on everything at once
(x <- pathPrep())
C:/Users/Me/Desktop/SomeFolder/example.csv
Now you can feed it to a function
shell.exec(x) #this piece would work only if
# this file really exists in the
# location specified
But as others pointed out what you want is not truly possible.
No, this is not possible with R versions before 4.0.0. Sorry.
I know this question is old, but for people stumbling upon this question in recent times, wanted to share that with the latest version R4.0.0, it is possible to parse in raw strings. The syntax for that is r"()". Note that the string goes in the brackets.
Example:
> r"(C:\Users)"
[1] "C:\\Users"
Source: https://cran.r-project.org/doc/manuals/r-devel/NEWS.html
jump to section: significant user-visible changes.
Here's an incredibly ugly one-line hack to do this in base R, with no packages necessary:
setwd(gsub(", ", "", toString(paste0(read.table("clipboard", sep="\\", stringsAsFactors=F)[1,], sep="/"))))
Usable in its own little wrapper function thus (using suppressWarnings for peace of mind):
> getwd()
[1] "C:/Users/username1/Documents"
> change_wd=function(){
+ suppressWarnings(setwd(gsub(", ", "", toString(paste0(read.table("clipboard", sep="\\", stringsAsFactors=F)[1,], sep="/")))))
+ getwd()
+ }
Now you can run it:
#Copy your new folder path to clipboard
> change_wd()
[1] "C:/Users/username1/Documents/New Folder"
To answer the actual question of "Can I parse raw-string in R without having to double-escape backslashes?" which is a good question, and has a lot of uses besides the specific use-case with the clipboard.
I have found a package that appears to provide this functionality:
https://github.com/trinker/pathr
See "win_fix".
The use-case specified in the docs is exactly the use-case you just stated, however I haven't investigated whether it handles more flexible usage scenarios yet.
I'm using R under Windows XP. It picked up the environmental variable HOME from windows which is,
> Sys.getenv("R_USER")
R_USER
"H:"
However, how can I use that variable quickly in a file name? In particular, if I have a file stored at H:/tmp/data.txt. How should I construct the following command?
data <- read.table("$R_HOME/tmp/data.txt")
That one clearly didn't work.
The only way I got it to work is the following:
data <- read.table(paste(Sys.getenv("R_USER"), "/tmp/data.txt", sep = ""))
Which is so cumbersome that I have to believe there is an easier way. Does anyone know a quick evocation of the HOME variable in R?
Ah, I got it. it's just
data <- read.table("~/tmp/data.txt")