Preventing duplicate slashes in file paths - r

I would like to build a path to a file, given a filename and a folder where that file exists. The folder may include a trailing slash or it may not. In python, os.path.join solves this problem for you. Is there a base R solution to this problem? If not, what is the recommended way in R to build file paths that do not have duplicate slashes?
This works fine:
> file.path("/path/to/folder", "file.txt")
[1] "/path/to/folder/file.txt"
But if the user provides a folder with a trailing slash, file.path does the still-functional-but-annoying double-slash:
> file.path("/path/to/folder/", "file.txt")
[1] "/path/to/folder//file.txt"
I'm looking for a built-in, 1 function answer to this common issue.

might be os independent, instead of explicitly coding /
joinpath = function(...) {
sep = .Platform$file.sep
result = gsub(paste0(sep,"{2,}"), sep, file.path(...), fixed=FALSE, perl=TRUE)
result = gsub(paste0(sep,"$"), '', result, fixed=FALSE, perl=TRUE)
return(result)
}

You could replace the // with / using gsub if it is too annoying. You could put it in a custom function for ease
file.path2 = function(..., fsep = .Platform$file.sep){
gsub("//", "/", file.path(..., fsep = fsep))
}
file.path2("/path/to/folder", "file.txt")
#[1] "/path/to/folder/file.txt"
file.path2("/path/to/folder/", "file.txt")
#[1] "/path/to/folder/file.txt"

Related

Concatenate string with escape characters to bookend string with "\"... \"" in R

I have a character vector of file paths that look like this:
xx <- c("data/lsa_two_isl_prosp_u.csv",
"data/lsa_two_isl_prosp_d.csv" ,
"data/lsa_two_isl_propsuit_u.csv")
However, I need these file paths to have "" concatenated on to the beginning of the string and "" concatenated onto the end, so that my string looks like this:
xx <- c("\"data/lsa_two_isl_prosp_u.csv\"",
"\"data/lsa_two_isl_prosp_d.csv\"" ,
"\"data/lsa_two_isl_propsuit_u.csv\"")
Normally I would use paste but the "\"... \"" are escape characters that need each other to 'bookend' a string.
In hindsight, an obviously doomed idea, but sharing to avoid anyone else who might try: If I try yo use paste('"\"', xx, '\""') , I get "\"\" data/lsa_two_isl_prosp_d.csv \"\"" , which is obviously wrong, and I cannot remove the excess portions of the string without throwing out all of it, incase you may have the same idea...
Any suggestions?
Found the answer after a lot of trial and error:
xx <- paste("\"", xx, "\"")

How to Change Part of URL With a Function Input in R?

Let's say we have a url in R like:
url <- 'http://google.com/maps'
And the objective is to change the 'maps' part of it. I'd like to write a function where basically I can just input something (e.g. 'maps', 'images'), etc., and the relevant part of the url will automatically change to reflect what I'm typing in.
Is there a way to do this in R, where part of the url can be changed by typing something into a function?
Thanks!
You have to store the part you type into a variable and paste this to the base URL:
base_url <- "http://google.com/"
your_extension <- "maps"
paste0(base_url, your_extension)
[1] "http://google.com/maps"
If you have to start with a fixed URL, use sub to replace the last part:
sub("\\w+$", 'foo', url)
# "http://google.com/foo"
You can use dirname to remove the last part of the URL and paste it with additional custom string.
change_url_part <- function(base_url, string) {
paste(dirname(base_url), string, sep = '/')
}
change_url_part('http://google.com/maps', 'images')
#[1] "http://google.com/images"

make file.exists() case insensitive

I have a line of code in my script that checks if a file exists (actually, many files, this one line gets looped for a bunch of different files):
file.exists(Sys.glob(file.path(getwd(), "files", "*name*")))
This looks for any file in the directory /files/ that has "name" in it, e.g. "filename.csv". However, some of my files are named "fileName.csv" or "thisfileNAME.csv". They do not get recognized. How can i make file.exists treat this check in a case insensitive way?
In my other code i usually make any imported names or lists immediately lowercase with the tolower function. But I don't see any option to include that in the file.exists function.
Suggested solution using list.files:
If we have many files we might want to do this only once, otherwise we can put in in the function (and pass path_to_root_directory instead of found_files to the function)
found_files <- list.files(path_to_root_directory, recursive=FALSE)
Behaviour as file.exists (return value is boolean):
fileExIsTs <- function(file_path, found_files) {
return(tolower(file_path) %in% tolower(found_files))
}
Return value is file with spelling as found in directory or character(0) if no match:
fileExIsTs <- function(file_path, found_files) {
return(found_files[tolower(found_files) %in% tolower(file_path)])
}
Edit:
New solution to fit new requirements:
keywordExists <- function(keyword, found_files) {
return(any(grepl(keyword, found_files, ignore.case=TRUE)))
}
keywordExists("NaMe", found_files=c("filename.csv", "morefilenames.csv"))
Returns:
[1] TRUE
Or
Return value are files with spelling as found in directory or character(0) if no match:
keywordExists2 <- function(file_path, found_files) {
return(found_files[grepl(keyword, found_files, ignore.case=TRUE)])
}
keywordExists2("NaMe", found_files=c("filename.csv", "morefilenames.csv"))
Returns:
[1] "filename.csv" "morefilenames.csv"
The following should return a 1 if the filename matches in any case and a 0 if it does not.
max(grepl("*name*",list.files()),ignore.case=T)

Append to file names in folder

How to append filenames in a folder
Filenames:
abc.wav
wjejrt.wav
13567tin.wav
Desired Output
abc_ENG.wav
wjejrt_ENG.wav
13567tin_ENG.wav
Tried this line code below but getting an error, maybe because I don't know the right use of file.rename function. Please help...
file.rename(list.files(pattern="*.wav"), paste0("_ENG"))
With base Ryou can do:
Filenames <- c("abc.wav", "wjejrt.wav", "13567tin.wav")
Fnames_new <- sub(".wav", "_ENG.wav", Filenames, fixed = TRUE)
file.rename(Filenames, Fnames_new)
Since you tagged Python, you could use os.rename() to rename your files:
from os import rename
from os import listdir
from os.path import splitext
# Current directory script is being run in
# You can change this to any path you want
path_to_folder = "."
for f in listdir(path_to_folder):
if f.endswith(".wav"):
name, ext = splitext(f)
rename(f, name + "_ENG" + ext)
You can try this one
^.*(?=\\.wav)
Explanation
^ - Anchor to start of string.
.* - Match anything except new line.
(?=\\.wav) - Positive look ahead matches .wav.
Change your code to this
file.rename(list.files(pattern=".*(?=\\.wav)"), paste0("_ENG"))
Demo

combining strings to one string in r

I'm trying to combine some stings to one. In the end this string should be generated:
//*[#id="coll276"]
So my inner part of the string is an vector: tag <- 'coll276'
I already used the paste() method like this:
paste('//*[#id="',tag,'"]', sep = "")
But my result looks like following: //*[#id=\"coll276\"]
I don't why R is putting some \ into my string, but how can I fix this problem?
Thanks a lot!
tldr: Don't worry about them, they're not really there. It's just something added by print
Those \ are escape characters that tell R to ignore the special properties of the characters that follow them. Look at the output of your paste function:
paste('//*[#id="',tag,'"]', sep = "")
[1] "//*[#id=\"coll276\"]"
You'll see that the output, since it is a string, is enclosed in double quotes "". Normally, the double quotes inside your string would break the string up into two strings with bare code in the middle:
"//*[#id\" coll276 "]"
To prevent this, R "escapes" the quotes in your string so they don't do this. This is just a visual effect. If you write your string to a file, you'll see that those escaping \ aren't actually there:
write(paste('//*[#id="',tag,'"]', sep = ""), 'out.txt')
This is what is in the file:
//*[#id="coll276"]
You can use cat to print the exact value of the string to the console (Thanks #LukeC):
cat(paste('//*[#id="',tag,'"]', sep = ""))
//*[#id="coll276"]
Or use single quotes (if possible):
paste('//*[#id=\'',tag,'\']', sep = "")
[1] "//*[#id='coll276']"

Resources