Fully qualified file name in R - r

I wish to get the fully qualified name of a file in R, given any of the standard notations. For example:
file.ext
~/file.ext (this case can be handled by path.expand)
../current_dir/file.ext
etc.
By fully qualified file name I mean, for example, (on a Unix-like system):
/home/user/some/path/file.ext
(Edited - use file.path and attempt Windows support) A crude implementation might be:
path.qualify <- function(path) {
path <- path.expand(path)
if(!grepl("^/|([A-Z|a-z]:)", path)) path <- file.path(getwd(),path)
path
}
However, I'd ideally like something cross-platform that can handle relative paths with ../, symlinks etc. An R-only solution would be preferred (rather than shell scripting or similar), but I can't find any straightforward way of doing this, short of coding it "from scratch".
Any ideas?

I think you want normalizePath():
> setwd("~/tmp/bar")
> normalizePath("../tmp.R")
[1] "/home/gavin/tmp/tmp.R"
> normalizePath("~/tmp/tmp.R")
[1] "/home/gavin/tmp/tmp.R"
> normalizePath("./foo.R")
[1] "/home/gavin/tmp/bar/foo.R"
For Windows, there is argument winslash which you might want to set all the time as it is ignored on anything other than Windows so won't affect other OSes:
> normalizePath("./foo.R", winslash="\\")
[1] "/home/gavin/tmp/bar/foo.R"
(You need to escape the \ hence the \\) or
> normalizePath("./foo.R", winslash="/")
[1] "/home/gavin/tmp/bar/foo.R"
depending on how you want the path presented/used. The former is the default ("\\") so you could stick with that if it suffices, without needing to set anything explicitly.
On R 2.13.0 then the "~/file.ext" bit also works (see comments):
> normalizePath("~/foo.R")
[1] "/home/gavin/foo.R"

I think I kind of miss the point of your question, but hopefully my answer can point you into the direction you want (it integrates your idea of using paste and getwdwith list.files):
paste(getwd(),substr(list.files(full.names = TRUE), 2,1000), sep ="")
Edit: Works on windows in some tested folders.

Related

How do I tell Vim to use any/all dictionary files that fit a filepath wildcard glob?

I am trying to set the dictionary option (to allow autocompletion of certain words of my choosing) using wildcards in a filename glob, as follows:
:set dict+=$VIM/dict/dict*.lst
The hope is that, with this line in the initially sourced .vimrc (or, in my case of Windows 10, _vimrc), I can add different dictionary files to my $VIM/dict directory later, and each new invocation of Vim will use those dictionary files, without me needing to modify my .vimrc settings.
However, an error message says that there is no such file. When I give a specific filename (as in :set dict+=$VIM/dict/dict01.lst ), then it works.
The thing is, I could swear that this used to work. I had this setting in my .vimrc files since I started using Vim 7.1, and I don't recall any such error message until recently. Now it shows up on my Linux laptop as well as my Windows 7 and Windows 10 laptops. I can't remember exactly when this started happening.
Yes, I tried using backslashes (as in :set dict+=$VIM\dict\dict*.lst ) in case it was a Windows compatibility issue, but that still doesn't work. (Also this is happening on my Linux laptop, too, and that doesn't use backslashes for filepaths.)
Am I going senile? Or is there some other mysterious force going on?
Assuming for now that it is a change in the latest version of Vim, is there some way to specify "use all the dictionary files that fit this glob"?
-- Edited 2021-02-14 06:17:07
I also checked to see if it was due to having more than one file that fits the wildcard glob. (I thought that if I had more than one file that fit the wildcard, the glob would turn into two filenames, equivalent to saying dict+=$VIM/dict/dict01.lst dict02.lst which would not be syntactically valid.) But it still did not working after removing extra files so that only one file fit my pathname of $VIM/dict/dict*.lst . (I had previously put another Addendum here happily explaining that this was how I solved my problem, but it turned out to be premature.)
You must expand wildcards before setting an option. Multiple file names must be separated by commas. For example,
let &dictionary = tr(expand("$VIM/dict/dict*.lst"), "\n", ",")
If adding a value to a non-empty option, don't forget to add comma too (let is more universal than set, so it's less forgiving):
let &dictionary .= "," . tr(expand(...)...)

R grep with 'AND' logic

I'm working with RJDBC on a server whose maintainers frequently update jar versions. Since RJDBC requires classpaths, this poses a problem when paths break. My situation is fortuitous in that the most current jars will always be in the same directory, but the version numbers will have changed.
I'm trying to use a simple grep function in R to isolate which jar I need based on a regex with some AND logic, however R makes this surprisingly difficult...
This question demonstrates how grep in R can function with the | operator for OR logic, but I can't seem to find similar AND logic operator.
Here's an example:
## Let's say I have three jars in a directory
jars <- list.files('/the/dir')
> jars
[1] "hive-jdbc-1.1.0-cdh5.4.3-standalone.jar" "hive-jdbc-1.1.0-cdh5.4.3.jar" "jython-standalone-2.5.3.jar"
The jar I want is "hive-jdbc-1.1.0-cdh5.4.3-standalone.jar"—how can I use AND logic in grep to extract it?
## I know that OR logic is supported:
j <- jars[grep('hive-jdbc|standalone', jars)]
> j
[1] "hive-jdbc-1.1.0-cdh5.4.3-standalone.jar" "hive-jdbc-1.1.0-cdh5.4.3.jar" "jython-standalone-2.5.3.jar"
## Would AND logic look like the same format?
> jars[grep('hive-jdbc&standalone', jars)]
character(0)
Not all-that-surprisingly, that last piece doesn't work... I found a useful, yet non-comprehensive, link for grep in R, but it doesn't show an AND operator. Any thoughts?
You could try
grep('hive-jdbc.*standalone', jars) # 'hive-jdbc' followed by 'standalone'
or
grepl('hive-jdbc', jars) & grepl('standalone', jars) # 'hive-jdbc' AND 'standalone'

How to use Windows special directories in R

If I do list.files('~') on Linux I get the contents of my home directory.
If I do list.files('%userprofiles%') from Windows, I get an empty character as the return.
How can I use the special directories in this manner on Windows?
This isn't the same as this question because using ~ in Windows gets me %userprofile%/documents which I don't want. As a plan B I can use that and use string manipulation to take out "/documents" but that seems pretty hacky.
I'm not sure if you would consider this "hacky", but you can try something like:
list.files(dirname(path.expand("~")))
From #nongkrong's comments...
Sys.getenv("USERPROFILE") will return the correct directory. Using Sys.getenv() will work for other special directories too. Fortunately it is possible to mix "\\", which Sys.getenv() returns, with "/" which are more convenient to use for full paths.

Function to concatenate paths?

Is there an existing function to concatenate paths?
I know it is not that difficult to implement, but still... besides taking care of trailing / (or \) I would need to take care of proper OS path format detection (i.e. whether we write C:\dir\file or /dir/file).
As I said, I believe I know how to implement it; the question is: should I do it? Does the functionality already exist in existing R package?
Yes, file.path()
R> file.path("usr", "local", "lib")
[1] "usr/local/lib"
R>
There is also the equally useful system.path() for files in a package:
R> system.file("extdata", "date_time_zonespec.csv", package="RcppBDT")
[1] "/usr/local/lib/R/site-library/RcppBDT/extdata/date_time_zonespec.csv"
R>
which will get the file extdata/date_time_zonespec.csv irrespective of
where the package is installed, and
the OS
which is very handy. Lastly, there is also
R> .Platform$file.sep
[1] "/"
R>
if you insist on doing it manually.
In case anyone wants, this is my own function path.cat. Its functionality is comparable with Python's os.path.join with the extra sugar, that it interprets the ...
With this function, you can construct paths hierarchically, but unlike the file.path, you leave the user the ability to override the hierarchy by putting an absolute path. And as an added sugar, he can put the ".." wherever he likes in the path, with obvious meaning.
e.g.
path.cat("/home/user1","project/data","../data2") yelds /home/user1/project/data2
path.cat("/home/user1","project/data","/home/user2/data") yelds /home/user2/data
The function works only with slashes as path separator, which is fine, since R transparently translates them to backslashes on Windows machine.
library("iterators") # After writing this function I've learned, that iterators are very inefficient in R.
library("itertools")
#High-level function that inteligentely concatenates paths given in arguments
#The user interface is the same as for file.path, with the exception that it understands the path ".."
#and it can identify relative and absolute paths.
#Absolute paths starts comply with "^\/" or "^\d:\/" regexp.
#The concatenation starts from the last absolute path in arguments, or the first, if no absolute paths are given.
path.cat<-function(...)
{
elems<-list(...)
elems<-as.character(elems)
elems<-elems[elems!='' && !is.null(elems)]
relems<-rev(elems)
starts<-grep('^[/\\]',relems)[1]
if (!is.na(starts) && !is.null(starts))
{
relems<-relems[1:starts]
}
starts<-grep(':',relems,fixed=TRUE)
if (length(starts)==0){
starts=length(elems)-length(relems)+1
}else{
starts=length(elems)-starts[[1]]+1}
elems<-elems[starts:length(elems)]
path<-do.call(file.path,as.list(elems))
elems<-strsplit(path,'[/\\]',FALSE)[[1]]
it<-ihasNext(iter(elems))
out<-rep(NA,length(elems))
i<-1
while(hasNext(it))
{
item<-nextElem(it)
if(item=='..')
{
i<-i-1
} else if (item=='' & i!=1) {
#nothing
} else {
out[i]<-item
i<-i+1
}
}
do.call(file.path,as.list(out[1:i-1]))
}

Validate a character as a file path?

What's the best way to determine if a character is a valid file path? So CheckFilePath( "my*file.csv") would return FALSE (on windows * is invalid character), whereas CheckFilePath( "c:\\users\\blabla\\desktop\\myfile.csv" ) would return TRUE.
Note that a file path can be valid but not exist on disk.
This is the code that save is using to perform that function:
....
else file(file, "wb")
on.exit(close(con))
}
else if (inherits(file, "connection"))
con <- file
else stop("bad file argument")
......
Perhaps file.exists() is what you're after? From the help page:
file.exists returns a logical vector indicating whether the files named by its argument exist.
(Here ‘exists’ is in the sense of the system's stat call: a file will be reported as existing only
if you have the permissions needed by stat. Existence can also be checked by file.access, which
might use different permissions and so obtain a different result.
Several other functions to tap into the computers file system are available as well, also referenced on the help page.
No, there's no way to do this (reliably). I don't see an operating system interface in neither Windows nor Linux to test this. You would normally try and create the file and get a fail message, or try and read the file and get a 'does not exist' kind of message.
So you should rely on the operating system to let you know if you can do what you want to do to the file (which will usually be read and/or write).
I can't think of a reason other than a quiz ("Enter a valid fully-qualified Windows file path:") to want to know this.
I would suggest trying checkPathForOutput function offered by the checkmate package. As stated in the linked documentation, the function:
Check[s] if a file path can be safely be used to create a file and write to it.
Example
checkmate::checkPathForOutput(x = tempfile(pattern = "sample_test_file", fileext = ".tmp"))
# [1] TRUE
checkmate::checkPathForOutput(x = "c:\\users\\blabla\\desktop\\myfile.csv")
# [1] TRUE
Invalid path
\0 character should not be used in Linux1 file names:
checkmate::check_path_for_output("my\0file.csv")
# Error: nul character not allowed (line 1)
1 Not tested on Windows, but looking at the code of checkmate::check_path_for_output indicates that function should work correctly on MS Windows system as well.

Resources