R:Using a character as argument in a function - r

I´m new here and I have not much knowledge in the use of R. I cant find a solution for my current problem: In the use of a character (Path) as an argument for my function.
Path <- "C:/...../"
foo <-function(Path){
Driver <- "Driver={Microsoft Access Driver (*.mdb,*.accdb)};DBQ=Path"
connect <- odbcDriverConnect(Driver)
return(connect)
}
My problem is, that Path will be replaced in the function with the quotes. At least I have the following format in my function.
...DBQ="C:/..../""
I tried to fix this problem with noquote or cat to delete the quotes, but it doesnt help.
I thank you in advance that you are helping a beginner in R :)

You can use sprintf() to insert the Path.
Path <- "C:/...../"
sprintf("Driver={Microsoft Access Driver (*.mdb,*.accdb)};DBQ=\"%s\"", Path)
# [1] "Driver={Microsoft Access Driver (*.mdb,*.accdb)};DBQ=\"C:/...../\""
So your updated function would be
foo <- function(Path) {
Driver <- "Driver={Microsoft Access Driver (*.mdb,*.accdb)};DBQ=\"%s\""
connect <- odbcDriverConnect(sprintf(Driver, Path))
return(connect)
}
See help(sprintf) for all its amazing uses.
Update: Since it's not clear to me whether you want the quotes around Path, I will include the way to have it without them. If you don't want the quotes in the string, remove them from the sprintf() format.
Path <- "C:/...../"
sprintf("Driver={Microsoft Access Driver (*.mdb,*.accdb)};DBQ=%s", Path)
# [1] "Driver={Microsoft Access Driver (*.mdb,*.accdb)};DBQ=C:/...../"

Related

How do I include a variable file name in a system() function to call Windows commands

This is likely a stupid question but I have not found a work around (at least in anything I have searched for, though I might just not be using the right search parameters.)
I want to call an executable in Windows, and send a file to it (in this case a Blaise man file), the name of which is variable in my script.
So, for example, I have
x<-2
myfile<-c(paste("FileNumber",x,".man", sep="")
system("myapp.exe" myfile)
But I simply get
Error: unexpected symbol in "system("myapp.exe" myfile"
as if the command is not recognizing the object as myfile, instead taking "myfile" as literal text.
I tried using a paste function to create a whole line command, but that also did not work.
The system command will not concatenate the string and the myfile object together, you have to do it yourself.
So, try this instead:
x<-2
myfile<-c(paste("FileNumber",x,".man", sep=""))
cmd <- paste("myapp.exe", myfile)
system(cmd)
Or just:
x<-2
system(paste("myapp.exe", c(paste("FileNumber",x,".man", sep=""))))

parlapply on sqlQuery from RODBC

R Version : 2.14.1 x64
Running on Windows 7
Connecting to a database on a remote Microsoft SQL Server 2012
I have an unordered vectors of names, say:
names<-c(“A”, “B”, “A”, “C”,”C”)
each of which have an id in a table in my db. I need to convert the names to their corresponding ids.
I currently have the following code to do it.
###
names<-c(“A”, “B”, “A”, “C”,”C”)
dbConn<-odbcDriverConnect(connection=”connection string”) #successfully connects
nameToID<-function(name, dbConn){
#dbConn : active db connection formed via odbcDriverConnect
#name : a char string
sqlQuery(dbConn, paste(“select id from table where name=’”, name, “’”, sep=””))
}
sapply(names, nameToID, dbConn=dbConn)
###
Barring better ways to do this, which could involve loading the table into R then working with the problem there (which is possible), I understand why the following doesn’t work, but I cannot seem to find a solution. Attempting to use parallelization via the package ‘parallel’ :
###
names<-c(“A”, “B”, “A”, “C”,”C”)
dbConn<-odbcDriverConnect(connection=”connection string”) #successfully connects
nameToID<-function(name, dbConn){
#dbConn : active db connection formed via odbcDriverConnect
#name : a char string
sqlQuery(dbConn, paste(“select id from table where name=’”, name, “’”, sep=””))
}
mc<-detectCores()
cl<-makeCluster(mc)
clusterExport(cl, c(“sqlQuery”, “dbConn”))
parSapply(cl, names, nameToID, dbConn=dbConn) #incorrect passing of nameToID’s second argument
###
As in the comment, this is not the correct way to assign the second argument to nameToID.
I have also tried the following:
parSapply(cl, names, function(x) nameToID(x, dbConn))
in place of the previous parSapply call, but that also does not work, with the error being thrown saying “the first parameter is not an open RODBC connection”, presumably referring to the first parameter of the sqlQuery(). dbConn remains open though
The following code does work with parallization.
###
names<-c(“A”, “B”, “A”, “C”,”C”)
dbConn<-odbcDriverConnect(connection=”connection string”) #successfully connects
nameToID<-function(name){
#name : a char string
dbConn<-odbcDriverConnect(connection=”string”)
result<-sqlQuery(dbConn, paste(“select id from table where name=’”, name, “’”, sep=””))
odbcClose(dbConn)
result
}
mc<-detectCores()
cl<-makeCluster(mc)
clusterExport(cl, c(“sqlQuery”, “odbcDriverConnect”, “odbcClose”, “dbConn”, “nameToID”)) #throwing everything in
parSapply(cl, names, nameToID)
###
But the constant opening and closing of the connection ruins the gains from parallelization, and seems just a bit silly.
So the overall question would be how to pass the second parameter (the open db connection) to the function within parSapply, in much the same way as it is done in the regular apply? In general, how does one pass a second, third, nth parameter to a function within a parallel routine?
Thanks and if you need any more information let me know.
-DT
Database connection objects can't be exported or passed as function arguments because they contain socket connections. If you try, it will be serialized, sent to the workers and deserialized, but it won't work correctly since the socket connection won't be valid.
The solution is to create the database connection on each worker before calling parSapply. I often do that using clusterEvalQ:
clusterEvalQ(cl, {
library(RODBC)
dbConn <- odbcDriverConnect(connection="connection string")
NULL
})
Now the worker function can be written as:
nameToID <- function(name) {
sqlQuery(dbConn, paste("select id from table where name='", name, "'", sep=""))
}
and called with:
parSapply(cl, names, nameToID)
Also note that since RODBC is loaded on each of the workers you don't have to export functions defined in it, which I think is good programming practice.

Sys.glob expansion

I am trying to use Sys.glob to open a file called "apcp_sfc_latlon_subset_19940101_20071231.nc". The following command works:
> Sys.glob(file.path("data/train", "apcp*"))
[1] "data/train/apcp_sfc_latlon_subset_19940101_20071231.nc"
But this command doesn't return anything. I'm don't know why it doesn't work.
> Sys.glob(file.path("data/train", "apcp", "*"))
character(0)
I want the "apcp" bit as it's own argument because I will be passing a variable instead of a hard coded string.
Thank you.
file.path("data/train", "apcp", "*") returns "data/train/apcp/*" whereas file.path("data/train", "apcp*") returns "data/train/apcp*".
Thus in the first case you are looking for files in the subdirectoy apcp, and in the (working) case you are looking for files which begin apcp within the data\train directory.
If you want to be able to pass the apcp component as a argument, using paste0 will work
starting <- "apcp"
file.path("data/train", paste0(starting, '*', collapse =''))
# "data/train/apcp*"

Retrieving MS Access filename when using RODBC

Typing in my connection name I see a line like DBQ=Path\to\DB. How do I retrieve this value? I have tried
conn$DBQ
conn[DBQ]
conn['DBQ']
conn[,'DBQ']
None return the value. I tried typeof(conn) and got "integer", class(conn) -> "RODBC", mode(conn) -> "numeric".
I think there is no simple way. You could get connection string by attr(conn, "connection.string") and then try to parse it (e.g.: sub("^DBQ=([^=]*);.*", "\\1", attr(a,"connection.string")) or strsplit(attr(a,"connection.string"),";")[[1]][1]).

Validate a character as a file path?

What's the best way to determine if a character is a valid file path? So CheckFilePath( "my*file.csv") would return FALSE (on windows * is invalid character), whereas CheckFilePath( "c:\\users\\blabla\\desktop\\myfile.csv" ) would return TRUE.
Note that a file path can be valid but not exist on disk.
This is the code that save is using to perform that function:
....
else file(file, "wb")
on.exit(close(con))
}
else if (inherits(file, "connection"))
con <- file
else stop("bad file argument")
......
Perhaps file.exists() is what you're after? From the help page:
file.exists returns a logical vector indicating whether the files named by its argument exist.
(Here ‘exists’ is in the sense of the system's stat call: a file will be reported as existing only
if you have the permissions needed by stat. Existence can also be checked by file.access, which
might use different permissions and so obtain a different result.
Several other functions to tap into the computers file system are available as well, also referenced on the help page.
No, there's no way to do this (reliably). I don't see an operating system interface in neither Windows nor Linux to test this. You would normally try and create the file and get a fail message, or try and read the file and get a 'does not exist' kind of message.
So you should rely on the operating system to let you know if you can do what you want to do to the file (which will usually be read and/or write).
I can't think of a reason other than a quiz ("Enter a valid fully-qualified Windows file path:") to want to know this.
I would suggest trying checkPathForOutput function offered by the checkmate package. As stated in the linked documentation, the function:
Check[s] if a file path can be safely be used to create a file and write to it.
Example
checkmate::checkPathForOutput(x = tempfile(pattern = "sample_test_file", fileext = ".tmp"))
# [1] TRUE
checkmate::checkPathForOutput(x = "c:\\users\\blabla\\desktop\\myfile.csv")
# [1] TRUE
Invalid path
\0 character should not be used in Linux1 file names:
checkmate::check_path_for_output("my\0file.csv")
# Error: nul character not allowed (line 1)
1 Not tested on Windows, but looking at the code of checkmate::check_path_for_output indicates that function should work correctly on MS Windows system as well.

Resources