How to use variable in xpath in R? - r

when i parse a web file, it works fine ,
tdata=xpathApply(data,"//table[#id='PL']")
i want to use variable in xpathApply,
x="PL"
tdata=xpathApply(data,"//table[#id=x]")
it can not work,how to write the xpath expression in xpathApply with variable?
think for Dason's suggestion,
x="PL"
y=paste0("//table[#id='",x,"']")
tdata=xpathApply(data,y)
it is ok,but i feel it is ugly,how can i write it more beautiful?

The gsubfn package can do string interpolation somewhat along the lines of Perl if we preface the function whose arguments are to contain substitutions with fn$. Here $x means substitute in the value of x . See ?fn and the gsubfn home page.
library(gsubfn)
x <- "PL"
tdata <- fn$xpathApply(data, "//table[#id='$x']")

#Dason's suggestion of using paste or one alike is most likely the only way to go. If you find it ugly, you can sweep it under the rug by creating your own function:
my.xpathApply <- function(data, x) xpathApply(data, paste0("//table[#id='",x,"']"))
tdata <- my.xpathApply(data, "PL")
After all, you must use a lot of package functions that use paste somewhere, so you should be ok with having one of your own :-)

Related

Rename a column with R

I'm trying to rename a specific column in my R script using the colnames function but with no sucess so far.
I'm kinda new around programming so it may be something simple to solve.
Basically, I'm trying to rename a column called Reviewer Overall Notes and name it Nota Final in a data frame called notas with the codes:
colnames(notas$`Reviewer Overall Notes`) <- `Nota Final`
and it returns to me:
> colnames(notas$`Reviewer Overall Notes`) <- `Nota Final`
Error: object 'Nota Final' not found
I also found in [this post][1] a code that goes:
colnames(notas) [13] <- `Nota Final`
But it also return the same message.
What I'm doing wrong?
Ps:. Sorry for any misspeling, English is not my primary language.
You probably want
colnames(notas)[colnames(notas) == "Reviewer Overall Notes"] <- "Nota Final"
(#Whatif's answer shows how you can do this with the numeric index, but probably better practice to do it this way; working with strings rather than column indices makes your code both easier to read [you can see what you're renaming] and more robust [in case the order of columns changes in the future])
Alternatively,
notas <- notas %>% dplyr::rename(`Nota Final` = `Reviewer Overall Notes`)
Here you do use back-ticks, because tidyverse (of which dplyr is a part) prefers its arguments to be passed as symbols rather than strings.
Why using backtick? Use the normal quotation mark.
colnames(notas)[13] <- 'Nota Final'
This seems to matter:
df <- data.frame(a = 1:4)
colnames(df)[1] <- `b`
Error: object 'b' not found
You should not use single or double quotes in naming:
I have learned that we should not use space in names. If there are spaces in names (it works and is called a non-syntactic name: And according to Wickham Hadley's description in Advanced R book this is due to historical reasons:
"You can also create non-syntactic bindings using single or double quotes (e.g. "_abc" <- 1) instead of backticks, but you shouldn’t, because you’ll have to use a different syntax to retrieve the values. The ability to use strings on the left hand side of the assignment arrow is an historical artefact, used before R supported backticks."
To get an overview what syntactic names are use ?make.names:
make.names("Nota Final")
[1] "Nota.Final"

Switching the order of paste() in piping in R

I am fairly new to R and I would like to paste the string "exampletext" in front of each file name within the path.
csvList <- list.files(path = "./csv_by_subject") %>%
paste0("*exampletext")
Currently this code renders things like "csv*exampletext" and I want it to be *exampletextcsv". I would like to continue to using dplyr and piping - help appreciated!
As others pointed out, the pipe is not necessary here. But if you do want to use it, you just have to specify that the second argument to paste0 is "the thing you are piping", which you do using a period (.)
list.files(path = "./csv_by_subject") %>%
paste0("*exampletext", .)
paste0('exampletext', csvList) should do the trick. It's not necessarily using dplyr and piping, but it's taking advantage of the vectorization features that R provides.
If you'd like to paste *exampletext before all of the file names, you can reverse the order of what you're doing now using paste0 and passing the second argument as list.files. paste0 can handle vectors as the second argument and will apply the paste to each element.
csvList <- paste0("*exampletext", list.files(path = "./csv_by_subject"))
This returns a few examples from a local folder on my machine:
csvList
[1] "*exampletext_error_metric.m"
[2] "*exampletext_get_K_clusters.m"
...

Use LaF and grepl together

I would like to read in a possibly large text file and filter the relevant lines on the fly based on a regular expression. My first approach was using the package LaF which supports chunkwise reading and then grepl to filter. However, this seems not to work:
library(LaF)
fh <- laf_open_csv("myfile.txt", column_types="string", sep="°")
# would be nice to declare *no* separator
fh[grepl("abc", fh[[1]]), ]
returns an error in as.character.default(x) -- no method to convert this S4 to character. It seems like grepl is applied to the S4 function and not to the chunks.
Is there a nice way to read text lines from a large file and filter them selectively?
OK, I just discovered process_blocks:
regfilter <- function(df, result) c(result, df[grepl("1745", df[[1]]),1])
process_blocks(fh, regfilter)
This works, now I only need to find a way to ignore separators..

Turn character strings into named function arguments

I have an R script I intend to call from the command line, which includes a function which may take option ... arguments. I'd like to parse any arguments given at the command line as arguments in .... How might I do this?
I've tried the rlang package. I thought something like this would work:
z <- c('x=1', 'y=2') #but actually, z <- commandArgs(T)
c(UQS(z))
Here I would expect to get a vector as if I had called c(x=1, y=2). Instead I get a list of names x=1 &c.
I agree with the previous answer that this is a bit unsafe. That said, one hacky way to achieve this somewhat safely is to take advantage of environments.
First, create a blank environment:
args_env <- env()
Parse-eval your arguments inside that environment.
z <- c("x=1", "y=2")
eval(parse(text = z), envir = args_env)
Convert the environment to a list:
args_list <- as.list(args_env)
And now, you should be able to use do.call to pass arguments in.
do.call(f, args_list)
I wouldn't necessary recommend your approach. It would be safer to use a package like optparse to properly parse your command line parameters.
But that said, if you want to treat arbitrary strings as R code, you can just eval(parse(test=)) it. Of if you really want to use rlang, you can use eval_tidy/parse_expr.
args <- c('x=1', 'y=2')
z <- eval_tidy(parse_expr(paste0("c(", paste(args, collapse=","), ")")))
# or
z <- eval(parse(text=paste0("c(", paste(args, collapse=","), ")")))
You need this because things in the form a=b can't exist independently outside of an expression otherwise it's interpreted as assignment. If you needed to play with just the name on the left or the value on the right, there might be smarter ways to do this, but if you pass it as one character chunk, you'll need to parse it.

Pass non-string argument to r function and use it inside this function as is

I am a bit confused with the way arguments are transmitted to r function, and the associated syntax (quoting, substituting, evaluating, calling, expressions, "...", ...) .
Basically, what I need to do is to pass arguments in a function using only their name, but without using the type "character".
This is a (not working) illustration of what I would like to do
require(dplyr)
test <- function(x) select(iris, DesiredFunction(x))
test(Species)
I am also interested in general resources about the possibilities to pass arguments to functions.
Thank you,
François
UPDATE
The following is working
require(dplyr)
test <- function(x) select_(iris, substitute(x))
test(Species)
Is there a way to do this but with "select" instead of "select_" ?
Or in other words, what is the inverse operation for quoting ?

Resources