Replace variable name in string with variable value [R] - r

I have a character in R, say "\\frac{A}{B}". And I have values for A and B, say 5 and 10. Is there a way that I can replace the A and B with the 5 and 10?
I tried the following.
words <- "\\frac{A}{B}"
numbers <- list(A=5, B=10)
output <- do.call("substitute", list(parse(text=words)[[1]], numbers))
But I get an error on the \. Is there a way that I can do this? I an trying to create equations with the actual variable values.

You could use the stringi function stri_replace_all_fixed()
stringi::stri_replace_all_fixed(
words, names(numbers), numbers, vectorize_all = FALSE
)
# [1] "\\frac{5}{10}"

Try this:
sprintf(gsub('\\{\\w\\}','\\{%d}',words),5,10)

I'm more familiar with gsub than substitute. The following works:
words <- "\\frac{A}{B}"
numbers <- list(A=5, B=10)
arglist = mapply(list, as.list(names(numbers)), numbers, SIMPLIFY=F)
for (i in 1:length(arglist)){
arglist[[i]]["x"] <- words
words <- do.call("gsub", arglist[[i]])
}
But of course this is unsafe because you're iterating over the substitutions. If, say, the first variable has value "B" and the second variable has name "B", you'll have problems. There's probably a cleaner way.

Related

R: Define ranges from text using regex

I need a way to call defined variables dependant from a string within text.
Let's say I have five variables (r010, r020, r030, r040, r050).
If there is a given text in that form "r010-050" I want to have the sum of values from all five variables.
The whole text would look like "{r010-050} == {r060}"
The first part of that equation needs to be replaced by the sum of the five variables and since r060 is also a variable the result (via parsing the text) should be a logical value.
I think regex will help here again.
Can anyone help?
Thanks.
Define the inputs: the variables r010 etc. which we assume are scalars and the string s.
Then define a pattern pat which matches the {...} part and a function Sum which accepts the 3 capture groups in pat (i.e. the strings matched to the parts of pat within parentheses) and performs the desired sum.
Use gsubfn to match the pattern, passing the capture groups to Sum and replacing the match with the output of Sum. Then evaluate it.
In the example the only variables in the global environment whose names are between r010 and r050 inclusive are r010 and r020 (it would have used more had they existed) and since they sum to r060 it returned TRUE.
library(gsubfn)
# inputs
r010 <- 1; r020 <- 2; r060 <- 3
s <- "{r010-050} == {r060}"
pat <- "[{](\\w+)(-(\\w+))?[}]"
Sum <- function(x1, x2, x3, env = .GlobalEnv) {
x3 <- if(x3 == "") x1 else paste0(gsub("\\d", "", x1), x3)
lst <- ls(env)
sum(unlist(mget(lst[lst >= x1 & lst <= x3], envir = env)))
}
eval(parse(text = gsubfn(pat, Sum, s)))
## [1] TRUE

R: created a names vector containing the means of multiple numeric vectors

I have over 20 numeric vectors which consist of a series of values. each vector is distinguished by a letter, e.g. val_a, val_b, val_c etc...
I would like to put the means from each of these vectors into a single named vector. I could of course do this in a laborious manner like so:
obs <- c("val_a" = round(mean(val_a),3),
"val_b" = round(mean(val_b),3),
"val_c" = round(mean(val_c),3))
But with 20 vectors this then becomes tedious to write out, and not to mention an inelegant solution. How can I create the named vector in a more succinct way? I have made an attempt using a for loop, as so:
obs <- c(for (j in 1:20) {
assign(paste("val",letters[j], sep = "_"),
mean(as.name(paste('val',letters[j], sep = '_'))),)
})
In the right hand argument passed to assign, "as.name" is used in order to remove the quotation marks from output of "paste". So the second argument passed to assign returns a character which has the exact same name as the numeric vector that I want get the mean of, e.g. val_a. But I get the error messsage:
Warning messages:
1: In mean.default(as.name(paste("val", letters[j], sep = "_"))) :
argument is not numeric or logical: returning NA
Does anyone know how to accomplish this?
Solution
To build on bouncyball's comment so you have a full answer, you can do this:
sapply(paste('val', letters[1:20], sep='_'), function(x) round(mean(get(x)), 3))
Explanation
For an object in your environment called x, get("x") will return x. See help("get"). Then we can do this for every element of paste('val', letters[1:20], sep='_') using sapply(), or if you like, a loop.
Example
val_a <- rnorm(100)
val_b <- rnorm(100)
val_c <- rnorm(100)
sapply(paste('val', letters[1:3], sep='_'), function(x) round(mean(get(x)), 3))
val_a val_b val_c
-0.09328504 -0.15632654 -0.09759111

Finding the position of a character within a string

I am trying to find the equivalent of the ANYALPHA SAS function in R. This function searches a character string for an alphabetic character, and returns the first position at which at which the character is found.
Example: looking at the following string '123456789A', the ANYALPHA function would return 10 since first alphabetic character is at position 10 in the string. I would like to replicate this function in R but have not been able to figure it out. I need to search for any alphabetic character regardless of case (i.e. [:alpha:])
Thanks for any help you can offer!
Here's an anyalpha function. I added a few extra features. You can specify the maximum amount of matches you want in the n argument, it defaults to 1. You can also specify if you want the position or the value itself with value=TRUE:
anyalpha <- function(txt, n=1, value=FALSE) {
txt <- as.character(txt)
indx <- gregexpr("[[:alpha:]]", txt)[[1]]
ret <- indx[1:(min(n, length(indx)))]
if(value) {
mapply(function(x,y) substr(txt, x, y), ret, ret)
} else {ret}
}
#test
x <- '123A56789BC'
anyalpha(x)
#[1] 4
anyalpha(x, 2)
#[1] 4 10
anyalpha(x, 2, value=TRUE)
#[1] "C" "A"

Integers/expressions as names for elements in lists

I am trying to understand names, lists and lists of lists in R. It would be convenient to have a way to dynamically label them like this:
> ll <- list("1" = 2)
> ll
$`1`
[1] 2
But this is not working:
> ll <- list(as.character(1) = 2)
Error: unexpected '=' in "ll <- list(as.character(1) ="
Neither is this:
> ll <- list(paste(1) = 2)
Error: unexpected '=' in "ll <- list(paste(1) ="
Why is that? Both paste() and as.character() are returning "1".
The reason is that paste(1) is a function call that evaluates to a string, not a string itself.
The The R Language Definition says this:
Each argument can be tagged (tag=expr), or just be a simple expression.
It can also be empty or it can be one of the special tokens ‘...’, ‘..2’, etc.
A tag can be an identifier or a text string.
Thus, tags can't be expressions.
However, if you want to set names (which are just an attribute), you can do so with structure, eg
> structure(1:5, names=LETTERS[1:5])
A B C D E
1 2 3 4 5
Here, LETTERS[1:5] is most definitely an expression.
If your goal is simply to use integers as names (as in the question title), you can type them in with backticks or single- or double-quotes (as the OP already knows). They are converted to characters, since all names are characters in R.
I can't offer a deep technical explanation for why your later code fails beyond "the left-hand side of = is not evaluated in that context (of enumerating items in a list)". Here's one workaround:
mylist <- list()
mylist[[paste("a")]] <- 2
mylist[[paste("b")]] <- 3
mylist[[paste("c")]] <- matrix(1:4,ncol=2)
mylist[[paste("d")]] <- mean
And here's another:
library(data.table)
tmp <- rbindlist(list(
list(paste("a"), list(2)),
list(paste("b"), list(3)),
list(paste("c"), list(matrix(1:4,ncol=2))),
list(paste("d"), list(mean))
))
res <- setNames(tmp$V2,tmp$V1)
identical(mylist,res) # TRUE
The drawbacks of each approach are pretty serious, I think. On the other hand, I've never found myself in need of richer naming syntax.

Replace non-ascii chars with a defined string list without a loop in R

I want to replace non-ascii characters (for now, only spanish), by their ascii equivalent. If I have "á", I want to replace it with "a" and so on.
I built this function (works fine), but I don't want to use a loop (including internal loops like sapply).
latin2ascii<-function(x) {
if(!is.character(x)) stop ("input must be a character object")
require(stringr)
mapL<-c("á","é","í","ó","ú","Á","É","Í","Ó","Ú","ñ","Ñ","ü","Ü")
mapA<-c("a","e","i","o","u","A","E","I","O","U","n","N","u","U")
for(y in 1:length(mapL)) {
x<-str_replace_all(x,mapL[y],mapA[y])
}
x
}
Is there an elegante way to solve it? Any help, suggestion or modification is appreciated
gsubfn() in the package of the same name is really nice for this sort of thing:
library(gsubfn)
# Create a named list, in which:
# - the names are the strings to be looked up
# - the values are the replacement strings
mapL <- c("á","é","í","ó","ú","Á","É","Í","Ó","Ú","ñ","Ñ","ü","Ü")
mapA <- c("a","e","i","o","u","A","E","I","O","U","n","N","u","U")
# ll <- setNames(as.list(mapA), mapL) # An alternative to the 2 lines below
ll <- as.list(mapA)
names(ll) <- mapL
# Try it out
string <- "ÍÓáÚ"
gsubfn("[áéíóúÁÉÍÓÚñÑüÜ]", ll, string)
# [1] "IOaU"
Edit:
G. Grothendieck points out that base R also has a function for this:
A <- paste(mapA, collapse="")
L <- paste(mapL, collapse="")
chartr(L, A, "ÍÓáÚ")
# [1] "IOaU"
I like the version by Josh, but I thought I might add another 'vectorized' solution. It returns a vector of unaccented strings. It also only relies on the base functions.
x=c('íÁuÚ','uíÚÁ')
mapL<-c("á","é","í","ó","ú","Á","É","Í","Ó","Ú","ñ","Ñ","ü","Ü")
mapA<-c("a","e","i","o","u","A","E","I","O","U","n","N","u","U")
split=strsplit(x,split='')
m=lapply(split,match,mapL)
mapply(function(split,m) paste(ifelse(is.na(m),split,mapA[m]),collapse='') , split, m)
# "iAuU" "uiUA"

Resources