Unexpected behaviour of paste() inside glue() - r

paste(x, collapse = ',') returns a string of length 1. However this is not the case when it is evaluated within a glue() call. The help page of glue states clearly that "Expressions enclosed by braces will be evaluated as R code. " so I am a bit puzzled by this:
require(glue)
x = 1:3
y = paste(x, collapse = ',')
o1 = glue('{y}')
length(o1) #1
o2 = glue('{ paste(x, collapse = ',') }')
length(o2) #3
Why does o2 have a length of 3 instead of 1?

Because you mixed ` instead of using two kinds of quotes ".
Instead use :
o2 = glue('{ paste(x, collapse = ",") }')
length(o2)

Related

A way to strsplit and replace all of one character with several variations of alternate strings?

I am sure there is a simple solution and I am just getting too frustrated to work through it but here is the issue, simplified:
I have a string, ex: AB^AB^AB^^BAAA^^BABA^
I want to replace the ^s (so, 7 characters in the string), but iterate through many variants and be able to retain them all as strings
for example:
replacement 1: CCDCDCD to get: ABCABCABDCBAAADCBABAD
replacement 2: DDDCCCD to get: ABDABDABDCBAAACCBABAD
I imagine strsplit is the way, and I would like to do it in a for loop, any help would be appreciated!
The positions of the "^" can be found using gregexpr, see tmp
x <- "AB^AB^AB^^BAAA^^BABA^"
y <- c("CCDCDCD", "DDDCCCD")
tmp <- gregexpr(pattern = "^", text = x, fixed = TRUE)
You can then split the 'replacements' character by character using strsplit, this gives a list. Finally, iterate over that list and replace the "^" with the characters from your replacements one after the other.
sapply(strsplit(y, split = ""), function(i) {
`regmatches<-`("AB^AB^AB^^BAAA^^BABA^", m = tmp, value = i)
})
Result
# [1] "ABCABCABCCBAAACCBABAC" "ABDABDABDDBAAADDBABAD"
You don't really need a for loop. You can strplit your string and pattern, and then replace the "^" with the vector.
str <- unlist(strsplit(str, ""))
pat <- unlist(strsplit("CCDCDCD", ""))
str[str == "^"] <- pat
paste(str, collapse = "")
# [1] "ABCABCABDCBAAADCBABAD"
An option is also with gsubfn
f1 <- Vectorize(function(str1, str2) {
p <- proto(fun = function(this, x) substr(str2, count, count))
gsubfn::gsubfn("\\^", p, str1)
})
-testing
> unname(f1(x, y))
[1] "ABCABCABDCBAAADCBABAD" "ABDABDABDCBAAACCBABAD"
data
x <- "AB^AB^AB^^BAAA^^BABA^"
y <- c("CCDCDCD", "DDDCCCD")
Given x <- "AB^AB^AB^^BAAA^^BABA^" and y <- c("CCDCDCD", "DDDCCCD"), we can try utf8ToInt + intToUtf8 + replace like below
sapply(
y,
function(s) {
intToUtf8(
replace(
u <- utf8ToInt(x),
u == utf8ToInt("^"),
utf8ToInt(s)
)
)
}
)
which gives
CCDCDCD DDDCCCD
"ABCABCABDCBAAADCBABAD" "ABDABDABDCBAAACCBABAD"

Replacing text in gsub by evaluating a backreference

Let's say i have some text :
myF <- "lag.variable.1+1"
I would like to get for all similar expressions the following result : lag.variable.2 (that is replacing 1+1 by the actual sum
The following doesn't seem to work, it appears that the backreference doesnt carry through in the eval(parse() bit ):
myF<-gsub("(\\.\\w+)\\.([0-9]+\\+[0-9]+)",
paste0( "\\1." ,eval(parse(text ="\\2"))) ,
myF )
Any tips on how to achieve the desired result ?
Thanks!
Here is how you can use your current pattern with gsubfn:
library(gsubfn)
x <- " lag.variable0.3 * lag.variable1.1+1 + 9892"
p <- "(\\.\\w+)\\.([0-9]+\\+[0-9]+)"
gsubfn(p, function(n,m) paste0(n, ".", eval(parse(text = m))), x)
# => [1] " lag.variable0.3 * lag.variable1.2 + 9892"
Note the match is passed to the callable in this case where Group 1 is assigned to n variable and Group 2 is assigned to m. The return is a concatenation of Group 1, . and evaled Group 2 contents.
Note you may simplify the callable part using a PCRE regex (add perl=TRUE argument) \K, match reset operator that discards all text matched so far:
p <- "\\.\\w+\\.\\K(\\d+\\+\\d+)"
gsubfn(p, ~ eval(parse(text = z)), x, perl=TRUE)
[1] " lag.variable0.3 * lag.variable1.2 + 9892"
You may further enhance the pattern to support other operands by replacing \\+ with [-+/*] and if you need to support numbers with fractional parts, replace [0-9]+ with \\d*\\.?\\d+:
p <- "(\\.\\w+)\\.(\\d*\\.?\\d+[-+/*]\\d*\\.?\\d+)"
## or a PCRE regex:
p <- "\\.\\w+\\.\\K(\\d*\\.?\\d+[-+/*]\\d*\\.?\\d+)"
We can use gsubfn
library(gsubfn)
gsubfn("(\\d+\\+\\d+)", ~ eval(parse(text = x)), myF)
#[1] "lag.variable.2"
gsubfn("\\.([0-9]+\\+[0-9]+)", ~ paste0(".", eval(parse(text = x))), myF2)
#[1] "lag.variable0.3 * lag.variable1.2 + 9892"
Or with str_replace
library(stringr)
str_replace(myF, "(\\d+\\+\\d+)", function(x) eval(parse(text = x)))
#[1] "lag.variable.2"
Or an option with strsplit and paste
v1 <- strsplit(myF, "\\.(?=\\d)", perl = TRUE)[[1]]
paste(v1[1], eval(parse(text = v1[2])), sep=".")
#[1] "lag.variable.2"
data
myF <- "lag.variable.1+1"
myF2 <- "lag.variable0.3 * lag.variable1.1+1 + 9892"

paste0 is putting " in wrong place

I am creating columns of variables.
myVars=paste0("var",rep(1:5))
myVars
paste0(myVars,"=rnorm(5)")
output:
"var1=rnorm(5)" "var2=rnorm(5)" "var3=rnorm(5)" "var4=rnorm(5)"
"var5=rnorm(5)"
note the second quote should be after var1 as seen below.
I also want to paste in the comma seen in wanted output.
That should require something like paste0(A,B,C)
Want:
"var1"=rnorm(5), "var2"=rnorm(5), "var3"=rnorm(5), "var4"=rnorm(5),
"var5"=rnorm(5)
If we need to have double quotes around 'myVars', use dQuote with q = FALSE to avoid having the fancyquotes
out <- paste0(dQuote(myVars, q = FALSE), "=rnorm(5)")
cat(out, '\n')
#"var1"=rnorm(5) "var2"=rnorm(5) "var3"=rnorm(5) "var4"=rnorm(5) "var5"=rnorm(5)
if it should be a single string
out1 <- paste(dQuote(myVars, q = FALSE), "=rnorm(5)", sep="", collapse=", ")
cat(out1, '\n')
#"var1"=rnorm(5), "var2"=rnorm(5), "var3"=rnorm(5), "var4"=rnorm(5), "var5"=rnorm(5)

Replace multiple strings comprising of a different number of characters with one gsubfn()

Here Replace multiple strings in one gsub() or chartr() statement in R? it is explained to replace multiple strings of one character at in one statement with gsubfn(). E.g.:
x <- "doremi g-k"
gsubfn(".", list("-" = "_", " " = ""), x)
# "doremig_k"
I would however like to replace the string 'doremi' in the example with ''. This does not work:
x <- "doremi g-k"
gsubfn(".", list("-" = "_", "doremi" = ""), x)
# "doremi g_k"
I guess it is because of the fact that the string 'doremi' contains multiple characters and me using the metacharacter . in gsubfn. I have no idea what to replace it with - I must confess I find the use of metacharacters sometimes a bit difficult to udnerstand. Thus, is there a way for me to replace '-' and 'doremi' at once?
You might be able to just use base R sub here:
x <- "doremi g-k"
result <- sub("doremi\\s+([^-]+)-([^-]+)", "\\1_\\2", x)
result
[1] "g_k"
Does this work for you?
gsubfn::gsubfn(pattern = "doremi|-", list("-" = "_", "doremi" = ""), x)
[1] " g_k"
The key is this search: "doremi|-" which tells to search for either "doremi" or "-". Use "|" as the or operator.
Just a more generic solution to #RLave's solution -
toreplace <- list("-" = "_", "doremi" = "")
gsubfn(paste(names(toreplace),collapse="|"), toreplace, x)
[1] " g_k"

Extract a pattern before // and after || symbol

I am not very familiar with regex in R.
in a column I am trying to extract words before // and after || symbol. I.e. this is what I have in my column:
qtaro_269//qtaro_269||qtaro_353//qtaro_353||qtaro_375//qtaro_375||qtaro_11//qtaro_11
This is what I want:
qtaro_269; qtaro_353; qtaro_375; qtaro_11
I found this: Extract character before and after "/" and this: Extract string before "|". However I don't know how to adjust it to my input. Any hint is much appreciated.
EDIT:
a qtaro_269//qtaro_269||qtaro_353//qtaro_353||qtaro_375//qtaro_375||qtaro_11//qtaro_11
b
c qtaro_269//qtaro_269||qtaro_353//qtaro_353||qtaro_375//qtaro_375||qtaro_11//qtaro_11
What about the following?
# Split by "||"
x2 <- unlist(strsplit(x, "\\|\\|"))
[1] "qtaro_269//qtaro_269" "qtaro_353//qtaro_353" "qtaro_375//qtaro_375" "qtaro_11//qtaro_11"
# Remove everything before and including "//"
gsub(".+//", "", x2)
[1] "qtaro_269" "qtaro_353" "qtaro_375" "qtaro_11"
And if you want it as one string with ; for separation:
paste(gsub(".+//", "", x2), collapse = "; ")
[1] "qtaro_269; qtaro_353; qtaro_375; qtaro_11"
This is how I solved it. For sure not the most intelligent and elegant way, so suggestions to improve it are welcome.
df <-unlist(lapply(strsplit(df[[2]],split="\\|\\|"), FUN = paste, collapse = "; "))
df <-unlist(lapply(strsplit(df[[2]],split="\\/\\/"), FUN = paste, collapse = "; "))
df <- sapply(strsplit(df$V2, "; ", fixed = TRUE), function(x) paste(unique(x), collapse = "; "))

Resources