paste(x, collapse = ',') returns a string of length 1. However this is not the case when it is evaluated within a glue() call. The help page of glue states clearly that "Expressions enclosed by braces will be evaluated as R code. " so I am a bit puzzled by this:
require(glue)
x = 1:3
y = paste(x, collapse = ',')
o1 = glue('{y}')
length(o1) #1
o2 = glue('{ paste(x, collapse = ',') }')
length(o2) #3
Why does o2 have a length of 3 instead of 1?
Because you mixed ` instead of using two kinds of quotes ".
Instead use :
o2 = glue('{ paste(x, collapse = ",") }')
length(o2)
Related
I am sure there is a simple solution and I am just getting too frustrated to work through it but here is the issue, simplified:
I have a string, ex: AB^AB^AB^^BAAA^^BABA^
I want to replace the ^s (so, 7 characters in the string), but iterate through many variants and be able to retain them all as strings
for example:
replacement 1: CCDCDCD to get: ABCABCABDCBAAADCBABAD
replacement 2: DDDCCCD to get: ABDABDABDCBAAACCBABAD
I imagine strsplit is the way, and I would like to do it in a for loop, any help would be appreciated!
The positions of the "^" can be found using gregexpr, see tmp
x <- "AB^AB^AB^^BAAA^^BABA^"
y <- c("CCDCDCD", "DDDCCCD")
tmp <- gregexpr(pattern = "^", text = x, fixed = TRUE)
You can then split the 'replacements' character by character using strsplit, this gives a list. Finally, iterate over that list and replace the "^" with the characters from your replacements one after the other.
sapply(strsplit(y, split = ""), function(i) {
`regmatches<-`("AB^AB^AB^^BAAA^^BABA^", m = tmp, value = i)
})
Result
# [1] "ABCABCABCCBAAACCBABAC" "ABDABDABDDBAAADDBABAD"
You don't really need a for loop. You can strplit your string and pattern, and then replace the "^" with the vector.
str <- unlist(strsplit(str, ""))
pat <- unlist(strsplit("CCDCDCD", ""))
str[str == "^"] <- pat
paste(str, collapse = "")
# [1] "ABCABCABDCBAAADCBABAD"
An option is also with gsubfn
f1 <- Vectorize(function(str1, str2) {
p <- proto(fun = function(this, x) substr(str2, count, count))
gsubfn::gsubfn("\\^", p, str1)
})
-testing
> unname(f1(x, y))
[1] "ABCABCABDCBAAADCBABAD" "ABDABDABDCBAAACCBABAD"
data
x <- "AB^AB^AB^^BAAA^^BABA^"
y <- c("CCDCDCD", "DDDCCCD")
Given x <- "AB^AB^AB^^BAAA^^BABA^" and y <- c("CCDCDCD", "DDDCCCD"), we can try utf8ToInt + intToUtf8 + replace like below
sapply(
y,
function(s) {
intToUtf8(
replace(
u <- utf8ToInt(x),
u == utf8ToInt("^"),
utf8ToInt(s)
)
)
}
)
which gives
CCDCDCD DDDCCCD
"ABCABCABDCBAAADCBABAD" "ABDABDABDCBAAACCBABAD"
Let's say i have some text :
myF <- "lag.variable.1+1"
I would like to get for all similar expressions the following result : lag.variable.2 (that is replacing 1+1 by the actual sum
The following doesn't seem to work, it appears that the backreference doesnt carry through in the eval(parse() bit ):
myF<-gsub("(\\.\\w+)\\.([0-9]+\\+[0-9]+)",
paste0( "\\1." ,eval(parse(text ="\\2"))) ,
myF )
Any tips on how to achieve the desired result ?
Thanks!
Here is how you can use your current pattern with gsubfn:
library(gsubfn)
x <- " lag.variable0.3 * lag.variable1.1+1 + 9892"
p <- "(\\.\\w+)\\.([0-9]+\\+[0-9]+)"
gsubfn(p, function(n,m) paste0(n, ".", eval(parse(text = m))), x)
# => [1] " lag.variable0.3 * lag.variable1.2 + 9892"
Note the match is passed to the callable in this case where Group 1 is assigned to n variable and Group 2 is assigned to m. The return is a concatenation of Group 1, . and evaled Group 2 contents.
Note you may simplify the callable part using a PCRE regex (add perl=TRUE argument) \K, match reset operator that discards all text matched so far:
p <- "\\.\\w+\\.\\K(\\d+\\+\\d+)"
gsubfn(p, ~ eval(parse(text = z)), x, perl=TRUE)
[1] " lag.variable0.3 * lag.variable1.2 + 9892"
You may further enhance the pattern to support other operands by replacing \\+ with [-+/*] and if you need to support numbers with fractional parts, replace [0-9]+ with \\d*\\.?\\d+:
p <- "(\\.\\w+)\\.(\\d*\\.?\\d+[-+/*]\\d*\\.?\\d+)"
## or a PCRE regex:
p <- "\\.\\w+\\.\\K(\\d*\\.?\\d+[-+/*]\\d*\\.?\\d+)"
We can use gsubfn
library(gsubfn)
gsubfn("(\\d+\\+\\d+)", ~ eval(parse(text = x)), myF)
#[1] "lag.variable.2"
gsubfn("\\.([0-9]+\\+[0-9]+)", ~ paste0(".", eval(parse(text = x))), myF2)
#[1] "lag.variable0.3 * lag.variable1.2 + 9892"
Or with str_replace
library(stringr)
str_replace(myF, "(\\d+\\+\\d+)", function(x) eval(parse(text = x)))
#[1] "lag.variable.2"
Or an option with strsplit and paste
v1 <- strsplit(myF, "\\.(?=\\d)", perl = TRUE)[[1]]
paste(v1[1], eval(parse(text = v1[2])), sep=".")
#[1] "lag.variable.2"
data
myF <- "lag.variable.1+1"
myF2 <- "lag.variable0.3 * lag.variable1.1+1 + 9892"
I am creating columns of variables.
myVars=paste0("var",rep(1:5))
myVars
paste0(myVars,"=rnorm(5)")
output:
"var1=rnorm(5)" "var2=rnorm(5)" "var3=rnorm(5)" "var4=rnorm(5)"
"var5=rnorm(5)"
note the second quote should be after var1 as seen below.
I also want to paste in the comma seen in wanted output.
That should require something like paste0(A,B,C)
Want:
"var1"=rnorm(5), "var2"=rnorm(5), "var3"=rnorm(5), "var4"=rnorm(5),
"var5"=rnorm(5)
If we need to have double quotes around 'myVars', use dQuote with q = FALSE to avoid having the fancyquotes
out <- paste0(dQuote(myVars, q = FALSE), "=rnorm(5)")
cat(out, '\n')
#"var1"=rnorm(5) "var2"=rnorm(5) "var3"=rnorm(5) "var4"=rnorm(5) "var5"=rnorm(5)
if it should be a single string
out1 <- paste(dQuote(myVars, q = FALSE), "=rnorm(5)", sep="", collapse=", ")
cat(out1, '\n')
#"var1"=rnorm(5), "var2"=rnorm(5), "var3"=rnorm(5), "var4"=rnorm(5), "var5"=rnorm(5)
Here Replace multiple strings in one gsub() or chartr() statement in R? it is explained to replace multiple strings of one character at in one statement with gsubfn(). E.g.:
x <- "doremi g-k"
gsubfn(".", list("-" = "_", " " = ""), x)
# "doremig_k"
I would however like to replace the string 'doremi' in the example with ''. This does not work:
x <- "doremi g-k"
gsubfn(".", list("-" = "_", "doremi" = ""), x)
# "doremi g_k"
I guess it is because of the fact that the string 'doremi' contains multiple characters and me using the metacharacter . in gsubfn. I have no idea what to replace it with - I must confess I find the use of metacharacters sometimes a bit difficult to udnerstand. Thus, is there a way for me to replace '-' and 'doremi' at once?
You might be able to just use base R sub here:
x <- "doremi g-k"
result <- sub("doremi\\s+([^-]+)-([^-]+)", "\\1_\\2", x)
result
[1] "g_k"
Does this work for you?
gsubfn::gsubfn(pattern = "doremi|-", list("-" = "_", "doremi" = ""), x)
[1] " g_k"
The key is this search: "doremi|-" which tells to search for either "doremi" or "-". Use "|" as the or operator.
Just a more generic solution to #RLave's solution -
toreplace <- list("-" = "_", "doremi" = "")
gsubfn(paste(names(toreplace),collapse="|"), toreplace, x)
[1] " g_k"
I am not very familiar with regex in R.
in a column I am trying to extract words before // and after || symbol. I.e. this is what I have in my column:
qtaro_269//qtaro_269||qtaro_353//qtaro_353||qtaro_375//qtaro_375||qtaro_11//qtaro_11
This is what I want:
qtaro_269; qtaro_353; qtaro_375; qtaro_11
I found this: Extract character before and after "/" and this: Extract string before "|". However I don't know how to adjust it to my input. Any hint is much appreciated.
EDIT:
a qtaro_269//qtaro_269||qtaro_353//qtaro_353||qtaro_375//qtaro_375||qtaro_11//qtaro_11
b
c qtaro_269//qtaro_269||qtaro_353//qtaro_353||qtaro_375//qtaro_375||qtaro_11//qtaro_11
What about the following?
# Split by "||"
x2 <- unlist(strsplit(x, "\\|\\|"))
[1] "qtaro_269//qtaro_269" "qtaro_353//qtaro_353" "qtaro_375//qtaro_375" "qtaro_11//qtaro_11"
# Remove everything before and including "//"
gsub(".+//", "", x2)
[1] "qtaro_269" "qtaro_353" "qtaro_375" "qtaro_11"
And if you want it as one string with ; for separation:
paste(gsub(".+//", "", x2), collapse = "; ")
[1] "qtaro_269; qtaro_353; qtaro_375; qtaro_11"
This is how I solved it. For sure not the most intelligent and elegant way, so suggestions to improve it are welcome.
df <-unlist(lapply(strsplit(df[[2]],split="\\|\\|"), FUN = paste, collapse = "; "))
df <-unlist(lapply(strsplit(df[[2]],split="\\/\\/"), FUN = paste, collapse = "; "))
df <- sapply(strsplit(df$V2, "; ", fixed = TRUE), function(x) paste(unique(x), collapse = "; "))