Extract a pattern before // and after || symbol - r

I am not very familiar with regex in R.
in a column I am trying to extract words before // and after || symbol. I.e. this is what I have in my column:
qtaro_269//qtaro_269||qtaro_353//qtaro_353||qtaro_375//qtaro_375||qtaro_11//qtaro_11
This is what I want:
qtaro_269; qtaro_353; qtaro_375; qtaro_11
I found this: Extract character before and after "/" and this: Extract string before "|". However I don't know how to adjust it to my input. Any hint is much appreciated.
EDIT:
a qtaro_269//qtaro_269||qtaro_353//qtaro_353||qtaro_375//qtaro_375||qtaro_11//qtaro_11
b
c qtaro_269//qtaro_269||qtaro_353//qtaro_353||qtaro_375//qtaro_375||qtaro_11//qtaro_11

What about the following?
# Split by "||"
x2 <- unlist(strsplit(x, "\\|\\|"))
[1] "qtaro_269//qtaro_269" "qtaro_353//qtaro_353" "qtaro_375//qtaro_375" "qtaro_11//qtaro_11"
# Remove everything before and including "//"
gsub(".+//", "", x2)
[1] "qtaro_269" "qtaro_353" "qtaro_375" "qtaro_11"
And if you want it as one string with ; for separation:
paste(gsub(".+//", "", x2), collapse = "; ")
[1] "qtaro_269; qtaro_353; qtaro_375; qtaro_11"

This is how I solved it. For sure not the most intelligent and elegant way, so suggestions to improve it are welcome.
df <-unlist(lapply(strsplit(df[[2]],split="\\|\\|"), FUN = paste, collapse = "; "))
df <-unlist(lapply(strsplit(df[[2]],split="\\/\\/"), FUN = paste, collapse = "; "))
df <- sapply(strsplit(df$V2, "; ", fixed = TRUE), function(x) paste(unique(x), collapse = "; "))

Related

Pass vector as a chain of strings R

I have the following vector:
vectr <- c("LIBDISP1","LIBDISP2","LIBDISP3")
and I want it as a chain of strings to use in a sql query.
"'LIBDISP1','LIBDISP2','LIBDISP3'"
I tried the following:
text <- paste(as.character(vectr), collapse = ", ")
But it returns:
"LIBDISP1, LIBDISP2, LIBDISP3"
Any help will be greatly appreciated.
We can use sQuote with paste
paste(sQuote(vectr, FALSE), collapse=', ')
#[1] "'LIBDISP1', 'LIBDISP2', 'LIBDISP3'"
or with toString
toString(sQuote(vectr, FALSE))
We can use paste0 like :
paste0("'", vectr, "'", collapse = ",")
#[1] "'LIBDISP1', 'LIBDISP2', 'LIBDISP3'"

Unexpected behaviour of paste() inside glue()

paste(x, collapse = ',') returns a string of length 1. However this is not the case when it is evaluated within a glue() call. The help page of glue states clearly that "Expressions enclosed by braces will be evaluated as R code. " so I am a bit puzzled by this:
require(glue)
x = 1:3
y = paste(x, collapse = ',')
o1 = glue('{y}')
length(o1) #1
o2 = glue('{ paste(x, collapse = ',') }')
length(o2) #3
Why does o2 have a length of 3 instead of 1?
Because you mixed ` instead of using two kinds of quotes ".
Instead use :
o2 = glue('{ paste(x, collapse = ",") }')
length(o2)

R - Construct a string with double quotations

I basically need the outcome (string) to have double quotations, thus need of escape character. Preferabily solving with R base, without extra R packages.
I have tried with squote, shQuote and noquote. They just manipulate the quotations, not the escape character.
My list:
power <- "test"
myList <- list (
"power" = power)
I subset the content using:
myList
myList$power
Expected outcome (a string with following content):
" \"power\": \"test\" "
Using package glue:
library(glue)
glue(' "{names(myList)}": "{myList}" ')
"power": "test"
Another option using shQuote
paste(shQuote(names(myList), type = "cmd"),
shQuote(unlist(myList), type = "cmd"),
sep = ": ")
# [1] "\"power\": \"test\""
Not sure to get your expectation. Is it what you want?
myList <- list (
"power" = "test"
)
stringr::str_remove_all(
as.character(jsonlite::toJSON(myList, auto_unbox = TRUE)),
"[\\{|\\}]")
# [1] "\"power\":\"test\""
If you want some spaces:
x <- stringr::str_remove_all(
as.character(jsonlite::toJSON(myList, auto_unbox = TRUE)),
"[\\{|\\}]")
paste0(" ", x, " ")

concatening strings in R repetatively

Suppose I build mystring in a for loop as followings in r:
mystring = ""
colorIndex = 17
for(i in 1:ncol(myTable)){
mystring = paste(mystring, paste("$('td:eq(",i, ")', nRow).attr('title', full_text);", sep = ""))
mystring = paste(mystring, paste("$('td:eq(",i,")', nRow).css('cursor', 'pointer');", sep = ""))
mystring = paste(mystring, "if(aData[",colorIndex,"] == 0){
$(nRow).css('background-color','#f8f8ff')
}else if(aData[",colorIndex,"]==1){
$(nRow).css('background-color','#9EFAC5')
}else{
$(nRow).css('background-color','#FAF99E')
};", sep ="")
}
Now, suppose my table had 60 columns. I'm trying to figure out the easiest way to do this. Do I need to make one large string, with a special character and then grep out the character? How to iterate over the i efficiently is throwing me. However, given how slow R is with strings, I would prefer not to do this in a loop.
You don't need a loop at all because paste is vectorized:
i <- 1:ncol(myTable)
yourstring <-
paste(
paste0(
paste0(" ", "$('td:eq(",i, ")', nRow).attr('title', full_text);"),
" ",
paste0("$('td:eq(",i,")', nRow).css('cursor', 'pointer');"),
"if(aData[",colorIndex,"] == 0){
$(nRow).css('background-color','#f8f8ff')
}else if(aData[",colorIndex,"]==1){
$(nRow).css('background-color','#9EFAC5')
}else{
$(nRow).css('background-color','#FAF99E')
};"
),
collapse = "")
Maybe you could use glue for this, because it makes things look cleaner, and put combinations in a data frame in advance, such that you don't need the loop:
myTable <- iris
mystring <- "your string with some glue-elements in it: i = {paste_df$i} and colorIndex = {paste_df$colorIndex}"
paste_df <- data.frame(i = seq_len(ncol(myTable)), colorIndex = 17)
string <- glue::glue(mystring)
# or, a little messy but the same, with paste0:
string <- paste0("your string with some glue-elements in it: i = ",
paste_df$i, " and colorIndex = ", paste_df$colorIndex)
# and in the end, collapse the string:
paste0(string, collapse = "")

String remove from n-th last seperator to the end

I have the following string:
data_string = c("Aa_Bbbbb_0_ID1",
"Aa_Bbbbb_0_ID2",
"Aa_Bbbbb_0_ID3",
"Ccccc_D_EEE_0_ID1")
I just wanted to split all the string to have these results:
"Aa_Bbbbb"
"Aa_Bbbbb"
"Aa_Bbbbb"
"Ccccc_D_EEE"
So basically, I'm looking for a function which take data_string, set a separator, and take the split position :
remove_tail(data_table, sep = '_', del = 2)
only removing the tail from 2nd last separator to the end of the string (not split all the string)
Try below:
# split on "_" then paste back removing last 2
sapply(strsplit(data_string, "_", fixed = TRUE),
function(i) paste(head(i, -2), collapse = "_"))
We can make our own function:
# custom function
remove_tail <- function(x, sep = "_", del = 2){
sapply(strsplit(x, split = sep, fixed = TRUE),
function(i) paste(head(i, -del), collapse = sep))
}
remove_tail(data_string, sep = '_', del = 2)
# [1] "Aa_Bbbbb" "Aa_Bbbbb" "Aa_Bbbbb" "Ccccc_D_EEE"
Using gsub
gsub("_0_.*","",data_string)
We can also use sub tp match the _ followed by one or more digits (\\d+) and the rest of the characters, replace it with blank ("")
sub("_\\d+.*", "", data_string)
#[1] "Aa_Bbbbb" "Aa_Bbbbb" "Aa_Bbbbb" "Ccccc_D_EEE"

Resources