concatening strings in R repetatively - r

Suppose I build mystring in a for loop as followings in r:
mystring = ""
colorIndex = 17
for(i in 1:ncol(myTable)){
mystring = paste(mystring, paste("$('td:eq(",i, ")', nRow).attr('title', full_text);", sep = ""))
mystring = paste(mystring, paste("$('td:eq(",i,")', nRow).css('cursor', 'pointer');", sep = ""))
mystring = paste(mystring, "if(aData[",colorIndex,"] == 0){
$(nRow).css('background-color','#f8f8ff')
}else if(aData[",colorIndex,"]==1){
$(nRow).css('background-color','#9EFAC5')
}else{
$(nRow).css('background-color','#FAF99E')
};", sep ="")
}
Now, suppose my table had 60 columns. I'm trying to figure out the easiest way to do this. Do I need to make one large string, with a special character and then grep out the character? How to iterate over the i efficiently is throwing me. However, given how slow R is with strings, I would prefer not to do this in a loop.

You don't need a loop at all because paste is vectorized:
i <- 1:ncol(myTable)
yourstring <-
paste(
paste0(
paste0(" ", "$('td:eq(",i, ")', nRow).attr('title', full_text);"),
" ",
paste0("$('td:eq(",i,")', nRow).css('cursor', 'pointer');"),
"if(aData[",colorIndex,"] == 0){
$(nRow).css('background-color','#f8f8ff')
}else if(aData[",colorIndex,"]==1){
$(nRow).css('background-color','#9EFAC5')
}else{
$(nRow).css('background-color','#FAF99E')
};"
),
collapse = "")

Maybe you could use glue for this, because it makes things look cleaner, and put combinations in a data frame in advance, such that you don't need the loop:
myTable <- iris
mystring <- "your string with some glue-elements in it: i = {paste_df$i} and colorIndex = {paste_df$colorIndex}"
paste_df <- data.frame(i = seq_len(ncol(myTable)), colorIndex = 17)
string <- glue::glue(mystring)
# or, a little messy but the same, with paste0:
string <- paste0("your string with some glue-elements in it: i = ",
paste_df$i, " and colorIndex = ", paste_df$colorIndex)
# and in the end, collapse the string:
paste0(string, collapse = "")

Related

A way to strsplit and replace all of one character with several variations of alternate strings?

I am sure there is a simple solution and I am just getting too frustrated to work through it but here is the issue, simplified:
I have a string, ex: AB^AB^AB^^BAAA^^BABA^
I want to replace the ^s (so, 7 characters in the string), but iterate through many variants and be able to retain them all as strings
for example:
replacement 1: CCDCDCD to get: ABCABCABDCBAAADCBABAD
replacement 2: DDDCCCD to get: ABDABDABDCBAAACCBABAD
I imagine strsplit is the way, and I would like to do it in a for loop, any help would be appreciated!
The positions of the "^" can be found using gregexpr, see tmp
x <- "AB^AB^AB^^BAAA^^BABA^"
y <- c("CCDCDCD", "DDDCCCD")
tmp <- gregexpr(pattern = "^", text = x, fixed = TRUE)
You can then split the 'replacements' character by character using strsplit, this gives a list. Finally, iterate over that list and replace the "^" with the characters from your replacements one after the other.
sapply(strsplit(y, split = ""), function(i) {
`regmatches<-`("AB^AB^AB^^BAAA^^BABA^", m = tmp, value = i)
})
Result
# [1] "ABCABCABCCBAAACCBABAC" "ABDABDABDDBAAADDBABAD"
You don't really need a for loop. You can strplit your string and pattern, and then replace the "^" with the vector.
str <- unlist(strsplit(str, ""))
pat <- unlist(strsplit("CCDCDCD", ""))
str[str == "^"] <- pat
paste(str, collapse = "")
# [1] "ABCABCABDCBAAADCBABAD"
An option is also with gsubfn
f1 <- Vectorize(function(str1, str2) {
p <- proto(fun = function(this, x) substr(str2, count, count))
gsubfn::gsubfn("\\^", p, str1)
})
-testing
> unname(f1(x, y))
[1] "ABCABCABDCBAAADCBABAD" "ABDABDABDCBAAACCBABAD"
data
x <- "AB^AB^AB^^BAAA^^BABA^"
y <- c("CCDCDCD", "DDDCCCD")
Given x <- "AB^AB^AB^^BAAA^^BABA^" and y <- c("CCDCDCD", "DDDCCCD"), we can try utf8ToInt + intToUtf8 + replace like below
sapply(
y,
function(s) {
intToUtf8(
replace(
u <- utf8ToInt(x),
u == utf8ToInt("^"),
utf8ToInt(s)
)
)
}
)
which gives
CCDCDCD DDDCCCD
"ABCABCABDCBAAADCBABAD" "ABDABDABDCBAAACCBABAD"

How can I get the output of this function to print onto different lines in R?

So, I am writing a function that, among many other things, is supposed to keep only the first sentence from each paragraph of a text and preserve the paragraph structure (i.e. each sentence is in its own line). Here is the code that I have so far:
text_shortener <- function(input_text) {
lapply(input_text, function(x)str_split(x, "\\.", simplify = T)[1])
first.sentences <- unlist(lapply(input_text, function(x)str_split(x, "\\.", simplify = T)[1]))
no.spaces <- gsub(pattern = "(?<=[\\s])\\s*|^\\s+|\\s+$", replacement = "", x = first.sentences, perl = TRUE)
stopwords <- c("the", "really", "truly", "very", "The", "Really", "Truly", "Very")
x <- unlist(strsplit(no.spaces, " "))
no.stopwords <- paste(x[!x %in% stopwords], collapse = " ")
final.text <- gsub(pattern = "(?<=\\w{5})\\w+", replacement = ".", x = no.stopwords, perl=TRUE)
return(final.text)
}
All of the functions are working as they should, but the one part I can't figure out is how to get the output to print onto separate lines. When I run the function with a vector of text (I was using some text from Moby Dick as a test), this is what I get:
> text_shortener(Moby_Dick)
[1] "Call me Ishma. It is a way I have of drivi. off splee., and regul. circu. This is my subst. for pisto. and ball"
What I want is for the output of this function to look like this:
[1] "Call me Ishma."
[2] "It is a way I have of drivi. off splee., and regul. circu."
[3] "This is my subst. for pisto. and ball"
I am relatively new to R and this giving me a real headache, so any help would be much appreciated! Thank you!
Looking at your output, it seems like splitting on a period followed by a capital letter if what you need.
You could accomplish that with strsplit() and split the string up like so:
strsplit("Call me Ishma. It is drivi. off splee., and regul. circu. This is my subst. for pisto.","\\. (?=[A-Z])", perl=T)
That finds instances where a period is followed by a space and a capital letter and splits the character up there.
Edit: You could add it to the end of your function like so:
text_shortener <- function(input_text) {
lapply(input_text, function(x)str_split(x, "\\.", simplify = T)[1])
first.sentences <- unlist(lapply(input_text, function(x)str_split(x, "\\.", simplify = T)[1]))
no.spaces <- gsub(pattern = "(?<=[\\s])\\s*|^\\s+|\\s+$", replacement = "", x = first.sentences, perl = TRUE)
stopwords <- c("the", "really", "truly", "very", "The", "Really", "Truly", "Very")
x <- unlist(strsplit(no.spaces, " "))
no.stopwords <- paste(x[!x %in% stopwords], collapse = " ")
trim.text <- gsub(pattern = "(?<=\\w{5})\\w+", replacement = ".", x = no.stopwords, perl=TRUE)
final.text <- strsplit(trim.text, "\\. (?=[A-Z])", perl=T)
return(final.text)
}

R - Construct a string with double quotations

I basically need the outcome (string) to have double quotations, thus need of escape character. Preferabily solving with R base, without extra R packages.
I have tried with squote, shQuote and noquote. They just manipulate the quotations, not the escape character.
My list:
power <- "test"
myList <- list (
"power" = power)
I subset the content using:
myList
myList$power
Expected outcome (a string with following content):
" \"power\": \"test\" "
Using package glue:
library(glue)
glue(' "{names(myList)}": "{myList}" ')
"power": "test"
Another option using shQuote
paste(shQuote(names(myList), type = "cmd"),
shQuote(unlist(myList), type = "cmd"),
sep = ": ")
# [1] "\"power\": \"test\""
Not sure to get your expectation. Is it what you want?
myList <- list (
"power" = "test"
)
stringr::str_remove_all(
as.character(jsonlite::toJSON(myList, auto_unbox = TRUE)),
"[\\{|\\}]")
# [1] "\"power\":\"test\""
If you want some spaces:
x <- stringr::str_remove_all(
as.character(jsonlite::toJSON(myList, auto_unbox = TRUE)),
"[\\{|\\}]")
paste0(" ", x, " ")

Extract a pattern before // and after || symbol

I am not very familiar with regex in R.
in a column I am trying to extract words before // and after || symbol. I.e. this is what I have in my column:
qtaro_269//qtaro_269||qtaro_353//qtaro_353||qtaro_375//qtaro_375||qtaro_11//qtaro_11
This is what I want:
qtaro_269; qtaro_353; qtaro_375; qtaro_11
I found this: Extract character before and after "/" and this: Extract string before "|". However I don't know how to adjust it to my input. Any hint is much appreciated.
EDIT:
a qtaro_269//qtaro_269||qtaro_353//qtaro_353||qtaro_375//qtaro_375||qtaro_11//qtaro_11
b
c qtaro_269//qtaro_269||qtaro_353//qtaro_353||qtaro_375//qtaro_375||qtaro_11//qtaro_11
What about the following?
# Split by "||"
x2 <- unlist(strsplit(x, "\\|\\|"))
[1] "qtaro_269//qtaro_269" "qtaro_353//qtaro_353" "qtaro_375//qtaro_375" "qtaro_11//qtaro_11"
# Remove everything before and including "//"
gsub(".+//", "", x2)
[1] "qtaro_269" "qtaro_353" "qtaro_375" "qtaro_11"
And if you want it as one string with ; for separation:
paste(gsub(".+//", "", x2), collapse = "; ")
[1] "qtaro_269; qtaro_353; qtaro_375; qtaro_11"
This is how I solved it. For sure not the most intelligent and elegant way, so suggestions to improve it are welcome.
df <-unlist(lapply(strsplit(df[[2]],split="\\|\\|"), FUN = paste, collapse = "; "))
df <-unlist(lapply(strsplit(df[[2]],split="\\/\\/"), FUN = paste, collapse = "; "))
df <- sapply(strsplit(df$V2, "; ", fixed = TRUE), function(x) paste(unique(x), collapse = "; "))

Identify missing values in a sequence / perform asymmetric difference between two lists

Using R, I want to efficiently identify which values in a sequence are missing. I've written the below example of how I do it. There must be a better way. Can someone help?
data.list=c(1,2,4,5,7,8,9)
full.list=seq(from = 1, to = 10, by =1)
output <- c()
for(i in 1:length(full.list)){
holder1 <- as.numeric(any(data.list == i))
output[i] <- holder1
}
which(output == 0)
Another possible solution
setdiff(full.list,data.list)
full.list[!full.list %in% data.list]
Another option using match (similar to %in%)
full.list[!match(full.list,data.list,nomatch=FALSE)]
[1] 3 6 10
Using grep():
grep(paste("^", data.list, "$", sep = "", collapse = "|"), full.list, invert = TRUE)
You could be "lazy" and use collapse = ^|$ but use the above for precise accuracy.
Using grepl():
full.list[!grepl(paste("^", data.list, "$", sep = "", collapse = "|"), full.list)]

Resources