How can I split a string and add them to vector?

How can I split a string and add them to vector? - r

I'd like to split a character vector so that additional members are added to the length of the vector.
> va <- c("a", "b", "c;d;e")
[1] "a" "b" "c;d;e"
> vb <- strsplit(va, ";")
[[1]]
[1] "a"
[[2]]
[1] "b"
[[3]]
[1] "c" "d" "e"
Can can I get vb vector in the same format as va vector so that I get 1-dimensional, 5 member vector in vb as such?
[1] "a" "b" "c" "d" "e"
Appreciate the help.

One possibility:
unlist(vb)
# [1] "a" "b" "c" "d" "e"

Or
scan(text=va, sep=";",what="")
#Read 5 items
# [1] "a" "b" "c" "d" "e"

Related

Smartest way for making a sequence of characters in R

I am going to make the below sequence in R:
A A B B B A A B B B
I have used the below code:
rep(c("A","A","B","B","B"),2)
I got the correct answer as follows:
[1] "A" "A" "B" "B" "B" "A" "A" "B" "B" "B"
But I don't like my code. I would like to see the smartest way for making the above sequence. I don't know if it is possible to make the above sequence using LETTERS[1:2].
Thank you in advance

You can do it without using rep at all:
LETTERS[(0:9 %% 5 > 1) + 1]
[1] "A" "A" "B" "B" "B" "A" "A" "B" "B" "B"
Here you just replace 9 with however long you want the sequence to be.

You can use rep twice :
rep(rep(LETTERS[1:2], c(2, 3)), 2)
#[1] "A" "A" "B" "B" "B" "A" "A" "B" "B" "B"

A Reduce() version of #RonakShah's answer.
Reduce(rep, list(c(2, 3), 2), LETTERS[1:2])
# [1] "A" "A" "B" "B" "B" "A" "A" "B" "B" "B"

Another variant using rep and LETTERS:
LETTERS[rep(rep(1:2, 2:3), 2)]
# [1] "A" "A" "B" "B" "B" "A" "A" "B" "B" "B"

An option with replicate
unlist(replicate(2, Map(rep, LETTERS[1:2], c(2, 3))))
#[1] "A" "A" "B" "B" "B" "A" "A" "B" "B" "B"

List of string to list of vectors of characters

After defining
> Seq.genes <- as.list(c("ATGCCCAAATTTGATTT","AGAGTTCCCACCAACG"))
I have a list of strings :
> Seq.genes[1:2]
[[1]]
[1] "ATGCCCAAATTTGATTT"
[[2]]
[1] "AGAGTTCCCACCAACG"
I would like to convert it in a list of vectors :
>Seq.genes[1:2]
[[1]]
[1]"A" "T" "G" "C" "C" "C" "A" "A" "A" "T" "T" "T" "G" "A" "T" "T" "T"
[[2]]
[1] "A" "G" "A" "G" "T" "T" "C" "C" "C" "A" "C" "C" "A" "A" "C" "G"
I tried something like :
for (i in length(Seq.genes)){
x <- Seq.genes[i]
Seq.genes[i] <- substring(x, seq(1,nchar(x),2), seq(1,nchar(x),2))
}

It may be better to have the strings in a vector rather than in a list. So, we could unlist, then do an strsplit
strsplit(unlist(Seq.genes), "")

sapply(Seq.genes, strsplit, split = '')
or
lapply(Seq.genes, strsplit, split = '')

Convert vector with sets of values preceeded by "headers", to separate vectors

I have a vector with several sets of elements. Each set is preceded by a certain name, given by "A", "B" and "C" as an example over here:
v1 <- c("A", letters[1:5], "B", letters[6:7], "C", letters[8:12])
v1
# [1] "A" "a" "b" "c" "d" "e" "B" "f" "g" "C" "h" "i" "j" "k" "l"
The position of the "headers" can be obtained by grep:
start <- grep("[ABC]", v1)
# [1] 1 7 10
How do I proceed from here to extract the three sets of elements as separate vectors with the preceding "headers" as their name?
"A" <- letters[1:5]
"B" <- letters[6:7]
"C" <- letters[8:12]
A
# [1] "a" "b" "c" "d" "e"
B
# [1] "f" "g"
C
# [1] "h" "i" "j" "k" "l"
SOLUTION
I hope the kind soul who provided an answer to this question (his id eluded me), but later deleted his answer and all of his comments can be contacted, and the answer reinstated, so that he can be duly rewarded with upvotes.
Contrary to my initial claim, which was caused by a misunderstanding, his answer DID provide a viable solution.
Here's the gist of it, from what I can recall:
end <- start-1
end <- end[-1]
end[length(end)+1] <- length(v1)
[1] 6 9 15
map2(start+1, end, ~v1[.x:.y]) %>% set_names(v1[start])
$A
[1] "a" "b" "c" "d" "e"
$B
[1] "f" "g"
$C
[1] "h" "i" "j" "k" "l"

R: how to apply to a list a function that joins all subelements except the first one

I am struggling with manipulating lists; now I want to join all subelements in an element EXCEPT THE FIRST ONE, in one operation if possible.
For example, I have a list that looks like this:
[[1]] [1] "A" "B" "C" "D" "E" "F"
[[2]] [1] "A" "B" "C"
[[3]] [1] "A" "B" "C" "D"
[[4]] [1] "A" "B" "C" "D"
[[5]] [1] "A" "B" "C" "D" "E"
And I want to obtain this:
[[1]] [1] "B;C;D;E;F"
[[2]] [1] "B;C"
[[3]] [1] "B;C;D"
[[4]] [1] "B;C;D"
[[5]] [1] "B;C;D;E"
So I need a function to apply in this way:
list2 <- lapply(list1,
function(x) {
#something here
})
It would be awesome if the function could be easily modified to leave out a different subelement (not just the first one, but the 3rd, or the last, or 2nd to last...).
Many thanks!

Lets make a reproducible example:
> L = list(LETTERS[1:6], LETTERS[1:3],LETTERS[1:4],LETTERS[1:4],LETTERS[1:5])
> L
[[1]]
[1] "A" "B" "C" "D" "E" "F"
[[2]]
[1] "A" "B" "C"
[[3]]
[1] "A" "B" "C" "D"
[[4]]
[1] "A" "B" "C" "D"
[[5]]
[1] "A" "B" "C" "D" "E"
Then you drop the first element and paste everything else together with a semicolon:
> lapply(L, function(x){paste(x[-1],collapse=";")})
[[1]]
[1] "B;C;D;E;F"
[[2]]
[1] "B;C"
[[3]]
[1] "B;C;D"
[[4]]
[1] "B;C;D"
[[5]]
[1] "B;C;D;E"
You get an empty string (no semicolons) if there's only one element in the list element to start with.
Read up about R's vector indexing to do selection of other elements of the x vector in the function.

[ is actually a function. You can try the below.
list1 <- list(
c("A", "B", "C"),
c("D", "E", "F", "G")
)
# for leaving out the first element
lapply(list1, `[`, -1)
# for leaving out the last element
lapply(list1, function(a) a[-length(a)])
# for leaving various elements
Map(`[`, list1, -c(1, 2))

R adding to a list (nested list)

I have list1:
list1<-list("outliers"=list("values"=list(list(c("a","b","c"),
"dimensionKey"=2101120,
"metric"="1")
)
)
)
> list1
$outliers
$outliers$values
$outliers$values[[1]]
$outliers$values[[1]][[1]]
[1] "a" "b" "c"
$outliers$values[[1]]$dimensionKey
[1] 2101120
$outliers$values[[1]]$metric
[1] "1"
I need to add into this list values:
list2<-list(c("e", "f", "g", "m"),
"dimensionKey"=2101120,
"metric"="2")
I want rezult to look:
$outliers
$outliers$values
$outliers$values[[1]]
$outliers$values[[1]][[1]]
[1] "a" "b" "c"
$outliers$values[[1]]$dimensionKey
[1] 2101120
$outliers$values[[1]]$metric
[1] "1"
$outliers$values[[2]]
$outliers$values[[2]][[1]]
[1] "e" "f" "g" "m"
$outliers$values[[2]]$dimensionKey
[1] 2101120
$outliers$values[[2]]$metric
[1] "2"
How can I manage that?
P.S.: I need that for my function for adding in existing list values, therefore I can't write this in one step.
Thank you!

You can use this approach:
list1$outliers$values <- append(list1$outliers$values, list(list2))
The result (list1):
$outliers
$outliers$values
$outliers$values[[1]]
$outliers$values[[1]][[1]]
[1] "a" "b" "c"
$outliers$values[[1]]$dimensionKey
[1] 2.10112e+06
$outliers$values[[1]]$metric
[1] "1"
$outliers$values[[2]]
$outliers$values[[2]][[1]]
[1] "e" "f" "g" "m"
$outliers$values[[2]]$dimensionKey
[1] 2.10112e+06
$outliers$values[[2]]$metric
[1] "2"

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

How can I split a string and add them to vector? - r

One possibility: unlist(vb) # [1] "a" "b" "c" "d" "e"

Or scan(text=va, sep=";",what="") #Read 5 items # [1] "a" "b" "c" "d" "e"

Related

Smartest way for making a sequence of characters in R

List of string to list of vectors of characters

Convert vector with sets of values preceeded by "headers", to separate vectors

R: how to apply to a list a function that joins all subelements except the first one

R adding to a list (nested list)

Categories

Resources