R: repeat elements of a list based on another list - r

I have searched for this but in vain.
the problem is I have two lists, first with the elements to be repeated
for example
my.list<-list(c('a','b','c','d'), c('g','h'))
and the second list is the number of times each element is to be repeated
repeat.list<-list(c(5,7,6,1), c(2,3))
I would like to create a new list in which each element in my.list is repeated based in repeat.list
i.e.
result:
[[1]]
[1] "a" "a" "a" "a" "a" "b" "b" "b" "b" "b" "b" "b" "c" "c" "c" "c" "c" "c" "d"
[[2]]
[1] "g" "g" "h" "h" "h"
Thank you in advance for your help

Use mapply:
mapply(rep, my.list, repeat.list)
[[1]]
[1] "a" "a" "a" "a" "a" "b" "b" "b" "b" "b" "b" "b" "c" "c" "c" "c" "c" "c" "d"
[[2]]
[1] "g" "g" "h" "h" "h"
lapply also does the trick, but is more verbose:
lapply(seq_along(my.list), function(i)rep(my.list[[i]], repeat.list[[i]]))
[[1]]
[1] "a" "a" "a" "a" "a" "b" "b" "b" "b" "b" "b" "b" "c" "c" "c" "c" "c" "c" "d"
[[2]]
[1] "g" "g" "h" "h" "h"

Related

data.table filter doesn't work in with()?

I am filetering a data.table based on another data.table, and it gives a very odd result.
please advise,
library(data.table)
library(magrittr)
set.seed(100)
xA = data.table(A = letters[1:4], B = sample(1:1000))
xB = data.table(A = letters[1:4], B = sample(1:100))
with(xA[30], {
sprintf(" xA A = %s B = %s", A, B) %>% print
xB[A == A]$A %>% print
print("")
xB[A == "b"]$A %>% print
})
#[1] " xA A = b B = 322"
# [1] "a" "b" "c" "d" "a" "b" "c" "d" "a" "b" "c" "d" "a" "b" "c" "d" "a" "b" "c" "d" "a" "b" "c" #"d" "a" "b" "c" "d" "a" "b" "c" "d" "a" "b"
# [35] "c" "d" "a" "b" "c" "d" "a" "b" "c" "d" "a" "b" "c" "d" "a" "b" "c" "d" "a" "b" "c" "d" "a" #"b" "c" "d" "a" "b" "c" "d" "a" "b" "c" "d"
# [69] "a" "b" "c" "d" "a" "b" "c" "d" "a" "b" "c" "d" "a" "b" "c" "d" "a" "b" "c" "d" "a" "b" "c" #"d" "a" "b" "c" "d" "a" "b" "c" "d"
#[1] " xA A = b B = 322"
# [1] "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" #"b" "b" "b" "b" "b" "b" "b"
With the toy code, it shall give a result of all b as the second result, but it gave everything as first printout. How come? Thanks for advice.
The problem is when you just look at the statement
xB[A == A]
How do you know which is a column name and which is a variable name? In this case, data.table just assumes you want all rows where column A is equal to itself (which is all of them. Try using a differnt variable name
with(xA[30], {
sprintf(" xA A = %s B = %s", A, B) %>% print
a <- A
xB[A == a]$A
})

R: Efficient way for spreading vectors

Is there an efficient way of programming to solve the following task?
Imagine the following vector:
A<-[a,b,c...k]
And would like to spread it the following way:
Let‘s start with e.g. n=2
B<-[a,a,b,b,c...,k,k]
And now n=4 or any number greater 1
C<-[a,a,a,a,b,...,k,k,k,k]
To solve it via loops seems kind of easy, but is there any function or vector based operation I missed/could use? A tidyverse solutions (for using it in a pipe) would be the best solution for me.
(It is hard to do research on this task as I am a newbie in R and don‘t the correct terms to search for. Any help would be helpful.)
Let
A <- letters[1:11]
A
[1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k"
If you use function rep with argument each, you get what you want:
rep(A, each=2)
[1] "a" "a" "b" "b" "c" "c" "d" "d" "e" "e" "f" "f" "g" "g" "h" "h" "i" "i" "j"
[20] "j" "k" "k"
rep(A, each=3)
[1] "a" "a" "a" "b" "b" "b" "c" "c" "c" "d" "d" "d" "e" "e" "e" "f" "f" "f" "g"
[20] "g" "g" "h" "h" "h" "i" "i" "i" "j" "j" "j" "k" "k" "k"
An option is to use rep with argument times = 2 or 4 and then sort the result. Another option is to use mapply and then c operator.
c(mapply(rep, 2 ,A)) # OR sort(rep(A, times = 2))
#[1] "a" "a" "b" "b" "c" "c" "d" "d" "e" "e" "f" "f" "g" "g" "h" "h" "i" "i" "j" "j"
#[21] "k" "k"
c(mapply(rep,A, 4)) #OR sort(rep(A, times = 2))
#[1] "a" "a" "a" "a" "b" "b" "b" "b" "c" "c" "c" "c" "d" "d" "d" "d" "e" "e" "e" "e"
#[21] "f" "f" "f" "f" "g" "g" "g" "g" "h" "h" "h" "h" "i" "i" "i" "i" "j" "j" "j" "j"
#[41] "k" "k" "k" "k"

How to convert DNAbin to FASTA in R?

I am trying to convert my_dnabin1, a DNAbin file of 55 samples, to fasta format. I am using the following code to convert it into a fasta file.
dnabin_to_fasta <- lapply(my_dnabin1, function(x) as.character(x[1:length(x)]))
This generates a list of 55 samples which looks like:
$SS.11.01
[1] "t" "t" "a" "c" "c" "t" "a" "a" "a" "a" "a" "g" "c" "c" "g" "c" "t" "t" "c" "c" "c" "t" "c" "c" "a" "a"
[27] "c" "c" "c" "t" "a" "g" "a" "a" "g" "c" "a" "a" "a" "c" "c" "t" "t" "t" "c" "a" "a" "c" "c" "c" "c" "a"
$SS.11.02
[1] "t" "t" "a" "c" "c" "t" "a" "a" "a" "a" "a" "g" "c" "c" "g" "c" "t" "t" "c" "c" "c" "t" "c" "c" "a" "a"
[27] "c" "c" "c" "t" "a" "g" "a" "a" "g" "c" "a" "a" "a" "c" "c" "t" "t" "t" "c" "a" "a" "c" "c" "c" "c" "a"
and so on...
However, I want a fasta formatted file as the output that may look something like:
>SS.11.01 ttacctga
>SS.11.02 ttacctga
you can try this
lapply(my_dnabin1, function(x) paste0(x, collapse = ''))

How to replace values in a data frame with another value

I have huge data set. The columns contain values like A,B,C,D,E,F,G,H and I need to replace them with 1,2,3,4...
[1] "C" "C" "C" "C" "C" "A" "H" "G" "G" "G" "G" "G" "G" "G" "C" "C" "C" "C" "C"
[20] "C" "B" "B" "B" "H" "H" "H" "H" "H" "H" "G" "C" "A" "A" "A" "A" "A" "A" "A"
[30]----
Another similar problem is values in one column are more than 1000 and I need to replace them by unique numbers.
try replace
replace function examples
in your case e.g.
replace(df, "A", 1)

Access nested structure

Are there some nice designs to call data in a nested structure e.g.
a<-list(list(LETTERS[1:3],LETTERS[1:3]),list(LETTERS[4:6]))
lapply(a,function(x) lapply(x, function(x) x))
but unlist is not a option.
Not as good as #SimonO101's answer but just for providing as an alternative you can do it using do.call
> do.call(c,do.call(c, a))
[1] "A" "B" "C" "A" "B" "C" "D" "E" "F"
Also using Reduce
> do.call(c, Reduce(c, a))
[1] "A" "B" "C" "A" "B" "C" "D" "E" "F"
Recursive lapply... a.k.a rapply?
rapply( a , c )
[1] "A" "B" "C" "A" "B" "C" "D" "E" "F"

Resources