I have searched for this but in vain.
the problem is I have two lists, first with the elements to be repeated
for example
my.list<-list(c('a','b','c','d'), c('g','h'))
and the second list is the number of times each element is to be repeated
repeat.list<-list(c(5,7,6,1), c(2,3))
I would like to create a new list in which each element in my.list is repeated based in repeat.list
i.e.
result:
[[1]]
[1] "a" "a" "a" "a" "a" "b" "b" "b" "b" "b" "b" "b" "c" "c" "c" "c" "c" "c" "d"
[[2]]
[1] "g" "g" "h" "h" "h"
Thank you in advance for your help
Use mapply:
mapply(rep, my.list, repeat.list)
[[1]]
[1] "a" "a" "a" "a" "a" "b" "b" "b" "b" "b" "b" "b" "c" "c" "c" "c" "c" "c" "d"
[[2]]
[1] "g" "g" "h" "h" "h"
lapply also does the trick, but is more verbose:
lapply(seq_along(my.list), function(i)rep(my.list[[i]], repeat.list[[i]]))
[[1]]
[1] "a" "a" "a" "a" "a" "b" "b" "b" "b" "b" "b" "b" "c" "c" "c" "c" "c" "c" "d"
[[2]]
[1] "g" "g" "h" "h" "h"
Related
I am filetering a data.table based on another data.table, and it gives a very odd result.
please advise,
library(data.table)
library(magrittr)
set.seed(100)
xA = data.table(A = letters[1:4], B = sample(1:1000))
xB = data.table(A = letters[1:4], B = sample(1:100))
with(xA[30], {
sprintf(" xA A = %s B = %s", A, B) %>% print
xB[A == A]$A %>% print
print("")
xB[A == "b"]$A %>% print
})
#[1] " xA A = b B = 322"
# [1] "a" "b" "c" "d" "a" "b" "c" "d" "a" "b" "c" "d" "a" "b" "c" "d" "a" "b" "c" "d" "a" "b" "c" #"d" "a" "b" "c" "d" "a" "b" "c" "d" "a" "b"
# [35] "c" "d" "a" "b" "c" "d" "a" "b" "c" "d" "a" "b" "c" "d" "a" "b" "c" "d" "a" "b" "c" "d" "a" #"b" "c" "d" "a" "b" "c" "d" "a" "b" "c" "d"
# [69] "a" "b" "c" "d" "a" "b" "c" "d" "a" "b" "c" "d" "a" "b" "c" "d" "a" "b" "c" "d" "a" "b" "c" #"d" "a" "b" "c" "d" "a" "b" "c" "d"
#[1] " xA A = b B = 322"
# [1] "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" #"b" "b" "b" "b" "b" "b" "b"
With the toy code, it shall give a result of all b as the second result, but it gave everything as first printout. How come? Thanks for advice.
The problem is when you just look at the statement
xB[A == A]
How do you know which is a column name and which is a variable name? In this case, data.table just assumes you want all rows where column A is equal to itself (which is all of them. Try using a differnt variable name
with(xA[30], {
sprintf(" xA A = %s B = %s", A, B) %>% print
a <- A
xB[A == a]$A
})
Is there an efficient way of programming to solve the following task?
Imagine the following vector:
A<-[a,b,c...k]
And would like to spread it the following way:
Let‘s start with e.g. n=2
B<-[a,a,b,b,c...,k,k]
And now n=4 or any number greater 1
C<-[a,a,a,a,b,...,k,k,k,k]
To solve it via loops seems kind of easy, but is there any function or vector based operation I missed/could use? A tidyverse solutions (for using it in a pipe) would be the best solution for me.
(It is hard to do research on this task as I am a newbie in R and don‘t the correct terms to search for. Any help would be helpful.)
Let
A <- letters[1:11]
A
[1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k"
If you use function rep with argument each, you get what you want:
rep(A, each=2)
[1] "a" "a" "b" "b" "c" "c" "d" "d" "e" "e" "f" "f" "g" "g" "h" "h" "i" "i" "j"
[20] "j" "k" "k"
rep(A, each=3)
[1] "a" "a" "a" "b" "b" "b" "c" "c" "c" "d" "d" "d" "e" "e" "e" "f" "f" "f" "g"
[20] "g" "g" "h" "h" "h" "i" "i" "i" "j" "j" "j" "k" "k" "k"
An option is to use rep with argument times = 2 or 4 and then sort the result. Another option is to use mapply and then c operator.
c(mapply(rep, 2 ,A)) # OR sort(rep(A, times = 2))
#[1] "a" "a" "b" "b" "c" "c" "d" "d" "e" "e" "f" "f" "g" "g" "h" "h" "i" "i" "j" "j"
#[21] "k" "k"
c(mapply(rep,A, 4)) #OR sort(rep(A, times = 2))
#[1] "a" "a" "a" "a" "b" "b" "b" "b" "c" "c" "c" "c" "d" "d" "d" "d" "e" "e" "e" "e"
#[21] "f" "f" "f" "f" "g" "g" "g" "g" "h" "h" "h" "h" "i" "i" "i" "i" "j" "j" "j" "j"
#[41] "k" "k" "k" "k"
I am trying to convert my_dnabin1, a DNAbin file of 55 samples, to fasta format. I am using the following code to convert it into a fasta file.
dnabin_to_fasta <- lapply(my_dnabin1, function(x) as.character(x[1:length(x)]))
This generates a list of 55 samples which looks like:
$SS.11.01
[1] "t" "t" "a" "c" "c" "t" "a" "a" "a" "a" "a" "g" "c" "c" "g" "c" "t" "t" "c" "c" "c" "t" "c" "c" "a" "a"
[27] "c" "c" "c" "t" "a" "g" "a" "a" "g" "c" "a" "a" "a" "c" "c" "t" "t" "t" "c" "a" "a" "c" "c" "c" "c" "a"
$SS.11.02
[1] "t" "t" "a" "c" "c" "t" "a" "a" "a" "a" "a" "g" "c" "c" "g" "c" "t" "t" "c" "c" "c" "t" "c" "c" "a" "a"
[27] "c" "c" "c" "t" "a" "g" "a" "a" "g" "c" "a" "a" "a" "c" "c" "t" "t" "t" "c" "a" "a" "c" "c" "c" "c" "a"
and so on...
However, I want a fasta formatted file as the output that may look something like:
>SS.11.01 ttacctga
>SS.11.02 ttacctga
you can try this
lapply(my_dnabin1, function(x) paste0(x, collapse = ''))
I have huge data set. The columns contain values like A,B,C,D,E,F,G,H and I need to replace them with 1,2,3,4...
[1] "C" "C" "C" "C" "C" "A" "H" "G" "G" "G" "G" "G" "G" "G" "C" "C" "C" "C" "C"
[20] "C" "B" "B" "B" "H" "H" "H" "H" "H" "H" "G" "C" "A" "A" "A" "A" "A" "A" "A"
[30]----
Another similar problem is values in one column are more than 1000 and I need to replace them by unique numbers.
try replace
replace function examples
in your case e.g.
replace(df, "A", 1)
Are there some nice designs to call data in a nested structure e.g.
a<-list(list(LETTERS[1:3],LETTERS[1:3]),list(LETTERS[4:6]))
lapply(a,function(x) lapply(x, function(x) x))
but unlist is not a option.
Not as good as #SimonO101's answer but just for providing as an alternative you can do it using do.call
> do.call(c,do.call(c, a))
[1] "A" "B" "C" "A" "B" "C" "D" "E" "F"
Also using Reduce
> do.call(c, Reduce(c, a))
[1] "A" "B" "C" "A" "B" "C" "D" "E" "F"
Recursive lapply... a.k.a rapply?
rapply( a , c )
[1] "A" "B" "C" "A" "B" "C" "D" "E" "F"