formula for getting combination of lists - math

I have
List 1 = AB
List 2 = CD
List 3 = EF
List 4 = GH
A program will print a final list composed by only one letter from each list.
So one of the combination can be
A
C
E
G
How many combination are possibile? What is the formula to count the number of the combinations?

The formula is just the product of the lengths of all your lists, so in your case: 2 x 2 x 2 x 2 = 16 combinations.

Related

Apply function to column by segments in R

I have a function f that needs to be applied to a single column of length n in segments of m length, where m divides n. (For example, to a column of 1000 values, apply f to the first 250 values, then to 250-500, ...).
A loop is overkill, since the column has over 16 million values. I was thinking the efficient way would be to separate the column of length n into q vectors of length m, where mq = n. Then I could apply f simultaneously to all this vectors using some lapply-like functionality. Then I cold join the q vectors to obtain the transformed version of the column.
Is that the efficient way to go here? If so, what function could decompose a column into q vectors of equal length and what function should I use to broadcast f across the q vectors?
Lastly, although less importantly, what if we wanted to do this to several columns and not just one?
Context
I've programmed a function that computes the power spectrum of an EEG signal (a numeric vector). However, it is bad practice to compute the power spectrum of a whole signal at once. The correct method is to compute it epoch by epoch, in 30 or 5 second segments, and average the spectrum of all those epochs. Hence why I need to apply a function to a column (an EEG signal) by epochs (or segments).
A way to do it is to create an auxiliar variable, so you can apply to each variable, depending on your function you can use group_by and/or summarize, an example:
df <- data.frame(
x = rnorm(15),
y = rnorm(15),
z = rnorm(15)
)
library(dplyr)
df %>%
mutate(
aux = rep(1:3,each = (nrow(df)/3)),
across(.cols = c(x,y,z),.fns = ~ . + 2 * aux)
)
x y z aux
1 2.164841 2.882465 2.139098 1
2 2.364115 2.205598 2.410275 1
3 2.552158 1.383564 1.441543 1
4 1.398107 1.265201 2.605371 1
5 1.006301 1.868197 1.493666 1
6 5.026785 4.310017 2.579434 2
7 4.751061 2.960320 4.127993 2
8 2.490833 3.815691 5.945851 2
9 3.904853 4.967267 4.800914 2
10 3.104052 3.891720 5.165253 2
11 3.929249 5.301579 6.358856 3
12 6.150120 5.724055 5.391443 3
13 5.920788 7.114649 5.797759 3
14 5.902631 6.550044 5.726752 3
15 6.216153 7.236676 5.531300 3

How to transpose a long data frame every n rows

I have a data frame like this:
x=data.frame(type = c('a','b','c','a','b','a','b','c'),
value=c(5,2,3,2,10,6,7,8))
every item has attributes a, b, c while some records may be missing records, i.e. only have a and b
The desired output is
y=data.frame(item=c(1,2,3), a=c(5,2,6), b=c(2,10,7), c=c(3,NA,8))
How can I transform x to y? Thanks
We can use dcast
library(data.table)
out <- dcast(setDT(x), rowid(type) ~ type, value.var = 'value')
setnames(out, 'type', 'item')
out
# item a b c
#1: 1 5 2 3
#2: 2 2 10 8
#3: 3 6 7 NA
Create a grouping vector g assuming each occurrence of a starts a new group, use tapply to create a table tab and coerce that to a data frame. No packages are used.
g <- cumsum(x$type == "a")
tab <- with(x, tapply(value, list(g, type), c))
as.data.frame(tab)
giving:
a b c
1 5 2 3
2 2 10 NA
3 6 7 8
An alternate definition of the grouping vector which is slightly more complex but would be needed if some groups have a missing is the following. It assumes that x lists the type values in order of their levels within group so that if a level is less than the prior level it must be the start of a new group.
g <- cumsum(c(-1, diff(as.numeric(x$type))) < 0)
Note that ultimately there must be some restriction on missingness; otherwise, the problem is ambiguous. For example if one group can have b and c missing and then next group can have a missing then whether b and c in the second group actually form a second group or are part of the first group is not determinable.

R: Stack error - How to merge multiple columns into one very long column in R

This should be easy but I'm having a lot of difficulty.
I have a relatively large dataset of medications,
What I want is a table of frequencies, but ranging over ALL the columns - so I want the medication that appears the most commonly from columns 1:8.
My idea was to combine all of these columns into one long column, just one on top of the other. However, I have tried multiple function (stack, melt, matrix), but they all give me bizarre results. The one that seems correct for me to use is stack, but it keeps returning the error message "Error in stack.data.frame(meds) : no vector columns were selected". I've seen this error on the message boards before - I tried converting into as.vector, but this is not working. The object is definitely of class dataframe.
If there is another way to achieve these table results, that would be great, but either way, it's not working right now. Could somebody help?
Consider do.call or Reduce using c() function to combine all columns into a vector and then count unique meds using sapply loop:
set.seed(79)
meds <- data.frame(MED1=sample(LETTERS, 8),
MED2=sample(LETTERS, 8),
MED3=sample(LETTERS, 8),
MED4=sample(LETTERS, 8),
MED5=sample(LETTERS, 8),
MED6=sample(LETTERS, 8),
MED7=sample(LETTERS, 8),
MED8=sample(LETTERS, 8), stringsAsFactors = FALSE)
medslist <- do.call(c, meds) # OR Reduce(c, meds)
medslength <- sapply(unique(medslist), function(i) length(medslist[medslist==i]))
medslength <- sort(medslength, decreasing=TRUE)
medslength[1:8]
# B U W L I E M R
# 5 5 3 3 3 3 3 3
Try this to get what you want. No stacking necessary:
df = data.frame(Col1 = sample(LETTERS,50,replace=T),
Col2 = sample(LETTERS,50,replace=T))
> table(as.matrix(df))
# A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
# 2 3 3 4 3 5 4 3 5 3 4 8 4 5 3 6 5 2 5 4 4 2 4 2 3 4

Function that group values of a list (in R)

I am trying to construct a function which shouldn't be hard in terms of programming but I am having some difficulties to conceptualize it. Hope you'll be able to understand my problem better than me!
I'd like a function that takes a single list of vectors as argument. Something like
arg1 = list(c(1,2), c(2,3), c(5,6), c(1,3), c(4,6), c(6,7), c(7,5), c(5,8))
The function should output a matrix with two columns (or a list of two vectors or something like that) where one column contains letters and the other numbers. One can think of the argument as a list of the positions/values that should be placed in the same group. If in the list there is the vector c(5,6), then the output should contain somewhere the same letters next to the values 5 and 6 in the number column. If there are the three following vectors c(1,2), c(2,3) and c(1,3), then the output should contain somewhere the same letters next to the value 1, 2 and 3 in the number column.
Therefore if we enter the object arg1 in the function it should return:
myFun(arg1)
number_column letters_column
1 A
2 A
3 A
5 B
6 B
7 B
4 C
6 C
5 D
8 D
(the order is not important. The letters E should not be present before the letter D has been used)
Therefore the function has constructed 2 groups of 3 (A:[1,2,3] and B:[5,6,7]) and 2 groups of 2 (C:[4,6] and D:[5,8]). Note one position or number can be in several group.
Please let me know if something is unclear in my question! Thanks!
As I wrote in the comments, it appears that you want a data frame that lists the maximal cliques of a graph given a list of vectors that define the edges.
require(igraph)
## create a matrix where each row is an edge
argmatrix <- do.call(rbind, arg1)
## create an igraph object from the matrix of edges
gph <- graph.edgelist(argmatrix, directed = FALSE)
## returns a list of the maximal cliques of the graph
mxc <- maximal.cliques(gph)
## creates a data frame of the output
dat <- data.frame(number_column = unlist(mxc),
group_column = rep.int(seq_along(mxc),times = sapply(mxc,length)))
## converts group numbers to letters
## ONLY USE if max(dat$group_column) <= 26
dat$group_column <- LETTERS[dat$group_column]
# number_column group_column
# 1 5 A
# 2 8 A
# 3 5 B
# 4 6 B
# 5 7 B
# 6 4 C
# 7 6 C
# 8 3 D
# 9 1 D
# 10 2 D

matrix containing number of occurrence corresponding to each element in the matrix

I have matrix, suppose
A = [1 2 3 1 1 1 2 3]
I want to find number of times the number appeared in the matrix. The output matrix for this i/p would be
B = [1 1 1 2 3 4 2 2]
i.e. 1 appeared 4 times in the array, hence last value corresponding to 1 is 4.
unique and sum unique do not help because it gives total number of times the element occured, but I want another matrix which increases the count every time it occurs.
try this:
B = ave(A,A,FUN=function(x) 1:length(x))
You can do this pretty simply with the following code. This will assume that the A matrix is one dimensional, but this is not too big of an assumption to make.
A=[1 2 3 1 1 1 2 3];
vals = unique(A);
B = zeros(size(A));
for i = 1:numel(vals)
idxs = find(diff([0,cumsum(A == vals(i))]));
B(idxs) = 1:numel(idxs);
end
This solution is for MATLAB, not R. I do not know which one you want. If you want an R answer, I would recommend one of the other people's answer :)
Here is a solution in MATLAB:
B = sum(triu(bsxfun(#eq, A, A.')));
For Matlab:
B = sum(tril(repmat(A,length(A),1)).' + tril(repmat(NaN,length(A),length(A)),-1) == repmat(A,length(A),1))
If A is guaranteed not to contain zeros, this can be simplified to:
B = sum(tril(repmat(A,length(A),1)).' == repmat(A,length(A),1));

Resources