count frequency of vector elements - vector

I have this code which gives the number of occurrences of each element of w vector in vector y
a= [7 4 9 6 4 10 9 6 7 6]
y=zeros(size(a));
for i=1:length(a)
y(i)=sum(a==a(i));
end
y = y;
end
y=[2 2 2 3 2 1 2 3 2 3]
but this result is not enough because I need to know the index along with the number of repartitions. The result should look like:
element no of repartitions position1 position2 position3
7 2 1 9 0
4 2 2 5 0
9 2 3 7 0
6 3 4 8 10
10 1 0 0 0

Related

Generating all possible outcomes in r

Given a vector with numeric values, how do I generate all possible outcomes for subtraction to find the differences and put them in a data.frame?
dataset1 <- data.frame(numbers = c(1,2,3,4,5,6,7,8,9,10))
i.e. (1 - 1, 1 - 2 , 1 - 3,...)
Ideally, I would want the output to give me a data frame with 3 columns (Number X, Number Y, Difference) using dataset1.
The expand.grid function can get you "pairings" which are different than the pairings you get with combn. Since you included 1-1 I'm assuming you didn't want since it doesn't return 1-1 and only gives you 45 combinations.
> pairs=expand.grid(X=1:10, Y=1:10)
> pairs$diff <- with(pairs, X-Y)
> pairs
X Y diff
1 1 1 0
2 2 1 1
3 3 1 2
4 4 1 3
5 5 1 4
6 6 1 5
7 7 1 6
8 8 1 7
9 9 1 8
10 10 1 9
11 1 2 -1
12 2 2 0
13 3 2 1
14 4 2 2
15 5 2 3
16 6 2 4
17 7 2 5
snipped remainder (total of 100 rows)
Use outer as another way to get such a group of paired differences;
> tbl <- matrix( outer(X=1:10, Y=1:10, "-"), 10, dimnames=list(X=1:10, Y=1:10))
> tbl
Y
X 1 2 3 4 5 6 7 8 9 10
1 0 -1 -2 -3 -4 -5 -6 -7 -8 -9
2 1 0 -1 -2 -3 -4 -5 -6 -7 -8
3 2 1 0 -1 -2 -3 -4 -5 -6 -7
4 3 2 1 0 -1 -2 -3 -4 -5 -6
5 4 3 2 1 0 -1 -2 -3 -4 -5
6 5 4 3 2 1 0 -1 -2 -3 -4
7 6 5 4 3 2 1 0 -1 -2 -3
8 7 6 5 4 3 2 1 0 -1 -2
9 8 7 6 5 4 3 2 1 0 -1
10 9 8 7 6 5 4 3 2 1 0
But I didn't see a compact way to create a dataframe of the sort you specified.
The now deleted comment by #RitchieSacramento iswas correct:
> tbl <- matrix( outer(X=1:10, Y=1:10, "-"), 10, dimnames=list(X=1:10, Y=1:10))
> as.data.frame.table(tbl)
X Y Freq
1 1 1 0
2 2 1 1
3 3 1 2
4 4 1 3
5 5 1 4
6 6 1 5
7 7 1 6
8 8 1 7
9 9 1 8
10 10 1 9
11 1 2 -1
12 2 2 0
13 3 2 1
14 4 2 2
15 5 2 3
16 6 2 4
You can use the combn() function to generate the list of all combinations take 2 at a time.
numbers = c(1,2,3,4,5,6,7,8,9,10)
output <-combn(numbers, 2, FUN = NULL, simplify = TRUE )
answer <- as.data.frame(t(output))
answer$Difference <- answer[ ,1] - answer[ ,2]
head(answer)
V1 V2 Difference
1 1 2 -1
2 1 3 -2
3 1 4 -3
4 1 5 -4
5 1 6 -5
6 1 7 -6

Creating new variable based on index position and value of other variable

I could use some help. I need to add a new variable to a dataframe based on whether or not the value of a variable in a dataframe equals the index value of another vector. Below is a simplified example:
vector [2 7 15 4 5]
dataframe (4 variables; Index, Site, Quad, Count)
Index Site Quad Count
1 2 3 0
1 3 7 2
2 1 8 0
2 3 3 1
3 2 3 0
4 3 7 2
5 1 8 0
5 3 3 1
The variable I would like to create would match value of df$Index from the dataframe with the matching position in the vector. That is, when df$Index = 1, the new variable would be 2 (position 1 in the vector), when df$Index = 2, the new variable would be 7 (position 2 in the vector), when df$Index = 3, the new variable would be 3 (position 3 in the vector).
I've ended up in a R wormhole, and know the solution is simple, but I cannot seem to get it. Thanks for any help.
If your indexes are atually integer indices, for example
dd<-read.table(text="Index Site Quad Count
1 2 3 0
1 3 7 2
2 1 8 0
2 3 3 1
3 2 3 0
4 3 7 2
5 1 8 0
5 3 3 1", header=TRUE)
vec <- c(2, 7, 15, 4, 5)
Then you can create the new column with
dd$value <- vec[dd$Index]
dd
# Index Site Quad Count value
# 1 1 2 3 0 2
# 2 1 3 7 2 2
# 3 2 1 8 0 7
# 4 2 3 3 1 7
# 5 3 2 3 0 15
# 6 4 3 7 2 4
# 7 5 1 8 0 5
# 8 5 3 3 1 5

Create set of matrices from concatenating columns of another matrix in r

I have two matrices A and B of dimension 5 by 3 and 5 by 2, respectively. I want to produce series of matrices combining each column of matrix B to A. The dimensions of the resulting matrices would be 5 by 4
Let A be
1 2 3
4 5 6
7 8 9
2 3 1
4 1 5
and B be
1 2
2 5
3 8
6 3
2 1
Then the resulting matrices are
1 2 3 1
4 5 6 2
7 8 9 3
2 3 1 6
4 1 5 2
and
1 2 3 2
4 5 6 5
7 8 9 8
2 3 1 3
4 1 5 1
Use our old friend the assignment operator. Assigning 1st column of B to 4th of A:
A[, 4] <- B[, 1]
> A
V1 V2 V3 V4
1 1 2 3 1
2 4 5 6 2
3 7 8 9 3
4 2 3 1 6
5 4 1 5 2
Then A[, 4] <- B[, 2], etc.

Combine minimum values of row and column in matrix

Suppose I have a vector of size n=8 v=(5,8,2,7,9,12,2,1). I would like to know how to build a N x N matrix that compares every pair of values of v and returns the minimum value of each comparation. In this example, it would be like this:
5 5 2 5 5 5 2 1
5 8 2 7 8 8 2 1
2 2 2 2 2 2 2 1
5 7 2 7 7 7 2 1
5 8 2 7 9 9 2 1
5 8 2 7 9 12 2 1
2 2 2 2 2 2 2 1
1 1 1 1 1 1 1 1
Could you help me with this, please?
outer(v, v, pmin)
Notice the use of pmin, not min, as the former is vectorised but not the latter.

Conditonally delete columns in R

I know how to delete columns in R, but I am not sure how to delete them based on the following set of conditions.
Suppose a data frame such as:
DF <- data.frame(L = c(2,4,5,1,NA,4,5,6,4,3), J= c(3,4,5,6,NA,3,6,4,3,6), K= c(0,1,1,0,NA,1,1,1,1,1),D = c(1,1,1,1,NA,1,1,1,1,1))
DF
L J K D
1 2 3 0 1
2 4 4 1 1
3 5 5 1 1
4 1 6 0 1
5 NA NA NA NA
6 4 3 1 1
7 5 6 1 1
8 6 4 1 1
9 4 3 1 1
10 3 6 1 1
The data frame has to be set up in this fashion. Column K corresponds to column L, and column D, corresponds to column J. Because column D has values that are all equal to one, I would like to delete column D, and the corresponding column J yielding a dataframe that looks like:
DF
L K
1 2 0
2 4 1
3 5 1
4 1 0
5 NA NA
6 4 1
7 5 1
8 6 1
9 4 1
10 3 1
I know there has got to be a simple command to do so, I just can't think of any. And if it makes any difference, the NA's must be retained.
Additional helpful information, in my real data frame there are a total of 20 columns, so there are 10 columns like L and J, and another 10 that are like K and D, I need a function that can recognize the correspondence between these two groups and delete columns accordingly if necessary
Thank you in advance!
Okey, assuming the column-number based correspondence, here is an example:
> n <- 10
>
> # sample data
> d <- data.frame(lapply(1:n, function(x)sample(n)), lapply(1:n, function(x)sample(2, n, T, c(0.1, 0.9))-1))
> names(d) <- c(LETTERS[1:n], letters[1:n])
> head(d)
A B C D E F G H I J a b c d e f g h i j
1 5 5 2 7 4 3 4 3 5 8 0 1 1 1 1 1 1 1 1 1
2 9 8 4 6 7 8 8 2 10 5 1 1 1 1 1 1 1 1 1 1
3 6 6 10 3 5 6 2 1 8 6 1 1 1 1 1 1 1 1 1 1
4 1 7 5 5 1 10 10 4 2 4 1 1 1 1 1 1 1 1 1 1
5 10 9 6 2 9 5 6 9 9 9 1 1 0 1 1 1 1 1 1 1
6 2 1 1 4 6 1 5 8 4 10 1 1 1 1 1 1 1 1 1 1
>
> # find the column that should be left.
> idx <- which(colMeans(d[(n+1):(2*n)], na.rm = TRUE) != 1)
>
> # filter the data
> d[, c(idx, idx+n)]
A B C D F a b c d f
1 5 5 2 7 3 0 1 1 1 1
2 9 8 4 6 8 1 1 1 1 1
3 6 6 10 3 6 1 1 1 1 1
4 1 7 5 5 10 1 1 1 1 1
5 10 9 6 2 5 1 1 0 1 1
6 2 1 1 4 1 1 1 1 1 1
7 8 4 7 10 2 1 1 1 1 0
8 7 3 9 9 4 1 0 1 0 1
9 3 10 3 1 9 1 1 0 1 1
10 4 2 8 8 7 1 0 1 1 1
I basically agree with koshke (whose SO work is excellent), but would suggest that the test to use is colSums(d[(n+1):(2*n)], na.rm=TRUE) == NROW(d) , since a paired 0 and 2 or -1 and 3 could throw off the colMeans test.

Resources