How to use a positional vector to create a binary string? - r

I have a vector c(2, 5) and I want to turn this into a vector of n elements where positions 2 and 5 are equal to 1 and any remaining positions are equal to zero.
i.e. If I want to create a vector of length 6, I would want to use vector c(2, 5) to generate the following vector:
c(0, 1, 0, 0, 1, 0)

How about
x <- c(2, 5)
n <- 6
replace(integer(n), x, 1L)
# [1] 0 1 0 0 1 0
And another option is
as.integer(1:n %in% x)
# [1] 0 1 0 0 1 0

Related

Roll vector so max value is centered and order of values relative to it stay the same

How can I roll a vector so the max value is in the middle and the order of values relative to it stay the same?
If the length of the vector is even, then just put it in the mean value: n/2
If I have:
vec <- c(0, 2, 4, 3, 1, 8)
I want to return:
3 1 8 0 2 4
Thanks.
You can use the SOfun package which has a function called shifter:
library(SOfun)
shifter(vec, which.max(vec) - length(vec) / 2)
Output
[1] 3 1 8 0 2 4
In the second case, you have:
vec <- c(0, 8, 4, 3, 1, 2)
with max value before the middle, resulting in a negative shift:
shifter(vec, which.max(vec) - length(vec) / 2)
Output
[1] 2 0 8 4 3 1

Cluster groups based on pairwise distances

I have an n x n matrix with pairwise distances as entries. The matrix looks for example like this:
m = matrix (c(0, 0, 1, 1, 1, 1,0, 0, 1, 1, 0, 1,1, 1, 0, 1, 1, 0,1, 1, 1, 0, 1, 1,1, 0, 1, 1, 0, 1,1, 1, 0, 1, 1, 0),ncol=6, byrow=TRUE)
colnames(m) <- c("A","B","C","D","E","F")
rownames(m) <- c("A","B","C","D","E","F")
Now I want to put every letter in the same cluster if the distance to any other letter is 0. For the example above, I should get three clusters consisting of:
(A,B,E)
(C,F)
(D)
I would be interested in the number of entries in each cluster. At the end, I want to have a vector like:
clustersizes = c(3,2,1)
I assume it is possible by using the hclust function, but I'm not able to extract the three clusters. I also tried the cutree function, but if I don't know the number of clusters before and also not the cutoff for the height, how should I do it?
This is what I tried:
h <- hclust(dist(m),method="single")
plot(h)
Thanks!
Welcome to SO.
There are several ways to handle this but an easy choice is to use the igraph package.
First we convert your matrix m to an adjacency matrix. It contains the distances to neighbouring nodes, where 0 means no connection. Thus, we subtract your matrix from 1 to get that
mm <- 1 - m
diag(mm) <- 0 # We don't allow loops
This gives
> mm
A B C D E F
A 0 1 0 0 0 0
B 1 0 0 0 1 0
C 0 0 0 0 0 1
D 0 0 0 0 0 0
E 0 1 0 0 0 0
F 0 0 1 0 0 0
Then we just need to feed it to igraph to compute communities
library("igraph")
fastgreedy.community(as.undirected(graph.adjacency(mm)))
which produces
IGRAPH clustering fast greedy, groups: 3, mod: 0.44
+ groups:
$`1`
[1] "A" "B" "E"
$`2`
[1] "C" "F"
$`3`
[1] "D"
Now if you save that result you can get the community sizes right away
res < fastgreedy.community(as.undirected(graph.adjacency(mm)))
sizes(res)
which yields
Community sizes
1 2 3
3 2 1

dplyr syntax to replace all elements of a vector after a certain value

What could be the dplyr syntax for replacing all elements (by 0) after a certain condition (x < 0) is first reach
x <- c(1, 2, 4, -1, 1, 3)
if(any(x < 0)) x[min(which(x < 0)):length(x)] <- 0
> x
[1] 1 2 4 5 0 0 0

Transform categorical attribute vector into similarity matrix

I need to transfrom a categorical attribute vector into a "same attribute matrix" using R.
For example I have a vector which reports gender of N people (male = 1, female = 0). I need to convert this vector into a NxN matrix named A (with people names on rows and columns), where each cell Aij has the value of 1 if two persons (i and j) have the same gender and 0 otherwise.
Here is an example with 3 persons, first male, second female, third male, which produce this vector:
c(1, 0, 1)
I want to transform it into this matrix:
A = matrix( c(1, 0, 1, 0, 1, 0, 1, 0, 1), nrow=3, ncol=3, byrow = TRUE)
Like lmo said in acomment it's impossible to know the structure of your dataset so what follows is just an example for you to see how it could be done.
First, make up some data.
set.seed(3488) # make the results reproducible
x <- LETTERS[1:5]
y <- sample(0:1, 5, TRUE)
df <- data.frame(x, y)
Now tabulate it according to your needs
A <- outer(df$y, df$y, function(a, b) as.integer(a == b))
dimnames(A) <- list(df$x, df$x)
A
# A B C D E
#A 1 1 1 0 0
#B 1 1 1 0 0
#C 1 1 1 0 0
#D 0 0 0 1 1
#E 0 0 0 1 1

how to remove one data in r

In R I have some vector.
x <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0)
I want to remove only "0" in x vector, but it removes all '0' in this vector.
Example
x=x[!x %in% 0 )]
All zero in this vector had been remove in x vector
For Example in Python
x = [0,1,0,1,0,0,0,1]
x.remove(0)
x
[1, 0, 1, 0, 0, 0, 1]
x.remove(0)
x
[1, 1, 0, 0, 0, 1]
We can use match to remove the first occurrence of a particular number
x <- c(1, 0, 1, 0, 0, 0, 1)
x[-match(1, x)]
#[1] 0 1 0 0 0 1
If you have any other number to remove in array, for example 5 in the case below,
x <- c(1, 0, 5, 5, 0, 0, 1)
x[-match(5, x)]
#[1] 1 0 5 0 0 1
You may need which.min(),
which determines the index of the first minimum of a vector:
x <- c(0,1,0,1,0,0,0,1)
x <- x[-which.min(x)]
x
# [1] 1 0 1 0 0 0 1
If your vector contains elements other than 0 or 1: x <- x[-which.min(x != 0)]

Resources