This question already has an answer here:
Is there anything wrong with using T & F instead of TRUE & FALSE?
(1 answer)
Closed 4 years ago.
Given a matrix with one row, one column, or one cell, I need to reorder the rows while keeping the matrix structure. I tried adding drop=F but it doesn't work! What did I do?
test = matrix(letters[1:5]) # is a matrix
test[5:1,,drop=F] # not a matrix
test2 = matrix(letters[1:5],nrow=1) # is a matrix
test2[1:1,,drop=F] # not a matrix
test3 = matrix(1) # is a matrix
test3[1:1,,drop=F] # not a matrix
I'd guess it was an overwritten F; F can be set as a variable, in which case it's no longer false. Always write out FALSE fully, it can't be set as a variable.
See Is there anything wrong with using T & F instead of TRUE & FALSE?
Also the R Inferno, section 8.1.32, is a good reference.
> F <- 1
> test = matrix(letters[1:5]) # is a matrix
> test[5:1,,drop=F] # not a matrix
[1] "e" "d" "c" "b" "a"
> test[5:1,,drop=FALSE] # but this is a matrix
[,1]
[1,] "e"
[2,] "d"
[3,] "c"
[4,] "b"
[5,] "a"
> rm(F)
> test[5:1,,drop=F] # now a matrix again
[,1]
[1,] "e"
[2,] "d"
[3,] "c"
[4,] "b"
[5,] "a"
The code in your question works fine in a fresh R session:
test = matrix(letters[1:5]) # is a matrix
result = test[5:1,,drop=F]
result
# [,1]
# [1,] "e"
# [2,] "d"
# [3,] "c"
# [4,] "b"
# [5,] "a"
class(result) # still a matrix
# [1] "matrix"
dim(result)
# [1] 5 1
Even on the 1x1 matrix:
test3 = matrix(1) # is a matrix
result3 = test3[1:1,,drop=F]
class(result3)
# [1] "matrix"
dim(result3)
# [1] 1 1
Maybe you've loaded other packages that are overriding the default behavior? What makes you think you don't end up with a matrix?
The following works:
test <- matrix(test[5:1,, drop = F], nrow = 5, ncol = 1)
When you use is.matrix to test it, the output is a matrix. At the same time, you specify the number of rows (nrow) and number of columns (ncol) to coerce it to the number of rows and columns you require.
Related
Im trying to convert a data set in a long format panel structure to an adjacency matrix or edge list to make network graphs. The data set contains articles each identified by an ID-number. Each article can appear several times under a number of categories. Hence I have a long format structure at the moment:
ID <- c(1,1,1,2,2,2,3,3)
Category <- c("A","B","C","B","E","H","C","E")
dat <- data.frame(ID,Category)
I want to convert this into an adjacency matrix or edge list. Where the edge list such look something like this
A B
A C
B C
B E
B H
E H
C E
Edit: I have tried dat <- merge(ID, Category, by="Category") but it returns the error message Error in fix.by(by.x, x) : 'by' must specify a uniquely valid column
Thanks in advance
Update: I ended up using the crossprod(table(dat)) from the comments, but the solution suggested by Navy Cheng below works just as well
This code will work
do.call(rbind,lapply(split(dat, dat$ID), function(x){
t(combn(as.vector(x$Category), 2))
}))
Update
As #Parfait 's suggestion, you can have by instead of split+lapply.
1) Use by to group nodes ("A", "B", "C" ...) by Category;
2) Use combn to create edge between nodes in each group, and t to transform the matrix for further rbind
> edge.list <- by(dat, dat$ID, function(x) t(combn(as.vector(x$Category), 2)))
dat$ID: 1
[,1] [,2]
[1,] "A" "B"
[2,] "A" "C"
[3,] "B" "C"
------------------------------------------------------------
dat$ID: 2
[,1] [,2]
[1,] "B" "E"
[2,] "B" "H"
[3,] "E" "H"
------------------------------------------------------------
dat$ID: 3
[,1] [,2]
[1,] "C" "E"
3) Then merge the list
> do.call(rbind, edge.list)
[,1] [,2]
[1,] "A" "B"
[2,] "A" "C"
[3,] "B" "C"
[4,] "B" "E"
[5,] "B" "H"
[6,] "E" "H"
[7,] "C" "E"
So if you are willing to convert your data.frame to a data.table this problem can be solved pretty efficiently and cleanly and if you have many rows will be much faster.
library(data.table)
dat<-data.table(dat)
Basically you can apply functions to columns of the data.table in the j cell and group in the k cell. So you want all the combinations of categories taken two at a time for each ID which looks like this:
dat[,combn(Categories,2),by=ID]
However stopping at this point will keep the ID column and by default create a column called V1 that basically concatenates the array returned by combn into a vector of the categories and not the two-column adjacency matrix that you need. But by chaining another call to this you can create the matrix easily as you would with any single vector. In one line of code this will look like:
dat[,combn(Category,2),by=ID][,matrix(V1,ncol=2,byrow = T)]
Remember that the vector column we wish to convert to a matrix is called V1 by default and also we want the 2-column matrix to be created by row instead of the default which is by column. Hope that helps and let me know if I need to add anything to my explanation. Good luck!
in R I have produced the following list L:
>L
[[1]]
[1] "A" "B" "C"
[[2]]
[1] "D"
[[3]]
[1] NULL
I would like to manipulate the list L arriving at a database df like
>df
df[,1] df[,2]
"A" 1
"B" 1
"C" 1
"D" 2
where the 2nd column gives the position in the list L of the corresponding element in column 1.
My question is: is(are) there a() built-in R function(s) which can do this manipulation quickly? I can do it using "brute force", but my solution does not scale well when I consider much bigger lists.
I thank you all!
You'll get a warning because of your NULL value, but you can use stack if you give your list items names:
L <- list(c("A", "B", "C"), "D", NULL)
stack(setNames(L, seq_along(L)))
# values ind
# 1 A 1
# 2 B 1
# 3 C 1
# 4 D 2
# Warning message:
# In stack.default(setNames(L, seq_along(L))) :
# non-vector elements will be ignored
If the warning displeases you, you can, of course, run stack on the non-NULL elements, but do it after you name your list elements so that the "ind" column reflects the correct value.
I'll show in 2 steps just for clarity:
names(L) <- seq_along(L)
stack(L[!sapply(L, is.null)])
Similarly, if you've gotten rid of the NULL list elements, you can use melt from "reshape2". You don't gain anything in brevity, and I'm not sure that you gain anything in efficiency either, but I thought I'd share it as an option.
library(reshape2)
names(L) <- seq_along(L)
melt(L[!sapply(L, is.null)])
Ananda's answer is seemingly better than this, but I'll put it up anyway:
> cbind(unlist(L), rep(1:length(L), sapply(L, length)))
[,1] [,2]
[1,] "A" "1"
[2,] "B" "1"
[3,] "C" "1"
[4,] "D" "2"
I have a vector with five items.
my_vec <- c("a","b","a","c","d")
If I want to re-arrange those values into a new vector (shuffle), I could use sample():
shuffled_vec <- sample(my_vec)
Easy - but the sample() function only gives me one possible shuffle. What if I want to know all possible shuffling combinations? The various "combn" functions don't seem to help, and expand.grid() gives me every possible combination with replacement, when I need it without replacement. What's the most efficient way to do this?
Note that in my vector, I have the value "a" twice - therefore, in the set of shuffled vectors returned, they all should each have "a" twice in the set.
I think permn from the combinat package does what you want
library(combinat)
permn(my_vec)
A smaller example
> x
[1] "a" "a" "b"
> permn(x)
[[1]]
[1] "a" "a" "b"
[[2]]
[1] "a" "b" "a"
[[3]]
[1] "b" "a" "a"
[[4]]
[1] "b" "a" "a"
[[5]]
[1] "a" "b" "a"
[[6]]
[1] "a" "a" "b"
If the duplicates are a problem you could do something similar to this to get rid of duplicates
strsplit(unique(sapply(permn(my_vec), paste, collapse = ",")), ",")
Or probably a better approach to removing duplicates...
dat <- do.call(rbind, permn(my_vec))
dat[duplicated(dat),]
Noting that your data is effectively 5 levels from 1-5, encoded as "a", "b", "a", "c", and "d", I went looking for ways to get the permutations of the numbers 1-5 and then remap those to the levels you use.
Let's start with the input data:
my_vec <- c("a","b","a","c","d") # the character
my_vec_ind <- seq(1,length(my_vec),1) # their identifier
To get the permutations, I applied the function given at Generating all distinct permutations of a list in R:
permutations <- function(n){
if(n==1){
return(matrix(1))
} else {
sp <- permutations(n-1)
p <- nrow(sp)
A <- matrix(nrow=n*p,ncol=n)
for(i in 1:n){
A[(i-1)*p+1:p,] <- cbind(i,sp+(sp>=i))
}
return(A)
}
}
First, create a data.frame with the permutations:
tmp <- data.frame(permutations(length(my_vec)))
You now have a data frame tmp of 120 rows, where each row is a unique permutation of the numbers, 1-5:
>tmp
X1 X2 X3 X4 X5
1 1 2 3 4 5
2 1 2 3 5 4
3 1 2 4 3 5
...
119 5 4 3 1 2
120 5 4 3 2 1
Now you need to remap them to the strings you had. You can remap them using a variation on the theme of gsub(), proposed here: R: replace characters using gsub, how to create a function?
gsub2 <- function(pattern, replacement, x, ...) {
for(i in 1:length(pattern))
x <- gsub(pattern[i], replacement[i], x, ...)
x
}
gsub() won't work because you have more than one value in the replacement array.
You also need a function you can call using lapply() to use the gsub2() function on every element of your tmp data.frame.
remap <- function(x,
old,
new){
return(gsub2(pattern = old,
replacement = new,
fixed = TRUE,
x = as.character(x)))
}
Almost there. We do the mapping like this:
shuffled_vec <- as.data.frame(lapply(tmp,
remap,
old = as.character(my_vec_ind),
new = my_vec))
which can be simplified to...
shuffled_vec <- as.data.frame(lapply(data.frame(permutations(length(my_vec))),
remap,
old = as.character(my_vec_ind),
new = my_vec))
.. should you feel the need.
That gives you your required answer:
> shuffled_vec
X1 X2 X3 X4 X5
1 a b a c d
2 a b a d c
3 a b c a d
...
119 d c a a b
120 d c a b a
Looking at a previous question (R: generate all permutations of vector without duplicated elements), I can see that the gtools package has a function for this. I couldn't however get this to work directly on your vector as such:
permutations(n = 5, r = 5, v = my_vec)
#Error in permutations(n = 5, r = 5, v = my_vec) :
# too few different elements
You can adapt it however like so:
apply(permutations(n = 5, r = 5), 1, function(x) my_vec[x])
# [,1] [,2] [,3] [,4]
#[1,] "a" "a" "a" "a" ...
#[2,] "b" "b" "b" "b" ...
#[3,] "a" "a" "c" "c" ...
#[4,] "c" "d" "a" "d" ...
#[5,] "d" "c" "d" "a" ...
UPDATE: FIXED
This is fixed in the upcoming release of R 3.1.0. From the CHANGELOG:
combn(x, simplify = TRUE) now gives a factor result for factor input
x (previously user error).
Related to PR#15442
I just noticed a curious thing. Why does combn appear to unclass factor variables to their underlying numeric values for all except the first combination?
x <- as.factor( letters[1:3] )
combn( x , 2 )
# [,1] [,2] [,3]
#[1,] "a" "1" "2"
#[2,] "b" "3" "3"
This doesn't occur when x is a character:
x <- as.character( letters[1:3] )
combn( x , 2 )
# [,1] [,2] [,3]
#[1,] "a" "a" "b"
#[2,] "b" "c" "c"
Reproducible on R64 on OS X 10.7.5 and Windows 7.
I think it is due to the conversion to matrix done by the simplify parameter. If you don't use it you get:
combn( x , 2 , simplify=FALSE)
[[1]]
[1] a b
Levels: a b c
[[2]]
[1] a c
Levels: a b c
[[3]]
[1] b c
Levels: a b c
The fact that the first column is OK is due to the way combn works: the first column is specified separately and the other columns are then changed from the existing matrix using [<-. Consider:
m <- matrix(x,3,3)
m[,2] <- sample(x)
m
[,1] [,2] [,3]
[1,] "a" "1" "a"
[2,] "b" "3" "b"
[3,] "c" "2" "c"
I think the offending function is therefore [<-.
As Konrad said, the treatment of factors is often odd, or at least inconsistent. In this case I think the behaviour is weird enough to constitute a bug. Try submitting it, and see what the response is.
Since the result is a matrix, and there is no factor matrix type, I think that the correct behaviour would be to convert factor inputs to character somewhere near the start of the function.
I had the same problem. Coercing back to a character vector inside the combn command seems to work:
> combn(as.character(x),2)
[,1] [,2] [,3]
[1,] "a" "a" "b"
[2,] "b" "c" "c"
I am trying to find a way to get a list in R of all the possible unique permutations of A,A,A,A,B,B,B,B,B.
Combinations was what was originally thought to be the method for obtaining a solution, hence the combinations answers.
I think this is what you're after. #bill was on the ball with the recommendation of combining unique and combn. We'll also use the apply family to generate ALL of the combinations. Since unique removes duplicate rows, we need to transpose the results from combn before uniqueing them. We then transpose them back before returning to the screen so that each column represents a unique answer.
#Daters
x <- c(rep("A", 4), rep("B",5))
#Generates a list with ALL of the combinations
zz <- sapply(seq_along(x), function(y) combn(x,y))
#Filter out all the duplicates
sapply(zz, function(z) t(unique(t(z))))
Which returns:
[[1]]
[,1] [,2]
[1,] "A" "B"
[[2]]
[,1] [,2] [,3]
[1,] "A" "A" "B"
[2,] "A" "B" "B"
[[3]]
[,1] [,2] [,3] [,4]
[1,] "A" "A" "A" "B"
[2,] "A" "A" "B" "B"
[3,] "A" "B" "B" "B"
...
EDIT Since the question is about permuations and not combinations, the answer above is not that useful. This post outlines a function to generate the unique permutations given a set of parameters. I have no idea if it could be improved upon, but here's one approach using that function:
fn_perm_list <-
function (n, r, v = 1:n)
{
if (r == 1)
matrix(v, n, 1)
else if (n == 1)
matrix(v, 1, r)
else {
X <- NULL
for (i in 1:n) X <- rbind(X, cbind(v[i], fn_perm_list(n -
1, r - 1, v[-i])))
X
}
}
zz <- fn_perm_list(9, 9)
#Turn into character matrix. This currently does not generalize well, but gets the job done
zz <- ifelse(zz <= 4, "A", "B")
#Returns 126 rows as indicated in comments
unique(zz)
There's no need to generate permutations and then pick out the unique ones.
Here's a much simpler way (and much, much faster as well): To generate all permutations of 4 A's and 5 B's, we just need to enumerate all possible ways of placing 4 A's among 9 possible locations. This is simply a combinations problem. Here's how we can do this:
x <- rep('B',9) # vector of 9 B's
a_pos <- combn(9,4) # all possible ways to place 4 A's among 9 positions
perms <- apply(a_pos, 2, function(p) replace(x,p,'A')) # all desired permutations
Each column of the 9x126 matrix perms is a unique permutation 4 A's and 5 B's:
> dim(perms)
[1] 9 126
> perms[,1:4] ## look at first few columns
[,1] [,2] [,3] [,4]
[1,] "A" "A" "A" "A"
[2,] "A" "A" "A" "A"
[3,] "A" "A" "A" "A"
[4,] "A" "B" "B" "B"
[5,] "B" "A" "B" "B"
[6,] "B" "B" "A" "B"
[7,] "B" "B" "B" "A"
[8,] "B" "B" "B" "B"
[9,] "B" "B" "B" "B"