Is there an expand.grid like function in R, returning permutations? - r

to become more specific, here is an example:
> expand.grid(5, 5, c(1:4,6),c(1:4,6))
Var1 Var2 Var3 Var4
1 5 5 1 1
2 5 5 2 1
3 5 5 3 1
4 5 5 4 1
5 5 5 6 1
6 5 5 1 2
7 5 5 2 2
8 5 5 3 2
9 5 5 4 2
10 5 5 6 2
11 5 5 1 3
12 5 5 2 3
13 5 5 3 3
14 5 5 4 3
15 5 5 6 3
16 5 5 1 4
17 5 5 2 4
18 5 5 3 4
19 5 5 4 4
20 5 5 6 4
21 5 5 1 6
22 5 5 2 6
23 5 5 3 6
24 5 5 4 6
25 5 5 6 6
This data frame was created from all combinations of the supplied vectors. I would like to create a similar data frame from all permutations of the supplied vectors. Notice that each row must contain exactly 2 fives, yet not necessarily the fist two in line.
Thank you.

The code below works. (relies on permutations from gtools)
comb <- t(as.matrix(expand.grid(5, 5, c(1:4,6),c(1:4,6))))
perms <- t(permutations(4,4))
ans <- apply(comb,2,function(x) x[perms])
ans <- unique(matrix(as.vector(ans), ncol = 4, byrow = TRUE))

Try ?allPerms in the vegan package.

Related

Create a new variable based on existing variable

My current dataset look like this
Order V1
1 7
2 5
3 8
4 5
5 8
6 3
7 4
8 2
1 8
2 6
3 3
4 4
5 5
6 7
7 3
8 6
I want to create a new variable called "V2" based on the variables "Order" and "V1". For every 8 items in the "Order" variable, I want to assign a value of "0" in "V2" if the varialbe "Order" has observation equals to 1; otherwise, "V2" takes the value of previous item in "V1".
This is the dataset that I want
Order V1 V2
1 7 0
2 5 7
3 8 5
4 5 8
5 8 5
6 3 8
7 4 3
8 2 4
1 8 0
2 6 8
3 3 6
4 4 3
5 5 4
6 7 5
7 3 7
8 6 3
Since my actual dataset is very large, I'm trying to use for loop with if statement to generate "V2". But my code keeps failing. I appreciate if anyone can help me on this, and I'm open to other statements. Thank you!
(Up front: I am assuming that the order of Order is perfectly controlled.)
You need simply ifelse and lag:
df <- read.table(text="Order V1
1 7
2 5
3 8
4 5
5 8
6 3
7 4
8 2
1 8
2 6
3 3
4 4
5 5
6 7
7 3
8 6 ", header=T)
df$V2 <- ifelse(df$Order==1, 0, lag(df$V1))
df
# Order V1 V2
# 1 1 7 0
# 2 2 5 7
# 3 3 8 5
# 4 4 5 8
# 5 5 8 5
# 6 6 3 8
# 7 7 4 3
# 8 8 2 4
# 9 1 8 0
# 10 2 6 8
# 11 3 3 6
# 12 4 4 3
# 13 5 5 4
# 14 6 7 5
# 15 7 3 7
# 16 8 6 3
with(dat,{V2<-c(0,head(V1,-1));V2[Order==1]<-0;dat$V2<-V2;dat})
Order V1 V2
1 1 7 0
2 2 5 7
3 3 8 5
4 4 5 8
5 5 8 5
6 6 3 8
7 7 4 3
8 8 2 4
9 1 8 0
10 2 6 8
11 3 3 6
12 4 4 3
13 5 5 4
14 6 7 5
15 7 3 7
16 8 6 3

Subsetting a dataframe in R including observations that satisfy condition

I would like to randomly subset dataframe with condition that if the observation with alpha=1 is included in a subset, then all observation which has alpha=1 must be included in the subset. I simplify data, so it looks like this.
df
alpha beta gamma
1 5 2
1 6 3
1 5 3
2 3 2
2 5 9
2 2 6
3 3 4
3 4 7
3 3 8
4 3 4
4 8 3
4 4 9
5 9 8
5 5 5
5 3 5
What command should I use to get subsets like the following?
df1
alpha beta gamma
1 5 2
1 6 3
1 5 3
3 3 4
3 4 7
3 3 8
5 9 8
5 5 5
5 3 5
df2
alpha beta gamma
2 3 2
2 5 9
2 2 6
4 3 4
4 8 3
4 4 9
5 9 8
5 5 5
5 3 5
df3
alpha beta gamma
1 5 2
1 6 3
1 5 3
2 3 2
2 5 9
2 2 6
5 9 8
5 5 5
5 3 5
Specifically, the first observation in df with numbers (1,5,2) is randomly fell in subset df1 and df3. If so, it must follow that 2nd and 3d observations in df (1,6,3) and (1,5,3) are also included in subsets df1 and df2.
I hope that my question is clear. Please help.
Try this
str <- "alpha,beta,gamma
1,5,2
1,6,3
1,5,3
2,3,2
2,5,9
2,2,6
3,3,4
3,4,7
3,3,8
4,3,4
4,8,3
4,4,9
5,9,8
5,5,5
5,3,5"
df <- read.csv(textConnection(str))
df[df$alpha %in% sample(unique(df$alpha), 3), ]
Output
alpha beta gamma
4 2 3 2
5 2 5 9
6 2 2 6
10 4 3 4
11 4 8 3
12 4 4 9
13 5 9 8
14 5 5 5
15 5 3 5

Eliminate in an increasing order rows in a data frame

Eliminate in an increasing order rows in a data frame
x<-c(4,5,6,23,5,6,7,8,0,3)
y<-c(2,4,5,6,23,5,6,7,8,0)
z<-c(1,2,4,5,6,23,5,6,7,8)
df<-data.frame(x,y,z)
df
x y z
1 4 2 1
2 5 4 2
3 6 5 4
4 23 6 5
5 5 23 6
6 6 5 23
7 7 6 5
8 8 7 6
9 0 8 7
10 3 0 8
I would like to eliminate number 23 in the df from all columns by instructing to sequentially increasingly remove a row per column (not by matching the value 23, but by its initial x location).
df
x y z
1 4 2 1
2 5 4 2
3 6 5 4
4 5 6 5
5 6 5 6
6 7 6 5
7 8 7 6
8 0 8 7
9 3 0 8
Thank you
You can iterate through the columns and remove the element from each, then reassemble as a data frame:
result <- as.data.frame(lapply(1:ncol(df), function(x) df[-(x+3),x]))
names(result) <- names(df)
result
## x y z
## 1 4 2 1
## 2 5 4 2
## 3 6 5 4
## 4 5 6 5
## 5 6 5 6
## 6 7 6 5
## 7 8 7 6
## 8 0 8 7
## 9 3 0 8
df[-(x+3),x] is the column with the value removed, by location. To start with row N in column x you would use df[-(x+N-1),x].
You could also try:
n <- 4
df1 <- df[-n,]
df1[] <- unlist(df,use.names=FALSE)[-seq(n, prod(dim(df)), by=nrow(df)+1)]
df1
# x y z
#1 4 2 1
#2 5 4 2
#3 6 5 4
#5 5 6 5
#6 6 5 6
#7 7 6 5
#8 8 7 6
#9 0 8 7
#10 3 0 8

R, Using reshape to pull pre post data

I have a simple data frame as follows
x = data.frame(id = seq(1,10),val = seq(1,10))
x
id val
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
I want to add 4 more columns. The first 2 are the previous two rows and the next two are the next two rows. For the first two rows and last two rows it needs to write out as NA.
How do I accomplish this using cast in the reshape package?
The final output would look like
1 1 NA NA 2 3
2 2 NA 1 3 4
3 3 1 2 4 5
4 4 2 3 5 6
... and so on...
Thanks much in advance
After your give the example , I change the solution
mat <- cbind(dat,
c(c(NA,NA),head(dat$id,-2)),
c(c(NA),head(dat$val,-1)),
c(tail(dat$id,-1),c(NA)),
c(tail(dat$val,-2),c(NA,NA)))
colnames(mat) <- c('id','val','idp','valp','idn','valn')
id val idp valp idn valn
1 1 1 NA NA 2 3
2 2 2 NA 1 3 4
3 3 3 1 2 4 5
4 4 4 2 3 5 6
5 5 5 3 4 6 7
6 6 6 4 5 7 8
7 7 7 5 6 8 9
8 8 8 6 7 9 10
9 9 9 7 8 10 NA
10 10 10 8 9 NA NA
Here is a soluting with sapply. First, choose the relative change for the new columns:
lags <- c(-2, -1, 1, 2)
Create the new columns:
newcols <- sapply(lags,
function(l) {
tmp <- seq.int(nrow(x)) + l;
x[replace(tmp, tmp < 1 | tmp > nrow(x), NA), "val"]})
Bind together:
cbind(x, newcols)
The result:
id val 1 2 3 4
1 1 1 NA NA 2 3
2 2 2 NA 1 3 4
3 3 3 1 2 4 5
4 4 4 2 3 5 6
5 5 5 3 4 6 7
6 6 6 4 5 7 8
7 7 7 5 6 8 9
8 8 8 6 7 9 10
9 9 9 7 8 10 NA
10 10 10 8 9 NA NA

What is the Matlab/Octave equivalent or R's 'merge' (or 'expand.grid')?

I am looking for the Matlab way of doing the following:
> merge(2:4,3:7)
x y
1 2 3
2 3 3
3 4 3
4 2 4
5 3 4
6 4 4
7 2 5
8 3 5
9 4 5
10 2 6
11 3 6
12 4 6
13 2 7
14 3 7
15 4 7
> expand.grid(2:4,3:7)
Var1 Var2
1 2 3
2 3 3
3 4 3
4 2 4
5 3 4
6 4 4
7 2 5
8 3 5
9 4 5
10 2 6
11 3 6
12 4 6
13 2 7
14 3 7
15 4 7
I usually do it with meshgrid:
>> [x y] = meshgrid(2:4, 3:7);
>> [x(:) y(:)]
ans =
2 3
2 4
2 5
2 6
2 7
3 3
3 4
3 5
3 6
3 7
4 3
4 4
4 5
4 6
4 7
Use ndgrid for n variables (2 and more). For example (4-D space)
[X,Y,Z,T] = ndgrid(2:4, 3:7, 1:2, 1:10);

Resources