Expand grid with unknown dimension in R [duplicate] - r

This question already has answers here:
Dynamic arguments to expand.grid
(3 answers)
Closed 8 years ago.
For a given vector x, I need to obtain quantities of the type
expand.grid(x,x,x,x)
where x is repeated d times. Is there a function that allows this? Something like
expand.grids(x,d)
Thank you!

expand.grids <- function(x,d) {
expand.grid(replicate(d, x, simplify=FALSE))
}
expand.grids(1:2,4)
Var1 Var2 Var3 Var4
1 1 1 1 1
2 2 1 1 1
3 1 2 1 1
4 2 2 1 1
5 1 1 2 1
6 2 1 2 1
7 1 2 2 1
8 2 2 2 1
9 1 1 1 2
10 2 1 1 2
11 1 2 1 2
12 2 2 1 2
13 1 1 2 2
14 2 1 2 2
15 1 2 2 2
16 2 2 2 2

Related

discrete choice experiment data preparation for analysis using GMNL package

I have conducted a discrete choice experiment using google forms and written up the results in a csv in excel. I am having problems understanding how to take the data from a standard csv format to a format that I can analyse using the gmnl package.
I am using this data below which has been dummy coded
personid choiceid alt payment management assessment crop
1 1 1 3 2 2 3
1 2 2 2 2 1 3
1 3 1 3 2 1 3
1 4 1 2 1 3 1
1 5 1 2 1 3 1
1 6 2 1 1 2 1
1 7 2 3 1 2 3
1 8 2 3 1 2 3
1 9 2 3 1 1 2
1 10 2 3 1 1 2
1 11 2 3 1 2 1
1 12 2 2 1 1 3
1 13 3 1 2 1 1
1 14 2 1 1 2 3
1 15 2 2 1 2 2
1 16 2 1 1 1 3
2 17 3 1 2 1 2
2 18 3 1 3 1 2
2 19 1 3 1 1 3
test <- as.data.frame(testchoices)
choices <- mlogit.data(test, shape = "long", idx = list(c("choiceid", "personid")),
idnames = c("management", "crops", "assessment", "price"))
write_csv(choices, "choicesnext.csv")
It works fine up to write csv where the error is thrown saying 'Error in [.data.frame (x, start:min(NROW(x), start + len)) : undefined columns selected
I would be grateful for any assistance

Using R code to reorganize data frame by randomly selecting one row from each combination

I have a data frame that looks like this:
Subject N S
Sub1-1 3 1
Sub1-2 3 1
Sub1-3 3 1
Sub1-4 3 1
Sub2-1 3 1
Sub2-2 3 1
Sub2-3 3 1
Sub2-4 3 1
Sub3-1 3 2
Sub3-2 3 2
Sub3-3 3 2
Sub4-1 3 2
Sub4-2 3 2
Sub4-3 3 2
Sub5-1 3 2
Sub5-2 3 2
Sub6-1 1 1
Sub6-2 1 1
Sub6-3 1 1
Sub7-1 1 1
Sub7-2 1 1
Sub7-3 1 1
Sub8-1 1 1
Sub8-2 1 1
Sub8-3 1 2
Sub9-1 1 2
Sub9-2 1 2
Sub1-1 1 2
Sub1-2 1 2
Sub1-3 1 2
Sub5-1 1 2
Sub5-2 1 2
Sub1-5 2 1
Sub1-6 2 1
Sub1-7 2 1
Sub1-5 2 1
Sub2-6 2 1
Sub2-5 2 1
Sub2-6 2 1
Sub2-7 2 1
Sub3-8 2 2
Sub3-5 2 2
Sub3-6 2 2
Sub4-7 2 2
Sub4-5 2 2
Sub4-6 2 2
Sub5-7 2 2
Sub5-8 2 2
As you can see in this data frame there are 6 different combinations in the N and S columns, and 8 consecutive rows of each combination. I want to create a new data frame where one row from each combination (be it 3 & 1 or 1 & 2) is randomly selected and then put into a new data frame so there are 8 consecutive rows of each different combination. That way the entire data frame of all 48 rows is completely reorganized. Is this possible in R code?
Edit: The desired output would be something like this, but repeating until all 48 rows are full and the subject number for each row would have be random because it is a randomly selected row of each N & S combo.
Subject N S
3 1
1 1
3 2
1 2
2 2
2 1
2 2
3 2
2 1
1 1
3 1
1 2
A solution using functions from dplyr.
# Load package
library(dplyr)
# Set seed for reproducibility
set.seed(123)
# Process the data
dt2 <- dt %>%
group_by(N, S) %>%
sample_n(size = 1)
# View the result
dt2
## A tibble: 6 x 3
## Groups: N, S [6]
# Subject N S
# <chr> <int> <int>
#1 Sub6-3 1 1
#2 Sub5-1 1 2
#3 Sub1-5 2 1
#4 Sub5-8 2 2
#5 Sub2-4 3 1
#6 Sub3-1 3 2
Update: Reorganize the row
The following randomize all rows.
dt3 <- dt %>% slice(sample(1:n(), n()))
Data Preparation
dt <- read.table(text = "Subject N S
Sub1-1 3 1
Sub1-2 3 1
Sub1-3 3 1
Sub1-4 3 1
Sub2-1 3 1
Sub2-2 3 1
Sub2-3 3 1
Sub2-4 3 1
Sub3-1 3 2
Sub3-2 3 2
Sub3-3 3 2
Sub4-1 3 2
Sub4-2 3 2
Sub4-3 3 2
Sub5-1 3 2
Sub5-2 3 2
Sub6-1 1 1
Sub6-2 1 1
Sub6-3 1 1
Sub7-1 1 1
Sub7-2 1 1
Sub7-3 1 1
Sub8-1 1 1
Sub8-2 1 1
Sub8-3 1 2
Sub9-1 1 2
Sub9-2 1 2
Sub1-1 1 2
Sub1-2 1 2
Sub1-3 1 2
Sub5-1 1 2
Sub5-2 1 2
Sub1-5 2 1
Sub1-6 2 1
Sub1-7 2 1
Sub1-5 2 1
Sub2-6 2 1
Sub2-5 2 1
Sub2-6 2 1
Sub2-7 2 1
Sub3-8 2 2
Sub3-5 2 2
Sub3-6 2 2
Sub4-7 2 2
Sub4-5 2 2
Sub4-6 2 2
Sub5-7 2 2
Sub5-8 2 2",
header = TRUE, stringsAsFactors = FALSE)

How to create a full pattern matrix? [duplicate]

This question already has answers here:
Dynamic arguments to expand.grid
(3 answers)
Closed 8 years ago.
For a given vector x, I need to obtain quantities of the type
expand.grid(x,x,x,x)
where x is repeated d times. Is there a function that allows this? Something like
expand.grids(x,d)
Thank you!
expand.grids <- function(x,d) {
expand.grid(replicate(d, x, simplify=FALSE))
}
expand.grids(1:2,4)
Var1 Var2 Var3 Var4
1 1 1 1 1
2 2 1 1 1
3 1 2 1 1
4 2 2 1 1
5 1 1 2 1
6 2 1 2 1
7 1 2 2 1
8 2 2 2 1
9 1 1 1 2
10 2 1 1 2
11 1 2 1 2
12 2 2 1 2
13 1 1 2 2
14 2 1 2 2
15 1 2 2 2
16 2 2 2 2

Combining an individual and aggregate level data sets

I've got two different data frames, lets call them "Months" and "People".
Months looks like this:
Month Site X
1 1 4
2 1 3
3 1 5
1 2 10
2 2 7
3 2 5
and People looks like this:
ID Month Site
1 1 1
2 1 2
3 1 1
4 2 2
5 2 2
6 2 2
7 3 1
8 3 2
I'd like to combine them so essentially each time an entry in "People" has a particular Month and Site combination, it's added to the appropriate aggregated data frame, so I'd get something like the following:
Month Site X People
1 1 4 2
2 1 3 0
3 1 5 1
1 2 10 1
2 2 7 3
3 2 5 1
But I haven't the foggiest idea of how to go about doing that. Any suggestions?
Using base packages
> aggregate( ID ~ Month + Site, data=People, FUN = length )
Month Site ID
1 1 1 2
2 3 1 1
3 1 2 1
4 2 2 3
5 3 2 1
> res <- merge(Months, aggdata, all.x = TRUE)
> res
Month Site X ID
1 1 1 4 2
2 1 2 10 1
3 2 1 3 NA
4 2 2 7 3
5 3 1 5 1
6 3 2 5 1
> res[is.na(res)] <- 0
> res
Month Site X ID
1 1 1 4 2
2 1 2 10 1
3 2 1 3 0
4 2 2 7 3
5 3 1 5 1
6 3 2 5 1
Assuming your data.frames are months and people, here's a data.table solution:
require(data.table)
m.dt <- data.table(months, key=c("Month", "Site"))
p.dt <- data.table(people, key=c("Month", "Site"))
# one-liner
dt.f <- p.dt[m.dt, list(X=X[1], People=sum(!is.na(ID)))]
> dt.f
# Month Site X People
# 1: 1 1 4 2
# 2: 1 2 10 1
# 3: 2 1 3 0
# 4: 2 2 7 3
# 5: 3 1 5 1
# 6: 3 2 5 1

Episode count for each row

I'm sure this has been asked before but for the life of me I can't figure out what to search for!
I have the following data:
x y
1 3
1 3
1 3
1 2
1 2
2 2
2 4
3 4
3 4
And I would like to output a running count that resets everytime either x or y changes value.
x y o
1 3 1
1 3 2
1 3 3
1 2 1
1 2 2
2 2 1
2 4 1
3 4 1
3 4 2
Try something like
df<-read.table(header=T,text="x y
1 3
1 3
1 3
1 2
1 2
2 2
2 4
3 4
3 4")
cbind(df,o=sequence(rle(paste(df$x,df$y))$lengths))
> cbind(df,o=sequence(rle(paste(df$x,df$y))$lengths))
x y o
1 1 3 1
2 1 3 2
3 1 3 3
4 1 2 1
5 1 2 2
6 2 2 1
7 2 4 1
8 3 4 1
9 3 4 2
After seeing #ttmaccer's I see my first attempt with ave was wrong and this is perhaps what is needed:
> dat$o <- ave(dat$y, list(dat$y, dat$x), FUN=seq )
# there was a warning but the answer is corect.
> dat
x y o
1 1 3 1
2 1 3 2
3 1 3 3
4 1 2 1
5 1 2 2
6 2 2 1
7 2 4 1
8 3 4 1
9 3 4 2

Resources