Percentage histogram with facet_grid: x variable is a factor - r
I want to split a percentage histogram (that integrates to 100%) into two facets using facet_grid. However, when splitting to facets, each facet by itself doesn't integrate to 100%. This kind of question has been resolved here in the past, but I cannot translate that solution to my current situation where x is a factor, and thus a histogram using stat(density) doesn't work.
My Data
Dataframe with two columns. equipment denotes whether a household has enough equipment for homeschooling, and children_n denotes number of children.
library(tidyverse)
library(magrittr)
df <-
structure(list(equipment = c(1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1,
0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1,
0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0,
1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0,
1, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1,
1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0,
0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1,
0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1,
1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0,
1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1,
1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0,
1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1,
0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0, 1,
0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 0,
1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0,
0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0,
1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1,
1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1,
1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1,
0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1,
1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1,
1, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1,
1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 0,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1,
1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1,
1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0, 0,
0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 0, 1, 1,
1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1,
1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1,
1, 0, 1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1,
0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0,
1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0,
1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1,
0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0,
0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1,
0, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0,
0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0,
1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1,
1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1,
0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0,
0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1,
1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1,
1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1,
1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0,
1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1,
1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1), children_n = c(4,
4, 2, 2, 2, 1, 1, 3, 2, 3, 3, 7, 3, 2, 1, 2, 1, 1, 3, 3, 3, 2,
3, 3, 3, 2, 4, 3, 1, 2, 3, 4, 4, 1, 2, 5, 2, 8, 1, 2, 1, 2, 2,
3, 4, 3, 3, 3, 3, 2, 3, 2, 2, 4, 3, 3, 3, 4, 3, 1, 1, 2, 1, 1,
2, 1, 3, 3, 2, 3, 3, 3, 4, 2, 2, 2, 3, 5, 2, 2, 2, 2, 1, 2, 4,
3, 4, 3, 3, 1, 2, 3, 3, 3, 2, 4, 4, 3, 1, 3, 2, 2, 2, 3, 1, 1,
1, 3, 1, 2, 2, 2, 3, 6, 3, 2, 2, 6, 3, 4, 3, 2, 3, 3, 2, 2, 2,
3, 2, 3, 3, 6, 3, 1, 4, 3, 4, 9, 1, 1, 3, 4, 2, 2, 1, 2, 3, 1,
3, 3, 6, 4, 1, 3, 2, 2, 3, 2, 3, 2, 4, 3, 1, 3, 3, 2, 3, 2, 2,
4, 2, 2, 3, 3, 3, 1, 3, 3, 2, 4, 2, 7, 3, 3, 3, 2, 2, 2, 4, 3,
1, 1, 3, 4, 1, 4, 3, 4, 3, 3, 2, 3, 3, 3, 2, 3, 3, 2, 3, 3, 3,
3, 1, 1, 2, 2, 4, 2, 3, 3, 2, 2, 1, 2, 5, 2, 2, 2, 5, 3, 2, 2,
4, 2, 1, 3, 4, 4, 3, 3, 4, 3, 3, 1, 3, 2, 1, 8, 2, 3, 2, 3, 3,
2, 3, 3, 1, 3, 3, 4, 2, 3, 3, 3, 2, 6, 1, 2, 2, 2, 2, 2, 2, 4,
3, 5, 4, 1, 2, 2, 2, 4, 2, 3, 3, 1, 3, 2, 1, 2, 2, 3, 3, 3, 3,
1, 3, 4, 2, 1, 3, 4, 2, 1, 3, 4, 3, 4, 2, 3, 3, 2, 7, 1, 2, 1,
3, 2, 2, 2, 2, 3, 3, 3, 2, 3, 1, 2, 2, 3, 2, 4, 3, 2, 3, 3, 5,
3, 5, 3, 5, 1, 2, 1, 4, 1, 4, 2, 2, 3, 2, 2, 2, 3, 2, 3, 3, 3,
3, 4, 3, 8, 3, 1, 2, 3, 3, 2, 1, 3, 2, 2, 3, 3, 4, 4, 2, 2, 3,
1, 2, 3, 2, 3, 3, 2, 1, 3, 3, 2, 3, 3, 3, 4, 1, 2, 3, 3, 3, 4,
2, 1, 3, 4, 2, 3, 3, 2, 2, 2, 2, 2, 3, 3, 3, 1, 3, 3, 1, 1, 3,
2, 1, 3, 2, 4, 1, 3, 2, 3, 2, 2, 2, 4, 1, 2, 3, 2, 3, 2, 2, 1,
3, 1, 3, 1, 3, 3, 2, 1, 2, 3, 2, 3, 1, 2, 1, 2, 2, 3, 3, 4, 1,
2, 4, 2, 4, 2, 2, 2, 1, 3, 2, 1, 1, 4, 3, 4, 3, 2, 2, 2, 3, 7,
3, 1, 3, 3, 3, 2, 1, 3, 2, 3, 3, 2, 4, 1, 1, 1, 4, 3, 3, 4, 3,
8, 2, 4, 5, 3, 2, 3, 1, 2, 1, 2, 2, 3, 1, 4, 3, 2, 2, 3, 3, 3,
3, 1, 2, 1, 2, 3, 3, 2, 2, 2, 2, 3, 3, 4, 5, 3, 2, 2, 2, 3, 1,
3, 3, 4, 2, 1, 3, 3, 3, 4, 2, 1, 2, 1, 2, 2, 3, 3, 4, 1, 1, 6,
3, 2, 2, 2, 6, 3, 3, 2, 2, 1, 4, 2, 3, 3, 3, 2, 2, 3, 3, 2, 4,
6, 1, 1, 1, 1, 3, 9, 4, 2, 3, 2, 2, 2, 4, 3, 3, 4, 1, 2, 6, 3,
3, 3, 2, 2, 3, 4, 2, 3, 2, 2, 3, 2, 3, 4, 7, 2, 3, 3, 2, 3, 2,
3, 4, 3, 3, 3, 2, 2, 2, 1, 3, 4, 2, 1, 3, 4, 1, 3, 4, 4, 3, 3,
3, 3, 3, 2, 3, 3, 3, 5, 3, 3, 5, 2, 2, 1, 1, 2, 2, 2, 3, 1, 3,
2, 2, 2, 4, 2, 2, 2, 4, 1, 3, 4, 3, 3, 4, 3, 2, 1, 3, 4, 8, 1,
2, 3, 3, 3, 3, 2, 3, 3, 1, 3, 4, 2, 3, 2, 6, 3, 1, 2, 2, 2, 2,
2, 4, 3, 5, 1, 2, 2, 2, 4, 2, 3, 3, 1, 1, 2, 2, 3, 3, 2, 3, 3,
3, 3, 1, 4, 4, 2, 3, 3, 1, 4, 3, 4, 2, 3, 3, 2, 7, 1, 4, 1, 2,
2, 3, 2, 5, 2, 3, 2, 3, 1, 3, 2, 2, 3, 2, 4, 2, 3, 3, 3, 3, 1,
5, 5, 1, 1, 2, 3, 1, 4, 2, 2, 3, 2, 2, 2, 3, 3, 3, 3, 2, 3, 4,
8, 3, 2, 3, 1, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4, 4, 2, 3, 2, 1, 3,
2, 3, 3, 2, 3, 3, 2, 3, 2, 3, 3, 1, 1, 2, 4, 3, 4, 3, 1, 3, 4,
2, 3, 3, 2, 2, 2, 2, 2, 3, 3, 3, 1, 3, 3, 2, 1, 1, 4, 1, 3, 2,
1, 2, 3, 3, 2, 2, 2, 4, 2, 1, 3, 2, 3, 2, 1, 3, 1, 3, 1, 3, 3,
2, 1, 2, 3, 2, 3, 1, 2, 2, 2, 3, 3, 2, 3, 1, 3, 3, 3, 3, 2, 4,
2, 4, 4, 1, 2, 1, 2, 1, 3, 3, 3, 2, 3, 3, 4, 2, 2, 3, 2, 1, 2,
2, 1, 1, 3, 1, 2, 3, 3, 3, 2, 1, 1, 1, 2, 1, 2, 5, 1, 2, 1, 4,
2, 2, 2, 1, 4, 2, 3, 3, 3, 2, 4, 5, 4, 2, 4, 2, 3, 1, 4, 3, 3,
2, 3, 3, 2, 3, 2, 1, 3, 2, 4, 2, 3, 4, 1, 2, 3, 1, 3, 3, 4, 2,
2, 2, 3, 3, 2, 1, 2, 2, 1, 3, 1, 3, 1, 1, 1, 3, 2, 2, 4, 3, 4,
3, 3, 4, 1, 1, 3, 3, 2, 3, 2, 3, 2, 1, 3, 3, 1, 5, 1, 1, 2, 4,
2, 3, 5, 4, 1, 3, 2, 1, 2, 2, 4, 3, 4, 2, 2, 1, 3, 2, 4, 2, 3,
3, 2, 3, 2, 1, 2, 3, 4)), row.names = c(NA, -1059L), class = c("tbl_df",
"tbl", "data.frame"))
df
## # A tibble: 1,059 x 2
## equipment children_n
## <dbl> <dbl>
## 1 1 4
## 2 0 4
## 3 1 2
## 4 1 2
## 5 0 2
## 6 1 1
## 7 1 1
## 8 1 3
## 9 1 2
## 10 1 3
## # ... with 1,049 more rows
In cases where number of children is above 6, I want to collapse those cases to one category of "6+".
df %<>%
mutate_at(vars(children_n), as.character) %>%
mutate_at(vars(children_n), recode, "9" = "6_plus", "8" = "6_plus", "7" = "6_plus", "6" = "6_plus") %>%
mutate_at(vars(children_n), fct_relevel, "1", "2", "3", "4", "5", "6_plus")
glimpse(df)
## Rows: 1,059
## Columns: 2
## $ equipment <dbl> 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, ...
## $ children_n <fct> 4, 4, 2, 2, 2, 1, 1, 3, 2, 3, 3, 6_plus, 3, 2, 1, 2, 1, 1, 3, 3, 3, 2, 3, 3, 3, 2, 4, 3, 1, 2, 3, 4, 4, 1, 2, 5, 2, 6_plus, 1, 2, 1, 2,...
Now I want to plot the proportion of number of children in two separate panels: one panel for families who have enough equipment, and another panel for families who don't have enough equipment:
df %>%
ggplot(data = ., aes(x = children_n, y = equipment)) +
geom_histogram(aes(y = (..count..)/sum(..count..)), stat = "count" , fill = "darkblue") +
geom_text(aes(label = scales::percent(((..count..)/sum(..count..)), accuracy = 1),
y = ((..count..)/sum(..count..)) ), stat= "count", vjust = -.5, color = "darkblue") +
scale_y_continuous(labels = scales::percent) +
facet_grid(~ equipment, labeller = as_labeller(c("1" = "have enough equipment",
"0" = "don't have enough equipment")))
This gives two panels that *DON'T* integrate to 100% independently:
Trying to solve the problem
I found this question that describes the same intention and problem. The chosen solution suggests defining the geom_histogram as density so it integrates to 100%. But this won't work in my case because stat(density) asks that the x variable will be continuous, unlike my case where x is a factor.
df %>%
ggplot(data = ., aes(x = children_n, y = equipment)) +
geom_histogram(aes(y = stat(density) * 6), binwidth = 6, fill = "darkblue") +
facet_grid(~ equipment, labeller = as_labeller(c("1" = "have enough equipment",
"0" = "don't have enough equipment")))
Error: StatBin requires a continuous x variable: the x variable is
discrete. Perhaps you want stat="count"?
Other approaches suggest using ..PANEL.. while others are strongly against it.
How can I get the two facets to show percents that independently integrate to 100%, in a proper way?
This could be achieved like so:
Map the facetting variable on the group aes
Use e.g. tapply to get the total number per group or facet
BTW: I have put the code for the normalization inside a helper function to reduce the code duplication and readability
library(tidyverse)
library(magrittr)
df %<>%
mutate_at(vars(children_n), as.character) %>%
mutate_at(vars(children_n), recode, "9" = "6_plus", "8" = "6_plus", "7" = "6_plus", "6" = "6_plus") %>%
mutate_at(vars(children_n), fct_relevel, "1", "2", "3", "4", "5", "6_plus")
help <- function(count, group) {
count / tapply(count, group, sum)[group]
}
df %>%
ggplot(data = ., aes(x = children_n, y = equipment, group = equipment)) +
geom_histogram(aes(y = help(..count.., ..group..)), stat = "count" , fill = "darkblue") +
geom_text(aes(label = scales::percent(help(..count.., ..group..), accuracy = 1),
y = help(..count.., ..group..) ), stat= "count", vjust = -.5, color = "darkblue") +
scale_y_continuous(labels = scales::percent) +
facet_grid(~ equipment, labeller = as_labeller(c("1" = "have enough equipment",
"0" = "don't have enough equipment")))
#> Warning: Ignoring unknown parameters: binwidth, bins, pad
Related
Output list of variables with significant p-values from several regressions in R
I have a dataframe that looks like the following. consistent admire trust judge 3 3 2 4 5 1 3 6 2 4 5 1 I can run the regressions I need simultaneously using the following code. In the actual dataset, there are many more than 3 variables. variables <- c("admire", "trust", "judge") form <- paste("consistent ~ ",variables,"") model <- form %>% set_names(variables) %>% map(~lm(as.formula(.x), data = df)) map(model, summary) This yields the output for the 3 following regressions. summary(lm(consistent ~ admire, df)) summary(lm(consistent ~ trust, df)) summary(lm(consistent ~ judge, df)) I would like a list of the variables with significant p-values at p < 0.05. For example, if "admire" was significant and "judge" was significant, the output I am looking for would be something like: admire, judge Is there a way to do this that allows me to also run several regressions simultaneously? This question offers a similar answer, but I don't know how to apply it when I have several regressions. Data: structure(list(consistent = c(1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0), admire = c(7, 3, 1, 1, 3, 5, 5, 6, 7, 1, 4, 2, 5, 3, 3, 1, 3, 1, 2, 1, 5, 5, 3, 1, 5, 3, 5, 4, 5, 1, 6, 1, 6, 2, 1, 4, 1, 1, 3, 2, 1, 5, 1, 7, 1, 4, 1, 4, 2, 2, 4, 2, 4, 1, 5, 5, 1, 2, 6, 6, 1, 1, 3, 5, 5, 1, 5, 7, 2, 4, 5, 1, 4, 4, 3, 5, 6, 1, 5, 2, 1, 5, 6, 2, 3, 3, 5, 6, 1, 4, 4, 6, 4, 4, 4, 6, 5, 4, 1, 2, 5, 4, 2, 4, 6, 1, 3, 7, 4, 4, 3, 2, 7, 5, 3, 2, 1, 2, 2, 5, 7, 3, 5, 4, 6, 2, 2, 4, 4, 5, 5, 1, 5, 6, 1, 2, 4, 7, 1, 4, 5, 4, 2, 4, 1, 4, 3, 4, 7, 5, 6, 3, 1, 1, 7, 1, 6, 4, 1, 1, 2, 1, 1, 6, 3, 1, 4, 4, 7, 2, 1, 5, 3, 3, 7, 4, 5, 1, 3, 7, 5, 4, 1, 1, 1, 5, 2, 1, 1, 4, 1, 5, 4, 5, 1, 4, 4, 4, 7, 1, 1, 2, 5, 2, 4, 2, 4, 6, 4, 2, 6, 5, 6, 7, 4, 4, 5, 1, 5, 7, 1, 7, 2, 7, 3, 6, 2, 5, 7, 3, 5, 4, 1, 4, 1, 5, 1, 1, 6, 6, 7, 3, 4, 1, 6, 4, 1, 6, 7, 5, 4, 2, 6, 5, 5, 4, 1, 2, 6, 1, 5, 3, 1, 1, 1, 7, 7, 3, 5, 1, 5, 1, 7, 2, 5, 4, 2, 1, 4, 1, 1, 5, 5, 4, 5, 2, 4, 5, 5, 1, 4, 4, 1, 3, 4, 2, 7, 6, 6, 4, 3, 6, 1, 6, 1, 1, 4, 7, 7, 1, 3, 1, 4, 2, 2, 6, 1, 2, 1, 1, 1, 4, 2, 5, 4, 1, 4, 2, 5, 5, 2, 1, 6, 1, 2, 3, 4, 1, 7, 2, 2, 4, 5, 1, 6, 2, 5, 1, 5, 6, 2, 5, 1, 1, 7, 4, 5, 6, 1, 4, 5, 2, 4, 4, 6, 4, 4, 2, 6, 1, 1, 2, 6, 1, 3, 5, 5, 3, 7, 5, 6, 4, 3, 4, 7, 5, 4, 2, 1, 5, 7, 2, 6, 3, 1, 2, 4, 3, 5, 4, 1, 6, 1, 3, 1, 1, 1, 4, 3, 3, 1, 1, 1, 6, 4, 1, 1, 1, 1, 4, 1, 6, 4, 4, 4, 4, 1, 5, 2, 4, 5, 4, 4, 3, 3, 6, 7, 3, 2, 4, 2, 5, 1, 4, 5, 4, 1, 2, 4, 1), trust = c(7, 4, 2, 2, 3, 4, 6, 6, 7, 1, 4, 5, 5, 4, 1, 1, 2, 2, 1, 1, 6, 6, 4, 1, 3, 6, 5, 4, 6, 1, 5, 1, 6, 1, 2, 5, 1, 1, 4, 1, 1, 5, 1, 7, 1, 4, 4, 5, 3, 4, 5, 3, 5, 2, 6, 5, 3, 2, 6, 6, 1, 1, 3, 5, 5, 1, 5, 7, 2, 4, 6, 1, 4, 4, 4, 6, 6, 3, 5, 6, 1, 6, 5, 2, 2, 2, 5, 7, 1, 5, 3, 7, 3, 5, 4, 6, 6, 5, 2, 1, 6, 5, 2, 6, 5, 1, 2, 7, 6, 5, 3, 3, 4, 7, 4, 2, 1, 3, 4, 7, 6, 2, 6, 5, 7, 3, 2, 4, 5, 5, 5, 1, 2, 7, 1, 1, 5, 4, 1, 4, 6, 6, 2, 4, 2, 4, 1, 5, 7, 6, 7, 3, 2, 1, 7, 1, 4, 4, 1, 2, 4, 1, 1, 6, 3, 1, 4, 3, 7, 2, 2, 6, 4, 5, 7, 5, 7, 2, 4, 7, 4, 3, 1, 1, 1, 5, 2, 4, 1, 4, 1, 5, 4, 5, 1, 6, 5, 4, 6, 1, 1, 2, 6, 2, 4, 4, 4, 5, 6, 1, 5, 5, 5, 6, 4, 4, 5, 5, 6, 7, 1, 7, 3, 7, 5, 6, 3, 5, 7, 4, 5, 4, 2, 3, 1, 4, 5, 1, 5, 4, 7, 3, 5, 1, 6, 6, 1, 4, 6, 5, 4, 3, 7, 6, 5, 4, 1, 1, 6, 1, 5, 3, 1, 1, 1, 7, 7, 3, 4, 1, 4, 1, 7, 2, 4, 2, 2, 2, 4, 1, 1, 5, 4, 6, 5, 2, 4, 5, 4, 1, 6, 4, 1, 4, 4, 3, 7, 5, 6, 4, 4, 6, 2, 6, 1, 2, 4, 7, 7, 1, 1, 1, 4, 2, 2, 6, 2, 4, 1, 2, 1, 6, 2, 6, 4, 1, 6, 3, 5, 4, 3, 1, 6, 1, 2, 3, 5, 1, 6, 1, 3, 4, 5, 2, 6, 2, 5, 1, 3, 7, 1, 4, 1, 1, 7, 5, 6, 5, 1, 5, 5, 1, 4, 3, 7, 4, 4, 1, 7, 1, 1, 4, 6, 1, 4, 5, 5, 4, 7, 6, 7, 4, 4, 4, 4, 4, 4, 1, 1, 5, 6, 2, 7, 4, 2, 4, 5, 4, 5, 4, 1, 5, 1, 2, 1, 1, 4, 4, 3, 4, 3, 1, 2, 6, 5, 1, 1, 1, 2, 4, 1, 7, 4, 4, 5, 6, 2, 5, 3, 4, 5, 4, 4, 3, 3, 6, 7, 4, 4, 3, 2, 5, 1, 5, 5, 5, 2, 2, 3, 1 ), judge = c(1, 5, 6, 3, 6, 3, 4, 5, 4, 1, 3, 2, 3, 2, 4, 3, 4, 2, 5, 4, 3, 3, 4, 4, 7, 5, 4, 4, 1, 3, 6, 2, 3, 2, 5, 2, 3, 4, 2, 4, 4, 3, 4, 4, 1, 4, 1, 2, 3, 1, 2, 2, 3, 5, 3, 5, 5, 3, 1, 4, 4, 2, 5, 4, 3, 1, 5, 4, 4, 5, 2, 2, 2, 7, 3, 3, 1, 1, 5, 3, 3, 1, 2, 5, 2, 3, 5, 4, 3, 4, 3, 2, 1, 3, 4, 4, 5, 5, 3, 2, 2, 3, 2, 4, 1, 1, 4, 2, 2, 3, 3, 2, 4, 4, 6, 1, 7, 4, 2, 3, 4, 1, 2, 4, 4, 5, 2, 1, 3, 2, 2, 1, 1, 7, 2, 3, 5, 5, 1, 2, 2, 5, 6, 5, 1, 1, 1, 4, 1, 5, 4, 3, 6, 1, 4, 1, 3, 4, 6, 1, 2, 4, 3, 3, 4, 7, 1, 3, 1, 2, 2, 3, 2, 3, 5, 3, 4, 2, 6, 3, 1, 1, 1, 1, 4, 2, 2, 4, 4, 5, 4, 2, 1, 6, 7, 5, 2, 2, 4, 5, 6, 1, 5, 2, 4, 5, 5, 2, 2, 3, 4, 5, 2, 2, 4, 1, 3, 4, 4, 4, 2, 3, 1, 4, 4, 3, 2, 3, 1, 4, 2, 4, 4, 1, 5, 4, 4, 4, 4, 6, 1, 3, 5, 7, 2, 6, 1, 5, 7, 5, 4, 2, 3, 6, 3, 1, 1, 2, 2, 5, 5, 2, 5, 4, 4, 5, 4, 4, 3, 7, 4, 4, 4, 2, 5, 3, 6, 5, 4, 4, 4, 6, 4, 5, 5, 1, 5, 2, 6, 4, 4, 1, 1, 4, 6, 1, 7, 1, 5, 2, 5, 4, 2, 3, 2, 6, 3, 2, 2, 1, 1, 5, 4, 1, 1, 4, 1, 5, 1, 4, 3, 2, 3, 4, 1, 6, 1, 2, 1, 3, 5, 5, 2, 1, 3, 4, 2, 4, 5, 4, 6, 3, 4, 6, 7, 6, 2, 4, 6, 2, 4, 5, 1, 4, 1, 3, 2, 4, 1, 6, 4, 3, 1, 3, 4, 5, 1, 6, 1, 5, 1, 3, 3, 1, 3, 4, 2, 4, 1, 1, 2, 2, 2, 3, 1, 6, 5, 4, 1, 7, 5, 6, 5, 2, 3, 5, 4, 3, 4, 5, 7, 1, 5, 2, 5, 1, 3, 4, 3, 5, 1, 4, 2, 3, 4, 1, 7, 5, 5, 2, 1, 2, 5, 6, 5, 5, 3, 1, 3, 1, 4, 1, 5, 2, 3, 5, 6, 4, 4, 3, 2, 4, 1, 3, 4, 3, 4, 4, 1, 5)), row.names = c(NA, -450L), class = c("tbl_df", "tbl", "data.frame"))
To fit many many simple linear regression models, I recommend Fast pairwise simple linear regression between variables in a data frame. Hmm... looks like I need to collect those functions in an R package... ## suppose your data frame is `df` ## response variable (LHS) in column 1 ## independent variable (RHS) in other columns out <- general_paired_simpleLM(df[1], df[-1]) # LHS RHS alpha beta beta.se beta.tv beta.pv #1 consistent admire -0.1458455 0.18754326 0.008324192 22.529906 1.040756e-75 #2 consistent trust -0.2211250 0.19565589 0.007721387 25.339475 1.531499e-88 #3 consistent judge 0.3484851 0.04824981 0.014182420 3.402086 7.287372e-04 # sig R2 F.fv F.pv #1 0.3430602 0.53118295 507.59665 1.040756e-75 #2 0.3212008 0.58902439 642.08902 1.531499e-88 #3 0.4946862 0.02518459 11.57419 7.287372e-04 To get what you want: with(out, RHS[beta.pv < 0.05]) #[1] "admire" "trust" "judge"
Error using aggregate to find length with missing values
I am trying to use the aggregate function in R to summarise a data using the length function. My data has some NA's and I have tried using 'na.rm = T' or 'na.omit' however none sees to work. I keep getting this error 'Error in FUN(X[[i]], ...) : 2 arguments passed to 'length' which requires 1' data10 <- structure(list(Group = c(1, 1, 2, 1, 1, 2, 1, 1, 2, 1, 2, 2, 1, 2, 2, 1, 2, 2, 2, 1, 2, 2, 1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 2, 1), SUBJECT = c(1, 1, 2, 3, 3, 4, 5, 5, 6, 7, 8, 8, 9, 10, 10, 11, 12, 14, 14, 15, 16, 16, 17, 18, 19, 19, 20, 21, 21, 22, 23, 23, 24, 25), test = c(1, 2, 1, 1, 2, 2, 1, 2, 2, 1, 1, 2, 2, 1, 2, 2, 1, 1, 2, 1, 1, 2, 2, 1, 1, 2, 1, 1, 2, 2, 1, 2, 2, 1 ), trial = c(1, 3, 5, 7, 1, 3, 5, 7, 1, 3, 5, 7, 1, 3, 5, 7, 1, 3, 5, 7, 1, 3, 5, 7, 1, 3, 5, 7, 1, 3, 5, 7, 1, 3), Condition = c(1, 2, 3, 1, 3, 1, 2, 3, 2, 3, 1, 2, 1, 2, 3, 1, 3, 1, 2, 3, 2, 3, 1, 2, 1, 2, 3, 1, 3, 1, 2, 3, 2, 3), Sac2 = c(1, 1, 1, NA, 2, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 4, 1, 1, 1, 4, 1, 1, 1, 1, 1, 1, 2, 2, 1, 1), Sac = c(1, 1, 1, NA, 3, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 7, 1, 1, 1, 7, 1, 1, 1, 1, 1, 1, 3, 3, 1, 1), Saccade...8 = c(1, 1, 1, NA, 2, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 2, 2, 1, 1), T_APPEAR = c(9.236, 17.85, 28.942, 63.724, 9.463, 22.963, 52.068, 57.021, 15.344, 19.783, 37.825, 46.17, 4.339, 21.241, 29.179, 31.823, 12.164, 22.84, 23.954, 73.663, 27.269, 22.131, 30.361, 62.674, 6.928, 16.413, 47.555, 48.893, 7.291, 15.796, 31.788, 54.946, 10.117, 28.83)), row.names = c(NA, -34L), class = c("tbl_df", "tbl", "data.frame")) data14 = aggregate(data10, by = list(data10$SUBJECT,data10$Condition, data10$Group, data10$test), FUN = length(), na.rm=TRUE)
Bootstrapping multiple regression error: number of items to replace is not a multiple of replacement length
I want to bootstrap my dataset for multiple regression. Unfortunately I get this error message: "number of items to replace is not a multiple of replacement length" I suspect that the factors in my regression formula may be problematic. What could I do to solve my problem? My code is as following (I read Andy FieldĀ“s Discovering Statistics using R): BootReg <- function(data, indices, formula) { d <- data[indices,] fit <- lm(formula, data=d) return(coef(fit)) } bootResults <-boot(statistic = BootReg, formula = TICS_Skala1 ~HSPhoch + HSPhoch*extra.c + psy + sex + age.c, data = mod.reg.data, R = 2000) psy (psychiatric disease), sex and HSPhoch (high sensory-processing sensitivity) are factors. TICS_Skala1, extra.c, age.c are continuos variables. my sample data: > dput(head(mod.reg.data, 20)) structure(list(neo_01 = c(3, 4, 3, 0, 4, 4, 3, 2, 3, 1, 4, 2, 3, 3, 1, 2, 3, 4, 0, 2), neo_03 = c(1, 1, 1, 3, 1, 2, 0, 0, 0, 0, 0, 0, 1, 3, 1, 1, 1, 1, 3, 1), neo_04 = c(2, 4, 3, 0, 4, 3, 4, 3, 2, 3, 3, 3, 3, 4, 2, 4, 3, 4, 3, 3), neo_08 = c(3, 0, 1, 2, 3, 3, 4, 3, 2, 1, 2, 4, 0, 3, 1, 1, 3, 1, 3, 1), neo_12 = c(3, 1, 1, 2, 2, 2, 4, 1, 1, 2, 1, 4, 1, 3, 1, 1, 3, 2, 3, 2), neo_13 = c(3, 2, 2, 4, 3, 3, 3, 2, 2, 1, 2, 3, 0, 3, 1, 0, 2, 3, 0, 2), neo_16 = c(3, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 3, 0, 2, 0, 0, 0, 0, 2, 1), neo_17 = c(2, 1, 3, 0, 1, 1, 1, 4, 3, 1, 2, 2, 2, 3, 1, 0, 2, 0, 2, 2), neo_18 = c(2, 3, 4, 0, 4, 3, 4, 3, 3, 1, 3, 2, 4, 2, 3, 4, 3, 4, 2, 2), neo_21 = c(3, 0, 1, 2, 1, 2, 1, 1, 1, 1, 1, 3, 0, 4, 1, 0, 0, 0, 4, 1), neo_26 = c(3, 0, 0, 0, 2, 1, 3, 0, 1, 1, 0, 2, 3, 3, 0, 0, 1, 1, 4, 1), neo_27 = c(3, 3, 4, 3, 3, 3, 3, 3, 3, 3, 3, 2, 3, 4, 3, 3, 3, 3, 2, 2), TICS_1 = c(3, 0, 3, 2, 2, 1, 3, 3, 1, 2, 0, 4, 2, 3, 2, 3, 4, 1, 3, 2), TICS_2 = c(3, 1, 1, 1, 1, 2, 0, 0, 0, 0, 0, 4, 3, 1, 1, 1, 2, 1, 2, 1), TICS_3 = c(2, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 3, 1, 2, 0, 1, 1, 0, 1, 0), TICS_4 = c(2, 0, 2, 0, 1, 2, 1, 3, 0, 0, 0, 4, 1, 2, 1, 2, 1, 1, 2, 2), TICS_5 = c(2, 3, 2, 1, 2, 2, 2, 2, 0, 2, 1, 2, 2, 2, 2, 1, 1, 1, 2, 1), TICS_6 = c(3, 2, 2, 4, 2, 2, 1, 3, 1, 1, 1, 2, 2, 2, 2, 1, 1, 2, 1, 2), TICS_7 = c(3, 3, 2, 2, 2, 2, 0, 3, 1, 2, 1, 4, 2, 0, 2, 1, 4, 1, 0, 1), TICS_8 =c(NA, NA, NA, NA, NA, NA, NA, NA, 1, 1, 0, 4, 3, 1, 1, 3, 3, 2, 1, 2), TICS_9 = c(NA, NA, NA, NA, NA, NA, NA, NA, 0, 3, 2, 2, 1, 3, 0, 1, 3, 1, 1, 2), TICS_10 = c(2, 2, 0, 0, 2, 3, 0, 2, 1, 1, 2, 2, 1, 0, 0, 1, 1, 2, 2, 1), TICS_11 = c(1, 2, 1, 0, 1, 1, 0, 0, 0, 0, 2, 4, 1, 0, 0, 0, 0, 1, 1, 0), TICS_12 = c(2, 2, 1, 0, 1, 1, 1, 3, 1, 1, 1, 4, 2, 2, 2, 3, 3, 1, 2, 3), TICS_13= c(1, 1, 3, 0, 2, 3, 2, 1, 1, 2, 1, 2, 2, 3, 2, 2, 1, 2, 2, 2), TICS_14= c(4, 1, 1, 0, 1, 1, 3, 4, 0, 2, 0, 4, 2, 3, 0, 1, 3, 1, 1, 1), TICS_15= c(3, 1, 1, 3, 0, 2, 0, 2, 0, 2, 1, 2, 0, 1, 1, 1, 0, 0, 0, 1), ICS_16= c(4, 2, 1, 3, 3, 2, 1, 2, 1, 1, 1, 3, 1, 3, 1, 2, 3, 1, 2, 1), TICS_17= c(3, 0, 2, 2, 1, 2, 2, 3, 0, 1, 1, 2, 1, 2, 2, 3, 1, 1, 1, 2), TICS_18= c(3, 0, 1, 2, 0, 1, 1, 0, 0, 1, 0, 4, 2, 2, 0, 0, 1, 0, 2, 0), TICS_19= c(4, 2, 2, 2, 2, 2, 0, 2, 1, 2, 1, 4, 3, 2, 1, 1, 1, 0, 1, 2), TICS_20= c(2, 0, 2, 0, 0, 0, 1, 0, 1, 1, 0, 4, 1, 1, 0, 0, 1, 0, 2, 0), TICS_21= c(2, 1, 1, 0, 2, 3, 0, 1, 0, 1, 3, 2, 2, 1, 2, 1, 1, 1, 3, 0), TICS_22= c(3, 0, 1, 2, 2, 3, 1, 4, 0, 1, 1, 2, 3, 1, 1, 2, 3, 2, 0, 3), TICS_24= c(2, 0, 0, 1, 0, 0, 2, 0, 1, 1, 0, 2, 0, 0, 0, 1, 1, 0, 0, 1), TICS_25= c(4, 0, 1, 2, 2, 2, 4, 2, 1, 1, 0, 3, 0, 2, 0, 1, 2, 1, 2, 1), TICS_26= c(3, 0, 2, 2, 0, 1, 1, 0, 0, 1, 0, 2, 0, 2, 0, 0, 0, 0, 0, 1), TICS_27= c(3, 1, 4, 2, 3, 3, 4, 4, 0, 1, 0, 3, 2, 3, 2, 3, 2, 2, 4, 3), TICS_28= c(3, 2, 2, 1, 1, 2, 1, 2, 1, 1, 0, 4, 1, 2, 1, 0, 1, 0, 0, 2), TICS_29= c(2, 0, 1, 0, 2, 2, 1, 0, 1, 0, 0, 4, 1, 1, 0, 1, 0, 0, 1, 1), TICS_30= c(2, 1, 3, 1, 2, 2, 1, 0, 1, 1, 1, 3, 2, 0, 1, 0, 1, 2, 2, 2), TICS_31= c(2, 0, 1, 0, 1, 1, 1, 0, 0, 1, 1, 3, 2, 1, 0, 0, 1, 0, 2, 1), TICS_32= c(4, 1, 1, 0, 1, 2, 1, 4, 0, 3, 0, 3, 3, 2, 1, 2, 2, 2, 3, 3), TICS_33= c(2, 1, 0, 2, 1, 1, 1, 1, 0, 0, 0, 1, 0, 2, 0, 0, 0, 1, 1, 1), TICS_34= c(1, 3, 0, 0, 2, 1, 1, 1, 0, 0, 2, 4, 0, 0, 0, 0, 0, 0, 0, 0), TICS_35= c(1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 2, 0, 1, 0, 1, 1, 0, 4, 1), TICS_36= c(4, 1, 2, 3, 3, 2, 4, 1, 0, 1, 2, 3, 1, 3, 0, 1, 1, 0, 2, 1), TICS_37= c(1, 1, 2, 0, 2, 3, 3, 0, 1, 2, 1, 2, 1, 0, 2, 2, 1, 1, 2, 1), TICS_38= c(3, 0, 3, 1, 2, 2, 2, 3, 0, 2, 0, 4, 0, 2, 1, 2, 2, 1, 1, 2), TICS_39= c(1, 1, 2, 2, 3, 1, 1, 2, 1, 1, 1, 4, 1, 1, 1, 1, 3, 0, 0, 3), TICS_40= c(2, 0, 2, 0, 3, 2, 1, 2, 0, 0, 0, 3, 2, 2, 0, 1, 2, 0, 0, 1), TICS_41= c(2, 2, 0, 0, 2, 3, 1, 1, 0, 1, 3, 1, 2, 0, 1, 0, 0, 1, 2, 0), TICS_42= c(1, 2, 0, 0, 2, 1, 0, 0, 0, 1, 1, 2, 1, 1, 1, 0, 0, 0, 0, 0), TICS_43= c(4, 1, 1, 2, 2, 3, 3, 3, 0, 2, 1, 4, 3, 2, 1, 1, 3, 1, 2, 3), TICS_44= c(3, 0, 2, 1, 2, 2, 3, 3, 0, 1, 0, 4, 1, 3, 0, 2, 2, 1, 3, 1), TICS_45= c(2, 0, 1, 2, 0, 1, 0, 2, 0, 1, 0, 2, 0, 2, 0, 0, 0, 0, 0, 1), TICS_46= c(2, 1, 0, 1, 2, 2, 1, 0, 0, 3, 1, 4, 3, 1, 1, 0, 1, 1, 2, 1), TICS_47= c(3, 1, 2, 1, 2, 2, 1, 1, 1, 2, 0, 3, 1, 2, 1, 2, 1, 1, 4, 1), TICS_48= c(1, 2, 3, 1, 2, 3, 1, 1, 0, 2, 2, 4, 2, 3, 2, 2, 1, 0, 2, 0), TICS_49= c(1, 3, 2, 2, 1, 2, 2, 1, 0, 1, 1, 4, 3, 0, 1, 2, 4, 1, 0, 3), TICS_50= c(3, 0, 3, 1, 1, 2, 4, 3, 0, 2, 0, 4, 2, 3, 2, 2, 2, 2, 2, 3), TICS_51= c(1, 2, 0, 0, 2, 1, 0, 0, 0, 0, 1, 2, 1, 0, 1, 0, 0, 0, 0, 0), TICS_52= c(2, 1, 3, 0, 1, 1, 1, 1, 0, 1, 0, 2, 0, 3, 0, 0, 0, 0, 0, 1), TICS_53= c(2, 2, 2, 0, 2, 3, 1, 1, 0, 2, 2, 3, 2, 2, 2, 1, 1, 1, 2, 1), TICS_54= c(3, 0, 3, 2, 2, 2, 3, 3, 1, 2, 0, 4, 0, 2, 0, 2, 2, 0, 2, 1), TICS_55= c(2, 0, 0, 1, 0, 1, 2, 0, 0, 1, 0, 4, 0, 1, 0, 1, 1, 0, 2, 0), TICS_56= c(4, 3, 1, 0, 2, 0, 0, 0, 1, 0, 1, 2, 1, 1, 1, 0, 0, 0, 2, 0), TICS_57= c(2, 1, 1, 0, 2, 1, 0, 0, 1, 1, 1, 4, 3, 0, 0, 1, 1, 0, 0, 2), HSPS_1 = c(3, 4, 3, 3, 4, 2, 4, 2, 4, 2, 3, 4, 2, 2, 4, 2, 3, 3, 5, 2), HSPS_2 = c(4, 4, 3, 5, 5, 3, 2, 4, 5, 5, 3, 4, 3, 4, 4, 2, 4, 3, 4, 3), HSPS_3 = c(4, 4, 4, 3, 3, 4, 3, 3, 3, 3, 3, 5, 3, 4, 5, 3, 3, 3, 4, 2), HSPS_4 = c(4, 2, 1, 4, 2, 3, 5, 3, 5, 2, 3, 3, 3, 4, 3, 3, 4, 2, 5, 2), HSPS_5 = c(2, 2, 2, 4, 3, 3, 3, 1, 4, 3, 3, 4, 3, 2, 4, 3, 4, 3, 5, 1), HSPS_6 = c(4, 3, 1, 3, 4, 3, 3, 3, 3, 2, 1, 1, 1, 3, 5, 3, 3, 1, 1, 2), HSPS_7 = c(4, 3, 1, 3, 4, 2, 3, 1, 4, 3, 2, 4, 1, 1, 5, 3, 3, 1, 5, 1), HSPS_8 = c(4, 3, 5, 5, 4, 5, 5, 3, 4, 4, 3, 3, 2, 4, 4, 3, 4, 3, 3, 3), HSPS_9 = c(3, 2, 2, 5, 3, 3, 4, 1, 5, 2, 2, 4, 1, 2, 4, 4, 3, 1, 5, 2), HSPS_10= c(4, 4, 5, 4, 4, 4, 3, 1, 4, 3, 3, 4, 2, 1, 5, 3, 4, 4, 3, 2), HSPS_11= c(3, 2, 2, 3, 2, 2, 3, 1, 3, 2, 4, 5, 1, 3, 3, 3, 3, 2, 3, 2), HSPS_12= c(4, 4, 5, 5, 4, 5, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 4, 4, 5, 4), HSPS_13= c(3, 2, 3, 2, 2, 2, 5, 2, 3, 2, 3, 4, 3, 3, 3, 3, 4, 2, 5, 2), HSPS_14= c(3, 2, 2, 3, 3, 3, 5, 3, 3, 2, 3, 3, 2, 3, 2, 3, 3, 2, 4, 2), HSPS_15= c(4, 4, 2, 3, 4, 3, 3, 3, 4, 2, 3, 3, 5, 2, 4, 2, 3, 3, 3, 2), HSPS_16= c(2, 2, 1, 5, 2, 3, 2, 2, 3, 3, 3, 5, 2, 3, 3, 3, 2, 2, 5, 2), HSPS_17= c(4, 3, 4, 5, 3, 4, 4, 2, 4, 3, 5, 4, 4, 4, 5, 4, 5, 2, 5, 4), HSPS_18= c(2, 2, 1, 2, 1, 2, 2, 1, 3, 2, 2, 5, 2, 1, 4, 3, 2, 1, 5, 1), HSPS_19= c(3, 2, 2, 4, 2, 2, 3, 1, 4, 2, 2, 4, 1, 1, 4, 3, 2, 2, 5, 2), HSPS_20= c(4, 4, 4, 3, 4, 3, 5, 3, 3, 3, 4, 3, 3, 4, 4, 3, 5, 3, 5, 2), HSPS_21= c(3, 3, 4, 5, 3, 3, 5, 2, 4, 2, 3, 5, 4, 4, 3, 2, 3, 2, 5, 2), HSPS_22= c(3, 5, 5, 4, 5, 4, 3, 2, 4, 3, 3, 5, 3, 2, 4, 2, 4, 3, 5, 2), HSPS_23= c(2, 2, 1, 4, 2, 3, 4, 3, 3, 2, 2, 5, 3, 3, 3, 3, 3, 2, 5, 3), HSPS_24= c(3, 2, 2, 3, 3, 3, 3, 2, 4, 2, 3, 5, 4, 2, 4, 4, 4, 3, 4, 2), HSPS_25= c(3, 2, 2, 5, 3, 3, 5, 1, 4, 2, 3, 5, 3, 2, 4, 3, 3, 2, 5, 2), HSPS_26= c(2, 1, 1, 3, 3, 3, 3, 2, 3, 2, 2, 5, 2, 2, 3, 3, 3, 2, 5, 2), HSPS_27= c(2, 2, 1, 4, 3, 2, 3, 4, 3, 1, 4, 1, 1, 3, 4, 2, 3, 2, 5, 3), sex = structure(c(2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L), .Label = c("m", "w", "d"), class = "factor"), Bildung = structure(c(6L, 5L, 5L, 6L, 6L, 6L, 5L, 6L, 5L, 6L, 6L, 4L, 6L, 5L, 5L, 6L, 6L, 5L, 5L, 6L), .Label = c("kein", "Haupt", "mittlereR", "Fachabi", "Abi", "Studium"), class = "factor"), job = structure(c(6L, 2L, 2L, 2L, 2L, 6L, 2L, 6L, 5L, 2L, 2L, 1L, 6L, 2L, 2L, 2L, 6L, 2L, 2L, 6L), .Label = c("hausl", "Student", "Azubi", "Suchend", "Rente", "berufstaetig"), class = "factor"), age = c(23, 24, 21, 70, 25, 29, 22, 25, 57, 24, 25, 30, 31, 20, 28, 27, 26, 21, 24, 53), VPN = 1:20, consent = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("ja", "nein"), class = "factor"), psy = c(0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0), HSPS = c(86, 75, 69, 102, 85, 82, 97, 59, 100, 68, 80, 106, 68, 73, 105, 79, 91, 63, 119, 59), neuro = c(16, 3, 4, 10, 10, 11, 12, 5, 5, 5, 5, 16, 5, 18, 4, 3, 8, 5, 19, 7), extra = c(15, 17, 19, 7, 19, 17, 18, 17, 16, 10, 17, 14, 15, 19, 11, 13, 16, 18, 9, 13), TICS_Skala1 = c(23, 1, 22, 11, 14, 16, 22, 25, 2, 11, 1, 29, 9, 20, 10, 19, 16, 9, 18, 16), TICS_Skala2 = c(14, 12, 11, 9, 11, 10, 4, 10, 5, 8, 5, 24, 13, 5, 6, 6, 14, 2, 1, 13), TICS_Skala3 = c(21, 6, 10, 5, 12, 14, 11, 20, 3, 11, 4, 27, 20, 13, 7, 13, 20, 11, 11, 18), TICS_Skala4 = c(13, 14, 13, 2, 16, 23, 10, 9, 3, 13, 15, 18, 14, 11, 13, 10, 7, 9, 17, 6), TICS_Skala5 = c(12, 2, 6, 5, 3, 5, 8, 3, 4, 6, 0, 18, 3, 7, 1, 6, 6, 1, 13, 3), TICS_Skala6 = c(10, 2, 3, 4, 4, 6, 3, 0, 0, 5, 2, 15, 10, 5, 2, 1, 5, 2, 8, 3), TICS_Skala7 = c(15, 5, 9, 13, 4, 8, 4, 9, 1, 6, 2, 11, 2, 12, 3, 2, 1, 3, 2, 7), TICS_Skala8 = c(8, 10, 3, 0, 11, 7, 2, 1, 2, 2, 7, 20, 7, 2, 2, 2, 1, 1, 2, 3), TICS_Skala9 = c(12, 3, 4, 8, 8, 6, 9, 5, 2, 6, 5, 11, 3, 11, 1, 5, 9, 3, 7, 5 ), TICS_Skala10 = c(32, 5, 18, 16, 19, 18, 21, 16, 5, 17, 7, 39, 12, 24, 3, 15, 20, 6, 25, 14), neuro.c = c(6.08921933085502, -6.91078066914498, -5.91078066914498, 0.089219330855018, 0.089219330855018, 1.08921933085502, 2.08921933085502, -4.91078066914498, -4.91078066914498, -4.91078066914498, -4.91078066914498, 6.08921933085502, -4.91078066914498, 8.08921933085502, -5.91078066914498, -6.91078066914498, -1.91078066914498, -4.91078066914498, 9.08921933085502, -2.91078066914498), extra.c = c(5.21003717472119, 7.21003717472119, 9.21003717472119, -2.78996282527881, 9.21003717472119, 7.21003717472119, 8.21003717472119, 7.21003717472119, 6.21003717472119, 0.21003717472119, 7.21003717472119, 4.21003717472119, 5.21003717472119, 9.21003717472119, 1.21003717472119, 3.21003717472119, 6.21003717472119, 8.21003717472119, -0.78996282527881, 3.21003717472119), age.c = c(-15.4460966542751, -14.4460966542751, -17.4460966542751, 31.5539033457249, -13.4460966542751, -9.4460966542751, -16.4460966542751, -13.4460966542751, 18.5539033457249, -14.4460966542751, -13.4460966542751, -8.4460966542751, -7.4460966542751, -18.4460966542751, -10.4460966542751, -11.4460966542751, -12.4460966542751, -17.4460966542751, -14.4460966542751, 14.5539033457249), HSP.c = c(-1.92936802973978, -12.9293680297398, -18.9293680297398, 14.0706319702602, -2.92936802973978, -5.92936802973978, 9.07063197026022, -28.9293680297398, 12.0706319702602, -19.9293680297398, -7.92936802973978, 18.0706319702602, -19.9293680297398, -14.9293680297398, 17.0706319702602, -8.92936802973978, 3.07063197026022, -24.9293680297398, 31.0706319702602, -28.9293680297398), HSPhoch = c(1, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0)), row.names = c(NA, 20L), class = "data.frame")
Partial Credit Model: How to calculate the item difficulty?
I have a dataframe with credits of participants for several items. I would like to calculate the item difficulty of those items. With the eRm Package I can calculate the difficulty for the different categories of each item: Some data: x1 <- c(1, 2, 1, 0, 3, 3, 0, 4, 4, 1, 0, 3, 2, 0, 4, 1, NA, 1, 1, NA, 0, 1, 2, 1, 1, 3, 0, 2, 1, 0) x2 <- c(0, 1, 0, 3, 2, 0, 1, 2, 2, NA, 0, 1, 2, 2, NA, 1, 2, 1, 2, 1, 0, 2, 3, 0, 1, 1, 0, 1, 1, 3) x3 <- c(NA, NA, 3, 0, 1, 2, 0, 1, 1, NA, 3, 0, 1, 2, 0, 1, 2, 1, 0, 1, 3, 1, 3, 0, 1, 1, 0, 1, 1, 0) x4 <- c(3, 0, 2, 2, 3, 2, 1, 2, 0, 0, 1, 0, 1, 1, 0, 1, 2, 1, 1, 2, 0, 1, 1, 2, 1, 1, 0, 1, 1, 0) x5 <- c(1, NA, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, NA, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0) dat <- data.frame(x1, x2, x3, x4, x5) library("eRm") Calculation of category difficulties: PCM(dat) PCM(dat)$etapar I do not need the difficulty for the categories, but for the whole item. How can I calculate the overall difficulty of each item? Thank you very much in advance!
Imputing missing values keeping a rectangular shape in mind
I have a data set where number denotes a particular color. Since i have large data set, i am sharing a sample data and work. I am looking forward to create this Output d <- matrix(c( 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 1, 1, 2, 3, 3, 2, 1, 1, 2, 3, 3, 2, 1, 1, 2, 3, 3, 2, 1, 1, 2, 3, 3, 2, 1, 1, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), nrow=13, byrow = TRUE) from this Input : d_mi <- d d_mi[ sample(1:length(d), length(d)*0.3) ] <- NA d_mi Optional : Output(in color mode)