Control the level of detail in a pivot in R (tidyverse) [duplicate]

Control the level of detail in a pivot in R (tidyverse) [duplicate] - r

This question already has answers here:
Reshaping multiple sets of measurement columns (wide format) into single columns (long format)
(8 answers)
Closed 2 years ago.
I have an extremely wide dataset that I am trying to unpivot to a degree but not completely. Essentially I am trying to group certain columns together based on a string before an underscore and pivot on those groups individually. My current method uses two opposite pivots, a for loop, and an intermediate list to accomplish my goal. I am able to get my final product but for my own knowledge, I am wondering if there is a more elegant solution. I realize that I am likely not explaining things well so I have recreated the scenario with a dummy dataset.
#Required packages
library(tidyverse)
#Dummy data
file <- as_tibble(data.frame(id = c("QQQ", "WWW", "EEE", "RRR", "TTT"),
state = c("aa", "bb", "cc", "dd", "ee"),
city = c("ff", "gg", "hh", "ii", "jj"),
a_1 = runif(5),
a_2 = runif(5),
a_3 = runif(5),
a_4 = runif(5),
a_5 = runif(5),
a_6 = runif(5),
a_7 = runif(5),
a_8 = runif(5),
a_9 = runif(5),
a_10 = runif(5),
b_1 = runif(5),
b_2 = runif(5),
b_3 = runif(5),
b_4 = runif(5),
b_5 = runif(5),
b_6 = runif(5),
b_7 = runif(5),
b_8 = runif(5),
b_9 = runif(5),
b_10 = runif(5),
c_1 = runif(5),
c_2 = runif(5),
c_3 = runif(5),
c_4 = runif(5),
c_5 = runif(5),
c_6 = runif(5),
c_7 = runif(5),
c_8 = runif(5),
c_9 = runif(5),
c_10 = runif(5)))
#My solution
longer <- file %>%
pivot_longer(cols = c(-id:-city),
names_to = c(".value", "section"),
names_pattern = "(.+)_([0-9]+$)"
)
num_letterGroup <- ncol(longer) - 4 #4 is the number of columns i want to retain
wide_list <- vector(mode = "list", length = num_letterGroup)
name_list <- vector(mode = "character", length = num_letterGroup)
for (i in 1:num_letterGroup) {
col_num <- 4 + i
col_name <- colnames(longer)[col_num]
wide <- longer %>%
select(1:4, all_of(col_name)) %>%
pivot_wider(names_from = section, values_from = col_name) %>%
mutate(letterGroup = col_name)
wide_list[[i]] <- wide
name_list[i] <- col_name
}
names(wide_list) <- name_list
wide_df <- bind_rows(wide_list)
I realize that the amount of data given might seem excessive but I needed the column numbers to be sequential as well as reach double digits. Thank you in advance for any assistance you can provide.
EDIT TO CLARIFY: wide_df is the final product that I want

EDIT
This is actually much simpler than the original answer. (Thanks to #thelatemail)
library(tidyr)
pivot_longer(file,
cols = -c(id:city),
names_to = c('letterGroup', '.value'),
names_sep = '_')
# A tibble: 15 x 14
# id state city letterGroup `1` `2` `3` `4` `5` `6` `7` `8` `9` `10`
# <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 QQQ aa ff a 0.894 0.534 0.583 0.327 0.497 0.254 0.877 0.236 0.585 0.436
# 2 QQQ aa ff b 0.861 0.897 0.244 0.292 0.818 0.428 0.732 0.322 0.702 0.158
# 3 QQQ aa ff c 0.371 0.842 0.918 0.615 0.346 0.675 0.821 0.718 0.461 0.374
# 4 WWW bb gg a 0.573 0.00886 0.555 0.810 0.480 0.763 0.624 0.0667 0.705 0.872
# 5 WWW bb gg b 0.994 0.652 0.961 0.825 0.398 0.0138 0.560 0.695 0.0171 0.704
# 6 WWW bb gg c 0.113 0.988 0.663 0.0461 0.335 0.478 0.291 0.338 0.386 0.183
# 7 EEE cc hh a 0.482 0.197 0.630 0.442 0.633 0.932 0.317 0.119 0.872 0.678
# 8 EEE cc hh b 0.834 0.378 0.504 0.911 0.644 0.976 0.777 0.485 0.470 0.560
# 9 EEE cc hh c 0.819 0.240 0.683 0.570 0.969 0.956 0.745 0.790 0.0548 0.314
#10 RRR dd ii a 0.887 0.818 0.0266 0.444 0.554 0.817 0.332 0.0801 0.966 0.252
#11 RRR dd ii b 0.416 0.211 0.931 0.105 0.948 0.555 0.201 0.656 0.794 0.526
#12 RRR dd ii c 0.652 0.897 0.741 0.254 0.815 0.154 0.422 0.361 0.925 0.696
#13 TTT ee jj a 0.391 0.626 0.358 0.296 0.804 0.743 0.655 0.000308 0.257 0.415
#14 TTT ee jj b 0.764 0.686 0.0174 0.460 0.0164 0.0718 0.700 0.558 0.341 0.411
#15 TTT ee jj c 0.812 0.995 0.845 0.513 0.987 0.249 0.429 0.749 0.557 0.369
Original Answer
You can get data completely in long format (no need for intermediate columns), separate the column names in two different columns and get the data in wide format.
file %>%
pivot_longer(cols = -c(id:city)) %>%
separate(name, into = c('letterGroup', 'col'), sep = "_") %>%
pivot_wider(names_from = col, values_from = value)

You can try this:
library(tidyr)
df1 <- pivot_longer(file,cols = names(file)[-c(1:3)]) %>%
separate(name,into = c('letter','number'),sep = '_') %>%
pivot_wider(names_from = number,values_from = value,id_cols = c() )
#Reshape
df2 <- reshape(as.data.frame(df1),idvar = c('id','state','city','letter'),timevar = 'number',direction = 'wide')
names(df2) <- gsub('value.','',names(df2))

Related

r arrange data nested wide format

I have a dataset like this
Time1 Time2 Time3
A
Median 0.046 0.12 0
Q1, Q3 -0.12, 0.22 -1.67, -4.59 -0.245, 0.289
Range -2.75 -4.65 -2.20 - 1.425 -3.12, -1.928
B
Median 0.016 0.42 0.067
Q1, Q3 -0.21, 0.63 -1.17, -2.98 -0.478, 0.187
Range -2.15 -2.15 -1.12 - 1.125 -1.45, -1.478
What I want is to make this look like this
Time1 Time2 Time3
Median Q1,Q3 Range Median Q1,Q3 Range Median Q1,Q3 Range
A 0.046 -0.12, 0.22 2.75 -4.65 0.12 -1.67, -4.59 -2.20 - 1.425 0 -0.245, 0.289 -3.12, -1.928
B 0.016 -0.21, 0.63 -2.15 -2.15 0.42 -1.17, -2.98 -1.12 - 1.125 0.067 -0.478, 0.187 -1.45, -1.478
I have used spread function before to change long to wide, not sure how to turn this into a nested wide. Any suggestions is much appreciated.
df <- structure(list(Col1 = c("A", "Median", "Q1, Q3", "Range", "B",
"Median", "Q1, Q3", "Range"), Time1 = c("", "0.046", "-0.12, 0.22",
"-2.75 -4.65", "", "0.016", "-0.21, 0.63", "-2.15 -2.15"), Time2 = c("",
"0.12", "-1.67, -4.59", "-2.20 - 1.425", "", "0.42", "-1.17, -2.98",
"-1.12 - 1.125"), Time3 = c("", "0 ", "-0.245, 0.289 ",
"-3.12, -1.928", "", "0.067 ", "-0.478, 0.187 ", "-1.45, -1.478"
)), class = "data.frame", row.names = c(NA, -8L))

Here is a potential solution, see comments for the step by step.
library(tidyr)
#find rows containing the ids
namerows <- which(df$Time1=="")
#create and fill in the id column
df$id <- ifelse(df$Time1=="", df$Col1, NA)
df <- fill(df, id, .direction="down")
#clean up the dataframe
df <- df[-namerows, ]
#pivot
pivot_wider(df, id_cols = "id", names_from = "Col1", values_from = starts_with("Time"))
The result:
# A tibble: 2 × 10
id Time1_Median `Time1_Q1, Q3` Time1_Range Time2_Median `Time2_Q1, Q3` Time2_Range Time3_Median `Time3_Q1, Q3` Time3_Range
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 A 0.046 -0.12, 0.22 -2.75 -4.65 0.12 -1.67, -4.59 -2.20 - 1.425 "0 " "-0.245, 0.289 " -3.12, -1.928
2 B 0.016 -0.21, 0.63 -2.15 -2.15 0.42 -1.17, -2.98 -1.12 - 1.125 "0.067 " "-0.478, 0.187 " -1.45, -1.478

How to sum up durations if certain patterns are found across columns

I have a dataframe with words and their durations in speech:
test1
d1 d2 d3 d4 d5 d6 d7 d8 d9 d10 w1 w2 w3 w4 w5 w6 w7 w8 w9 w10
10 0.103 0.168 0.198 0.188 0.359 0.343 0.064 0.075 0.095 0.367 And I thought oh no Sarah do n't do it
132 0.091 0.072 0.109 0.119 0.113 0.087 0.088 0.264 0.092 0.249 I du n no you ca n't see his head
784 0.152 0.341 0.117 0.108 0.123 0.263 0.083 0.095 0.099 0.098 Oh honestly I did n't touch it I did n't
The short form n't is treated as if it were a separate word. That's okay as long as the preceding word ends on a consonant such as did, but that's not okay if the preceding word ends on a vowel such do or ca. Because that separation into different words is incorrect the separation into different durations is incorrect too.
What I'd like to do is sum up the durations of ca and n't as well as doand n't but leave alone the separate durations for did and n't.
I know how to select the rows where the changes need to be implemented:
test1[which(grepl("(?<=(ca|do)\\s)n't", apply(test1, 1, paste0, collapse = " "), perl = T)),]
but I'm stuck going forward.
The desired result would look like this:
d1 d2 d3 d4 d5 d6 d7 d8 d9 d10 w1 w2 w3 w4 w5 w6 w7 w8 w9 w10
10 0.103 0.168 0.198 0.188 0.359 0.343 0.139 0.095 0.367 NA And I thought oh no Sarah do n't do it
132 0.091 0.072 0.109 0.119 0.113 0.175 0.264 0.092 0.249 NA I du n no you ca n't see his head
784 0.152 0.341 0.117 0.108 0.123 0.263 0.083 0.095 0.099 0.098 Oh honestly I did n't touch it I did n't
How can this be done? Help is much appreciated.
Reproducible data:
test1 <- structure(list(d1 = c(0.103, 0.091, 0.152), d2 = c(0.168, 0.072,
0.341), d3 = c(0.198, 0.109, 0.117), d4 = c(0.188, 0.119, 0.108
), d5 = c(0.359, 0.113, 0.123), d6 = c(0.343, 0.087, 0.263),
d7 = c(0.064, 0.088, 0.083), d8 = c(0.075, 0.264, 0.095),
d9 = c(0.095, 0.092, 0.099), d10 = c(0.367, 0.249, 0.098),
w1 = c("And", "I", "Oh"), w2 = c("I", "du", "honestly"),
w3 = c("thought", "n", "I"), w4 = c("oh", "no", "did"), w5 = c("no",
"you", "n't"), w6 = c("Sarah", "ca", "touch"), w7 = c("do",
"n't", "it"), w8 = c("n't", "see", "I"), w9 = c("do", "his",
"did"), w10 = c("it", "head", "n't")), row.names = c(10L,
132L, 784L), class = "data.frame")

I think this is best done with data in long instead of wide format so you can take advantage of grouping operations:
library(dplyr)
library(tidyr)
library(tibble)
test1 %>%
rownames_to_column() %>%
pivot_longer(-rowname, names_to = c(".value", "number"), names_pattern = "(\\D)(\\d+)") %>%
group_by(rowname) %>%
mutate(wid = cumsum(!(lag(w) %in% c("ca", "do") & w == "n't"))) %>%
group_by(rowname, wid) %>%
summarise(d = sum(d),
w = paste0(w, collapse = "")) %>%
pivot_wider(names_from = wid, values_from = c(d, w), names_sep = "")
`summarise()` regrouping output by 'rowname' (override with `.groups` argument)
# A tibble: 3 x 21
# Groups: rowname [3]
rowname d1 d2 d3 d4 d5 d6 d7 d8 d9 d10 w1 w2 w3 w4 w5 w6 w7 w8 w9 w10
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 10 0.103 0.168 0.198 0.188 0.359 0.343 0.139 0.095 0.367 NA And I thought oh no Sarah don't do it NA
2 132 0.091 0.072 0.109 0.119 0.113 0.175 0.264 0.092 0.249 NA I du n no you can't see his head NA
3 784 0.152 0.341 0.117 0.108 0.123 0.263 0.083 0.095 0.099 0.098 Oh honestly I did n't touch it I did n't

Divide matrix values by category means in R

I have a matrix (A) containing 211 rows and 6 columns (one per time period) and a different matrix (B) containing 211 rows and 2 columns, the second of which contains categorial information (1-9).
My aim is to create a new matrix (C) where each value in matrix A is the value(A) divided by the mean of (value(A) by category(B)). I managed to compute the means for each category per column with the aggregate function. These are stored in a separate dataframe, column_means, with each time wave in a separate column. This also contains the information about the group in column_means[,1].
I don't understand how to proceed from here and am looking for an elegant solution so I can transfer this knowledge to future projects (and possibly improve my existing code). My guess is that the solution is hidden somewhere in dplyr and rather simple once you know it.
Thank you for any suggestions.
Data example:
##each column here represents a wave:
initialmatrix <- structure(c(0.882647671948723, 0.847932241438909, 0.753052308699317,
0.754977233408875, NA, 0.886095543329695, 0.849625252682829,
0.78893884364632, 0.77111113840682, NA, 0.887255207679895, 0.851503493865384,
0.812107856411831, 0.793982699495818, NA, 0.885212452552841,
0.854894065774315, 0.815265718290737, 0.806766276556325, NA,
0.882027335190646, 0.85386634818439, 0.818052477777012, 0.815997781565393,
NA, 0.88245957310107, 0.855819521951304, 0.830425687228663, 0.820857689847061,
NA), .Dim = 5:6, .Dimnames = list(NULL, c("V1", "V2", "V3", "V4",
"V5", "V6")))
##the first column is unique ID, the 2nd the category:
categories <- structure(c(1L, 2L, 3L, 4L, 5L, 2L, 1L, 2L, 2L, 4L), .Dim = c(5L,
2L), .Dimnames = list(NULL, c("V1", "V2")))
##the first column represents the category, column 1-6 the mean per category for each corresponding wave in "initialmatrix"
column.means <- structure(list(Group.1 = 1:5, x = c(0.805689153058216, 0.815006230419524,
0.832326976776262, 0.794835253329865, 0.773041961434791), asset_means_2...2. = c(0.80050960343197,
0.81923553710203, 0.833814773618545, 0.797834687980729, 0.780028077018158
), asset_means_3...2. = c(0.805053341257357, 0.828691564900149,
0.833953165695685, 0.799381078569563, 0.785813047374534), asset_means_4...2. = c(0.806116664276125,
0.832439754757116, 0.835982197159582, 0.801702200401293, 0.788814840753852
), asset_means_5...2. = c(0.807668548993891, 0.83801834926905,
0.836036508152776, 0.803433961863399, 0.79014026195926), asset_means_6...2. = c(0.808800359101212,
0.840923947682599, 0.839660313992458, 0.804901773257962, 0.793165113115977
)), row.names = c(NA, 5L), class = "data.frame")

Is this what you are trying to do?
options(digits=3)
divisor <- column.means[categories[, 2], -1]
divisor
# x asset_means_2...2. asset_means_3...2. asset_means_4...2. asset_means_5...2. asset_means_6...2.
# 2 0.815 0.819 0.829 0.832 0.838 0.841
# 1 0.806 0.801 0.805 0.806 0.808 0.809
# 2.1 0.815 0.819 0.829 0.832 0.838 0.841
# 2.2 0.815 0.819 0.829 0.832 0.838 0.841
# 4 0.795 0.798 0.799 0.802 0.803 0.805
initialmatrix/divisor
# x asset_means_2...2. asset_means_3...2. asset_means_4...2. asset_means_5...2. asset_means_6...2.
# 2 1.083 1.082 1.071 1.063 1.053 1.049
# 1 1.052 1.061 1.058 1.061 1.057 1.058
# 2.1 0.924 0.963 0.980 0.979 0.976 0.988
# 2.2 0.926 0.941 0.958 0.969 0.974 0.976
# 4 NA NA NA NA NA NA

This looks like a job for Superma ... no wait ... map2.
library(dplyr)
library(purrr)
as_tibble(initialmatrix) %>%
mutate(category = as.double(as_tibble(categories)$V2),
across(starts_with('V'),
~ unlist(map2(., category, ~ .x/mean(c(.x, .y)))))) %>%
select(-category)
# V1 V2 V3 V4 V5 V6
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 0.612 0.614 0.615 0.614 0.612 0.612
# 2 0.918 0.919 0.920 0.922 0.921 0.922
# 3 0.547 0.566 0.578 0.579 0.581 0.587
# 4 0.548 0.557 0.568 0.575 0.580 0.582
# 5 NA NA NA NA NA NA

Merging output from cor.test in R dataframe

I am trying to perform cor.test in R in a dataframe:
For a toy dataset of X and Y, I used the following:
library(dplyr)
library(broom)
X = c(0.88,1.3,5.6,3.1)
Y = c(0,1,1,1)
ft<-cor.test(X,Y)
tidy(ft) %>%
select(estimate, p.value, conf.low, conf.high) %>%
bind_rows(.id = 'grp')
which gives me the following result:
grp estimate p.value conf.low conf.high
<chr> <dbl> <dbl> <dbl> <dbl>
1 1 0.571 0.429 -0.864 0.989
Now, a short version of my dataframe is like:
df<-structure(list(X_sample1 = c(0.11, 0.98, 0.88), X_sample2 = c(0.13,
0, 1.3), X_sample3 = c(1.5, 3.5, 5.6), X_sample4 = c(3.2, 2.4,
3.1), Y_sample1 = c(0L, 1L, 0L), Y_sample2 = c(0L, 0L, 1L), Y_sample3 = c(1L,
1L, 1L), Y_sample4 = c(1L, 1L, 1L)), class = "data.frame", row.names = c("Product1",
"Product2", "Product3"))
I want to perform cor.test in each row of the df between X and Y groups. Thus, in the above example df, the groups are:
X = c(0.11,0.13,1.5,3.2)
Y = c(0,0,1,1)
---------------
X = c(0.98,0,3.5,2.4)
Y = c(1,0,1,1)
---------------
X = c(0.88,1.3,5.6,3.1)
Y = c(0,1,1,1)
I want a output like:
grp estimate p.value conf.low conf.high
Product1 0.88 0.12 -0.525 0.997
Product2 0.743 0.257 -0.762 0.994
Product3 0.571 0.429 -0.864 0.989
Thanks for your help!

One option could be:
df %>%
rownames_to_column(var = "grp") %>%
rowwise() %>%
transmute(grp,
tidy(cor.test(c_across(starts_with("X")), c_across(starts_with("Y"))))) %>%
select(grp, estimate, p.value, conf.low, conf.high)
grp estimate p.value conf.low conf.high
<chr> <dbl> <dbl> <dbl> <dbl>
1 Product1 0.880 0.120 -0.525 0.997
2 Product2 0.743 0.257 -0.762 0.994
3 Product3 0.571 0.429 -0.864 0.989

You can use dplyr and broom:
library(dplyr)
library(broom)
df %>%
rownames_to_column() %>%
pivot_longer(-rowname, names_to = c(".value", "sample"),
names_sep = "_sample") %>%
nest_by(rowname) %>%
summarize(cors1 = tidy(cor.test(data$X, data$Y)))
# A tibble: 3 x 2
# Groups: rowname [3]
rowname cors1$estimate $statistic $p.value $parameter $conf.low $conf.high
<chr> <dbl> <dbl> <dbl> <int> <dbl> <dbl>
1 Produc~ 0.880 2.62 0.120 2 -0.525 0.997
2 Produc~ 0.743 1.57 0.257 2 -0.762 0.994
3 Produc~ 0.571 0.984 0.429 2 -0.864 0.989

How can I easily combine the output of grouped summaries with an overall output for the data

I've used group_by with the summarise command in dplyr to generate some summaries for my data. I would like to get the same summaries for the overall data set and combine it as one tibble.
Is there a straighforward way of doing this? My solution below feels like it has 4X the amount of code required to do this efficently!
Thanks in advance.
# reprex
library(tidyverse)
tidy_data <- tibble::tribble(
~drug, ~gender, ~condition, ~value,
"control", "f", "work", 0.06,
"treatment", "m", "work", 0.42,
"treatment", "f", "work", 0.22,
"control", "m", "work", 0.38,
"treatment", "m", "work", 0.57,
"treatment", "f", "work", 0.24,
"control", "f", "work", 0.61,
"control", "f", "play", 0.27,
"treatment", "m", "play", 0.3,
"treatment", "f", "play", 0.09,
"control", "m", "play", 0.84,
"control", "m", "play", 0.65,
"treatment", "m", "play", 0.98,
"treatment", "f", "play", 0.38
)
tidy_summaries <- tidy_data %>%
# Group by the required variables
group_by(drug, gender, condition) %>%
summarise(mean = mean(value),
median = median(value),
min = min(value),
max = max(value)) %>%
# Bind rows will bind this output to the following one
bind_rows(
# Now for the overall version
tidy_data %>%
# Generate the overall summary values
mutate(mean = mean(value),
median = median(value),
min = min(value),
max = max(value)) %>%
# We need to know what the structure of the 'grouped_by' tibble first
# as the overall output format needs to match that
select(drug, gender, condition, mean:max) %>% # Keep columns of interest
# The same information will be appended to all rows, so we just need to retain one
filter(row_number() == 1) %>%
# Change the values in drug, gender, condition to "overall"
mutate_at(vars(drug:condition),
list(~ifelse(is.character(.), "overall", .)))
)
This the output I want, but it wasn't as simple as I might have hoped.
tidy_summaries
#> # A tibble: 9 x 7
#> # Groups: drug, gender [5]
#> drug gender condition mean median min max
#> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 control f play 0.27 0.27 0.27 0.27
#> 2 control f work 0.335 0.335 0.06 0.61
#> 3 control m play 0.745 0.745 0.65 0.84
#> 4 control m work 0.38 0.38 0.38 0.38
#> 5 treatment f play 0.235 0.235 0.09 0.38
#> 6 treatment f work 0.23 0.23 0.22 0.24
#> 7 treatment m play 0.64 0.64 0.3 0.98
#> 8 treatment m work 0.495 0.495 0.42 0.570
#> 9 overall overall overall 0.429 0.38 0.06 0.98

Try
tidy_data %>%
group_by(drug, gender, condition) %>%
summarise(mean = mean(value), median = median(value), min = min(value), max = max(value)) %>%
bind_rows(.,
tidy_data %>%
summarise(drug = "Overall", gender = "Overall", condition = "Overall", mean = mean(value), median = median(value), min = min(value), max = max(value))
)
This gives:
# A tibble: 9 x 7
# Groups: drug, gender [5]
drug gender condition mean median min max
<chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 control f play 0.27 0.27 0.27 0.27
2 control f work 0.335 0.335 0.06 0.61
3 control m play 0.745 0.745 0.65 0.84
4 control m work 0.38 0.38 0.38 0.38
5 treatment f play 0.235 0.235 0.09 0.38
6 treatment f work 0.23 0.23 0.22 0.24
7 treatment m play 0.64 0.64 0.3 0.98
8 treatment m work 0.495 0.495 0.42 0.570
9 Overall Overall Overall 0.429 0.38 0.06 0.98
The code summarizes it via groupings first, and then creates the final summary row from the original data and binds it at the very bottom.

Interesting question. My take is basically the same answer as #sumshyftw but uses mutate_if and summarise_at.
Code
library(hablar)
funs <- list(mean = ~mean(.),
median = ~median(.),
min = ~min(.),
max = ~max(.))
tidy_data %>%
group_by(drug, gender, condition) %>%
summarise_at(vars(value), funs) %>%
ungroup() %>%
bind_rows(., tidy_data %>% summarise_at(vars(value), funs)) %>%
mutate_if(is.character, ~if_na(., "Overall"))
Result
drug gender condition mean median min max
<chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 control f play 0.27 0.27 0.27 0.27
2 control f work 0.335 0.335 0.06 0.61
3 control m play 0.745 0.745 0.65 0.84
4 control m work 0.38 0.38 0.38 0.38
5 treatment f play 0.235 0.235 0.09 0.38
6 treatment f work 0.23 0.23 0.22 0.24
7 treatment m play 0.64 0.64 0.3 0.98
8 treatment m work 0.495 0.495 0.42 0.570
9 Overall Overall Overall 0.429 0.38 0.06 0.98