summarise() in dplyr 1.0.2 acting like mutate() - r

Given a tibble that lists users, products, and product features, I am attempting to calculate the fraction of distinct product users who have a certain product feature:
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
df <- tribble(
~users, ~product, ~feature,
"bob","iPhone","screen",
"bob","iPhone","camera",
"bob","iPhone","facial recognition",
"sally","Android","screen",
"sally","Android","camera",
"sally","Android","facial recognition",
"joe","Huawei","screen",
"joe","Huawei","camera",
"joe","Huawei","facial recognition",
"rachel","iPhone","screen",
"rachel","iPhone","camera",
"rachel","iPhone","fingerprint sensor"
)
# Get count of distinct users by product
df <- df %>%
group_by(product) %>%
mutate(n_users = n_distinct(users)) %>%
ungroup()
df
#> # A tibble: 12 x 4
#> users product feature n_users
#> <chr> <chr> <chr> <int>
#> 1 bob iPhone screen 2
#> 2 bob iPhone camera 2
#> 3 bob iPhone facial recognition 2
#> 4 sally Android screen 1
#> 5 sally Android camera 1
#> 6 sally Android facial recognition 1
#> 7 joe Huawei screen 1
#> 8 joe Huawei camera 1
#> 9 joe Huawei facial recognition 1
#> 10 rachel iPhone screen 2
#> 11 rachel iPhone camera 2
#> 12 rachel iPhone fingerprint sensor 2
# Count the fraction of distinct users with given product feature
df <- df %>%
group_by(product, feature) %>%
summarise(feature_fraction = n()/n_users,
.groups = "drop_last")
df
#> # A tibble: 12 x 3
#> # Groups: product [3]
#> product feature feature_fraction
#> <chr> <chr> <dbl>
#> 1 Android camera 1
#> 2 Android facial recognition 1
#> 3 Android screen 1
#> 4 Huawei camera 1
#> 5 Huawei facial recognition 1
#> 6 Huawei screen 1
#> 7 iPhone camera 1
#> 8 iPhone camera 1
#> 9 iPhone facial recognition 0.5
#> 10 iPhone fingerprint sensor 0.5
#> 11 iPhone screen 1
#> 12 iPhone screen 1
Created on 2020-10-23 by the reprex package (v0.3.0)
devtools::session_info()
#> - Session info ---------------------------------------------------------------
#> setting value
#> version R version 4.0.2 (2020-06-22)
#> os Windows 10 x64
#> system x86_64, mingw32
#> ui RTerm
#> language (EN)
#> collate English_United States.1252
#> ctype English_United States.1252
#> tz America/New_York
#> date 2020-10-23
#>
#> - Packages -------------------------------------------------------------------
#> package * version date lib source
#> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.2)
#> backports 1.1.10 2020-09-15 [1] CRAN (R 4.0.2)
#> callr 3.4.4 2020-09-07 [1] CRAN (R 4.0.2)
#> cli 2.0.2 2020-02-28 [1] CRAN (R 4.0.2)
#> crayon 1.3.4 2017-09-16 [1] CRAN (R 4.0.2)
#> desc 1.2.0 2018-05-01 [1] CRAN (R 4.0.2)
#> devtools 2.3.1 2020-07-21 [1] CRAN (R 4.0.2)
#> digest 0.6.25 2020-02-23 [1] CRAN (R 4.0.2)
#> dplyr * 1.0.2 2020-08-18 [1] CRAN (R 4.0.2)
#> ellipsis 0.3.1 2020-05-15 [1] CRAN (R 4.0.2)
#> evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.2)
#> fansi 0.4.1 2020-01-08 [1] CRAN (R 4.0.2)
#> fs 1.5.0 2020-07-31 [1] CRAN (R 4.0.2)
#> generics 0.0.2 2018-11-29 [1] CRAN (R 4.0.2)
#> glue 1.4.2 2020-08-27 [1] CRAN (R 4.0.2)
#> highr 0.8 2019-03-20 [1] CRAN (R 4.0.2)
#> htmltools 0.5.0 2020-06-16 [1] CRAN (R 4.0.2)
#> knitr 1.29 2020-06-23 [1] CRAN (R 4.0.2)
#> lifecycle 0.2.0 2020-03-06 [1] CRAN (R 4.0.2)
#> magrittr 1.5 2014-11-22 [1] CRAN (R 4.0.2)
#> memoise 1.1.0 2017-04-21 [1] CRAN (R 4.0.2)
#> pillar 1.4.6 2020-07-10 [1] CRAN (R 4.0.2)
#> pkgbuild 1.1.0 2020-07-13 [1] CRAN (R 4.0.2)
#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.0.2)
#> pkgload 1.1.0 2020-05-29 [1] CRAN (R 4.0.2)
#> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.0.2)
#> processx 3.4.4 2020-09-03 [1] CRAN (R 4.0.2)
#> ps 1.3.4 2020-08-11 [1] CRAN (R 4.0.2)
#> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.0.2)
#> R6 2.4.1 2019-11-12 [1] CRAN (R 4.0.2)
#> remotes 2.2.0 2020-07-21 [1] CRAN (R 4.0.2)
#> rlang 0.4.7 2020-07-09 [1] CRAN (R 4.0.2)
#> rmarkdown 2.3 2020-06-18 [1] CRAN (R 4.0.2)
#> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 4.0.2)
#> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.2)
#> stringi 1.5.3 2020-09-09 [1] CRAN (R 4.0.2)
#> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.2)
#> testthat 2.3.2 2020-03-02 [1] CRAN (R 4.0.2)
#> tibble 3.0.3 2020-07-10 [1] CRAN (R 4.0.2)
#> tidyselect 1.1.0 2020-05-11 [1] CRAN (R 4.0.2)
#> usethis 1.6.1 2020-04-29 [1] CRAN (R 4.0.2)
#> utf8 1.1.4 2018-05-24 [1] CRAN (R 4.0.2)
#> vctrs 0.3.4 2020-08-29 [1] CRAN (R 4.0.2)
#> withr 2.3.0 2020-09-22 [1] CRAN (R 4.0.2)
#> xfun 0.16 2020-07-24 [1] CRAN (R 4.0.2)
#> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.2)
As can be seen, the final tibble has multiple rows for group-key pairs with the same summary value. This is, to my knowledge, unexpected behavior for summarise and seems almost the same as what mutate would return. Given this open github issue, it appears that maybe all the kinks haven't been ironed out of the new version of summarise. I also could just be being stupid, and would appreciate if someone could help get me back on track!

The problem is your have multiple values for n_users for each group. The latest version of dplyr allow you to return more than one row per group if your summary function returns multiple values.
If you want to assume all the values for n_users will be the same per group, then you can do
df %>%
group_by(product, feature) %>%
summarise(feature_fraction = n()/first(n_users),
.groups = "drop_last")
That will make sure only one value is returned per group

Related

Regular expression in R, splitting sentence at keywords in R

Hi,
I would like to split the sentence into two portions, from everything keyword_1 to keyword_2, and keyword_2 to the end of the sentence, preferably using regular expressions.
For example (my ideal output - shown below):
Below is a data set that I made.
Data set
library(tibble)
keyword_1 <- c("coffee", "apple", "rainbow", "strawberry shortcake")
keyword_2 <- c("life", "new york", "seven colours", "sweet and yummy")
raw <-
tibble(
sentence = c(
"coffee is keyword_1_1 life is keyword_2_1",
"apple is keyword_1_2 new york is keyword_2_2",
"rainbow is keyword_1_3 seven colours is keyword_2_3",
"strawberry shortcake is keyword_1_4 sweet and yummy is keyword 2_4"
))
raw
#> # A tibble: 4 x 1
#> sentence
#> <chr>
#> 1 coffee is keyword_1_1 life is keyword_2_1
#> 2 apple is keyword_1_2 new york is keyword_2_2
#> 3 rainbow is keyword_1_3 seven colours is keyword_2_3
#> 4 strawberry shortcake is keyword_1_4 sweet and yummy is keyword 2_4
Intended Output
library(tibble)
output = tibble(
output1 = c(
"coffee is keyword_1_1",
"apple is keyword_1_2",
"rainbow is keyword_1_3",
"strawberry shortcake is keyword_1_4"
),
output2 = c("life is keyword_2_1", "new york is keyword_2_2",
"seven colours is keyword_2_3", "sweet and yummy is keyword 2_4")
)
output
#> # A tibble: 4 x 2
#> output1 output2
#> <chr> <chr>
#> 1 coffee is keyword_1_1 life is keyword_2_1
#> 2 apple is keyword_1_2 new york is keyword_2_2
#> 3 rainbow is keyword_1_3 seven colours is keyword_2_3
#> 4 strawberry shortcake is keyword_1_4 sweet and yummy is keyword 2_4
Created on 2021-03-18 by the reprex package (v0.3.0)
devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.0.2 (2020-06-22)
#> os macOS 10.16
#> system x86_64, darwin17.0
#> ui X11
#> language (EN)
#> collate en_AU.UTF-8
#> ctype en_AU.UTF-8
#> tz Australia/Melbourne
#> date 2021-03-18
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date lib source
#> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.2)
#> callr 3.5.1 2020-10-13 [1] CRAN (R 4.0.2)
#> cli 2.3.1 2021-02-23 [1] CRAN (R 4.0.2)
#> crayon 1.3.4 2017-09-16 [1] CRAN (R 4.0.2)
#> debugme 1.1.0 2017-10-22 [1] CRAN (R 4.0.2)
#> desc 1.2.0 2018-05-01 [1] CRAN (R 4.0.2)
#> devtools 2.3.2 2020-09-18 [1] CRAN (R 4.0.2)
#> digest 0.6.27 2020-10-24 [1] CRAN (R 4.0.2)
#> ellipsis 0.3.1 2020-05-15 [1] CRAN (R 4.0.2)
#> evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.1)
#> fansi 0.4.1 2020-01-08 [1] CRAN (R 4.0.2)
#> fs 1.5.0 2020-07-31 [1] CRAN (R 4.0.2)
#> glue 1.4.2 2020-08-27 [1] CRAN (R 4.0.2)
#> highr 0.8 2019-03-20 [1] CRAN (R 4.0.2)
#> htmltools 0.5.1.1 2021-01-22 [1] CRAN (R 4.0.2)
#> knitr 1.31 2021-01-27 [1] CRAN (R 4.0.2)
#> lifecycle 0.2.0 2020-03-06 [1] CRAN (R 4.0.2)
#> magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.0.2)
#> memoise 1.1.0 2017-04-21 [1] CRAN (R 4.0.2)
#> pillar 1.5.0 2021-02-22 [1] CRAN (R 4.0.2)
#> pkgbuild 1.1.0 2020-07-13 [1] CRAN (R 4.0.2)
#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.0.2)
#> pkgload 1.1.0 2020-05-29 [1] CRAN (R 4.0.2)
#> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.0.2)
#> processx 3.4.4 2020-09-03 [1] CRAN (R 4.0.2)
#> ps 1.4.0 2020-10-07 [1] CRAN (R 4.0.2)
#> R6 2.5.0 2020-10-28 [1] CRAN (R 4.0.2)
#> remotes 2.2.0 2020-07-21 [1] CRAN (R 4.0.2)
#> rlang 0.4.10 2020-12-30 [1] CRAN (R 4.0.2)
#> rmarkdown 2.5 2020-10-21 [1] CRAN (R 4.0.2)
#> rprojroot 2.0.2 2020-11-15 [1] CRAN (R 4.0.2)
#> rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.0.2)
#> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.2)
#> stringi 1.5.3 2020-09-09 [1] CRAN (R 4.0.2)
#> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.2)
#> testthat 3.0.0 2020-10-31 [1] CRAN (R 4.0.2)
#> tibble * 3.1.0 2021-02-25 [1] CRAN (R 4.0.2)
#> usethis 1.6.3 2020-09-17 [1] CRAN (R 4.0.2)
#> utf8 1.1.4 2018-05-24 [1] CRAN (R 4.0.2)
#> vctrs 0.3.4 2020-08-29 [1] CRAN (R 4.0.2)
#> withr 2.3.0 2020-09-22 [1] CRAN (R 4.0.2)
#> xfun 0.19.3 2020-11-06 [1] Github (yihui/xfun#12e77f5)
#> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.2)
#>
#> [1] /Library/Frameworks/R.framework/Versions/4.0/Resources/library
Assuming that the pattern is always "keyword_number_number", so that the fourth entrance is missing a "_" and should be:
raw[4,1] = "strawberry shortcake is keyword_1_4 sweet and yummy is keyword_2_4"
Then we can write:
pattern = "([a-z ]+ keyword_[0-9]_[0-9]) ([a-z ]+ keyword_[0-9]_[0-9])"
a = matrix(NA, nrow(raw), 2)
for(i in 1:nrow(raw)){
for(j in 1:2)
a[i,j] = gsub(pattern, paste0("\\",j), raw[i,1])}
Output:
> a
[,1] [,2]
[1,] "coffee is keyword_1_1" "life is keyword_2_1"
[2,] "apple is keyword_1_2" "new york is keyword_2_2"
[3,] "rainbow is keyword_1_3" "seven colours is keyword_2_3"
[4,] "strawberry shortcake is keyword_1_4" "sweet and yummy is keyword_2_4"
here is a data.table approach, using a look-behind regex pattern for splitting
library( data.table )
setDT(raw)[, paste0( "output", 1:2 ) :=
lapply( tstrsplit(sentence, "(?<=_[0-9]{1}_[0-9]{1})", perl = TRUE ),
trimws ) ][, sentence := NULL][]
# output1 output2
# 1: coffee is keyword_1_1 life is keyword_2_1
# 2: apple is keyword_1_2 new york is keyword_2_2
# 3: rainbow is keyword_1_3 seven colours is keyword_2_3
# 4: strawberry shortcake is keyword_1_4 sweet and yummy is keyword 2_4

How to delete entire rows if it has duplicates values and ids in the data frame in R

Hello all my df looks like
PID Record date category
123 22-04-1996 2
123 25-02-2000 NA
132 16-06-1994 1
143 25-07-1990 3
154 09-07-1993 1
154 08-08-1998 2
165 23-03-1993 NA
165 15-05-1995 NA
174 30-12-2000 NA
If category value is available in any of one rows for same PID i want to remove them from the dataframe
completely.
Expected output :
PID Record date category
132 16-06-1994 1
143 25-07-1990 3
165 23-03-1993 NA
165 15-05-1995 NA
174 30-12-2000 NA
Thank You in Advance
Using {dplyr} you can group your data by PID and maintain only groups with a single distinct value of category (NA included).
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
example_data <- tribble(
~PID, ~`Record date`, ~category,
123, "22-04-1996", 2,
123, "25-02-2000", NA,
132, "16-06-1994", 1,
143, "25-07-1990", 3,
154, "09-07-1993", 1,
154, "08-08-1998", 2,
165, "23-03-1993", NA,
165, "15-05-1995", NA,
174, "30-12-2000", NA
)
example_data %>%
with_groups(PID, filter, n_distinct(category) == 1)
#> # A tibble: 5 x 3
#> PID `Record date` category
#> <dbl> <chr> <dbl>
#> 1 132 16-06-1994 1
#> 2 143 25-07-1990 3
#> 3 165 23-03-1993 NA
#> 4 165 15-05-1995 NA
#> 5 174 30-12-2000 NA
Created on 2020-09-07 by the reprex package (v0.3.0)
devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.0.2 (2020-06-22)
#> os Ubuntu 20.04.1 LTS
#> system x86_64, linux-gnu
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz Europe/Rome
#> date 2020-09-07
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date lib source
#> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.2)
#> backports 1.1.9 2020-08-24 [1] CRAN (R 4.0.2)
#> callr 3.4.3 2020-03-28 [1] CRAN (R 4.0.2)
#> cli 2.0.2 2020-02-28 [1] CRAN (R 4.0.2)
#> crayon 1.3.4 2017-09-16 [1] CRAN (R 4.0.2)
#> desc 1.2.0 2018-05-01 [1] CRAN (R 4.0.2)
#> devtools 2.3.1 2020-07-21 [1] CRAN (R 4.0.2)
#> digest 0.6.25 2020-02-23 [1] CRAN (R 4.0.2)
#> dplyr * 1.0.2 2020-08-18 [1] CRAN (R 4.0.2)
#> ellipsis 0.3.1 2020-05-15 [1] CRAN (R 4.0.2)
#> evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.2)
#> fansi 0.4.1 2020-01-08 [1] CRAN (R 4.0.2)
#> fs 1.5.0 2020-07-31 [1] CRAN (R 4.0.2)
#> generics 0.0.2 2018-11-29 [1] CRAN (R 4.0.2)
#> glue 1.4.2 2020-08-27 [1] CRAN (R 4.0.2)
#> highr 0.8 2019-03-20 [1] CRAN (R 4.0.2)
#> htmltools 0.5.0 2020-06-16 [1] CRAN (R 4.0.2)
#> knitr 1.29 2020-06-23 [1] CRAN (R 4.0.2)
#> lifecycle 0.2.0 2020-03-06 [1] CRAN (R 4.0.2)
#> magrittr 1.5 2014-11-22 [1] CRAN (R 4.0.2)
#> memoise 1.1.0 2017-04-21 [1] CRAN (R 4.0.2)
#> pillar 1.4.6 2020-07-10 [1] CRAN (R 4.0.2)
#> pkgbuild 1.1.0 2020-07-13 [1] CRAN (R 4.0.2)
#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.0.2)
#> pkgload 1.1.0 2020-05-29 [1] CRAN (R 4.0.2)
#> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.0.2)
#> processx 3.4.3 2020-07-05 [1] CRAN (R 4.0.2)
#> ps 1.3.4 2020-08-11 [1] CRAN (R 4.0.2)
#> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.0.2)
#> R6 2.4.1 2019-11-12 [1] CRAN (R 4.0.2)
#> remotes 2.2.0 2020-07-21 [1] CRAN (R 4.0.2)
#> rlang 0.4.7 2020-07-09 [1] CRAN (R 4.0.2)
#> rmarkdown 2.3 2020-06-18 [1] CRAN (R 4.0.2)
#> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 4.0.2)
#> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.2)
#> stringi 1.4.6 2020-02-17 [1] CRAN (R 4.0.2)
#> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.2)
#> testthat 2.3.2 2020-03-02 [1] CRAN (R 4.0.2)
#> tibble 3.0.3 2020-07-10 [1] CRAN (R 4.0.2)
#> tidyselect 1.1.0 2020-05-11 [1] CRAN (R 4.0.2)
#> usethis 1.6.1 2020-04-29 [1] CRAN (R 4.0.2)
#> utf8 1.1.4 2018-05-24 [1] CRAN (R 4.0.2)
#> vctrs 0.3.4 2020-08-29 [1] CRAN (R 4.0.2)
#> withr 2.2.0 2020-04-20 [1] CRAN (R 4.0.2)
#> xfun 0.16 2020-07-24 [1] CRAN (R 4.0.2)
#> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.2)
#>
#> [1] /home/cl/R/x86_64-pc-linux-gnu-library/4.0
#> [2] /usr/local/lib/R/site-library
#> [3] /usr/lib/R/site-library
#> [4] /usr/lib/R/library
Here is a base R option using ave + subset
subset(
df,
!ave(Negate(is.na)(category),
PID,
FUN = function(x) length(x) > 1 & any(x)
)
)
which gives
PID date category
3 132 16-06-1994 1
4 143 25-07-1990 3
7 165 23-03-1993 NA
8 165 15-05-1995 NA
9 174 30-12-2000 NA
Data
> dput(df)
structure(list(PID = c(123L, 123L, 132L, 143L, 154L, 154L, 165L,
165L, 174L), date = c("22-04-1996", "25-02-2000", "16-06-1994",
"25-07-1990", "09-07-1993", "08-08-1998", "23-03-1993", "15-05-1995",
"30-12-2000"), category = c(2L, NA, 1L, 3L, 1L, 2L, NA, NA, NA
)), class = "data.frame", row.names = c(NA, -9L))

Why does dplyr::mutate_at() on the first element in a rowwise-tibble also take effect on the rest of the elements?

In the following code, I defined a tibble df with two columns: name column contains a character vector of c("a", "b", "c"), and data column contains a list of tibbles, each with the column value. Then I'd like to change the column name of each tibble's value column to the character in the corresponding row, e.g. "a", "b" and "c". To manipulate the tibble in a row-wise manner, I used dplyr::rowwise(), but then I found that the changes taking effect on the first element (changing the column name to "a") also took effect on the rest of the elements (since after the first row, the printed tibble before the change of the column name showed the column name of "a"). And therefore, it can be expected that the change of column names to the following elements in the column failed, since there were no longer column names of "value" (all changed to "a"). Do I have to use a purrr::map() function here instead of the tidier row-wise tibble manipulation?
Would you please give me an answer using rowwise-mutate_at method? Thanks.
library(tidyverse)
#> Warning: 程辑包'tidyverse'是用R版本3.6.3 来建造的
#> Warning: 程辑包'ggplot2'是用R版本3.6.1 来建造的
#> Warning: 程辑包'tibble'是用R版本3.6.3 来建造的
#> Warning: 程辑包'tidyr'是用R版本3.6.1 来建造的
#> Warning: 程辑包'readr'是用R版本3.6.1 来建造的
#> Warning: 程辑包'purrr'是用R版本3.6.1 来建造的
#> Warning: 程辑包'dplyr'是用R版本3.6.3 来建造的
#> Warning: 程辑包'stringr'是用R版本3.6.1 来建造的
#> Warning: 程辑包'forcats'是用R版本3.6.3 来建造的
df <- tibble::tibble(name = c("a", "b", "c"),
data = list(tibble::tibble(value = 1:10)))
df_mutate <- df %>%
dplyr::rowwise() %>%
dplyr::mutate_at("data", ~ {
print(.x)
colnames(.x)[colnames(.x) %in% "value"] <- name
list(.x)
}) %>%
dplyr::ungroup()
#> # A tibble: 10 x 1
#> value
#> <int>
#> 1 1
#> 2 2
#> 3 3
#> 4 4
#> 5 5
#> 6 6
#> 7 7
#> 8 8
#> 9 9
#> 10 10
#> # A tibble: 10 x 1
#> a
#> <int>
#> 1 1
#> 2 2
#> 3 3
#> 4 4
#> 5 5
#> 6 6
#> 7 7
#> 8 8
#> 9 9
#> 10 10
#> # A tibble: 10 x 1
#> a
#> <int>
#> 1 1
#> 2 2
#> 3 3
#> 4 4
#> 5 5
#> 6 6
#> 7 7
#> 8 8
#> 9 9
#> 10 10
Created on 2020-06-19 by the reprex package (v0.3.0)
devtools::session_info()
#> - Session info ---------------------------------------------------------------
#> setting value
#> version R version 3.6.0 (2019-04-26)
#> os Windows Server x64
#> system x86_64, mingw32
#> ui RTerm
#> language (EN)
#> collate Chinese (Simplified)_China.936
#> ctype Chinese (Simplified)_China.936
#> tz Asia/Taipei
#> date 2020-06-19
#>
#> - Packages -------------------------------------------------------------------
#> package * version date lib source
#> assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.6.1)
#> backports 1.1.5 2019-10-02 [1] CRAN (R 3.6.1)
#> broom 0.5.6 2020-04-20 [1] CRAN (R 3.6.3)
#> callr 3.4.0 2019-12-09 [1] CRAN (R 3.6.2)
#> cellranger 1.1.0 2016-07-27 [1] CRAN (R 3.6.1)
#> cli 2.0.2 2020-02-28 [1] CRAN (R 3.6.3)
#> colorspace 1.4-1 2019-03-18 [1] CRAN (R 3.6.1)
#> crayon 1.3.4 2017-09-16 [1] CRAN (R 3.6.1)
#> DBI 1.1.0 2019-12-15 [1] CRAN (R 3.6.2)
#> dbplyr 1.4.2 2019-06-17 [1] CRAN (R 3.6.3)
#> desc 1.2.0 2018-05-01 [1] CRAN (R 3.6.1)
#> devtools 2.2.1 2019-09-24 [1] CRAN (R 3.6.1)
#> digest 0.6.23 2019-11-23 [1] CRAN (R 3.6.2)
#> dplyr * 1.0.0 2020-05-29 [1] CRAN (R 3.6.3)
#> ellipsis 0.3.0 2019-09-20 [1] CRAN (R 3.6.1)
#> evaluate 0.14 2019-05-28 [1] CRAN (R 3.6.1)
#> fansi 0.4.0 2018-10-05 [1] CRAN (R 3.6.1)
#> forcats * 0.5.0 2020-03-01 [1] CRAN (R 3.6.3)
#> fs 1.3.1 2019-05-06 [1] CRAN (R 3.6.1)
#> generics 0.0.2 2018-11-29 [1] CRAN (R 3.6.1)
#> ggplot2 * 3.2.1 2019-08-10 [1] CRAN (R 3.6.1)
#> glue 1.4.1 2020-05-13 [1] CRAN (R 3.6.3)
#> gtable 0.3.0 2019-03-25 [1] CRAN (R 3.6.1)
#> haven 2.2.0 2019-11-08 [1] CRAN (R 3.6.3)
#> highr 0.8 2019-03-20 [1] CRAN (R 3.6.1)
#> hms 0.5.2 2019-10-30 [1] CRAN (R 3.6.2)
#> htmltools 0.4.0 2019-10-04 [1] CRAN (R 3.6.1)
#> httr 1.4.1 2019-08-05 [1] CRAN (R 3.6.1)
#> jsonlite 1.6 2018-12-07 [1] CRAN (R 3.6.1)
#> knitr 1.26 2019-11-12 [1] CRAN (R 3.6.2)
#> lattice 0.20-38 2018-11-04 [2] CRAN (R 3.6.0)
#> lazyeval 0.2.2 2019-03-15 [1] CRAN (R 3.6.1)
#> lifecycle 0.2.0 2020-03-06 [1] CRAN (R 3.6.3)
#> lubridate 1.7.4 2018-04-11 [1] CRAN (R 3.6.2)
#> magrittr 1.5 2014-11-22 [1] CRAN (R 3.6.1)
#> memoise 1.1.0 2017-04-21 [1] CRAN (R 3.6.1)
#> modelr 0.1.6 2020-02-22 [1] CRAN (R 3.6.3)
#> munsell 0.5.0 2018-06-12 [1] CRAN (R 3.6.1)
#> nlme 3.1-143 2019-12-10 [1] CRAN (R 3.6.2)
#> pillar 1.4.3 2019-12-20 [1] CRAN (R 3.6.2)
#> pkgbuild 1.0.6 2019-10-09 [1] CRAN (R 3.6.0)
#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 3.6.0)
#> pkgload 1.0.2 2018-10-29 [1] CRAN (R 3.6.1)
#> prettyunits 1.0.2 2015-07-13 [1] CRAN (R 3.6.1)
#> processx 3.4.1 2019-07-18 [1] CRAN (R 3.6.1)
#> ps 1.3.0 2018-12-21 [1] CRAN (R 3.6.1)
#> purrr * 0.3.3 2019-10-18 [1] CRAN (R 3.6.1)
#> R6 2.4.1 2019-11-12 [1] CRAN (R 3.6.2)
#> Rcpp 1.0.3 2019-11-08 [1] CRAN (R 3.6.2)
#> readr * 1.3.1 2018-12-21 [1] CRAN (R 3.6.1)
#> readxl 1.3.1 2019-03-13 [1] CRAN (R 3.6.1)
#> remotes 2.1.0 2019-06-24 [1] CRAN (R 3.6.1)
#> reprex 0.3.0 2019-05-16 [1] CRAN (R 3.6.3)
#> rlang 0.4.6 2020-05-02 [1] CRAN (R 3.6.3)
#> rmarkdown 2.0 2019-12-12 [1] CRAN (R 3.6.2)
#> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 3.6.1)
#> rvest 0.3.5 2019-11-08 [1] CRAN (R 3.6.3)
#> scales 1.1.0 2019-11-18 [1] CRAN (R 3.6.2)
#> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.6.1)
#> stringi 1.4.3 2019-03-12 [1] CRAN (R 3.6.0)
#> stringr * 1.4.0 2019-02-10 [1] CRAN (R 3.6.1)
#> testthat 2.3.1 2019-12-01 [1] CRAN (R 3.6.2)
#> tibble * 3.0.1 2020-04-20 [1] CRAN (R 3.6.3)
#> tidyr * 1.0.0 2019-09-11 [1] CRAN (R 3.6.1)
#> tidyselect 1.1.0 2020-05-11 [1] CRAN (R 3.6.3)
#> tidyverse * 1.3.0 2019-11-21 [1] CRAN (R 3.6.3)
#> usethis 1.5.1 2019-07-04 [1] CRAN (R 3.6.1)
#> utf8 1.1.4 2018-05-24 [1] CRAN (R 3.6.1)
#> vctrs 0.3.0 2020-05-11 [1] CRAN (R 3.6.3)
#> withr 2.1.2 2018-03-15 [1] CRAN (R 3.6.1)
#> xfun 0.11 2019-11-12 [1] CRAN (R 3.6.2)
#> xml2 1.2.2 2019-08-09 [1] CRAN (R 3.6.1)
#> yaml 2.2.0 2018-07-25 [1] CRAN (R 3.6.0)
#>
#> [1] C:/Users/xzhu/Documents/R/win-library/3.6
#> [2] C:/Program Files/R/R-3.6.0/library
Yes, you can use map2 :
library(dplyr)
df %>% mutate(data = purrr::map2(name, data, ~{names(.y) <- .x;.y}))
Or Map in base R :
df$data <- Map(function(x, y) {names(y) <- x;y}, df$name, df$data)
If you want to use rowwise a similar approach would be :
df %>% rowwise() %>% mutate(data = {names(data) <- name;list(data)})

Calculating upper and lower confidence intervals by group in dplyr summarise()

I am trying to make a table that shows N (number of observations), percent frequency (of answers > 0), and the lower and upper confidence intervals for percent frequency, and I want to group this by type.
Example of data
dat <- data.frame(
"type" = c("B","B","A","B","A","A","B","A","A","B","A","A","A","B","B","B"),
"num" = c(3,0,0,9,6,0,4,1,1,5,6,1,3,0,0,0)
)
Expected output (with values filled in):
Type N Percent Lower 95% CI Upper 95% CI
A
B
Attempt
library(dplyr)
library(qwraps2)
table<-dat %>%
group_by(type) %>%
summarise(N=n(),
mean.ci = mean_ci(dat$num),
"Percent"=n_perc(num > 0))
This worked to get N and percent frequency, but returned an error: "Column must be length 1 (a summary value), not 3" when I added in mean_ci
The second code I tried, found here:
table2<-dat %>%
group_by(type) %>%
summarise(N.num=n(),
mean.num = mean(dat$num),
sd.num = sd(dat$num),
"Percent"=n_perc(num > 0)) %>%
mutate(se.num = sd.num / sqrt(N.num),
lower.ci = 100*(mean.num - qt(1 - (0.05 / 2), N.num - 1) * se.num),
upper.ci = 100*(mean.num + qt(1 - (0.05 / 2), N.num - 1) * se.num))
# A tibble: 2 x 8
# type N.num mean.num sd.num Percent se.num lower.ci upper.ci
# <fct> <int> <dbl> <dbl> <chr> <dbl> <dbl> <dbl>
#1 A 8 2.44 2.83 "6 (75.00\\%)" 1.00 7.35 480.
#2 B 8 2.44 2.83 "4 (50.00\\%)" 1.00 7.35 480.
This gave me an output, but the confidence intervals are not logical.
The output of mean_ci is a vector of length 3. This is maybe unexpected because the package has added a print method so that when you see this in the console it looks like a single character value and not a numeric length > 1 vector. But, you can see the underlying data structure by looking at str.
mean_ci(dat$num) %>% str
# 'qwraps2_mean_ci' Named num [1:3] 2.44 1.05 3.82
# - attr(*, "names")= chr [1:3] "mean" "lcl" "ucl"
# - attr(*, "alpha")= num 0.05
In summarize, each element of each column of the output needs to be length 1, so providing a length 3 object for summarize to put in a single "cell" (column element) results in an error. A workaround is to put the length 3 vector in a list, so that it is now a length 1 list. Then you can use unnest_wider to separate it into 3 columns (and therefore making the table "wider")
library(tidyverse)
dat %>%
group_by(type) %>%
summarise( N=n(),
mean.ci = list(mean_ci(num)),
"Percent"= n_perc(num > 0)) %>%
unnest_wider(mean.ci)
# # A tibble: 2 x 6
# type N mean lcl ucl Percent
# <fct> <int> <dbl> <dbl> <dbl> <chr>
# 1 A 8 2.25 0.523 3.98 "6 (75.00\\%)"
# 2 B 8 2.62 0.344 4.91 "4 (50.00\\%)"
IceCreamToucan’s answer is very good. I’m posting this answer to offer a
different way to present the information.
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(qwraps2)
dat <- data.frame("type" = c("B","B","A","B","A","A","B","A","A","B","A","A","A","B","B","B"),
"num" = c(3,0,0,9,6,0,4,1,1,5,6,1,3,0,0,0))
When building the dplyr::summarize call you can use the qwraps2::frmtci
call to format the output of qwraps2::mean_ci into a character string of
length one.
I would also recommend using the data pronoun .data so you can be explicit
about the variables to summarize.
dat %>%
dplyr::group_by(type) %>%
dplyr::summarize(N = n(),
mean.ci = qwraps2::frmtci(qwraps2::mean_ci(.data$num)),
Percent = qwraps2::n_perc(.data$num > 0))
#> `summarise()` ungrouping output (override with `.groups` argument)
#> # A tibble: 2 x 4
#> type N mean.ci Percent
#> <chr> <int> <chr> <chr>
#> 1 A 8 2.25 (0.52, 3.98) "6 (75.00\\%)"
#> 2 B 8 2.62 (0.34, 4.91) "4 (50.00\\%)"
Created on 2020-09-15 by the reprex package (v0.3.0)
devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.0.2 (2020-06-22)
#> os macOS Catalina 10.15.6
#> system x86_64, darwin17.0
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz America/Denver
#> date 2020-09-15
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date lib source
#> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.0)
#> backports 1.1.9 2020-08-24 [1] CRAN (R 4.0.2)
#> callr 3.4.4 2020-09-07 [1] CRAN (R 4.0.2)
#> cli 2.0.2 2020-02-28 [1] CRAN (R 4.0.0)
#> crayon 1.3.4 2017-09-16 [1] CRAN (R 4.0.0)
#> desc 1.2.0 2018-05-01 [1] CRAN (R 4.0.0)
#> devtools 2.3.1 2020-07-21 [1] CRAN (R 4.0.2)
#> digest 0.6.25 2020-02-23 [1] CRAN (R 4.0.0)
#> dplyr * 1.0.2 2020-08-18 [1] CRAN (R 4.0.2)
#> ellipsis 0.3.1 2020-05-15 [1] CRAN (R 4.0.0)
#> evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.0)
#> fansi 0.4.1 2020-01-08 [1] CRAN (R 4.0.0)
#> fs 1.5.0 2020-07-31 [1] CRAN (R 4.0.2)
#> generics 0.0.2 2018-11-29 [1] CRAN (R 4.0.0)
#> glue 1.4.2 2020-08-27 [1] CRAN (R 4.0.2)
#> highr 0.8 2019-03-20 [1] CRAN (R 4.0.0)
#> htmltools 0.5.0 2020-06-16 [1] CRAN (R 4.0.0)
#> knitr 1.29 2020-06-23 [1] CRAN (R 4.0.0)
#> lifecycle 0.2.0 2020-03-06 [1] CRAN (R 4.0.0)
#> magrittr 1.5 2014-11-22 [1] CRAN (R 4.0.0)
#> memoise 1.1.0 2017-04-21 [1] CRAN (R 4.0.0)
#> pillar 1.4.6 2020-07-10 [1] CRAN (R 4.0.2)
#> pkgbuild 1.1.0 2020-07-13 [1] CRAN (R 4.0.2)
#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.0.0)
#> pkgload 1.1.0 2020-05-29 [1] CRAN (R 4.0.0)
#> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.0.0)
#> processx 3.4.4 2020-09-03 [1] CRAN (R 4.0.2)
#> ps 1.3.4 2020-08-11 [1] CRAN (R 4.0.2)
#> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.0.0)
#> qwraps2 * 0.5.0 2020-09-14 [1] local
#> R6 2.4.1 2019-11-12 [1] CRAN (R 4.0.0)
#> Rcpp 1.0.5 2020-07-06 [1] CRAN (R 4.0.0)
#> remotes 2.2.0 2020-07-21 [1] CRAN (R 4.0.2)
#> rlang 0.4.7 2020-07-09 [1] CRAN (R 4.0.2)
#> rmarkdown 2.3 2020-06-18 [1] CRAN (R 4.0.0)
#> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 4.0.0)
#> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.0)
#> stringi 1.5.3 2020-09-09 [1] CRAN (R 4.0.2)
#> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.0)
#> testthat 2.3.2 2020-03-02 [1] CRAN (R 4.0.0)
#> tibble 3.0.3 2020-07-10 [1] CRAN (R 4.0.2)
#> tidyselect 1.1.0 2020-05-11 [1] CRAN (R 4.0.0)
#> usethis 1.6.1 2020-04-29 [1] CRAN (R 4.0.0)
#> utf8 1.1.4 2018-05-24 [1] CRAN (R 4.0.0)
#> vctrs 0.3.4 2020-08-29 [1] CRAN (R 4.0.2)
#> withr 2.2.0 2020-04-20 [1] CRAN (R 4.0.0)
#> xfun 0.17 2020-09-09 [1] CRAN (R 4.0.2)
#> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.0)
#>
#> [1] /Library/Frameworks/R.framework/Versions/4.0/Resources/library

Group output from naniar using dplyr, nesting/unnesting, compatibility with newer version of R

I had a piece of R script running to get an overview on missing values in a repeated measures data frame. I used naniar and dplyr from the tidyverse and it worked fine. I used the combination to group the output by different factors (e.g. study, day, participant,...):
miss_trigger <- data_mlm_npu_filter[,c("Trigger_counter", "stadi_AU")] %>%
group_by(Trigger_counter) %>%
miss_var_summary()
Now, some months later, I first got the warning message
#Warning message:
# `cols` is now required.
#Please use `cols = c(data)`
After searching for the warning message, I found that there has change something with nesting/unnesting but this information did not help me to fix the warning/what changes to apply to my code.
And now after updating R to 3.6.2, I am just getting:
Error in group_by_fun(data, .fun = miss_var_summary()) :
could not find function "group_by_fun"
The miss_var_summary function itself works without problems. So, I would really just like to group my output from naniar as before. What do I have to do? Apparently I am missing a key information or understanding of the packages I am using on how to fix this myself.
This was a bug introduced by a new version of tidyr, this should now work:
library(naniar)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
oceanbuoys %>%
group_by(year) %>%
miss_var_summary()
#> # A tibble: 14 x 4
#> # Groups: year [2]
#> year variable n_miss pct_miss
#> <dbl> <chr> <int> <dbl>
#> 1 1997 air_temp_c 77 20.9
#> 2 1997 latitude 0 0
#> 3 1997 longitude 0 0
#> 4 1997 sea_temp_c 0 0
#> 5 1997 humidity 0 0
#> 6 1997 wind_ew 0 0
#> 7 1997 wind_ns 0 0
#> 8 1993 humidity 93 25.3
#> 9 1993 air_temp_c 4 1.09
#> 10 1993 sea_temp_c 3 0.815
#> 11 1993 latitude 0 0
#> 12 1993 longitude 0 0
#> 13 1993 wind_ew 0 0
#> 14 1993 wind_ns 0 0
Created on 2020-05-14 by the reprex package (v0.3.0)
devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.0.0 (2020-04-24)
#> os macOS Mojave 10.14.6
#> system x86_64, darwin17.0
#> ui X11
#> language (EN)
#> collate en_AU.UTF-8
#> ctype en_AU.UTF-8
#> tz Australia/Melbourne
#> date 2020-05-14
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date lib source
#> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.0)
#> backports 1.1.6 2020-04-05 [1] CRAN (R 4.0.0)
#> callr 3.4.3 2020-03-28 [1] CRAN (R 4.0.0)
#> cli 2.0.2 2020-02-28 [1] CRAN (R 4.0.0)
#> colorspace 1.4-2 2020-02-27 [1] R-Forge (R 4.0.0)
#> crayon 1.3.4 2017-09-16 [1] CRAN (R 4.0.0)
#> desc 1.2.0 2018-05-01 [1] CRAN (R 4.0.0)
#> devtools 2.3.0 2020-04-10 [1] CRAN (R 4.0.0)
#> digest 0.6.25 2020-02-23 [1] CRAN (R 4.0.0)
#> dplyr * 0.8.99.9002 2020-05-04 [1] Github (tidyverse/dplyr#8710f8a)
#> ellipsis 0.3.0 2019-09-20 [1] CRAN (R 4.0.0)
#> evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.0)
#> fansi 0.4.1 2020-01-08 [1] CRAN (R 4.0.0)
#> fs 1.4.1 2020-04-04 [1] CRAN (R 4.0.0)
#> generics 0.0.2 2018-11-29 [1] CRAN (R 4.0.0)
#> ggplot2 3.3.0 2020-03-05 [1] CRAN (R 4.0.0)
#> glue 1.4.0 2020-04-03 [1] CRAN (R 4.0.0)
#> gtable 0.3.0 2019-03-25 [1] CRAN (R 4.0.0)
#> highr 0.8 2019-03-20 [1] CRAN (R 4.0.0)
#> htmltools 0.4.0 2019-10-04 [1] CRAN (R 4.0.0)
#> knitr 1.28 2020-02-06 [1] CRAN (R 4.0.0)
#> lifecycle 0.2.0 2020-03-06 [1] CRAN (R 4.0.0)
#> magrittr 1.5 2014-11-22 [1] CRAN (R 4.0.0)
#> memoise 1.1.0 2017-04-21 [1] CRAN (R 4.0.0)
#> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.0.0)
#> naniar * 0.5.1 2020-04-30 [1] CRAN (R 4.0.0)
#> pillar 1.4.4 2020-05-05 [1] CRAN (R 4.0.0)
#> pkgbuild 1.0.8 2020-05-07 [1] CRAN (R 4.0.0)
#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.0.0)
#> pkgload 1.0.2 2018-10-29 [1] CRAN (R 4.0.0)
#> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.0.0)
#> processx 3.4.2 2020-02-09 [1] CRAN (R 4.0.0)
#> ps 1.3.3 2020-05-08 [1] CRAN (R 4.0.0)
#> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.0.0)
#> R6 2.4.1 2019-11-12 [1] CRAN (R 4.0.0)
#> Rcpp 1.0.4.6 2020-04-09 [1] CRAN (R 4.0.0)
#> remotes 2.1.1 2020-02-15 [1] CRAN (R 4.0.0)
#> rlang 0.4.6 2020-05-02 [1] CRAN (R 4.0.0)
#> rmarkdown 2.1 2020-01-20 [1] CRAN (R 4.0.0)
#> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 4.0.0)
#> scales 1.1.1 2020-05-11 [1] CRAN (R 4.0.0)
#> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.0)
#> stringi 1.4.6 2020-02-17 [1] CRAN (R 4.0.0)
#> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.0)
#> testthat 2.3.2 2020-03-02 [1] CRAN (R 4.0.0)
#> tibble 3.0.1 2020-04-20 [1] CRAN (R 4.0.0)
#> tidyr 1.0.3 2020-05-07 [1] CRAN (R 4.0.0)
#> tidyselect 1.1.0 2020-05-11 [1] CRAN (R 4.0.0)
#> usethis 1.6.1 2020-04-29 [1] CRAN (R 4.0.0)
#> utf8 1.1.4 2018-05-24 [1] CRAN (R 4.0.0)
#> vctrs 0.2.99.9011 2020-05-04 [1] Github (r-lib/vctrs#0ca806c)
#> visdat 0.5.3 2019-02-15 [1] CRAN (R 4.0.0)
#> withr 2.2.0 2020-04-20 [1] CRAN (R 4.0.0)
#> xfun 0.13 2020-04-13 [1] CRAN (R 4.0.0)
#> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.0)
#>
#> [1] /Library/Frameworks/R.framework/Versions/4.0/Resources/library

Resources