Wrap function passing NULL to lower-level haven::read_dta function in R - r

I am trying to build a function wrapping over haven::read_dta() similar to the wrap_function() defined in the code below.
My wrap_function() has a default variables = NULL, which should be able to pass NULL to haven::read_dta()'s col_select argument if no values are specified. However, passing the NULL from variables to col_select throws an error (i.e. 'Error: Can't find any columns matching col_select in data.').
Can someone help me understand why this happens and how could I build a wrap_function capable of passing a NULL default value to the lower-level function?
Thanks!
library(reprex)
library(haven)
df_ <- data.frame(a = 1:5,
b = letters[1:5])
haven::write_dta(df_,
path = "file.dta")
# works well:
haven::read_dta(file = "file.dta",
col_select = NULL)
#> # A tibble: 5 x 2
#> a b
#> <dbl> <chr>
#> 1 1 a
#> 2 2 b
#> 3 3 c
#> 4 4 d
#> 5 5 e
# does not work:
wrap_function <- function(file, variables = NULL){
haven::read_dta(file = file,
col_select = variables)
}
wrap_function("file.dta")
#> Note: Using an external vector in selections is ambiguous.
#> ℹ Use `all_of(variables)` instead of `variables` to silence this message.
#> ℹ See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
#> This message is displayed once per session.
#> Error: Can't find any columns matching `col_select` in data.
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.0.3 (2020-10-10)
#> os CentOS Linux 8
#> system x86_64, linux-gnu
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> date 2021-05-14
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date lib source
#> cli 2.4.0 2021-04-05 [1] CRAN (R 4.0.3)
#> crayon 1.4.1.9000 2021-04-16 [1] Github (r-lib/crayon#965d1dc)
#> digest 0.6.27 2020-10-24 [1] CRAN (R 4.0.3)
#> ellipsis 0.3.1 2020-05-15 [1] CRAN (R 4.0.2)
#> evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.2)
#> fansi 0.4.2 2021-01-15 [1] CRAN (R 4.0.3)
#> forcats 0.5.1 2021-01-27 [1] CRAN (R 4.0.3)
#> fs 1.5.0 2020-07-31 [1] CRAN (R 4.0.2)
#> glue 1.4.2 2020-08-27 [1] CRAN (R 4.0.2)
#> haven * 2.3.1 2020-06-01 [1] CRAN (R 4.0.2)

TLDR: You just need to embrace the argument by wrapping it in double curly brackets{{ }}, previously called "curly-curly". This passes the variable properly. See the programming with dplyr vignette for more info.
wrap_function <- function(file, variables = NULL){
haven::read_dta(file = file,
col_select = {{ variables }})
}
wrap_function("file.dta")
#> # A tibble: 5 x 2
#> a b
#> <dbl> <chr>
#> 1 1 a
#> 2 2 b
#> 3 3 c
#> 4 4 d
#> 5 5 e
Unfortunately it's a little hard to understand that this is necessary without looking at the code. If you look up the haven repository, you can see that read_dta uses the double-curly around col_select as well. This is a pretty good indication that you need to use it in your wrapper function.
If you look further, it is using them to pass the argument to a function skip_cols, which uses it inside tidyselect::vars_select. The reason this is needed is so that you can delay evaluation of the argument until the point that you actually need it. In other words, it lets you call the function like this:
wrap_function("file.dta", variables = a)
instead of forcing you to do something like
wrap_function("file.dta", variables = "a")
and saves you a lot of typed quotes, especially with a lot of columns. You see this pattern in dplyr and other tidyverse functions a lot, especially any time an argument refers to a dataframe column rather than a variable.
In other words, you want to not have the code check exactly what a is until you reach skip_cols, which knows that a refers to a column inside the file you're reading. If you don't use the curly braces, it will think that a is some object in your working environment.

Related

Is there a way to create a custom metric for use with tune_grid() in tidymodels that allows for a grouped data.frame/tibble?

What I'd like to do
I am trying to build a model in tidymodels that will predict the efficacy of drugs on cell lines (like bacteria). The model will rank drugs by efficacy for a given cell line, so I want to use Spearman's correlation (ρ) as a metric. In the following example data set, each cell line (column Sample) is represented by a letter, Q, R, S, ..., Z, and each sample was treated with 50 drugs.
When I split the data for cross-validation, the training/test splits for each fold will have >1 cell line (e.g. Q, R in the test split for fold 1), but in calculating the metric (ρ), I want to calculate it for each cell line individually and then take the average across all the cell lines in the test split, rather than for all the observations in aggregate. For example, if the test split for fold 1 consists of Q, R, then I want to calculate ρ for the 50 drugs tested against Q, then a separate ρ for the 50 drugs tested against R, average these two ρ, and have that average be the metric calculated for fold 1.
What I've tried
I was thinking that I'd have to calculate the metric on a tibble/data.frame grouped by the Sample column, but I can't figure out how to pass that variable into tune_grid(). I don't think I can include the variable in add_formula() when creating the workflow object, since I don't want it as a predictor variable. I just discovered tidymodels yesterday, so maybe there's a straightforward solution I'm unaware of, but I haven't been able to find anything on Google so far. The code below is what I've tried, but obviously it doesn't work. Thank you in advance for any advice you can give.
Error
i Resample1: preprocessor 1/1
✓ Resample1: preprocessor 1/1
i Resample1: preprocessor 1/1, model 1/20
✓ Resample1: preprocessor 1/1, model 1/20
i Resample1: preprocessor 1/1, model 1/20 (predictions)
x Resample1: internal: Error: In metric: `spearman_cor`
unused arguments (truth = ~TargetVariable, estimate = ~.pred, na_rm ...
i Resample2: preprocessor 1/1
✓ Resample2: preprocessor 1/1
i Resample2: preprocessor 1/1, model 1/20
✓ Resample2: preprocessor 1/1, model 1/20
i Resample2: preprocessor 1/1, model 1/20 (predictions)
x Resample2: internal: Error: In metric: `spearman_cor`
unused arguments (truth = ~TargetVariable, estimate = ~.pred, na_rm ...
i Resample3: preprocessor 1/1
✓ Resample3: preprocessor 1/1
i Resample3: preprocessor 1/1, model 1/20
✓ Resample3: preprocessor 1/1, model 1/20
i Resample3: preprocessor 1/1, model 1/20 (predictions)
x Resample3: internal: Error: In metric: `spearman_cor`
unused arguments (truth = ~TargetVariable, estimate = ~.pred, na_rm ...
i Resample4: preprocessor 1/1
✓ Resample4: preprocessor 1/1
i Resample4: preprocessor 1/1, model 1/20
✓ Resample4: preprocessor 1/1, model 1/20
i Resample4: preprocessor 1/1, model 1/20 (predictions)
x Resample4: internal: Error: In metric: `spearman_cor`
unused arguments (truth = ~TargetVariable, estimate = ~.pred, na_rm ...
i Resample5: preprocessor 1/1
✓ Resample5: preprocessor 1/1
i Resample5: preprocessor 1/1, model 1/20
✓ Resample5: preprocessor 1/1, model 1/20
i Resample5: preprocessor 1/1, model 1/20 (predictions)
x Resample5: internal: Error: In metric: `spearman_cor`
unused arguments (truth = ~TargetVariable, estimate = ~.pred, na_rm ...
Warning message:
All models failed. See the `.notes` column.
Upon running glmnet_tuning_results:
Warning message:
This tuning result has notes. Example notes on model fitting include:
internal: Error: In metric: `spearman_cor`
unused arguments (truth = ~TargetVariable, estimate = ~.pred, na_rm = ~na_rm)
internal: Error: In metric: `spearman_cor`
unused arguments (truth = ~TargetVariable, estimate = ~.pred, na_rm = ~na_rm)
internal: Error: In metric: `spearman_cor`
unused arguments (truth = ~TargetVariable, estimate = ~.pred, na_rm = ~na_rm)
Code
Example data set
data = tibble(
Sample = rep(LETTERS[17:26], each = 50),
TargetVariable = rnorm(500, mean = 0, sd = 1),
PredictorVariable1 = rnorm(500, mean = 5, sd = 1),
PredictorVariable2 = rpois(500, lambda = 5)
)
Model
# Splitting for cross-validation.
set.seed(1026)
folds = group_vfold_cv(data, group = Sample, v = 5)
# Model specification.
glmnet_model = linear_reg(
mode = "regression",
penalty = tune(),
mixture = tune()
) %>%
set_engine("glmnet")
# Workflow.
glmnet_wf = workflow() %>%
add_model(glmnet_model) %>%
add_formula(TargetVariable ~ . - Sample)
# Grid specification.
glmnet_params = parameters(penalty(), mixture())
set.seed(1026)
glmnet_grid = grid_max_entropy(glmnet_params, size = 20)
# Hyperparameter tuning.
glmnet_tuning_results = tune_grid(
glmnet_wf,
resamples = folds,
grid = glmnet_grid,
metrics = metric_set(spearman_cor),
control = control_grid(verbose = TRUE)
)
glmnet_tuning_results %>% show_best(n = 10)
Custom metric
# Vector version.
spearman_cor_vec = function(truth, estimate, na_rm = TRUE) {
spearman_cor_impl = function(truth, estimate) {
cor(truth, estimate, method = "spearman")
}
metric_vec_template(
metric_impl = spearman_cor_impl,
truth = truth,
estimate = estimate,
na_rm = na_rm,
cls = "numeric"
)
}
# Data frame version.
spearman_cor = function(data) {
UseMethod("spearman_cor")
}
spearman_cor = new_numeric_metric(spearman_cor, direction = "maximize")
spearman_cor.data.frame = function(data, truth, estimate, na_rm = TRUE) {
data_grouped = data %>%
group_by(Sample)
metric_summarizer(
metric_nm = "spearman_cor",
metric_fn = spearman_cor_vec,
data = data_grouped,
truth = !! enquo(truth),
estimate = !! enquo(estimate),
na_rm = na_rm
)
}
Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 3.6.3 (2020-02-29)
#> os macOS Catalina 10.15.7
#> system x86_64, darwin15.6.0
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz America/Chicago
#> date 2021-08-25
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date lib source
#> backports 1.1.6 2020-04-05 [1] CRAN (R 3.6.2)
#> cli 3.0.1 2021-07-17 [1] CRAN (R 3.6.2)
#> crayon 1.3.4 2017-09-16 [1] CRAN (R 3.6.0)
#> digest 0.6.25 2020-02-23 [1] CRAN (R 3.6.0)
#> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 3.6.2)
#> evaluate 0.14 2019-05-28 [1] CRAN (R 3.6.0)
#> fansi 0.4.1 2020-01-08 [1] CRAN (R 3.6.0)
#> fs 1.3.1 2019-05-06 [1] CRAN (R 3.6.0)
#> glue 1.4.0 2020-04-03 [1] CRAN (R 3.6.2)
#> highr 0.8 2019-03-20 [1] CRAN (R 3.6.0)
#> htmltools 0.5.1.1 2021-01-22 [1] CRAN (R 3.6.2)
#> knitr 1.27 2020-01-16 [1] CRAN (R 3.6.0)
#> lifecycle 1.0.0 2021-02-15 [1] CRAN (R 3.6.2)
#> magrittr 2.0.1 2020-11-17 [1] CRAN (R 3.6.2)
#> pillar 1.6.2 2021-07-29 [1] CRAN (R 3.6.2)
#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 3.6.0)
#> purrr 0.3.4 2020-04-17 [1] CRAN (R 3.6.2)
#> Rcpp 1.0.4.6 2020-04-09 [1] CRAN (R 3.6.1)
#> reprex 2.0.1 2021-08-05 [1] CRAN (R 3.6.2)
#> rlang 0.4.10 2020-12-30 [1] CRAN (R 3.6.2)
#> rmarkdown 2.1 2020-01-20 [1] CRAN (R 3.6.0)
#> rstudioapi 0.13 2020-11-12 [1] CRAN (R 3.6.2)
#> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.6.0)
#> stringi 1.4.5 2020-01-11 [1] CRAN (R 3.6.0)
#> stringr 1.4.0 2019-02-10 [1] CRAN (R 3.6.0)
#> styler 1.5.1 2021-07-13 [1] CRAN (R 3.6.2)
#> tibble 3.1.3 2021-07-23 [1] CRAN (R 3.6.2)
#> utf8 1.1.4 2018-05-24 [1] CRAN (R 3.6.0)
#> vctrs 0.3.8 2021-04-29 [1] CRAN (R 3.6.2)
#> withr 2.4.2 2021-04-18 [1] CRAN (R 3.6.2)
#> xfun 0.12 2020-01-13 [1] CRAN (R 3.6.0)
#> yaml 2.2.0 2018-07-25 [1] CRAN (R 3.6.0)
#>
#> [1] /Library/Frameworks/R.framework/Versions/3.6/Resources/library
To make your custom metric work, you were just missing some ... so arguments could be passed through:
library(tidymodels)
#> Registered S3 method overwritten by 'tune':
#> method from
#> required_pkgs.model_spec parsnip
spearman_cor_vec <- function(truth, estimate, na_rm = TRUE) {
spearman_cor_impl <- function(truth, estimate) {
cor(truth, estimate, method = "spearman")
}
metric_vec_template(
metric_impl = spearman_cor_impl,
truth = truth,
estimate = estimate,
na_rm = na_rm,
cls = "numeric"
)
}
spearman_cor <- function(data, ...) { ## these dots were missing
UseMethod("spearman_cor")
}
spearman_cor <- new_numeric_metric(spearman_cor, direction = "maximize")
spearman_cor.data.frame <- function(data, truth, estimate, na_rm = TRUE) {
data_grouped = data %>%
group_by(Sample)
metric_summarizer(
metric_nm = "spearman_cor",
metric_fn = spearman_cor_vec,
data = data_grouped,
truth = !! enquo(truth),
estimate = !! enquo(estimate),
na_rm = na_rm
)
}
This makes it so you can use this metric on a dataset like so:
df <- tibble(
Sample = rep(LETTERS[17:26], each = 50),
TargetVariable = rnorm(500, mean = 0, sd = 1),
Pred1 = rnorm(500, mean = 5, sd = 1),
Pred2 = rpois(500, lambda = 5)
)
df %>%
mutate(.pred = TargetVariable + rnorm(500, mean = 0, sd = 0.2)) %>%
spearman_cor(TargetVariable, .pred)
#> # A tibble: 10 × 4
#> Sample .metric .estimator .estimate
#> <chr> <chr> <chr> <dbl>
#> 1 Q spearman_cor standard 0.980
#> 2 R spearman_cor standard 0.975
#> 3 S spearman_cor standard 0.983
#> 4 T spearman_cor standard 0.985
#> 5 U spearman_cor standard 0.978
#> 6 V spearman_cor standard 0.963
#> 7 W spearman_cor standard 0.975
#> 8 X spearman_cor standard 0.979
#> 9 Y spearman_cor standard 0.987
#> 10 Z spearman_cor standard 0.969
Created on 2021-08-31 by the reprex package (v2.0.1)
However, this doesn't totally solve your problem because for the tuning functions, we typically only pass the predictors and outcomes, not any extra variables with other roles. I worked with this a little bit and couldn't quite figure out a way to get the tuning function to have a variable only for computing metrics and not for fitting. I don't believe we support this right now; you might want to create a reprex, explain your use case, and post an issue on the tune repo so we can prioritize a new feature like this.

Polynomial Function Expansion in R

I am currently reviewing this question on SO and see that the OP stated that by adding more for loops you can expand the polynomials. How exactly would you do so? I am trying to expand to polyorder 5.
Polynomial feature expansion in R
Here is the code below:
polyexp = function(df){
df.polyexp = df
colnames = colnames(df)
for (i in 1:ncol(df)){
for (j in i:ncol(df)){
colnames = c(colnames, paste0(names(df)[i],'.',names(df)[j]))
df.polyexp = cbind(df.polyexp, df[,i]*df[,j])
}
}
names(df.polyexp) = colnames
return(df.polyexp)
}
Ultimately, I'd like to order the matrix so that it expands in order of degree. I tried using the poly function but I'm not sure if you can order the result so that it returns a matrix that starts with degree 1 then moves to degree 2, then 3, 4, and 5.
To "sort by degree" is a little ambiguous. x^2 and x*y both have degree 2. I'll assume you want to sort by total degree, and then within each of those, by degree of the 1st column; within that, by degree of the second column, etc. (I believe the default is to ignore total degree and sort by degree of the last column, within that the second last, and so on, but this is not documented so I won't count on it.)
Here's how to use polym to do this. The columns are named things like "2.0" or "1.1". You could sort these alphabetically and it would be fine up to degree 9, but if you convert those names using as.numeric_version, there's no limit. So convert the column names to version names, get the sort order, and use that plus degree to re-order the columns of the result. For example,
df <- data.frame(x = 1:6, y = 0:5, z = -(1:6))
expanded <- polym(as.matrix(df), degree = 5)
o <- order(attr(expanded, "degree"),
as.numeric_version(colnames(expanded)))
sorted <- expanded[,o]
# That lost the attributes, so put them back
attr(sorted, "degree") <- attr(expanded, "degree")[o]
attr(sorted, "coefs") <- attr(expanded, "coefs")
class(sorted) <- class(expanded)
# If you call predict(), it comes out in the default order,
# so will need sorting too:
predict(sorted, newdata = as.matrix(df[1,]))[, o]
#> 0.0.1 0.1.0 1.0.0 0.0.2 0.1.1 0.2.0
#> 0.59761430 -0.59761430 -0.59761430 0.54554473 -0.35714286 0.54554473
#> 1.0.1 1.1.0 2.0.0 0.0.3 0.1.2 0.2.1
#> -0.35714286 0.35714286 0.54554473 0.37267800 -0.32602533 0.32602533
#> 0.3.0 1.0.2 1.1.1 1.2.0 2.0.1 2.1.0
#> -0.37267800 -0.32602533 0.21343368 -0.32602533 0.32602533 -0.32602533
#> 3.0.0 0.0.4 0.1.3 0.2.2 0.3.1 0.4.0
#> -0.37267800 0.18898224 -0.22271770 0.29761905 -0.22271770 0.18898224
#> 1.0.3 1.1.2 1.2.1 1.3.0 2.0.2 2.1.1
#> -0.22271770 0.19483740 -0.19483740 0.22271770 0.29761905 -0.19483740
#> 2.2.0 3.0.1 3.1.0 4.0.0 0.0.5 0.1.4
#> 0.29761905 -0.22271770 0.22271770 0.18898224 0.06299408 -0.11293849
#> 0.2.3 0.3.2 0.4.1 0.5.0 1.0.4 1.1.3
#> 0.20331252 -0.20331252 0.11293849 -0.06299408 -0.11293849 0.13309928
#> 1.2.2 1.3.1 1.4.0 2.0.3 2.1.2 2.2.1
#> -0.17786140 0.13309928 -0.11293849 0.20331252 -0.17786140 0.17786140
#> 2.3.0 3.0.2 3.1.1 3.2.0 4.0.1 4.1.0
#> -0.20331252 -0.20331252 0.13309928 -0.20331252 0.11293849 -0.11293849
#> 5.0.0
#> -0.06299408
Created on 2020-03-21 by the reprex package (v0.3.0)

purrr::map_int: Can't coerce element 1 from a double to a integer

I am having the weirdest bug with map_int from the purrr package.
# Works as expected
purrr::map_int(1:10, function(x) x)
#> [1] 1 2 3 4 5 6 7 8 9 10
# Why on earth is that not working?
purrr::map_int(1:10, function(x) 2*x)
#> Error: Can't coerce element 1 from a double to a integer
# or that?
purrr::map_int(1:10, round)
#> Error: Can't coerce element 1 from a double to a integer
Created on 2019-03-28 by the reprex package (v0.2.1)
I run 3.5.2 in rocker container (Debian) with the latest github version of everything:
sessioninfo::package_info("purrr")
#> package * version date lib source
#> magrittr 1.5.0.9000 2019-03-28 [1] Github (tidyverse/magrittr#4104d6b)
#> purrr 0.3.2.9000 2019-03-28 [1] Github (tidyverse/purrr#25d84f7)
#> rlang 0.3.2.9000 2019-03-28 [1] Github (r-lib/rlang#9376215)
#>
#> [1] /usr/local/lib/R/site-library
#> [2] /usr/local/lib/R/library
2*x is not an integer because 2 is not. Do instead
purrr::map_int(1:10, function(x) 2L*x)
The documentation from help(map) says
The output of .f will be automatically typed upwards , e.g. logical ->
integer -> double -> character
It appears to be following the larger ordering given in help(c). For example, this produces an error map_dbl(1:10, ~complex(real = .x, imaginary = 1)).
NULL < raw < logical < integer < double < complex < character < list <
expression
As you can see in that ordering, double-to-integer is a downward conversion. So, the function is designed to not do this kind of conversion.
The solution is to either write a function .f which outputs integer (or lower) classed objects (as in #Stéphane Laurent's answer), or just use as.integer(map(.x, .f)).
This is a kind of type-checking, which can be a useful feature for preventing programming mistakes.

Plot elements appear outside plot region with cairo_pdf() but not pdf()

I don't know much about graphic devices etc. All I want to do is to save plots to PDF and to embed fonts.
I use cairo_pdf() for this, but I noticed that sometimes plot elements are printed outside of the box/plot region (see screenshots of the PDFs). I can reproduce the issue on different Windows machines, different R versions, using packages cairoDevice or Cairo, and with for example lines(). But plots saved via pdf() look fine.
My questions are:
Is this reproducible? If yes, is this a bug and where?
Are there any other situations were cairo_pdf()-plots look different compared to pdf()-plots? Are there any other disadvantages of using cairo_pdf()?
Below are screenshots from details of the whole PDFs illustrating the differences. Note that, in the left image, the axis overlap with some points.
capabilities("cairo")
#> cairo
#> TRUE
set.seed(123456)
N <- 10000
v1 <- rnorm(N)
v2 <- rnorm(N)
v3 <- ifelse(v1 > 1.02 | v2 > 1.02 | v1 < -.02 | v2 < -.02, 2, 1)
cairo_pdf("plot1.pdf")
plot(v1, v2, xlim = 0:1, ylim = 0:1, col = v3, pch = 16)
dev.off()
#> null device
#> 1
pdf("plot2.pdf")
plot(v1, v2, xlim = 0:1, ylim = 0:1, col = v3, pch = 16)
dev.off()
#> null device
#> 1
devtools::session_info()
#> Session info ------------------------------------------------------------------
#> setting value
#> version R version 3.4.2 (2017-09-28)
#> system x86_64, mingw32
#> ui Rgui
#> language (EN)
#> collate German_Germany.1252
#> tz Europe/Berlin
#> date 2018-03-09
#>
#> Packages ----------------------------------------------------------------------
#> package * version date source
#> base * 3.4.2 2017-09-28 local
#> compiler 3.4.2 2017-09-28 local
#> datasets * 3.4.2 2017-09-28 local
#> devtools 1.13.5 2018-02-18 CRAN (R 3.4.3)
#> digest 0.6.15 2018-01-28 CRAN (R 3.4.3)
#> graphics * 3.4.2 2017-09-28 local
#> grDevices * 3.4.2 2017-09-28 local
#> memoise 1.1.0 2017-04-21 CRAN (R 3.4.1)
#> methods * 3.4.2 2017-09-28 local
#> stats * 3.4.2 2017-09-28 local
#> utils * 3.4.2 2017-09-28 local
#> withr 2.1.1 2017-12-19 CRAN (R 3.4.3)
This bug is fixed in R 3.6.0.
From the NEWS:
The cairo_pdf graphics device (and other Cairo-based devices) now clip correctly to the right and bottom border.
There was an off-by-one-pixel bug, reported by Lee Kelvin.

How should I use the uq() function inside a package?

I'm puzzled by the behaviour of the uq() function. The behavior is not the same when I use uq() or lazyeval::uq().
Here is my reproducible example :
First, I generate a fake dataset
library(tibble)
library(lazyeval)
fruits <- c("apple", "banana", "peanut")
price <- c(5,6,4)
table_fruits <- tibble(fruits, price)
Then I write a toy function, toy_function_v1, using only uq() :
toy_function_v1 <- function(data, var) {
lazyeval::f_eval(f = ~ uq(var), data = data)
}
and a second function using lazyeval::uq() :
toy_function_v2 <- function(data, var) {
lazyeval::f_eval(f = ~ lazyeval::uq(var), data = data)
}
Surprisingly, the output of v1 and v2 is not the same :
> toy_function_v1(data = table_fruits, var = ~ price)
[1] 5 6 4
> toy_function_v2(data = table_fruits, var = ~ price)
price
Is there any explanation ?
I know it's a good practice to use the syntaxe package::function() to use the function inside a new package. So what's the best solution in that case ?
Here is my session_info :
> devtools::session_info()
Session info ----------------------------------------------------------------------------------------------------------------------------------------------------
setting value
version R version 3.3.1 (2016-06-21)
system x86_64, linux-gnu
ui RStudio (1.0.35)
language (EN)
collate C
tz <NA>
date 2016-11-07
Packages --------------------------------------------------------------------------------------------------------------------------------------------------------
package * version date source
Rcpp 0.12.7 2016-09-05 CRAN (R 3.2.3)
assertthat 0.1 2013-12-06 CRAN (R 3.2.2)
devtools 1.12.0 2016-06-24 CRAN (R 3.2.3)
digest 0.6.10 2016-08-02 CRAN (R 3.2.3)
lazyeval * 0.2.0.9000 2016-10-14 Github (hadley/lazyeval#c155c3d)
memoise 1.0.0 2016-01-29 CRAN (R 3.2.3)
tibble * 1.2 2016-08-26 CRAN (R 3.2.3)
withr 1.0.2 2016-06-20 CRAN (R 3.2.3)
It's just a bug in the uq() function. The issue is open on Github : https://github.com/hadley/lazyeval/issues/78.

Resources