Following on from my previous challenging exercise: promise already under evaluation with nesting from function, I have learnt thus far how to properly use: enquos, !!!, c() within a function for a variety of calling methods. However, my next challenge is more complex - I want to call a function within a function, and only passing it parameters from the outer function. Essentially, I wanted to make a list of functions and pass different parameters to each element from the list by using another function.
for example:
anotherTest <- function(data,...){
cols = enquos(...)
testFunc <- function(df, more){
df %>% mutate(!!!c(more))
}
n <- length(cols)
addMutation <- replicate(n, testFunc, simplify=FALSE)
print(addMutation)
addCars <- replicate(n, data)
mapply(function(x, y, z) x %>% reduce(., y, z),addCars, addMutation, cols)
}
When I call:
anotherTest(mtcars, vs, gear, am)
I get this error:
Error in fn(out, elt, ...) : unused argument (~vs)
We could try
anotherTest <- function(data,...){
cols = enquos(...)
testFunc <- function(df, more){
df %>% mutate(!!!c(more))
}
n <- length(cols)
addMutation <- replicate(n, testFunc, simplify=FALSE)
addCars <- replicate(n, data, simplify = FALSE)
Map(function(x, y, z) y(x, z), addCars, addMutation, cols)
}
-testing
out <- anotherTest(mtcars, vs, gear, am)
> lapply(out, head, 3)
[[1]]
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
[[2]]
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
[[3]]
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Related
I'm using the mtcars dataset for this example.
I have a function which creates a named list using a variable:
make_list <- function(df, variable_name) {
a <- df %>%
list(variable_name = .)
return(a)
}
When I use this function:
mylist <- make_list(mtcars, "car_info")
head(mylist)
$variable_name
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
The list name is called variable_name, rather than car_info.
How do I change the function (but still use a pipe format) so that the correct name is returned?
If you want to continue using the pipe, you can use setNames:
make_list <- function(df, variable_name) {
df %>%
list%>%
setNames(variable_name)
}
make_list(mtcars, "car_info")
Output:
$car_info
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
make_list <- function(df, variable_name) {
a <- df %>% list
names(a) <- variable_name
return(a)
}
Try this:
make_list <- function(df, variable_name) {
a <- df %>%
list()
names(a) <- variable_name
return(a)
}
mylist <- make_list(mtcars, "car_info")
Output (Some rows):
mylist
$car_info
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
rlang has a list2 function that does that
make_list <- function(df, variable_name) {
rlang::list2(!! variable_name := df)
}
make_list(mtcars, "car_info")
#> $car_info
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
#> Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
#> Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
Or tibble::lst works the same: make_list <- function(df, variable_name) tibble::lst(!! variable_name := df)
I need to reshape my dataframe based on values on columns 'group', 'model' and 'qsec'. Using the mtcars data as an example, would be possible to use dplyr or tidyr to achieve this?
I need to reshape a very large dataframe, and unfortunately i don't know how to use dplyr to resolve this problem.
df<-mtcars[1:3,]
df$model <- rownames(df)
row.names(df)<-NULL
df<-df[rep(seq_len(nrow(df)), each=2),]
df$group <- c("A","B","A","C","A","B")
####
> df
mpg cyl disp hp drat wt qsec vs am gear carb model group
1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 Mazda RX4 A
1.1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 Mazda RX4 B
2 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 Mazda RX4 Wag A
2.1 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 Mazda RX4 Wag C
3 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 Datsun 710 A
3.1 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 Datsun 710 B
#
I need to reshape in a way that each column would be a group (A,B or C) and as the row.names, each one of the car model belonging to each group. The values would be the ones on 'qsec' columns, with zeros for cases that a group don't have any 'qsec' value.
df_result <- data.frame(A = c(16.46,17.02,18.61), B = c(16.46,0,18.61), C = c(0,17.02,0))
row.names(df_result) <- unique(df$model)
> df_result
A B C
Mazda RX4 16.46 16.46 0.00
Mazda RX4 Wag 17.02 0.00 17.02
Datsun 710 18.61 18.61 0.00
I think this is what you want:
library(tidyverse)
df2 <- df%>%
select(group, model, qsec)%>%
spread(key = group, value = qsec)
row.names(df2) <- df2$model
df_final <- df2[-1]
Using dcast from reshape2 and replace_na from tidyr:
df <- df %>%
select(model, qsec, group) %>%
dcast(model ~ group, value.var = "qsec") %>%
replace_na(list(A = 0, B = 0, C = 0))
I have a function f1 on a dataframe which calls another function f2 and then does stuff with f2's output. f2 works interactively on its own but how do I get it to run when called by f1?
f1 <- function(x, y) {
z <- f2(x, y)
# do stuff with z
w <- z
return(w)
}
f2 uses subset() to de-select certain columns:
f2 <- function(x, y) {
y <- substitute(y)
subset(x, select = -eval(y))
}
As you can see, f2 works interactively. I don't care about that, but I do want it to work when called by f1.
# This works fine interactively (but I don't care about that)
f2(mtcars,mpg)
# This is what I want to work
f1(mtcars,mpg)
Error in -eval(y) : invalid argument to unary operator
I would prefer not to change f1 or its arguments. How do I re-write f2 so that it works within f1?
Here is a similar question with solutions that I'm having trouble applying to my context: R: passing expression to an inner function
The easiest would be to use rlang and tidyverse functions for compatibility with quasiquotation :
library(dplyr)
library(rlang)
f1 <- function(x, y) {
z <- f2(x, -!!enquo(y))
# do stuff with z
w <- z
return(w)
}
f2 <- function(x, y) {
select(x, !!enquo(y))
}
f1(mtcars,mpg)
# cyl disp hp drat wt qsec vs am gear carb
# Mazda RX4 6 160.0 110 3.90 2.620 16.46 0 1 4 4
# Mazda RX4 Wag 6 160.0 110 3.90 2.875 17.02 0 1 4 4
# Datsun 710 4 108.0 93 3.85 2.320 18.61 1 1 4 1
# ...
This works in base R :
f1 <- function(x, y) {
z <- f2(x, substitute(y))
# do stuff with z
w <- z
return(w)
}
f2 <- function(x, y) {
eval(substitute(subset(x, select = -Y), list(Y = y)))
}
f1(mtcars,mpg)
# cyl disp hp drat wt qsec vs am gear carb
# Mazda RX4 6 160.0 110 3.90 2.620 16.46 0 1 4 4
# Mazda RX4 Wag 6 160.0 110 3.90 2.875 17.02 0 1 4 4
# Datsun 710 4 108.0 93 3.85 2.320 18.61 1 1 4 1
# ...
With the newer version of rlang 0.4.0, we can use the {{...}} (curly-curly) that makes it easier to do the evaluation
library(rlang)
f1 <- function(x, y) {
z <- f2(x, -{{y}})
# do stuff with z
w <- z
return(w)
}
f2 <- function(x, y) {
select(x, {{y}})
}
f1(mtcars,mpg) %>%
head
# cyl disp hp drat wt qsec vs am gear carb
#Mazda RX4 6 160 110 3.90 2.620 16.46 0 1 4 4
#Mazda RX4 Wag 6 160 110 3.90 2.875 17.02 0 1 4 4
#Datsun 710 4 108 93 3.85 2.320 18.61 1 1 4 1
#Hornet 4 Drive 6 258 110 3.08 3.215 19.44 1 0 3 1
#Hornet Sportabout 8 360 175 3.15 3.440 17.02 0 0 3 2
#Valiant 6 225 105 2.76 3.460 20.22 1 0 3 1
I'm learning R programming as such have hit a few problems - and with your help have been able to fix them.
But I now have a need to rename columns of a data frame. I have a translation data frame with 2 columns that contains the column names and what the new columns should be called.
Here is my code: my question is how do I select the two columns from the trans dataframe and use them here as trans$old and trans$new variables?
I have 7 columns I'm renaming, and this might be even longer hence the translation table.
replace_header <- function()
{
names(industries)[names(industries)==trans$old] <- trans$new
replaced <- industries
return (replaced)
}
replaced_industries <- replace_header()
Here's an example using the built-in mtcars data frame. We'll use the match function to find the indices of the columns names we want to replace and then replace them with new names.
# Copy of built-in data frame
mt = mtcars
head(mt,3)
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
# Data frame with column name substitutions
dat = data.frame(old=c("mpg","am"), new=c("new.name1","new.name2"), stringsAsFactors=FALSE)
dat
old new
1 mpg new.name1
2 am new.name2
Use match to find the indices of the "old" names in the mt data frame:
match(dat[,"old"], names(mt))
[1] 1 9
Substitute "old" names with "new" names:
names(mt)[match(dat[,"old"], names(mt))] = dat[,"new"]
head(mt,3)
new.name1 cyl disp hp drat wt qsec vs new.name2 gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
I'd recommend setnames from "data.table" for this. Using #eipi10's example:
mt = mtcars
dat = data.frame(old=c("mpg","am"), new=c("new.name1","new.name2"), stringsAsFactors=FALSE)
library(data.table)
setnames(mt, dat$old, dat$new)
names(mt)
# [1] "new.name1" "cyl" "disp" "hp" "drat" "wt"
# [7] "qsec" "vs" "new.name2" "gear" "carb"
If there's a concern as indicated by #jmbadia that the data.frame with the old and new names, you can add skip_absent=TRUE to setnames.
improving a bit the eipi10's answer, if we want to use a "rename dataframe" with old names not always present on the mt dataframe (e.g. because mt is provided by differnt sources so we don't always know its colnames) we can consider the following code
mt = mtcars
head(mt,3)
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
# dataframe with possible names to replace
dat = data.frame(old=c("strangeName","am"), new=c("new.name1","new.name2"), stringsAsFactors=FALSE)
# find which old names are present in mt
namesMatched <- dat[dat$old %in% names(mt)
#renaming
names(mt)[match(namesMatched,"old"], names(mt))] = dat[namesMatched,"new"]
head(mt,3)
mpg cyl disp hp drat wt qsec vs new.name2 gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
I get this below error when I am running the below code to read my MySQL server table.
my_data <- dbReadTable(mydb, "ar_data")
Warning message:
row.names not set (duplicate elements in field)
Is there any way through which I don't ask R to read the row names. My table is fine and I don't want to make any changes to my MySQL table.
Here are a few options:
library(RMySQL)
library(DBI)
drv <- dbDriver("MySQL")
con <- dbConnect (drv, dbname="mydb", user="username")
data <- mtcars; rownames(data) <- NULL; data$row_names <- rownames(mtcars)[1]
dbWriteTable(con, "mtcars", data, overwrite = T, row.names = F)
head( dbReadTable(con, "mtcars"), 3 )
# mpg cyl disp hp drat wt qsec vs am gear carb row_names
# 1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 Mazda RX4
# 2 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 Mazda RX4
# 3 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 Mazda RX4
# Warning message:
# row.names not set (duplicate elements in field)
# suppress warnings
head( suppressWarnings(dbReadTable(con, "mtcars")), 3 )
# mpg cyl disp hp drat wt qsec vs am gear carb row_names
# 1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 Mazda RX4
# 2 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 Mazda RX4
# 3 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 Mazda RX4
# rename column row_names to rn
dbSendQuery(con, "ALTER TABLE mtcars CHANGE COLUMN row_names rn TEXT")
head( dbReadTable(con, "mtcars"), 3 )
# rn mpg cyl disp hp drat wt qsec vs am gear carb
# 1 Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
# 2 Mazda RX4 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
# 3 Mazda RX4 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
dbSendQuery(con, 'DROP TABLE mtcars')
dbDisconnect(con)