R functions using data.frame$column not working - r

The two following functions don't work at the moment, but do work when I write them out in full - not sure why. Any suggestions for fixes would be great.
change_specific_column_name <- function(data.frame,old_column_name,new_column_name){
names(data.frame)[names(data.frame) == old_column_name] <- new_column_name
}
change_specific_observations_name <- function(data.frame, column_name, old_obseration, new_observation){
data.frame$column_name[which(data.frame$column_name == old_obseration)] <- new_observation
}
test_frame <- data.frame(Does=1,This=2,Work=3)
change_specific_column_name(test_frame,"Work","Happen") # this doesn't change the name of the column
names(test_frame)[names(test_frame) == "Work"] <- "Happen" # writing out the function does change the name

Although not exact, you can think of the function's argument as passed-by-value, so it's perspicuous that changes made to the function's formal parameter don't impact the actual argument.
Any suggestions for fixes would be great.
If you really want a function to modify its argument, you could use a technique described e. g. under Call by reference in R; essentially just wrap your assignment in eval.parent(substitute(…)).
change_specific_column_name <- function(data.frame, old_column_name, new_column_name)
eval.parent(substitute(names(data.frame)[names(data.frame) == old_column_name] <- new_column_name))

Related

Use weighted.mean in summary_rows GT package

I've been searching around for a solution to using weighted.mean with summary_rows in GT package.
summary_rows function only accepts functions in form foo(x), therefore functions with more variables such as weighted.mean(x,w) is not accepted.
When using summary_rows with groups, such as:
summary_rows(groups = T, columns = c, fns = list("average" = ~mean(.)),...)
It takes vector of values for each group and then runs them through the mean() function, resp. the list of chosen functions.
My solution to this is quite cumbersome. I wrote my own custom function, that takes the vector of values provided by summary_rows and compares it to expected vectors using if statements. This only works for single columns at a time so it is quite a lot of code, both in the custom functions and in the code for the GT table.
weighted_mean_age <- function (x) {
if (all(x == some.data$age.column[some.data$group.column == "group name"])) {
weighted.mean(x, some.data$no.occurences[some.data$group.column == "group name"])
} else if (another vector) {
And so on for every group.
}
}
Did anyone deal with the same problem, but came up with less cumbersome solution? Did I miss something in the GT package?
Thank you for your time and ideas.
First I need to clarify the assumption that I used for this answer:
What you want is to pass something like weighted.mean(.,w) to this summary_rows() function.
However this isn't possible due to the problems with the gt library that you outlined in your question. If that is the case then I do believe I have a solution:
I've done some similar 'hacks' when I was creating some very specific Python scripts. It essentially revolved around mapping the functions that I wanted to use using some specific container. Thus I searched the R language sources if something like this is also possible in R and apparently it is using factory functions and storing them in some container. Here is a step by step guide:
You first need to create a factory function for your weighted.mean as such:
my_mean <- function(w) { function(x) { weighted.mean(x,w) } }
then you need to populate some kind of a container with your new functions (I am using a list):
func_list <- list()
func_list[[some_weight]] <- my_mean(some_weight)
func_list[[different_w]] <- my_mean(different_w)
#etc...
Once you've done that you should be able to pass this as a function to summary_rows i.e.:
summary_rows(
groups = T,
columns = c,
fns = list("w_mean" = ~func_list[w](.)),
...)
Bare in mind that you have to put the w values in yourself using some form of a mapping function or a loop.
Hope it is what you are looking for and I hope it helps!

"Capturing" the global environment

I'm using functions from an external package (that I cannot modify). These functions put a lot of stuff in the global environment, for instance the package does things like
the.data <<- data.frame(A=rnorm(10),B=rnorm(10),C=rnorm(10)) ## A sample dataset
package.plot <- function(){
x.coords <<- the.data$A/the.data$B
y.coords <<- the.data$C
plot(x.coords, y.coords)
}
(obviously hyper-simplified example... here the key is that x.coods and y.coords are rather complex derivations, sufficiently complex that I do not want to recode them but find it advantageous to re-use the existing code)
I want to use these functions in my own scripts, namely make the same graph with ggplot. Of course, a first, obvious solution is
my.better.plot <- function(){
package.plot()
tibble(x.coords,y.coords) %>% ggplot(aes(x=x.coords,y=y.coords))+geom_point() # etc.
}
However, this has two issues:
I end up plotting twice (a minor issue, it is sufficiently fast
to be unnoticeable);
I "pollute" the global environment with
global x.coords and y.coords
Hence, I would like to run package.plot() in a "pseudo-global" environment to avoid ending up with global variables that may be modified in an "uncontrolled" way.
A workaround, of course, is
my.better.plot <- function(){
package.plot()
tibble(x.coords,y.coords) %>% ggplot(aes(x=x.coords,y=y.coords))+geom_point() # etc.
rm(x.coords,envir=.GlobalEnv)
}
However, I'd prefer to do something like
my.better.and.cleaner.plot(){
within.envir(dummy_env,my.better.plot)
}
.. assuming that there is, indeed, a function "within.envir" that allows to run its second argument in a mock global environment.
Is something like this possible at all ? I did read http://adv-r.had.co.nz/Environments.html , but could not find the answer... (not one that I understood, at any rate). Bonus question : if this is possible, how can I extract the return value of ggplot from dummy_env and return it ?
This function avoids the side effects as much as possible:
library(ggplot2)
library(magrittr)
library(tibble)
my.better.plot <- function(){
x.coords <- 1
y.coords <- 1
environment(package.plot) <- environment()
bmp(tempfile())
package.plot()
dev.off()
print(tibble(x.coords,y.coords) %>% ggplot(aes(x=x.coords,y=y.coords))+geom_point()) # etc.
}
my.better.plot()
#creates only the ggplot in the current device
ls(globalenv())
#[1] "my.better.plot" "package.plot" "the.data"
So this is unfortunately very hacky, since the <<- operator will traverse the environment tree upwards if it does not find the variable name (hence why you should basically never use it.
The one workaround is to call the function from another environment that already has the variables in question initialized. Then it will assign it into those variables and not traverse further up into the globalEnv. You need to know the variable names beforehand though.
f <- function(x) a <<- x
f(5)
# a = 5 in GlobalEnv
rm(a)
CapturedCall <- function(fun, CapturedVars,...)
{
stopifnot(is.function(fun))
SandBox <- new.env()
for(varName in CapturedVars) assign(varName, NA,SandBox)
environment(fun) <- SandBox
fun(...)
}
CapturedCall(f,"a",1)
#Nothing in GlobalEnv

R - using substitute within a nested function

I have a function that may end up being nested (Inner) and some other function (in general this function won't be known) that I'm calling Outer, and I would like Inner to be able to produce the same result regardless of the wrapper function (Outerin the below case).
Inner <- function(x,baz,bang){
# code stuff things ...
x.prime = as.character(substitute(x))
return(c(x.prime,y,z))
}
Outer <- function(y){
Inner(y)
}
Inner(a)
# "a" "stuff" "things" , which is what I'm expecting, in particular the "a".
Outer(a)
# "y" .... , but I was expecting to get "a"?
Of course I'm not dead set on using substitute if someone knows of a better method.
Does anyone have any clues how to get Inner to output the same result regardless if it is nested or not?
thanks in advance.
Here is a general outline that should help you solve your problem:
Inner <- function(x) {
my.call <- quote(substitute(x)) # we quote this here because we are going to re-use this expression
var.name <- eval(my.call)
for(i in rev(head(sys.frames(), -1L))) { # First frame doesn't matter since we already substituted for first level, reverse since sys.frames is in order of evaluation, and we want to go in reverse order
my.call[[2]] <- var.name # this is where we re-use it, modified to replace the variable
var.name <- eval(my.call, i)
}
return(var.name)
}
Outer <- function(y) Inner(y)
Outer2 <- function(z) Outer(z)
Now let's run the functions:
Inner(1 + 1)
# 1 + 1
Outer(2 + 2)
# 2 + 2
Outer2(3 + 3)
# 3 + 3
Inner always returns the outermost expression (you don't see y or z ever, just the expression as typed in .GlobalEnv.
The trick here is to use sys.frames(), and repeatedly substitute until we get to the top level.
Note this assumes that all the "Outer" functions just forward their argument on to the next inner one. Things likely get a lot more complicated / impossible if you have something like:
Outer <- function(y) Inner(y + 1)
This code does not check for that type of issue, but you probably should in your code. Also, keep in mind that the assumption here is that your functions will only be called from the R command line. If someone wraps their functions around yours, you might get unexpected results.

R - pass a global variable to a function, modify it and save

I'm trying to build a dynamic function utilizing eval,parse, or whatever works
Intention of a function: a value setter.
Parameter input: list, name of list item, value
Return: don't really care
Current code
#call fun_lsSetValue(state_list,selected,"dropdown")
fun_lsSetValue <- function(ls,name,value){
pars <- as.list(match.call()[-1])
element <- as.character(eval(expression(pars$name)))
if(is.null(value))
eval(parse(text="ls[[element]] <- ''"))
else
eval(parse(text="ls[[element]] <- value"))
#part that I need help, I need to assign ls to "state_list" without
#having to hard coded it in this function
#I have tried everything I can think of like
#assign(deparse(substitute(ls)),ls,.GlobalEnv)
#state_list <<- ls works, but I want to be dynamic
}
The problem I found is I need to pass the value of a local variable "ls" to where it came from dynamically (state_list)
I know a <- function(a,name,value) {... return(a)} work, but this syntax is really not my preference.
Since I'm trying to learn if same thing can be done without the assign out side of function.
Any advise would be helpful.
Even though this is a terrible idea in general, something like
fun_lsSetValue <- function(ls,name,value){
lsname <- deparse(substitute(ls))
name <- deparse(substitute(name))
ls <- get(lsname, envir=globalenv())
if(is.null(value)) {
value<-''
}
ls[[name]]<-value
assign(lsname, ls,envir=globalenv())
}
should work
a <- list(x=1)
fun_lsSetValue(a,x,3)
a
# $x
# [1] 3

Naming different variables and using i to subset a file

I want to go through a vector, name all variables with i and use i to subset a larger file.
Why this does not work?
x <- c(seq(.1,.9,.1),seq(.9,1,.01))
doplot <- function(y)
{
for (i in unique(y))
{
paste("f_", i, sep = "") <- (F_agg[F_agg$Assort==i,])
}
}
doplot(x)
There are several problems here. First of all, on the left hand side of <- you need a symbol (well, or a special function, but let's not get into that now). So when you do this:
a <- "b"
a <- 15
then a will be set to 15, instead of first evaluating a to be b and then set b to 15.
Then, if you create variables within a function, they will be (by default) local to that function, and destroyed at the end of the function.
Third, it is not good practice to create variables this way. (For details I will not go into now.) It is better to put your data in a named list, and then return the list from the function.
Here is a solution that should work, although I cannot test it, because you did not provide any test data:
doplot <- function(y) {
lapply(unique(y), function(i) {
F_agg[F_agg$Assort == i, ]
})
}

Resources