R function with variable args depending on presence/absence of other args - r

i've stumbled upon the varargs issue in R two or three times, but it seems that the problem i have is a little bit trickier than i expected. Here it is
i have a function, which does something with its variables, but i would like to introduce another variable, kind of a flag, that selects the way the function is working and which parameters are needed by the function itself: namely the number and type of inputs depends on a (flag) input.
Ok, an example is better:
example = function(x,flag=1,y){
if (flag) return(x)
else return(y)
}
and this is working fine.
The point is that in this example you need to specify both x and y every time. Instead I would like a function taking only x if flag=1 and only y if flag=0. (In this stupid example they basically would be two distinct functions, but in my actual case i have other (common) arguments on i do some calculations that both 'parts' of the functions need).
I know that one may specify whatever value for the unused argument and the result wouldn't change, but i want a function which is immediately readable by the user, and it is cumbersome to need to specify an argument which won't be used by the function
thank you for any help

What about the following.
example = function(x,flag=1,y){
if (flag && !missing(x)) return(x)
else if(!flag && !missing(y)) return(y)
}
This will check if the flag is 0 or non-zero plus it will check if an argument is missing. You may want to handle the case when neither of these is true cause this function will return NULL in that case.

Related

R: Not to look for variables outside a function if they do not exist within it

This function is OK in R:
f <- function(x) {
x + y
}
Because if the variable y is not defined inside the function f(), R will look for it outside the environment of the function, in its parent environment.
Apart from the fact that this behavior can be a bug generator, what is the point of functions having input parameters? Anyway, all the variables inside a function can be searched outside of it.
Is there any way not to look for variables outside a function if they do not exist within the function?
Some reasons for using parameters that came to my mind:
Without parameters, users have to define variables before using the function, and these variable names need to match the variable names used within the function -- this is impractical.
How is anyone supposed to know/remember the names of the variables within a function? How do I know which variables within a function are purely local, and which variables have to exist outside of the function?
Input parameters can be passed directly as values or as a variable (and the variable name does not matter).
Input parameters communicate the intended usage of the function; it is clear what data is needed to operate it (or at the very least: how many values need to be inserted by the user of the function)
Input parameters can be documented properly using Rd files (or roxygen syntax)
I am sure there are many other reasons to use input parameters.
M. Papenberg provides a very good explanation.
Here's a quick addendum how to make a function not look for objects in parental environments:
Just provide them in the parameter list! This might sound stupid, but that's what you should always do unless you have good reason to do otherwise. In your example only x is passed to the function. So, if the idea here is that x should be returned if y doesn't exist, you can go for default parameters. In this case this could be done as
f <- function(x, y = 0) {
x + y
}

Cannot figure out how to use IF statement

I want to create a categorical variable for my DB: I want to create the "Same_Region" group, that includes all the people that live and work in the same Region and a "Diff_Region" for those who don't. I tried to use the IF statement, but I actually don't know how to proper say "if the variable Region of residence and Region of work are the same, return...". It's the very first time I try to approach by my self R, and I feel a lil bit lost.
I tried to put the two variables (Made by 2 letters - f.i. "BO") as Characters and use the "grep" command. But it eventually took to no results.
Then I tried by putting both the variables as factors, and nothing much changed.
----In R-----
extractSamepr <- function(RegionOfRes, RegionOfWo){
if(RegionOfRes== RegionOfWo){
return("SamePr")
}
else {
return("DiffPr")
}
SamePr <- NULL
for (i in 1:nrow(Data.Base)) {
SamePr <- c(SamePr, extractSamepr(Data.Base[i, "RegionOfRes", "RegionOfWo"]))
}
The ifelse way proposed in #deepseefan's comment is a standard way of solving this type of problem.
Here is another one. It uses the fact that FALSE/TRUE are coded as integers 0/1 to create a logical vector based on equality and then add 1 to that vector, giving a vector of 1/2 values. This result is used in the function's final instruction to index a vector with the two possible outcomes.
extractSamepr <- function(DF){
i <- 1 + (DF[["RegionOfRes"]] == DF[["RegionOfWo"]])
c("DiffPr", "SamePr")[i]
}
Data.Base$SamePr <- extractSamepr(Data.Base)

not error, but not results either in R

I am trying to make a function in R that calculates the mean of nitrate, sulfate and ID. My original dataframe have 4 columns (date,nitrate, sulfulfate,ID). So I designed the next code
prueba<-read.csv("C:/Users/User/Desktop/coursera/001.csv",header=T)
columnmean<-function(y, removeNA=TRUE){ #y will be a matrix
whichnumeric<-sapply(y, is.numeric)#which columns are numeric
onlynumeric<-y[ , whichnumeric] #selecting just the numeric columns
nc<-ncol(onlynumeric) #lenght of onlynumeric
means<-numeric(nc)#empty vector for the means
for(i in 1:nc){
means[i]<-mean(onlynumeric[,i], na.rm = TRUE)
}
}
columnmean(prueba)
When I run my data without using the function(), but I use row by row with my data it will give me the mean values. Nevertheless if I try to use the function so it will make all the steps by itself, it wont mark me error but it also won't compute any value, as in my environment the dataframe 'prueba' and the columnmean function
what am I doing wrong?
A reproducible example would be nice (although not absolutely necessary in this case).
You need a final line return(means) at the end of your function. (Some old-school R users maintain that means alone is OK - R automatically returns the value of the last expression evaluated within the function whether return() is specified or not - but I feel that using return() explicitly is better practice.)
colMeans(y[sapply(y, is.numeric)], na.rm=TRUE)
is a slightly more compact way to achieve your goal (although there's nothing wrong with being a little more verbose if it makes your code easier for you to read and understand).
The result of an R function is the value of the last expression. Your last expression is:
for(i in 1:nc){
means[i]<-mean(onlynumeric[,i], na.rm = TRUE)
}
It may seem strange that the value of that expression is NULL, but that's the way it is with for-loops in R. The means vector does get changed sequentially, which means that BenBolker's advice to use return(.) is correct (as his advice almost always is.) . For-loops in R are a notable exception to the functional programming paradigm. They provide a mechanism for looping (as do the various *apply functions) but the commands inside the loop exert their effects in the calling environment via side effects (unlike the apply functions).

How do I remove an object from within a function environment in R?

How do I remove an object from the current function environment?
I'm trying to achieve this:
foo <- function(bar){
x <- bar
rm(bar, envir = environment())
print(c(x, is.null(bar)))
}
Because I want the function to be able to handle multiple inputs.
Specifically I'm trying to pass either a dataframe or a vector to the function, and if I'm passing a dataframe I want to set the vector to NULL for later error handling.
If you want, you can watch my DepthPlotter script, where I want to let the second function check if depth is a dataframe, and if so, assign it to df in stead and remove depth from the environment.
Here is a very brief sketch of how to set this up using S3 method dispatch.
First, you define your generic:
DepthPlotter <- function(depth,...){
UseMethod("DepthPlotter", depth)
}
Then you define methods for specific classes of the argument depth. As a very basic example in your case, you might create only two, a data.frame method and a default method to handle the vector case:
DepthPlotter.default <- function(depth, variable, ...){
#Here you write a function assuming that depth is
# anything but a data frame
}
DepthPlotter.data.frame <- function(depth,...){
#Here you'd write a function that assumes
# that depth is a data frame
}
And then you can call DepthPlotter() using either type of argument and the correct function will be run based upon the result of class(depth).
The example I've sketched out here is a little crude, since I've used a default method to handle the vector case. You could write .numeric and .integer methods to handle numeric or integer vectors more specifically. In my example, the .default method will be called for any case other than data.frame, so if you go this route you'd want to write some code in there that checks for strange cases like depth being a complicated list, or other odd object, if you think there's a chance something like that might be passed to the function.

R to ignore NULL values

I have 2 vector in R, but some of the values in both are marked as "NULL".
I want R to ignore "NULLS", but still "acknowledge" their presence because of indexes ( I´m using intersect and which function).
I have tried this:
for i in 1:length(vector)
if vector=="NULL"
i=i+1
else
'rest of the code'
Is this a good approach? The algorithm is running, but vector are very large.
You should change "NULL" for NA, which is R's native representation for NULL values. Then many functions have ways of dealing with NA values, such as na.action option... You shouldn't call your vector 'vector' since this is a reserved word for the class.
yourvector[yourvector == "NULL"] <- NA
Also you shouldn't add 1 to i in your if, just do nothing:
for (i in 1:length(yourvector)) {
if (!is.na(yourvector[i])) {
#rest of the code
}
}
Also tell what you wanna do. You probably don't need a for.
This code contains several errors:
First off, a vector cannot normally contain NULL values at all. Are you maybe using a list?
if vector=="NULL"
you probably mean if (vector[i] == "NULL"). Even so, that’s wrong. You cannot filter for NULL by comparing to the character string "NULL" – those two are fundamentally different things. You need to use the function is.null instead. Or, if you’re working with an actual vector which contains NA values (not NULL, like I said, that’s not possible), something like is.na.
i=i+1
This code makes no sense – leaving it out won’t change the result because the loop is in charge of incrementing i.
Finally, don’t iterate over indices – for (i in 1 : length(x)) is bad style in R. Instead, iterate over the elements directly:
for (x in vector) {
if (! is.na(x)) {
Perform action
}
}
But even this isn’t very R like. Instead, you would do two things:
use subsetting to get rid of NA values:
vector[! is.na(vector)]
Use one of the *apply functions (for instance, sapply) instead of a loop, and put the loop body into a function:
sapply(vector[! is.na(vector)], function (x) Do something with x)

Resources