equivalent of *args in R and summing up a function to itself - r

Say I have a binomial.coeff function defined as:
combn <- function(n,r) {
factorial(n)/(factorial(r)*factorial(n-r))
}
binomial.coeff <- function(n,x,p) {
combn(n,x)*p^x*(1-p)^(n-x)
}
I would like to call the function binomial.coeff for different values of x and sum the calls to produce the result. I'm not sure if R takes many positional arguments like this:
binomial.coeff <- function(n,...,p) {
combn(n,...)*p^...*(1-p)^(n-...)
}
My final goal is say x takes values [3,2] then I would like to have the function being summed on 3 and 2
binomial.coeff <- function(n,x,p) {
combn(n,3)*p^3*(1-p)^(n-3)+combn(n,2)*p^2*(1-p)^(n-2)
}

Since you are using functions that are themselves vectorized, you can pass in a vector for x with
binomial.coeff(3,2:3,.25)
# [1] 0.140625 0.015625
and sum that with
sum(binomial.coeff(3,2:3,.25))
# [1] 0.15625
More generally, if you want to call it multiple times with different parameter values, in R you'd use something like sapply. For example
sapply(2:3, function(x) binomial.coeff(3, x, .25))
# [1] 0.140625 0.015625
sum(sapply(2:3, function(x) binomial.coeff(3, x, .25)))
# [1] 0.15625
But the code you've written seems to be the pmf for the binomial distribution. There is a built in function for that dbinom You could do the same thing with
dbinom(2:3, 3, .25)
# [1] 0.140625 0.015625
sum(dbinom(2:3, 3, .25))
# [1] 0.15625
in one call because dbinom is vectorized over the x argument. Also this version would be more efficient.

Related

How to make a function that sums the exponential's of a vector in R?

My professor has assigned a question for programming in R and I am stuck. He wants us to make a function that will take the exponential (e^(x[i]) of all the numbers in a vector and then sum them. the equation is:
the summation of e^x(i), n, and i=1.
I have made a function that will give me the exponential of the first value in my vector. But I want to get the exponential of all the values and sum them. Here is my code
#Vector for summing
x=c(2,1,3,0.4)
#Code for function
mysum = 0
myfun=function(x){
for (i in 1:length(x)){
mysum = mysum + exp(x[i])
return(mysum)
}
}
myfun(x)
#returns 7.389056
I have also tried using i = 1:1 because the equation specifies i=1, even though I knew that would only go through 1 number, and it gave me the same answer.... obviously.
myfun=function(x){
for (i in 1:1)
Does anyone have any suggestions to get it to sum?
You need to set the initial value of mysum to the accumulation afterwards, and also move the line return(mysum) outsides your for loop to return the result, i.e.,
myfun=function(x){
mysum <- 0
for (i in 1:length(x)){
mysum = mysum + exp(x[i])
}
return(mysum)
}
or just
myfun=function(x){
mysum <- 0
for (i in x){
mysum = mysum + exp(x)
}
return(mysum)
}
Since exp operation is vectoroized, you can also define your function myfun like below
myfun <- function(x) sum(exp(x))
You could also use the fact that most base functions are already vectorized :
1) create a dummy vector
1:10
#> [1] 1 2 3 4 5 6 7 8 9 10
2) apply your function on that vector, you get vectorized result
exp(1:10)
#> [1] 2.718282 7.389056 20.085537 54.598150 148.413159
#> [6] 403.428793 1096.633158 2980.957987 8103.083928 22026.465795
3) Sum that vector
sum(exp(1:10))
#> [1] 34843.77
4) Write your function to gain (a little) time
my_fun <- function(x){sum(exp(x))}
my_fun(1:10)
#> [1] 34843.77

Creating a function in R with variable number of arguments,

When creating a function in R, we usually specify the number of argument like
function(x,y){
}
That means it takes only two arguments. But when the numbers of arguments are not specified (For one case I have to use two arguments but another case I have to use three or more arguments) how can we handle this issue? I am pretty new to programming so example will be greatly appreciated.
d <- function(...){
x <- list(...) # THIS WILL BE A LIST STORING EVERYTHING:
sum(...) # Example of inbuilt function
}
d(1,2,3,4,5)
[1] 15
You can use ... to specify an additional number of arguments. For example:
myfun <- function(x, ...) {
for(i in list(...)) {
print(x * i)
}
}
> myfun(4, 3, 1)
[1] 12
[1] 4
> myfun(4, 9, 1, 0, 12)
[1] 36
[1] 4
[1] 0
[1] 48
> myfun(4)

How to exclude 1 when running the function?

The question is:
There is a package with a function that enables you to check if a number is prime. install.packages("schoolmath") library(schoolmath) is.prim(3)
Create a function that takes in two integers (set default values of 1 to both). The function should calculate the number of prime numbers between the two values.
My code is:
install.packages("schoolmath")
library(schoolmath)
is.prim(3)
prime <- function(x)
{
p <- 0
p1 <- ifelse(is.prim(x) == "TRUE", p + 1, p)
return(sum(p1 == 1))
}
prime(seq(1,10,1))
When I ran the function, it counts 1 as a prime number as well, which is not true. How to efficiently exclude that from the function?
You can simplify your function a little because is.prim works with vectors and looking at the documentation for sum function:
Logical true values are regarded as one, false values as zero.
Here is a function that counts the primes in a vector
count.primes <- function(x) {
sum(x > 1 & is.prim(x))
}
Example:
count.primes(1:10)
# [1] 4
count.primes(1:20)
# [1] 8

Function parameter as argument in an R function

I am attempting to write a general function to calculate coverage probabilities for interval estimation of Binomial proportions in R. I intend to do this for a variety of confidence interval methods e.g. Wald, Clopper-Pearson, HPD intervals for varying priors.
Ideally, I would like there to be one function, that can take, as an argument, the method that should be used to calculate the interval. My question then: how can I include a function as an argument in another function?
As an example, for the Exact Clopper-Pearson interval I have the following function:
# Coverage for Exact interval
ExactCoverage <- function(n) {
p <- seq(0,1,.001)
x <- 0:n
# value of dist
dist <- sapply(p, dbinom, size=n, x=x)
# interval
int <- Exact(x,n)
# indicator function
ind <- sapply(p, function(x) cbind(int[,1] <= x & int[,2] >= x))
list(coverage = apply(ind*dist, 2, sum), p = p)
}
Where Exact(x,n) is just a function to calculate the appropriate interval. I would like to have
Coverage <- function(n, FUN, ...)
...
# interval
int <- FUN(...)
so that I have one function to calculate the coverage probabilities rather than a separate coverage function for each method of interval calculation. Is there a standard way to do this? I have not been able to find an explanation.
Thanks,
James
In R, a function can be provided as a function argument. The syntax matches the one of non-function objects.
Here is an example function.
myfun <- function(x, FUN) {
FUN(x)
}
This function applies the function FUN to the object x.
A few examples with a vector including the numbers from 1 to 10:
vec <- 1:10
> myfun(vec, mean)
[1] 5.5
> myfun(vec, sum)
[1] 55
> myfun(vec, diff)
[1] 1 1 1 1 1 1 1 1 1
This is not limited to built-in functions, but works with any function:
> myfun(vec, function(obj) sum(obj) / length(obj))
[1] 5.5
mymean <- function(obj){
sum(obj) / length(obj)
}
> myfun(vec, mymean)
[1] 5.5
you can also store a function name as a character variable, and call it with do.call()
> test = c(1:5)
> do.call(mean, list(test))
[1] 3
>
> func = 'mean'
> do.call(func, list(test))
[1] 3
Hadley's text gives a great (and simple) example:
randomise <- function(f) f(runif(1e3))
randomise(mean)
#> [1] 0.5059199
randomise(mean)
#> [1] 0.5029048
randomise(sum)
#> [1] 504.245

How to assign from a function which returns more than one value?

Still trying to get into the R logic... what is the "best" way to unpack (on LHS) the results from a function returning multiple values?
I can't do this apparently:
R> functionReturningTwoValues <- function() { return(c(1, 2)) }
R> functionReturningTwoValues()
[1] 1 2
R> a, b <- functionReturningTwoValues()
Error: unexpected ',' in "a,"
R> c(a, b) <- functionReturningTwoValues()
Error in c(a, b) <- functionReturningTwoValues() : object 'a' not found
must I really do the following?
R> r <- functionReturningTwoValues()
R> a <- r[1]; b <- r[2]
or would the R programmer write something more like this:
R> functionReturningTwoValues <- function() {return(list(first=1, second=2))}
R> r <- functionReturningTwoValues()
R> r$first
[1] 1
R> r$second
[1] 2
--- edited to answer Shane's questions ---
I don't really need giving names to the result value parts. I am applying one aggregate function to the first component and an other to the second component (min and max. if it was the same function for both components I would not need splitting them).
(1) list[...]<- I had posted this over a decade ago on r-help. Since then it has been added to the gsubfn package. It does not require a special operator but does require that the left hand side be written using list[...] like this:
library(gsubfn) # need 0.7-0 or later
list[a, b] <- functionReturningTwoValues()
If you only need the first or second component these all work too:
list[a] <- functionReturningTwoValues()
list[a, ] <- functionReturningTwoValues()
list[, b] <- functionReturningTwoValues()
(Of course, if you only needed one value then functionReturningTwoValues()[[1]] or functionReturningTwoValues()[[2]] would be sufficient.)
See the cited r-help thread for more examples.
(2) with If the intent is merely to combine the multiple values subsequently and the return values are named then a simple alternative is to use with :
myfun <- function() list(a = 1, b = 2)
list[a, b] <- myfun()
a + b
# same
with(myfun(), a + b)
(3) attach Another alternative is attach:
attach(myfun())
a + b
ADDED: with and attach
I somehow stumbled on this clever hack on the internet ... I'm not sure if it's nasty or beautiful, but it lets you create a "magical" operator that allows you to unpack multiple return values into their own variable. The := function is defined here, and included below for posterity:
':=' <- function(lhs, rhs) {
frame <- parent.frame()
lhs <- as.list(substitute(lhs))
if (length(lhs) > 1)
lhs <- lhs[-1]
if (length(lhs) == 1) {
do.call(`=`, list(lhs[[1]], rhs), envir=frame)
return(invisible(NULL))
}
if (is.function(rhs) || is(rhs, 'formula'))
rhs <- list(rhs)
if (length(lhs) > length(rhs))
rhs <- c(rhs, rep(list(NULL), length(lhs) - length(rhs)))
for (i in 1:length(lhs))
do.call(`=`, list(lhs[[i]], rhs[[i]]), envir=frame)
return(invisible(NULL))
}
With that in hand, you can do what you're after:
functionReturningTwoValues <- function() {
return(list(1, matrix(0, 2, 2)))
}
c(a, b) := functionReturningTwoValues()
a
#[1] 1
b
# [,1] [,2]
# [1,] 0 0
# [2,] 0 0
I don't know how I feel about that. Perhaps you might find it helpful in your interactive workspace. Using it to build (re-)usable libraries (for mass consumption) might not be the best idea, but I guess that's up to you.
... you know what they say about responsibility and power ...
Usually I wrap the output into a list, which is very flexible (you can have any combination of numbers, strings, vectors, matrices, arrays, lists, objects int he output)
so like:
func2<-function(input) {
a<-input+1
b<-input+2
output<-list(a,b)
return(output)
}
output<-func2(5)
for (i in output) {
print(i)
}
[1] 6
[1] 7
I put together an R package zeallot to tackle this problem. zeallot includes a multiple assignment or unpacking assignment operator, %<-%. The LHS of the operator is any number of variables to assign, built using calls to c(). The RHS of the operator is a vector, list, data frame, date object, or any custom object with an implemented destructure method (see ?zeallot::destructure).
Here are a handful of examples based on the original post,
library(zeallot)
functionReturningTwoValues <- function() {
return(c(1, 2))
}
c(a, b) %<-% functionReturningTwoValues()
a # 1
b # 2
functionReturningListOfValues <- function() {
return(list(1, 2, 3))
}
c(d, e, f) %<-% functionReturningListOfValues()
d # 1
e # 2
f # 3
functionReturningNestedList <- function() {
return(list(1, list(2, 3)))
}
c(f, c(g, h)) %<-% functionReturningNestedList()
f # 1
g # 2
h # 3
functionReturningTooManyValues <- function() {
return(as.list(1:20))
}
c(i, j, ...rest) %<-% functionReturningTooManyValues()
i # 1
j # 2
rest # list(3, 4, 5, ..)
Check out the package vignette for more information and examples.
functionReturningTwoValues <- function() {
results <- list()
results$first <- 1
results$second <-2
return(results)
}
a <- functionReturningTwoValues()
I think this works.
There's no right answer to this question. I really depends on what you're doing with the data. In the simple example above, I would strongly suggest:
Keep things as simple as possible.
Wherever possible, it's a best practice to keep your functions vectorized. That provides the greatest amount of flexibility and speed in the long run.
Is it important that the values 1 and 2 above have names? In other words, why is it important in this example that 1 and 2 be named a and b, rather than just r[1] and r[2]? One important thing to understand in this context is that a and b are also both vectors of length 1. So you're not really changing anything in the process of making that assignment, other than having 2 new vectors that don't need subscripts to be referenced:
> r <- c(1,2)
> a <- r[1]
> b <- r[2]
> class(r)
[1] "numeric"
> class(a)
[1] "numeric"
> a
[1] 1
> a[1]
[1] 1
You can also assign the names to the original vector if you would rather reference the letter than the index:
> names(r) <- c("a","b")
> names(r)
[1] "a" "b"
> r["a"]
a
1
[Edit] Given that you will be applying min and max to each vector separately, I would suggest either using a matrix (if a and b will be the same length and the same data type) or data frame (if a and b will be the same length but can be different data types) or else use a list like in your last example (if they can be of differing lengths and data types).
> r <- data.frame(a=1:4, b=5:8)
> r
a b
1 1 5
2 2 6
3 3 7
4 4 8
> min(r$a)
[1] 1
> max(r$b)
[1] 8
If you want to return the output of your function to the Global Environment, you can use list2env, like in this example:
myfun <- function(x) { a <- 1:x
b <- 5:x
df <- data.frame(a=a, b=b)
newList <- list("my_obj1" = a, "my_obj2" = b, "myDF"=df)
list2env(newList ,.GlobalEnv)
}
myfun(3)
This function will create three objects in your Global Environment:
> my_obj1
[1] 1 2 3
> my_obj2
[1] 5 4 3
> myDF
a b
1 1 5
2 2 4
3 3 3
Lists seem perfect for this purpose. For example within the function you would have
x = desired_return_value_1 # (vector, matrix, etc)
y = desired_return_value_2 # (vector, matrix, etc)
returnlist = list(x,y...)
} # end of function
main program
x = returnlist[[1]]
y = returnlist[[2]]
Yes to your second and third questions -- that's what you need to do as you cannot have multiple 'lvalues' on the left of an assignment.
How about using assign?
functionReturningTwoValues <- function(a, b) {
assign(a, 1, pos=1)
assign(b, 2, pos=1)
}
You can pass the names of the variable you want to be passed by reference.
> functionReturningTwoValues('a', 'b')
> a
[1] 1
> b
[1] 2
If you need to access the existing values, the converse of assign is get.
[A]
If each of foo and bar is a single number, then there's nothing wrong with c(foo,bar); and you can also name the components: c(Foo=foo,Bar=bar). So you could access the components of the result 'res' as res[1], res[2]; or, in the named case, as res["Foo"], res["BAR"].
[B]
If foo and bar are vectors of the same type and length, then again there's nothing wrong with returning cbind(foo,bar) or rbind(foo,bar); likewise nameable. In the 'cbind' case, you would access foo and bar as res[,1], res[,2] or as res[,"Foo"], res[,"Bar"]. You might also prefer to return a dataframe rather than a matrix:
data.frame(Foo=foo,Bar=bar)
and access them as res$Foo, res$Bar. This would also work well if foo and bar were of the same length but not of the same type (e.g. foo is a vector of numbers, bar a vector of character strings).
[C]
If foo and bar are sufficiently different not to combine conveniently as above, then you shuld definitely return a list.
For example, your function might fit a linear model and
also calculate predicted values, so you could have
LM<-lm(....) ; foo<-summary(LM); bar<-LM$fit
and then you would return list(Foo=foo,Bar=bar) and then access the summary as res$Foo, the predicted values as res$Bar
source: http://r.789695.n4.nabble.com/How-to-return-multiple-values-in-a-function-td858528.html
Year 2021 and this is something I frequently use.
tidyverse package has a function called lst that assigns name to the list elements when creating the list.
Post which I use list2env() to assign variable or use the list directly
library(tidyverse)
fun <- function(){
a<-1
b<-2
lst(a,b)
}
list2env(fun(), envir=.GlobalEnv)#unpacks list key-values to variable-values into the current environment
This is only for the sake of completeness and not because I personally prefer it. You can pipe %>% the result, evaluate it with curly braces {} and write variables to the parent environment using double-arrow <<-.
library(tidyverse)
functionReturningTwoValues() %>% {a <<- .[1]; b <<- .[2]}
UPDATE:
Your can also use the multiple assignment operator from the zeallot package:: %<-%
c(a, b) %<-% list(0, 1)
I will post a function that returns multiple objects by way of vectors:
Median <- function(X){
X_Sort <- sort(X)
if (length(X)%%2==0){
Median <- (X_Sort[(length(X)/2)]+X_Sort[(length(X)/2)+1])/2
} else{
Median <- X_Sort[(length(X)+1)/2]
}
return(Median)
}
That was a function I created to calculate the median. I know that there's an inbuilt function in R called median() but nonetheless I programmed it to build other function to calculate the quartiles of a numeric data-set by using the Median() function I just programmed. The Median() function works like this:
If a numeric vector X has an even number of elements (i.e., length(X)%%2==0), the median is calculated by averaging the elements sort(X)[length(X)/2] and sort(X)[(length(X)/2+1)].
If Xdoesn't have an even number of elements, the median is sort(X)[(length(X)+1)/2].
On to the QuartilesFunction():
QuartilesFunction <- function(X){
X_Sort <- sort(X) # Data is sorted in ascending order
if (length(X)%%2==0){
# Data number is even
HalfDN <- X_Sort[1:(length(X)/2)]
HalfUP <- X_Sort[((length(X)/2)+1):length(X)]
QL <- Median(HalfDN)
QU <- Median(HalfUP)
QL1 <- QL
QL2 <- QL
QU1 <- QU
QU2 <- QU
QL3 <- QL
QU3 <- QU
Quartiles <- c(QL1,QU1,QL2,QU2,QL3,QU3)
names(Quartiles) = c("QL (1)", "QU (1)", "QL (2)", "QU (2)","QL (3)", "QU (3)")
} else{ # Data number is odd
# Including the median
Half1DN <- X_Sort[1:((length(X)+1)/2)]
Half1UP <- X_Sort[(((length(X)+1)/2)):length(X)]
QL1 <- Median(Half1DN)
QU1 <- Median(Half1UP)
# Not including the median
Half2DN <- X_Sort[1:(((length(X)+1)/2)-1)]
Half2UP <- X_Sort[(((length(X)+1)/2)+1):length(X)]
QL2 <- Median(Half2DN)
QU2 <- Median(Half2UP)
# Methods (1) and (2) averaged
QL3 <- (QL1+QL2)/2
QU3 <- (QU1+QU2)/2
Quartiles <- c(QL1,QU1,QL2,QU2,QL3,QU3)
names(Quartiles) = c("QL (1)", "QU (1)", "QL (2)", "QU (2)","QL (3)", "QU (3)")
}
return(Quartiles)
}
This function returns the quartiles of a numeric vector by using three methods:
Discarding the median for the calculation of the quartiles when the number of elements of the numeric vector Xis odd.
Keeping the median for the calculation of the quartiles when the number of elements of the numeric vector Xis odd.
Averaging the results obtained by using methods 1 and 2.
When the number of elements in the numeric vector X is even, the three methods coincide.
The result of the QuartilesFunction() is a vector that depicts the first and third quartiles calculated by using the three methods outlined.
With R 3.6.1, I can do the following
fr2v <- function() { c(5,3) }
a_b <- fr2v()
(a_b[[1]]) # prints "5"
(a_b[[2]]) # prints "3"
To obtain multiple outputs from a function and keep them in the desired format you can save the outputs to your hard disk (in the working directory) from within the function and then load them from outside the function:
myfun <- function(x) {
df1 <- ...
df2 <- ...
save(df1, file = "myfile1")
save(df2, file = "myfile2")
}
load("myfile1")
load("myfile2")

Resources