How to combine Curry() with Vectorize()? - r

Consider the following function:
addAmount <- function(x, amount) {
stopifnot(length(x) == 1)
return(x + amount)
}
It can be used to add some amount to x:
> addAmount(x = 5, amount = 3)
[1] 8
> addAmount(x = 2, amount = 3)
[1] 5
However, x must be of length 1:
> addAmount(x = 7:9, amount = 3)
Error: length(x) == 1 is not TRUE
I added this restriction intentionally for exemplification.
Using Vectorize, it is possible to pass in a vector for x:
> Vectorize(addAmount)(x = 7:9, amount = 3)
[1] 10 11 12
So far, so good.
However, I'd like to turn my addAmount function into a "add 3" function, using currying:
add3 <- functional::Curry(addAmount, amount = 3)
This works as expected if x is of length 1 and fails (as expected) if x is not of length 1:
> add3(x = 5)
[1] 8
> add3(x = 7:9)
Error: length(x) == 1 is not TRUE
The problem is: add3 cannot be vectorized:
> Vectorize(add3)(x = 7:9)
Error: length(x) == 1 is not TRUE
Somehow, the curried function is not "compatible" with Vectorize, i.e. it behaves as if it had not been vectorized at all.
Question: What can I do about this? How can currying and vectorization be combined? (And: What is going wrong?)
I found a workaround (heavily inspired by Hadley's add function) using environments instead of Curry, but I'm looking for a cleaner solution that doesn't require this kind of clumsy "factory" functions:
getAdder <- function(amount) {
force(amount)
addAmount <- function(x) {
stopifnot(length(x) == 1)
return(x + amount)
}
return(addAmount)
}
add3 <- getAdder(3)
Vectorize(add3)(x = 7:9)
[1] 10 11 12
Tested with R 3.4.1 and the functional package (version 0.6).

You can vectorize before currying:
add3 <- functional::Curry(Vectorize(addAmount), amount = 3)
add3(1:10)
[1] 4 5 6 7 8 9 10 11 12 13

Related

How to add possible divisor numbers?

How do I retrieve maximum sum of possible divisors numbers
I have a below function which will give possible divisors of number
Code
divisors <- function(x) {
y <- seq_len(ceiling(x / 2))
y[x %% y == 0]
}
Example
Divisors of 99 will give the below possible values.
divisors(99)
[1] 1 3 9 11 33
My expected Logic :
Go from last digit to first digit in the divisors value
The last number is 33, Here next immediate number divisible by 33 is 11 . So I selected 11 , now traversing from 11 the next immediate number divisible by 11 is 1. So selected 1. Now add all the numbers.
33 + 11 + 1 = 45
Move to next number 11, Now next immediate number divisible by 11 is 1. So selected 1. Now add all the numbers.
11 + 1 = 12
Here immediate
Move to next number 9, Now next immediate number divisible by 11 is 1. So selected 1. Now add all the numbers.
9 + 3 + 1 = 13
Move to next number 3, Now next immediate number divisible by 3 is 1. So selected 1. Now add all the numbers.
3+1=4
Now maximum among these is 45.
Now I am struggling to write this logic in R . Help / Advice much appreciated.
Note : Prime numbers can be ignored.
update
For large integers, e.g., the maximum integer .Machine$integer.max (prime number), you can run the code below (note that I modified functions divisors and f a bit)
divisors <- function(x) {
y <- seq(x / 2)
y[as.integer(x) %% y == 0]
}
f <- function(y) {
if (length(y) <= 2) {
return(as.integer(sum(y)))
}
l <- length(y)
h <- y[l]
yy <- y[-l]
h + f(yy[h %% yy == 0])
}
and you will see
> n <- .Machine$integer.max - 1
> x <- divisors(n)
> max(sapply(length(x):2, function(k) f(head(x, k))))
[1] 1569603656
You can define a recursive function f that gives successive divisors
f <- function(y) {
if (length(y) == 1) {
return(y)
}
h <- y[length(y)]
yy <- y[-length(y)]
c(f(yy[h %% yy == 0]), h)
}
and you will see all possible successive divisor tuples
> sapply(rev(seq_along(x)), function(k) f(head(x, k)))
[[1]]
[1] 1 11 33
[[2]]
[1] 1 11
[[3]]
[1] 1 3 9
[[4]]
[1] 1 3
[[5]]
[1] 1
Then, we apply f within sapply like below
> max(sapply(rev(seq_along(x)), function(k) sum(f(head(x, k)))))
[1] 45
which gives the desired output.
You can also use the following solution. It may sound a little bit complicated and of course there is always an easier, more efficient solution. However, I thought this could be useful to you. I will take it from your divisors output:
> x
[1] 1 3 9 11 33
# First I created a list whose first element is our original x and from then on
# I subset the first element till the last element of the list
lst <- lapply(0:(length(x)-1), function(a) x[1:(length(x)-a)])
> lst
[[1]]
[1] 1 3 9 11 33
[[2]]
[1] 1 3 9 11
[[3]]
[1] 1 3 9
[[4]]
[1] 1 3
[[5]]
[1] 1
Then I wrote a custom function in order to implement your conditions and gather your desired output. For this purpose I created a function factory which in fact is a function that creates a function:
As you might have noticed the outermost function does not take any argument. It only sets up an empty vector out to save our desired elements in. It is created in the execution environment of the outermost function to shield it from any changes that might affect it in the global environment
The inner function is the one that takes our vector x so in general we call the whole setup like fnf()(x). First element of of our out vector is in fact the first element of the original x(33). Then I found all divisors of the first element whose quotient were 0. After I fount them I took the second element (11) as the first one was (33) and stored it in our out vector. Then I modified the original x vector and omitted the max value (33) and repeated the same process
Since we were going to repeat the process over again, I thought this might be a good case to use recursion. Recursion is a programming technique that a function actually calls itself from its body or from inside itself. As you might have noticed I used fn inside the function to repeat the process again but each time with one fewer value
This may sound a bit complicated but I believed there may be some good points for you to pick up for future exploration, since I found them very useful, hoped that's the case for you too.
fnf <- function() {
out <- c()
fn <- function(x) {
out <<- c(out, x[1])
z <- x[out[length(out)]%%x == 0]
if(length(z) >= 2) {
out[length(out) + 1] <<- z[2]
} else {
return(out)
}
x <- x[!duplicated(x)][which(x[!duplicated(x)] == z[2]):length(x[!duplicated(x)])]
fn(x)
out[!duplicated(out)]
}
}
# The result of applying the custom function on `lst` would result in your
# divisor values
lapply(lst, function(x) fnf()(sort(x, decreasing = TRUE)))
[[1]]
[1] 33 11 1
[[2]]
[1] 11 1
[[3]]
[1] 9 3 1
[[4]]
[1] 3 1
[[5]]
[1] 1
In the end we sum each element and extract the max value
Reduce(max, lapply(lst, function(x) sum(fnf()(sort(x, decreasing = TRUE)))))
[1] 45
Testing a very large integer number, I used dear #ThomasIsCoding's modified divisors function:
divisors <- function(x) {
y <- seq(x / 2)
y[as.integer(x) %% y == 0]
}
x <- divisors(.Machine$integer.max - 1)
lst <- lapply(0:(length(x)-1), function(a) x[1:(length(x)-a)])
Reduce(max, lapply(lst, function(x) sum(fnf()(sort(x, decreasing = TRUE)))))
[1] 1569603656
You'll need to recurse. If I understand correctly, this should do what you want:
fact <- function(x) {
x <- as.integer(x)
div <- seq_len(abs(x)/2)
factors <- div[x %% div == 0L]
return(factors)
}
maxfact <- function(x) {
factors <- fact(x)
if (length(factors) < 3L) {
return(sum(factors))
} else {
return(max(factors + mapply(maxfact, factors)))
}
}
maxfact(99)
[1] 45

Finding and writing environment

I am writing a R equivalent to Pythons 'pop' method. I know 99th percentile has one but I'd prefer my own (practice/understanding/consistency etc).
For reference, pop() takes an object and removes the first item from the object whilt also returning it. So
> l <- c(1,3,5)
> x <- pop(l)
> print(l)
> 3, 5
> print(x)
> 1
I am using assign() to replace the input object with one less the first value and returning said first value from the function.
My question is, how do I get the environment of the input object and use this environment within assign()?
I have tried using pryr::where() which returns 'R_GlobalEnv' but I can't use this value in assign(). Instead the only value I can get to work in assign() is 'globalenv()'.
Posted from mobile so let me know if something doesn't work.
You can implement this in base R, though it's not advised. R is a functional language and functions with side effects are not expected by end-users.
pop <- function(vec)
{
vec_name <- deparse(substitute(vec))
assign(vec_name, vec[-1], envir = parent.frame())
vec[1]
}
a <- c(2, 7, 9)
a
#> [1] 2 7 9
pop(a)
#> [1] 2
a
#> [1] 7 9
pop(a)
#> [1] 7
a
#> [1] 9
Created on 2020-08-15 by the reprex package (v0.3.0)
The following answer is based in this R-Help post, function pop with function getEnvOf from this SO post, both adapted to the question's problem.
getEnvOf <- function(what, which=rev(sys.parents())) {
what <- as.character(substitute(what))
for (frame in which)
if (exists(what, frame=frame, inherits=FALSE))
return(sys.frame(frame))
return(NULL)
}
pop <- function(x){
y <- as.character(substitute(x))
e <- getEnvOf(y)
if(length(x) > 0) {
val <- x[[length(x)]]
assign(y, x[-length(x)], envir = parent.env(e))
val
} else {
msg <- paste(sQuote(y), "length is not > 0")
warning(msg)
NULL
}
}
y <- c(1,3,5)
pop(y)
This also works with lists.
z <- list(1, 2, 5)
pop(z)
w <- list(1, c(2, 4, 6), 5)
pop(w)
#[1] 5
pop(w)
#[1] 2 4 6
pop(w)
#[1] 1
pop(w)
#NULL
#Warning message:
#In pop(w) : ‘w’ length is not > 0
You can do it using pryr::promise_info(l)$env, but it's a very un-R-like thing to do. Functions shouldn't have side effects.
For example,
pop <- function(l) {
info <- pryr::promise_info(l)
if (!is.name(info$code))
stop("Argument expression should be a name.")
result <- l[[1]] # work on lists too
assign(as.character(info$code), l[-1], envir = info$env)
result
}
l <- c(1, 3, 5)
pop(l)
#> Registered S3 method overwritten by 'pryr':
#> method from
#> print.bytes Rcpp
#> [1] 1
l
#> [1] 3 5
Created on 2020-08-15 by the reprex package (v0.3.0)
Edited to add: Interestingly, none of the three answers so far works in complicated situations like this one:
f <- function(x) {
cat("The pop(x) result is", pop(x), "\n")
cat("Now x is ", x, "\n")
cat("Now l is ", l, "\n")
}
l <- c(1, 3, 5)
f(l)
#RuiBarradas's answer gives
The pop(x) result is 5
Now x is 1 3 5
Now l is 1 3 5
(He pops the last value rather than the first which is not a big deal, but neither x nor l is modified.)
#AllanCameron's answer gives
The pop(x) result is 1
Now x is 3 5
Now l is 1 3 5
This is arguably correct (x got popped), but I think it would be nice to have l being popped, and that seems tricky.
My answer dies with this message:
Error in pop(x) : Argument expression should be a name.
which seems like a bug: obviously whether it's getting x or l, it really is a name. The problem seems to be in pryr::promise_info, which returns the compiled code that would return the value of x, rather than just the code for x. If I turn off JIT compiling by compiler::enableJIT(0), I get the same result as #AllanCameron. It's not clear to me how to unwind back the right amount to pop l instead of just x.

Summation of a sequence

If n(1) = 1 ,n(2) = 5, n(3) = 13, n(4) = 25, ...
I am using a for loop for summation of these terms
1 + (1*4 - 4) + (2*4 - 4) + (3*4 - 4) + ..
This is the function I am using with a for loop:
shapeArea <- function(n) {
terms <- as.numeric(1)
for(i in 1:n){
terms <- append(terms, (i*4 - 4))
}
sum(terms)
}
This works fine (as shown here):
> shapeArea(3)
[1] 13
> shapeArea(2)
[1] 5
> shapeArea(4)
[1] 25
Yet I was also thinking how can I do this without saving the terms of the series in numeric vector terms. In other words is there a way to find summations of terms without saving them in a vector first. Or is this the efficient way to do this.
Thanks
You can change your shapeArea function to a one-liner
shapeArea <- function(num) {
1 + sum(seq(num) * 4) - (4 * num)
}
shapeArea(1)
#[1] 1
shapeArea(2)
#[1] 5
shapeArea(3)
#[1] 13
shapeArea(4)
#[1] 25

List comprehension in R

Is there a way to implement list comprehension in R?
Like python:
sum([x for x in range(1000) if x % 3== 0 or x % 5== 0])
same in Haskell:
sum [x| x<-[1..1000-1], x`mod` 3 ==0 || x `mod` 5 ==0 ]
What's the practical way to apply this in R?
Nick
Something like this?
l <- 1:1000
sum(l[l %% 3 == 0 | l %% 5 == 0])
Yes, list comprehension is possible in R:
sum((1:1000)[(1:1000 %% 3) == 0 | (1:1000 %% 5) == 0])
And, (kind of) the for-comprehension of scala:
for(i in {x <- 1:100;x[x%%2 == 0]})print(i)
This is many years later but there are three list comprehension packages now on CRAN. Each has slightly different syntax. In alphabetical order:
library(comprehenr)
sum(to_vec(for(x in 1:1000) if (x %% 3 == 0 | x %% 5 == 0) x))
## [1] 234168
library(eList)
Sum(for(x in 1:1000) if (x %% 3 == 0 | x %% 5 == 0) x else 0)
## [1] 234168
library(listcompr)
sum(gen.vector(x, x = 1:1000, x %% 3 == 0 | x %% 5 == 0))
## [1] 234168
In addition the following is on github only.
# devtools::install.github("mailund/lc")
library(lc)
sum(unlist(lc(x, x = seq(1000), x %% 3 == 0 | x %% 5 == 0)))
## [1] 234168
The foreach package by Revolution Analytics gives us a handy interface to list comprehensions in R. https://www.r-bloggers.com/list-comprehensions-in-r/
Example
Return numbers from the list which are not equal as tuple:
Python
list_a = [1, 2, 3]
list_b = [2, 7]
different_num = [(a, b) for a in list_a for b in list_b if a != b]
print(different_num)
# Output:
[(1, 2), (1, 7), (2, 7), (3, 2), (3, 7)]
R
require(foreach)
list_a = c(1, 2, 3)
list_b = c(2, 7)
different_num <- foreach(a=list_a ,.combine = c ) %:% foreach(b=list_b) %:% when(a!=b) %do% c(a,b)
print(different_num)
# Output:
[[1]]
[1] 1 2
[[2]]
[1] 1 7
[[3]]
[1] 2 7
[[4]]
[1] 3 2
[[5]]
[1] 3 7
EDIT:
The foreach package is very slow for certain tasks.
A faster list comprehension implementation is given at List comprehensions for R
. <<- structure(NA, class="comprehension")
comprehend <- function(expr, vars, seqs, guard, comprehension=list()){
if(length(vars)==0){ # base case of recursion
if(eval(guard)) comprehension[[length(comprehension)+1]] <- eval(expr)
} else {
for(elt in eval(seqs[[1]])){
assign(vars[1], elt, inherits=TRUE)
comprehension <- comprehend(expr, vars[-1], seqs[-1], guard,
comprehension)
}
}
comprehension
}
## List comprehensions specified by close approximation to set-builder notation:
##
## { x+y | 0<x<9, 0<y<x, x*y<30 } ---> .[ x+y ~ {x<-0:9; y<-0:x} | x*y<30 ]
##
"[.comprehension" <- function(x, f,rectangularizing=T){
f <- substitute(f)
## First, we pluck out the optional guard, if it is present:
if(is.call(f) && is.call(f[[3]]) && f[[3]][[1]]=='|'){
guard <- f[[3]][[3]]
f[[3]] <- f[[3]][[2]]
} else {
guard <- TRUE
}
## To allow omission of braces around a lone comprehension generator,
## as in 'expr ~ var <- seq' we make allowances for two shapes of f:
##
## (1) (`<-` (`~` expr
## var)
## seq)
## and
##
## (2) (`~` expr
## (`{` (`<-` var1 seq1)
## (`<-` var2 seq2)
## ...
## (`<-` varN <- seqN)))
##
## In the former case, we set gens <- list(var <- seq), unifying the
## treatment of both shapes under the latter, more general one.
syntax.error <- "Comprehension expects 'expr ~ {x1 <- seq1; ... ; xN <- seqN}'."
if(!is.call(f) || (f[[1]]!='<-' && f[[1]]!='~'))
stop(syntax.error)
if(is(f,'<-')){ # (1)
lhs <- f[[2]]
if(!is.call(lhs) || lhs[[1]] != '~')
stop(syntax.error)
expr <- lhs[[2]]
var <- as.character(lhs[[3]])
seq <- f[[3]]
gens <- list(call('<-', var, seq))
} else { # (2)
expr <- f[[2]]
gens <- as.list(f[[3]])[-1]
if(any(lapply(gens, class) != '<-'))
stop(syntax.error)
}
## Fill list comprehension .LC
vars <- as.character(lapply(gens, function(g) g[[2]]))
seqs <- lapply(gens, function(g) g[[3]])
.LC <- comprehend(expr, vars, seqs, guard)
## Provided the result is rectangular, convert it to a vector or array
if(!rectangularizing) return(.LC)
tryCatch({
if(!length(.LC))
return(.LC)
dim1 <- dim(.LC[[1]])
if(is.null(dim1)){
lengths <- sapply(.LC, length)
if(all(lengths == lengths[1])){ # rectangular
.LC <- unlist(.LC)
if(lengths[1] > 1) # matrix
dim(.LC) <- c(lengths[1], length(lengths))
} else { # ragged
# leave .LC as a list
}
} else { # elements of .LC have dimension
dim <- c(dim1, length(.LC))
.LC <- unlist(.LC)
dim(.LC) <- dim
}
return(.LC)
}, error = function(err) {
return(.LC)
})
}
This implementation is faster then foreach, it allows nested comprehension, multiple parameters and parameters scoping.
N <- list(10,20)
.[.[c(x,y,z)~{x <- 2:n;y <- x:n;z <- y:n} | {x^2+y^2==z^2 & z<15}]~{n <- N}]
[[1]]
[[1]][[1]]
[1] 3 4 5
[[1]][[2]]
[1] 6 8 10
[[2]]
[[2]][[1]]
[1] 3 4 5
[[2]][[2]]
[1] 5 12 13
[[2]][[3]]
[1] 6 8 10
Another way
sum(l<-(1:1000)[l %% 3 == 0 | l %% 5 == 0])
I hope it's okay to self-promote my package listcompr which implements a list comprehension syntax for R.
The example from the question can be solved in the following way:
library(listcompr)
sum(gen.vector(x, x = 1:1000, x %% 3 == 0 || x %% 5 == 0))
## Returns: 234168
As listcompr does a row-wise (and not a vector-vise) evaluation of the conditions, it makes no difference if || or | is used a logical operator. It accepts arbitrary many arguments: First, a base expression which is transformed into the list or vector entries. Next, arbitrary many arguments which specify the variable ranges and the conditions.
More examples can be found on the readme page on the github repository of listcompr: https://github.com/patrickroocks/listcompr
For a strict mapping from Python to R, this might be the most direct equivalence:
Python:
sum([x for x in range(1000) if x % 3== 0 or x % 5== 0])
R:
sum((x <- 0:999)[x %% 3 == 0 | x %% 5 == 0])
One important difference: the R version works like Python 2 where the x variable is globally scoped outside of the expression. (I call it an "expression" here since R does not have the notion of "list comprehension".) In Python 3, the iterator is restricted to the local scope of the list comprehension. In other words:
In R (as in Python 2), the x variable persists after the expression. If it existed before the expression, then its value is changed to the final value of the expression.
In Python 3, the x variable exists only within the list comprehension. If there was an x variable created before the list comprehension, the list comprehension does not change it at all.
This list comprehension of the form:
[item for item in list if test]
is pretty straightforward with boolean indexing in R. But for more complex expressions, like implementing vector rescaling (I know this can be done with scales package too), in Python it's easy:
x = [1, 3, 5, 7, 9, 11] # -> [0.0, 0.2, 0.4, 0.6, 0.8, 1.0]
[(xi - min(x))/(max(x) - min(x)) for xi in x]
But in R this is the best I could come up with. Would love to know if there's something better:
sapply(x, function(xi, mn, mx) {(xi-mn)/(mx-mn)}, mn = min(x), mx = max(x))
You could convert a sequence of random numbers to a binary sequence as follows:
x=runif(1000)
y=NULL
for (i in x){if (i>.5){y<-c(y,1)}else{y=c(y,-1)}}
this could be generalized to operate on any list to another list based on:
x = [item for item in x if test == True]
where the test could use the else statement to not append the list y.
For the problem at hand:
x <- 0:999
y <- NULL
for (i in x){ if (i %% 3 == 0 | i %% 5 == 0){ y <- c(y, i) }}
sum( y )

How could I make this R snippet faster and more R-ish?

Coming from various other languages, I find R powerful and intuitive, but I am not thrilled with its performance. So I decided to try to improve some snippet I wrote and learn how to code better in R.
Here's a function I wrote, trying to determine if a vector is binary-valued (two distinct values or just one value) or not:
isBinaryVector <- function(v) {
if (length(v) == 0) {
return (c(0, 1))
}
a <- v[1]
b <- a
lapply(v, function(x) { if (x != a && x != b) {if (a != b) { return (c()) } else { b = x }}})
if (a < b) {
return (c(a, b))
} else {
return (c(b, a))
}
}
EDIT: This function is expected to look through a vector then return c() if it is not binary-valued, and return c(a, b) if it is, a being the small value and b being the larger one (if a == b then just c(a, a). E.g., for
A B C
1 1 1 0
2 2 2 0
3 3 1 0
I will lapply this isBinaryVector and get:
$A
[1] 1 1
$B
[1] 1 1
$C
[1] 0 0
The time it took on a moderate sized dataset (about 1800 * 3500, 2/3 of them are binary-valued) is about 15 seconds. The set contains only floating-point numbers.
Is there anyway I could do this faster?
Thanks for any inputs!
You are essentially trying to write a function that returns TRUE if a vector has exactly two unique values, and FALSE otherwise.
Try this:
> dat <- data.frame(
+ A = 1:3,
+ B = c(1, 2, 1),
+ C = 0
+ )
>
> sapply(dat, function(x)length(unique(x))==2)
A B C
FALSE TRUE FALSE
Next, you want to get the min and max value. The function range does this. So:
> sapply(dat, range)
A B C
[1,] 1 1 0
[2,] 3 2 0
And there you have all the ingredients to make a small function that is easy to understand and should be extremely quick, even on large amounts of data:
isBinary <- function(x)length(unique(x))==2
binaryValues <- function(x){
if(isBinary(x)) range(x) else NA
}
sapply(dat, binaryValues)
$A
[1] NA
$B
[1] 1 2
$C
[1] NA
This function returns true or false for vectors (or columns of a data frame):
is.binary <- function(v) {
x <- unique(v)
length(x) - sum(is.na(x)) == 2L
}
Also take a look at this post
I'd use something like that to get column indicies:
bivalued <- apply(my.data.frame, 2, is.binary)
nominal <- my.data.frame[,!bivalued]
binary <- my.data.frame[,bivalued]
Sample data:
my.data.frame <- data.frame(c(0,1), rnorm(100), c(5, 19), letters[1:5], c('a', 'b'))
> apply(my.data.frame, 2, is.binary)
c.0..1. rnorm.100. c.5..19. letters.1.5. c..a....b..
TRUE FALSE TRUE FALSE TRUE

Resources