why are these memoised functions different?

why are these memoised functions different? - r

I see that if I use memoise on a function in two different ways, I get two different behaviours, and I'd like to understand why.
# Non Memoised function
fib <- function(n) {
if (n < 2) return(1)
fib(n - 2) + fib(n - 1)
}
system.time(fib(23))
system.time(fib(24))
library(memoise)
# Memoisation stragagy 1
fib_fast <- memoise(function(n) {
if (n < 2) return(1)
fib_fast(n - 2) + fib_fast(n - 1)
})
system.time(fib_fast(23))
system.time(fib_fast(24))
# Memoisation strategy 2
fib_not_as_fast <- memoise(fib)
system.time(fib_not_as_fast(23))
system.time(fib_not_as_fast(24))
Strategy 1, is really fast, as it reuses the recursive results, whereas stratagy 2 is only fast if the exact input has been seen before.
Can someone explain to me why this is?

I think that the reason is simple. In the slow case, the function fib_not_as_fast is memoised. Inside the function, fib is called, which is not memoised. To be more detailed: when you calculate fib_not_so_fast(24), inside the function you have fib(22) + fib(23). Both of these have not been memoised.
In fib_fast, however, you use the memoised version also in the recursion. So, in this case, fib_fast(24) needs to evaluate fib_fast(22) + fib_fast(23). Both these function calls have already happened, when you calculated fib_fast(23) and are thus memoised.
What does work is to memoise a function later, after it has been defined. So, simply redefining the function fib() as fib <- memoise(fib) will work.

Related

Why does this function works?

with this function you can calculate the fibonacci sequence with a recursive function, but i am not sure why this works, i marked at which position i struggled, can someone explain me this code?
fib <- function(n){
if (n == 0) return(0)
if (n == 1) return(1)
seq <- integer(n) # at this point i didnt understand much at all
seq[1:2] <- 1
calc <- function(n) {
if (seq[n] != 0) return(seq[n])
seq[n] <<- calc(n-1) + calc(n-2)
seq[n]
}
calc(n)
}

In the course of the recursive function evaluation, fib() ends up getting called many times with the same n. One way to make this computation faster is to use memoization, which saves a record of values for which the function has previously been called, and the return values. Using #AnoushArivanR's fib function:
system.time(fib(30))
## user system elapsed
## 3.987 0.000 3.987
library(memoise)
fib <- memoise(fib)
system.time(fib(30))
## user system elapsed
## 0.004 0.000 0.004
In fact, now that I look at your code above, I believe it's doing exactly this — but it's definitely hard to understand! (Your question might have been better received if you explained that this was what you're trying to do ...)

Note
This implementation of fib function has been cited from Mastering Software Development in R by Dr. Roger D. Peng who taught me so much and I am forever grateful to him.
As mentioned above this code is poorly written and has been unnecessarily complicated. Here is a more simpler version. We first check that the n value is not less than 0, then as the first two elements of the sequence are necessary for the series calculation to start (each element being the sum of two previous elements in the series) we set them as 0 and 1 for n == 1 and n == 2 respectively. Then we use recursion which is a technique that a function calls itself from its body creating a series of repetitive computations until the maximum number of n is reached. So for example for n == 3 the function calls it self by calling fib(1) and fib(2) both of which have already been set and so on. Then for fib(4) the function calls both fib(3) and fib(2). fib(2) is already set and fib(3) will be calculated by summing fib(2) and fib(1) and ...
I hope this explanation has helped you get your mind around the idea.
fib <- function(n){
stopifnot(n > 0)
if(n == 1) {
return(0)
} else if(n == 2) {
return(1)
} else {
fib(n - 1) + fib(n - 2)
}
}
fib(7)
8
But as some of the calculations are computed more than once for example both fib(6) and fib(5) calculate fib(4) the execution of the function gets slower. For the sake of optimization your code has saved every fib(n) output into and empty vector called seq so before any computation takes place it checks whether the value for seq[n] has already been computed or not. If so, it will be used and if not it will be computed again. This technique is called memoization and whenver a new seq[n] is calculated it will be the nth element of the seq vector and for this purpose we make use of <<- called complex assignment operator as we are modifying an object in the parent environment of the function.

R recursive function call that works without stating arguments to function

The following simple recursion finds duplicated elements in a vector. It's taken from chapter 2 Functional Programming in R: Advanced Statistical Programming for Data Science, Analysis and Finance by Thomas Mailund. I wonder why it works when we call rest inside the function as it is calling a function without stating arguments.
Usually this would just return the function definition, but in the recursive function we don't need to and I wondered why.
I can see how this would work if we replaced rest in the function directly with find_duplicates(x, i + 1), but I am struggling to see why it works calling just the name which the function is attached to.
E.g if we define f<- function (x) x and call f it just returns the code function (x) x.
find_duplicates <- function(x, i = 1) {
if (i >= length(x)) return(c())
rest <- find_duplicates(x, i + 1)
if (x[i] == x[i + 1]) c(i, rest)
else rest
}

rest is not a function, it's the output of the function find_duplicates given arguments x and i+1.
So indeed it's the same to type rest or find_duplicates(x, i + 1) in the if clause, they're both values, not functions.

R. function. error

I have the following function f
f <- function (n) if (n==0) 1 else f(n - 1)(n %% 2) - f(n-1)(n+1)
I think it is defined right but I cannot calculate f(8)
f(8)
> Error in f(n - 1) : attempt to apply non-function
what do I have to change?

In R recursion is not terribly well optimised but the basic approach is to use the Recall function. It's unclear what you intend with the "double calling" syntax e.g. f(n-1)(n+1) where f is followed first by a parentheses paired with one argument and then another argument. The f function isn't being designed to return a function. I'm going to make a guess that you wanted the recurrence relation in f(n) to be:
f(n - 1)*(n %% 2) - f(n-1)*(n+1)
If that guess is correct then:
f <- function (n) if (n==0) 1 else {Recall(n - 1) *(n %% 2) - Recall(n-1)*(n+1)}
> f(8)
[1] 99225
I made my guess before the clarifying comments but appears I was correct in thinking you didn't understand that f(n)(n %% 2) was incorrect R syntax. Back to back parentheses (brackets in the English English language) are not signifying multiplication, but rather function application. Look at ?Syntax and for an example see ?ecdf where ecdf(x)(n) is an acceptable nested call because ecdf returns a function as a value.

Lazy sequences in R

In Clojure, it's easy to create infinite sequences using the lazy sequence constructor. For example,
(def N (iterate inc 0))
returns a data object N which is equivalent to the infinite sequence
(0 1 2 3 ...)
Evaluating the value N results in an infinite loop. Evaluating (take 20 N) returns the top 20 numbers. Since the sequence is lazy, the inc function is only iterated when you ask it to. Since Clojure is homoiconic, the lazy sequence is stored recursively.
In R, is it possible to do something similar? Can you present some sample R code which produces a data object N that is equivalent to the full infinite sequence of natural numbers? Evaluating the full object N should result in a loop, but something like head(N) should return just the leading numbers.
Note: I am really more interested in lazy sequences rather than the natural numbers per se.
Edit: Here is the Clojure source for lazy-seq:
(defmacro lazy-seq
"Takes a body of expressions that returns an ISeq or nil, and yields
a Seqable object that will invoke the body only the first time seq
is called, and will cache the result and return it on all subsequent
seq calls. See also - realized?"
{:added "1.0"}
[& body]
(list 'new 'clojure.lang.LazySeq (list* '^{:once true} fn* [] body)))
I'm looking for a macro with the same functionality in R.

Alternate Implementation
Introduction
I've had occasion to be working in R more frequently since this post, so I offer an alternate base R implementation. Again, I suspect you could get much better performance by dropping down to the C extension level. Original answer follows after the break.
The first challenge in base R is the lack of a true cons (exposed at the R level, that is). R uses c for a hybrid cons/concat operation, but this does not create a linked list, but rather a new vector populated with the elements of both arguments. In particular, the length of both arguments has to be known, which is not the case with lazy sequences. Additionally, successive c operations exhibit quadratic performance rather than constant time. Now, you could use "lists" (which are really vectors, not linked-lists) of length two to emulate cons cells, but...
The second challenge is the forcing of promises in data structures. R has some lazy evaluation semantics using implicit promises, but these are second class citizens. The function for returning an explicit promise, delay, has been deprecated in favor implicit delayedAssign, which is only executed for its side effects -- "Unevaluated promises should never be visible." Function arguments are implicit promises, so you can get your hands on them, but you can't place a promise into a data structure without it being forced.
CS 101
It turns out these two challenges can be solved by thinking back to Computer Science 101. Data structures can be implemented via closures.
cons <- function(h,t) function(x) if(x) h else t
first <- function(kons) kons(TRUE)
rest <- function(kons) kons(FALSE)
Now due to R's lazy function argument semantics, our cons already capable of lazy sequences.
fibonacci <- function(a,b) cons(a, fibonacci(b, a+b))
fibs <- fibonacci(1,1)
In order to be useful though we need a suite of lazy sequence processing functions. In Clojure, the sequence processing functions that are part of the core language work naturally with lazy sequences as well. R's sequence functions, on the other hand, would not be immediately compatible. Many rely on knowing ahead of time the (finite) sequence length. Let's define a few capable of working on lazy sequences instead.
filterz <- function(pred, s) {
if(is.null(s)) return(NULL)
f <- first(s)
r <- rest(s)
if(pred(f)) cons(f, filterz(pred, r)) else filterz(pred, r) }
take_whilez <- function(pred, s) {
if(is.null(s) || !pred(first(s))) return(NULL)
cons(first(s), take_whilez(pred, rest(s))) }
reduce <- function(f, init, s) {
r <- init
while(!is.null(s)) {
r <- f(r, first(s))
s <- rest(s) }
return(r) }
Let's use what we've created to sum all of the even Fibonacci numbers less than 4 million (Euler Project #2):
reduce(`+`, 0, filterz(function(x) x %% 2 == 0, take_whilez(function(x) x < 4e6, fibs)))
# [1] 4613732
Original Answer
I'm very rusty with R, but since (1) since I'm familiar with Clojure, and (2) I don't think you are getting your point across to the R users, I'll attempt a sketch based on my illustration of how Clojure lazy-sequences work. This is for example purposes only, and not tuned for performance in any way. It would probably be better implemented as an C extension (if one does not exist already).
A lazy sequence has the rest of the sequence generating calculation in a thunk. It is not immediately called. As each element (or chunk of elements as the case may be) is requested, a call to the next thunk is made to retrieve the value(s). That thunk may create another thunk to represent the tail of the sequence if it continues. The magic is that (1) these special thunks implement a sequence interface and can transparently be used as such and (2) each thunk is only called once -- its value is cached -- so the realized portion is a sequence of values.
Standard examples first
Natural numbers
numbers <- function(x) as.LazySeq(c(x, numbers(x+1)))
nums <- numbers(1)
take(10,nums)
#=> [1] 1 2 3 4 5 6 7 8 9 10
#Slow, but does not overflow the stack (level stack consumption)
sum(take(100000,nums))
#=> [1] 5000050000
Fibonacci sequence
fibonacci <- function(a,b) {
as.LazySeq(c(a, fibonacci(b, a+b)))}
fibs <- fibonacci(1,1)
take(10, fibs)
#=> [1] 1 1 2 3 5 8 13 21 34 55
nth(fibs, 20)
#=> [1] 6765
Followed by naive R implementation
Lazy sequence class
is.LazySeq <- function(x) inherits(x, "LazySeq")
as.LazySeq <- function(s) {
cache <- NULL
value <- function() {
if (is.null(cache)) {
cache <<- force(s)
while (is.LazySeq(cache)) cache <<- cache()}
cache}
structure(value, class="LazySeq")}
Some generic sequence methods with implementations for LazySeq
first <- function(x) UseMethod("first", x)
rest <- function(x) UseMethod("rest", x)
first.default <- function(s) s[1]
rest.default <- function(s) s[-1]
first.LazySeq <- function(s) s()[[1]]
rest.LazySeq <- function(s) s()[[-1]]
nth <- function(s, n) {
while (n > 1) {
n <- n - 1
s <- rest(s) }
first(s) }
#Note: Clojure's take is also lazy, this one is "eager"
take <- function(n, s) {
r <- NULL
while (n > 0) {
n <- n - 1
r <- c(r, first(s))
s <- rest(s) }
r}

The iterators library might be able to achieve what you're looking for:
library(iterators)
i <- icount()
nextElem(i)
# 1
nextElem(i)
# 2
You can keep calling nextElem forever.

Since R works best on vectors it seems like one often wants an iterator that return a vector, for instance
library(iterators)
ichunk <- function(n, ..., chunkSize) {
it <- icount(n) # FIXME: scale by chunkSize
chunk <- seq_len(chunkSize)
structure(list(nextElem=function() {
(nextElem(it) - 1L) * chunkSize + chunk
}), class=c("abstractiter", "iter"))
}
with typical use being with chunkSize in the millions range. At least this speeds up the approach to forever.

You didn't state your intended goal, so I'll point out that, in any language,
j <- 1
while(TRUE) { x[j+1] <- x[j]+1 ; j<-j+1}
will give you an infinite sequence. What do you actually want to do with an iterator?

Write a function that takes two arguments, x and n, and returns h(x, n). using FOR loop

I am trying to write a function in R that takes two arguments, x and n, and returns h(x, n); x=1
Does anyone know how to do this using a for loop?
The function I am working with is:
x^0 + x^1 + x^2...x^n
I have been working for a while on this and am not sure if I am doing this correctly.
Can anyone give me some guidance on how to do this problem.
Here is what I have..
n = seq(1,6, by = 1)
x = 1
h = function (x,n){
for (i in 0:n){
for( i in 1:n){
sum = sum +x^i
{
}}

h <- function( x, n ) sum( x^c(0:n) )
h( 1, 6 )
Loops are best avoided in R. First, you can use vectors in many situations; then, learn to use apply and friends (sapply, lapply etc.).
Do yourself a favor and use <- instead of = in assignments. It pays off in the long run.
Like in other programming languages, no need to declare the variables outside of the function (and anyways, since n is an argument to your function, your first assignment has no effect on the function)
Don't use seq() where a simple k:n will do.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

why are these memoised functions different? - r

Related

Why does this function works?

R recursive function call that works without stating arguments to function

R. function. error

Lazy sequences in R

Write a function that takes two arguments, x and n, and returns h(x, n). using FOR loop

Categories

Resources