I am looking for an efficient way to nest the same function in R until a condition is met. I hope the following example illustrates my problem clearly.
Consider the function
f(x) = x^2 + 1, with x > 1.
Denote
f^{(k)}(x) = f(f(f(...f(x)))),
where the function f is evaluated k times within itself. Let M > 0, with M given.
Is there any efficient routine in R to determine the minimum value of k such that f^{(k)}(2) > M?
Thank you.
Nothing special for that. Just use a loop:
function(x, M) {
k <- 0
repeat {
x <- x^2 + 1
k <- k + 1
if (x > M)
break
}
k
}
Not particularly efficient, but often the overhead of evaluating f will be greater than the overhead of the loop. If that's not the case (and it might not be for this particular f), I'd suggest doing the equivalent thing in C or C++ (perhaps using Rcpp).
This would be the recursive approach:
# 2^2 + 1 == 5
# 5^2 + 1 == 26
# 26^2 + 1 == 677
f <- function(x,M,k=0){
if(x <= M) k <- f(x^2 + 1,M=M,k+1)
return(k)
}
f(2,3) # 1
f(2,10) # 2
f(2,50) # 3
f(2,700) # 4
Related
I just saw a YouTube video from Numberphile on the Yellowstone sequence (A098550). It's base on a sequence starting with 1 and 2, with subsequent terms generated by the rules:
no repeated terms
always pick the lowest integer
gcd(a_n, a_(n-1)) = 1
gcd(a_n, a_(n-2)) > 1
The first 15 terms would be: 1 2 3 4 9 8 15 14 5 6 25 12 35 16 7
A Q&D approach in R could be something like this, but understandably, this becomes very slow at attempts to make longer sequences. It also make some assumptions about the highest number that is possible within the sequence (as info: the sequence of 10,000 items never goes higher than 5000).
What can we do to make this faster?
library(DescTools)
a <- c(1, 2, 3)
p <- length(a)
# all natural numbers
all_ints <- 1:5000
for (n in p:1000) {
# rule 1 - remove all number that are in sequence already
next_a_set <- all_ints[which(!all_ints %in% a)]
# rule 3 - search the remaining set for numbers that have gcd == 1
next_a_option <- next_a_set[which(
sapply(
next_a_set,
function(x) GCD(a[n], x)
) == 1
)]
# rule 4 - search the remaining number for gcd > 1
next_a <- next_a_option[which(
sapply(
next_a_option,
function(x) GCD(a[n - 1], x)
) > 1
)]
# select the lowest
a <- c(a, min(next_a))
n <- n + 1
}
Here's a version that's about 20 times faster than yours, with comments about the changes:
# Set a to the final length from the start.
a <- c(1, 2, 3, rep(NA, 997))
p <- 3
# Define a vectorized gcd() function. We'll be testing
# lots of gcds at once. This uses the Euclidean algorithm.
gcd <- function(x, y) { # vectorized gcd
while (any(y != 0)) {
x1 <- ifelse(y == 0, x, y)
y <- ifelse(y == 0, 0, x %% y)
x <- x1
}
x
}
# Guess at a reasonably large vector to work from,
# but we'll grow it later if not big enough.
allnum <- 1:1000
# Keep a logical record of what has been used
used <- c(rep(TRUE, 3), rep(FALSE, length(allnum) - 3))
for (n in p:1000) {
# rule 1 - remove all number that are in sequence already
# nothing to do -- used already records that.
repeat {
# rule 3 - search the remaining set for numbers that have gcd == 1
keep <- !used & gcd(a[n], allnum) == 1
# rule 4 - search the remaining number for gcd > 1
keep <- keep & gcd(a[n-1], allnum) > 1
# If we found anything, break out of this loop
if (any(keep))
break
# Otherwise, make the set of possible values twice as big,
# and try again
allnum <- seq_len(2*length(allnum))
used <- c(used, rep(FALSE, length(used)))
}
# select the lowest
newval <- which.max(keep)
# Assign into the appropriate place
a[n+1] <- newval
# Record that it has been used
used[newval] <- TRUE
}
If you profile it, you'll see it spends most of its time in the gcd() function. You could probably make that a lot faster by redoing it in C or C++.
The biggest change here is pre-allocation and restricting the search to numbers that have not yet been used.
library(numbers)
N <- 5e3
a <- integer(N)
a[1:3] <- 1:3
b <- logical(N) # which numbers have been used already?
b[1:3] <- TRUE
NN <- 1:N
system.time({
for (n in 4:N) {
a1 <- a[n - 1L]
a2 <- a[n - 2L]
for (k in NN[!b]) {
if (GCD(k, a1) == 1L & GCD(k, a2) > 1L) {
a[n] <- k
b[k] <- TRUE
break
}
}
if (!a[n]) {
a <- a[1:(n - 1L)]
break
}
}
})
#> user system elapsed
#> 1.28 0.00 1.28
length(a)
#> [1] 1137
For a fast C++ algorithm, see here.
I'm given a question in R language to find the 30th term of the recurrence relation x(n) = 2*x(n-1) - x(n-2), where x(1) = 0 and x(2) = 1. I know the answer is 29 from mathematical deduction. But as a newbie to R, I'm slightly confused by how to make things work here. The following is my code:
loop <- function(n){
a <- 0
b <- 1
for (i in 1:30){
a <- b
b <- 2*b - a
}
return(a)
}
loop(30)
I'm returned 1 as a result, which is way off.
In case you're wondering why this looks Python-ish, I've mostly only been exposed to Python programming thus far (I'm new to programming in general). I've tried to check out all the syntax in R, but I suppose my logic is quite fixed by Python. Can someone help me out in this case? In addition, does R have any resources like PythonTutor to help visualise the code execution logic?
Thank you!
I guess what you need might be something like below
loop <- function(n){
if (n<=2) return(n-1)
a <- 0
b <- 1
for (i in 3:n){
a_new <- b
b <- 2*b - a
a <- a_new
}
return(b)
}
then
> loop(30)
[1] 29
If you need a recursion version, below is one realization
loop <- function(n) {
if (n<=2) return(n-1)
2*loop(n-1)-loop(n-2)
}
which also gives
> loop(30)
[1] 29
You can solve it another couple of ways.
Solve the linear homogeneous recurrence relation, let
x(n) = r^n
plugging into the recurrence relation, you get the quadratic
r^n-2*r^(n-1)+r^(n-2) = 0
, i.e.,
r^2-2*r+1=0
, i.e.,
r = 1, 1
leading to general solution
x(n) = c1 * 1^n + c2 * n * 1^n = c1 + n * c2
and with x(1) = 0 and x(2) = 1, you get c2 = 1, c1 = -1, s.t.,
x(n) = n - 1
=> x(30) = 29
Hence, R code to compute x(n) as a function of n is trivial, as shown below:
x <- function(n) {
return (n-1)
}
x(30)
#29
Use matrix powers (first find the following matrix A from the recurrence relation):
(The matrix A has algebraic / geometric multiplicity, its corresponding eigenvectors matrix is singular, otherwise you could use spectral decomposition yourself for fast computation of matrix powers, here we shall use the library expm as shown below)
library(expm)
A <- matrix(c(2,1,-1,0), nrow=2)
A %^% 29 %*% c(1,0) # [x(31) x(30)]T = A^29.[x(2) x(1)]T
# [,1]
# [1,] 30 # x(31)
# [2,] 29 # x(30)
# compute x(n)
x <- function(n) {
(A %^% (n-1) %*% c(1,0))[2]
}
x(30)
# 29
You're not using the variable you're iterating on in the loop, so nothing is updating.
loop <- function(n){
a <- 0
b <- 1
for (i in 1:30){
a <- b
b <- 2*i - a
}
return(a)
}
You could define a recursive function.
f <- function(x, n) {
n <- 1:n
r <- function(n) {
if (length(n) == 2) x[2]
else r({
x <<- c(x[2], 2*x[2] - x[1])
n[-1]
})
}
r(n)
}
x <- c(0, 1)
f(x, 30)
# [1] 29
I am a new R user and have very limited programming experience, hence my question and poorly written code.
I was assigned a problem where I had to use a while loop to generate the numbers of the Fibonacci sequence that are less than 4,000,000 (the Fibonacci sequence is characterized by the fact that every number after the first two is the sum of the two preceding ones).
Next, I had to compute the sum of the even numbers in the sequence that was generated.
I was successful with my response, however, I don't think the code is written very well. What could I have done better?
> x <- 0
> y <- 1
> z <- 0
if (x == 0 & y == 1) {
cat(x)
cat(" ")
cat(y)
cat(" ")
while (x < 4000000 & y < 4000000) {
x <- x + y
cat(x)
cat(" ")
if (x %% 2 == 0) {
z <- x + z
}
y <- x + y
cat(y)
cat(" ")
if (y %% 2 == 0) {
z <- y + z
}
}
}
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765 10946 17711 28657 46368 75025 121393 196418 317811 514229 832040 1346269 2178309 3524578 5702887 9227465
cat(z)
4613732
First of all, cat comes with a sep argument. You can do cat(x, y, sep = " ") rather than using 3 lines for that.
Secondly, when you call while (x < 4000000 & y < 4000000) note that y will always be greater than x because it is the sum of the last x and y ... so it should suffice to check for y < 4000000 here.
For the while loop, you could also use a counter - might be more intuitive. Indexing in R isn't that fast though
fib <- c(0, 1)
i <- 2
while (fib[i] < 4000000) {
fib <- c(fib, fib[i-1] + fib[i])
i <- i + 1
}
sum(fib[fib %% 2 == 0])
If you don't necessarily need the while, you could also approach it via recursion
fib <- function(x, y) {
s <- x + y
c(s, if (s < 4000000) fib(y, s))
}
f <- fib(0, 1)
sum(f[f %% 2 == 0])
First, there's no need o explicitly print everything out.
Second, it's more idiomatic in R to make a vector of the Fibonacci numbers and then sum. If you don't know an explicit closed form for the Fibonacci numbers, or if you've been told not to use this, then use a loop to create the list of Fibonacci numbers.
So to construct the list of Fibonacci numbers (two at a time) you can do
x <- 0
y <- 1
fib <- c()
while (x < 4000000 & y < 4000000){
x <- x + y
y <- x + y
fib = c(fib, x, y)
}
This will give you a vector of Fibonacci numbers, containing all those less than 4000000 and a few more (the last element is 9227465).
Then run
sum(fib[fib %% 2 == 0 & fib < 4000000])
to get the result. This returns 4613732, like your code does. The subsetting operator [], when you put a logical condition inside it, will output just those numbers which satisfy the logical condition -- in this case, that they're even and less than 4000000.
I am using the closed form of the fibonacci sequence as found here
fib = function(n) round(((5 + sqrt(5)) / 10) * (( 1 + sqrt(5)) / 2) ** (1:n - 1))
numbers <- 2
while (max(fib(numbers)) < 4000000){ # try amount of numbers while the maximum of the sequence is less than 4000000
sequence <- fib(numbers) # here the sequence that satisfies the "4000000 condition will be saved"
numbers <- numbers + 1 # increase the amount of numbers
}
total_sum <- sum(sequence[sequence%%2==0]) # summing the even numbers
This is how I would do it. First, I defined a global variable i to include the first two elements of the Fibonacci series. Then at the end, I re-assigned the global variable to its initial value (i.e. 1). If I don't do that, then when I call the function fib(0,1) again, the output is incorrect as it calls the function with the last value of i. It's also important to do return() to ensure it doesn't return anything in the else clause. If you don't specify return(), the final output will be 1, instead of the Fibonacci series.
Please note the series only goes till the number 13 (z<14) obviously you can change that to whatever you want. May also be a good option to include this as the third argument of the function, something like fib(0,1,14). Try it out!
i <<- 1
fib <- function(x,y){
z <- x+y
if(z<14){
if (i==1){
i <<- i+1
c(x,y,z,fib(y,z))
}
else c(z, fib(y,z))
}
else {
i <<- 1
return()
}
}
a <- fib(0,1)
a
I would like to program a function in R to allow me to decompose any integer n into n=2^j+k where j>=0 and 0<= k <2^j. I have thought about using the modulus operation, but am failing to get a k that is less than 2^j. Would anyone have any ideas? Thanks!
For values of n > 0 you could try this function:
pow2_decompose <- function(n) {
j <- trunc(log2(n))
k <- n - 2**j
return(list(j,k))
}
Example:
n <- 1289
j <- pow2_decompose(n)[[1]]
k <- pow2_decompose(n)[[2]]
cat(j,k,"\n")
#10 265
identical(2**j + k, n)
#[1] TRUE
a) Create a vector X of length 20, with the kth element in X = 2k, for k=1…20. Print out the values of X.
b) Create a vector Y of length 20, with all elements in Y equal to 0. Print out the values of Y.
c) Using a for loop, reassigns the value of the k-th element in Y, for k = 1…20. When k < 12, the kth element of Y is reassigned as the cosine of k. When the k ≥ 12, the kth element of Y is reassigned as the value of integral sqrt(t)dt from 0 to K.
for the first two questions, it is simple.
> x1 <- seq(1,20,by=2)
> x <- 2 * x1
> x
[1] 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40
> y <- rep(0,20)
> y
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
i got stuck on the last one,
t <- function(i) sqrt(i)
for (i in 1:20) {
if (i < 12) {
y[i] <- cos(i)
}
else if (i >= 12) {
y[i] <- integral(t, lower= 0, Upper = 20)
}
}
y // print new y
Any suggestions? thanks.
What may help is that the command to calculate a one-dimensional integral is integrate not integral.
You have successfully completed the first two, so I'll demonstrate a different way of getting those vectors:
x <- 2 * seq_len(20)
y <- double(length = 20)
As for your function, you have the right idea, but you need to clean up your syntax a bit. For example, you may need to double-check your braces (using a set style like Hadley Wickham's will help you prevent syntax errors and make the code more readable), you don't need the "if" in the else, you need to read up on integrate and see what its inputs, and importantly its outputs are (and which of them you need and how to extract it), and lastly, you need to return a value from your function. Hopefully, that's enough to help you work it out on your own. Good Luck!
Update
Slightly different function to demonstrate coding style and some best practices with loops
Given a working answer has been posted, this is what I did when looking at your question. I think it is worth posting, as as I think that it is a good habit to 1) pre-allocate answers 2) prevent confusion about scope by not re-using the input variable name as an output and 3) use the seq_len and seq_along constructions for for loops, per R Inferno(pdf) which is required reading, in my opinion:
tf <- function(y){
z <- double(length = length(y))
for (k in seq_along(y)) {
if (k < 12) {
z[k] <- cos(k)
} else {
z[k] <- integrate(f = sqrt, lower = 0, upper = k)$value
}
}
return(z)
}
Which returns:
> tf(y)
[1] 0.540302306 -0.416146837 -0.989992497 -0.653643621 0.283662185 0.960170287 0.753902254
[8] -0.145500034 -0.911130262 -0.839071529 0.004425698 27.712816032 31.248114562 34.922139530
[15] 38.729837810 42.666671456 46.728535669 50.911693960 55.212726149 59.628486093
To be honest you almost have it ready and it is good that you have showed some code here:
y <- rep(0,20) #y vector from question 2
for ( k in 1:20) { #start the loop
if (k < 12) { #if k less than 12
y[k] <- cos(k) #calculate cosine
} else if( k >= 12) { #else if k greater or equal to 12
y[k] <- integrate( sqrt, lower=0, upper=k)$value #see below for explanation
}
}
print(y) #prints y
> print(y)
[1] 0.540302306 -0.416146837 -0.989992497 -0.653643621 0.283662185 0.960170287 0.753902254 -0.145500034 -0.911130262 -0.839071529 0.004425698
[12] 27.712816032 31.248114562 34.922139530 38.729837810 42.666671456 46.728535669 50.911693960 55.212726149 59.628486093
First of all stats::integrate is the function you need to calculate the integral
integrate( sqrt, lower=0, upper=2)$value
The first argument is a function which in your case is sqrt. sqrt is defined already in R so there is no need to define it yourself explicitly as t <- function(i) sqrt(i)
The other two arguments as you correctly set in your code are lower and upper.
The function integrate( sqrt, lower=0, upper=2) will return:
1.885618 with absolute error < 0.00022
and that is why you need integrate( sqrt, lower=0, upper=2)$value to only extract the value.
Type ?integrate in your console to see the documentation which will help you a lot I think.