Problem
Find the sum of all numbers below 1000 that can be divisible by 3 or 5
One solution I created:
x <- c(1:999)
values <- x[x %% 3 == 0 | x %% 5 == 0]
sum(values
Second solution I can't get to work and need help with. I've pasted it below.
I'm trying to use a loop (here, I use while() and after this I'll try for()). I am still struggling with keeping references to indexes (locations in a vector) separate from values/observations within vectors. Loops seem to make it more challenging for me to distinguish the two.
Why does this not produce the answer to Euler #1?
x <- 0
i <- 1
while (i < 100) {
if (i %% 3 == 0 | i %% 5 == 0) {
x[i] <- c(x, i)
}
i <- i + 1
}
sum(x)
And in words, line by line this is what I understand is happening:
x gets value 0
i gets value 1
while object i's value (not the index #) is < 1000
if is divisible by 3 or 5
add that number i to the vector x
add 1 to i in order (in order to keep the loop going to defined limit of 1e3
sum all items in vector x
I am guessing x[i] <- c(x, i) is not the right way to add an element to vector x. How do I fix this and what else is not accurate?
First, your loop runs until i < 100, not i < 1000.
Second, replace x[i] <- c(x, i) with x <- c(x, i) to add an element to the vector.
Here is a shortcut that performs this sum, which is probably more in the spirit of the problem:
3*(333*334/2) + 5*(199*200/2) - 15*(66*67/2)
## [1] 233168
Here's why this works:
In the set of integers [1,999] there are:
333 values that are divisible by 3. Their sum is 3*sum(1:333) or 3*(333*334/2).
199 values that are divisible by 5. Their sum is 5*sum(1:199) or 5*(199*200/2).
Adding these up gives a number that is too high by their intersection, which are the values that are divisible by 15. There are 66 such values, and their sum is 15*(1:66) or 15*(66*67/2)
As a function of N, this can be written:
f <- function(N) {
threes <- floor(N/3)
fives <- floor(N/5)
fifteens <- floor(N/15)
3*(threes*(threes+1)/2) + 5*(fives*(fives+1)/2) - 15*(fifteens*(fifteens+1)/2)
}
Giving:
f(999)
## [1] 233168
f(99)
## [1] 2318
And another way:
x <- 1:999
sum(which(x%%5==0 | x%%3==0))
# [1] 233168
A very efficient approach is the following:
div_sum <- function(x, n) {
# calculates the double of the sum of all integers from 1 to n
# that are divisible by x
max_num <- n %/% x
(x * (max_num + 1) * max_num)
}
n <- 999
a <- 3
b <- 5
(div_sum(a, n) + div_sum(b, n) - div_sum(a * b, n)) / 2
In contrast, a very short code is the following:
x=1:999
sum(x[!x%%3|!x%%5])
Here is an alternative that I think gives the same answer (using 99 instead of 999 as the upper bound):
iters <- 100
x <- rep(0, iters-1)
i <- 1
while (i < iters) {
if (i %% 3 == 0 | i %% 5 == 0) {
x[i] <- i
}
i <- i + 1
}
sum(x)
# [1] 2318
Here is the for-loop mentioned in the original post:
iters <- 99
x <- rep(0, iters)
i <- 1
for (i in 1:iters) {
if (i %% 3 == 0 | i %% 5 == 0) {
x[i] <- i
}
i <- i + 1
}
sum(x)
# [1] 2318
Related
I just saw a YouTube video from Numberphile on the Yellowstone sequence (A098550). It's base on a sequence starting with 1 and 2, with subsequent terms generated by the rules:
no repeated terms
always pick the lowest integer
gcd(a_n, a_(n-1)) = 1
gcd(a_n, a_(n-2)) > 1
The first 15 terms would be: 1 2 3 4 9 8 15 14 5 6 25 12 35 16 7
A Q&D approach in R could be something like this, but understandably, this becomes very slow at attempts to make longer sequences. It also make some assumptions about the highest number that is possible within the sequence (as info: the sequence of 10,000 items never goes higher than 5000).
What can we do to make this faster?
library(DescTools)
a <- c(1, 2, 3)
p <- length(a)
# all natural numbers
all_ints <- 1:5000
for (n in p:1000) {
# rule 1 - remove all number that are in sequence already
next_a_set <- all_ints[which(!all_ints %in% a)]
# rule 3 - search the remaining set for numbers that have gcd == 1
next_a_option <- next_a_set[which(
sapply(
next_a_set,
function(x) GCD(a[n], x)
) == 1
)]
# rule 4 - search the remaining number for gcd > 1
next_a <- next_a_option[which(
sapply(
next_a_option,
function(x) GCD(a[n - 1], x)
) > 1
)]
# select the lowest
a <- c(a, min(next_a))
n <- n + 1
}
Here's a version that's about 20 times faster than yours, with comments about the changes:
# Set a to the final length from the start.
a <- c(1, 2, 3, rep(NA, 997))
p <- 3
# Define a vectorized gcd() function. We'll be testing
# lots of gcds at once. This uses the Euclidean algorithm.
gcd <- function(x, y) { # vectorized gcd
while (any(y != 0)) {
x1 <- ifelse(y == 0, x, y)
y <- ifelse(y == 0, 0, x %% y)
x <- x1
}
x
}
# Guess at a reasonably large vector to work from,
# but we'll grow it later if not big enough.
allnum <- 1:1000
# Keep a logical record of what has been used
used <- c(rep(TRUE, 3), rep(FALSE, length(allnum) - 3))
for (n in p:1000) {
# rule 1 - remove all number that are in sequence already
# nothing to do -- used already records that.
repeat {
# rule 3 - search the remaining set for numbers that have gcd == 1
keep <- !used & gcd(a[n], allnum) == 1
# rule 4 - search the remaining number for gcd > 1
keep <- keep & gcd(a[n-1], allnum) > 1
# If we found anything, break out of this loop
if (any(keep))
break
# Otherwise, make the set of possible values twice as big,
# and try again
allnum <- seq_len(2*length(allnum))
used <- c(used, rep(FALSE, length(used)))
}
# select the lowest
newval <- which.max(keep)
# Assign into the appropriate place
a[n+1] <- newval
# Record that it has been used
used[newval] <- TRUE
}
If you profile it, you'll see it spends most of its time in the gcd() function. You could probably make that a lot faster by redoing it in C or C++.
The biggest change here is pre-allocation and restricting the search to numbers that have not yet been used.
library(numbers)
N <- 5e3
a <- integer(N)
a[1:3] <- 1:3
b <- logical(N) # which numbers have been used already?
b[1:3] <- TRUE
NN <- 1:N
system.time({
for (n in 4:N) {
a1 <- a[n - 1L]
a2 <- a[n - 2L]
for (k in NN[!b]) {
if (GCD(k, a1) == 1L & GCD(k, a2) > 1L) {
a[n] <- k
b[k] <- TRUE
break
}
}
if (!a[n]) {
a <- a[1:(n - 1L)]
break
}
}
})
#> user system elapsed
#> 1.28 0.00 1.28
length(a)
#> [1] 1137
For a fast C++ algorithm, see here.
I want to calculate how many values are taken until the cumulative reaches a certain value.
This is my vector: myvec = seq(0,1,0.1)
I started with coding the cumulative sum function:
cumsum_for <- function(x)
{
y = 1
for(i in 2:length(x)) # pardon the case where x is of length 1 or 0
{x[i] = x[i-1] + x[i]
y = y+1}
return(y)
}
Now, with the limit
cumsum_for <- function(x, limit)
{
y = 1
for(i in 2:length(x)) # pardon the case where x is of length 1 or 0
{x[i] = x[i-1] + x[i]
if(x >= limit) break
y = y+1}
return(y)
}
which unfortunately errors:
myvec = seq(0,1,0.1)
cumsum_for(myvec, 0.9)
[1] 10
Warning messages:
1: In if (x >= limit) break :
the condition has length > 1 and only the first element will be used
[...]
What about this? You can use cumsum to compute the cumulative sum, and then count the number of values that are below a certain threshold n:
f <- function(x, n) sum(cumsum(x) <= n)
f(myvec, 4)
#[1] 9
f(myvec, 1.1)
#[1] 5
You can put a while loop in a function. This stops further calculation of the cumsum if the limit is reached.
cslim <- function(v, l) {
s <- 0
i <- 0L
while (s < l) {
i <- i + 1
s <- sum(v[1:i])
}
i - 1
}
cslim(v, .9)
# [1] 4
Especially useful for longer vectors, e.g.
v <- seq(0, 3e7, 0.1)
I would like to write a code that generates 3 x 1 vector y according to following rule (The small numbers are selected for simplicity):
Here x is a 3 x 1 vector. According to the rule, for an update of y, I need sum of all y’s.
An attemp to code with an arbitrary x:
x <- c(2,3,1)
y <- c(0,0,0)
for(i in 1:5){
for(j in 1:3){
y[j] <- x[j] + y[j] + sum(y)
}
}
This code is not appropriate because it computes sum(b) term by term.
The inner loop indicates something like this:
y[1] = x[1] + 0 = 2
y[2] = x[2] + 2 = 5
y[3] = x[3] + 2 + 5 = 8
It is not appropriate because sum(y) term contains one term for y[1], two terms for y[2], three terms for y[3]. But I think sum(y) should be 2 + 5 + 8 = 15 for each iteration, y[1], y[2], y[3], according to the rule given above. Moreover this procedure should be repeated for a certain times (here 5 times shown by the outer loop). At each time of outer loop, only one sum(y) term will be computed for all three iteration of inner loop and it will be put as sum(y) term for each j.
How should I code this?
You are over-complicating this. Vectorize the inner-loop away:
> x <- c(2,3,1)
> y <- c(0,0,0)
> for(j in 1:5) y <- x + y + sum(y)
> y
[1] 682 687 677
This approach only computes sum(y) once per iteration, which is what you seem to want. As an added benefit, adding vectors in a single operation is much faster than adding them component-wise in a loop.
Maybe this will work
myfun <- function(x, y, i) {
y[i] <- x[i] + sum(y)
if (i < length(x)) {
myfun(x, y, i+1)
} else {
return(y)
}
}
x <- c(2, 3, 1)
y <- rep(0, length(x))
myfun(x, y, 1)
# [1] 2 5 8
x <- c(2, 3, 1, 5)
y <- rep(0, length(x))
myfun(x, y, 1)
# [1] 2 5 8 20
I need to write an algorithm that gives you any number n in base 3 in R. So far I wrote that :
vector <- c(10, 100, 1000, 10000)
ternary <- function(n) { while (n != 0) {
{q<- n%/%3}
{r <- n%%3}
{return(r)}
q<- n }
sapply(vector, ternary)}
I thought that by applying sapply( vector, ternary) it would give me all the r for any given n that I would put in ternary(n). My code still gives me the "last r" and I don't get why.
This is the straightforward implementation of what I have learned to do by hand in nth grade (don't remember exactly when).
base3 <- function(x){
y <- integer(0)
while(x >= 3){
r <- x %% 3
x <- x %/% 3
y <- c(r, y)
}
y <- c(x, y)
y
}
base3(10)
#[1] 1 0 1
base3(5)
#[1] 1 2
You ca use recursion:
base3 =function(x,y=NULL){
d = x %/% 3
r=c(x %% 3,y)
if(d>=3) base3(d,r)
else c(d,r)
}
base3(10)
[1] 1 0 1
> base3(100)
[1] 1 0 2 0 1
I am a new R user and have very limited programming experience, hence my question and poorly written code.
I was assigned a problem where I had to use a while loop to generate the numbers of the Fibonacci sequence that are less than 4,000,000 (the Fibonacci sequence is characterized by the fact that every number after the first two is the sum of the two preceding ones).
Next, I had to compute the sum of the even numbers in the sequence that was generated.
I was successful with my response, however, I don't think the code is written very well. What could I have done better?
> x <- 0
> y <- 1
> z <- 0
if (x == 0 & y == 1) {
cat(x)
cat(" ")
cat(y)
cat(" ")
while (x < 4000000 & y < 4000000) {
x <- x + y
cat(x)
cat(" ")
if (x %% 2 == 0) {
z <- x + z
}
y <- x + y
cat(y)
cat(" ")
if (y %% 2 == 0) {
z <- y + z
}
}
}
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765 10946 17711 28657 46368 75025 121393 196418 317811 514229 832040 1346269 2178309 3524578 5702887 9227465
cat(z)
4613732
First of all, cat comes with a sep argument. You can do cat(x, y, sep = " ") rather than using 3 lines for that.
Secondly, when you call while (x < 4000000 & y < 4000000) note that y will always be greater than x because it is the sum of the last x and y ... so it should suffice to check for y < 4000000 here.
For the while loop, you could also use a counter - might be more intuitive. Indexing in R isn't that fast though
fib <- c(0, 1)
i <- 2
while (fib[i] < 4000000) {
fib <- c(fib, fib[i-1] + fib[i])
i <- i + 1
}
sum(fib[fib %% 2 == 0])
If you don't necessarily need the while, you could also approach it via recursion
fib <- function(x, y) {
s <- x + y
c(s, if (s < 4000000) fib(y, s))
}
f <- fib(0, 1)
sum(f[f %% 2 == 0])
First, there's no need o explicitly print everything out.
Second, it's more idiomatic in R to make a vector of the Fibonacci numbers and then sum. If you don't know an explicit closed form for the Fibonacci numbers, or if you've been told not to use this, then use a loop to create the list of Fibonacci numbers.
So to construct the list of Fibonacci numbers (two at a time) you can do
x <- 0
y <- 1
fib <- c()
while (x < 4000000 & y < 4000000){
x <- x + y
y <- x + y
fib = c(fib, x, y)
}
This will give you a vector of Fibonacci numbers, containing all those less than 4000000 and a few more (the last element is 9227465).
Then run
sum(fib[fib %% 2 == 0 & fib < 4000000])
to get the result. This returns 4613732, like your code does. The subsetting operator [], when you put a logical condition inside it, will output just those numbers which satisfy the logical condition -- in this case, that they're even and less than 4000000.
I am using the closed form of the fibonacci sequence as found here
fib = function(n) round(((5 + sqrt(5)) / 10) * (( 1 + sqrt(5)) / 2) ** (1:n - 1))
numbers <- 2
while (max(fib(numbers)) < 4000000){ # try amount of numbers while the maximum of the sequence is less than 4000000
sequence <- fib(numbers) # here the sequence that satisfies the "4000000 condition will be saved"
numbers <- numbers + 1 # increase the amount of numbers
}
total_sum <- sum(sequence[sequence%%2==0]) # summing the even numbers
This is how I would do it. First, I defined a global variable i to include the first two elements of the Fibonacci series. Then at the end, I re-assigned the global variable to its initial value (i.e. 1). If I don't do that, then when I call the function fib(0,1) again, the output is incorrect as it calls the function with the last value of i. It's also important to do return() to ensure it doesn't return anything in the else clause. If you don't specify return(), the final output will be 1, instead of the Fibonacci series.
Please note the series only goes till the number 13 (z<14) obviously you can change that to whatever you want. May also be a good option to include this as the third argument of the function, something like fib(0,1,14). Try it out!
i <<- 1
fib <- function(x,y){
z <- x+y
if(z<14){
if (i==1){
i <<- i+1
c(x,y,z,fib(y,z))
}
else c(z, fib(y,z))
}
else {
i <<- 1
return()
}
}
a <- fib(0,1)
a