How do I retrieve maximum sum of possible divisors numbers
I have a below function which will give possible divisors of number
Code
divisors <- function(x) {
y <- seq_len(ceiling(x / 2))
y[x %% y == 0]
}
Example
Divisors of 99 will give the below possible values.
divisors(99)
[1] 1 3 9 11 33
My expected Logic :
Go from last digit to first digit in the divisors value
The last number is 33, Here next immediate number divisible by 33 is 11 . So I selected 11 , now traversing from 11 the next immediate number divisible by 11 is 1. So selected 1. Now add all the numbers.
33 + 11 + 1 = 45
Move to next number 11, Now next immediate number divisible by 11 is 1. So selected 1. Now add all the numbers.
11 + 1 = 12
Here immediate
Move to next number 9, Now next immediate number divisible by 11 is 1. So selected 1. Now add all the numbers.
9 + 3 + 1 = 13
Move to next number 3, Now next immediate number divisible by 3 is 1. So selected 1. Now add all the numbers.
3+1=4
Now maximum among these is 45.
Now I am struggling to write this logic in R . Help / Advice much appreciated.
Note : Prime numbers can be ignored.
update
For large integers, e.g., the maximum integer .Machine$integer.max (prime number), you can run the code below (note that I modified functions divisors and f a bit)
divisors <- function(x) {
y <- seq(x / 2)
y[as.integer(x) %% y == 0]
}
f <- function(y) {
if (length(y) <= 2) {
return(as.integer(sum(y)))
}
l <- length(y)
h <- y[l]
yy <- y[-l]
h + f(yy[h %% yy == 0])
}
and you will see
> n <- .Machine$integer.max - 1
> x <- divisors(n)
> max(sapply(length(x):2, function(k) f(head(x, k))))
[1] 1569603656
You can define a recursive function f that gives successive divisors
f <- function(y) {
if (length(y) == 1) {
return(y)
}
h <- y[length(y)]
yy <- y[-length(y)]
c(f(yy[h %% yy == 0]), h)
}
and you will see all possible successive divisor tuples
> sapply(rev(seq_along(x)), function(k) f(head(x, k)))
[[1]]
[1] 1 11 33
[[2]]
[1] 1 11
[[3]]
[1] 1 3 9
[[4]]
[1] 1 3
[[5]]
[1] 1
Then, we apply f within sapply like below
> max(sapply(rev(seq_along(x)), function(k) sum(f(head(x, k)))))
[1] 45
which gives the desired output.
You can also use the following solution. It may sound a little bit complicated and of course there is always an easier, more efficient solution. However, I thought this could be useful to you. I will take it from your divisors output:
> x
[1] 1 3 9 11 33
# First I created a list whose first element is our original x and from then on
# I subset the first element till the last element of the list
lst <- lapply(0:(length(x)-1), function(a) x[1:(length(x)-a)])
> lst
[[1]]
[1] 1 3 9 11 33
[[2]]
[1] 1 3 9 11
[[3]]
[1] 1 3 9
[[4]]
[1] 1 3
[[5]]
[1] 1
Then I wrote a custom function in order to implement your conditions and gather your desired output. For this purpose I created a function factory which in fact is a function that creates a function:
As you might have noticed the outermost function does not take any argument. It only sets up an empty vector out to save our desired elements in. It is created in the execution environment of the outermost function to shield it from any changes that might affect it in the global environment
The inner function is the one that takes our vector x so in general we call the whole setup like fnf()(x). First element of of our out vector is in fact the first element of the original x(33). Then I found all divisors of the first element whose quotient were 0. After I fount them I took the second element (11) as the first one was (33) and stored it in our out vector. Then I modified the original x vector and omitted the max value (33) and repeated the same process
Since we were going to repeat the process over again, I thought this might be a good case to use recursion. Recursion is a programming technique that a function actually calls itself from its body or from inside itself. As you might have noticed I used fn inside the function to repeat the process again but each time with one fewer value
This may sound a bit complicated but I believed there may be some good points for you to pick up for future exploration, since I found them very useful, hoped that's the case for you too.
fnf <- function() {
out <- c()
fn <- function(x) {
out <<- c(out, x[1])
z <- x[out[length(out)]%%x == 0]
if(length(z) >= 2) {
out[length(out) + 1] <<- z[2]
} else {
return(out)
}
x <- x[!duplicated(x)][which(x[!duplicated(x)] == z[2]):length(x[!duplicated(x)])]
fn(x)
out[!duplicated(out)]
}
}
# The result of applying the custom function on `lst` would result in your
# divisor values
lapply(lst, function(x) fnf()(sort(x, decreasing = TRUE)))
[[1]]
[1] 33 11 1
[[2]]
[1] 11 1
[[3]]
[1] 9 3 1
[[4]]
[1] 3 1
[[5]]
[1] 1
In the end we sum each element and extract the max value
Reduce(max, lapply(lst, function(x) sum(fnf()(sort(x, decreasing = TRUE)))))
[1] 45
Testing a very large integer number, I used dear #ThomasIsCoding's modified divisors function:
divisors <- function(x) {
y <- seq(x / 2)
y[as.integer(x) %% y == 0]
}
x <- divisors(.Machine$integer.max - 1)
lst <- lapply(0:(length(x)-1), function(a) x[1:(length(x)-a)])
Reduce(max, lapply(lst, function(x) sum(fnf()(sort(x, decreasing = TRUE)))))
[1] 1569603656
You'll need to recurse. If I understand correctly, this should do what you want:
fact <- function(x) {
x <- as.integer(x)
div <- seq_len(abs(x)/2)
factors <- div[x %% div == 0L]
return(factors)
}
maxfact <- function(x) {
factors <- fact(x)
if (length(factors) < 3L) {
return(sum(factors))
} else {
return(max(factors + mapply(maxfact, factors)))
}
}
maxfact(99)
[1] 45
Related
I just saw a YouTube video from Numberphile on the Yellowstone sequence (A098550). It's base on a sequence starting with 1 and 2, with subsequent terms generated by the rules:
no repeated terms
always pick the lowest integer
gcd(a_n, a_(n-1)) = 1
gcd(a_n, a_(n-2)) > 1
The first 15 terms would be: 1 2 3 4 9 8 15 14 5 6 25 12 35 16 7
A Q&D approach in R could be something like this, but understandably, this becomes very slow at attempts to make longer sequences. It also make some assumptions about the highest number that is possible within the sequence (as info: the sequence of 10,000 items never goes higher than 5000).
What can we do to make this faster?
library(DescTools)
a <- c(1, 2, 3)
p <- length(a)
# all natural numbers
all_ints <- 1:5000
for (n in p:1000) {
# rule 1 - remove all number that are in sequence already
next_a_set <- all_ints[which(!all_ints %in% a)]
# rule 3 - search the remaining set for numbers that have gcd == 1
next_a_option <- next_a_set[which(
sapply(
next_a_set,
function(x) GCD(a[n], x)
) == 1
)]
# rule 4 - search the remaining number for gcd > 1
next_a <- next_a_option[which(
sapply(
next_a_option,
function(x) GCD(a[n - 1], x)
) > 1
)]
# select the lowest
a <- c(a, min(next_a))
n <- n + 1
}
Here's a version that's about 20 times faster than yours, with comments about the changes:
# Set a to the final length from the start.
a <- c(1, 2, 3, rep(NA, 997))
p <- 3
# Define a vectorized gcd() function. We'll be testing
# lots of gcds at once. This uses the Euclidean algorithm.
gcd <- function(x, y) { # vectorized gcd
while (any(y != 0)) {
x1 <- ifelse(y == 0, x, y)
y <- ifelse(y == 0, 0, x %% y)
x <- x1
}
x
}
# Guess at a reasonably large vector to work from,
# but we'll grow it later if not big enough.
allnum <- 1:1000
# Keep a logical record of what has been used
used <- c(rep(TRUE, 3), rep(FALSE, length(allnum) - 3))
for (n in p:1000) {
# rule 1 - remove all number that are in sequence already
# nothing to do -- used already records that.
repeat {
# rule 3 - search the remaining set for numbers that have gcd == 1
keep <- !used & gcd(a[n], allnum) == 1
# rule 4 - search the remaining number for gcd > 1
keep <- keep & gcd(a[n-1], allnum) > 1
# If we found anything, break out of this loop
if (any(keep))
break
# Otherwise, make the set of possible values twice as big,
# and try again
allnum <- seq_len(2*length(allnum))
used <- c(used, rep(FALSE, length(used)))
}
# select the lowest
newval <- which.max(keep)
# Assign into the appropriate place
a[n+1] <- newval
# Record that it has been used
used[newval] <- TRUE
}
If you profile it, you'll see it spends most of its time in the gcd() function. You could probably make that a lot faster by redoing it in C or C++.
The biggest change here is pre-allocation and restricting the search to numbers that have not yet been used.
library(numbers)
N <- 5e3
a <- integer(N)
a[1:3] <- 1:3
b <- logical(N) # which numbers have been used already?
b[1:3] <- TRUE
NN <- 1:N
system.time({
for (n in 4:N) {
a1 <- a[n - 1L]
a2 <- a[n - 2L]
for (k in NN[!b]) {
if (GCD(k, a1) == 1L & GCD(k, a2) > 1L) {
a[n] <- k
b[k] <- TRUE
break
}
}
if (!a[n]) {
a <- a[1:(n - 1L)]
break
}
}
})
#> user system elapsed
#> 1.28 0.00 1.28
length(a)
#> [1] 1137
For a fast C++ algorithm, see here.
I'm trying to write a function that takes a numeric vector as input, and returns the indexes of a shorter version of the input vector, according to some rules:
(a) if all elements are identical, return just the index of the first element; i.e., return 1; else:
if NOT all elements identical, then test for whether special_treatment_value is among them:
(b) if special_treatment_value is there, return the input vector's indexes except for the indexes of elements where special_treatment_value appeared; else:
(c) if special_treatment_value is not there, return the indexes of the input vector as-is, i.e., 1:length(x).
The problem: if we ended up in route (b), we might encounter a situation in which all vector elements are now the same. In such case, we would like to iterate through (a) again to minimize to just the first element.
Example
Let's say that I want to pass the following vectors through my function:
my_vec_1 <- c(1, 2, 1, 2, 3)
my_vec_2 <- c(4, 4, 4)
my_vec_3 <- c(1, 2, 1, 4, 1)
my_vec_4 <- c(3, 3, 3, 4)
and that:
special_treatment_value <- 4
According to my rules, the function should return the outputs:
for my_vec_1: it fits route (c) and thus the output should be 1:5 (indexes of all)
for my_vec_2: it fits route (a) and thus the output should be 1 (index of first)
for my_vec_3: it fits route (b). output should be 1 2 3 5 (indexes of all except for special value's)
my_vec_4 demonstrates the problem. My desired output is 1 because first we go through route (b) then I want to pass through (a). But right now it doesn't happen and my function (see below) returns 1 2 3 (indexes of all except for special value's).
my current attempt
get_indexes <- function(x, special_val) {
if (var(x) == 0) { # route (a)
output_idx <- 1
return(output_idx)
}
idx_entire_length <- 1:length(x)
if (any(x == special_val)) { # route (b)
idx_to_remove <- which(x == special_val)
output_idx <- idx_entire_length[-idx_to_remove]
return(output_idx)
}
# else
output_idx <- idx_entire_length # route (c)
return(output_idx)
}
get_indexes(my_vec_1, 4)
#> [1] 1 2 3 4 5
get_indexes(my_vec_2, 4)
#> [1] 1
get_indexes(my_vec_3, 4)
#> [1] 1 2 3 5
get_indexes(my_vec_4, 4)
#> [1] 1 2 3
I guess there should be some repeat block or while loop, but I can't figure out how to implement it correctly (and efficiently).
You can try
foo <- function(x, y){
tmp <- which(x != y)
if(dplyr::n_distinct(x[x!=y])<=1){
tmp <- 1
}
return(tmp)
}
Instead n_distinct() you can use length(unique())
Result:
lapply(list(my_vec_1, my_vec_2, my_vec_3, my_vec_4), foo, 4)
[[1]]
[1] 1 2 3 4 5
[[2]]
[1] 1
[[3]]
[1] 1 2 3 5
[[4]]
[1] 1
You could repeat the condition for going through (a) inside condition (b), for example:
f <- function(x, treatment){
if(var(x) == 0) 1 else {
if(treatment %in% x) {
x[-which(x == treatment)] |>
(\(.) if(var(.) == 0) 1 else (1:length(x))[-which(x == treatment)])()
} else {
1:length(x)
}
}
}
lapply(list(v1, v2, v3, v4), f, 4)
[[1]]
[1] 1 2 3 4 5
[[2]]
[1] 1
[[3]]
[1] 1 2 3 5
[[4]]
[1] 1
If I have a vector such as x <-c(1,2,3,4,5,6,7,8,9), I want a function f such that
f(vector,index,num) where it takes the vector and gives me num "closest" elements to that one on the index
Examples:
f(x,3,4) = c(1,2,4,5)
f(x,1,5) = c(2,3,4,5,6)
f(x,8,3) = c(6,7,9)
Since there is also the issue where if we have an odd num, we will need to choose whether to pick left or right side by symmetry, let's go with choosing the left side (but right side is ok too)
i.e f(x,4,5) = c(1,2,3,5,6) and f(x,7,3) = c(5,6,8)
I hope my question is clear, thank you for any help/responses!
edit: The original vector of c(1:9) is arbitrary, the vector could be a vector of strings, or a vector of length 1000 with shuffled numbers with repeats etc.
i.e c(1,7,4,2,3,7,2,6,234,56,8)
num_closest_by_indices <- function(v, idx, num) {
# Try the base case, where idx is not within (num/2) of the edge
i <- abs(seq_along(x) - idx)
i[idx] <- +Inf # sentinel
# If there are not enough elements in the base case, incrementally add more
for (cutoff_idx in seq(floor(num/2), num)) {
if (sum(i <= cutoff_idx) >= num) {
# This will add two extra indices every iteration. Strictly if we have an even length, we should add the leftmost one first and `continue`, to break ties towards the left.
return(v[i <= cutoff_idx])
}
}
}
Here's an illustration of this algorithm: we rank the indices in order of desirability, then pick the lowest num legal ones:
> seq_along(x)
1 2 3 4 5 6 7 8 9
> seq_along(x) - idx
-2 -1 0 1 2 3 4 5 6
> i <- abs(seq_along(x) - idx)
2 1 0 1 2 3 4 5 6
> i[idx] <- +Inf # sentinel to prevent us returning the element itself
2 1 Inf 1 2 3 4 5 6
Now we can just find num elements with smallest values (break ties arbitrarily, unless you have a preference (left)).
Our first guess is all indices <= (num/2) ; this might not be enough if index is within (num/2) of the start/end.
> i <= 2
TRUE TRUE FALSE TRUE TRUE FALSE FALSE FALSE FALSE
> v[i <= 2]
1 2 4 5
So, adapting #dash2's code to handle the corner cases where some indices are illegal (nonpositive, or > length(x)), i.e. ! %in% 1:L. Then min(elems) would be the number of illegal indices which we cannot pick, hence we must pick abs(min(elems)) more.
Notes:
in the end the code is simpler and faster to handle it by three piecewise cases. Aww.
it actually seems to simplify things if we pick (num+1) indices, then remove idx before returning the answer. Using result[-idx] to remove it.
Like so:
f <- function (vec, elem, n) {
elems <- seq(elem - ceiling(n/2), elem + floor(n/2))
if (max(elems) > length(vec)) elems <- elems - (max(elems) - length(vec))
if (elems[1] < 1) elems <- elems + (1 - elems[1])
elems <- setdiff(elems, elem)
vec[elems]
}
Giving results:
> f(1:9, 1, 5)
[1] 2 3 4 5 6
> f(1:9, 9, 5)
[1] 4 5 6 7 8
> f(1:9, 2, 5)
[1] 1 3 4 5 6
> f(1:9, 4, 5)
[1] 1 2 3 5 6
> f(1:9, 4, 4)
[1] 2 3 5 6
> f(1:9, 2, 4)
[1] 1 3 4 5
> f(1:9, 1, 4)
[1] 2 3 4 5
> f(1:9, 9, 4)
[1] 5 6 7 8
Start a function with the variable argument x first, and the reference table and n after
.nearest_n <- function(x, table, n) {
The algorithm assumes that table is numeric, without any duplicates, and all values finite; n has to be less than or equal to the length of the table
## assert & setup
stopifnot(
is.numeric(table), !anyDuplicated(table), all(is.finite(table)),
n <= length(table)
)
Sort the table and then 'clamp' maximum and minimum values
## sort and clamp
table <- c(-Inf, sort(table), Inf)
len <- length(table)
Find the interval in table where x occurs; findInterval() uses an efficient search. Use the interval index as the initial lower index, and add 1 for the upper index, making sure to stay in-bounds.
## where to start?
lower <- findInterval(x, table)
upper <- min(lower + 1L, len)
Find the nearest n neighbors by comparing the lower and upper index distance to x, record the nearest value, and increment the lower or upper index as appropriate and making sure to stay in-bounds
## find
nearest <- numeric(n)
for (i in seq_len(n)) {
if (abs(x - table[lower]) < abs(x - table[upper])) {
nearest[i] = table[lower]
lower = max(1L, lower - 1L)
} else {
nearest[i] = table[upper]
upper = min(len, upper + 1L)
}
}
Then return the solution and finish the function
nearest
}
The code might seem verbose, but is actually relatively efficient because the only operations on the entire vector (sort(), findInterval()) are implemented efficiently in R.
A particular advantage of this approach is that it can be vectorized in it's first argument, calculating the test for using lower (use_lower = ...) as a vector and using pmin() / pmax() as clamps.
.nearest_n <- function(x, table, n) {
## assert & setup
stopifnot(
is.numeric(table), !anyDuplicated(table), all(is.finite(table)),
n <= length(table)
)
## sort and clamp
table <- c(-Inf, sort(table), Inf)
len <- length(table)
## where to start?
lower <- findInterval(x, table)
upper <- pmin(lower + 1L, len)
## find
nearest <- matrix(0, nrow = length(x), ncol = n)
for (i in seq_len(n)) {
use_lower <- abs(x - table[lower]) < abs(x - table[upper])
nearest[,i] <- ifelse(use_lower, table[lower], table[upper])
lower[use_lower] <- pmax(1L, lower[use_lower] - 1L)
upper[!use_lower] <- pmin(len, upper[!use_lower] + 1L)
}
# return
nearest
}
For instance
> set.seed(123)
> table <- sample(100, 10)
> sort(table)
[1] 5 29 41 42 50 51 79 83 86 91
> .nearest_n(c(30, 20), table, 4)
[,1] [,2] [,3] [,4]
[1,] 29 41 42 50
[2,] 29 5 41 42
Generalize this by taking any argument and coercing it to the required form using a reference look-up table table0 and the indexes into it table1
nearest_n <- function(x, table, n) {
## coerce to common form
table0 <- sort(unique(c(x, table)))
x <- match(x, table0)
table1 <- match(table, table0)
## find nearest
m <- .nearest_n(x, table1, n)
## result in original form
matrix(table0[m], nrow = nrow(m))
}
As an example...
> set.seed(123)
> table <- sample(c(letters, LETTERS), 30)
> nearest_n(c("M", "Z"), table, 5)
[,1] [,2] [,3] [,4] [,5]
[1,] "o" "L" "O" "l" "P"
[2,] "Z" "z" "Y" "y" "w"
I've been set a question on the Fibonacci Sequence and although I've been successful in doing the sequence, I haven't been as lucky summing the even terms up (i.e. 2nd, 4th, 6th... etc.) My code is below as well as the part of the question I am stuck on. Any guidance would be brilliant!
Question:
Write a function which will take as an input x and y and will return either the sum of the first x even Fibonacci numbers or the sum of even Fibonacci numbers less than y.
That means the user will be able to specify either x or y but not both.
You have to return a warning if someone uses both numbers (decide
on the message to return)
Code:
y <- 10
fibvals <- numeric(y)
fibvals[1] <- 1
fibvals[2] <- 1
for (i in 3:y) {
fibvals[i] <- fibvals[i-1]+fibvals[i-2]
if (i %% 2)
v<-sum(fibvals[i])
}
v
To get you started since this sounds like an exercise.
I would split your loop up into steps rather than do the summing within the loop with an if statement. Since you already have the sequence code working, you can just return what is asked for by the user. The missing function would probably help you out here
f <- function(x, y) {
if (missing(y)) {
warning('you must give y')
y <- 10
}
fibvals <- numeric(y)
fibvals[1] <- 1
fibvals[2] <- 1
for (i in 3:y) {
fibvals[i] <- fibvals[i-1]+fibvals[i-2]
}
evens <- fibvals %% 2 == 0
odds <- fibvals %% 2 != 0
if (missing(x)) {
return(sum(fibvals[evens]))
} else return(fibvals)
}
f(y = 20)
# [1] 3382
f(10)
# [1] 1 1 2 3 5 8 13 21 34 55
# Warning message:
# In f(10) : you must give y
I am filling a 10x10 martix (mat) randomly until sum(mat) == 100
I wrote the following.... (i = 2 for another reason not specified here but i kept it at 2 to be consistent with my actual code)
mat <- matrix(rep(0, 100), nrow = 10)
mat[1,] <- c(0,0,0,0,0,0,0,0,0,1)
mat[2,] <- c(0,0,0,0,0,0,0,0,1,0)
mat[3,] <- c(0,0,0,0,0,0,0,1,0,0)
mat[4,] <- c(0,0,0,0,0,0,1,0,0,0)
mat[5,] <- c(0,0,0,0,0,1,0,0,0,0)
mat[6,] <- c(0,0,0,0,1,0,0,0,0,0)
mat[7,] <- c(0,0,0,1,0,0,0,0,0,0)
mat[8,] <- c(0,0,1,0,0,0,0,0,0,0)
mat[9,] <- c(0,1,0,0,0,0,0,0,0,0)
mat[10,] <- c(1,0,0,0,0,0,0,0,0,0)
i <- 2
set.seed(129)
while( sum(mat) < 100 ) {
# pick random cell
rnum <- sample( which(mat < 1), 1 )
mat[rnum] <- 1
##
print(paste0("i =", i))
print(paste0("rnum =", rnum))
print(sum(mat))
i = i + 1
}
For some reason when sum(mat) == 99 there are several steps extra...I would assume that once i = 91 the while would stop but it continues past this. Can somone explain what I have done wrong...
If I change the while condition to
while( sum(mat) < 100 & length(which(mat < 1)) > 0 )
the issue remains..
Your problem is equivalent to randomly ordering the indices of a matrix that are equal to 0. You can do this in one line with sample(which(mat < 1)). I suppose if you wanted to get exactly the same sort of output, you might try something like:
set.seed(144)
idx <- sample(which(mat < 1))
for (i in seq_along(idx)) {
print(paste0("i =", i))
print(paste0("rnum =", idx[i]))
print(sum(mat)+i)
}
# [1] "i =1"
# [1] "rnum =5"
# [1] 11
# [1] "i =2"
# [1] "rnum =70"
# [1] 12
# ...
See ?sample
Arguments:
x: Either a vector of one or more elements from which to choose,
or a positive integer. See ‘Details.’
...
If ‘x’ has length 1, is numeric (in the sense of ‘is.numeric’) and
‘x >= 1’, sampling _via_ ‘sample’ takes place from ‘1:x’. _Note_
that this convenience feature may lead to undesired behaviour when
‘x’ is of varying length in calls such as ‘sample(x)’. See the
examples.
In other words, if x in sample(x) is of length 1, sample returns a random number from 1:x. This happens towards the end of your loop, where there is just one 0 left in your matrix and one index is returned by which(mat < 1).
The iteration repeats on level 99 because sample() behaves very differently when the first parameter is a vector of length 1 and when it is greater than 1. When it is length 1, it assumes you a random number from 1 to that number. When it has length >1, then you get a random number from that vector.
Compare
sample(c(99,100),1)
and
sample(c(100),1)
Of course, this is an inefficient way of filling your matrix. As #josilber pointed out, a single call to sample could do everything you need.
The issue comes from how sample and which do the sampling when you have only a single '0' value left.
For example, do this:
mat <- matrix(rep(1, 100), nrow = 10)
Now you have a matrix of all 1's. Now lets make two numbers 0:
mat[15]<-0
mat[18]<-0
and then sample
sample(which(mat<1))
[1] 18 15
by adding a size=1 argument you get one or the other
now lets try this:
mat[18]<-1
sample(which(mat<1))
[1] 3 13 8 2 4 14 11 9 10 5 15 7 1 12 6
Oops, you did not get [1] 15 . Instead what happens in only a single integer (15 in this case) is passed tosample. When you do sample(x) and x is an integer, it gives you a sample from 1:x with the integers in random order.