I wrote a recursive binary search function in R which finds the smallest element in a vector that is greater than a given value:
binary_next_biggest <- function(x, vec){
if (length(vec) == 1){
if (x < vec[1]){
return(vec[1])
} else {
return(NA)
}
} else {
mid = ceiling(length(vec)/2)
if (x < vec[mid]){
return(binary_next_biggest(x, vec[1:mid]))
} else {
return(binary_next_biggest(x, vec[mid+1:length(vec)]))
}
}
}
I've written this exact same function in Python with no issues (code below), but in R it does not work.
import numpy as np
def binary_next_biggest(x, arr):
if len(arr)==1:
if x < arr[0]:
return arr[0]
else:
return None
else:
mid = int(np.ceil(len(arr)/2)-1)
if x < arr[mid]:
return binary_next_biggest(x, arr[:mid+1])
else:
return binary_next_biggest(x, arr[mid+1:])
Through debugging in RStudio I discovered the mechanics of why it's not working: indexing the vector in my above function is returning a vector of the same length, so that if
vec <- 1:10
and vec is indexed within the function,
vec[6:10]
the resulting vector passed to the new call of binary_next_biggest() is
6 7 8 9 10 NA NA NA NA NA
where I would expect
6 7 8 9 10
What's going on here? I know I can just rewrite it as a while loop iteratively changing indexes, but I don't understand why vector indexing is behaving this way in the code I've written. Within the interactive R console indexing behaves as expected and changes the vector length, so why would it behave differently within a function, and what would be the appropriate way to index for what I'm trying to do?
The cause of the strange behavior of the code is an error in indexing of the vector elements. The part mid+1:length(vec) should be (mid+1):length(vec) because the : operator is executed before addition.
Here is an illustration of the difference.
5 + 1:10
# [1] 6 7 8 9 10 11 12 13 14 15
(5+1):10
# [1] 6 7 8 9 10
There might be a reason why you're doing a binary search (simplified example of more complicated problem?), but there are easier ways to do this in R.
vec <- 1:1000
x <- 49
min(vec[which(vec > x)])
# [1] 50
Which works even if vec isn't ordered.
vec <- sample.int(1000)
min(vec[which(vec > x)])
# [1] 50
Related
This code is supposed to remove all mutiples of 4 from the given vector, when I run it, only 8 gets removed.
multipleoffour<- function(y){
y2<-y
for (n in y )
{if (n%%4==0)
y2<-y2[-n]
}
return (y2)
}
multipleoffour(c(2,4,6,8,10,12,14))
Since R is vectorized this is more of an R way to do this:
multipleoffour<- function(y){
y[y %% 4 != 0]
}
multipleoffour(c(2,4,6,8,10,12,14))
## [1] 2 6 10 14
The reason why your code doesn't work is because y2<-y2[-n] remove the nth element of the vector and not n itself. In your example it removed 8, which is the 4th element of your vector. Otherwise, I agree with other answers about how to do it more efficiently.
I've made this simple code to test something that isn't working.
funcion=function(x,p){
for(i in 1:p){
return(x+i)
}
}
funcion(5,5)
This returns the value 6, and not 6,7,8,9,10 which is what I would've expected and what I'm looking for.
Can someone explain why this works this way and how can I make it so that I get what I want?
Thank you
We need to collect the output in an object and then return
f1 <- function(x, p) {
# // create an object to store the output from each iteration
out <- numeric(p)
for(i in seq_len(p)) {
out[i] <- x + i # // assign the output based on the index
}
return(out)
}
f1(5, 5)
#[1] 6 7 8 9 10
In R, this can be executed without a for loop i.e.
5 + seq_len(5)
#[1] 6 7 8 9 10
The issue is that return inside a function returns only once. So, it gets executed the first time with x + 1 and it return that output instead of the full output
I've been set a question on the Fibonacci Sequence and although I've been successful in doing the sequence, I haven't been as lucky summing the even terms up (i.e. 2nd, 4th, 6th... etc.) My code is below as well as the part of the question I am stuck on. Any guidance would be brilliant!
Question:
Write a function which will take as an input x and y and will return either the sum of the first x even Fibonacci numbers or the sum of even Fibonacci numbers less than y.
That means the user will be able to specify either x or y but not both.
You have to return a warning if someone uses both numbers (decide
on the message to return)
Code:
y <- 10
fibvals <- numeric(y)
fibvals[1] <- 1
fibvals[2] <- 1
for (i in 3:y) {
fibvals[i] <- fibvals[i-1]+fibvals[i-2]
if (i %% 2)
v<-sum(fibvals[i])
}
v
To get you started since this sounds like an exercise.
I would split your loop up into steps rather than do the summing within the loop with an if statement. Since you already have the sequence code working, you can just return what is asked for by the user. The missing function would probably help you out here
f <- function(x, y) {
if (missing(y)) {
warning('you must give y')
y <- 10
}
fibvals <- numeric(y)
fibvals[1] <- 1
fibvals[2] <- 1
for (i in 3:y) {
fibvals[i] <- fibvals[i-1]+fibvals[i-2]
}
evens <- fibvals %% 2 == 0
odds <- fibvals %% 2 != 0
if (missing(x)) {
return(sum(fibvals[evens]))
} else return(fibvals)
}
f(y = 20)
# [1] 3382
f(10)
# [1] 1 1 2 3 5 8 13 21 34 55
# Warning message:
# In f(10) : you must give y
I have just started learning R and I wrote this code to learn on functions and loops.
squared<-function(x){
m<-c()
for(i in 1:x){
y<-i*i
c(m,y)
}
return (m)
}
squared(5)
NULL
Why does this return NULL. I want i*i values to append to the end of mand return a vector. Can someone please point out whats wrong with this code.
You haven't put anything inside m <- c() in your loop since you did not use an assignment. You are getting the following -
m <- c()
m
# NULL
You can change the function to return the desired values by assigning m in the loop.
squared <- function(x) {
m <- c()
for(i in 1:x) {
y <- i * i
m <- c(m, y)
}
return(m)
}
squared(5)
# [1] 1 4 9 16 25
But this is inefficient because we know the length of the resulting vector will be 5 (or x). So we want to allocate the memory first before looping. This will be the better way to use the for() loop.
squared <- function(x) {
m <- vector("integer", x)
for(i in seq_len(x)) {
m[i] <- i * i
}
m
}
squared(5)
# [1] 1 4 9 16 25
Also notice that I have removed return() from the second function. It is not necessary there, so it can be removed. It's a matter of personal preference to leave it in this situation. Sometimes it will be necessary, like in if() statements for example.
I know the question is about looping, but I also must mention that this can be done more efficiently with seven characters using the primitive ^, like this
(1:5)^2
# [1] 1 4 9 16 25
^ is a primitive function, which means the code is written entirely in C and will be the most efficient of these three methods
`^`
# function (e1, e2) .Primitive("^")
Here's a general approach:
# Create empty vector
vec <- c()
for(i in 1:10){
# Inside the loop, make one or elements to add to vector
new_elements <- i * 3
# Use 'c' to combine the existing vector with the new_elements
vec <- c(vec, new_elements)
}
vec
# [1] 3 6 9 12 15 18 21 24 27 30
If you happen to run out of memory (e.g. if your loop has a lot of iterations or vectors are large), you can try vector preallocation which will be more efficient. That's not usually necessary unless your vectors are particularly large though.
I'm looking for a function that
can list all n! permutations of a given input vector (typically just the sequence 1:n)
can also list just the first N of all n! permutations
The first requirement is met, e.g., by permn() from package combinat, permutations() from package e1071, or permutations() from package gtools. However, I'm positive that there is yet another function from some package that also provides the second feature. I used it once, but have since forgotten its name.
Edit:
The definition of "first N" is arbitrary: the function just needs an internal enumeration scheme which is always followed, and should break after N permutations are computed.
As Spacedman correctly pointed out, it's crucial that the function does not compute more permutations than actually needed (to save time).
Edit - solution: I remembered what I was using, it was numperm() from package sna. numperm(4, 7) gives the 7th permutation of elements 1:4, for the first N, one has to loop.
It seems like the best way to approach this would be to construct an iterator that could produce the list of permutations rather than using a function like permn which generates the entire list up front (an expensive operation).
An excellent place to look for guidance on constructing such objects is the itertools module in the Python standard library. Itertools has been partially re-implemented for R as a package of the same name.
The following is an example that uses R's itertools to implement a port of the Python generator that creates iterators for permutations:
require(itertools)
permutations <- function(iterable) {
# Returns permutations of iterable. Based on code given in the documentation
# of the `permutation` function in the Python itertools module:
# http://docs.python.org/library/itertools.html#itertools.permutations
n <- length(iterable)
indicies <- seq(n)
cycles <- rev(indicies)
stop_iteration <- FALSE
nextEl <- function(){
if (stop_iteration){ stop('StopIteration', call. = FALSE) }
if (cycles[1] == 1){ stop_iteration <<- TRUE } # Triggered on last iteration
for (i in rev(seq(n))) {
cycles[i] <<- cycles[i] - 1
if ( cycles[i] == 0 ){
if (i < n){
indicies[i:n] <<- c(indicies[(i+1):n], indicies[i])
}
cycles[i] <<- n - i + 1
}else{
j <- cycles[i]
indicies[c(i, n-j+1)] <<- c(indicies[n-j+1], indicies[i])
return( iterable[indicies] )
}
}
}
# chain is used to return a copy of the original sequence
# before returning permutations.
return( chain(list(iterable), new_iterator(nextElem = nextEl)) )
}
To misquote Knuth: "Beware of bugs in the above code; I have only tried it, not proved it correct."
For the first 3 permutations of the sequence 1:10, permn pays a heavy price for computing unnecessary permutations:
> system.time( first_three <- permn(1:10)[1:3] )
user system elapsed
134.809 0.439 135.251
> first_three
[[1]]
[1] 1 2 3 4 5 6 7 8 9 10
[[2]]
[1] 1 2 3 4 5 6 7 8 10 9
[[3]]
[1] 1 2 3 4 5 6 7 10 8 9)
However, the iterator returned by permutations can be queried for only the first three elements which spares a lot of computations:
> system.time( first_three <- as.list(ilimit(permutations(1:10), 3)) )
user system elapsed
0.002 0.000 0.002
> first_three
[[1]]
[1] 1 2 3 4 5 6 7 8 9 10
[[2]]
[1] 1 2 3 4 5 6 7 8 10 9
[[3]]
[1] 1 2 3 4 5 6 7 9 8 10
The Python algorithm does generate permutations in a different order than permn.
Computing all the permutations is still possible:
> system.time( all_perms <- as.list(permutations(1:10)) )
user system elapsed
498.601 0.672 499.284
Though much more expensive as the Python algorithm makes heavy use of loops compared to permn. Python actually implements this algorithm in C which compensates for the inefficiency of interpreted loops.
The code is available in a gist on GitHub. If anyone has a better idea, fork away!
In my version of R/combinat, the function permn() is just over thirty lines long. One way would be to make a copy of permn and change it to stop early.