R - Convert this nested for loop (MATLAB) to R [duplicate] - r

This question already has an answer here:
R - My conditional and nested for loop take too long. How to vectorize?
(1 answer)
Closed 9 years ago.
I need to convert this for loop into R
for ii = 100:(size(start,1)-N)
if start(ii) == 1 && mean(start(ii-11:ii-1)) == 0
count = count + 1;
sif(count,:) = s(ii:ii+N-1);
time(count) = ii*1/FS;
end
end
The start vetor is a single dimension vector of true and false values about 3 million elements in total.
As loops in R take a long time, it take about 3 hours to execute the code, so it needs to be vectorized.
If someone could help I would be really really appreciate it.
Edit
Here is my R code with just a simple count (which takes hours to execute)
for(ii in 100:sp)
{
if(start(ii) == 1 && mean(start(ii-11:ii-1)) == 0)
{
count = count + 1
}
}
Edit-2
Here are the dummy values:
start:
[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[13] TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
N:
[1] 882
FS:
[1] 44100
s:
[1] 1.762390e-01 1.797791e-01 1.826172e-01 1.795044e-01 1.724243e-01
[6] 1.665039e-01 1.640625e-01 1.634827e-01 1.628723e-01 1.606750e-01

I just created some dummy data:
set.seed(1234)
start = sample(c(TRUE,FALSE), 300000, replace=TRUE)
N = 882
count = 0
Your R code takes:
system.time(
for(ii in 100:(length(start)-N))
{
if(start(ii) == 1 && mean(start((ii-11):(ii-1))) == 0)
{
count = count + 1
}
})
## user system elapsed
## 15.42 0.00 15.43
There is a function in R called start and it was getting called instead of indexing the vector start. The correct and faster way is:
system.time(
for(ii in 100:(length(start)-N))
{
if(start[ii] == 1 && mean(start[(ii-11):(ii-1)]) == 0)
{
count = count + 1
}
})
## user system elapsed
## 2.04 0.00 2.04

Related

Check if a number is between two others

I am looking for a function that verifies if a number is between two other numbers. I also need to control if I want a strict comparison (a
I know the function between() in dplyr. Yet, I have to know the upper and lower numbers.
MyNumber = 8
First = 2
Second = 10
# This will return TRUE
between(MyNumber, lower = First, upper = Second)
# But this will return FALSE
between(MyNumber, lower = Second, upper = First)
# This will return TRUE. I want it to return FALSE
First = 8
between(MyNumber, lower = First, upper = Second)
I need a function that returns TRUE no matter what is the order.
Something like:
between2 <- function(number,bounds) { number > min(bounds) & number < max(bounds)}
between2(8, c(2,10))
[1] TRUE
between2(8, c(10,2))
[1] TRUE
This function also deals with your added condition
between2(8,c(8,10))
[1] FALSE
You could do it with a simple arithmetics:
between <- function(number, first, second) { (first - number) * (second - number) < 0 }
Here are some example outputs:
> between(8, 2, 10)
[1] TRUE
> between(8, 10, 2)
[1] TRUE
> between(8, 10, 12)
[1] FALSE
> between(8, 1, 2)
[1] FALSE
You could use %in% with the : function, once you now first and last:
first <- 2
last <- 10
number <- 8
number %in% first:last
[1] TRUE
first <- 10
last <- 2
number <- 8
number %in% first:last
[1] TRUE
first <- 10
last <- 12
number <- 8
number %in% first:last
[1] FALSE
first <- 12
last <- 10
number <- 8
number %in% first:last
[1] FALSE
In a function, and strict lets you consider or not strict comparison:
my_between <- function(n, f, l, strict = FALSE) {
if (!strict) {
n %in% f:l # if strict == FALSE (default)
} else {
n %in% (f+1):(l-1) # if strict == TRUE
}
}
my_between(8, 2, 10)
What's wrong with
f_between <- function (num, L, R) num>=min(L,R) & num<=max(L,R)
f_between(8, 2, 10)
#[1] TRUE
f_between(6, 6, 10)
#[1] TRUE
f_between(2, -10, -2)
#[1] FALSE
f_between(3, 5, 7)
#[1] FALSE

How can I complete this Fibonacci sequence evaluation in R?

Greetings good people of Stackland!
Recently I was given this task
Generate the Fibonacci sequence in any language
Evaluate whether each value is odd or even
Sum the even numbers such that their total is not >500,000
I chose to do this R, as I am learning the language and thought it would be a good exercise in doing so.
I have managed to complete step 2 of the task but haven't been able to proceed any further. Please see code and comments below.
len <- 50
fibvals <- numeric(len)
fibvals[1] <- 1
fibvals[2] <- 1
for(i in 3:len) { fibvals[i] <- fibvals[i-1]+fibvals[i-2]}
fibvals
[1] 1 1 2 3 5
[6] 8 13 21 34 55
[11] 89 144 233 377 610
[16] 987 1597 2584 4181 6765
[21] 10946 17711 28657 46368 75025
[26] 121393 196418 317811 514229 832040
[31] 1346269 2178309 3524578 5702887 9227465
[36] 14930352 24157817 39088169 63245986 102334155
[41] 165580141 267914296 433494437 701408733 1134903170
[46] 1836311903 2971215073 4807526976 7778742049 12586269025
# Creates a variable called len in which the value 50 is stored
# Creates a var called fibvals, which is a numeric datatype, which should have len (50) vals
# Sets the value of the first entry in fibvals to 1
# Sets the value of the second entry in fibvals to 1
# Loop - "for (i in 3:len)" dictates that the loop should be executed between step 3 and step 50 (denoted by "len")
# Loop - Defines a loop step "i" as being the result of the (current i - the before it) + (current i - i two before it)
# Loop - Example 5 = (5-3) + (5-2) OR 2 + 3 = 5 | Example 21 = (21-13) + (21-8) OR 8 + 13 = 21
is.even <- function(x){ x %% 2 == 0 }
# Creates a UDF to check if values are odd or even by using modulo.
If the remainder is 0 when any value is divided by 2, it is an even number
is.even(fibvals)
[1] FALSE FALSE TRUE FALSE FALSE TRUE FALSE FALSE TRUE FALSE
[11] FALSE TRUE FALSE FALSE TRUE FALSE FALSE TRUE FALSE FALSE
[21] TRUE FALSE FALSE TRUE FALSE FALSE TRUE FALSE FALSE TRUE
[31] FALSE FALSE TRUE FALSE FALSE TRUE FALSE FALSE TRUE FALSE
[41] FALSE TRUE FALSE FALSE TRUE FALSE FALSE TRUE FALSE FALSE
# Evaluates all Fibonacci values on odd or even property
What I need is a bit of guidance as to where I should go from here.
Should I create a data.table and query that using the SQL package, or is there a much more elegant and less cumbersome way?
Thanks in advance!
For sorting out the even number from first 50 fibonacci numbers you can use this
even_numbers <- fibvals[fibvals%%2==0]
Then by computing cumulative sum of those even numbers and imposing the condition of the maximum value of the sum, you can select those even numbers by this
cumsum(even_numbers)<500000
Therefore your desired fibonacci numbers are
even_numbers[cumsum(even_numbers)<500000]
and their sum is
sum(even_numbers[cumsum(even_numbers)<500000])
This would do it
fsum <- 0
for (i in 1:len) { if (is.even(fibvals[i]) && (fsum + fibvals[i])<=500000) {fsum = fsum + fibvals[i]}}
The sum would then be stored in fsum.
Here's a way to do it with a recursive function:
getEvenWithFibber <- function(y = c(1,1),
s = 0,
threshold = 500000) {
if(s + y[1] + y[2] < threshold)
getEvenWithFibber(y = c(y[1] + y[2],y), s = s + ifelse(y[1]%%2==0,y[1],0))
else list(sum = s, seq = y, iseven = y%%2 == 0)
}
getEvenWithFibber()

Determine if there are x consecutive duplicates in a vector in R

I have the following vector:
p<-c(0,0,1,1,1,3,2,3,2,2,2,2)
I'm trying to write a function that returns TRUE if there are x consecutive duplicates in the vector.
The function call found_duplications(p,3) will return True because there are three consecutive 1's. The function call found_duplications(p,5) will return False because there are no 5 consecutive duplicates of a number. The function call found_duplications(p,4) will return True because there are four consecutive 4's.
I have a couple ideas. There's the duplicated() function:
duplicated(p)
> [1] FALSE TRUE FALSE TRUE TRUE FALSE FALSE TRUE TRUE TRUE TRUE TRUE
I can make a for loop that counts the number of TRUE's in the vector but the problem is that the consecutive counter would be off by one. Can you guys think of any other solutions?
You could also do
find.dup <- function(x, n){
n %in% rle(x)$lengths
}
find.dup(p,3)
#[1] TRUE
find.dup(p,2)
#[1] TRUE
find.dup(p,5)
#[1] FALSE
find.dup(p,4)
#[1] TRUE
p<-c(0,0,1,1,1,3,2,3,2,2,2,2)
find.dup <- function(x, n) {
consec <- 1
for(i in 2:length(x)) {
if(x[i] == x[i-1]) {
consec <- consec + 1
} else {
consec <- 1
}
if(consec == n)
return(TRUE) # or you could return x[i]
}
return(FALSE)
}
find.dup(p,3)
# [1] TRUE
find.dup(p,4)
# [1] TRUE
find.dup(p,5)
# [1] FALSE

Creation of a specific vector without loop or recursion in R

I've got a first vector, let's say x that consists only of 1's and -1's. Then, I have a second vector y that consists of 1's, -1's, and zeros. Now, I'd like to create a vector z that contains in index i a 1 if x[i] equals 1 and a 1 exists within the vector y between the n precedent elements (y[(i-n):i])...
more formally: z <- ifelse(x == 1 && 1 %in% y[(index(y)-n):index(y)],1,0)
I'm looking to create such a vector in R without looping or recursion. The proposition above does not work since it does not recognize to take the expression y[(index(y)-n):index(y)] element by element.
Thanks a lot for your support
Here's an approach that uses the cumsum function to test for the number of ones that have been seen so far. If the number of ones at position i is larger than the number of ones at position i-n, then the condition on the right will be satisfied.
## Generate some random y's.
> y <- sample(-1:1, 25, replace=T)
> y
[1] 0 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 0 0 -1 -1 -1 1 -1 1 1 0 0 0 1
> n <- 3
## Compute number of ones seen at each position.
> cs <- cumsum(ifelse(y == 1, 1, 0))
> lagged.cs <- c(rep(0, n), cs[1:(length(cs)-n)])
> (cs - lagged.cs) > 0
[1] FALSE TRUE TRUE TRUE FALSE FALSE FALSE TRUE TRUE TRUE FALSE FALSE
[13] FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE TRUE TRUE FALSE
[25] TRUE
You could use apply like this, although it is essentially a pretty way to do a loop, I'm not sure if it will be faster (it may or may not).
y1 <- unlist(lapply(1:length(x), function(i){1 %in% y[max(0, (i-n)):i]}))
z <- as.numeric(x==1) * as.numeric(y1)

Insert elements into a vector at given indexes

I have a logical vector, for which I wish to insert new elements at particular indexes. I've come up with a clumsy solution below, but is there a neater way?
probes <- rep(TRUE, 15)
ind <- c(5, 10)
probes.2 <- logical(length(probes)+length(ind))
probes.ind <- ind + 1:length(ind)
probes.original <- (1:length(probes.2))[-probes.ind]
probes.2[probes.ind] <- FALSE
probes.2[probes.original] <- probes
print(probes)
gives
[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
and
print(probes.2)
gives
[1] TRUE TRUE TRUE TRUE TRUE FALSE TRUE TRUE TRUE TRUE TRUE FALSE
[13] TRUE TRUE TRUE TRUE TRUE
So it works but is ugly looking - any suggestions?
These are all very creative approaches. I think working with indexes is definitely the way to go (Marek's solution is very nice).
I would just mention that there is a function to do roughly that: append().
probes <- rep(TRUE, 15)
probes <- append(probes, FALSE, after=5)
probes <- append(probes, FALSE, after=11)
Or you could do this recursively with your indexes (you need to grow the "after" value on each iteration):
probes <- rep(TRUE, 15)
ind <- c(5, 10)
for(i in 0:(length(ind)-1))
probes <- append(probes, FALSE, after=(ind[i+1]+i))
Incidentally, this question was also previously asked on R-Help. As Barry says:
"Actually I'd say there were no ways of doing this, since I dont think you can actually insert into a vector - you have to create a new vector that produces the illusion of insertion!"
You can do some magic with indexes:
First create vector with output values:
probs <- rep(TRUE, 15)
ind <- c(5, 10)
val <- c( probs, rep(FALSE,length(ind)) )
# > val
# [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
# [13] TRUE TRUE TRUE FALSE FALSE
Now trick. Each old element gets rank, each new element gets half-rank
id <- c( seq_along(probs), ind+0.5 )
# > id
# [1] 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 13.0 14.0 15.0
# [16] 5.5 10.5
Then use order to sort in proper order:
val[order(id)]
# [1] TRUE TRUE TRUE TRUE TRUE FALSE TRUE TRUE TRUE TRUE TRUE FALSE
# [13] TRUE TRUE TRUE TRUE TRUE
probes <- rep(TRUE, 1000000)
ind <- c(50:100)
val <- rep(FALSE,length(ind))
new.probes <- vector(mode="logical",length(probes)+length(val))
new.probes[-ind] <- probes
new.probes[ind] <- val
Some timings:
My method
user system elapsed
0.03 0.00 0.03
Marek method
user system elapsed
0.18 0.00 0.18
R append with for loop
user system elapsed
1.61 0.48 2.10
How about this:
> probes <- rep(TRUE, 15)
> ind <- c(5, 10)
> probes.ind <- rep(NA, length(probes))
> probes.ind[ind] <- FALSE
> new.probes <- as.vector(rbind(probes, probes.ind))
> new.probes <- new.probes[!is.na(new.probes)]
> new.probes
[1] TRUE TRUE TRUE TRUE TRUE FALSE TRUE TRUE TRUE TRUE TRUE FALSE
[13] TRUE TRUE TRUE TRUE TRUE
That is sorta tricky. Here's one way. It iterates over the list, inserting each time, so it's not too efficient.
probes <- rep(TRUE, 15)
probes.ind <- ind + 0:(length(ind)-1)
for (i in probes.ind) {
probes <- c(probes[1:i], FALSE, probes[(i+1):length(probes)])
}
> probes
[1] TRUE TRUE TRUE TRUE TRUE FALSE TRUE TRUE TRUE TRUE TRUE FALSE
[13] TRUE TRUE TRUE TRUE TRUE
This should even work if ind has repeated elements, although ind does need to be sorted for the probes.ind construction to work.
Or you can do it using the insertRow function from the miscTools package.
probes <- rep(TRUE, 15)
ind <- c(5,10)
for (i in ind){
probes <- as.vector(insertRow(as.matrix(probes), i, FALSE))
}
I came up with a good answer that's easy to understand and fairly fast to run, building off Wojciech's answer above. I'll adapt the method for the example here, but it can be easily generalized to pretty much any data type for an arbitrary pattern of missing points (shown below).
probes <- rep(TRUE, 15)
ind <- c(5,10)
probes.final <- rep(FALSE, length(probes)+length(ind))
probes.final[-ind] <- probes
The data I needed this for is sampled at a regular interval, but many samples are thrown out, and the resulting data file only includes the timestamps and measurements for those retained. I needed to produce a vector containing all the timestamps and a data vector with NAs inserted for timestamps that were tossed. I used the "not in" function stolen from here to make it a bit simpler.
`%notin%` <- Negate(`%in%`)
dat <- rnorm(50000) # Data given
times <- seq(from=554.3, by=0.1, length.out=70000] # "Original" time stamps
times <- times[-sample(2:69999, 20000)] # "Given" times with arbitrary points missing from interior
times.final <- seq(from=times[1], to=times[length(times)], by=0.1)
na.ind <- which(times.final %notin% times)
dat.final <- rep(NA, length(times.final))
dat.final[-na.ind] <- dat
Um, hi, I had the same doubt, but I couldn't understand what people had answered, because I'm still learning the language. So I tried make my own and I suppose it works! I created a vector and I wanted to insert the value 100 after the 3rd, 5th and 6th indexes. This is what I wrote.
vector <- c(0:9)
indexes <- c(6, 3, 5)
indexes <- indexes[order(indexes)]
i <- 1
j <- 0
while(i <= length(indexes)){
vector <- append(vector, 100, after = indexes[i] + j)
i <-i + 1
j <- j + 1
}
vector
The vector "indexes" must be in ascending order for this to work. This is why I put them in order at the third line.
The variable "j" is necessary because at each iteration, the length of the new vector increases and the original values are moved.
In the case you wish to insert the new value next to each other, simply repeat the number of the index. For instance, by assigning indexes <- c(3, 5, 5, 5, 6), you should get vector == 0 1 2 100 3 4 100 100 100 5 100 6 7 8 9

Resources