For loops that check for empty range - r

When dealing with recursive equations in mathematics, it is common to write equations that hold over some range k = 1,...,d with the implicit convention that if d < 1 then the set of equations is considered to be empty. When programming in R I would like to be able to write for loops in the same way as a mathematical statement (e.g., a recursive equation) so that it interprets a range with upper bound lower than the lower bound as being empty. This would ensure that the syntax of the algorithm mimics the syntax of the mathematical statement on which it is based.
Unfortunately, R does not interpret the for loop in this way, and so this commonly leads to errors when you program your loops in a way that mimics the underlying mathematics. For example, consider a simple function where we create a vector of zeros with length n and then change the first d values to ones using a loop over the elements in the range k = 1,...,d. If we input d < 1 into this function we would like the function to recognise that the loop is intended to be empty, so that we would get a vector of all zeros. However, using a standard for loop we get the following:
#Define a function using a recursive pattern
MY_FUNC <- function(n,d) {
OBJECT <- rep(0, n);
for (k in 1:d) { OBJECT[k] <- 1 }
OBJECT }
#Generate some values of the function
MY_FUNC(10,4);
[1] 1 1 1 1 0 0 0 0 0 0
MY_FUNC(10,1);
[1] 1 0 0 0 0 0 0 0 0 0
MY_FUNC(10,0);
[1] 1 0 0 0 0 0 0 0 0 0
#Not what we wanted
MY_FUNC(10,-2);
[1] 1 1 1 1 1 1 1 1 1 1
#Not what we wanted
My Question: Is there any function in R that performed loops like a for loop, but interprets the loop as empty if the upper bound is lower than the lower bound? If there is no existing function, is there a way to program R to read loops this way?
Please note: I am not seeking answers that simply re-write this example function in a way that removes the loop. I am aware that this can be done in this specific case, but my goal is to get the loop working more generally. This example is shown only to give a clear view of the phenomenon I am dealing with.

There is imho no generic for-loop doing what you like but you could easily make it by adding
if(d > 0) break
as the first statement at the beginning of the loop.

EDIT
If you don't want to return an error when negative input is given you can use pmax with seq_len
MY_FUNC <- function(n,d) {
OBJECT <- rep(0, n);
for (k in seq_len(pmax(0, d))) { OBJECT[k] <- 1 }
OBJECT
}
MY_FUNC(10, 4)
#[1] 1 1 1 1 0 0 0 0 0 0
MY_FUNC(10, 1)
#[1] 1 0 0 0 0 0 0 0 0 0
MY_FUNC(10, 0)
#[1] 0 0 0 0 0 0 0 0 0 0
MY_FUNC(10, -2)
#[1] 0 0 0 0 0 0 0 0 0 0
Previous Answer
Prefer seq_len over 1:d and it takes care of this situation
MY_FUNC <- function(n,d) {
OBJECT <- rep(0, n);
for (k in seq_len(d)) { OBJECT[k] <- 1 }
OBJECT
}
MY_FUNC(10, 4)
#[1] 1 1 1 1 0 0 0 0 0 0
MY_FUNC(10, 1)
#[1] 1 0 0 0 0 0 0 0 0 0
MY_FUNC(10, 0)
#[1] 0 0 0 0 0 0 0 0 0 0
MY_FUNC(10, -2)
Error in seq_len(d) : argument must be coercible to non-negative integer

The function can be vectorized
MY_FUNC <- function(n,d) {
rep(c(1, 0), c(d, n -d))
}
MY_FUNC(10, 4)
#[1] 1 1 1 1 0 0 0 0 0 0
MY_FUNC(10, 1)
#[1] 1 0 0 0 0 0 0 0 0 0
MY_FUNC(10, 0)
#[1] 0 0 0 0 0 0 0 0 0 0
MY_FUNC(10, -2)
Error in rep(c(1, 0), c(d, n - d)) : invalid 'times' argument

Related

What is the most optimal and creative way to create a random Matrix with mostly zeros and some ones in Julia?

I want to create a random Matrix with values of zeros and ones. With this presumption, there will be more zeros instead of ones! So I guess there should be something like a weighted Bernoulli distribution to choose between 0 or 1 each time (and more probability for choosing 0). I prefer not to limit it to just nxn matrices! I can try in an utterly not standard way like this:
julia> let mat = Matrix{Int64}(undef, 3, 5)
zero_or_one(shift) = rand()+shift>0.5 ? 0 : 1
foreach(x->mat[x]=zero_or_one(0.3), eachindex(mat))
end
julia> mat
3×5 Matrix{Int64}:
1 1 1 0 1
0 1 1 1 1
0 1 1 0 1
Note that this doesn't do the job. Because as you can see, I get more ones instead of zeros in the result.
Is there any more optimal or at least creative way? Or any module to do it?
Update:
It seems the result of this code will never change whether I change the value of shift or not 😑.
using SparseArrays?
julia> sprand(Bool, 1_000_000,1_000_000, 1e-9)
1000000×1000000 SparseMatrixCSC{Bool, Int64} with 969 stored entries:
⠀⠀⠁⠀⠢⠀⠂⠆⡄⠀⠀⠀⡈⠀⠐⠀⠁⠐⠂⠀⠀⢀⠤⠀⠀⠀⠄⠐⢀⠘⠈⠀⢂⠐⠀⠀⠆⠀⠠⠀⠀⠀⠀⢀⠈⠁⠀⠑⠀⢀⠐⠀
⠄⠀⡀⠀⠒⠠⠨⢀⣀⠀⠀⠀⠐⠤⠈⠀⠀⠀⠀⠀⠁⠁⠄⠐⠑⠅⢄⠠⠐⠀⠀⠀⠁⢀⠋⠂⠂⠂⠀⠀⠀⠀⠀⠀⠈⠄⠀⠀⠀⠄⠈⠀
⠠⠄⠀⢀⠀⢁⠐⠀⠁⠂⢂⠂⠀⠀⠠⠀⠀⠀⠁⠀⠈⠀⠀⠂⠀⠀⢀⠂⠀⠈⠀⠀⠀⠠⠀⠂⠄⠀⠄⠀⢀⠀⠀⠉⠀⠠⠤⠀⠒⡐⠀⠂
⢀⠂⠁⠀⠐⠀⠀⠀⠄⠀⢀⡘⠁⠂⠁⠀⠂⢀⠂⠅⡀⠀⠈⠡⠈⠉⢀⠩⠉⠄⡀⠀⠀⠐⠀⡀⡄⠈⠀⢀⠀⠂⠌⠀⠀⠂⠀⠀⠁⠀⠀⠀
⠂⠠⠀⡀⠀⢀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠂⠐⠀⠀⠂⡁⠁⠉⠈⠀⠀⠁⠀⠄⢀⠤⠀⠀⠁⡂⠠⠀⠄⠀⠀⡀⠀⢀⠥⠀⢉⠀⠀⠄⠁⠀
⠈⠀⠀⠀⠀⠅⠀⠀⠈⠀⠄⡄⠀⠀⠀⠠⠄⠈⠠⠀⠀⠐⠂⠀⠀⠀⠀⠆⠠⠀⠀⠀⠐⠀⠐⠀⠀⠀⠀⠀⠀⡀⠌⢠⠀⠀⠂⠐⠈⠀⠀⠐
⠀⡀⠁⠈⡀⠀⢀⠁⠈⠠⡈⠁⢄⠈⠀⠀⠀⢁⠐⣀⠂⠄⠄⢀⠠⠀⠐⠀⠡⠠⠄⠈⢄⢈⠂⠈⠆⠀⠁⠀⠀⠀⠃⠀⠀⠠⠀⠐⠀⠐⠘⡀
⠀⠂⠁⠰⠁⠀⠀⠀⠀⠄⠀⣐⠀⡄⠤⡀⠀⠄⠀⠐⠀⠉⠁⠀⠈⢀⣐⠠⠀⠀⠂⠀⠀⠀⠠⠂⠐⠁⠀⠀⠀⠐⡈⠀⠐⡀⠠⡁⡀⠠⡀⠈
⠈⠀⠀⠠⠀⠀⠀⠁⠠⠐⠀⠐⠄⡀⠠⠀⠀⠀⠐⡀⠀⠀⠄⠀⠀⢒⠈⠊⠀⢢⡠⠀⠀⠀⡈⠀⠀⠀⠀⠈⠉⠃⠀⡀⡉⠀⢁⠔⠀⠀⠂⠀
⠀⠀⠀⡐⠠⢀⠀⡐⠀⠈⢀⠀⠀⠀⠐⠪⠀⠂⡄⠐⠀⢀⠀⠈⠀⠀⠰⠀⠀⠀⠈⠀⠀⠠⠀⠀⠐⠀⠀⠠⠀⠀⡀⠄⠈⢂⠂⠌⠀⠀⠐⠀
⢀⠜⢈⠀⠤⠂⢄⠀⠘⠀⠀⠀⠈⠀⠀⢀⠄⠀⠠⠀⠠⠀⠀⠀⠁⠐⠁⠀⠀⠈⠁⠀⠀⢀⠀⢄⠀⠄⠀⠀⠀⠀⠀⡀⢄⠀⠅⠀⠀⠀⠀⠀
⠠⠦⠀⡐⠈⠐⠀⡄⠀⠄⠀⠀⠀⠀⡐⠀⠀⠌⠀⠨⠀⠀⠩⢀⠁⠀⠈⠐⠐⠀⠀⠀⠀⡐⠈⠀⠁⠘⠀⢀⠀⠀⠈⠀⠈⠀⠀⠐⠀⠐⠀⠈
⠀⠀⠀⢄⠤⠀⡀⠀⠀⠬⠀⠀⠂⡡⠀⠌⠠⠠⠀⠀⢀⢔⠀⠀⠀⠀⢀⠄⠀⡈⠀⠀⠈⠄⡀⠐⠀⠠⠀⠀⠠⠂⠠⠑⠀⠀⡄⢀⠁⠀⠀⢁
⠀⡀⠀⠀⠄⠀⠀⡀⠀⠀⠀⠄⠀⠂⠀⠁⠀⠀⠁⡠⠀⠀⠡⠀⠂⠂⠄⠀⣀⠄⠊⢀⠁⠀⠄⠀⠀⢀⠀⠄⠀⠁⡀⠈⠁⠀⠀⠀⢂⠀⠈⠂
⠀⠀⠀⢀⠀⠀⠀⠀⠀⠀⠠⡠⢐⠀⠀⠁⠀⠂⠀⠐⠀⠒⠈⡀⡂⢀⠀⠀⠀⠡⠌⠀⠀⢀⠄⠀⢐⠀⠀⢀⠠⠀⠀⠂⠀⠀⠈⠄⠠⡠⠀⡀
⢀⠲⠀⠀⠈⠀⠀⠂⠀⠀⠀⠀⠀⣀⠨⠁⢀⠀⠀⠀⠀⠀⠰⠀⠀⢠⠀⠁⠀⢀⢀⢀⠀⡡⠀⠈⠁⠀⠁⠠⠀⡀⠀⡀⠀⠐⠀⠐⠁⡀⠂⠈
⢀⠄⠀⠀⠀⠀⠡⠀⠀⠀⠀⠀⠀⠀⢀⠀⣂⠀⠀⠀⠂⠀⠀⠀⠀⠀⠁⠀⢀⠐⠀⠀⠐⠋⠀⠀⠀⢢⠠⠀⠂⠐⠄⢈⠠⠤⠀⡀⠀⠀⠀⠀
⠀⠠⠀⠄⢀⠄⠀⠑⠀⠀⠀⠄⠀⡠⠁⡀⢔⠠⢐⠀⢀⠀⠢⠀⠀⠈⠐⠀⠀⠀⠄⠂⠀⠀⠀⠀⠀⠀⡄⠀⡈⠀⠀⠀⡀⠀⠊⡀⠀⢠⠀⠀
⠀⠀⠒⠀⡀⢐⠄⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡀⠁⠄⠀⠀⠀⠀⠀⡄⢀⡀⠀⠀⠀⠀⢀⢀⢀⠁⠁⠀⠁⠔⠀⠀⠀⠂⠀⠒⠀⢀⢈⢀⠀⠀
⠈⠀⠀⡂⠀⠁⢐⡀⠀⠀⠂⠀⠀⡂⠄⠊⠀⠀⠄⢀⠈⠈⠁⠀⠀⠈⠒⠀⠠⠑⠄⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀⠄⠆⢄⠀⠀⠂⠂⠀⡀⠀⠀
⠀⠠⠄⠀⠀⠠⡀⠠⠀⠠⠀⠐⠀⠀⡌⠨⢀⠀⠀⠁⠀⠂⠀⡀⠄⠴⠀⢠⠄⠄⠄⡀⠀⠀⠂⢠⠀⠀⠀⠜⠐⠀⠀⠁⢠⠀⠄⠐⠁⠂⠀⠁
⠈⠀⠀⠀⠈⠐⠂⠈⠆⢈⠐⡀⠈⢀⠀⠐⠀⠰⠂⠀⠀⠀⠀⠀⠀⠠⠀⡂⠨⠀⠈⡀⠁⠀⠤⠈⠐⠂⠀⠀⡀⠀⠀⠀⠀⠢⠀⠠⠀⠀⠁⠀
⠠⠈⠀⠈⠠⡀⠀⠠⠀⠠⠀⠀⠐⢄⠜⠀⠈⠀⠄⡁⠀⠠⠀⠀⠀⠁⠀⠡⡀⠈⠐⠀⠂⠀⠀⠀⠀⠐⠐⡈⢀⡠⡂⠀⠀⠐⠀⠄⠀⠀⠀⠁
⢀⠁⠀⢠⠂⢁⠄⡅⠀⠠⠀⠄⠀⠠⠀⠈⡀⠈⠂⠀⠨⠈⢀⠀⢀⡈⠀⠈⡈⠂⠀⠈⠀⠀⡀⠀⠀⠀⠀⠀⠀⠊⠄⠠⠀⠀⠄⠊⠀⠈⠄⠀
⠂⠀⠀⠀⠌⠁⢀⠀⠐⠀⠀⠈⠀⠀⠁⠀⢀⠁⠪⠠⠀⠀⢐⠀⠀⠄⠀⠂⢀⡀⢐⠁⠀⣀⠒⠀⢀⢀⠀⠠⢂⠀⠀⠠⠀⠄⠐⠄⠁⠀⠀⠀
⠐⠀⠠⠀⠀⡀⠀⠄⠄⠐⠀⠁⠀⠀⠀⠀⠄⠄⠀⠀⢀⠂⠀⠰⠀⠀⠊⠀⢀⠀⠤⠀⠀⠀⠉⠀⠀⢀⠀⠁⠁⠀⠈⠁⠀⡠⡀⠐⠐⠀⠀⠀
I'd choose the Bernoulli distribution for this. Specify a success rate p, which takes value 1 with probability p and 0 with probability 1-p.
using Distributions
mat = rand(Bernoulli(0.1), 3, 4)
3×4 Matrix{Bool}:
1 0 0 0
0 0 0 0
0 0 0 0
As for your code, you chose rand()+shift>0.5 ? 0 : 1, that means if you write zero_or_one(0.3) it will give ones with probability 0.2 and zeros with probability 0.8, etc.
If you are OK with a BitMatrix:
julia> onesandzeros(shape...; threshold=0.5) = rand(shape...) .< threshold
onesandzeros (generic function with 1 method)
julia> onesandzeros(5, 8; threshold=0.2)
5×8 BitMatrix:
0 0 0 0 0 0 1 1
0 0 0 0 1 1 1 0
0 1 1 0 0 0 0 0
0 0 0 0 1 0 1 0
0 0 0 0 0 0 0 0
This amounts to sampling from a Binomial distribution.
If 0 and 1 should be equally probable, the default Binomial coefficient p = 0.5 encodes this:
julia> using Distributions
julia> rand(Binomial(), 3, 5)
3×5 Matrix{Int64}:
1 1 1 1 1
1 0 1 0 0
0 0 0 1 1
The number of 1s in the matrix is proportional to the parameter p, so if the matrix should on average contain ~10% 1 and ~90% 0, this is the same as sampling from Binomial(1, 0.1):
julia> rand(Binomial(1, 0.1), 3, 5)
3×5 Matrix{Int64}:
0 0 1 0 0
0 0 0 0 0
0 0 0 1 1
See also: Distributions.Binomial
Although based on some comments, I was somewhat convinced that the result of my code was reasonable, each time I investigated its simple procedure, I couldn't withdraw from focusing on it. I found the snag in my code. I forgot that the let blocks create a new hard scope area. So if I try returning the mat it would show a different and expected result for each run:
julia> let mat = Matrix{Int64}(undef, 3, 5)
zero_or_one(shift) = rand()+shift>0.5 ? 0 : 1
foreach(x->mat[x]=zero_or_one(0.3), eachindex(mat))
return mat
end
3×5 Matrix{Int64}:
0 0 0 0 1
0 0 0 0 0
1 0 0 1 0
Then, for making it available in the global scope, using a begin block will make the job get done:
julia> begin mat = Matrix{Int64}(undef, 3, 5)
zero_or_one(shift) = rand()+shift>0.5 ? 0 : 1
foreach(x->mat[x]=zero_or_one(0.3), eachindex(mat))
end
julia> mat
3×5 Matrix{Int64}:
0 1 0 0 1
0 0 0 0 0
1 0 0 1 0
(Note that the above results aren't precisely the same.)

how to generate random numbers using mid- square method?

I want to genearte random numbers using Von-Neuman middle square method in R.But
my code is returning the squared value.
midSquareRand <- function(seed, len) {
randvector <- NULL
for(i in 1:len) {
value <- seed * seed
Y=as.numeric(unlist(strsplit(as.character(value),split="")))
P=Y[3:6]
seed=as.numeric(paste(P,collapse= ""))
randvector <- c(randvector,seed)
}
return(randvector)
}
R = midSquareRand(6752, 50)
First, note that Von-Neuman middle square method is not really a good PRNG.
One issue is that it is possible to have less than 2n digit in the sequence that is being generated, according to the wikipedia page, the procedure is to pad leading zeros:
To generate a sequence of n-digit pseudorandom numbers, an n-digit starting value is created and squared, producing a 2n-digit number. If the result has fewer than 2n digits, leading zeroes are added to compensate.
To achieve this, assuming n=4, we can add a line
Y = c(rep(0,8 - length(Y)), Y)
Also, it is possible that the middle portion ends up being zero and hence it will generate a deterministic sequence of zeros.
midSquareRand <- function(seed, len) {
randvector <- NULL
for(i in 1:len) {
value <- seed * seed
Y=as.numeric(unlist(strsplit(as.character(value),split="")))
Y = c(rep(0,8 - length(Y)), Y)
P=Y[3:6]
seed=as.numeric(paste(P,collapse= ""))
randvector <- c(randvector,seed)
}
return(randvector)
}
R=midSquareRand(6752, 50)
gives me
[1] 5895 7510 4001 80 64 40 16 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[23] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[45] 0 0 0 0 0 0

bin2int function in R package rnn appears to have a bug

I'm aware of the previous post asking about this, but there was not an answer, only the suggestion to write your own functions. I have the same issue--the functions int2bin and bin2int in the rnn package in R appear to return incorrect values. The problem appears to be in bin2int. I would appreciate verification this is a bug.
library(rnn)
X2 <- 1:154
X21 <- int2bin(X2, length = 15)
> head(X2)
[1] 1 2 3 4 5 6
# X21 (data after int2bin(X2, length = 15)) num [1:154, 1:15] 1 0 1 0 1 1 1...
>head(X21)
[,1][,2][,3][,4][,5][,6][,7][,8][,9][,10][,11][,12][,13][,14][,15]
[1,] 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[2,] 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
[3,] 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0
[4,] 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
[5,] 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0
[6,] 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0
# so far so good
>X22 <- bin2int(X21)
# X22 (data after conversion back to integer) X22 int [1:154] 131072 262144...
> head(X22)
[1] 131072 262144 393216 524288 655360 786432
# should be 1 2 3 4 5 6
The underlying function of int2bin is i2b which is defined as:
function (integer, length = 8)
{
as.numeric(intToBits(integer))[1:length]
}
Which is then wrapped in int2bin
function (integer, length = 8)
{
t(sapply(integer, i2b, length = length))
}
Which is wrong (I think) because it returns the binary number backwards.
In your example 1 is returned as 100000000000000, when it should be returned as 000000000000001.
You can fix that by returning the intToBits() list backwards by changing [1:length] to [length:1]
function (integer, length = 8)
{
as.numeric(intToBits(integer))[length:1]
}
However, there is also a problem with bin2int, passing the correct binary input still outputs nonsense.
The b2i function is implemented as:
function(binary){
packBits(as.raw(c(rep(0, 32 - length(binary)), binary)), "integer")
}
Passing sample inputs, I don't understand what this function is doing - certainly not converting binary to integer.
Borrowing a function to convert binary to decimal from #Julius:
BinToDec <- function(x){
sum(2^(which(rev(unlist(strsplit(as.character(x), "")) == 1))-1)) }
This is a simple conversion from base2. Splits each binary digit, returns the indices where == 1, subtract 1 from each index (because R indexes from 1, not zero), then raise 2 to the power of each index returned earlier and sum. For example 101 (binary) = 2^2 + 2^0 = 5
And then (note this is using a corrected X21 structure that follows standard right-to-left binary notation)
X22 <- apply(X21,1,BinToDec)
Returns 1:154
So in short, yes, I agree that rnn:bin2int and rnn::int2bin appear to be wrong/broken.
Also, rather than trying to fix the rnn::int2bin function, I'd suggest R.utils::intToBin
And simply use:
require(R.utils)
X99 <- sapply(X2, intToBin)
I found an example implementation using rnn here:
https://www.r-bloggers.com/plain-vanilla-recurrent-neural-networks-in-r-waves-prediction/
This implementation works. I found the key to be the transpose before using trainr:
# Create sequences
t <- seq(0.005,2,by=0.005)
x <- sin(t*w) + rnorm(200, 0, 0.25)
y <- cos(t*w)
# Samples of 20 time series
X <- matrix(x, nrow = 40)
Y <- matrix(y, nrow = 40)
# Standardize in the interval 0 - 1
X <- (X - min(X)) / (max(X) - min(X))
Y <- (Y - min(Y)) / (max(Y) - min(Y))
# Transpose
X <- t(X)
Y <- t(Y)
Updating my code I had success in using the package. Therefore, the issues with the use of binary, if any, are not impacting the use of the package and were probably a red herring raised by me as I was searching for why my code wasn't producing expected results.

For loop storage of output data

I am trying to store the output data from the forloop in the n.I matrix at the end of the code, but I am certain that something is wrong with my output matrix. It is giving me all the same values, either 0 or 1. I know that print(SS) is outputting the correct values and can see that the forloop is working properly.
Does anyone have any advice on how to fix the matrix, or any way that I am able to store the data from the forloop? Thanks in advance!
c=0.2
As=1
d=1
d0=0.5
s=0.5
e=0.1
ERs=e/As
C2 = c*As*exp(-d*s/d0)
#Island States (Initial Probability)
SS=0
for(i in 1:5) {
if (SS > 0) {
if (runif(1, min = 0, max = 1) < ERs){
SS = 0
}
}
else {
if (runif(1, min = 0, max = 1) < C2) {
SS = 1
}
}
print(SS)
}
n.I=matrix(c(SS), nrow=i, ncol=1, byrow=TRUE)
The efficient solution here is not to use a loop. It's unnecessary since the whole task can be easily vectorized.
Z =runif(100,0,1)
as.integer(x <= Z)
#[1] 0 0 0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 0 0 1 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
#[70] 0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
you can save them in a list. Not very efficient but gets the job done.
list[[1]] indicates the first element saved in a list if you want to retrieve it.
list_pos <- list() # create the list out of the for loop
for(i in 1:100) {
c=0.10 #colonization rate
A=10 #Area of all islands(km^2)
d=250 #Distance from host to target (A-T)
s=0.1 #magnitude of distance
d0=100 #Specific "half distance" for dispersal(km)
C1 = c*A*exp(-d/d0) #Mainland to Target colonization
Z =runif(1,0,1)
x <- C1*A
if(x <= Z) {
list_pos[[i]] <- print("1") # Here you can store the 1 results.print is actually not necessary.
}
if(x >= Z){
list_pos[[i]] <- print("0") # Here you can store the 0 results.print is actually not necessary.
}
}

Find # of rows between events in R

I have a series of data in the format (true/false). eg it looks like it can be generated from rbinom(n, 1, .1). I want a column that represents the # of rows since the last true. So the resulting data will look like
true/false gap
0 0
0 0
1 0
0 1
0 2
1 0
1 0
0 1
What is an efficient way to go from true/false to gap (in practice I'll this will be done on a large dataset with many different ids)
DF <- read.table(text="true/false gap
0 0
0 0
1 0
0 1
0 2
1 0
1 0
0 1", header=TRUE)
DF$gap2 <- sequence(rle(DF$true.false)$lengths) * #create a sequence for each run length
(1 - DF$true.false) * #multiply with 0 for all 1s
(cumsum(DF$true.false) != 0L) #multiply with zero for the leading zeros
# true.false gap gap2
#1 0 0 0
#2 0 0 0
#3 1 0 0
#4 0 1 1
#5 0 2 2
#6 1 0 0
#7 1 0 0
#8 0 1 1
The cumsum part might not be the most efficient for large vectors. Something like
if (DF$true.false[1] == 0) DF$gap2[seq_len(rle(DF$true.false)$lengths[1])] <- 0
might be an alternative (and of course the rle result could be stored temporarly to avoid calculating it twice).
Ok, let me put this in answer
1) No brainer method
data['gap'] = 0
for (i in 2:nrow(data)){
if data[i,'true/false'] == 0{
data[i,'gap'] = data[i-1,'gap'] + 1
}
}
2) No if check
data['gap'] = 0
for (i in 2:nrow(data)){
data[i,'gap'] = (data[i-1,'gap'] + 1) * (-(data[i,'gap'] - 1))
}
Really don't know which is faster, as both contain the same amount of reads from data, but (1) have an if statement, and I don't know how fast is it (compared to a single multiplication)

Resources