I'm aware of the previous post asking about this, but there was not an answer, only the suggestion to write your own functions. I have the same issue--the functions int2bin and bin2int in the rnn package in R appear to return incorrect values. The problem appears to be in bin2int. I would appreciate verification this is a bug.
library(rnn)
X2 <- 1:154
X21 <- int2bin(X2, length = 15)
> head(X2)
[1] 1 2 3 4 5 6
# X21 (data after int2bin(X2, length = 15)) num [1:154, 1:15] 1 0 1 0 1 1 1...
>head(X21)
[,1][,2][,3][,4][,5][,6][,7][,8][,9][,10][,11][,12][,13][,14][,15]
[1,] 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[2,] 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
[3,] 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0
[4,] 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
[5,] 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0
[6,] 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0
# so far so good
>X22 <- bin2int(X21)
# X22 (data after conversion back to integer) X22 int [1:154] 131072 262144...
> head(X22)
[1] 131072 262144 393216 524288 655360 786432
# should be 1 2 3 4 5 6
The underlying function of int2bin is i2b which is defined as:
function (integer, length = 8)
{
as.numeric(intToBits(integer))[1:length]
}
Which is then wrapped in int2bin
function (integer, length = 8)
{
t(sapply(integer, i2b, length = length))
}
Which is wrong (I think) because it returns the binary number backwards.
In your example 1 is returned as 100000000000000, when it should be returned as 000000000000001.
You can fix that by returning the intToBits() list backwards by changing [1:length] to [length:1]
function (integer, length = 8)
{
as.numeric(intToBits(integer))[length:1]
}
However, there is also a problem with bin2int, passing the correct binary input still outputs nonsense.
The b2i function is implemented as:
function(binary){
packBits(as.raw(c(rep(0, 32 - length(binary)), binary)), "integer")
}
Passing sample inputs, I don't understand what this function is doing - certainly not converting binary to integer.
Borrowing a function to convert binary to decimal from #Julius:
BinToDec <- function(x){
sum(2^(which(rev(unlist(strsplit(as.character(x), "")) == 1))-1)) }
This is a simple conversion from base2. Splits each binary digit, returns the indices where == 1, subtract 1 from each index (because R indexes from 1, not zero), then raise 2 to the power of each index returned earlier and sum. For example 101 (binary) = 2^2 + 2^0 = 5
And then (note this is using a corrected X21 structure that follows standard right-to-left binary notation)
X22 <- apply(X21,1,BinToDec)
Returns 1:154
So in short, yes, I agree that rnn:bin2int and rnn::int2bin appear to be wrong/broken.
Also, rather than trying to fix the rnn::int2bin function, I'd suggest R.utils::intToBin
And simply use:
require(R.utils)
X99 <- sapply(X2, intToBin)
I found an example implementation using rnn here:
https://www.r-bloggers.com/plain-vanilla-recurrent-neural-networks-in-r-waves-prediction/
This implementation works. I found the key to be the transpose before using trainr:
# Create sequences
t <- seq(0.005,2,by=0.005)
x <- sin(t*w) + rnorm(200, 0, 0.25)
y <- cos(t*w)
# Samples of 20 time series
X <- matrix(x, nrow = 40)
Y <- matrix(y, nrow = 40)
# Standardize in the interval 0 - 1
X <- (X - min(X)) / (max(X) - min(X))
Y <- (Y - min(Y)) / (max(Y) - min(Y))
# Transpose
X <- t(X)
Y <- t(Y)
Updating my code I had success in using the package. Therefore, the issues with the use of binary, if any, are not impacting the use of the package and were probably a red herring raised by me as I was searching for why my code wasn't producing expected results.
Related
I want to genearte random numbers using Von-Neuman middle square method in R.But
my code is returning the squared value.
midSquareRand <- function(seed, len) {
randvector <- NULL
for(i in 1:len) {
value <- seed * seed
Y=as.numeric(unlist(strsplit(as.character(value),split="")))
P=Y[3:6]
seed=as.numeric(paste(P,collapse= ""))
randvector <- c(randvector,seed)
}
return(randvector)
}
R = midSquareRand(6752, 50)
First, note that Von-Neuman middle square method is not really a good PRNG.
One issue is that it is possible to have less than 2n digit in the sequence that is being generated, according to the wikipedia page, the procedure is to pad leading zeros:
To generate a sequence of n-digit pseudorandom numbers, an n-digit starting value is created and squared, producing a 2n-digit number. If the result has fewer than 2n digits, leading zeroes are added to compensate.
To achieve this, assuming n=4, we can add a line
Y = c(rep(0,8 - length(Y)), Y)
Also, it is possible that the middle portion ends up being zero and hence it will generate a deterministic sequence of zeros.
midSquareRand <- function(seed, len) {
randvector <- NULL
for(i in 1:len) {
value <- seed * seed
Y=as.numeric(unlist(strsplit(as.character(value),split="")))
Y = c(rep(0,8 - length(Y)), Y)
P=Y[3:6]
seed=as.numeric(paste(P,collapse= ""))
randvector <- c(randvector,seed)
}
return(randvector)
}
R=midSquareRand(6752, 50)
gives me
[1] 5895 7510 4001 80 64 40 16 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[23] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[45] 0 0 0 0 0 0
When dealing with recursive equations in mathematics, it is common to write equations that hold over some range k = 1,...,d with the implicit convention that if d < 1 then the set of equations is considered to be empty. When programming in R I would like to be able to write for loops in the same way as a mathematical statement (e.g., a recursive equation) so that it interprets a range with upper bound lower than the lower bound as being empty. This would ensure that the syntax of the algorithm mimics the syntax of the mathematical statement on which it is based.
Unfortunately, R does not interpret the for loop in this way, and so this commonly leads to errors when you program your loops in a way that mimics the underlying mathematics. For example, consider a simple function where we create a vector of zeros with length n and then change the first d values to ones using a loop over the elements in the range k = 1,...,d. If we input d < 1 into this function we would like the function to recognise that the loop is intended to be empty, so that we would get a vector of all zeros. However, using a standard for loop we get the following:
#Define a function using a recursive pattern
MY_FUNC <- function(n,d) {
OBJECT <- rep(0, n);
for (k in 1:d) { OBJECT[k] <- 1 }
OBJECT }
#Generate some values of the function
MY_FUNC(10,4);
[1] 1 1 1 1 0 0 0 0 0 0
MY_FUNC(10,1);
[1] 1 0 0 0 0 0 0 0 0 0
MY_FUNC(10,0);
[1] 1 0 0 0 0 0 0 0 0 0
#Not what we wanted
MY_FUNC(10,-2);
[1] 1 1 1 1 1 1 1 1 1 1
#Not what we wanted
My Question: Is there any function in R that performed loops like a for loop, but interprets the loop as empty if the upper bound is lower than the lower bound? If there is no existing function, is there a way to program R to read loops this way?
Please note: I am not seeking answers that simply re-write this example function in a way that removes the loop. I am aware that this can be done in this specific case, but my goal is to get the loop working more generally. This example is shown only to give a clear view of the phenomenon I am dealing with.
There is imho no generic for-loop doing what you like but you could easily make it by adding
if(d > 0) break
as the first statement at the beginning of the loop.
EDIT
If you don't want to return an error when negative input is given you can use pmax with seq_len
MY_FUNC <- function(n,d) {
OBJECT <- rep(0, n);
for (k in seq_len(pmax(0, d))) { OBJECT[k] <- 1 }
OBJECT
}
MY_FUNC(10, 4)
#[1] 1 1 1 1 0 0 0 0 0 0
MY_FUNC(10, 1)
#[1] 1 0 0 0 0 0 0 0 0 0
MY_FUNC(10, 0)
#[1] 0 0 0 0 0 0 0 0 0 0
MY_FUNC(10, -2)
#[1] 0 0 0 0 0 0 0 0 0 0
Previous Answer
Prefer seq_len over 1:d and it takes care of this situation
MY_FUNC <- function(n,d) {
OBJECT <- rep(0, n);
for (k in seq_len(d)) { OBJECT[k] <- 1 }
OBJECT
}
MY_FUNC(10, 4)
#[1] 1 1 1 1 0 0 0 0 0 0
MY_FUNC(10, 1)
#[1] 1 0 0 0 0 0 0 0 0 0
MY_FUNC(10, 0)
#[1] 0 0 0 0 0 0 0 0 0 0
MY_FUNC(10, -2)
Error in seq_len(d) : argument must be coercible to non-negative integer
The function can be vectorized
MY_FUNC <- function(n,d) {
rep(c(1, 0), c(d, n -d))
}
MY_FUNC(10, 4)
#[1] 1 1 1 1 0 0 0 0 0 0
MY_FUNC(10, 1)
#[1] 1 0 0 0 0 0 0 0 0 0
MY_FUNC(10, 0)
#[1] 0 0 0 0 0 0 0 0 0 0
MY_FUNC(10, -2)
Error in rep(c(1, 0), c(d, n - d)) : invalid 'times' argument
How to convert this
1,2,5,6,9
1,2
3,11
into this:
1,1,0,0,1,1,0,0,1,0,0
1,1,0,0,0,0,0,0,0,0,0
0,0,1,0,0,0,0,0,0,0,1
I thought I can read my data by adding na if the index is not exist.
Then, replace each na with zero, and each not na with one.
But I don't know how, and I searched to similar code and I didn't find
You can do:
lapply(z,tabulate,nbins=max(unlist(z)))
[[1]]
[1] 1 1 0 0 1 1 0 0 1 0 0
[[2]]
[1] 1 1 0 0 0 0 0 0 0 0 0
[[3]]
[1] 0 0 1 0 0 0 0 0 0 0 1
where z is a list of vectors:
z <- list(c(1,2,5,6,9),c(1,2),c(3,11))
I'm not sure what your original numbers are stored as, but here's a solution assuming it's a list of vectors:
nums <-list(
c(1,2,5,6,9),
c(1,2),
c(3,11)
)
maxn <- max(unlist(nums))
lapply(nums, function(x) {
binary <- numeric(maxn)
binary[x] <- 1
binary
})
I have a series of data in the format (true/false). eg it looks like it can be generated from rbinom(n, 1, .1). I want a column that represents the # of rows since the last true. So the resulting data will look like
true/false gap
0 0
0 0
1 0
0 1
0 2
1 0
1 0
0 1
What is an efficient way to go from true/false to gap (in practice I'll this will be done on a large dataset with many different ids)
DF <- read.table(text="true/false gap
0 0
0 0
1 0
0 1
0 2
1 0
1 0
0 1", header=TRUE)
DF$gap2 <- sequence(rle(DF$true.false)$lengths) * #create a sequence for each run length
(1 - DF$true.false) * #multiply with 0 for all 1s
(cumsum(DF$true.false) != 0L) #multiply with zero for the leading zeros
# true.false gap gap2
#1 0 0 0
#2 0 0 0
#3 1 0 0
#4 0 1 1
#5 0 2 2
#6 1 0 0
#7 1 0 0
#8 0 1 1
The cumsum part might not be the most efficient for large vectors. Something like
if (DF$true.false[1] == 0) DF$gap2[seq_len(rle(DF$true.false)$lengths[1])] <- 0
might be an alternative (and of course the rle result could be stored temporarly to avoid calculating it twice).
Ok, let me put this in answer
1) No brainer method
data['gap'] = 0
for (i in 2:nrow(data)){
if data[i,'true/false'] == 0{
data[i,'gap'] = data[i-1,'gap'] + 1
}
}
2) No if check
data['gap'] = 0
for (i in 2:nrow(data)){
data[i,'gap'] = (data[i-1,'gap'] + 1) * (-(data[i,'gap'] - 1))
}
Really don't know which is faster, as both contain the same amount of reads from data, but (1) have an if statement, and I don't know how fast is it (compared to a single multiplication)
So I have a list of coordinates that I perform a chull on.
X <- matrix(stats::rnorm(100), ncol = 2)
hpts <- chull(X)
chull would return something like "[1] 1 3 44 16 43 9 31 41". I want to then multiple X by another vector to return only the values of X that are in the result set of chull. So for example [-2.1582511,-2.1761699,-0.5796294]*[1,0,1,...] = [-2.1582511,0,-0.5796294...] would be the result. I just don't know how to populate the second vector correctly.
Y <- matrix(0, ncol = 1,nrow=50) #create a vector with nothing
# how do I fill vector y with a 1 or 0 based on the results from chull what do I do next?
X[,1] * Y
X[,2] * Y
Thanks,
To return only the values of X that are in the result set of hpts, use
> X[hpts]
## [1] 2.1186262 0.5038656 -0.4360200 -0.8511972 -2.6542077 -0.3451074 1.0771153
## [8] 2.2306497
I read it like "X such that hpts", or "the values of hpts that are in X"
Of course, these values of X are different from yours, due to my values of rnorm
To get a vector of 1s and 0s signifying results use
> Y <- ifelse(X[,1] %in% X[hpts], 1, 0)
> Y
## [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 1 0 0
## [44] 0 1 0 0 1 0 1