how to generate random numbers using mid- square method? - r

I want to genearte random numbers using Von-Neuman middle square method in R.But
my code is returning the squared value.
midSquareRand <- function(seed, len) {
randvector <- NULL
for(i in 1:len) {
value <- seed * seed
Y=as.numeric(unlist(strsplit(as.character(value),split="")))
P=Y[3:6]
seed=as.numeric(paste(P,collapse= ""))
randvector <- c(randvector,seed)
}
return(randvector)
}
R = midSquareRand(6752, 50)

First, note that Von-Neuman middle square method is not really a good PRNG.
One issue is that it is possible to have less than 2n digit in the sequence that is being generated, according to the wikipedia page, the procedure is to pad leading zeros:
To generate a sequence of n-digit pseudorandom numbers, an n-digit starting value is created and squared, producing a 2n-digit number. If the result has fewer than 2n digits, leading zeroes are added to compensate.
To achieve this, assuming n=4, we can add a line
Y = c(rep(0,8 - length(Y)), Y)
Also, it is possible that the middle portion ends up being zero and hence it will generate a deterministic sequence of zeros.
midSquareRand <- function(seed, len) {
randvector <- NULL
for(i in 1:len) {
value <- seed * seed
Y=as.numeric(unlist(strsplit(as.character(value),split="")))
Y = c(rep(0,8 - length(Y)), Y)
P=Y[3:6]
seed=as.numeric(paste(P,collapse= ""))
randvector <- c(randvector,seed)
}
return(randvector)
}
R=midSquareRand(6752, 50)
gives me
[1] 5895 7510 4001 80 64 40 16 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[23] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[45] 0 0 0 0 0 0

Related

Populate Binary Matrix with Double For Loop in R

I'm working on populating a binary matrix based on values from a different table. I can create the matrix but am struggling with the looping needed to populate it. I think this is a pretty simple issue so I hope I can get some easy help.
Here's an example of my data:
start <- c(291, 291, 291, 702, 630, 768)
sequence <- c("chr9:103869456:103870456", "chr5:30823103:30824103", "chr11:49801703:49802703", "chr4:133865601:133866601", "chr12:55738034:55739034", "chr8:96569493:96570493")
motif <- c("ARI5B", "ARI5B", "ARI5B", "ATOH1", "EGR1", "EGR1")
df <- data.frame(start, sequence, motif)
I have created a character vector for each unique motif+start values like so:
x <- sprintf("%s_%d", df$motif, df$start)
x <- unique(x)
Next I create a binary matrix with the sequences as rows and the values from x as columns:
binmat <- matrix(0, nrow = length(df$sequence), ncol = length(x))
rownames(binmat) <- df$sequence
colnames(binmat) <- x
And now I'm stuck. I want to iterate through columns and rows and put a 1 in each position that has a match. For example, the first sequence is "chr9:103869456:103870456" and it has motif "ARI5B" at starting position 291, so it should get a 1 while the rest of the values in that row remain at 0. The output of this example should look like this:
ARI5B_291 ATOH1_702 EGR1_630 EGR1_768
chr9:103869456:103870456 1 0 0 0
chr5:30823103:30824103 1 0 0 0
chr11:49801703:49802703 1 0 0 0
chr4:133865601:133866601 0 1 0 0
chr12:55738034:55739034 0 0 1 0
chr8:96569493:96570493 0 0 0 1
But so far I am unsuccessful. I think I need a double for loop somewhere along these lines:
for (row in binmat){
for (col in binmat){
if (row && col %in% x){
1
} else { 0
}
}
}
But all I get are 0s.
Thanks in advance!
Aren't you just looking for table here? You can get the result as a vectorized one-liner, without loops, by doing:
table(factor(df$sequence, df$sequence), sprintf("%s_%d", df$motif, df$start))
ARI5B_291 ATOH1_702 EGR1_630 EGR1_768
chr9:103869456:103870456 1 0 0 0
chr5:30823103:30824103 1 0 0 0
chr11:49801703:49802703 1 0 0 0
chr4:133865601:133866601 0 1 0 0
chr12:55738034:55739034 0 0 1 0
chr8:96569493:96570493 0 0 0 1

For loops that check for empty range

When dealing with recursive equations in mathematics, it is common to write equations that hold over some range k = 1,...,d with the implicit convention that if d < 1 then the set of equations is considered to be empty. When programming in R I would like to be able to write for loops in the same way as a mathematical statement (e.g., a recursive equation) so that it interprets a range with upper bound lower than the lower bound as being empty. This would ensure that the syntax of the algorithm mimics the syntax of the mathematical statement on which it is based.
Unfortunately, R does not interpret the for loop in this way, and so this commonly leads to errors when you program your loops in a way that mimics the underlying mathematics. For example, consider a simple function where we create a vector of zeros with length n and then change the first d values to ones using a loop over the elements in the range k = 1,...,d. If we input d < 1 into this function we would like the function to recognise that the loop is intended to be empty, so that we would get a vector of all zeros. However, using a standard for loop we get the following:
#Define a function using a recursive pattern
MY_FUNC <- function(n,d) {
OBJECT <- rep(0, n);
for (k in 1:d) { OBJECT[k] <- 1 }
OBJECT }
#Generate some values of the function
MY_FUNC(10,4);
[1] 1 1 1 1 0 0 0 0 0 0
MY_FUNC(10,1);
[1] 1 0 0 0 0 0 0 0 0 0
MY_FUNC(10,0);
[1] 1 0 0 0 0 0 0 0 0 0
#Not what we wanted
MY_FUNC(10,-2);
[1] 1 1 1 1 1 1 1 1 1 1
#Not what we wanted
My Question: Is there any function in R that performed loops like a for loop, but interprets the loop as empty if the upper bound is lower than the lower bound? If there is no existing function, is there a way to program R to read loops this way?
Please note: I am not seeking answers that simply re-write this example function in a way that removes the loop. I am aware that this can be done in this specific case, but my goal is to get the loop working more generally. This example is shown only to give a clear view of the phenomenon I am dealing with.
There is imho no generic for-loop doing what you like but you could easily make it by adding
if(d > 0) break
as the first statement at the beginning of the loop.
EDIT
If you don't want to return an error when negative input is given you can use pmax with seq_len
MY_FUNC <- function(n,d) {
OBJECT <- rep(0, n);
for (k in seq_len(pmax(0, d))) { OBJECT[k] <- 1 }
OBJECT
}
MY_FUNC(10, 4)
#[1] 1 1 1 1 0 0 0 0 0 0
MY_FUNC(10, 1)
#[1] 1 0 0 0 0 0 0 0 0 0
MY_FUNC(10, 0)
#[1] 0 0 0 0 0 0 0 0 0 0
MY_FUNC(10, -2)
#[1] 0 0 0 0 0 0 0 0 0 0
Previous Answer
Prefer seq_len over 1:d and it takes care of this situation
MY_FUNC <- function(n,d) {
OBJECT <- rep(0, n);
for (k in seq_len(d)) { OBJECT[k] <- 1 }
OBJECT
}
MY_FUNC(10, 4)
#[1] 1 1 1 1 0 0 0 0 0 0
MY_FUNC(10, 1)
#[1] 1 0 0 0 0 0 0 0 0 0
MY_FUNC(10, 0)
#[1] 0 0 0 0 0 0 0 0 0 0
MY_FUNC(10, -2)
Error in seq_len(d) : argument must be coercible to non-negative integer
The function can be vectorized
MY_FUNC <- function(n,d) {
rep(c(1, 0), c(d, n -d))
}
MY_FUNC(10, 4)
#[1] 1 1 1 1 0 0 0 0 0 0
MY_FUNC(10, 1)
#[1] 1 0 0 0 0 0 0 0 0 0
MY_FUNC(10, 0)
#[1] 0 0 0 0 0 0 0 0 0 0
MY_FUNC(10, -2)
Error in rep(c(1, 0), c(d, n - d)) : invalid 'times' argument

bin2int function in R package rnn appears to have a bug

I'm aware of the previous post asking about this, but there was not an answer, only the suggestion to write your own functions. I have the same issue--the functions int2bin and bin2int in the rnn package in R appear to return incorrect values. The problem appears to be in bin2int. I would appreciate verification this is a bug.
library(rnn)
X2 <- 1:154
X21 <- int2bin(X2, length = 15)
> head(X2)
[1] 1 2 3 4 5 6
# X21 (data after int2bin(X2, length = 15)) num [1:154, 1:15] 1 0 1 0 1 1 1...
>head(X21)
[,1][,2][,3][,4][,5][,6][,7][,8][,9][,10][,11][,12][,13][,14][,15]
[1,] 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[2,] 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
[3,] 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0
[4,] 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
[5,] 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0
[6,] 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0
# so far so good
>X22 <- bin2int(X21)
# X22 (data after conversion back to integer) X22 int [1:154] 131072 262144...
> head(X22)
[1] 131072 262144 393216 524288 655360 786432
# should be 1 2 3 4 5 6
The underlying function of int2bin is i2b which is defined as:
function (integer, length = 8)
{
as.numeric(intToBits(integer))[1:length]
}
Which is then wrapped in int2bin
function (integer, length = 8)
{
t(sapply(integer, i2b, length = length))
}
Which is wrong (I think) because it returns the binary number backwards.
In your example 1 is returned as 100000000000000, when it should be returned as 000000000000001.
You can fix that by returning the intToBits() list backwards by changing [1:length] to [length:1]
function (integer, length = 8)
{
as.numeric(intToBits(integer))[length:1]
}
However, there is also a problem with bin2int, passing the correct binary input still outputs nonsense.
The b2i function is implemented as:
function(binary){
packBits(as.raw(c(rep(0, 32 - length(binary)), binary)), "integer")
}
Passing sample inputs, I don't understand what this function is doing - certainly not converting binary to integer.
Borrowing a function to convert binary to decimal from #Julius:
BinToDec <- function(x){
sum(2^(which(rev(unlist(strsplit(as.character(x), "")) == 1))-1)) }
This is a simple conversion from base2. Splits each binary digit, returns the indices where == 1, subtract 1 from each index (because R indexes from 1, not zero), then raise 2 to the power of each index returned earlier and sum. For example 101 (binary) = 2^2 + 2^0 = 5
And then (note this is using a corrected X21 structure that follows standard right-to-left binary notation)
X22 <- apply(X21,1,BinToDec)
Returns 1:154
So in short, yes, I agree that rnn:bin2int and rnn::int2bin appear to be wrong/broken.
Also, rather than trying to fix the rnn::int2bin function, I'd suggest R.utils::intToBin
And simply use:
require(R.utils)
X99 <- sapply(X2, intToBin)
I found an example implementation using rnn here:
https://www.r-bloggers.com/plain-vanilla-recurrent-neural-networks-in-r-waves-prediction/
This implementation works. I found the key to be the transpose before using trainr:
# Create sequences
t <- seq(0.005,2,by=0.005)
x <- sin(t*w) + rnorm(200, 0, 0.25)
y <- cos(t*w)
# Samples of 20 time series
X <- matrix(x, nrow = 40)
Y <- matrix(y, nrow = 40)
# Standardize in the interval 0 - 1
X <- (X - min(X)) / (max(X) - min(X))
Y <- (Y - min(Y)) / (max(Y) - min(Y))
# Transpose
X <- t(X)
Y <- t(Y)
Updating my code I had success in using the package. Therefore, the issues with the use of binary, if any, are not impacting the use of the package and were probably a red herring raised by me as I was searching for why my code wasn't producing expected results.

Working with matrices in r

I'm working on code to construct an option pricing matrix. What I have at the moment is the values along the diagonal part of the matrix. Currently I'm working in a matrix with 4 rows and 4 columns. What I'm attempting to do is to use the values in the diagonal part of the matrix to give values in the lower triangle of the matrix. So for my matrix Omat, Omat[1,1]+Omat[2,2] will give a value for [2,1], Omat[2,2]+Omat[3,3] will give a value for [3,2]. Then using these created values, Omat[2,1]+Omat[3,2] will give a value for [3,1].
My attempt:
Omat = diag(2, 4, 4)
Omat[j+i,j] <- Omat[i-1,j]+Omat[i,j+1]
Any ideas on how one could go about this?
What I currently have, a 4 row by 4 col matrix:
Omat
# 2 0 0 0
# 0 2 0 0
# 0 0 2 0
# 0 0 0 2
What I've been attempting to create, a 4 row by 4 col matrix:
0 0 0 0
4 0 0 0
8 4 0 0
16 8 4 0
You could try calculating successive diagonals underneath the main diagonal. Code could look like:
Omat = diag(2,4)
for(i in 1:(nrow(Omat)-1)) {
for( j in (i+1):nrow(Omat)) {
Omat[j,j-i] <- Omat[j,j-i+1] + Omat[j-1,j-i]
}
}
diag(Omat) <- 0
Am I probably missing something, but why not do this:
for (i in 2:dim){
for (j in 1:(i-1)){
Omat[i,j] <- Omat[i-1,j] + Omat[i,j+1]
}
}
diag(Omat) <- 0
,David.

Basic R, how to populate a vector with results from a function

So I have a list of coordinates that I perform a chull on.
X <- matrix(stats::rnorm(100), ncol = 2)
hpts <- chull(X)
chull would return something like "[1] 1 3 44 16 43 9 31 41". I want to then multiple X by another vector to return only the values of X that are in the result set of chull. So for example [-2.1582511,-2.1761699,-0.5796294]*[1,0,1,...] = [-2.1582511,0,-0.5796294...] would be the result. I just don't know how to populate the second vector correctly.
Y <- matrix(0, ncol = 1,nrow=50) #create a vector with nothing
# how do I fill vector y with a 1 or 0 based on the results from chull what do I do next?
X[,1] * Y
X[,2] * Y
Thanks,
To return only the values of X that are in the result set of hpts, use
> X[hpts]
## [1] 2.1186262 0.5038656 -0.4360200 -0.8511972 -2.6542077 -0.3451074 1.0771153
## [8] 2.2306497
I read it like "X such that hpts", or "the values of hpts that are in X"
Of course, these values of X are different from yours, due to my values of rnorm
To get a vector of 1s and 0s signifying results use
> Y <- ifelse(X[,1] %in% X[hpts], 1, 0)
> Y
## [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 1 0 0
## [44] 0 1 0 0 1 0 1

Resources