For loop storage of output data - r

I am trying to store the output data from the forloop in the n.I matrix at the end of the code, but I am certain that something is wrong with my output matrix. It is giving me all the same values, either 0 or 1. I know that print(SS) is outputting the correct values and can see that the forloop is working properly.
Does anyone have any advice on how to fix the matrix, or any way that I am able to store the data from the forloop? Thanks in advance!
c=0.2
As=1
d=1
d0=0.5
s=0.5
e=0.1
ERs=e/As
C2 = c*As*exp(-d*s/d0)
#Island States (Initial Probability)
SS=0
for(i in 1:5) {
if (SS > 0) {
if (runif(1, min = 0, max = 1) < ERs){
SS = 0
}
}
else {
if (runif(1, min = 0, max = 1) < C2) {
SS = 1
}
}
print(SS)
}
n.I=matrix(c(SS), nrow=i, ncol=1, byrow=TRUE)

The efficient solution here is not to use a loop. It's unnecessary since the whole task can be easily vectorized.
Z =runif(100,0,1)
as.integer(x <= Z)
#[1] 0 0 0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 0 0 1 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
#[70] 0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0

you can save them in a list. Not very efficient but gets the job done.
list[[1]] indicates the first element saved in a list if you want to retrieve it.
list_pos <- list() # create the list out of the for loop
for(i in 1:100) {
c=0.10 #colonization rate
A=10 #Area of all islands(km^2)
d=250 #Distance from host to target (A-T)
s=0.1 #magnitude of distance
d0=100 #Specific "half distance" for dispersal(km)
C1 = c*A*exp(-d/d0) #Mainland to Target colonization
Z =runif(1,0,1)
x <- C1*A
if(x <= Z) {
list_pos[[i]] <- print("1") # Here you can store the 1 results.print is actually not necessary.
}
if(x >= Z){
list_pos[[i]] <- print("0") # Here you can store the 0 results.print is actually not necessary.
}
}

Related

R, remove zero entries per column/sample and calculate the unique occurances. Plot them as stacked bar plot

I got a big table that is available here as a Renvironment it looks something like this:
gene
s1
s2
s3
s4
s5
s6
s7
s8
s9
s10
s11
s12
type
TRAM2
0
0
0
0
0
0
0
0
0
0
0
0
proeteinCoding
CLIC5
0
0
1
0
1
0.2
0
0
1.3
1
0
0.7
proeteinCoding
GAL3ST2
0
0.5
0
0
0
0
0
0
0
0
0
0
trna
UHRF1BP
0
0
0
0
0
0
0
0
0
0
0
0
trna
OSTM1
0
0
0
0
0
0
0
0
0
0
0
0
trna
IMPG2
0
0
0
1
0
0
0
0
0
0
0
0
miRNA
OXCT1
0
1
0
0
0
0
0
0
0
0
0
0.3
miRNA
CPNE3
0
0
0
0
0
0
0
0
0
0
0
0
miRNA
PPP1R15
0
0
0
0
0
0
0
0
0
0
0
0
miRNA
ADAM11
0
0
0
0
0
0
0
0
0
0
0
0
snoRNA
PTHLH
0
0
0
0
0.1
0.5
0
0
0.1
0.2
0
0.5
snoRNA
By using the following code I can get the unique elements OF TYPE (last column) of all cells"
table <-read.delim("smalRNAseq/counts_molc_NoLengthCut_mirnaCollapsed2.txt", header = T, sep = "\t")
genecodev22 <-read.table("genecodev22.csv")
#assign new names to the coluns so I can merge them
colnames(genecodev22)[colnames(genecodev22) %in% c("V1", "V2")] <- c("ENSEMBLE", "TYPE")
colnames(table)[colnames(table) %in% c("X", "X.1")] <- c("ENSEMBLE1", "ENSEMBLE")
#mergethem by ENSEMBLE so that the type will be a new entry at the back
mergetable <- merge(table,genecodev22, match = "first", by="ENSEMBLE")
#assign the first column as column names, the one with the true ensemble names
row.names(mergetable) <- mergetable[[1]]
#and remove it
mergetable2 <- mergetable[,-2:-1]
#get all entries with no 0 value
mergetable3 <- mergetable2[rowSums(mergetable2[1:nrow(mergetable2),1:95])>0,]
# Total number of unique element and occurancy p
list_of_elements <- aggregate(data.frame(count = mergetable3$TYPE),
list(value = mergetable3$TYPE),
length)
#plot it
row.names(list_of_elements) <- list_of_elements[[1]]
list_of_elements <- list_of_elements[-1]
list_of_elementsOrd<- list_of_elements[order(list_of_elements$count, decreasing = T),]
library(ggplot2)
ggplot(as.data.frame(list_of_elementsOrd),
aes(x=reorder(value, -count), y=count, fill=value)) +
geom_bar(stat = "identity") +
coord_flip() +
geom_text(aes(label=count), vjust=-1, color="black", size=3.5)+
theme(axis.text.x = element_text(angle = 90), legend.position = "none")
The output looks like:
WHAT I WANT
I would like for each s# to draw a stacked box plot with the unique Type by occurrence (0s should not be counted).
Many thanks
EDIT: I manage to create a list of all 'aggregate' elements with the following loop:
i=1
list_of_elementsOrd <- c()
mergedElements <- list_of_elements[1:2]
for (i in 1:length(mergetable2[-1])) {
function(row) all(row !=0 )), ]
mergetable2[mergetable2 == 0] <- NA
list_of_elements <- aggregate(data.frame(count = mergetable2$TYPE),
list(value = mergetable2$TYPE),
length)
list_of_elementsOrd[[i]]<- list_of_elements[order(list_of_elements$count, decreasing = T),]
}
But of course I can not get the plot done. When I transform it to dataframe I get column_names as:
value.70 count.70 value.71 count.71
You can replace all the 0 by NA, so the stacked barplot shouldn't have the gene with NA.
To do so:
table[table == 0] <- NA

For loops that check for empty range

When dealing with recursive equations in mathematics, it is common to write equations that hold over some range k = 1,...,d with the implicit convention that if d < 1 then the set of equations is considered to be empty. When programming in R I would like to be able to write for loops in the same way as a mathematical statement (e.g., a recursive equation) so that it interprets a range with upper bound lower than the lower bound as being empty. This would ensure that the syntax of the algorithm mimics the syntax of the mathematical statement on which it is based.
Unfortunately, R does not interpret the for loop in this way, and so this commonly leads to errors when you program your loops in a way that mimics the underlying mathematics. For example, consider a simple function where we create a vector of zeros with length n and then change the first d values to ones using a loop over the elements in the range k = 1,...,d. If we input d < 1 into this function we would like the function to recognise that the loop is intended to be empty, so that we would get a vector of all zeros. However, using a standard for loop we get the following:
#Define a function using a recursive pattern
MY_FUNC <- function(n,d) {
OBJECT <- rep(0, n);
for (k in 1:d) { OBJECT[k] <- 1 }
OBJECT }
#Generate some values of the function
MY_FUNC(10,4);
[1] 1 1 1 1 0 0 0 0 0 0
MY_FUNC(10,1);
[1] 1 0 0 0 0 0 0 0 0 0
MY_FUNC(10,0);
[1] 1 0 0 0 0 0 0 0 0 0
#Not what we wanted
MY_FUNC(10,-2);
[1] 1 1 1 1 1 1 1 1 1 1
#Not what we wanted
My Question: Is there any function in R that performed loops like a for loop, but interprets the loop as empty if the upper bound is lower than the lower bound? If there is no existing function, is there a way to program R to read loops this way?
Please note: I am not seeking answers that simply re-write this example function in a way that removes the loop. I am aware that this can be done in this specific case, but my goal is to get the loop working more generally. This example is shown only to give a clear view of the phenomenon I am dealing with.
There is imho no generic for-loop doing what you like but you could easily make it by adding
if(d > 0) break
as the first statement at the beginning of the loop.
EDIT
If you don't want to return an error when negative input is given you can use pmax with seq_len
MY_FUNC <- function(n,d) {
OBJECT <- rep(0, n);
for (k in seq_len(pmax(0, d))) { OBJECT[k] <- 1 }
OBJECT
}
MY_FUNC(10, 4)
#[1] 1 1 1 1 0 0 0 0 0 0
MY_FUNC(10, 1)
#[1] 1 0 0 0 0 0 0 0 0 0
MY_FUNC(10, 0)
#[1] 0 0 0 0 0 0 0 0 0 0
MY_FUNC(10, -2)
#[1] 0 0 0 0 0 0 0 0 0 0
Previous Answer
Prefer seq_len over 1:d and it takes care of this situation
MY_FUNC <- function(n,d) {
OBJECT <- rep(0, n);
for (k in seq_len(d)) { OBJECT[k] <- 1 }
OBJECT
}
MY_FUNC(10, 4)
#[1] 1 1 1 1 0 0 0 0 0 0
MY_FUNC(10, 1)
#[1] 1 0 0 0 0 0 0 0 0 0
MY_FUNC(10, 0)
#[1] 0 0 0 0 0 0 0 0 0 0
MY_FUNC(10, -2)
Error in seq_len(d) : argument must be coercible to non-negative integer
The function can be vectorized
MY_FUNC <- function(n,d) {
rep(c(1, 0), c(d, n -d))
}
MY_FUNC(10, 4)
#[1] 1 1 1 1 0 0 0 0 0 0
MY_FUNC(10, 1)
#[1] 1 0 0 0 0 0 0 0 0 0
MY_FUNC(10, 0)
#[1] 0 0 0 0 0 0 0 0 0 0
MY_FUNC(10, -2)
Error in rep(c(1, 0), c(d, n - d)) : invalid 'times' argument

How to define the input array for RNA training in the rnn package?

In the package rnn there is an example of how to carry out the training of the network, which is described in this link (example 1). In the approach of this package the inputs are given in the format of a 3D array, where the dim 1: samples; Dim 2: time; Dim 3: variables, however not making explicit the division of inputs and targets (inputs and targets, which is a common approach in RNA packages). Moreover, in the package description both the entries and the targets must have the same dimension. So, how can I define my dataset for the recurrent neural network in the rnn package? Data to a reproductive example.
My train data (inputs) on DF, first five rows:
> data[1:5,2:14]
Ibiara.P_t Ibiara.P_t_1 Ibiara.P_t_2 Nova.Olinda.P_t_1 Princesa.Isabel.P_t_1 Boa.ventura.P_t_1 Boa.Ventura.P_t_2
1966-01-01 0 0 0 0 0 0 0
1966-01-02 0 0 0 0 0 0 0
1966-01-03 0 0 0 0 0 0 0
1966-01-04 0 0 0 0 0 0 0
1966-01-05 0 0 0 0 0 0 0
Piancó.P_t Piancó.P_t_1 Piancó.P_t_2 Q_t_1 Q_t_2 Q_t_3
1966-01-01 0 0 0 0 0 0
1966-01-02 0 0 0 0 0 0
1966-01-03 0 0 0 0 0 0
1966-01-04 0 0 0 0 0 0
1966-01-05 0 0 0 0 0 0
My target data, first five rows:
> data[1:5,1]
Q_t
1966-01-01 0
1966-01-02 0
1966-01-03 0
1966-01-04 0
1966-01-05 0
That was my code to define data:
# Scaling data for the NN
maxs <- apply(data, 2, max)
mins <- apply(data, 2, min)
scaled <- as.data.frame(scale(data, center = mins, scale = maxs - mins))
# Train-test split
train_ <- scaled[1:train_days,]
test_ <- scaled[(train_days+1):nrow(data),]
X <- t(as.matrix(train_[,1]))
Y <- t(as.matrix(test_[,2:14]))
# Train model
model <- trainr(Y = Y,
X = X,
learningrate = 0.01,
hidden_dim = 10,
numepochs = 10)

How can I select columns and rows with variable in R?

I have an object currency I would like to select one column and the rows equal to 1 with the variable Pair.
>currency
EURUSD EURUSDi USDJPY USDJPYi GBPUSD GBPUSDi AUDUSD AUDUSDi XAUUSD XAUUSDi zeroes
2000-07-16 0 0 0 0 0 1 0 0 0 0 0
2000-07-23 0 0 0 0 0 1 0 0 0 0 0
2000-07-30 0 0 0 0 0 1 0 0 0 0 0
2000-08-06 0 0 0 0 0 0 0 0 0 1 0
2000-08-13 0 1 0 0 0 0 0 0 0 0 0
From the console I can do it with subset like this :
> subset(currency$GBPUSDi, GBPUSDi == 1)
GBPUSDi
2000-07-16 1
2000-07-23 1
2000-07-30 1
2000-08-06 1
2000-08-13 1
2000-08-20 1
But as soon as it is passed in a script with variable Pair it fails. I've searched for hours in the documentation and I'm having a headache trying to figure out what is wrong.
Please find the different command I've try :
subset (currency$Pair, Pair == 1)
subset (currency, Pair = 1, select = Pair)
weights$Cur[currency$Pair = 1]
The one that works is currency[,c(Pair)] but it only select column, how can I complete with row selection of Pair = 1 ?
currency[,c(Pair)][Pair = 1] and subset (currency[,c(Pair)], Pair = 1) with = or == doesn't work.
currency$Pair[currency$Pair == 1] should work ($Pair select column Pair and [currency$Pair == 1] select values equal to 1). It looks like it don't work in your case, because currency don't contain variable Pair.
If currency is not a dataframe but matrix, you can try
currency[currency[, c("Pair")] == 1, c("Pair")]

Basic R, how to populate a vector with results from a function

So I have a list of coordinates that I perform a chull on.
X <- matrix(stats::rnorm(100), ncol = 2)
hpts <- chull(X)
chull would return something like "[1] 1 3 44 16 43 9 31 41". I want to then multiple X by another vector to return only the values of X that are in the result set of chull. So for example [-2.1582511,-2.1761699,-0.5796294]*[1,0,1,...] = [-2.1582511,0,-0.5796294...] would be the result. I just don't know how to populate the second vector correctly.
Y <- matrix(0, ncol = 1,nrow=50) #create a vector with nothing
# how do I fill vector y with a 1 or 0 based on the results from chull what do I do next?
X[,1] * Y
X[,2] * Y
Thanks,
To return only the values of X that are in the result set of hpts, use
> X[hpts]
## [1] 2.1186262 0.5038656 -0.4360200 -0.8511972 -2.6542077 -0.3451074 1.0771153
## [8] 2.2306497
I read it like "X such that hpts", or "the values of hpts that are in X"
Of course, these values of X are different from yours, due to my values of rnorm
To get a vector of 1s and 0s signifying results use
> Y <- ifelse(X[,1] %in% X[hpts], 1, 0)
> Y
## [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 1 0 0
## [44] 0 1 0 0 1 0 1

Resources