function not filling empty vector in R - r

I'm to get the vector simualted_results to take values returned by "simulation," which produced a vector of varying length depending on the iteration.
Initially I have this code which works, but is very slow:
simulated_results<-NULL
while(as.numeric(Sys.time())-start<duration){
simulated_results <- cbind(simulated_results,simulation(J,4* (length(J)^2),0.0007,duration,start))
}
But its very slow so I modified it:
start<-as.numeric(Sys.time())
duration<-10
simulated_results<-NULL
simulated_results <- cbind(simulated_results,
replicate(n=10000,expr=(while(as.numeric(Sys.time())-start<duration)
{simulation(J,4*(length(J)^2),0.0007,duration,start)})))
Now with the new code, my problem is that despite everything running, I cant get the results of simulation to be passed to simualted_results, instead simualted_results jsut takes on a column vector of NULL values
I get no error messages
I would greatly appreciate any help!!
for reference the simulation code is:
iter<-as.numeric(Sys.getenv("PBS_ARRAY_INDEX"))
if(iter <= 40){J<-1:500
}else if(iter <= 80){J<-1:1500
}else if(iter <= 120){J<-1:2500
}else if(iter <= 160){J<-1:5000}
set.seed(iter)
simulation <- function(J,gens,v=0.1,duration,start){
species_richness <- function(J){
a <- table(J)
return(NROW(a))
}
start<-as.numeric(Sys.time())
species_richness_output <- rep(NA,gens)
for(rep in 1:gens){
if (as.numeric(Sys.time())-start<duration){
index1 <- sample(1:length(J),1)
if(runif(1,0,1) < v){
J[index1] <- (rep+100)
}
else{
index2 <- sample(1:length(J),1)
while(index1==index2) {
index2 <- sample(1:length(J),1)
}
J[index1] <- J[index2]
}
species_richness_output[rep] <- species_richness(J)} else break
}
species_abundance <- function(J){
a <- table(J)
return(a)
}
abuntable <- species_abundance(J)
octaves <- function(abuntable)
{
oct<-rep(0,floor(log2(length(J))+1))
for(i in 1:length(abuntable)){
oct2 <- floor(log2(abuntable[i])+1)
oct[oct2] <- oct[oct2]+1
}
return(oct)
}
octaves(abuntable)
}

I agree with #Nathan G, but something did catch my attention: You are trying to cbind two things that cannot be bound together, since they have different dimensions. We don't know what kind of data type your simulation function returns, but it clearly is not NULL. Consider this:
df1 <- NULL
df2 <- data.frame(x = 1:10, y = 11:20)
cbind(df1, df2)
cbind(df2, df1)
Both cbind statements give errors. Do you get an error? If this is what's going on, you should initialize simulated_results not as NULL but as an empty version of whatever the function simulation returns.
EDIT
iter = 10
set.seed(iter)
J <- 1:1500
# critical to preallocate the list size for speed
res <- vector("list", iter)
for (i in 1: iter) {
res[[i]] <- simulation(J,4* (length(J)^2),0.0007,duration = 10,start)
}
str(res)
res[[1]]
Now I don't think I'm using this quite the way you ultimately intend, but perhaps this will give you enough to get to what you actually want.

Related

How to store values in a vector inside a while loop in R

For the next exercise: From a certain numerical value, check if this is a natural number or not so that, if it is, it shows the divisors of this number and, if it is not, it shows an error message.
As there was no predefined function for this I wrote:
n <- 102
x <- n
res <- c()
while (x>0){
if (n%%x == 0){
res[x] <- x
x = x-1
} else {
x = x -1
} print("The values are ", res)
}
res
Works nice, except it´s not storing the values inside the vector. Any ideas?
I´m new to programming and stackoverflow. I hope this question is right posted and presented.
Cheers
What you need is a counter "i" to save the value in the next entry of the vector
n <- 102
x <- n
res <- c()
i<-1
while (x>0){
if (n%%x == 0){
res[i] <- x
x = x-1
i<-i+1
} else {
x = x -1
}
}
res

Is there an easier way to create this vector that doesn't require loops?

I'm trying to make a single vector that pairs up values. For example if the 2nd element is 100, I want the 100th element to be 2. In a sense, each element of the vector has its own respective pair.
I'm using this vector in a function later.
I've currently tried working with the mapply and for loop in similar ways, but these take too long and I keep thinking that there must be a more optimal way for doing this. Perhaps a variation of the sample function?
I want a 'paired' vector of length 10^5
set.seed(1)
tmp <- 1:10^5
t1 <- c(sample(1:10^5, 5))
t2 <- tmp[-t1]
t2 <- sample(t2)
for (i in 1:10^5) {
if (tmp[i] %in% t1) {
a <- which(t1 == tmp[i])
tmp[i] <- t2[a]
} else {
a <- which(t2 ==tmp[i])
tmp[i] <- t1[a]
}
}
Basically it all works, it's just that it takes way too long. I'm sure there must be a more optimal way of doing this.
I need the pairings to be random, so doing them for the fist 50 000 and then the other 50 000 wouldn't work.
Maybe if I have understood you correctly you can do
tmp <- 1:100
t1 <- sample(1:100, 50)
t2 <- tmp[-t1]
t2 <- sample(t2)
v2 <- integer(100)
v2[t1] <- t2
v2[t2] <- t1
This when compared with your for loop gives same result
for(i in 1:100) {
if(tmp[i] %in% t1) {
a <- which(t1 == tmp[i])
v1[i] <- t2[a]
} else {
a <- which(t2 ==tmp[i])
v1[i] <- t1[a]
}
}
identical(v1, v2)
#[1] TRUE

Loop for value matching won't work across data frames for multiple instances

Can anyone tell me what’s preventing this loop from running?
For each row i, in column 3 of the data frame ‘depth.df’, the loop preforms a mathematical function, using a second data frame, 'linker.df' (it multiplies i by a constant / a value from linker.df which is found by matching the value of i.
If I run the loop for a single instance of i, (lets say its = 50) it runs fine:
cor.depth <- function(depth.df){
result <- seq(from=1, to=(nrow(depth.df)))
x <- 8971
for(i in 1:nrow(depth.df)){
result[i] <- depth.df[i,3]*(x /( linker.df [i,2][ linker.df [i,1] == 50]))
return(result)
}
}
>97,331
but if I run it to loop over each instance of i, it always returns an error:
cor.depth <- function(depth.df){
result <- seq(from=1, to=(nrow(depth.df)))
x <- 8971
for(i in 1:nrow(depth.df)){
result[i] <- depth.df[i,3]*(x /( linker.df [i,2][ linker.df [i,1] %in% depth.df[i,3]]))
return(result)
}
}
Error in result[i] <- depth.df[i, 3] * (all_SC_bins/(depth.ea.bin.all[, :
replacement has length zero
EDIT
Here is a reproducible data set provided to illustrate data structure and issue
#make some data as an example
#make some data as an example
linker.data <- sample(x=40:50, replace = FALSE)
linker.df <- data.frame(
X = linker.data
, Y = sample(x=2000:3000, size = 11, replace = TRUE)
)
depth.df <- data.frame(
X = sample(x=9000:9999, size = 300, replace = TRUE)
, Y = sample(x=c("A","G","T","C"), size = 300, replace = TRUE)
, Z = sample(linker.data, size = 300, replace = TRUE)
)
cor.depth <- function(depth.df){
result <- seq(from=1, to=(nrow(depth.df)))
x <- 8971
for(i in 1:nrow(depth.df)){
result[i] <- depth.df[i,3]*(x /( linker.df [i,2][ linker.df [i,1] %in% depth.df[i,3]]))
return(result)
}
}
Error emerges because denominator returns integer(0) or numeric(0) or a FALSE result on most rows. Your loop attempts to find exact row number, i, where both dataframes' respective X and Z match. Likely, you intended where any of the rows match which would entail using a second, nested loop with an if conditional on matches.
cor.depth <- function(depth.df){
result <- seq(from=1, to=(nrow(depth.df)))
x <- 8971
for(i in 1:nrow(depth.df)){
for (j in 1:nrow(linker.df)){
if (linker.df[j,1] == depth.df[i,3]) {
result[i] <- depth.df[i,3]*(x /( linker.df[j,2]))
}
}
}
return(result)
}
Nonetheless, consider merge a more efficient, vectorized approach which matches any rows between both sets on ids. The setNames below renames columns to avoid duplicate headers:
mdf <- merge(setNames(linker.df, paste0(names(linker.df), "_l")),
setNames(depth.df, paste0(names(depth.df), "_d")),
by.x="X_l", by.y="Z_d")
mdf$result <- mdf$X_l * (8971 / mdf$Y_l)
And as comparison, the two approaches would be equivalent:
depth.df$result <- cor.depth(depth.df)
depth.df <- with(depth.df, depth.df[order(Z),]) # ORDER BY Z
mdf <- with(mdf, mdf[order(X_l),]) # ORDER BY X_L
all.equal(depth.df$result, mdf$result)
# [1] TRUE

Populating a vector with a for loop

I am trying to fill a vector pred_pos with the result pred on each iteration of the for loop. However, my pred_pos vector is never filled. The my_vec object is a list of large character vectors which I don't believe needs to be reproduced for this problem as it is most likely a fundamental indexing error. I just need to know how to populate a vector from this for loop. I can't seem to work out a solution.
pred_pos <- vector("numeric" , 2)
for(i in my_vec) {
for(r in pred_pos) {
inserts <- sapply(i, function(n) { n <- cond_probs_neg[n] } )
pred <- sum(unlist(inserts) , na.rm = T) * apriori_neg
pred_pos[r] <- pred
}
}
Assuming that the rest of your code works, there is no need to explicitly state:
pred_pos <- vector("numeric" , 2)
That creates a numeric vector of length two. You ought to be able to write:
pred_pos <- vector()
Now when you wish to append to the vector you can simply use:
vector[length(vector)+1] <- someData
I believe your code should work if it is adjusted:
pred_pos <- vector()
for(i in my_vec) {
inserts <- sapply(i, function(n) { n <- cond_probs_neg[n] } )
pred <- sum(unlist(inserts) , na.rm = T) * apriori_neg
pred_pos[length(pred_pos)+1] <- pred
}

incorrect number of subscripts on matrix, for getting a plot from a matrix in R

I have a set of data, with it's own metadata. I get some of the columns to list all the data from the given set of data.
Then I use this loops to store it in a matrix (I tried a data.frame and a list, but didn't work either). The entries are strings.
#############
ii_c <- metadades$item_id[metadades$tipus_item == "comentari"]
g_c <- metadades$grup[metadades$tipus_item == "comentari"]
i_c <- metadades$item[metadades$tipus_item == "comentari"]
in_c <- data_ent[, ii_c]
c_l <- list()
for(i in 1:ncol(in_c)){
c_l[[i]] <- in_c[,i][!is.na(in_c[,i])]
}
j <- 0
l <- 0
c_cl <- matrix(ncol=3)
for(i in 1:ncol(in_c)){
if(mode(c_l[[i]])=="numeric"){
j=j+1
} else {
for(k in 1:length(c_l[i])){
c_cl[i-j+l,] = c(g_c[i],i_c[i],c_l[i][k])
l=l+1
}
}
}
df_cl <- as.data.frame(c_cl)
#############
This way afterwards I would be able to plot it. Nevertheless I've tried to list (instead of making a matrix) all the dataframes and later on I could be able to cbind them (but it gave me errors aswell).
The next step would be to do a tableGrob and a grid.draw, to print it in a report.
Got the solution from my workmate,,
df_comentaris <- data.frame(grup=NA, item=NA, comentari =NA)
for (i in metadades$item_id[metadades$tipus_item=='comentari']) {
comentaris <- dades[!is.na(dades[i]),i]
grup <- metadades$grup[metadades$item_id == i]
item <- metadades$item[metadades$item_id == i]
df_aux <- data.frame(grup=rep(grup,length(comentaris)), item=rep(item,length(comentaris)), comentari=comentaris)
df_comentaris <- rbind(df_comentaris, df_aux)
}
df_comentaris <- df_comentaris[2:nrow(df_comentaris),]

Resources