I would like to generate the set with growing number of some representative.
In final I need a matrix or a data.frame, consisting of 100 rows containing i number of representative (in example it's 1). But there is a following error. What is the trick? What I am missing?
Error: no function to return from, jumping to top level
for(i in 1:100) {
x <- c(rep(1,i),rep(100000,(2500-i)))
return(x)
}
Many thanks!
You can only use return within a function. One solution is to create a matrix to store the results in, something like this:
R> m = matrix(0, ncol=100, nrow=2500)
R>
R> for(i in 1:100) {
+ m[,i] = c(rep(1, i), rep(100000, (2500-i)))
+ }
should do the trick. Or using the sapply function:
m1 = sapply(1:100, function(i) c(rep(1, i), rep(100000,(2500-i))))
For info, your rep function can also be simplified to:
rep(c(1, 1000000), c(i, 2500-i))
Related
I would like to fill in the matrix in inside of the parallel loop.
When I call the function, it returns back me the empty matrix
I was wondering whether can someone help me with that.
Compute_TaskSimilarity<-function(X,...){
Task_similarity<-matrix(0,nrow=100,ncol=100)
foreach(i = 1:K, .combine = "cbind") %dopar% {
for (j in (i + 1):(ncol(Task_similarity))) {
Myvalue<- ComputeValue
if (Myvalue!=0){
TaskSimilarity[i, j] <- Myvalue
} else{
TaskSimilarity[i, j] <- 0.0
}
}
return(TaskSimilarity)
}
Maybe do something like this: create a data.frame of all combinations of indices, then apply over all combinations, then reshape into a matrix of the right size. (I added a simple multiplication as example for a more complex operation and used a smaller array of lenght 10):
indices <- data.frame(x=rep(1:10, each=10), y=rep(1:10, 10))
result <- foreach(i=1:nrow(indices)) %dopar% {
# just an example for a more complex calculation
indices$x[i] * indices$y[i]
}
result <- do.call(c, result)
dim(result) <- c(10, 10)
I have the a dataset, containing about 4000 matrices in vector form, each of them should be named by the date it was created.
Right now I have the following:
dates <- unique(rcov_matrix$dateid)
for(k in dates){
k <- matrix(0, 30, 30)
for(i in 1:30){
for (j in 1:i){
number <- number + 1
value <- rcov_matrix[1, number]
k[i,j] <- value
k[j,i] <- value
}
}
}
The code correctly assigns the entries of the vector to the matrix, but I only end up with one matrix named k in the end.
I understand that this is because of the way variable names are assigned in R, but I could not find a viable solution for my problem in similar posts.
assign(k, matrix(0, 30, 30))
does not work because I have to reuse the variable name later in the next for loop.
How can I solve this? Or is there a more effective way to assign my values to the matrices?
Thank you.
Maybe the simplest is to use assign at the end of the loop, rather than the start.
for (k in dates){
This_k <- matrix(0, 30, 30)
for(i in 1:30){
for (j in 1:i){
value <- rnorm(1) # I use rnorm here to make the example reproducible
This_k[i,j] <- value
This_k[j,i] <- value
}
}
assign(k, This_k)
}
Alternatively (and perhaps a little more efficient), you could put your matrices in a list and use list indexing:
klist = lapply(rep(0, length(dates)), matrix, 30, 30)
names(klist) = dates
for (k in dates){
for(i in 1:30){
for (j in 1:i){
value <- rnorm(1)
klist[[k]][i,j] <- value
klist[[k]][j,i] <- value
}
}
}
My question is about how to improve the performance of function that downsamples from the columns of a matrix without replacement (a.k.a. "rarefication" of a matrix... I know there has been mention of this here, but I could not find a clear answer that a) does what I need; b) does it quickly).
Here is my function:
downsampled <- function(data,samplerate=0.8) {
data.test <- apply(data,2,function(q) {
names(q) <- rownames(data)
samplepool <- character()
for (i in names(q)) {
samplepool <- append(samplepool,rep(i,times=q[i]))
}
sampled <- sample(samplepool,size=samplerate*length(samplepool),replace = F)
tab <- table(sampled)
mat <- match(names(tab),names(q))
toret=numeric(length <- length(q))
names(toret) <- names(q)
toret[mat] <- tab
return(toret)
})
return(data.test)
}
I need to be downsampling matrices with millions of entries. I find this is quite slow (here I'm using a 1000x1000 matrix, which is about 20-100x smaller than my typical data size):
mat <- matrix(sample(0:40,1000*1000,replace=T),ncol=1000,nrow=1000)
colnames(mat) <- paste0("C",1:1000)
rownames(mat) <- paste0("R",1:1000)
system.time(matd <- downsampled(mat,0.8))
## user system elapsed
## 69.322 21.791 92.512
Is there a faster/easier way to perform this operation that I haven't thought of?
I think you can make this dramatically faster. If I understand what you are trying to do correctly, you want to down-sample each cell of the matrix, such that if samplerate = 0.5 and the cell of the matrix is mat[i,j] = 5, then you want to sample up to 5 things where each thing has a 0.5 chance of being sampled.
To speed things up, rather than doing all these operations on columns of the matrix, you can just loop through each cell of the matrix, draw n things from that cell by using runif (e.g., if mat[i,j] = 5, you can generate 5 random numbers between 0 and 1, and then add up the number of values that are < samplerate), and finally add the number of things to a new matrix. I think this effectively achieves the same down-sampling scheme, but much more efficiently (both in terms of running time and lines of code).
# Sample matrix
set.seed(23)
n <- 1000
mat <- matrix(sample(0:10,n*n,replace=T),ncol=n,nrow=n)
colnames(mat) <- paste0("C",1:n)
rownames(mat) <- paste0("R",1:n)
# Old function
downsampled<-function(data,samplerate=0.8) {
data.test<-apply(data,2,function(q){
names(q)<-rownames(data)
samplepool<-character()
for (i in names(q)) {
samplepool=append(samplepool,rep(i,times=q[i]))
}
sampled=sample(samplepool,size=samplerate*length(samplepool),replace = F)
tab=table(sampled)
mat=match(names(tab),names(q))
toret=numeric(length = length(q))
names(toret)<-names(q)
toret[mat]<-tab
return(toret)
})
return(data.test)
}
# New function
downsampled2 <- function(mat, samplerate=0.8) {
new <- matrix(0, nrow(mat), ncol(mat))
colnames(new) <- colnames(mat)
rownames(new) <- rownames(mat)
for (i in 1:nrow(mat)) {
for (j in 1:ncol(mat)) {
new[i,j] <- sum(runif(mat[i,j], 0, 1) < samplerate)
}
}
return(new)
}
# Compare times
system.time(downsampled(mat,0.8))
## user system elapsed
## 26.840 3.249 29.902
system.time(downsampled2(mat,0.8))
## user system elapsed
## 4.704 0.247 4.918
Using an example 1000 X 1000 matrix, the new function I provided runs about 6 times faster.
One source of savings would be to remove the for loop that appends samplepool using rep. Here is a reproducible example:
myRows <- 1:5
names(myRows) <- letters[1:5]
# get the repeated values for sampling
samplepool <- rep(names(myRows), myRows)
Within your function, this would be
samplepool <- rep(names(q), q)
I spend multiple hours of thinking about the following problem. I am running a simulation study and I want to define functions outside the simulation study in order to be able to call these functions in the end of my code.
This example illustrates the problem, but is not replicable (below you will find a replicable example of the problem). I make use of the "metafor" package for doing a meta-analysis.
I would like to use the following function that I define outside my final simulation code:
mat <- matrix(NA, nrow = 8, ncol = 3)
funtr.stu <- function(i) {
for (y in 1:8) {
mat[y,i] <- tr[[y]]$k0
}
return(mat)
}
"tr" is a list and consists of the results of 8 times an analysis. I want to retrieve the object "k0" from that list and store it into the matrix "mat".
In the following part of the code (in which I run the simulation), I want to call the function and fill the matrix "mat" with the correct numbers.
for (i in 1:iterations) {
tr.stu <- funtr.stu()
}
The result of this code is a filled matrix, but within each column the same numbers. Thus, R isn´t storing the numbers every iteration, but stores only the last iteration.
How can I modify my code in such a way that R is storing the output as I want?
A very simplified example:
Mat represents just a matrix with numbers and res is an empty matrix that I want to fill.
mat <- matrix(data = c(1,2,3,4,5,6), ncol = 2, nrow = 3)
res <- matrix(NA, ncol = 2, nrow = 3)
I use the function "fun" to fill the empty matrix res.
fun <- function() {
for (i in 1:2) {
res[y,i] <- mat[y,i]
}
return(res)
}
This is what I would like to put in the end of my code (I just want to call the function and with this function I want to fill the matrix "res"). However, if I use the code below R only fills the third row and not the first and second row.
for (y in 1:3) {
test <- fun()
}
Thank you in advance!
This should work in your case. Basically, return one row in each iteration of the for loop. Where as you are returning the entire 'res' matrix.
mat <- matrix(data = c(1,2,3,4,5,6), ncol = 2, nrow = 3)
res <- matrix(NA, ncol = 2, nrow = 3)
fun <- function() {
for (i in 1:2) {
res[y,i] <- mat[y,i]
}
return(res[y,])
}
for (y in 1:3) {
test[y,] <- fun()
}
I spend multiple hours of thinking about the following problem. I am running a simulation study and I want to define functions outside the simulation study in order to be able to call these functions in the end of my code.
A very simplified example:
Mat represents just a matrix with numbers and res is an empty matrix that I want to fill.
mat <- matrix(data = c(1,2,3,4,5,6), ncol = 2, nrow = 3)
res <- matrix(NA, ncol = 2, nrow = 3)
I use the function "fun" to fill the empty matrix res.
fun <- function() {
for (i in 1:2) {
res[y,i] <- mat[y,i]
}
return(res)
}
This is what I would like to put in the end of my code (I just want to call the function and with this function I want to fill the matrix "res"). However, if I use the code below R only fills the third row and not the first and second row.
for (y in 1:3) {
test <- fun()
}
My question is: why isn't R also filling the first and second row and how can I change my code in such a way that R provides me with the desired result?
Thank you in advance!
EDIT:
The following example also illustrates my problem. I make use of the "metafor" package for doing a meta-analysis.
I would like to use the following function that I define outside my final simulation code:
mat <- matrix(NA, nrow = 8, ncol = 3, dimnames = list(c("0_le", "0_ri", ".13_le", ".13_ri", ".33_le", ".33_ri", ".5_le", ".5_ri"), c("1", "2", "3")))
funtr.stu <- function(i) {
for (y in 1:8) {
mat[y,i] <- tr[[y]]$k0
}
return(mat)
}
"tr" is a list and consists of the results of 8 times an analysis. I want to retrieve the object "k0" from that list and store it into the matrix "mat".
In the following part of the code (in which I run the simulation), I want to call the function and fill the matrix "mat" with the correct numbers.
for (i in 1:iterations) {
kip <- funtr.stu()
}
The result of this code is a filled matrix, but within each column the same numbers. Thus, R isn´t storing the numbers every iteration, but stores only the last iteration.
How can I modify my code in such a way that R is storing the output as I want?
Thank you in advance for your help!
It is because you are overwriting all values of the matrix test in each iteration. I added print(test) in the loop. See the code.
mat <- matrix(data = c(1,2,3,4,5,6), ncol = 2, nrow = 3)
res <- matrix(NA, ncol = 2, nrow = 3)
mat
res
fun <- function() {
for (i in 1:2) {
res[y,i] <- mat[y,i]
}
return(res)
}
for (y in 1:3) {
test <- fun()
print(test)
}
This should work in your case. Basically, return one row in each iteration of the for loop. Where as you are returning the entire 'res' matrix.
mat <- matrix(data = c(1,2,3,4,5,6), ncol = 2, nrow = 3)
res <- matrix(NA, ncol = 2, nrow = 3)
fun <- function() {
for (i in 1:2) {
res[y,i] <- mat[y,i]
}
return(res[y,])
}
for (y in 1:3) {
test[y,] <- fun()
}