Store/make sparse vector in R - r

I am generating a sparse vector length >50,000. I am producing it in a for loop. I wonder if there is an efficient way of storing the zeros?
Basically the code looks like
score = c()
for (i in 1:length(someList)) {
score[i] = getScore(input[i], other_inputs)
if (score[i] == numeric(0))
score[i] = 0 ###I would want to do something about the zeros
}

This code will not work. You should preallocate score vector size before looping. Preallocating also will create a vector with zeros. So, no need to assign zeros values, you can only assign numeric results from getScore function.
N <- length(someList) ## create a vector with zeros
score = vector('numeric',N)
for (i in 1:N) {
ss <- getScore(input[i], other_inputs)
if (length(ss)!=0)
score[i] <- ss
}

Related

Add new row to R data frame in for loop without using NROW()

Problem
Add row to data frame in function for accuracy measurement.
Effort in R
The function compute_accuracy.func() returns the precision accuracy measurement.
The IF loop invokes the compute_accuracy.func() and added new rows to the data.frame.
Add New Row to Data Frame, Not Working
Need to add new row to each for...loop with threshold used and precision computed. I'm new at this effort to add new rows to an R data.frame, dynamically inside the for...loop.
compute_accuracy.func <- function(t_threshold) {
tryCatch({
x <- accuracy.meas(full$y, loans_predict$fit, threshold=t_threshold)
return(x$precision)
},
error = function(e) return(e)
)
}
df_accuracies <- data.frame(n=0, threshold=0, precision=0)
compute_for_values = seq(0.1,0.9,by=0.1)
for(i in compute_for_values){
threshold = i
precision <- compute_accuracy.func(i)
df_accuracies[nrow(df_accuracies) + 1,] <- rbind(threshold, precision)
}
Solved by this approach: resorted to vector approach rather than rbind, e.g., c() with variables inside c().
df_accuracies <- data.frame(n=0, threshold=0, precision=0)
compute_for_values = seq(0.15,0.95,by=0.1)
n=0
for(i in compute_for_values){
n <- n+1
threshold = i
precision <- compute_accuracy.func(i)
df_accuracies[nrow(df_accuracies) + 1,] <- c(n, threshold, precision)
}

How to locate specific elements in one matrix and compare those with a second matrix?

Let's have a binary Matrix/ Data Frame:
library("Matrix")
df_binary <- data.frame(as.matrix(rsparsematrix(1000, 20,nnz = 800, rand.x = runif)))
df_binary[df_binary > 0] = 1
Now, I would like to create an index-object of all elements of equal value 1. How I can do this in R?
I need something like an index of those entries to compare the entries of the binary matrix with entries of a second matrix. Both matrices are of the same size - if this information could be important.
If you want a list out you could do something along the lines of
list_ones <- function(df) {
out <- list()
for (col in names(df)) {
out[[col]] <- which(df[[col]] == 1)
}
return(out)
}
list_ones(df_binary)

Loop with matrix accumulating values in column

I'm trying to make a loop to simplify:
dens1ha <- (densidade[1:45,5])
dens10ha <- (densidade[46:90,5])
dens100ha <- (densidade[91:135,5])
densfc <- (densidade[136:180,5])
denscap <- (densidade[181:225,5])
I need it stored in a single vector (x) and matrix (mm) as follows:
values of the matrix density line 1 to line 45, column 5, are stored in column 1 of vector x and the matrix mm. The line density matrix values 46 to line 90, column 5, are stored in column 2 of vector x and the matrix mm
and so on.
I tried:
x=c()
ii[1]=1
for(i in seq(1, 255, by = 44)) {
x[i]=densidade[i:(i+44),5]
ii=ii+1
mm = matrix(x,nrow=i,ncol=ii)
}
From your description of mm, I think this should do the trick:
mm <- matrix(densidade[,5], ncol=5)
You could also add names to the columns if this were desirable:
colnames(mm) <- c("dens1ha", "dens10ha", "dens100ha", "densfc", "denscap")
The goal of storing the vector x is less clear. I suspect that all of your goals may be achieved through extracting from the matrix mm rather than building a separate matrix:
# get dens1ha values as a vector
mm[,"dens1ha"]
mm[,1]
If you really would like to store these values in a separate, non-matrix structure, the most natural object to use in R is a list:
x <- list()
for(i in 1:5) {
x[[i]] <- densidade[(((i-1)*45)+1):(i*45),5]
}
# name the elements of the list
names(x) <- c("dens1ha", "dens10ha", "dens100ha", "densfc", "denscap")
You can extract vectors from this list using either
x[["dens1ha"]]
or
x[[1]]

How to save results into a empty vector in R?

I have problem to create a empty vector in R, and save the results of another vector into them. This is my code:
k<-vector(mode = "numeric", length = 0)
for (j in length(Pe)){
if ((Pe[j])>0) {
k[j]<-Pe[j]
}
}
The lenght of the vector Pe is 1000. I need only to save the values mayor than zero in the vector k, but when I type the vector k the display window show:
numerical(0)
This is the correct way to initiate a empty vector in R (k)?
Thanks
in fact, it can be much more easy. Type
k <- c()
instead. But I think this won't get you what you want.
What happens when element p is not > 0? R will fill k[p] with NA, while I think you want k to be a shorter vector of only the elements of Pe which are > 0, not to be the same length but with NA's?
If so, you don't even need a loop. Try
k <- Pe[Pe > 0]
This will get you a vector only containing the elements of Pe > 0, no NA's.
Excuse my bad english, hope I helped you
As MaxPD pointed out
for (j in length(Pe)) print(j)
would only print the length of Pe, you should
for (j in seq_len(length(Pe))) print(j)
## or
for (j in seq_along(Pe)) print(j)
## or
for (j in 1:length(Pe)) print(j)
but in your case i wouldn't even use a loop
k<-vector(mode = "numeric", length = 0)
k[Pe > 0] <- Pe[Pe > 0]
should do the trick if both objects are vectors and have the same length.

How to iterate an array with vectors in R?

I have a set of vectors of length n, say, for example that n=3:
vec1<-c(1,2,3)
vec2<-c(2,2,2)
And a multidimensional array of size n^n:
threeDarray<-array(0,dim=c(3,3,3))
I want to create a loop that goes through my set of vectors and adds 1 to the corresponding index in the array. After analysing the two vectors above the array should be like:
threeDarray[1,2,3]=1
threeDarray[2,2,2]=1
I'm trying to use the multidimensional array to store the number of occurrences of each vector (my vectors are patterns in a time series).
The community is right (and the noob is wrong). Multidimensional arrays are not the way to go about this.
An example of code working with lists:
freqPatterns<-function(timeSeries,dimension){
temp<-character()
for (i in 1:(length(timeSeries)-dimension+1)){
pattern<-paste(as.character(rank(timeSeries[i:(i+dimension-1)])-1),collapse=", ")
#print(pattern)
temp[[length(temp)+1]] <- pattern
}
freqTable=sort(table(temp),decreasing=T)
return(freqTable)
}
Thank you guys!
Like you found out yourself, I wouldn't use a multidimensioanl array neither.
Here is a solution using a dataframe:
n=4 # dimension
ll = lapply(vector("list", n), function(x) x=1:n) # build list of vectors (n * 1:n)
df_occurs = expand.grid(ll, KEEP.OUT.ATTRS=F) # get all combinations
df_occurs$occurences = 0
# for-loop for storing the occurences
for(v in list(vec1, vec2)) {
v_match = apply(df_occurs[,1:n], 1, function(x) all(x==v))
df_occurs$occurences[v_match] = 1
}
Maybe performance is an issue with large n. If it's possible to build a character-key out of your vector, eg.
paste(vec1, collapse="")
the lookup in the dataframe would be easier:
df_occurs = data.frame(
key = apply(expand.grid(ll, KEEP.OUT.ATTRS=F), 1, paste, collapse=""),
occurences = 0
)
for(key in list(vec1, vec2)) {
df_occurs$occurences[df_occurs$key==paste(key, collapse="")] = 1
}

Resources