Set range data frame in R - r

I have a large number (17000) of 96x97 matrices with names b1,b2...b17000. I need to combine them into an array 96x97x17000. I am trying to do this through an array function:
s=array(b1,b2...b17000,dim=c(96,97,17000))
The problem is that for this function to work, you need to write down the name of all the matrices. How can this be done without writing down the name of the matrices 17,000 times?
I tried to set this as a range b1:b17000 but it does not work correctly.

Based on your original question the below code should work for you.
names<-c(paste0("b",1:17000))
s<-array(unlist(mget(names)),dim=c(96,97,17000))
This will create a vector called names containing your the names of your matrices (assuming they're actually called m1, m2,... m17000). For example: names[1] will be b1 and names[2] will be b2, so on and so forth.
You can then use names to reference your matrices array as shown in my suggested code.
I hope this helps!

Related

Subsetting list containing multiple classes by same index/vector

I'm needing to subset a list which contains an array as well as a factor variable. Essentially if you imagine each component of the array is relative to a single individual which is then associated to a two factor variable (treatment).
list(array=array(rnorm(2,4,1),c(5,5,10)), treatment= rep(c(1,2),5))
Typically when sub-setting multiple components of the array from the first component of the list I would use something like
list$array[,,c(2,4,6)]
this would return the array components in location 2,4 and 6. However, for the factor component of the list this wouldn't work as subsetting is different, what you would need is this:
list$treatment[c(2,4,6)]
Need to subset a list with containing different classes (array and vector) by the same relative number.
You're treating your list of matrices as some kind of 3-dimensional object, but it's not.
Your list$matrices is of itself a list as well, which means you can index at as a list as well, it doesn't matter if it is a list of matrices, numerics, plot-objects, or whatever.
The data you provided as an example can just be indexed at one level, so list$matrices[c(2,4,6)] works fine.
And I don't really get your question about saving the indices in a numeric vector, what's to stop you from this code?
indices <- c(2,4,6)
mysubset <- list(list$matrices[indices], list$treatment[indices])
EDIT, adding new info for edited question:
I see you actually have an 3-D array now. Which is kind of weird, as there is no clear convention of what can be seen as "components". I mean, from your question I understand that list$array[,,n] refers to the n-th individual, but from a pure code-point of view there is no reason why something like list$array[n,,] couldn't refer to that.
Maybe you got the idea from other languages, but this is not really R-ish, your earlier example with a list of matrices made more sense to me. And I think the most logical would have been a data.frame with columns matrix and treatment (which is conceptually close to a list with a vector and a list of matrices, but it's clearer to others what you have).
But anyway, what is your desired output?
If it's just subsetting: with this structure, as there are no constraints on what could have been the content, you just have to tell R exactly what you want. There is no one operator that takes a subset of a vector and the 3rd index of an array at the same time. You're going to have to tell R that you want 3rd index to use for subsetting, and that you want to use the same index for subsetting a vector. Which is basically just the code you already have:
idx <- c(2,4,6)
output <- list(list$array[,,idx], list$treatment[idx])
The way that you use for subsetting multiple matrices actually gives an error since you are giving extra dimension although you already specify which sublist you are in. Hence in order to subset matrices for the given indices you can usemy_list[[1]][indices] or directly my_list$matrices[indices]. It is the same for the case treatement my_list[[2]][indices] or my_list$treatement[indices]

How to calculate combination of Data frame in R

I am a beginner in R program.
I imported a csv file. This file only contains one column with 50 characters, but R classifies it as a dataframe. I need all possible combinations within elements of this column. I think I need to work with a vector not with a data frame, how can I do it?
Thank you!
Actually your data frame already contains the vector you need. You can call it with
dataframe$column_name
The text before the $ operator specifies your data frame, and after is your vector, which is a column in your data frame. So when you run your calculations you can just write
function(dataframe$column_name)
In your specific case with a single vector, it may be simplest to change the dataframe into a 2d vector. But when you start manipulating your data, you'll likely store more vectors of variables. You'll want to keep those vectors organized within data frames.
Do you mean unlist?
You can use it to change a data frame into a vector, then you can use combn to get combination.

print variable names in my own function r

I want to create a funtion that creates new data frames using some variables from other data frames. For that I thing I need to print the variable names in my own function somehow.
The variables come from two data frames (asd and tetracam) which have six variables in common, the bands "w530", "w550", "w570", "670", "w700" and "w800". So, I want to create six data frames, one for each band. One by one I could write like this:
# Band w530
w530<-data.frame(tetracam$filename,tetracam$time,tetracam$type,tetracam$w530,asd$w530)
names(w530)<-c("filename","time","type","tetracam","asd")
w530<-w530[order(w530$time),]
It works fine but I'd like to do it as a function in order to run for all bands. I thought I have to replace all the w530 in the code above for a dinamic object. As I thought of using some of the apply family. So, I first created a list with the names of my common variables:
bands<-c("w530","w550","w570","670","w700","w800")
Then, I tried several ways, for example, using cat or sprintf that would use the strings from the list to fill my function. But it didn't work. Actually, I'm not sure which apply family function I would use. If it's possible to use any in this case:
my.fun<- function(band){
sprintf("%s<-data.frame(tetracam$filename,tetracam$time,tetracam$type,asd$%s,tetracam$%s)",band,band,band)
sprintf("names(%s)<-c('filename','time','type','asd','tetracam')",band)
sprintf("%s[order(%s$time),]",band,band)
}
Any help is appreciated.
Trick is to access data.frame column using df[varName] idiom.
fun1 <- function(band, tetracam, asd){
df<-data.frame(tetracam$filename,tetracam$time,tetracam$type,tetracam[band],asd[band])
names(df)<-c("filename","time","type","tetracam","asd")
df<-df[order(df$time),]
return(df)
}
for (band in bands){
single_band_df <- fun1(band, tetracam, asd)
}

Using a list of matrix names

I have 75 matrices that I want to search through. The matrices are named a1r1, a1r2, a1r3, a1r4, a1r5, a2r1,...a15r5, and I have a list with all 75 of those names in it; each matrix has the same number of rows and columns. Inside some nested for loops, I also have a line of code that, for the first matrix looks like this:
total <- (a1r1[row,i]) + (a1r1[row,j]) + (a1r1[row,k])
(i, j, k, and row are all variables that I am looping over.) I would like to automate this line so that the for loops would fully execute using the first matrix in the list, then fully execute using the second matrix and so on. How can I do this?
(I'm an experienced programmer, but new to R, so I'm willing to be told I shouldn't use a list of the matrix names, etc. I realize too that there's probably a better way in R than for loops, but I was hoping for sort of quick and dirty at my current level of R expertise.)
Thanks in advance for the help.
Here The R way to do this :
lapply(ls(pattern='a[0-9]r[0-9]'),
function(nn) {
x <- get(nn)
sum(x[row,c(i,j,k)])
})
ls will give a list of variable having a certain pattern name
You loop through the resulted list using lapply
get will transform the name to a varaible
use multi indexing with the vectorized sum function
It's not bad practice to build automatically lists of names designating your objects. You can build such lists with paste, rep, and sequences as 0:10, etc. Once you have a list of object names (let's call it mylist), the get function applied on it gives the objects themselves.

How to order a matrix by all columns

Ok, I'm stuck in a dumbness loop. I've read thru the helpful ideas at How to sort a dataframe by column(s)? , but need one more hint. I'd like a function that takes a matrix with an arbitrary number of columns, and sorts by all columns in sequence. E.g., for a matrix foo with N columns,
does the equivalent of foo[order(foo[,1],foo[,2],...foo[,N]),] . I am happy to use a with or by construction, and if necessary define the colnames of my matrix, but I can't figure out how to automate the collection of arguments to order (or to with) .
Or, I should say, I could build the entire bloody string with paste and then call it, but I'm sure there's a more straightforward way.
The most elegant (for certain values of "elegant") way would be to turn it into a data frame, and use do.call:
foo[do.call(order, as.data.frame(foo)), ]
This works because a data frame is just a list of variables with some associated attributes, and can be passed to functions expecting a list.

Resources