How can I create a vector of subsets in Pari/GP? - pari-gp

I want to produce a vector containing all k-element subsets of a second vector. I know that I can do this by applying vecextract with each k-element subset of the natural numbers 1...n to my original vector.
How can I create that vector of subsets of natural numbers, though? I can see that the command forsubset does nearly what I want, but it's an imperative command, not one which creates a vector. So I could use the following function to print a list of vectors:
f(n,k)=forsubset([n,k],s,print(s)) but I can only capture them by adding each subset to a List() and then converting the List() to a Vec(). This seems clumsy. Is there a better way to do this, perhaps a totally different one?

Related

cbind a long list in R using list

i have a huge number of vectors in R, all with the same length, say: x1,x2,...,x1000000
and i want to cbind some of into one single dataframe. I have the list of the names of vectors that i want to combine, say name_list <- c("x3","x47847","x930233") of vectors x3,x47847,x930233
So normally we would just use cbind(x3,x47847,x930233). But since the number of vectors is huge, and all i have is the name list, is there any way i can just do the cbind with name_list?
Thank you so much for your help in advance.

Data table subsetting in r by concatenating string variables

I have a data table that I am trying to subset by creating a list of variable names by pasting together some string vectors in the j argument of the data table, but I'm running into difficulty.
I have a character vector called foos (for this example foos <- c('FOO0','FOO1','FOO2')) and a vector I created with c() . I wanted to subset my data table by doing dt[,paste0(foos, c('VAR0','VAR1','VAR2'))] but that didn’t work as expected. I output what paste0(foos, c('VAR0','VAR1','VAR2')) returns and it becomes
[1] "FOO0VAR0" "FOO1VAR1" "FOO2VAR2"
so it seems this approach does a vector index by vector index concatenation instead of a concatenation of the vectors themselves (and that’s a bit surprising to me, I’d expect to have to lapply to get a paste happening on elements of a vector). Changing the permutation of the c() and paste0 didn’t work. I also tried to do
dt[,c(foos,c('VAR0','VAR1','VAR2'))] but that also doesn't work.
Is there a way to subset by a created concatenation of two string vectors in the jth column of a data table in R?

Writing a loop to slice a matrix in R

I have a pretty basic loop question:
I have a matrix (365x20). So for twenty years I have daily rainfall data.
I need to slice the matrix in order to conduct the next steps of my analysis, which I did like this:
year1 <- as.vector(Rainfall_data$year1)
year2 <- as.vector(Rainfall_data$year2)
...
year20 <- as.vector(Rainfall_data$year20)
This gives me in total 20 single 1x365 vectors.
Now, I want to do the same for the transposed Rainfall data to obtain a vector containing the value of the same day for all twenty years. Since this would mean to do
as.vector(t_Rainfall_data$day1-365)
I wanted to write a loop. The columns are called day1 to day 365. t_Rainfall_data would be the transposed matrix. Main aim is to obtain in total 365 single 1x20 vectors.
I tried several ways, but failed them all.
The comments are right: anything you want to do with the vector day1 can just as well be done with t_Rainfall_data$day1 (or likely with Rainfall_data[1,]) and it's better practice to slice your dataframe when you're doing something with it rather than creating a lot of redundant vectors out of it. Similarly, even if you need a bunch of objects, it's almost always easier to deal with a list of objects than it is to create separate named objects. All that said, here's how to get what you're asking for:
As in comments, you can return a list of vectors with
lapply(seq_len(nrow(Rainfall_data)), function(i) Rainfall_data[i, ])
If you would prefer a loop, and to create the objects rather than return a list, you can do something like
for(i in 1:nrow(Rainfall_data){
assign(paste0("day",i),as.vector(t_Rainfall_data[,paste0("day",i)]))
}
assign will create an object named after the string passed to it, that contains the second argument.

matrix subseting by column's name using `subset` function

Consider the following simulation snippet:
k <- 1:5
x <- seq(0,10,length.out = 100)
dsts <- lapply(1:length(k), function(i) cbind(x=x, distri=dchisq(x,k[i]),i) )
dsts <- do.call(rbind,dsts)
why does this code throws an error (dsts is matrix):
subset(dsts,i==1)
#Error in subset.matrix(dsts, i == 1) : object 'i' not found
Even this one:
colnames(dsts)[3] <- 'iii'
subset(dsts,iii==1)
But not this one (matrix coerced as dataframe):
subset(as.data.frame(dsts),i==1)
This one works either where x is already defined:
subset(dsts,x> 500)
The error occurs in subset.matrix() on this line:
else if (!is.logical(subset))
Is this a bug that should be reported to R Core?
The behavior you are describing is by design and is documented on the ?subset help page.
From the help page:
For data frames, the subset argument works on the rows. Note that subset will be evaluated in the data frame, so columns can be referred to (by name) as variables in the expression (see the examples).
In R, data.frames and matrices are very different types of objects. If this is causing a problem, you are probably using the wrong data structure for your data. Matrices are really only necessary if you meed matrix arithmetic. If you are thinking of your columns as different attributes for a row observations, then you should be storing your data in a data.frame in the first place. You could store all your values in a simple vector where every three values represent one observation, but that would also be a poor choice of data structure for your data. I'm not sure if you were trying to be more efficient by choosing a matrix but it seems like just the wrong choice.
A data.frame is stored as a named list while a matrix is stored as a dimensioned vector. A list can be used as an environment which makes it easy to evaluate variable names in that context. The biggest difference between the two is that data.frames can hold columns of different classes (numerics, characters, dates) while matrices can only hold values of exactly one data.type. You cannot always easily convert between the two without a loss of information.
Thinks like $ only work with data.frames as well.
dd <- data.frame(x=1:10)
dd$x
mm <- matrix(1:10, ncol=1, dimnames=list(NULL, "x"))
mm$x # Error
If you want to subset a matrix, you are better off using standard [ subsetting rather than the sub setting function.
dsts[ dsts[,"i"]==1, ]
This behavior has been a part of R for a very long time. Any changes to this behavior is likely to introduce breaking changes to existing code that relies on variables being evaluated in a certain context. I think the problem lies with whomever told you to use a matrix in the first place. Rather than cbind(), you should have used data.frame()

How do you select multiple variables from a matrix using a randomly selected vector of column indices?

Hopefully this has an easy answer I just haven't been able to find:
I am trying to write a simulation that will compare a number of statistical procedures on different subsets of rows (subjects) and columns (variables) of a large matrix.
Subsets of rows was fairly easy using a sample() of the subject ID numbers, but I am running into a little more trouble with columns.
Essentially, what I'd like to be able to do is create a random sample of column index numbers which will then be used to create a new matrix. What's got me the closest so far is:
testmat <- matrix(rnorm(10000),nrow=1000,ncol=100)
column.ind <- sample(3:100,20)
teststr <- paste("testmat[,",column.ind,"]",sep="",collapse=",")
which gives me a string that has a testmat[,column.ind] for every sampled index number. Is there any way to easily plug that into a cbind() function to make a new matrix? Is there any other obvious way I'm missing?
I've been able to do it using a loop (i.e. cbind(matrix,newcolumn) over and over), but that's fairly slow as the matrix I'm using is quite large and I will be doing this many times. I'm hoping there's a couple-line solution that's more elegant and quicker.
Have you tried testmat[, column.ind]?
Rows and columns can be indexed in the same way with logical vectors, a set of names, or numbers for indexes.
See here for an example: http://ideone.com/EtuUN.

Resources