Naming data frames in lists using a sequence - r

I have a rather simple question. So I have a list, and I want to name the data frames in the list according to a sequence. Right now I have a sequence that increase according to one letter per list (explained below):
nm1 <- paste0("Results_Comparison_",LETTERS[seq_along(Model_comparisons)])
This creates "Results_Comparison_A", "Results_Comparison_B", "Results_Comparison_C", "Results_Comparison_D", etc. What I want is it for it to be a number instead of a letter. (i.e. Results_Comparison_1, Results_Comparison_2, Results_Comparison_3, etc.) Does anyone know how I could change this? If extra information is needed let me know!

This should work paste0("Results_Comparison_",seq_along(Model_comparisons))

Related

extracting something from a dataframe which contain lists with label information

Sorry if this is basic, I have great difficulty knowing how to extract things out of dataframes, especially when there are lists and other items contained within.
Please see below image of what I see when I click on the dataframe in the global environment. I have a dataframe (lets call it "df") with the variable S1 (it was SPSS file imported using Haven).
[![how it looks in environment][1]][1]
I know df$S1 gets me the values for the variable. But I'd just like to extract the information that is value "labels" and "names" and put them into another dataframe. Is there a way to do it? I'd like to do it for all variables, but if you can start me off just to extract these two from that one variable, I can look at doing it using a loop. Any help greatly appreciated. Many thanks
We can use attr or attributes
attributes(df$S1)
returns a list with the key as the names label, format.spss etc. From the above, just use the $ or [[ for extraction i.e. for the normal list
attributes(df$S1)$label
Based on the updated image, we may need #
attributes(df$S1)#label
If we use attr
attr(df$S1, "label")

Subsetting list containing multiple classes by same index/vector

I'm needing to subset a list which contains an array as well as a factor variable. Essentially if you imagine each component of the array is relative to a single individual which is then associated to a two factor variable (treatment).
list(array=array(rnorm(2,4,1),c(5,5,10)), treatment= rep(c(1,2),5))
Typically when sub-setting multiple components of the array from the first component of the list I would use something like
list$array[,,c(2,4,6)]
this would return the array components in location 2,4 and 6. However, for the factor component of the list this wouldn't work as subsetting is different, what you would need is this:
list$treatment[c(2,4,6)]
Need to subset a list with containing different classes (array and vector) by the same relative number.
You're treating your list of matrices as some kind of 3-dimensional object, but it's not.
Your list$matrices is of itself a list as well, which means you can index at as a list as well, it doesn't matter if it is a list of matrices, numerics, plot-objects, or whatever.
The data you provided as an example can just be indexed at one level, so list$matrices[c(2,4,6)] works fine.
And I don't really get your question about saving the indices in a numeric vector, what's to stop you from this code?
indices <- c(2,4,6)
mysubset <- list(list$matrices[indices], list$treatment[indices])
EDIT, adding new info for edited question:
I see you actually have an 3-D array now. Which is kind of weird, as there is no clear convention of what can be seen as "components". I mean, from your question I understand that list$array[,,n] refers to the n-th individual, but from a pure code-point of view there is no reason why something like list$array[n,,] couldn't refer to that.
Maybe you got the idea from other languages, but this is not really R-ish, your earlier example with a list of matrices made more sense to me. And I think the most logical would have been a data.frame with columns matrix and treatment (which is conceptually close to a list with a vector and a list of matrices, but it's clearer to others what you have).
But anyway, what is your desired output?
If it's just subsetting: with this structure, as there are no constraints on what could have been the content, you just have to tell R exactly what you want. There is no one operator that takes a subset of a vector and the 3rd index of an array at the same time. You're going to have to tell R that you want 3rd index to use for subsetting, and that you want to use the same index for subsetting a vector. Which is basically just the code you already have:
idx <- c(2,4,6)
output <- list(list$array[,,idx], list$treatment[idx])
The way that you use for subsetting multiple matrices actually gives an error since you are giving extra dimension although you already specify which sublist you are in. Hence in order to subset matrices for the given indices you can usemy_list[[1]][indices] or directly my_list$matrices[indices]. It is the same for the case treatement my_list[[2]][indices] or my_list$treatement[indices]

R replace variable name in all dimensions of multidimensional list

I have a large, multidimensional list as a result of a statistic's project. The list holds different objects, holding objects by themselves. There are also plots, matrices etc. It's a heterogeneous mix of a lot of different types and different dimensionalities.
Now I have to change the name of one variable completely. Every occurence has to be overriden. Is there a way to do this?
Here is a little example. There's no use in solving this example explicitely, as my list is much larger.
a <- list(entry1=list("a","b","c","xx",p=c(3,4,"xx")),
entry2=list(matrix(c(1,2,"xx",4), nrow = 2),xx=list(6,7,8,"xx")),
xx=list(1,2,3,4,"xx"))
How can I change the xx to yy? Thanks in advance!

How to remember which variables are in a list

I have a huge list in which I put different variables in order to apply the same function to all of them.
In a next step I want to apply specific functions to specific elements of the list, i.e. all functions used vary from element to element within the list.
How can I do this? My first idea was (see my other question, Reassign variables to elements of list) to split the list into the original variables again. This can be done.
But I was recommended to keep the items in the list instead. My questions is: How can I access each variable quickly by doing that? One idea would be to use the names attribute of the list in the beginning and fill it with a vector of the original variable names. However, by doing that it would be much longer later on to type list["name_x"] than just typing name_x assuming name_x is globally available.
What is the most efficient way to deal with my problem?

How do I match single ID's in one data frame to multiples of the IDs in another data frame in R?

For a project at work, I need to generate a table from a list of proposal ids, and a table with more data about some of those proposals (called "awards"). I'm having trouble with the match() function; the data in the "awards" table often has several rows that use the same ID, while the proposals frame has only one copy of each ID. From what I've tried, R ignores multiple rows and only returns the first match, when I need all of them. I haven't been able to find anything in documentation or through searches that helps me, though I have been having difficulty phrasing the right question.
Here's what I have so far:
#R CODE to add awards data on proposals to new data spreadsheet
#read tab delimited files
Awards=read.delim("O:/testing.txt",as.is=T)
Proposals=read.delim("O:/test.txt",as.is=T)
#match IDs from both spreadsheets
Proposals$TotalAwarded=Awards$TotalAwarded([match(Proposals$IDs,Awards$IDs)]),
write.table(Proposals,"O:/tested.txt",quote=F,row.names=F,sep="\t")
This does exactly what I want, except that only the first match is encapsulated.
What's the best way to go forward? How do I make R utilize all of the matches available?
Thanks
See help on merge: ?merge
merge( Proposals, Awards, by=ID, all.y=TRUE )
But I cannot believe this hasn't been asked on SO before.

Resources