How to iteratively access data in data frames in r? - r

I follow the step in this post: Is there a way to define a subsequent set of data.frame in R? to assign data to a whole set of data frames.
What I did is to store all my "symbol" into a vector x <- c("A","B",C"...), then use the above method to get a set of data.frame such as A, B, C...etc.
My question is: how can I access these data.frame iteratively by another for loop, then assign to vectors such as:
x <- c("A","B","C") ## A, B, C are names of data.frame
y <- c("a","b","c") ## a, b, c are names of vectors
for (i in x){ y[i] <- x[i][,2]}
Thanks a lot in advance!!

Related

(R) Write function with vector as variable

I want to create a function and use sapply to pass a number of input variables through it. The trick: the "variables" are actually vectors. I include an example below, where I would like to transpose the vectors a, b, and d, without having to manually write each command individually. I include the x <- part, while leaving it blank, because this is one of the main points of my confusion. Were I creating a normal function, I would simply create a vector of all the variables I want to pass through form5. However, if I create a vector from vectors, I'll just have a longer vector. So to be clear, I'd like sapply to return a matrix or dataframe with all 3 transposed vectors.
a <- c(1:10)
b <- c(11:20)
d <- c(21:30)
X <-
form5 <- function(x){
t(x)
}
sapply(x, form5)

Is there a way of assigning multiple variables to elements in a named R list in one line?

I would like to assign variables to elements in a named list. These variable names are the same as the names in the list. Is there a way that I can assign them all in one line instead of one at a time like what I am doing below?
params <- data[data$Month == m,]
a <- params$a
b <- params$b
c <- params$c
I know that in Java Script you can destructure and array like so:
const [a, b, c] =[1,2,3]
Or a dictionary (which is perhaps more similar to an R named list):
const {a, b, c} = {a:1, b:2, c:3}
Each of these assign the variables a,b and c to the values 1,2 and 3 respectively.
Is there a similar approach that I can take with R?
Use list2env to create individual objects for each column in params.
params <- data[data$Month == m,]
list2env(params, .GlobalEnv)
If you want to keep data in a named list use as.list.
as.list(params)
Establish a named list (lst) in advance. Then you can assign the variables in the data frame (params) in one line.
lst <- vector(mode="list", length = 3)
lst <- list(params$a,params$b,params$c)
names(lst) <- c("a","b","c")

R: Creating data frame from list of values and list of variable names

I have two lists, A and B:
List A contains K character vectors of length W. Each vector contains the same W string values but the indices of the strings may differ. We can think of this list in practice as containing vector of variable names, where each vector contains the same variable names but in potentially-differing orders..
List B contains K character vectors of length W. Each vector can contain W arbitrary values. We can think of this list in practice as containing vectors with the corresponding values of the variables contained in each vector of List A.
I am trying to generate a data frame that is K rows long and W rows wide, where the column names are the W unique values in each vector in List A and the values for each row are drawn from the vector found in that row's index in List B.
I've been able to do this (minimal working example below) but it seems very hackish because it basically involves turning the two lists into data frames and then assigning values from one as column names for the other in a loop.
Is there a way to skip the steps of turning each list into a data frame before then using a loop to combine them? Looping through the lists seems inefficient, as does generating the two data frames rather than a single data frame that draws on contents of both lists.
# Declare number of rows and columns
K <- 10
W <- 5
colnames_set <- sample(LETTERS, W)
# Generate example data
# List A: column names
list_a <- vector(mode = "list", length = K)
list_a <- lapply(list_a, function(x) x <- sample(colnames_set, W))
# List B: values
list_b <- vector(mode = "list", length = K)
list_b <- lapply(list_b, function(x) x <- rnorm(n = W))
# Define function to take a vector and turn it into a
# data frame where each element of the vector is
# assigned to its own colun
vec2df <- function(x) {
x %>%
as.data.frame(., stringsAsFactors = FALSE) %>%
t() %>%
as.data.frame(., stringsAsFactors = FALSE)
}
# Convert vectors to data frames
vars <- lapply(list_a, vec2df)
vals <- lapply(list_b, vec2df)
# Combine the data frames into one
# (note the looping)
for(i in 1:K){
colnames(vals[[i]]) <- vars[[i]][1, ]
}
# Combine rows into a single data frame
out <- vals %>%
dplyr::bind_rows()
rownames(out) <- NULL
# Show output
out
Arrange the data in list_b so that the variables are aligned. We can use Map/mapply to do this, convert the output to dataframe and name the columns.
setNames(data.frame(t(mapply(function(x, y) y[order(x)], list_a, list_b))),
sort(colnames_set))

Generate a function based upon name data in a csv file opened in R

I have a dataframe of variable names and weightings.
Example:
Names <- c("a","b","c")
Weightings <- c(1,2,3)
df <- cbind(Names,Weightings)
I need to generate a function from this data as such.
myfun <- function(x,data){
data[x,"a"]*1+data[x,"b"]*2+data[x,"c"]*3}
I have another dataframe named data where the column names match a, b, and c and I will apply myfun to this data over all rows.
The issue I have is that the size of the Names and Weightings vector can vary. I could be working with 5 names and Weightings but I want it to generate the new function "myfun" as such.
Newnames <- c("a","b","c","d","e")
NewWeightings <- c(1,2,3,4,5)
myfun <- function(data){
data[x,"a"]*1+data[x,"b"]*2+data[x,"c"]*3+data[x,"d"]*4+data[x,"e"]*5}
Is there an easy way to automate the creation of this function so I could give someone the code, and a .csv file of column names and weightings and they could generate their new function.
What about a strategy like this. We use a function to make a function
getMyFunction <- function(columns, weights) {
stopifnot(length(columns)==length(weights))
function(x, data) {
rowSums(data[x, columns] * weights)
}
}
Basically the rowSums takes care of the addition, we specify a vector of columns all at once, and the default * is element-wise so that takes care of the weights.
Then we build a function like
Names <- c("a","b","c")
Weightings <- c(1,2,3)
myFun <- getMyFunction(Names, Weightings)
and we can use it with
dd<-data.frame(a=c(1,1), b=c(1,2), c=c(1,3))
myFun(1,dd)
# [1] 6
myFun(2,dd)
# [1] 13
myFun(1:2,dd)
# [1] 6 13

Add a Column to a Dataframe From a List of Values

Starting with an empty dataframe, I need to fill the dataframe as follows: A for loop generates a fixed number of values in each iteration, and I need to add a new column with the values in that list, and giving the column a unique name, col_i (where i is the ith iteration of the loop).
How can this (seemingly simple task) be done?
The most efficient way to build a dataframe piecewise is to store your parts in a pre-allocated list, then put them together afterwards.
For example:
num.iters <- 10
l <- vector('list', num.iters)
for (i in 1:num.iters) {
l[[i]] <- rnorm(3) # the column data
names(l)[i] <- paste('Col', i, sep='.') # the column name
}
do.call(cbind, l) # ... if your cols are the same datatype and you want a matrix
data.frame(l) # otherwise
What's wrong with ?cbind?
The functions cbind and rbind are S3 generic, with methods for data frames.
The data frame method will be used if at least one argument is a data frame
and the rest are vectors or matrices.
?colnames can also be applied to data.frames

Resources