Generate column names dynamically for a dataframe in R

Generate column names dynamically for a dataframe in R - r

So, I am coverting a json into dataframe using and I'm successful in doing that. Below is my code:
df <- data.frame(t(sapply(json, c)))
colnames(df) <- gsub("X", "y",colnames(df))
So, it gives me column names like y1,y2,y3 etc. Is it possible if I could have these column names generated from 0 instead. So, the column names should be like y0,y1,y2 etc.

From the comments:
df <- data.frame(t(sapply(json,c))
colnames(df) <- paste0("y", 0:(ncol(df)-1))
Or if you want padded zeros
a <- seq(0,ncol(df)-1,1)
colnames(df) <- sprintf("y%02d",a)

Related

How do I rename a single column in multiple dataframes to the name of the dataframe in which they reside in R?

I am currently trying to rename a single column in multiple dataframes to match the dataframe name in R.
I have seen some questions/solutions on the site that are similar to what I am attempting to do, but none appear to do this dynamically. I have over 45 dataframes I need rename a column in, so manually typing in each individual name is doable, but time consuming.
Dataframe1 <- column
Dataframe2 <- column
Dataframe3 <- column
I want it to look like this:
Dataframe1 <- Dataframe1
Dataframe2 <- Dataframe2
Dataframe3 <- Dataframe3
The ultimate goal is to have a master dataframe with columns Dataframe1, Dataframe2, and Dataframe3

We can get all the datasets into a list and rename at once in the list
lst1 <- lapply(mget(ls(pattern = "Dataframe\\d+")), function(x) {
names(x)[5] <- "newcol"
x})
Update
If we are renaming the columns in different datasets with different names, then create a vector of columns names that corresponds to each 'Dataframe' column name
nm1 <- c("col5A", "col5B", "col5C", ..., "col5Z")
lst2 <- Map(function(x) {names(x)[5] <- y; x},
mget(ls(pattern = "Dataframe\\d+")),
nm1)
In the above code, we are renaming the 5th column to 'newcol'.
It can also be done using tidyverse
library(dplyr)
library(purrr)
map(mget(ls(pattern = "Dataframe\\d+")), ~ .x %>%
rename_at(5, ~ "newcol"))

Creating json list of lists from dataframe

Very new to R, I have a data.frame of mixed types and need to convert it to a json object that has each row of the data.frame as a list within a list, with the column headers as the first list.
Closest I've come is the below,
library(jsonlite)
df <- data.frame(X=as.numeric(c(1,2,3)),
Y=as.numeric(c(4,5,6)),
Z=c('a', 'b', 'c'),
stringsAsFactors=FALSE)
test <- split(unname(df), 1:NROW(df))
toJSON(test)
Which gives,
{"1":[[1,4,"a"]],"2":[[2,5,"b"]],"3":[[3,6,"c"]]}
If there's some way to remove the keys and flatten the value list by one level I could make this work by adding the colnames, but is there an easier way I'm missing? Output I'd like is,
{[["X","Y","Z"],[1,4,"a"],[2,5,"b"],[3,6,"c"]]}
Thanks for any help!

The general idea is to get the json format you want your data needs to be in a list of vectors (2D Vectors and 2D lists do not work).
Hey here is one way, there is probably a more elegant one but this works (but it makes the numbers strings, I can't find away around that sorry).
library(rlist)
df <- data.frame(X=as.numeric(c(1,2,3)),
Y=as.numeric(c(4,5,6)),
Z=c('a', 'b', 'c'),
stringsAsFactors=FALSE)
#make the column names a row and then remove them
names <- colnames(df)
df[2:nrow(df)+1,] <- df
df[1,] <- names
colnames(df) <- NULL
#convert the df into a list containing vectors
data <- list()
for(i in seq(1,nrow(df))){
data <- list.append(data,as.vector(df[i,]))
}
toJSON(data)

Converting List of Vectors to Data Frame in R

I'm trying to convert a list of vectors into a data frame, with there being a column for Company Names and column for the MPE. My list is generated by running the following code for each company:
MPE[[2]] <- c("Google", abs(((forecasted - goog[nrow(goog),]$close)
/ goog[nrow(goog),]$close)*100))
Now, i'm having trouble making it into the appropriate data frame for further manipulation. What's the easiest way to do this?
This is an example list of vectors that I would want to manipulate into a dataframe with the company names in one column and the number in the second column.
test <- list(c("Google", 2))
test[[2]] <- c("Microsoft", 3)
test[[3]] <- c("Apple", 4)

You can use unlist with matrix and then turn into a dataframe. reducing with rbind could take a long time with a large dataframe I think.
df <- data.frame(matrix(unlist(test), nrow=length(test), byrow=T))
colnames(df) <- c("Company", "MPE")

I was actually able to achieve what I wanted with the following:
MPE_df <- data.frame(Reduce(rbind ,MPE))
colnames(MPE_df) <- c("Company", "MPE")
MPE_df

Subsetting efficiently on multiple columns and rows

I am trying to subset my data to drop rows with certain values of certain variables. Suppose I have a data frame df with many columns and rows, I want to drop rows based on the values of variables G1 and G9, and I only want to keep rows where those variables take on values of 1, 2, or 3. In this way, I aim to subset on the same values across multiple variables.
I am trying to do this with few lines of code and in a manner that allows quick changes to the variables or values I would like to use. For example, assuming I start with data frame df and want to end with newdf, which excludes observations where G1 and G9 do not take on values of 1, 2, or 3:
# Naive approach (requires manually changing variables and values in each line of code)
newdf <- df[which(df$G1 %in% c(1,2,3), ]
newdf <- df[which(newdf$G9 %in% c(1,2,3), ]
# Better approach (requires manually changing variables names in each line of code)
vals <- c(1,2,3)
newdf <- df[which(df$G1 %in% vals, ]
newdf <- df[which(newdf$G9 %in% vals, ]
If I wanted to not only subset on G1 and G9 but MANY variables, this manual approach would be time-consuming to modify. I want to simplify this even further by consolidating all of the code into a single line. I know the below is wrong but I am not sure how to implement an alternative.
newdf <- c(1,2,3)
newdf <- c(df$G1, df$G9)
newdf <- df[which(df$vars %in% vals, ]
It is my understanding I want to use apply() but I am not sure how.

You do not need to use which with %in%, it returns boolean values. How about the below:
keepies <- (df$G1 %in% vals) & (df$G9 %in% vals)
newdf <- df[keepies, ]

Use data.table
First, melt your data
library(data.table)
DT <- melt.data.table(df)
Then split into lists
DTLists <- split(DT, list(DT[1:9])) #this is the number of columns that you have.
Now you can operate on the lists recursively using lapply
DTresult <- lapply(DTLists, function(x) {
...
}

R move named column to the end of a data frame

I'm trying to move a column to the end of a data frame and I'm struggling
output_index <- grep(output, names(df))
df <- cbind(df[,-output_index], df[,output_index])
This orders the data properly, however it converts the data to a matrix which doesn't work. How can I do this without losing the column names and keeping the data as a data frame.

Didn't need the , in front of the index:
output_index <- grep(output, names(df))
df <- cbind(df[-output_index], df[output_index])

df <- data.frame(id=1:10, output=rnorm(10,1,1), input=rnorm(10,1,1))
output_index <- grep("output", names(df))
res.df <- cbind(df[,-output_index], df[,output_index])

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Generate column names dynamically for a dataframe in R - r

From the comments: df <- data.frame(t(sapply(json,c)) colnames(df) <- paste0("y", 0:(ncol(df)-1)) Or if you want padded zeros a <- seq(0,ncol(df)-1,1) colnames(df) <- sprintf("y%02d",a)

Related

How do I rename a single column in multiple dataframes to the name of the dataframe in which they reside in R?

Creating json list of lists from dataframe

Converting List of Vectors to Data Frame in R

Subsetting efficiently on multiple columns and rows

R move named column to the end of a data frame

Categories

Resources