How to Rename Column Headers in R - r

I have two separate datasets: one has the column headers and another has the data.
The first one looks like this:
where I want to make the 2nd column as the column headers of the next dataset:
How can I do this? Thank you.

In general you can use colnames, which is a list of your column names of your dataframe or matrix. You can rename your dataframe then with:
colnames(df) <- *listofnames*
Also it is possible just to rename one name by using the [] brackets.
This would rename the first column:
colnames(df2)[1] <- "name"
For your example we gonna take the values of your column. Try this:
colnames(df2) <- as.character(df1[,2])
Take care that the length of the columns and the header is identical.
Equivalent for rows is rownames()

dplyr way w/ reproducible code:
library(dplyr)
df <- tibble(x = 1:5, y = 11:15)
df_n <- tibble(x = 1:2, y = c("col1", "col2"))
names(df) <- df_n %>% select(y) %>% pull()
I think the select() %>% pull() syntax is easier to remember than list indexing. Also I used names over colnames function. When working with a dataframe, colnames simply calls the names function, so better to cut out the middleman and be more explicit that we are working with a dataframe and not a matrix. Also shorter to type.

You can simply do this :
names(data)[3]<- 'Newlabel'
Where names(data)[3] is the column you want to rename.

Related

Selecting columns from dataframe programmatically when column names have spaces

I have a dataframe which I would like to query. Note that the columns of that dataframe could change and the column names have spcaes. I have a function that I want to apply on the dataframe columns. I figured I could programmatically find out what columns exists and then use that list of columns to apply function to the columns that exist.
I was able to figure out how to do that when the column names don't have spaces: See the code below
library(tidyverse)
library(rlang)
col_names <- c("cyl","mpg","New_Var")
cc <- rlang::quos(col_names)
mtcars%>%mutate(New_Var=1)%>%select(!!!cc)
But when the column names have spaces, this method does not works, below is the code I used:
col_names <- c("cyl","mpg","`New Var`")
cc <- rlang::quos(col_names)
mtcars%>%mutate(`New Var`=1)%>%select(!!!cc)
Is there a way to select columns that have spaces in their name without changing their names ?
You have to do nothing differently for values with spaces. For example,
library(dplyr)
library(rlang)
col_names <- c("cyl","mpg","New Var")
cc <- quos(col_names)
mtcars %>% mutate(`New Var`=1) %>% select(!!!cc)
Also note, that select also accepts string names so this works too :
mtcars%>% mutate(`New Var`=1) %>% select(col_names)

if variable name in data frame column matches names in a vector rename variable

Given a data frame and a vector
set.seed(123)
feature <- sample(LETTERS,30,replace = T)
number<-sample(1:100,30, replace = T)
df<-data.frame(feature,number)
rename<-c("N","V","C","E")
I want to scan through df$feature and if a letter stored in rename matches one in the column df$feature I want to rename them to "other".
I am quite sure that this must have been answered somewhere already, I have looked for a quite long time though.
You can use %in% to find the rows which hold rename:
df$feature[df$feature %in% rename] <- "other"
In case df$feature is a factor and does not contain other in the levels you need to add other to the levels before you exchange them with:
levels(df$feature) <- unique(c(levels(df$feature), "other"))
or you cast it to character with:
df$feature <- as.character(df$feature)
One option using dplyr:
df %>%
mutate(feature = str_replace(feature, paste(rename, collapse = "|"), "other"))

Name the column of data frame and set as factor at the same time

I need your help to simplify the following code.
I need to name the columns of matrix and format each of it as factor.
How can I do that for 100 columns without doing it one by one.
z <- matrix(sample(seq(3),n*p,replace=TRUE),nrow=n)
train.data <- data.frame(x1=factor(z[,1],x2=factor(z[,2],....,x100=factor(z[,52]))
Here's one option
setNames(data.frame(lapply(split(z, col(z)), factor)), paste0("x", 1:p))
or use magrittr piping syntax
library(magrittr)
split(z, col(z)) %>%
lapply(factor) %>%
data.frame %>%
setNames(paste0("x", 1:p))

how to use gather_ in tidyr with variables

I'm using tidyr together with shiny and hence needs to utilize dynamic values in tidyr operations.
However I do have trouble using the gather_(), which I think was designed for such case.
Minimal example below:
library(tidyr)
df <- data.frame(name=letters[1:5],v1=1:5,v2=10:14,v3=7:11,stringsAsFactors=FALSE)
#works fine
df %>% gather(Measure,Qty,v1:v3)
dyn_1 <- 'Measure'
dyn_2 <- 'Qty'
dyn_err <- 'v1:v3'
dyn_err_1 <- 'v1'
dyn_err_2 <- 'v2'
#error
df %>% gather_(dyn_1,dyn_2,dyn_err)
#error
df %>% gather_(dyn_1,dyn_2,dyn_err_1:dyn_err_2)
after some debug I realized the error happened at melt measure.vars part, but I don't know how to get it work with the ':' there...
Please help with a solution and explain a little bit so I could learn more.
You are telling gather_ to look for the colume 'v1:v3' not on the separate column ids. Simply change dyn_err <- "v1:v3" to dyn_err <- paste("v", seq(3), sep="").
If you df has different column names (e.g. var_a, qtr_b, stg_c), you can either extract those column names or use the paste function for whichever variables are of interest.
dyn_err <- colnames(df)[2:4]
or
dyn_err <- paste(c("var", "qtr", "stg"), letters[1:3], sep="_")
You need to look at what column names you want and make the corresponding vector.

What is the best way to transpose a data.frame in R and to set one of the columns to be the header for the new transposed table?

What is the best way to transpose a data.frame in R and to set one of the columns to be the header for the new transposed table? I have coded up a way to do this below. As I am still new to R. I would like suggestions to improve my code as well as alternatives that would be more R-like. My solution is also unfortunately a bit hard coded (i.e. the new column headings are in a certain location).
# Assume a data.frame called fooData
# Assume the column is the first column before transposing
# Transpose table
fooData.T <- t(fooData)
# Set the column headings
colnames(fooData.T) <- test[1,]
# Get rid of the column heading row
fooData.T <- fooData.T[2:nrow(fooData.T), ]
#fooData.T now contains a transposed table with the first column as headings
Well you could do it in 2 steps by using
# Transpose table YOU WANT
fooData.T <- t(fooData[,2:ncol(fooData)])
# Set the column headings from the first column in the original table
colnames(fooData.T) <- fooData[,1]
The result being a matrix which you're probably aware of, that's due to class issues when transposing. I don't think there will be a single line way to do this given the lack of naming abilities in the transpose step.
You can do it even in one line:
fooData.T <- setNames(data.frame(t(fooData[,-1])), fooData[,1])
There are already great answers. However, this answer might be useful for those who prefer brevity in code.
Here are my two cents using dplyr for a data.frame that has grouping columns and an id column.
id_transpose <- function(df, id){
df %>%
ungroup() %>%
select(where(is.numeric)) %>%
t() %>%
as_tibble() %>%
setNames(., df %>% pull({{id}}))
}
Here is another tiyderse/dplyr approach taken from here.
mtcars %>%
tibble::rownames_to_column() %>%
tidyr::pivot_longer(-rowname) %>%
tidyr::pivot_wider(names_from=rowname, values_from=value)
Use transpose from data.table, suppose the column you want to use as header after transpose is the variable group.
fooData.transpose = fooData %>% transpose (make.name = "group")
In addition, if you want to assign a name for the transposed column name, use argument keep.names.
fooData.transpose = fooData %>% transpose (make.name = "group", keep.names = "column_name")
There's now a dedicated function to transpose data frames, rotate_df from the sjmisc package. If the desired names are in the first column of the original df, you can achieve this in one line thanks to the cn argument.
Here's an example data frame:
df <- data.frame(name = c("Mary", "John", "Louise"), class = c("A", "A", "B"), score = c(40, 75, 80))
df
# name class score
#1 Mary A 40
#2 John A 75
#3 Louise B 80
Executing the function with cn = T:
rotate_df(df, cn = T)
# Mary John Louise
#class A A B
#score 40 75 80
I had a similar problem to this -- I had a variable of factors in a long format and I wanted each factor to be a new column heading; using "unstack" from the stats library did it in one step. If the column you want as a header isn't a factor, "cast" from the reshape library might work.

Resources