rename column names dynamically in dplyr chain [duplicate] - r

This question already has an answer here:
Rename multiple columns given character vectors of column names and replacement [duplicate]
(1 answer)
Closed 5 years ago.
Here's an example
temp <- mtcars
colnames(temp)[grepl("ge", colnames(temp))] <- "garbage"
Output
mpg cyl disp hp drat wt qsec vs am garbage carb
Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
I only know what the column name will start with ("ge") but I'm not sure of the column name exactly
I want a solution that works in a dplyr chain
temp %>%
rename(vars(starts_with("ge")), "garbage")
Error: All arguments must be named
of course doesn't work. Thanks for any help

You can use rename_at. If you know that only one column starts with "ge", this will work:
library(dplyr)
mtcars %>%
rename_at(vars(starts_with("ge")), funs(paste0("garbage")))
If you want to rename more than one column, the function in funs() needs to return a vector of names, or do something like gsub() to add something to the existing column names.

Related

RStudio: colnames() function not showing name of very first column

When I run colnames(), it never shows the name of this first column.
For example, after wasting a lot of time researching online, I discovered the name of the first column in mtcars is das_Auto.
Why doesn't this name show when I run this code?
[colnames(mtcars)][1]
What's the easiest way to determine the name of the first column in a data set?
This is because the first 'column' of mtcars is not actually a column but an index. If you want to convert it to a column you can run the below:
df <- cbind(das_Auto = rownames(mtcars), mtcars)
rownames(df) <- 1:nrow(mtcars)
head(df)
das_Auto mpg cyl disp hp drat wt qsec vs am gear carb
1 Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
2 Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
3 Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
4 Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
5 Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
6 Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1

How to rename all columns between two sets of columns in a dataframe using R?

Say I am interested in renaming columns across several datasets. The columns that need to be renamed vary by name and position, so they can't be selected that way. However, the columns prior to the columns I want to rename and the columns right after are constant.
As an example, say the mpg and cyl columns in mtcars are always the first two columns and their names never change. The vs:carb columns are similar, but their positions change depending on the number of columns added before them (but after cyl). However, the variable names from hp:qsec change and sometime a new variable will get added between them.
Say I want to append the word '_Value' to the end of each of the columns that are located after cyl and before vs. How would I go about doing that, ideally using dplyr?
You can try -
library(dplyr)
mtcars %>%
rename_with(~paste0(., '_Value'), -c(mpg:cyl, vs:carb)) %>%
head
# mpg cyl disp_Value hp_Value drat_Value wt_Value qsec_Value vs am gear carb
#Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
#Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
#Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
#Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
#Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
If you have other columns in the data and to rename the columns specifically between cyl and vs you can do -
start <- match('cyl', names(mtcars))
end <- match('vs', names(mtcars))
cols <- (start + 1):(end - 1)
names(mtcars)[cols] <- paste0(names(mtcars)[cols], '_Value')

How to recover property from table into dataframe in R? [duplicate]

I think I am missing a fundamental concept about R's data frames.
head(mtcars)
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
The names of the cars here. Is this a column? I don't think so, because I am not able to access them via mtcars[,1]. And there is no column name/header for it.
How could I create a data frame like that? How could I use that special column e.g. to describe the data in a plot for example?
They are row names, to access them use:
rownames(mtcars)
For column names use colnames, to see both row and column names, we can use:
dimnames(mtcars)
To modify, for example the first row:
rownames(mtcars)[1] <- "myNewName"
When data frame is created with data.frame, row names are assigned with 1:n numbers.
mydata <- data.frame(x = 1:5)
Then we can modify them:
rownames(mydata) <- paste0("MyName", 1:5)
Or we can add rownames when creating the data.frame:
mydata <- data.frame(x = 1:5, row.names = paste0("MyName", 1:5))
Note:
rownames are not very reliable, for example see this post. (this could be subjective opinion and I avoid them by reassigning rownames to columns)
data.table and dplyr packages prefer not to have them. You can always reassign rownames into a columns as:
mydata$myNames <- rownames(mydata)
A shorter one liner argument with data.tablePackage will make the rowname a column.
library(data.table)
setDT(mtcars, keep.rownames = TRUE[])
head(mtcars)
rn mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
This works too using tibble.
library(tibble)
mtcars %>%
rownames_to_column(var="carnames")
How could you create a data frame like that? =>
you can transform a column to a row names using textshape package. see exemple below
> column to row names
library(textshape)
state_dat <- data.frame(state.name, state.area, state.center, state.division)
column_to_rownames(state_dat)
#making 'state.name' to row names in new data 'new_state_dat'
new_state_dat<-column_to_rownames(state_dat, 'state.name')
I advise you not to use row.names() to transform column into row names
How could I use that special column e.g. to describe the data in a
plot for example?
you can use superheat package, for more information, see https://rlbarter.github.io/superheat/index.html , it's more simple and more powerful if you use textshape package instead row.names() to transform column into rownames

What is about the first column in R's dataset mtcars?

I think I am missing a fundamental concept about R's data frames.
head(mtcars)
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
The names of the cars here. Is this a column? I don't think so, because I am not able to access them via mtcars[,1]. And there is no column name/header for it.
How could I create a data frame like that? How could I use that special column e.g. to describe the data in a plot for example?
They are row names, to access them use:
rownames(mtcars)
For column names use colnames, to see both row and column names, we can use:
dimnames(mtcars)
To modify, for example the first row:
rownames(mtcars)[1] <- "myNewName"
When data frame is created with data.frame, row names are assigned with 1:n numbers.
mydata <- data.frame(x = 1:5)
Then we can modify them:
rownames(mydata) <- paste0("MyName", 1:5)
Or we can add rownames when creating the data.frame:
mydata <- data.frame(x = 1:5, row.names = paste0("MyName", 1:5))
Note:
rownames are not very reliable, for example see this post. (this could be subjective opinion and I avoid them by reassigning rownames to columns)
data.table and dplyr packages prefer not to have them. You can always reassign rownames into a columns as:
mydata$myNames <- rownames(mydata)
A shorter one liner argument with data.tablePackage will make the rowname a column.
library(data.table)
setDT(mtcars, keep.rownames = TRUE[])
head(mtcars)
rn mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
This works too using tibble.
library(tibble)
mtcars %>%
rownames_to_column(var="carnames")
How could you create a data frame like that? =>
you can transform a column to a row names using textshape package. see exemple below
> column to row names
library(textshape)
state_dat <- data.frame(state.name, state.area, state.center, state.division)
column_to_rownames(state_dat)
#making 'state.name' to row names in new data 'new_state_dat'
new_state_dat<-column_to_rownames(state_dat, 'state.name')
I advise you not to use row.names() to transform column into row names
How could I use that special column e.g. to describe the data in a
plot for example?
you can use superheat package, for more information, see https://rlbarter.github.io/superheat/index.html , it's more simple and more powerful if you use textshape package instead row.names() to transform column into rownames

Replace for loop with apply function

My question is how to use "apply" function to do what "for loop" does in this example:
mtcars
for (i in colnames(mtcars)){
print(head(mtcars[i]))
}
What I need is to get R to read one column after the other, ideally by index (ie mtcars[1], then mtcars[2], then mtcars[3]...) rather than colnames.
Your help is highly appreciated.
Thanks
Use the apply function
apply(mtcars, 2 ,head)
this is the result
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
Explanation
2 in the parameter of apply function means you're going to pass each column to your defined function , if you pass 1 instead of 2 this means that you're going to send each row instead of column

Resources