Can you sort a df based on object class? Say
data("mtcars")
mtcars$cyl <- as.factor(mtcars$cyl)
mtcars$vs <- as.factor(mtcars$vs)
mtcars$am <- as.factor(mtcars$am)
sapply(mtcars,class)
and I want all numeric variables first and then all factors at the end? I want to be able to do this on a much larger dataset so I prefer solutions that do not rely on subsetting by column number. Cheers.
Maybe this one?
head(mtcars)
# mpg cyl disp hp drat wt qsec vs am gear carb
# Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
# Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
# Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
# Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
# Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
# Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
x <- mtcars[,names(sort(unlist(lapply(mtcars, class)), decreasing = T))]
head(x)
# mpg disp hp drat wt qsec gear carb cyl vs am
# Mazda RX4 21.0 160 110 3.90 2.620 16.46 4 4 6 0 1
# Mazda RX4 Wag 21.0 160 110 3.90 2.875 17.02 4 4 6 0 1
# Datsun 710 22.8 108 93 3.85 2.320 18.61 4 1 4 1 1
# Hornet 4 Drive 21.4 258 110 3.08 3.215 19.44 3 1 6 1 0
# Hornet Sportabout 18.7 360 175 3.15 3.440 17.02 3 2 8 0 0
# Valiant 18.1 225 105 2.76 3.460 20.22 3 1 6 1 0
In x, as you see, the columns cyl, vs and am that are of class factor are place at the end and those of class numeric first.
Related
I need to re-ordering the columns' position in a dataframe with 500 columns. In fact, I only want the last column to be moved between the third and the fourth columns.
Here is what I tried:
df[ ,c(1, 2, 3, ncol(df), 4:ncol(df)-1)]
But it gives me a vector of values which are the columns' number. Would you someone tell me what I expect wrong from this code?
The issue maybe related to the operator precedence - wrap the (ncol(df)-1) within bracket (assuming the original object is a data.frame)
library(data.table)
df <- df[ ,c(1, 2, 3, ncol(df), 4:(ncol(df)-1)), with = FALSE]
Or use setcolorder to update the original object
setcolorder(df, c(1, 2, 3, ncol(df), 4:(ncol(df)-1)))
NOTE: with = FALSE was added after the OP confirmed it is a data.table object
Or another option is select
library(dplyr)
df <- df %>%
select(1:3, last_col(), everything())
Or with relocate
df <- df %>%
relocate(last_col(), .before = 4)
-reproducible example testing
> data(mtcars)
> head(mtcars)[, c(1, 2, 3, ncol(mtcars), 4:(ncol(mtcars)-1))]
mpg cyl disp carb hp drat wt qsec vs am gear
Mazda RX4 21.0 6 160 4 110 3.90 2.620 16.46 0 1 4
Mazda RX4 Wag 21.0 6 160 4 110 3.90 2.875 17.02 0 1 4
Datsun 710 22.8 4 108 1 93 3.85 2.320 18.61 1 1 4
Hornet 4 Drive 21.4 6 258 1 110 3.08 3.215 19.44 1 0 3
Hornet Sportabout 18.7 8 360 2 175 3.15 3.440 17.02 0 0 3
Valiant 18.1 6 225 1 105 2.76 3.460 20.22 1 0 3
> head(mtcars) %>% select(1:3, last_col(), everything())
mpg cyl disp carb hp drat wt qsec vs am gear
Mazda RX4 21.0 6 160 4 110 3.90 2.620 16.46 0 1 4
Mazda RX4 Wag 21.0 6 160 4 110 3.90 2.875 17.02 0 1 4
Datsun 710 22.8 4 108 1 93 3.85 2.320 18.61 1 1 4
Hornet 4 Drive 21.4 6 258 1 110 3.08 3.215 19.44 1 0 3
Hornet Sportabout 18.7 8 360 2 175 3.15 3.440 17.02 0 0 3
Valiant 18.1 6 225 1 105 2.76 3.460 20.22 1 0 3
> ?relocate
> head(mtcars) %>% relocate(last_col(), .before = 4)
mpg cyl disp carb hp drat wt qsec vs am gear
Mazda RX4 21.0 6 160 4 110 3.90 2.620 16.46 0 1 4
Mazda RX4 Wag 21.0 6 160 4 110 3.90 2.875 17.02 0 1 4
Datsun 710 22.8 4 108 1 93 3.85 2.320 18.61 1 1 4
Hornet 4 Drive 21.4 6 258 1 110 3.08 3.215 19.44 1 0 3
Hornet Sportabout 18.7 8 360 2 175 3.15 3.440 17.02 0 0 3
Valiant 18.1 6 225 1 105 2.76 3.460 20.22 1 0 3
head(mtcars)
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
I want two new columns.
near_wt which has the value same or near to the wt column.
I got this using
mtcars$near_wt <- mtcars[, -col][cbind(1:nrow(mtcars),
max.col(-abs(mtcars[, col] - mtcars[, -col])))]
This works. The next thing I need is the column name from which each value of "near_wt" comes.
Does col is 6? If so, you may try this way. For convenience, I define x as
x <- cbind(1:nrow(mtcars),max.col(-abs(mtcars[, col] - mtcars[, -col])))
Then,
mtcars$near_wt <- mtcars[, -6][x]
mtcars$near_wt_name <- names(mtcars[-6])[x[,2]]
head(mtcars)
mpg cyl disp hp drat wt qsec vs am gear carb near_wt near_wt_name
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 3.90 drat
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 3.90 drat
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 1.00 am
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 3.08 drat
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 3.15 drat
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 3.00 gear
When I create a new variable, is there a way to specify in the function where to place it?
Right now, it adds it to the end of the dataframe, but for ease of viewing in Excel for example, I'd like to place a new calculated column beside the columns I used for the calculation.
Here's an example of code:
rawdata2 <- (rawdata1 %>% unite(location, locations1,locations2, locations3,
na.rm = TRUE, remove=TRUE)
%>% select(-location7, -location16)
%>% unite(Sector, Sectors, na.rm=TRUE, remove=TRUE)
%>% unite(TypeofSpace, TypesofSpace, type.of.spaceOther, na.rm=TRUE,
remove=TRUE)
)
You can rearrange the columns in your data frame. It looks like you are using dplyr::select in your example.
library(dplyr)
head(mtcars)
# mpg cyl disp hp drat wt qsec vs am gear carb
# Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
# Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
# Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
# Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
# Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
# Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
mtcars2 <- mtcars %>%
select(mpg, carb, everything()) ## moves carb up behind mpg
head(mtcars2)
# mpg carb cyl disp hp drat wt qsec vs am gear
# Mazda RX4 21.0 4 6 160 110 3.90 2.620 16.46 0 1 4
# Mazda RX4 Wag 21.0 4 6 160 110 3.90 2.875 17.02 0 1 4
# Datsun 710 22.8 1 4 108 93 3.85 2.320 18.61 1 1 4
# Hornet 4 Drive 21.4 1 6 258 110 3.08 3.215 19.44 1 0 3
# Hornet Sportabout 18.7 2 8 360 175 3.15 3.440 17.02 0 0 3
# Valiant 18.1 1 6 225 105 2.76 3.460 20.22 1 0 3
You can do the same thing with base subsetting, for example with a data frame with 11 columns you can move the 11th behind the second by
mtcars3 <- mtcars[,c(1,11,2:10)]
identical(mtcars2, mtcars3)
# [1] TRUE
I ended up using relocate, documentation here: dplyr.tidyverse.org/reference/relocate.html
i want to swap a specific column with the last column, and then delete the last column after swapping. After delete ncol(testFrame) will decrease by 1
Usually a reproducible example is expected but your description is clear enough to understand what you want to do.
Using mtcars as sample data
df <- mtcars
head(df)
# mpg cyl disp hp drat wt qsec vs am gear carb
#Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
#Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
#Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
#Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
#Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
swap_column <- 3
cols <- seq_len(ncol(df))
df1 <- df[replace(cols, cols == swap_column, ncol(df))][-ncol(df)]
head(df1)
# mpg cyl carb hp drat wt qsec vs am gear
#Mazda RX4 21.0 6 4 110 3.90 2.620 16.46 0 1 4
#Mazda RX4 Wag 21.0 6 4 110 3.90 2.875 17.02 0 1 4
#Datsun 710 22.8 4 1 93 3.85 2.320 18.61 1 1 4
#Hornet 4 Drive 21.4 6 1 110 3.08 3.215 19.44 1 0 3
#Hornet Sportabout 18.7 8 2 175 3.15 3.440 17.02 0 0 3
#Valiant 18.1 6 1 105 2.76 3.460 20.22 1 0 3
We replace the column number swap_column with last column number (ncol(df)) and then remove the last column (-ncol(df)).
We can do this conveniently with add_column from tibble. The .after and .before parameters can take either column index or column name. Suppose, we need to shift last column to third position
library(tibble)
data(mtcars)
df1 <- add_column(mtcars[-ncol(mtcars)], mtcars[ncol(mtcars)], .after = 2)
head(df1)
# mpg cyl carb disp hp drat wt qsec vs am gear
#Mazda RX4 21.0 6 4 160 110 3.90 2.620 16.46 0 1 4
#Mazda RX4 Wag 21.0 6 4 160 110 3.90 2.875 17.02 0 1 4
#Datsun 710 22.8 4 1 108 93 3.85 2.320 18.61 1 1 4
#Hornet 4 Drive 21.4 6 1 258 110 3.08 3.215 19.44 1 0 3
#Hornet Sportabout 18.7 8 2 360 175 3.15 3.440 17.02 0 0 3
#Valiant 18.1 6 1 225 105 2.76 3.460 20.22 1 0 3
I would like to define a string
string<- "modelName"
That could be used to name an object later. Something like
paste0(string) <- mtcars
cat(string) <- mtcars
print(string) <- mtcars
get(string) <- mtcars
The needed result is the dataset called "modelName". None of the examples above work, obviously.
Question:
How can create one create an object which name is defined by the sourced string?
As #Spacedman notes this is not generally the way things are done but you can use assign
string<- "modelName"
assign(string, mtcars)
> head(modelName)
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
In general it may be perferable to use sometthing like a list:
x <- list()
x[[string]] <- mtcars
> head(x$modelName)
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1