Extract all columns except numeric in R data frame [duplicate] - r

This question already has answers here:
How to select non-numeric columns using dplyr::select_if
(3 answers)
Closed 1 year ago.
In my project, I want to extract all the columns except numeric from my R data frame, as this question I used the same method and just put a not gate into is.numeric() R function but it is not working
This gives all the numaric data,
x<-iris %>% dplyr::select(where(is.numeric))
But this does not work as expected,
x<-iris %>% dplyr::select(where(!is.numeric))
Note: Finally the output data frame should only contain the species column in the iris dataset

purrr package from tidyverse serves exactly what you want by purrr::keep and purrr::discard
library(purrr)
x <- iris %>% keep(is.numeric)
by these piece of code, you set a logical test in keep function and only the columns which passed the test stays.
to reverse that operation and achieve to your wish, you can use discard from purrr also;
x <- iris %>% discard(is.numeric)
you can think discard as keep but with !is.numeric
or alternatively by dplyr
x <- iris %>% select_if(~!is.numeric(.))

Related

Converting a matrix into a dataframe [duplicate]

This question already has answers here:
How to deal with nonstandard column names (white space, punctuation, starts with numbers)
(3 answers)
Closed 2 years ago.
I am trying to find convert/cast a built in matrix state.x77 to a dataframe. But once casted using as.data.frame, the column "Life Exp" should be automatically casted to "Life.Exp"; however, when i used select() function to choose that column using Life.Exp or Life Exp, both do not exist. Am I casting it wrong?
library(dplyr)
library(tidyr)
state.x77 %>% as.data.frame %>% select(Frost,Life.Exp) %>% cor
Reinforcing the response of colleagues, as.data.frame converts the state.x77 matrix to data.frame, keeping the name of the original variables. The Life Exp variable contains space, interpreted by the R as a special character, so to select the Life Exp column in the data.frame, you must put (``). Therefore:
select (Frost, `Life Exp`)
as.data.frame surprisingly does not change name of columns so the space in it remains, you can select the column with back quotes.
library(dplyr)
state.x77 %>% as.data.frame %>% select(Frost,`Life Exp`) %>% cor
However, if you use data.frame it adds the "." in between so now you can use
state.x77 %>% data.frame %>% select(Frost,Life.Exp) %>% cor
Try this:
library(dplyr)
library(tidyr)
state.x77 %>% as.data.frame() %>% select(Frost,`Life Exp`) %>% cor()
Frost Life Exp
Frost 1.000000 0.262068
Life Exp 0.262068 1.000000

dplyr - convert column names containing words to character

I want to convert column names that start with the word "feature" to character type using dplyr. I tried the below and a few other variations using answers from stackoverflow. Any help would be appreciated. Thanks!
train %>% mutate_if(vars(starts_with("feature")), funs(as.character(.)))
train %>% mutate_if(vars(starts_with("feature")), funs(as.character(.)))
I am trying to improve my usage of dplyr commands.
You need mutate_at instead
library(dplyr)
train %>% mutate_at(vars(starts_with("feature")), as.character)
As #Gregor mentioned, mutate_if is when selection of column is based on the actual data in the column and not the names.
For example,
iris %>% mutate_if(is.numeric, sqrt)
So if the data in the column is numeric only then it will calculate square root.
If we want to combine multiple vars statement into one we can use matches
merchants %>% mutate_at(vars(matches("_id|category_")), as.character)

Filtering tables with a character variable as a column name in dplyr R [duplicate]

This question already has answers here:
R dplyr: Non-Standard Evaluation difficulty. Would like to use dynamic variable names in filter and mutate
(2 answers)
Closed 4 years ago.
I am trying to filter the mtcars table in R, referencing a column name with a character variable. So, I write:
var <- "cyl"
mtcars %>%
filter(!!var > 6)
But, for some reason the table isn't being filtered. I think this code is the equivalent of this:
mtcars %>%
filter("cyl" > 6)
What I really need is to convert that string to a name. Does anybody know how to handle this problem?
This works for me:
library(dplyr)
var <- sym("cyl")
mtcars %>%
filter(!!var > 6)

R - group data frame from a variable [duplicate]

This question already has answers here:
dplyr: How to use group_by inside a function?
(4 answers)
Closed 6 years ago.
I want to set the column for grouping a data frame into a variable and then group and summarise the data frame based on it, i.e.
require(dplyr)
var <- colnames(mtcars)[10]
summaries <- mtcars %>% dplyr::group_by(var) %>% dplyr::summarise_each(funs(mean))
such that I can simply change var and use the second line without changing anything. Unfortunately my solution does not work as group_by asks the column name and not a variable.
Use group_by_, which takes arguments as character strings:
require(dplyr)
var <- colnames(mtcars)[10]
summaries <- mtcars %>% dplyr::group_by_(var) %>% dplyr::summarise_each(funs(mean))
(Maybe resources on standard vs non-standard evaluation would be of interest: http://adv-r.had.co.nz/Computing-on-the-language.html)

Using dplyr's select where variable names are quoted [duplicate]

This question already has answers here:
Pass a vector of variable names to arrange() in dplyr
(6 answers)
Closed 7 years ago.
Often I'll want to select a subset of variables where the subset is the result of a function. In this simple case, I first get all the variable names which pertain to width characteristics
library(dplyr)
library(magrittr)
data(iris)
width.vars <- iris %>%
names %>%
extract(grep(".Width", .))
Which returns:
>width.vars
[1] "Sepal.Width" "Petal.Width"
It would be useful to be able to use these returns as a way to select columns (and while I'm aware that contains() and its siblings exist, there are plenty of more complicated subsets I would like to perform, and this example is made trivial for the purpose of this example.
If I was to attempt to use this function as a way to select columns, the following happens:
iris %>%
select(Species,
width.vars)
Error: All select() inputs must resolve to integer column positions.
The following do not:
* width.vars
How can I use dplyr::select with a vector of variable names stored as strings?
Within dplyr, most commands have an alternate version that ends with a '_' that accept strings as input; in this case, select_. These are typically what you have to use when you are utilizing dplyr programmatically.
iris %>% select_(.dots=c("Species",width.vars))
First of all, you can do the selection in dplyr with
iris %>% select(Species, contains(".Width"))
No need to create the vector of names separately. But if you did have a list of columns as string names, you could do
width.vars <- c("Sepal.Width", "Petal.Width")
iris %>% select(Species, one_of(width.vars))
See the ?select help page for all the available options.

Resources