Cannot use a variable named with numbers in R - r

I have some dataframes named as:
1_patient
2_patient
3_patient
Now I am not able to access its variables. For example:
I am not able to obtain:
2_patient$age
If I press tab when writing the name, it automatically gets quoted, but I am still unable to use it.
Do you know how can I solve this?

It is not recommended to name an object with numbers as prefix, but we can use backquote to extract the value from the object
`1_patient`$age
If there are more than object, we can use mget to return the objects in a list and then extract the 'age' column by looping over the list with lapply
mget(ls(pattern = "^\\d+_mtcars$"))
#$`1_mtcars`
# mpg cyl disp hp drat wt qsec vs am gear carb
#Mazda RX4 21 6 160 110 3.9 2.620 16.46 0 1 4 4
#Mazda RX4 Wag 21 6 160 110 3.9 2.875 17.02 0 1 4 4
lapply(mget(ls(pattern = "^\\d+_patient$")), `[[`, 'age')
Using a small reproducible example
data(mtcars)
`1_mtcars` <- head(mtcars, 2)
1_mtcars$mpg
Error: unexpected input in "1_"
`1_mtcars`$mpg
#[1] 21 21

Related

Create subset of ranges and individual items

I'm using R and have a dataset with ~3000 psychological test data. The data is all dyadic in male-female partners (though this shouldn't matter for you). I'm creating a new data frame with just the variables of interest, most of them are not sequentially listed in the original data so I select them by name like below:
new_df <- subset(data, select=c("MQ4", "FQ4", #RX STATUS
"MQ9", "FQ9", #ETHNICITY
"MQ10", "FQ10", #RACE
"MQ465", "FQ465", #SEX
"MQ13", "FQ13", #GENDER
"MQ14", "FQ14", #SEXORIENT
"MQ180", "MQ181", "MQ182", "MQ182" ### HERE IS WHERE I NEED HELP
))
However, I have about 150 unique items that are listed sequentially and I'd like to select them without writing out "MQ180" through "MQ310" to select them all. I've been trying to figure out a way to select the range as well to the individual items I have been doing. This is currently what I'm trying:
new_df <- subset(data, select=c("MQ4", "FQ4", #RX STATUS
"MQ9", "FQ9", #ETHNICITY
"MQ10", "FQ10", #RACE
"MQ465", "FQ465", #SEX
"MQ13", "FQ13", #GENDER
"MQ14", "FQ14", #SEXORIENT
163:310 ### HERE IS WHERE I NEED HELP
))
One option:
dplyr::select(mtcars, "cyl", 5:8)
This subsets the mtcars dataframe to just the cyl column and the 5th thru 8th column:
cyl drat wt qsec vs
Mazda RX4 6 3.90 2.620 16.46 0
Mazda RX4 Wag 6 3.90 2.875 17.02 0
Datsun 710 4 3.85 2.320 18.61 1
Here's a base R alternative but there's probably a better way:
cbind(mtcars[, 'cyl'], mtcars[, 5:8])
mtcars originally:
5 6 7 8
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
In the index part of subset select can use names
subset(data, select=c("MQ4", "FQ4", #RX STATUS
"MQ9", "FQ9", #ETHNICITY
"MQ10", "FQ10", #RACE
"MQ465", "FQ465", #SEX
"MQ13", "FQ13", #GENDER
"MQ14", "FQ14", #SEXORIENT
names(data)[163:310]
))
The issue arises because of the property of vector which can only have a single class. So, when we have both character and integer, the integers are converted to character and thus it will look for column names that matches the name "163" instead of the position index

Adding tidyselect helper functions to a vector [duplicate]

This question already has answers here:
dplyr/rlang: parse_expr with multiple expressions
(3 answers)
Closed 2 years ago.
I often create a "vector" of the variables I use most often while I'm coding. Usually if I just input the vector object in select it works perfectly. Is there any way I can use in the helper functions in a string?
For example I could do
library(dplyr)
x = c('matches("cyl")')
mtcars %>%
select_(x)
but this is not preferable because 1) select_ is deprecated and 2) it's not scalable (i.e., x = c('hp', 'matches("cyl")') will not grab both the relevant columns.
Is there anyway I could use more tidyselect helper functions in as part of a vector?
Note: if I do something like:
x = c(matches("cyl"))
#> Error: `matches()` must be used within a *selecting* function.
#> ℹ See <https://tidyselect.r-lib.org/reference/faq-selection-context.html>.
I get an error, so I'll definitely need to enquo it somehow.
You are trying to turn a string into code which might not be the best approach. However, you can use parse_exprs with !!!.
library(dplyr)
library(rlang)
x = c('matches("cyl")')
mtcars %>% select(!!!parse_exprs(x))
# Cyl
#Mazda RX4 6
#Mazda RX4 Wag 6
#Datsun 710 4
#Hornet 4 Drive 6
#Hornet Sportabout 8
#...
x = c('matches("cyl")', 'hp')
mtcars %>% select(!!!parse_exprs(x))
# cyl hp
#Mazda RX4 6 110
#Mazda RX4 Wag 6 110
#Datsun 710 4 93
#Hornet 4 Drive 6 110
#Hornet Sportabout 8 175
#....

Indexing by column name to the end of the dataframe - R

I'm wondering if there is a way to select a group of columns by the name of the first column in the group and then all the next columns either a) to the end of the data frame, or b) to another column, also using its name.
a) As an example for the first question, in the mtcars dataset, is there a way to select the columns from drat to the end of the data frame? (Something like mtcars[,'drat':ncol(mtcars)])
b) For the second question, is there a way to select the columns starting at cyl and ending at wt? (Something like mtcars[,'cyl':'wt'])
Many elegant solutions already provided but one can even use base-R to get the desired result using which as:
Ans a:
mtcars[,which(names(mtcars) == "drat"):ncol(mtcars)]
Ans b:
mtcars[,which(names(mtcars) == "cyl"):which(names(mtcars) == "wt")]
# cyl disp hp drat wt
#Mazda RX4 6 160.0 110 3.90 2.620
#Mazda RX4 Wag 6 160.0 110 3.90 2.875
#Datsun 710 4 108.0 93 3.85 2.320
#Hornet 4 Drive 6 258.0 110 3.08 3.215
#Hornet Sportabout 8 360.0 175 3.15 3.440
#......so on
We can do with this with select from dplyr
Answer a)
mtcars %>% select(drat:get(last(names(.))))
Answer b)
mtcars %>% select(cyl:wt)
In dplyr, the select function does exactly this (no quotes needed):
mtcards %>%
select(cyl:wt)
If we need to use a quoted string, convert it to sym (symbol) and then do the evaluation (!!
mtcars %>%
select(!! (rlang::sym("cyl")): !!(rlang::sym("wt")))
It would be when these are stored in an object
a <- "cyl"
b <- "wt"
mtcars %>%
select(!! (rlang::sym(a)): !!(rlang::sym(b)))
Or another option is
mtcars %>%
select(!! rlang::parse_expr(glue::glue("{a}:{b}")))

is there a way to use the ggplot aes callout without inputing the column name but by just inputting the column #?

EXAMPLE DATASET:
mtcars
mpg cyl disp hp drat wt ...
Mazda RX4 21.0 6 160 110 3.90 2.62 ...
Mazda RX4 Wag 21.0 6 160 110 3.90 2.88 ...
Datsun 710 22.8 4 108 93 3.85 2.32 ...
............
Recommended ggplot way:
ggplot(mtcars,aes(x=mpg)) + geom_histogram
They way I want to do it:
ggplot(mtcars,aes(x=[,1]) +geom_histogram
or
ggplot(mtcars,aes(x=[[1]]))+geom_histogram
Why can't ggplot let me call out my variable by its column? I need to call it out by column number not name. Why is ggplot so strict here? Any work around for this?
The problem you're facing is that the ggplot aes argument evaluates within the data.frame that you pass it. A column name is a string, and can't be properly evaluated the same way.
Fortunately, there is a solution: use the aes_string option, as follows:
library(ggplot2)
my_data <- mtcars
names(my_data)
ggplot(my_data, aes_string(x=names(my_data)[1]))+
geom_histogram()
This works because names(my_data)[1] returns a string, and is perfectly acceptable for the aes_string option.

Adding objects together in R (like ggplot layers)

I'm doing OOP R and was wondering how to make it so the + can be used to add custom objects together. The most common example of this I've found is in ggplot2 w/ adding geoms together.
I read through the ggplot2 source code and found this
https://github.com/hadley/ggplot2/blob/master/R/plot-construction.r
It looks like "%+%" is being used, but it's not clear how that eventually translates into the plain + operator.
You just need to define a method for the generic function +. (At the link in your question, that method is "+.gg", designed to be dispatched by arguments of class "gg"). :
## Example data of a couple different classes
dd <- mtcars[1, 1:4]
mm <- as.matrix(dd)
## Define method to be dispatched when one of its arguments has class data.frame
`+.data.frame` <- function(x,y) rbind(x,y)
## Any of the following three calls will dispatch the method
dd + dd
# mpg cyl disp hp
# Mazda RX4 21 6 160 110
# Mazda RX41 21 6 160 110
dd + mm
# mpg cyl disp hp
# Mazda RX4 21 6 160 110
# Mazda RX41 21 6 160 110
mm + dd
# mpg cyl disp hp
# Mazda RX4 21 6 160 110
# Mazda RX41 21 6 160 110

Resources