Subsetting in r - getting NULL when I use a variable [duplicate] - r

This question already has answers here:
define $ right parameter with a variable in R [duplicate]
(3 answers)
Closed 7 years ago.
I'm trying to filter through a list of 100 lists with four columns of data to pull out individual columns and operate on them.
The columns are: Date/Time, Measurement 1, Measurement 2, Identity Variable
filepull <- list of 100 lists
column_name <- "foo"
meanoflist <- NULL
for (i in 1:100) {
holder_variable<-filepull[[i]]
meanoflist[i]<-mean(na.omit(holder_variable$column_name))
}
holder_variable$"foo" gives me what I need, but holder_variable$column_name gives me NULL. What gives?
Thx for the help!

When you use the $ operator, the input doesn't get evaluated; it will be used as-is. So, by using holder_variable$column_name, you are actually trying to get the column with the name column_name. If you want to get the value with the name stored in a variable, use holder_variable[, column_name] (assuming holder_variable is a data.frame, from instance).
Take a look at this example, to better understand the difference.

Related

How to assign values to string when the data is very large R [duplicate]

This question already has answers here:
Create a numeric vector with names in one statement?
(6 answers)
Closed 10 months ago.
How to assign values to string when the data is very large.
Currently I assign values to character vectors manually as illustrated below, however, when the amount of data is very large it becomes tedious to do that process manually. Is there a function that allows me to do it?
c("a" = 100, "b" = 200, "c"=300, ..., "aaaaaa" = n)
Is there any particular meaning to your numbering? If you just need numerical values for strings, you can use as.factor and as.numeric as outlined in this post:
R: Encode character variables into numeric
However, if you need a specific encoding you will have specify the associated labels necessary; there isn't enough information in your question to help with this further, but the documentation is here:
https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/factor

How to change a vector name in a function? [duplicate]

This question already has answers here:
Dynamically select data frame columns using $ and a character value
(10 answers)
Closed 1 year ago.
I want to do graphical outputs such as boxplots or graphs using a function. So that I can plot several dataframes, changing only the column name each time.
For example :
boxplot_func = function(column){
boxplot(dataframe1$column, dataframe2$column)}
boxplot_func(mean)
boxplot_func(max)
etc.
But R doesn't seem to compute mean or max in the function. Do you know a way to do it ?
One option would be to pass the column as a character string and use [[ to access the column in your function:
A simple example using mtcars:
boxplot_func = function(column) {
boxplot(mtcars[[column]], mtcars[[column]])
}
boxplot_func("mpg")

How to point an R function to a particular column of a dataset? [duplicate]

This question already has answers here:
Dynamically select data frame columns using $ and a character value
(10 answers)
Closed 2 years ago.
I have a dataset called df, which has columns a and b with three integers each. I want to write a function for the mean (obviously this already exists; I want to write a larger function and this appears to be where problems are occurring). However, this function returns NA:
mean_function <- function(x) {
mean(df$x)
}
mean_function(a) returns NA, while mean(df$a) returns 2. Is there something I'm missing about how R functions handle datasets, or another problem?
We need [[ instead of $ as it will literally check for x as column and pass a string
mean_function <- function(x) {mean(df[[x]])}
mean_function("a")
If we need to pass unquoted column name, substitute and convert to character with deparse
mean_function<- function(x) {
x <- deparse(substitute(x))
mean(df[[x]]
}
mean_function(a)

R loop over string and use it to refer to column names [duplicate]

This question already has answers here:
Dynamically select data frame columns using $ and a character value
(10 answers)
Closed 4 years ago.
I have data frame with column names 1990.x ..2000.x, 1990.y,..2000.y. I want to replace NAs in variables ending with ".x" with values from .y from corresponding year. It is element by element computation of formula 1990.x = 0.5+0.2*log(1990.y)
I wanted to do something like this:
for (v in colnames(df[ ,grepl(".x",names(df))])) {
print(v)
df$v <- ifelse(is.na(df$v), ols$coefficients[1]+ols$coefficients[2]*log(df$gsub(".x",".y",v)), df$v)
}
but this is not working. How can i make this loop working, or is there any better solution?
Thanks
The $ operator is available for convenience, but can't be used inside of a for loop where the value of the column you're selecting is going to change, e.g, in your for loop. Your code will need to use the [ operator (open and closed square brackets) instead:
df[,v] <- ifelse(is.na(df[,v]), ols$coefficients[1]+ols$coefficients[2]*log(df$gsub(".x",".y",v)), df[,v])

Obtaining the name of a data frame as a string in R [duplicate]

This question already has answers here:
using substitute to get argument name with
(2 answers)
Closed 8 years ago.
I have a function that takes a data frame as an argument. In that function I need to use if/else construct to do some actions based on the data frame used as argument. for example, I need to be able to say
if (name of data frame=="Anthro_Data") {do this}
else if (the name of data frame=="Sports") {do that}.
The problem I am having is that I don't know how to get the name of the data frame (as a string) in order to use it. Any suggestions!
You can use deparse and substitute to get the name of the argument passed to your function:
a <- 1
f <- function(arg) deparse(substitute(arg))
f(a)
# [1] "a"

Resources