R call multiple columns' elements with $ operator - r

Is there something in R to call like df$col1:df$col5?
I would like to convert the character elements to numeric with as.numeric, so I would like to do something like as.numeric(df$col1:df$col5) to convert all elements in these columns to numeric.

df = mtcars
If you want to access multiple columns by column number
lapply(df[,c(1:3,5)], as.numeric) #Or as.character if you want
If you want to access by colnames
lapply(df[,c('mpg','cyl')], as.numeric)

You can use a numeric index to get a range of columns, as suggested in the comments.
But if you the columns are not in order you can construct a vector of names, and use that (rather than write the names explicitly, as in the other answer)
my_cols <- paste0('col', 1:5)
my_df[, my_cols] <- lapply(my_df[, my_cols], as.numeric)

Related

Convert structure of columns in list of dataframes to character in R

I'm creating an empty list of dataframes that I will append later using lapply.
library(tidyverse)
library(dplyr)
library(purrr)
my.list <- lapply(1:192, function(x, nr = 468, nc = 1) { data.frame(symbol = matrix(nrow=nr, ncol=nc)) })
str(my.list)
If you obtain the structure of my.list you will notice that the structure of the columns within each dataframe is "logical". I would like the structure of the column in each dataframe to be character rather than logical.
Can I change anything within my lapply function above so that the columns in the resulting list of dataframes are character? Or how best would I go about this task? I'm creating this empty list of dataframes because I understand that R works faster if it doesn't have to constantly append files. Thus my next step is to perform a map function to populate each dataframe in this list of dataframes with character data.
The issue would be that by creating NA, by default it is NA_logical_. If we want to create a character column, use NA_character_. Here, we can fix with
my.list <- lapply(my.list, function(x) {x[] <- lapply(x, as.character); x})
Or while creating the data.frame column, use
my.list <- lapply(1:192, function(x) data.frame(symbol = rep(NA_character_, 468)))
The matrix route to get a single column data.frame is not ideal and is sometimes incorrect (because matrix can have only a single class whereas data.frame columns can be of different type). The easiest option is replicate the NA_character_ with n times to create a single column data.frame with n rows

Converting to string all values in a data frame in R

How can I make sure each single value contained in a dataframe is a string?
Moreover, is there a way I can add a prefix to each value contained in a dataframe? (for example, turning a 0.02 to "X0.02")
We can loop through the columns of the data.frame with lapply, convert to character and assign the output back to the dataset. The [] is used to preserve the attributes of the original data and not output as a list element
dat[] <- lapply(dat, as.character)
Or if there is at least one character element, conversion to matrix and then back to data.frame will also make sure the elements are character
as.data.frame(as.matrix(dat), stringsAsFactors = FALSE)
For the second case
dat[] <- lapply(dat,function(x) paste0("X", x))
Or in tidyverse
library(dplyr)
library(stringr)
dat %>%
mutate_all(list(~ str_c("X", .)))

I have a data frame of char variables and I want to convert the df values to upper/lower case in r

I have a data frame with 6k plus rows and 10 variables. I want to convert the char variables to uppercase without changing the str of df.
df[sapply(df, is.character)] = toupper(df[sapply(df, is.character)])
Explanation
We select only the columns containing characters with sapply(df, is.character), then use the vectorized function toupper and reassign to the data.frame.

Converting factors to numeric in R

I have 100s of columns in my database as factors. They actually contains numbers, but R considers them as factors. For my project requirement, I want to convert them to numeric.
I can do that in bulk using sapply / for loop. However i am not sure how to check that variable contains numbers? I cannot just check is.factor(var_name) as the data base also contains character variables which are considered as factors.
is there some other way to execute the below check:
if (is.numeric(var_name)) {
convert the variable to numeric
}
I am looking for something similar to "stringasfactors= FALSE"
which is used for retaining character variable as a character variable instead of converting to factors.
Any help/pointer would be really helpful.
One way would be to use type.convert after converting all the columns to character
df1[] <- lapply(df1, function(x) type.convert(as.character(x)))
Now, the non-numeric character columns will be converted to factor class. We can reconvert those columns back to character
df1[] <- lapply(df1, function(x) if(is.factor(x)) as.character(x) else x)

renaming subset of columns in r with paste0

I have a data frame (my_df) with columns named after individual county numbers. I melted/cast the data from a much larger set to get to this point. The first column name is year and it is a list of years from 1970-2011. The next 3010 columns are counties. However, I'd like to rename the county columns to be "column_"+county number.
This code executes in R but for whatever reason doesn't update the column names. they remain solely the numbers... any help?
new_col_names = paste0("county_",colnames(my_df[,2:ncol(my_df)]))
colnames(my_df[,2:ncol(my_df)]) = new_col_names
The problem is the subsetting within the colnames call.
Try names(my_df) <- c(names(my_df)[1], new_col_names) instead.
Note: names and colnames are interchangeable for data.frame objects.
EDIT: alternate approach suggested by flodel, subsetting outside the function call:
names(my_df)[-1] <- new_col_names
colnames() is for a matrix (or matrix-like object), try simply names() for a data.frame
Example:
new_col_names=paste0("county_",colnames(my_df[,2:ncol(my_df)]))
my_df <- data.frame(a=c(1,2,3,4,5), b=rnorm(5), c=rnorm(5), d=rnorm(5))
names(my_df) <- c(names(my_df)[1], new_col_names)

Resources