Change several column into factor - R [duplicate] - r

This question already has answers here:
Coerce multiple columns to factors at once
(11 answers)
Closed 5 years ago.
I have a data frame with 203 column.
Some of them string and I want to convert them to factor.
like this :
data$var = as.factor((data$var))
The problem is, there is 180 string column. Is there a way to do this with a loop or a sapply maybe ? Anything that can help me to not writing this code 180 times.

I had the same issue to convert into numeric.
The solution for me was to use the following:
data[] <- lapply(data, function(x) as.numeric(as.character(x)))

Related

How do I subset a dataframe's columns if the data is all the same? [duplicate]

This question already has answers here:
How to remove columns with same value in R
(4 answers)
Closed 2 years ago.
I have a really large dataset and I want to filter out some of the columns because it is the same data all throughout (ex: company name is all "Walmart"). I can go through and do these manually but I'm looking for a code to do it automatically.
I had in mind a function to subset based on if sum(unique(colnam)) == 1 but not sure how to get it to work. Thanks.
which(sapply(dat, function(col) length(unique(col)) == 1))

Conversion of dataype for multiple columns at once in R [duplicate]

This question already has answers here:
Change the class from factor to numeric of many columns in a data frame
(16 answers)
How to convert a factor to integer\numeric without loss of information?
(12 answers)
Closed 2 years ago.
so I have a csv file with around 60 variables that I have imported in R and I'd like to convert the columns to datatypes like numeric, Posixt, and Boolean. In my previous files I needed to do so only for a limited set of columns so I simply used 4-5 times.
data$var1 <- as.numeric(as.character(data$var1)
data$var2 <- as.numeric(as.character(data$var2)
For now however, when I want to convert 60 columns I'd rather not write 60 lines of code again and again. Could someone help me come up with a more efficient way?

Select only numeric variables of a data frame in R [duplicate]

This question already has answers here:
Why does apply convert logicals in data frames to strings of 5 characters?
(2 answers)
Selecting only numeric columns from a data frame
(12 answers)
Closed 2 years ago.
I know that the question is very easy, but I have a more specific one:
I have a data frame, with 50 variables (numeric and non-numeric) and 5000 observations.
Now what I want to do is create another data frame containing only the numerica variables of the original one.
On this website I found the solution of my problem, that is:
numeric_variables<-unlist(lapply(original_data,is.numeric))
X<-original_data[numeric_variables]
But I was wondering: why if I try like this, it does not work instead? what's wrong?
numeric_variables2<-apply(original_data,2,is.numeric)
x<-original_data[numeric_variables2]
try this :
names_num <- names(which(sapply(df, is.numeric)))
df_num <- df[, names_num]

as. double()/ as.numeric() not working [duplicate]

This question already has answers here:
How to convert a factor to integer\numeric without loss of information?
(12 answers)
Closed 5 years ago.
I want the program to read the data as double float but when I use as.double and as.numeric it changes the data itself.
Original data
Original data is in fractions
After applying as.double to each column separately and combining to form a dataframe, the data starts looking like this
Changed data values after applying as. double()
Your data are probably factor (not character).
To convert column x to numeric use as.numeric(levels(x))[x]
This can also help.

Need help in using R "as." function [duplicate]

This question already has answers here:
issue summing columns
(2 answers)
Closed 6 years ago.
a <- x
where x is data frame with 120 columns in it.
sum(a$column1) == 0
this condition works fine.
I have a situation where I have to find each column’s sum.
So created vector m with list of all column names in it
m <- colnames(a)
tried calling the vector values inside the sum function it is throwing error.
sum(m[1]) == 0
throws some error. Not sure which as. Function to use here.
is sapply(x, sum) what you want?

Resources