This question already has answers here:
Why does apply convert logicals in data frames to strings of 5 characters?
(2 answers)
Selecting only numeric columns from a data frame
(12 answers)
Closed 2 years ago.
I know that the question is very easy, but I have a more specific one:
I have a data frame, with 50 variables (numeric and non-numeric) and 5000 observations.
Now what I want to do is create another data frame containing only the numerica variables of the original one.
On this website I found the solution of my problem, that is:
numeric_variables<-unlist(lapply(original_data,is.numeric))
X<-original_data[numeric_variables]
But I was wondering: why if I try like this, it does not work instead? what's wrong?
numeric_variables2<-apply(original_data,2,is.numeric)
x<-original_data[numeric_variables2]
try this :
names_num <- names(which(sapply(df, is.numeric)))
df_num <- df[, names_num]
Related
This question already has answers here:
How to remove columns with same value in R
(4 answers)
Closed 2 years ago.
I have a really large dataset and I want to filter out some of the columns because it is the same data all throughout (ex: company name is all "Walmart"). I can go through and do these manually but I'm looking for a code to do it automatically.
I had in mind a function to subset based on if sum(unique(colnam)) == 1 but not sure how to get it to work. Thanks.
which(sapply(dat, function(col) length(unique(col)) == 1))
This question already has answers here:
how to remove multiple columns in r dataframe?
(8 answers)
Select column 2 to last column in R
(4 answers)
Closed 2 years ago.
I have a data frame that I import from excel each week with 40+ columns. Each week the data frame has a different number of columns, I am only interested in the first 40. I take the data frame, drop the columns after 40 and rbind to another data frame (that has 40 columns).
The code I use to drop the columns is"
df = df[ -c(40:45) ] #assume df has 45 columns this week.
I would like to find a step to automate the lendth of columns to drop, similar to length(df$x) type of idea. I try width(df) but doesn't seem to work?
Any ideas please?
This question already has answers here:
Change the class from factor to numeric of many columns in a data frame
(16 answers)
How to convert a factor to integer\numeric without loss of information?
(12 answers)
Closed 2 years ago.
so I have a csv file with around 60 variables that I have imported in R and I'd like to convert the columns to datatypes like numeric, Posixt, and Boolean. In my previous files I needed to do so only for a limited set of columns so I simply used 4-5 times.
data$var1 <- as.numeric(as.character(data$var1)
data$var2 <- as.numeric(as.character(data$var2)
For now however, when I want to convert 60 columns I'd rather not write 60 lines of code again and again. Could someone help me come up with a more efficient way?
This question already has answers here:
Extracting specific columns from a data frame
(10 answers)
Closed 4 years ago.
I have to extract two columns from this data set (Cars93 on MASS) and create a separate folder consisting only of the two columns MPG.highway and EngineSize. How do I go about doing this?
You can look at Cars93 on Mass and just get the first ten rows to see it.
You can create a subset using the names directly using the subset function or alternately,
new_df <- Cars93[,c("MPG.highway","EngineSize")]
#or
new_df <- subset(Cars93, keep = c("MPG.highway","EngineSize"))
This question already has answers here:
Generate a dummy-variable
(17 answers)
Closed 5 years ago.
Beginner in R and looking to avoid unnecessary copy+pasting...
I have a data frame with a numeric column. I would like to create binary columns based on the values in the numeric column.
I know the tedious approach would be to copy+paste the following and manually add the different values:
DataFrame$NewCol1 <- as.numeric(DataFrame$ExistingCol == 1);
DataFrame$NewCol2 <- as.numeric(DataFrame$ExistingCol == 2);
Would a "for" loop be able to accomplish this task?
How about something like this?
model.matrix(~factor(DataFrame$ExistingCol))[,-1]