trying to isolate two columns in a data frame in r [duplicate] - r

This question already has answers here:
Extracting specific columns from a data frame
(10 answers)
Closed 6 years ago.
Does anybody know of a way to create a new data frame that contains the information of specific columns from a master data frame that has multiple columns? I have a master dataframe and I'm trying to run various tests (regression, ANOVA...etc.,) on specific columns in the data frame. Any suggestions would be greatly appreciated.

if you want to choose columns 3,12 and 15 from the old DF:
newDF <- oldDF[,c(3,12,15)]
if you want to remove columns 3,12 and 15 from the old DF:
newDF <- oldDF[,-c(3,12,15)]

Related

Order a dataframe by a specific variable [duplicate]

This question already has answers here:
Sort (order) data frame rows by multiple columns
(19 answers)
Closed 3 months ago.
I am trying to order a data frame/matrix by the variable VIRUS, by descending or ascending it does not matter.
Anncol<-data.frame(Metadata$VIRUS) ##From left to case, add more if neeedd
sortedAnncol<-Anncol[order(Anncol$VIRUS),]
sAnncol<-as.matrix(sortedAnncol)
This is what I have tried so far, but I lose the first column of data, the corresponding data points in the data frame. How can order the 'Anncol' data frame by the variable 'VIRUS' while simultaneously ordering the rest of the data frame.
Any help would be greatly appreciated! Thank you in advance
Here a solution using dplyr
library(dplyr)
Metadata <- Metadata %>% arrange(VIRUS)

Automate data frame width in R [duplicate]

This question already has answers here:
how to remove multiple columns in r dataframe?
(8 answers)
Select column 2 to last column in R
(4 answers)
Closed 2 years ago.
I have a data frame that I import from excel each week with 40+ columns. Each week the data frame has a different number of columns, I am only interested in the first 40. I take the data frame, drop the columns after 40 and rbind to another data frame (that has 40 columns).
The code I use to drop the columns is"
df = df[ -c(40:45) ] #assume df has 45 columns this week.
I would like to find a step to automate the lendth of columns to drop, similar to length(df$x) type of idea. I try width(df) but doesn't seem to work?
Any ideas please?

Select only numeric variables of a data frame in R [duplicate]

This question already has answers here:
Why does apply convert logicals in data frames to strings of 5 characters?
(2 answers)
Selecting only numeric columns from a data frame
(12 answers)
Closed 2 years ago.
I know that the question is very easy, but I have a more specific one:
I have a data frame, with 50 variables (numeric and non-numeric) and 5000 observations.
Now what I want to do is create another data frame containing only the numerica variables of the original one.
On this website I found the solution of my problem, that is:
numeric_variables<-unlist(lapply(original_data,is.numeric))
X<-original_data[numeric_variables]
But I was wondering: why if I try like this, it does not work instead? what's wrong?
numeric_variables2<-apply(original_data,2,is.numeric)
x<-original_data[numeric_variables2]
try this :
names_num <- names(which(sapply(df, is.numeric)))
df_num <- df[, names_num]

How to Extract Two Columns of Data in R [duplicate]

This question already has answers here:
Extracting specific columns from a data frame
(10 answers)
Closed 4 years ago.
I have to extract two columns from this data set (Cars93 on MASS) and create a separate folder consisting only of the two columns MPG.highway and EngineSize. How do I go about doing this?
You can look at Cars93 on Mass and just get the first ten rows to see it.
You can create a subset using the names directly using the subset function or alternately,
new_df <- Cars93[,c("MPG.highway","EngineSize")]
#or
new_df <- subset(Cars93, keep = c("MPG.highway","EngineSize"))

Trying to break down data frame by subject in one command [duplicate]

This question already has answers here:
How to split a data frame?
(8 answers)
Closed 6 years ago.
Say I have a large data frame in long format, with each subject occupying 5 rows, with 5 subjects in total.
x=c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4,5,5,5,5,5)
df=data.frame(x, 1:25)
Now I want to separate this into 5 separate data frames, one for each subject. I know I could do this:
s01=df[df$x==1,]
5 times, but I want to create all five data frames in one go, using one command. Is there a way to do this (e.g. with a for loop or something like lapply)? I tried with a for loop but not sure how to get it to output 5 separate objects with different names.
You can simply do:
result <- split(df, df$x)
This will return in a list different data frames according to column x. You can take out for example, the first data frame, by
result[[1]]

Resources