This question already has answers here:
How to split a data frame?
(8 answers)
Closed 6 years ago.
Say I have a large data frame in long format, with each subject occupying 5 rows, with 5 subjects in total.
x=c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4,5,5,5,5,5)
df=data.frame(x, 1:25)
Now I want to separate this into 5 separate data frames, one for each subject. I know I could do this:
s01=df[df$x==1,]
5 times, but I want to create all five data frames in one go, using one command. Is there a way to do this (e.g. with a for loop or something like lapply)? I tried with a for loop but not sure how to get it to output 5 separate objects with different names.
You can simply do:
result <- split(df, df$x)
This will return in a list different data frames according to column x. You can take out for example, the first data frame, by
result[[1]]
Related
This question already has answers here:
how to remove multiple columns in r dataframe?
(8 answers)
Select column 2 to last column in R
(4 answers)
Closed 2 years ago.
I have a data frame that I import from excel each week with 40+ columns. Each week the data frame has a different number of columns, I am only interested in the first 40. I take the data frame, drop the columns after 40 and rbind to another data frame (that has 40 columns).
The code I use to drop the columns is"
df = df[ -c(40:45) ] #assume df has 45 columns this week.
I would like to find a step to automate the lendth of columns to drop, similar to length(df$x) type of idea. I try width(df) but doesn't seem to work?
Any ideas please?
This question already has answers here:
Why does apply convert logicals in data frames to strings of 5 characters?
(2 answers)
Selecting only numeric columns from a data frame
(12 answers)
Closed 2 years ago.
I know that the question is very easy, but I have a more specific one:
I have a data frame, with 50 variables (numeric and non-numeric) and 5000 observations.
Now what I want to do is create another data frame containing only the numerica variables of the original one.
On this website I found the solution of my problem, that is:
numeric_variables<-unlist(lapply(original_data,is.numeric))
X<-original_data[numeric_variables]
But I was wondering: why if I try like this, it does not work instead? what's wrong?
numeric_variables2<-apply(original_data,2,is.numeric)
x<-original_data[numeric_variables2]
try this :
names_num <- names(which(sapply(df, is.numeric)))
df_num <- df[, names_num]
This question already has answers here:
Extracting specific columns from a data frame
(10 answers)
Closed 4 years ago.
I have to extract two columns from this data set (Cars93 on MASS) and create a separate folder consisting only of the two columns MPG.highway and EngineSize. How do I go about doing this?
You can look at Cars93 on Mass and just get the first ten rows to see it.
You can create a subset using the names directly using the subset function or alternately,
new_df <- Cars93[,c("MPG.highway","EngineSize")]
#or
new_df <- subset(Cars93, keep = c("MPG.highway","EngineSize"))
This question already has answers here:
Extracting specific columns from a data frame
(10 answers)
Closed 6 years ago.
Does anybody know of a way to create a new data frame that contains the information of specific columns from a master data frame that has multiple columns? I have a master dataframe and I'm trying to run various tests (regression, ANOVA...etc.,) on specific columns in the data frame. Any suggestions would be greatly appreciated.
if you want to choose columns 3,12 and 15 from the old DF:
newDF <- oldDF[,c(3,12,15)]
if you want to remove columns 3,12 and 15 from the old DF:
newDF <- oldDF[,-c(3,12,15)]
This question already has answers here:
Rbind dataframes using wildcard
(1 answer)
R - use rbind on multiple variables with similar names
(2 answers)
Closed 6 years ago.
Have 68 data frames with the same structure and columns. I started to use rbind and after typing in about 30 df names, I thought there might be an easier way to concatenate all the data frames without typing in each name. I mean, what if you have 100,000 data frames to concatenate? I looked at a dozen of online resources including this site and didn't find exactly what I was looking for.
All the 68 data frames are named sequentially. For example: df_01, df_02, ... up to df_68. I tried to do rbind(df_01:df_68) and of course this did not work.