This question already has answers here:
How do I delete rows in a data frame?
(10 answers)
Closed 1 year ago.
So I've been trying to subset and remove the observations of a country from my data frame (ESS6). I have been able to remove certain variables with this function, -c(variable), but that is not useful since I only want to remove certain rows from the variable countries (cntry).
Thank you for your help :)
Try using dplyr and the "filter" function
Related
This question already has answers here:
Sort (order) data frame rows by multiple columns
(19 answers)
Closed 3 months ago.
I am trying to order a data frame/matrix by the variable VIRUS, by descending or ascending it does not matter.
Anncol<-data.frame(Metadata$VIRUS) ##From left to case, add more if neeedd
sortedAnncol<-Anncol[order(Anncol$VIRUS),]
sAnncol<-as.matrix(sortedAnncol)
This is what I have tried so far, but I lose the first column of data, the corresponding data points in the data frame. How can order the 'Anncol' data frame by the variable 'VIRUS' while simultaneously ordering the rest of the data frame.
Any help would be greatly appreciated! Thank you in advance
Here a solution using dplyr
library(dplyr)
Metadata <- Metadata %>% arrange(VIRUS)
This question already has answers here:
How to remove columns with same value in R
(4 answers)
Closed 2 years ago.
I have a really large dataset and I want to filter out some of the columns because it is the same data all throughout (ex: company name is all "Walmart"). I can go through and do these manually but I'm looking for a code to do it automatically.
I had in mind a function to subset based on if sum(unique(colnam)) == 1 but not sure how to get it to work. Thanks.
which(sapply(dat, function(col) length(unique(col)) == 1))
This question already has answers here:
dplyr - using mutate() like rowmeans()
(8 answers)
Closed 4 years ago.
I currently have a table in R with 4 columns and I want to average the last two columns (titled W10CP1 and W10CP2) into a 5th column of that table.
I tried to use rowMeans but I got an error.
Sorry for the basic question!
You can try use the tydeverse package here an example:
library(tidyverse)
data<-data%>%
mutate(mean= (data[,-1] +data[,-2]/2))
This question already has answers here:
Generate a dummy-variable
(17 answers)
Closed 5 years ago.
Beginner in R and looking to avoid unnecessary copy+pasting...
I have a data frame with a numeric column. I would like to create binary columns based on the values in the numeric column.
I know the tedious approach would be to copy+paste the following and manually add the different values:
DataFrame$NewCol1 <- as.numeric(DataFrame$ExistingCol == 1);
DataFrame$NewCol2 <- as.numeric(DataFrame$ExistingCol == 2);
Would a "for" loop be able to accomplish this task?
How about something like this?
model.matrix(~factor(DataFrame$ExistingCol))[,-1]
This question already has answers here:
Remove duplicated rows
(10 answers)
Closed 7 years ago.
I have a long column (9500 rows in excel), where I have a lot of gene ids. I want to remove the duplicates.
ID
BXDC2
BXDC5
BXDC5
BZRPL1
BZRPL1
C10orf11
C10orf116
C10orf119
C10orf120
C10orf125
C10orf125
And I want the result to be:
ID
BXDC2
BXDC5
BZRPL1
C10orf11
C10orf116
C10orf119
C10orf120
C10orf125
Can anybody help me with an R script :-)?
You can use duplicated or unique. Here, I am assuming that the column name is 'ID'
df1[!duplicated(df1$ID),,drop=FALSE]
Or
library(data.table)#v1.9.4+
unique(setDT(df1), by='ID')