This question already has answers here:
Generate a dummy-variable
(17 answers)
Closed 5 years ago.
Beginner in R and looking to avoid unnecessary copy+pasting...
I have a data frame with a numeric column. I would like to create binary columns based on the values in the numeric column.
I know the tedious approach would be to copy+paste the following and manually add the different values:
DataFrame$NewCol1 <- as.numeric(DataFrame$ExistingCol == 1);
DataFrame$NewCol2 <- as.numeric(DataFrame$ExistingCol == 2);
Would a "for" loop be able to accomplish this task?
How about something like this?
model.matrix(~factor(DataFrame$ExistingCol))[,-1]
Related
This question already has answers here:
How do I delete rows in a data frame?
(10 answers)
Closed 1 year ago.
So I've been trying to subset and remove the observations of a country from my data frame (ESS6). I have been able to remove certain variables with this function, -c(variable), but that is not useful since I only want to remove certain rows from the variable countries (cntry).
Thank you for your help :)
Try using dplyr and the "filter" function
This question already has answers here:
How to remove columns with same value in R
(4 answers)
Closed 2 years ago.
I have a really large dataset and I want to filter out some of the columns because it is the same data all throughout (ex: company name is all "Walmart"). I can go through and do these manually but I'm looking for a code to do it automatically.
I had in mind a function to subset based on if sum(unique(colnam)) == 1 but not sure how to get it to work. Thanks.
which(sapply(dat, function(col) length(unique(col)) == 1))
This question already has answers here:
Why does apply convert logicals in data frames to strings of 5 characters?
(2 answers)
Selecting only numeric columns from a data frame
(12 answers)
Closed 2 years ago.
I know that the question is very easy, but I have a more specific one:
I have a data frame, with 50 variables (numeric and non-numeric) and 5000 observations.
Now what I want to do is create another data frame containing only the numerica variables of the original one.
On this website I found the solution of my problem, that is:
numeric_variables<-unlist(lapply(original_data,is.numeric))
X<-original_data[numeric_variables]
But I was wondering: why if I try like this, it does not work instead? what's wrong?
numeric_variables2<-apply(original_data,2,is.numeric)
x<-original_data[numeric_variables2]
try this :
names_num <- names(which(sapply(df, is.numeric)))
df_num <- df[, names_num]
This question already has answers here:
How can I divide one column of a data frame through another?
(2 answers)
Closed 3 years ago.
The below link shows my initial data frame.
The inc_per column has wrong values in it. so I want to recalculate this and update data frame with new values in it. So I tried this code:
cars<- within(cars,cars$per_inc<- (cars$oldprice / cars$newprice)*100)
But it created unnecessary columns and I didn't get my desired output
My desired output can be seen below:
it is very easy use
cars$per_inc <- (cars$oldprice / cars$newprice)*100
this will replace the current column values with new one.
This question already has answers here:
How to find the highest value of a column in a data frame in R?
(10 answers)
Closed 7 years ago.
I have the following dataframe:
I would like to create a table of the column headers, in a column with their maximum value to the right of them. I will be looking to embed this in a Shiny App. Does anyone know how I can do this?
I used:
colMax <- function(data) sapply(data, max, na.rm=TRUE)
Then I called as.data.frame(colmax(data)) and it worked as needed.
Duplicate: How to find the highest value of a column in a data frame in R?