I've look for this question; however no examples seem to be as primitive as I need.
I need to be able to create a data frame that contains only the true values from an IF condition.. for example:
if(sampleDF$column[1]== 1){
newDF(sampleDF)
}
In this example newDF should contain those rows from sampleDF where column has a value of 1
Thanks in advance!
Related
I have a large list with 13 elements and each element has 181 rows and 31 columns. I am trying to filter each element based on one specific column. For example, I want to filter the elements' rows based on the column named X with the following condition X<=0.30. If it was a data frame, for instance, I would do "filter(X<=0.30)".
Here are some of the codes I tried where "out" is my data frame and "X" is the variable with the condition (X<=0.30):
out_new<-(list.filter(out,X<=0.30))
out_new<-out[sapply(out, X)<=0.30]
However these codes return a list of 0. I am not sure what I am doing wrong or missing. So any help is very much appreciated. Thank you.
I am presuming u have a list of data-frame and you want to filter each df based on one specific column.
my_function<-function(df){
df<-df %>% filter(X<=0.30)
}
map(list_name,my_function)
I am trying to work with the below data frame and what I am trying to do is to compare the values from columns:
#create a sample data frame
df<-data.frame(
item=c("a","b","c","d"),
price_today=c(1,2,3,4,5),
price_yesterday=c(1,2,3,4,5)
If values from column price_today are the same compared with values from column price_yesterday, then print okay or else print not okay, and the printed result will be shown as a new variable in the data frame.
May I know how should I go about the ifelse part here?
Many thanks for your help and have a good day.
Modified Questions:
Hi all, so now what if the df becomes like this:
#create a sample data frame (modified)
df<-data.frame(
item=c("a","a","c","d"),
price_today=c(1,"",3,"XYZ",5),
price_yesterday=c(1,2,3,4,5)
Now it contains both blank value and non-numerical values in column price_today. And instead of a,b,c,d in column item, it becomes a,a,c,d in column item. I have been trying to do the following:
Sort column item by "a" and I have the below code:
df_1<-df[df$item=="a",]
After df_1 is filtered, then again sort price_today, by removing blank and non-numerical values with codes below:
df_1<-df[!is.numeric(df_1$price_today),]
I am able to filter out by "a" in column item, however, with the second filter, it then returns with the original df, may I know what did I do wrong here?
Million thanks for your help and have a good day/night.
I have two data frames, one containing the predictors and one containing the different categories I want to predict. Both of the data frames contain a column named geoid. Some of the rows of my predictors contains NA values, and I need to remove these.
After extracting the geoid value of the rows containing NA values, and removing them from the predictors data frame I need to remove the corresponding rows from the categories data frame as well.
It seems like a rather basic operation but the code won't work.
categories <- as.data.frame(read.csv("files/cat_df.csv"))
predictors <- as.data.frame(read.csv("files/radius_100.csv"))
NA_rows <- predictors[!complete.cases(predictors),]
geoids <- NA_rows['geoid']
clean_categories <- categories[!(categories$geoid %in% geoids),]
None of the rows in categories/clean_categories are removed.
A typical geoid value is US06140231. typeof(categories$geoid) returns integer.
I can't say this is it, but a very basic typo won't be doing what you want, try this correction
clean_categories <- categories[!(categories$geoid %in% geoids),]
Almost certainly this is what you meant to happen in that line. You want to negate the result of the %in% operator. You don't include a reproducible example so I can't say whether the whole thing will do as you want.
To build off of this question:
How to add rows to empty data frames with header in R?
If I happen to generate a data frame that is empty (I don't necessarily know which will be empty and which will contain data until I try to run the code) can I build in contigency: "if data frame is empty THEN assign zeros to all columns, if data frame has data, keep that data." ?
Thanks!
I can try to share some code if it's helpful, but if we can work with the code in the related example that will help me immensely.
Try
f1 <- function(dat){
if(nrow(dat)==0){
dat[1,] <- 0
}
else dat
dat}
using the data from the link,
f1(compData)
# A B
#1 0 0
I have a list of 26 data frames called score.list and I have written a code that tells me which data frames are not complete. So this code gives me the name of the data frame within the list, but it doesn't tell me the index of the data frame in the list.
Example... the code tells me that a data frame named p08 and another data frame named p18 are not complete. Therefore, they need to combined with whichever data frame that follows after these. So if the data frame named p08 is score.list[[8]], then it should be combined with score.list[[9]]. It should replace [[8]] with the newly made data frame then score.list[[9]] should be deleted from the list.
I'm guessing something like the code below may work to combine & replace a data frame... I'm not sure if the following code works..
score.list[[8]] <- rbind(score.list[[8]], score.list[[9]])
This is what I tried doing... but didn't exactly work because it didn't make a new data frame after combining it. And I get this error message:
Error in if (names(score.list[i]) == names(score.list[i + 1])) { :
missing value where TRUE/FALSE needed
for(i in 1:length(score.list)){
if(names(score.list[i])==names(score.list[i+1])) {
a <- score.list[i]
b <- score.list[i+1]
score.list[[i]] <- rbind(a, b)
print(score.list[[i]])
}
}
Reason I wrote if(names(score.list[i]==names(score.list[i+1])) as that is because the names of the data frames that need to be combined together are the same in the list. The data frame that is not complete has the same name as the one that follows it. So name of the data frame score.list[[8]] is same as the name of the data frame score.list[[9]].
Please let me know if there are confusing parts.. I tried to write it as clear as I can. Thank you!
This should help you :
## a list example
score.list <-
list(l1= data.frame(x=1),
l2=data.frame(x=2),
l3= data.frame(x=3))
## use %in% to select some elements
## here I am selecting list l1 and l3
do.call(rbind,
score.list[names(score.list) %in% c('l1','l3')])