how are you?
I have the next problem, that is very weird because the task it is very simple.
I want to filter one of my factor variables in R, but the outcome is an empty dataframe.
So my data frame is called "data_2022", if i execute this code:
sum(data_2022$CANALDEVENTA=="WEB")
The result is 2704800 that is the number of times that this filter is TRUE.
a= data_2022 %>% filter(CANALDEVENTA=="WEB")
This returns an empty data frame.
I know i am not an expert in R, but i have done the last thing a million times and i never had this error before.
Do you have a clue about whats the problem with this?
Sorry i did not make a reproducible example.
Already thank you.
you could use subset function:
a<-subset(data_2022, CANALDEVENTA=="WEB")
using tidyverse, make sure you are using the function from dplyr::filter. filter is looking for a logical expression but probably you apply it to a data.frame. Try this code too:
my_names<-c("WEB")
a<-dplyr::filter(data_2022, CANALDEVENTA %in% my_names)
Hope it works.
I am a beginner in R so this is a very basic question. I do not find a specific answer to it so I would like to ask you here.
I'm confronted with the following challenge; I'd like to recode a character variable and create one out of this.
Specifically, the variable in my data frame(data) is called "driver", with the categories "market", "legislation", "technology", and "mixed".
Now I would simply like to create a new variable, "driverrec", with the values "market" and "others". In "others" the three remaining variables shall be summarized.
I tried it with this page: http://rprogramming.net/recode-data-in-r/
Basically, I tried the following code to adopt on mine, but it won't work for more than one category.
#Create a new field called NewGrade
SchoolData$NewGrade <- recode(SchoolData$Grade,"5='Elementary'")
# my attempt
driverrec <- data$driver
recode(driverrec, "'Mixed'='others'") This is working.
But the whole recode is not working:
recode(driverrec, "'Mixed'='others'", "'Technology'='others'",
"'Legislation'='others'", "'Market'='market'" )
I am looking forward to and thank you for your help.
I found a solution not using the replace command:
data$driverrec[dataframe$driver=='Market'] <- 'market'
data$driverrec[is.na(dataframe$driver)==TRUE] <- 'others'
This worked fine; in order, someone is looking for a solution ;)!
I've got this for loop:
for(i in 1:length(class.data$ID)) {
class.data$FinalExam_GroupMCScore[i]=mc.data$PSYC.260.Exam....2017.3.
[which(mc.data$SIS.User.ID == class.data$FinalExam_MCGroupNumber[i])]
}
To merge two class grade files. Students did a part of their final exam in groups. The problem I'm having is that not everyone opted to do the group portion so they are missing a code for class.data$FinalExam_MCGroupNumber. The for loop gets hung up on these missing values and I can't get past. I suspect I need a an if statement embedded in there but I'm not familiar enough with R yet to write one in.
I've looked at some of the other posts on this and they don't help just because I'm having a tough time seeing how to embed an if or ifelse with a more complicated function following. Any help would be appreciated! I just want it to assign an NA on FinalExam_GroupMCScore to all students with NA on FinalExam_MCGroupNumber and carry on as normal!
Thank you!!
How can I incorporate criteria into a table(x,y) function? Instead of subsetting the main data frame and then running table() on each subset, can I save a step or two and just write table() with some sort "if" functionality written into it?
An example of the data set would be great. Hard to imagine what you're after at the moment.
But yes, you can write an if/else statement:
if(grepl("^1[.]0{3}, dataset)==TRUE)) {
print("1.000")
} else {
print("not 1.000")
}
Here, I'm asking it to find the pattern 1,000 in the dataset named dataset. If not, it'll return the else statement.
You can see other examples around too. Maybe look at the function summary of if, and how to use regex (regular expressions).
Hope that helps.
New to R, taking a very accelerated class with very minimal instruction. So I apologize in advance if this is a rookie question.
The assignment I have is to take a specific column that has 21 levels from a dataframe, and condense them into 4 levels, using an if, or ifelse statement. I've tried what feels like hundreds of combinations, but this is the code that seemed most promising:
> b2$LANDFORM=ifelse(b2$LANDFORM=="af","af_type",
ifelse(b2$LANDFORM=="aflb","af_type",
ifelse(b2$LANDFORM=="afub","af_type",
ifelse(b2$LANDFORD=="afwb","af_type",
ifelse(b2$LANDFORM=="afws","af_type",
ifelse(b2$LANDFORM=="bfr","bf_type",
ifelse(b2$LANDFORM=="bfrlb","bf_type",
ifelse(b2$LANDFORM=="bfrwb","bf_type",
ifelse(b2$LANDFORM=="bfrwbws","bf_type",
ifelse(b2$LANDFORM=="bfrws","bf_type",
ifelse(b2$LANDFORM=="lb","lb_type",
ifelse(bs$LANDFORM=="lbaf","lb_type",
ifelse(b2$LANDFORM=="lbub","lb_type",
ifelse(b2$LANDFORM=="lbwb","lb_type","ws_type"))))))))))))))
LANDFORM is a factor, but I tried changing it to a character too, and the code still didn't work.
"ws_type" is the catch all for the remaining variables.
the code runs without errors, but when I check it, all I get is:
> unique(b2$LANDFORM)
[1] NA "af_type"
Am I even on the right path? Any suggestions? Should I bite the bullet and make a new column with substr()? Thanks in advance.
If your new levels are just the first two letters of the old ones followed by _type you can easily achieve what you want through:
#prototype of your column
mycol<-factor(sample(c("aflb","afub","afwb","afws","bfrlb","bfrwb","bfrws","lb","lbwb","lbws","wslb","wsub"), replace=TRUE, size=100))
as.factor(paste(sep="",substr(mycol,1,2),"_type"))
After a great deal of experimenting, I consulted a co-worker, and he was able to simplify a huge amount of this. Basically, I should have made a new column composed of the first two letters of the variables in LANDFORM, and then sample from that new column and replace values in LANDFORM, in order to make the ifelse() statement much shorter. The code is:
> b2$index=as.factor(substring(b2$LANDFORM,1,2))
b2$LANDFORM=ifelse(b2$index=="af","af_type",
ifelse(b2$index=="bf","bf_type",
ifelse(b2$index=="lb","lb_type",
ifelse(b2$index=="wb","wb_type",
ifelse(b2$index=="ws","ws_type","ub_type")))))
b2$LANDFORM=as.factor(b2$LANDFORM)
Thanks to everyone who gave me some guidance!