I was hoping someone could help me because this problem should be really easy to solve, however it's taking me too much time.
I have a df (data) with several columns. In one of the columns (rideable_type) I get one of 3 answers, "docked_bike" "classic_bike" "electric_bike".
I want to turn every "docked_bike" answer into "classic_bike".
When I use
data$rt<-data$rideable_type %>%
set_names(~stringr::str_replace_all(.,"docked_bike", "classic_bike"))
Nothing changes.
When I use
data$rt<-data$rideable_type %>%
dplyr::rename_all(~stringr::str_replace_all(.,"docked_bike", "classic_bike"))
I get an error:
Error in UseMethod("tbl_vars") :
no applicable method for 'tbl_vars' applied to an object of class "character"
Thank you for your time
I found the answer, maybe the question wasn't clear. So the point was to change the values within the column. But maybe that is not correct R terminology
hopefully it'll help someone else.
data$rt<-NA
data$rt[data$rideable_type=="docked_bike"]<-"classic_bike"
data$rt[data$rideable_type=="classic_bike"]<-"classic_bike"
data$rt[data$rideable_type=="electric_bike"]<-"electric_bike"
data=subset(data, select = -c(rt) )
Related
how are you?
I have the next problem, that is very weird because the task it is very simple.
I want to filter one of my factor variables in R, but the outcome is an empty dataframe.
So my data frame is called "data_2022", if i execute this code:
sum(data_2022$CANALDEVENTA=="WEB")
The result is 2704800 that is the number of times that this filter is TRUE.
a= data_2022 %>% filter(CANALDEVENTA=="WEB")
This returns an empty data frame.
I know i am not an expert in R, but i have done the last thing a million times and i never had this error before.
Do you have a clue about whats the problem with this?
Sorry i did not make a reproducible example.
Already thank you.
you could use subset function:
a<-subset(data_2022, CANALDEVENTA=="WEB")
using tidyverse, make sure you are using the function from dplyr::filter. filter is looking for a logical expression but probably you apply it to a data.frame. Try this code too:
my_names<-c("WEB")
a<-dplyr::filter(data_2022, CANALDEVENTA %in% my_names)
Hope it works.
I've got a piece of code and I don't quite understand how it works. I'm sorry if I may sound stupid...
I'm making a network, so I created an adjacency matrix that looks like this:
Part of my adjacency matrix
So, when I try to calculate cosine similarity via lsa::cosine function the column "authors$book_id" disappears and all the other columns' names that contain authors' IDs are replaced with Xs.
My code is this:
a_cos <- lsa::cosine(t(as.matrix(adj_matrix[, -1]))) %>%
data.frame() %>%
+0.01 %>%
round()
This is what this code returns:
After lsa::cosine
Since I know this is isn't an error I want to know how this happened.
Can someone please explain to me what this piece of code does and why it removes the first column (book IDs) completely and turns authors' IDs to Xs? How can I apply lsa::cosine so that this doesn't happen?
I'm only a beginner, but I really need help and I will appreciate any comments! This is my first question on stackoverflow, I hope I've described the problem extensively enough...
library(discretization)
data("CO2")
disc<- mdlp(CO2[4])
I just need to discretize the 4th column of the data set provided. Then it is getting an Error in data[1, ] : incorrect number of dimensions error. Could you please help me to fix this.
I don't know if this is what you're going for, but 1) mdlp needs more than just one column of data, and 2) it also has trouble working with complex objects like CO2. Here is one way to make it execute:
CO2.df <- as.data.frame(CO2) # strips the extra info
mdlp(CO2.df[,4:5])
I am just starting to learn R.
Used function psych::describeBy in order to group observation in standard dataset airquality.
psych::describeBy(airquality, group = df$month)
However, got the error message:
"df$month : object of type 'closure' is not subsettable"
Still, cannot understand what is wrong.
UPDATE:
Ok, this question was my first shot on Stack Overflow. Not particularly successful, but I was doing my best :). Decided not to delete it, in case it may be useful for somebody who is doing his first steps (and to humble myself too).
What I did not realize, when I was dealing with this problem years ago, is that I need to specify my column for grouping using the name of dataset airquality$Month, rather than df$Month. It was not a spelling issue, but the misunderstanding of syntax basics. I believed that I've already stated that I want to use dataframe named airquality, therefore I can address its columns by name df$Month, meaning that df is a placeholder for airquality. Which is totally wrong, of course. However, my intuition was not 100% wrong, in fact this could have been accomplished by using this syntax:
psych::describeBy(airquality, group = "Month")
You don't need to wrote name of dataframe (because you named it in first argument for function describeBy), just need to specify name of column of interest as second argument (as string, therefore in quotation marks).
Also, for some reason I wrote month, but correct name of column is Month (maybe it was renamed I am not sure).
Hope that my blunder would be a help for somebody else!
Most likely the issue is with how you're loading/using the dataset. Make sure refer to the dataset in a consistent manner
You can try something like
library(psych)
describeBy(airquality, group = airquality$month)
I have a couple of questions as I am new to this.
How can I read in data using \-\- to look for missing values?
How can I determine how many values are missing in each variable?
I tried using the summary command and is.na but can't's seem to get it right.
the first question is not clear, for the second one you can use
sapply(yourdataframe, function(x) sum(is.na(x))