I am just starting to learn R.
Used function psych::describeBy in order to group observation in standard dataset airquality.
psych::describeBy(airquality, group = df$month)
However, got the error message:
"df$month : object of type 'closure' is not subsettable"
Still, cannot understand what is wrong.
UPDATE:
Ok, this question was my first shot on Stack Overflow. Not particularly successful, but I was doing my best :). Decided not to delete it, in case it may be useful for somebody who is doing his first steps (and to humble myself too).
What I did not realize, when I was dealing with this problem years ago, is that I need to specify my column for grouping using the name of dataset airquality$Month, rather than df$Month. It was not a spelling issue, but the misunderstanding of syntax basics. I believed that I've already stated that I want to use dataframe named airquality, therefore I can address its columns by name df$Month, meaning that df is a placeholder for airquality. Which is totally wrong, of course. However, my intuition was not 100% wrong, in fact this could have been accomplished by using this syntax:
psych::describeBy(airquality, group = "Month")
You don't need to wrote name of dataframe (because you named it in first argument for function describeBy), just need to specify name of column of interest as second argument (as string, therefore in quotation marks).
Also, for some reason I wrote month, but correct name of column is Month (maybe it was renamed I am not sure).
Hope that my blunder would be a help for somebody else!
Most likely the issue is with how you're loading/using the dataset. Make sure refer to the dataset in a consistent manner
You can try something like
library(psych)
describeBy(airquality, group = airquality$month)
Related
I was hoping someone could help me because this problem should be really easy to solve, however it's taking me too much time.
I have a df (data) with several columns. In one of the columns (rideable_type) I get one of 3 answers, "docked_bike" "classic_bike" "electric_bike".
I want to turn every "docked_bike" answer into "classic_bike".
When I use
data$rt<-data$rideable_type %>%
set_names(~stringr::str_replace_all(.,"docked_bike", "classic_bike"))
Nothing changes.
When I use
data$rt<-data$rideable_type %>%
dplyr::rename_all(~stringr::str_replace_all(.,"docked_bike", "classic_bike"))
I get an error:
Error in UseMethod("tbl_vars") :
no applicable method for 'tbl_vars' applied to an object of class "character"
Thank you for your time
I found the answer, maybe the question wasn't clear. So the point was to change the values within the column. But maybe that is not correct R terminology
hopefully it'll help someone else.
data$rt<-NA
data$rt[data$rideable_type=="docked_bike"]<-"classic_bike"
data$rt[data$rideable_type=="classic_bike"]<-"classic_bike"
data$rt[data$rideable_type=="electric_bike"]<-"electric_bike"
data=subset(data, select = -c(rt) )
I'd like to transfer the variable type from one variable to the other, where the variables are vectors.
for example x is numeric, y is character but I want it to be the same as x, how to do this? Same if x is integer etc.
pseudo code:
y <- set_variable_type(y,get_variable_type(x))
This has the potential for some errors to crop up. Like #r2evans mentioned converting some characters to numerics doesn't mesh well. It really depends on what the scope is for this code. If you know that type x will always be within certain confines, you will probably be fine with their suggestion class(y)<-class(x), but if you don't know what x will always be, you could end up trying to convert a dataframe into a int, which will throw an error. I would suggest approaching your problem a different way. With more information we may be able to help you find a more comprehensive solution to the problem. (I still can't comment yet, or this would be a comment).
I recently just started with R a few weeks ago at the Uni. We were given a problem which we had to solve. However in this problem, I find that there are two answers that fit the question:
Verify that you created lo_heval correctly (incl. missing values). Store your verification in the object proof2.
So i find this is correct:
proof2 <- soep[1:100, c("heval", "lo_heval")]
But I think that this answer is also correct:
proof2 <- table(soep$heval, soep$lo_heval, useNA = "always")
Instead of having to decide for one answer, how do I combine them both into the object? I tried to use &, but I get an error. I may be using it wrong.
Prof. if you're seeing this, please don't fail me. I just can't decide between them.
Thanks in advance!
R lists can hold any arbitrary objects in them, so you could use
proof2 <- list(
soep[1:100, c("heval", "lo_heval")],
table(soep$heval, soep$lo_heval, useNA = "always")
)
However, to my mind 100 rows of two columns isn't proof - it's an exercise to look through those and verify things are right. (And what about the rows past 100? It's a decent spot check, but if there are more rows in the data it is more strong evidence than proof.) The table approach, on the other hand, seems succinct and effective.
I am trying to use update function on survey.design object. For instance, I want to create a variable that is the mean of 4 other variables, as follows
x1<-runif(3)
x2<-runif(3)
x3<-runif(3)
population=10000
testdf<-data.frame(x1,x2,x3,population)
testsvy<-svydesign(id=~1,weights=c(30,30,30),data=testdf)
testsvy<-update(testsvy,avg=mean(c(x1,x2,x3)))
However this returns a vector of the same number for every person. There must be something wrong. Alternatively I can modify on test$variables, but I don't feel that this is the easiest way...
OK I got the answer myself... Hope that it could be simpler since I type the object names three times...
testsvy<-update(testsvy,avg2=rowMeans(testsvy$variables[,c("x1","x2","x3")],na.rm=TRUE))
I am putting together an R function that takes some undefined input through the ... argument described in the docs as:
"..." the special variable length argument ***
The idea is that the user will enter a number of column names here, each belonging to a dataset also specified by the user. These columns will then be cross-tabulated in comparison to the dependent variable by tapply. The function is to return a table (independent variable x indedependent variable).
Thus, I tried:
plotter=function(dataset, dependent_variable, ...)
{
indi_variables=list(...); # making a list of the ... input as described in the docs
result=with (dataset, tapply(dependent_variable, indi_variables, mean); # this fails
}
I figured this should work as tapply can take a list as input.
But it does not in this case ('Error in tapply...arguments must have same length') and I think it is because indi_variables is a list of strings.
If I input the contents of the list by hand and leave out the quotation marks, everything works just fine.
However, if the user feeds the function the column names as non-strings, R will interpret them as variable names; and I cannot figure out how to transform the list indi_variables in the right way, unsuccessfully trying things like this:
indi_variables=lapply(indi_variables, as.factor)
So I am wondering
What causes the error described above? Is my interpretation correct?
How would one go about transforming the list created through ... in the right way?
Is there an overall better way of doing this, in the input or the implementation of tapply?
Any help is much appreciated!
Thanks to Joran's helpful reading, I have come up with these improvements than make things work out...
indi_variables=substitute(list(...));
result=with (dataset, tapply(dependent_variable, eval(indi_variables, dataset), FUN=mean));