I'm playing around with R and trying to get the average of a column. Just mean(V1) doesn't work.
Could anybody give me an advice?
Thank you!
A: mean(D$V1) ... column names are not first class objects. They are part of a data.frame with a name that needs to be used.
Q: es, thanks. So $ is like the dot-operator? – user1170330 4 mins ago
A: Perhaps (depending on which language is being compared.) it is possible to construct list objects with cascaded calls to $<- and then obj$V1$subV1 to extract. Review ?Extract very carefully.
if you just want mean(V1), you can attach dataframe with attach(D), but probably I don't want to recommend it as later you may have other dataframes with same variable names and a mess withattach and detach commands. So, mean(D$V1) is the best way.
Related
I have a data frame that contains 150 numerical values that I want the mean of. Which column or row they're on is not relevant at all, that's just how the data was given.
I have found a solution to do this, but it's so shamefully disgusting that I'd prefer to use a better method. I've literally just added up the mean of each column and divided by the number of columns...
This is still a 1-liner so it's not that bad, but there must be better ways to do this.
A thousand thanks in advance!
Best solution for me was to unlist it with the unlist() function. Thanks to #H 1 !
Then you can simply use the mean() function.
I am trying to use update function on survey.design object. For instance, I want to create a variable that is the mean of 4 other variables, as follows
x1<-runif(3)
x2<-runif(3)
x3<-runif(3)
population=10000
testdf<-data.frame(x1,x2,x3,population)
testsvy<-svydesign(id=~1,weights=c(30,30,30),data=testdf)
testsvy<-update(testsvy,avg=mean(c(x1,x2,x3)))
However this returns a vector of the same number for every person. There must be something wrong. Alternatively I can modify on test$variables, but I don't feel that this is the easiest way...
OK I got the answer myself... Hope that it could be simpler since I type the object names three times...
testsvy<-update(testsvy,avg2=rowMeans(testsvy$variables[,c("x1","x2","x3")],na.rm=TRUE))
I am reading Wickham's Advanced R book. This question is relating to solving Question 5 in chapter 12 - Functionals. The exercise asks us to:
Implement a version of lapply() that supplies FUN with both the name and value of each component.
Now, when I run below code, I get expected answer for one column.
c(class(iris[1]),names(iris[1]))
Output is:
"data.frame" "Sepal.Length"
Building upon above code, here's what I did:
lapply(iris,function(x){c(class(x),names(x))})
However, I only get the output from class(x) and not from names(x). Why is this the case?
I also tried paste() to see whether it works.
lapply(iris,function(x){paste(class(x),names(x),sep = " ")})
I only get class(x) in the output. I don't see names(x) being returned.
Why is this the case? Also, how do I fix it?
Can someone please help me?
Instead of going over the data frame directly you could switch things around and have lapply go over a vector of the column names,
data(iris)
lapply(colnames(iris), function(x) c(class(iris[[x]]), x))
or over an index for the columns, referencing the data frame.
lapply(1:ncol(iris), function(x) c(class(iris[[x]]), names(iris[x])))
Notice the use of both single and double square brackets.
iris[[n]] references the values of the nth object in the list iris (a data frame is just a particular kind of list), stripping all attributes, making something like mean(iris[[1]]) possible.
iris[n] references the nth object itself, all attributes intact, making something like names(iris[1]) possible.
I am confused about the paste() function in R.
This is my r code:
paste(cols=list("speed###","dist"), rows=list("speed"))
The ideal output should be:
cols=list("speed###","dist"), rows=list("speed")
But the actual output was:
"speed### speed" "dist speed"
Can anyone help me to figure out and get the ideal output?
I will appreciate any reply here!
Thank you!
Best regards!
What I think is happening with your code is:
You are passing list("speed###","dist") and list("speed") as parameters cols and rows, not as strings, and paste ignores names of parameters. Also, lists are converted to characters:c("speed###","dist"), c("speed")
Since parameters don't have the same length, the second is replicated (i.e. c("speed###","dist") and c("speed", "speed")).
Then, first elements are pasted together, and second elements are as well, returning a vector with each pasted string: c("speed### speed", "dist speed")
Why do you need what you are asking for? I mean, you plan to do.call(matrix, list(cols=c("speed###","dist"), rows=c("speed"))) or something?
Besides, you should use c instead of list. I don't understand the need of using list here.
I have been trying to figure out how to apply the apply functions plyr is out there. I will learn that later. But, I need help. I can get output with actually typing the object name in, but I am trying to loop a list through it. The code is as follows:
list<-noquote(c("T","AAVL"))
lapply(list,function(i) xts(l.df$i[,-1:-5],order.by=as.POSIXct(rownames(l.df$i))))
If I just do xts(l.df$T[,-1:-5],order.by=as.POSIXct(rownames(l.df$T))
I get the xts file that I need. Could someone please help me loop the names without quotes into the lapply(), so that I could have this work for numerous elements in my list? Thank you!
There are a number of ways to subset a list in R. See https://ramnathv.github.io/pycon2014-r/learn/subsetting.html or http://adv-r.had.co.nz/Subsetting.html for more detailed discussion.
However, in your case the issue is that the dollar operator $ takes a fixed string rather than a variable name. So myList[["item"]] and myList$item are equivalent. In the example you gave, you're trying to find the member of the data.frame called "i", not the one referenced by the variable i. The noquote class you used purely affects printing of a character vector; it has no effect on subsetting.
The version of your code that works doesn't work as you explain in your comment. It works because you're now subsetting the column whose name is stored in i not the one called "i".
I want to add 106 new columns to a dataframe that are the length of the df ofcourse and filled with zeros (0). How would I loop over i in this case:
geo <- unique(df$geo)
geo
[1] "AL" "AT1" "AT2" "AT3" "BE1" "BE2" "BE3"
for(i in geo) {
df$i <- v(0,length(df)
}
Emil Krabbe 2 mins ago Edit