I want to multiply a this number 4.193215e+12 with a dataframe. My code is
df <- cbind(Dataset = df$Dataset, df[,2:4] * 4.193215e^12
However an error appears. What is the proper way to code this number 4.193215e+12 in R?
While this is found in the not-quite-obvious location ?NumericConstants , I am hard-pressed to think of a language in which Xe^Y is syntactically correct. Always use either e or ^ for powers.
I have a dataset in R with two columns labelled x and y each with over 1000 values. I need to find sum((xi^2-xbar^2)(yi-ybar))/sum((xi-xbar)^4) for a linear regression problem. All I can think to use is:
sum(((data$x)^2-mean(data$x)^2)(data$y-mean(data$y)))/sum((data$x-mean(data$x))^4)
But this just gives me Error: attempt to apply non-function. I haven't got a clue how to correct this. Any help would be much appreciated.
Question: How do you figure out what the problem is in an expression that is visually overwhelming?
Answer: take it apart piece by piece.
df <- data.frame(x = rnorm(10), y = rnorm(10))
df$x^2
# works fine
df$x^2 - mean(x)^2
# works fine **SEE NOTE**
sum(df$x^2 - mean(x)^2)
# works fine
# sum(DF$x^2 - mean(x)^2)(data$y-mean.... oh i see
You're trying to multiply by putting parens next to each other. Use *
NOTE: NO IT DOESN'T ... on a second pass, you might discover that your values aren't correct, but this isn't what throws the error if you have an x object already in your environment (and that object doesn't have any NA values)
I think this is related to the () and how you refer x and y variable from data. Try the following.
sum(((data$x)^2-(mean(data$x))^2)*(data$y-mean(data$y)))/sum((data$x-(mean(data$x))^2))
I need to take a limit of a function $\frac{x^n-1}{x-1}$ in R but I want the answer in terms of n. I tried defining n as a symbol but it did not work. I am new to R so I would appreciate any assistance.
Assuming that you want to take the limit of this function for x->1, you can obtain the result by using the package Ryacas in the following way:
require(Ryacas)
x <- Sym("x")
n <- Sym("n")
Limit((x^n-1)/(x-1),x,1)
which yields the answer:
expression(n)
I'm having odd results with quantmod monthlyReturn function. Here is an example:
require(quantmod)
getSymbols("VOO")
adj <- Ad(VOO["2010-09"])
monthlyReturn(adj)
(as.numeric(tail(adj)[6]) - as.numeric(adj[1])) / as.numeric(adj[1])
The last two commands gives the same answer 0.03559799
However commands as.numeric(tail(adj)[6]) and as.numeric(adj[1]) give me values 92.81556 and 89.62508 respectively and command (92.81556 - 89.62508)/89.62508 gives a value 0.03559807 which is correct but different from above examples.
Can somebody please explain to me what is wrong and why is there a difference?
You're losing precision when you print the numbers with so few digits.
options(digits=20)
as.numeric(tail(adj)[6])
# 92.815557999999995786
as.numeric(adj[1])
# 89.625084999999998558
(as.numeric(tail(adj)[6]) - as.numeric(adj[1])) / as.numeric(adj[1])
#0.035597991343606506798
(92.815557999999995786 - 89.625084999999998558)/89.625084999999998558
#0.035597991343606506798
For the life of me I cannot understand why this method is failing, I would really appreciate an additional set of eyes here:
heatmap.2(TEST,trace="none",density="none",scale="row",
ColSideColors=c("red","blue")[data.test.factors],
col=redgreen,labRow="",
hclustfun=function(x) hclust(x,method="complete"),
distfun=function(x) as.dist((1 - cor(x))/2))
The error that I get is:
row dendrogram ordering gave index of wrong length
If I don't include the distfun, everything works really well and is responsive to the hclust function. Any advice would be greatly appreicated.
The standard call to dist computes the distance between the rows of the matrix provided, cor computes the correlation between columns of the provided matrix, so the above example to work, you need to transpose the matrix:
heatmap.2(TEST,trace="none",density="none",scale="row",
ColSideColors=c("red","blue")[data.test.factors],
col=redgreen,labRow="",
hclustfun=function(x) hclust(x,method="complete"),
distfun=function(x) as.dist((1 - cor( t(x) ))/2))
should work. If you use a square matrix, you'll get code that works, but it won't be calculating what you think it is.
This is not reproducible yet ...
TEST <- matrix(runif(100),nrow=10)
heatmap.2(TEST, trace="none", density="none",
scale="row",
labRow="",
hclust=function(x) hclust(x,method="complete"),
distfun=function(x) as.dist((1-cor(x))/2))
works for me. I don't know what redgreen or data.test.factors are.
Have you tried debug(heatmap.2) or options(error=recover) (or traceback(), although it's unlikely to be useful on its own) to try to track down the precise location of the error?
> sessionInfo()
R version 2.13.0 alpha (2011-03-18 r54865)
Platform: i686-pc-linux-gnu (32-bit)
...
other attached packages:
[1] gplots_2.8.0 caTools_1.12 bitops_1.0-4.1 gdata_2.8.2 gtools_2.6.2
Building on Ben Bolker's reply, your code seems to work if TEST is an n×n matrix and data.test.factors is a vector of n integers. So for example starting with
n1 <- 5
n2 <- 5
n3 <- 5
TEST <- matrix(runif(n1*n2), nrow=n1)
data.test.factors <- sample(n3)
then your code will work. However if n1 and n2 are different then you will get the error row dendrogram ordering gave index of wrong length, while if they are the same but n3 is different or data.test.factors has non-integers then you will get the error 'ColSideColors' must be a character vector of length ncol(x).