I'm just learning to create functions in R so I'm trying to make a function which graphs residual lines for a linear regression. I've already tried it and the code works outside of the function, but once I put it all into a function I get the 'x' and 'y' lengths differ error.
Here is my function:
`reslines <- function(x,y) {
abline(lm(y~x))
for(k in 1: length(y)) lines(c(x[k],x[k]), c(y[k], predict(lm(y~x))))
}`
The tracebook shows that the error occurs here:
6 stop("'x' and 'y' lengths differ")
5 xy.coords(x, y)
4 plot.xy(xy.coords(x, y), type = type, ...)
3 lines.default(c(x[k], x[k]), c(y[k], predict(lm(y ~ x))))
2 lines(c(x[k], x[k]), c(y[k], predict(lm(y ~ x))))
1 reslines(a, b)
I've checked the lengths of each data set I've tried using the length() function, and they all match, so something is happening inside the function which appears to change the length or 'x' or 'y' or both.
Can anyone tell me what the error is and how to fix it? Thanks.
I think I fixed it, it was not super easy, mainly the problem was in your predict where you used y, x instead of y[k], x[k]. But there was a little bit more:
reslines <- function(x,y) {
plot(y~x)
abline(lm(y~x))
lm.xy <- lm(y~x)
for(k in 1: length(y)) {
lines(c(x[k],x[k]), c(y[k], predict(lm.xy, data.frame(x=x[k], y = y[k]))))
}
}
Now a test
set.seed(123)
reslines(rnorm(10), rnorm(10))
Related
I'm optimising a simple function in r using 'nloptr' and I'm having difficulty passing arguments in to the objective function. Here is the code I'm using:
require("nloptr")
Correl <- matrix(c(1,-.3,-.3,1), nrow=2, ncol=2)
Wghts <- c(.01,.99)
floor <- .035
expret <- c(.05,.02)
pf.return <- function(r, x, threshold=0){
return(r * x - threshold)
}
pf.vol <- function(x, C){
return(sqrt(x %*% C %*% x))
}
res <- nloptr(x0=Wghts,eval_f = pf.vol,eval_g_ineq=pf.return,opts=list(algorithm="NLOPT_GN_ISRES"), x=Wghts,C=Correl)
(I know I'm missing parameters here but I'm trying to highlight a behaviour I don't understand)
Running this gives the following error:
Error in .checkfunargs(eval_f, arglist, "eval_f") :
x' passed to (...) in 'nloptr' but this is not required in the eval_f function.
Just to see what happens I can try running it without the x argument:
res <- nloptr(x0=Wghts,eval_f = pf.vol,eval_g_ineq=pf.return,opts=list(algorithm="NLOPT_GN_ISRES"), C=Correl)
which gives the error:
Error in .checkfunargs(eval_g_ineq, arglist, "eval_g_ineq") :
eval_g_ineq requires argument 'x' but this has not been passed to the 'nloptr' function.
So including x throws an error that it is unnecessary and omitting it throws an (at least understandable) error that it has been omitted.
Ok for posterity:
I rewrote the functions so that they had the same set of arguments, in the same order.
I also omitted the x=Wghts bit as this is the parameter I'm trying to search over.
How does R dispatch plot functions? The standard generic is defined as
plot <- function (x, y, ...) UseMethod("plot")
So usually, all plot methods need arguments x and y. Yet, there exists a variety of other functions with different arguments. Some functions only have argument x:
plot.data.frame <- function (x, ...)
others even have neither x nor y:
plot.formula <- function(formula, data = parent.frame(), ..., subset,
ylab = varnames[response], ask = dev.interactive())
How does this work and where is this documented?
Background
In my package papeR (see GitHub) I want to replace the function plot.data.frame, which is defined in the R package graphics with my own version. Yet, this is usually not allowed
Do not replace registered S3 methods from base/recommended packages,
something which is not allowed by the CRAN policies and will mean that
everyone gets your method even if your namespace is unloaded.
as Brian Ripley let me know last time I tried to do such a thing. A possible solution is as follows:
If you want to change the behaviour of a generic, say predict(), for
an existing class or two, you could add such as generic in your own
package with default method stats::predict, and then register modified
methods for your generic (in your own package).
For other methods I could easily implement this (e.g. toLatex), yet, with plot I am having problems. I added the following to my code:
## overwrite standard generic
plot <- function(x, y, ...)
UseMethod("plot")
## per default fall back to standard generic
plot.default <- function(x, y, ...)
graphics::plot(x, y, ...)
## now specify modified plot function for data frames
plot.data.frame <- function(x, variables = names(x), ...)
This works for data frames and plots with x and y. Yet, it does not work if I try to plot a formula etc:
Error in eval(expr, envir, enclos) :
argument "y" is missing, with no default
I also tried to use
plot.default <- function(x, y, ...)
UseMethod("graphics::plot")
but then I get
Error in UseMethod("graphics::plot") :
no applicable method for 'graphics::plot' applied to an object of class "formula"
So the follow up question is how I can fix this?
[Edit:] Using my solution below fixes the problems within the package. Yet, plot.formula is broken afterwards:
library("devtools")
install_github("hofnerb/papeR")
example(plot.formula, package="graphics") ## still works
library("papeR")
example(plot, package = "papeR") ## works
### BUT
example(plot.formula, package="graphics") ## is broken now
Thanks to #Roland I solved part of my problem.
It seems that the position of the arguments are used for method dispatch (and not only the names). Names are however partially used. So with Rolands example
> plot.myclass <- function(a, b, ...)
> print(b)
> x <- 1
> y <- 2
> class(x) <- "myclass"
we have
> plot(x, y)
[1] 2
> plot(a = x, b = y)
[1] 2
but if we use the standard argument names
> plot(x = x, y = y)
Error in print(b) (from #1) : argument "b" is missing, with no default
it doesnt't work. As one can see x is correctly used for the dispatch but b is then "missing". Also we cannot swap a and b:
> plot(b = y, a = x)
Error in plot.default(b = y, a = x) :
argument 2 matches multiple formal arguments
Yet, one could use a different order if the argument one wants to dispatch for is the first (?) element without name:
> plot(b = y, x)
[1] 2
Solution to the real problem:
I had to use
plot.default <- function(x, y, ...)
graphics::plot(x, y, ...)
The real issue was internally in my plot.data.frame method where I was using something along the lines of:
plot(x1 ~ y1)
without specifying data. Usually this works as data = parent.frame() per default. Somehow in this case this wasn't working. I now use plot(y1, x1) which works like a charm. So this was my sloppiness.
In swiss data I try to do stepwise linear regression for different range of Agriculture, so I tried:
data <- swiss
splits <- split(data, cut(data$Agriculture, breaks=c(0, 50, Inf), right=FALSE))
select <- function(x) {
null <- lm(Fertility~1, data=splits[[x]])
full <- lm(Fertility~., data=splits[[x]])
step(null, scope=list(lower=null, upper=full, direction='forward'))
}
select(2)
this would work but the following doesn't:
null_list <- lapply(splits, function(x) {lm(Fertility~1, data=x)})
full_list <- lapply(splits, function(x) {lm(Fertility~., data=x)})
select <- function(x) {
null <- null_list[[x]]
full <- full_list[[x]]
step(null, scope=list(lower=null, upper=full, direction='forward'))
}
select(2)
The second version throws error:
Error in eval(expr, envir, enclos) : object 'Fertility' not found
But when I check
lm(Fertility~1, data=splits[[2]])
null_list[[2]]
and
lm(Fertility~., data=splits[[2]])
full_list[[2]]
They both look the same. What makes the difference? Any stupid mistakes made?
Well, if you look at the calls for the two versions, you'll see they are not exactly the same
lm(Fertility~1, data=splits[[2]])$call
# lm(formula = Fertility ~ 1, data = splits[[2]])
null_list[[2]]$call
# lm(formula = Fertility ~ 1, data = x)
Notice how the data= argument is different for each. The former still points to a valid global variable, the latter points to x which does not exist any more. The step() function tries to evaluate the formula in a context from where it was called. And in that context x is your loop counter. If you changed the select() function to
select <- function(z) {
null <- null_list[[z]]
full <- full_list[[z]]
step(null, scope=list(lower=null, upper=full, direction='forward'))
}
select(2)
You'd get a different error
Error in is.data.frame(data) : object 'x' not found
which basically means that step() is having trouble getting back to the variable that contains the data that can be used to re-fit a model adding or subtracting a covariate.
One work around would be to embed the data in the lm() call itself. You can do that with
null_list <- lapply(splits, function(x) {do.call("lm", list(Fertility~1, data=x))})
full_list <- lapply(splits, function(x) {do.call("lm", list(Fertility~., data=x))})
But you'll see this results in a "messy-looking" call but the the results should be the same.
This is unfortunately a side-effect of non-standard evaluation. It would be nice if step() looked for the data in the $model property of the full model, but I believe this doesn't match up when you have NA values so R has no choice but to try to re-evaluate the data= parameter in some context.
I am having a problem with the package glmnet in R. I am trying to use it off-the-shelf, and am getting the following problem:
test <- glmnet(seq.trans,rsem.trans)
Error in weighted.mean.default(y, weights) :
'x' and 'w' must have the same length
But the inputs are the same size:
dim(seq.trans)
# [1] 28 17763
dim(rsem.trans)
# [1] 28 17763
What is causing this error?
I had the same problem, but found the solution was that both X and y should be matrices. I was running the code below without the as.matrix function and getting the same error. Then I tried this and it worked. Also see the example in this tutorial by loading the data that should come in the package, and you'll see that both x and y in the first example are both matrices.
library(glmnet)
library(dplyr)
X <- as.matrix(select(mtcars, -mpg))
y <- as.matrix(select(mtcars, mpg))
fit <- glmnet(X, y)
In the context of glmnet(x,y) the variable y should be a vector.
In your example, you could achieve this using:
glmnet(seq.trans, as.vector(rsem.trans))
This is probably a simple coding issue but I can't for the life of me get it.
Using this code for a logit:
glm(formula = cbind(Found, Missing) ~ Male + Age, family = binomial,
data = table.5.15)
I can't get the Hosmer-Lemeshow to work:
hosmerlem(miss.logit$cbind(Found,Missing), fitted(miss.logit))
Error in cbind(1 - y, y) : attempt to apply non-function
I realize this is a problem with having the cbind in my logit model.
Assuming you're using some implementation of the Hosmer Lemeshow test which roughly resembles Frank Harrell's,
it seems most likely that your mistake is a basic syntactical issue:
miss.logit$cbind(Found,Missing),
Your $ operator is not smart enough to reference both Found and Missing as objects which resolve in the scope of miss.logit. For instance:
> x <- data.frame('n'=1:26, 'l'=letters[1:26])
> x$cbind(n, l)
Error: attempt to apply non-function
The issue is that R thinks cbind is a function that lives in x which you're trying to evaluate on two globals n and l. Even if I made cbind an element of x, n and l would need to be referenced within x as well.
I can correct this code by using the with statement instead, or just basic array subsetting.
> x[, c('n', 'l')] ## works (best)
> with(x, cbind(n, l)) ## works
> cbind(x$n, x$l) ## works (worst)
FYI for anyone else.
I'm an R newbie and I had a similar problem today...
My code:
hosmerlem(y = MM_LOGIT$sick_or_not, yhat = fitted(MM_LOGIT))
Error in model.frame.default(formula = cbind(1 - y, y) ~ cutyhat) :
variable lengths differ (found for 'cutyhat')
I had to change the y= bit to point towards the name of my data, rather than the logistic/glm output...
hosmerlem(y=MYDATA$sick_or_not, yhat=fitted(MM_LOGIT))
$chisq
[1] 20.24864
$p.value
[1] 0.0003960473