perform Deming regression without intercept - r

I would like to perform Deming regression (or any equivalent of a regression method with uncertainties in both X and Y variables, such as York regression).
In my application, I have a very good scientific justification to deliberately set the intercept as zero. However, I can't find way of setting it to zero, either in R package deming, it gives an error when I use -1 in the formula:
df=data.frame(x=rnorm(10), y=rnorm(10), sx=runif(10), sy=runif(10))
library(deming)
deming(y~x-1, df, xstd=sy, ystd=sy)
Error in lm.wfit(x, y, wt/ystd^2) : 'x' must be a matrix
In other packages (like mcr::mcreg or IsoplotR::york or MethComp::Deming), the input are two vectors x and y, so there is no way I can input a model matrix or modify the formula.
Do you have any idea on how to achieve that? Thanks.

There is a bug in the function when you remove the intercept. I'm going to report it.
It is easy to fix, you just have to change 2 lines in the original function.
The print does not work correctly, but it is possible to interpret the output.
deming.aux <-
function (formula, data, subset, weights, na.action, cv = FALSE,
xstd, ystd, stdpat, conf = 0.95, jackknife = TRUE, dfbeta = FALSE,
x = FALSE, y = FALSE, model = TRUE)
{
deming.fit1 <- getAnywhere(deming.fit1)[[2]][[1]]
deming.fit2 <- getAnywhere(deming.fit2)[[2]][[1]]
Call <- match.call()
indx <- match(c("formula", "data", "weights", "subset", "na.action", "xstd", "ystd"), names(Call), nomatch = 0)
if (indx[1] == 0)
stop("A formula argument is required")
temp <- Call[c(1, indx)]
temp[[1]] <- as.name("model.frame")
mf <- eval(temp, parent.frame())
Terms <- terms(mf)
n <- nrow(mf)
if (n < 3)
stop("less than 3 non-missing observations in the data set")
xstd <- model.extract(mf, "xstd")
ystd <- model.extract(mf, "ystd")
Y <- as.matrix(model.response(mf, type = "numeric"))
if (is.null(Y))
stop("a response variable is required")
wt <- model.weights(mf)
if (length(wt) == 0)
wt <- rep(1, n)
usepattern <- FALSE
if (is.null(xstd)) {
if (!is.null(ystd))
stop("both of xstd and ystd must be given, or neither")
if (missing(stdpat)) {
if (cv)
stdpat <- c(0, 1, 0, 1)
else stdpat <- c(1, 0, 1, 0)
}
else {
if (any(stdpat < 0) || all(stdpat[1:2] == 0) || all(stdpat[3:4] ==
0))
stop("invalid stdpat argument")
}
if (stdpat[2] > 0 || stdpat[4] > 0)
usepattern <- TRUE
else {
xstd <- rep(stdpat[1], n)
ystd <- rep(stdpat[3], n)
}
}
else {
if (is.null(ystd))
stop("both of xstd and ystd must be given, or neither")
if (!is.numeric(xstd))
stop("xstd must be numeric")
if (!is.numeric(ystd))
stop("ystd must be numeric")
if (any(xstd <= 0))
stop("xstd must be positive")
if (any(ystd <= 0))
stop("ystd must be positive")
}
if (conf < 0 || conf >= 1)
stop("invalid confidence level")
if (!is.logical(dfbeta))
stop("dfbeta must be TRUE or FALSE")
if (dfbeta & !jackknife)
stop("the dfbeta option only applies if jackknife=TRUE")
X <- model.matrix(Terms, mf)
if (ncol(X) != (1 + attr(Terms, "intercept")))
stop("Deming regression requires a single predictor variable")
xx <- X[, ncol(X), drop = FALSE]
if (!usepattern)
fit <- deming.fit1(xx, Y, wt, xstd, ystd, intercept = attr(Terms, "intercept"))
else fit <- deming.fit2(xx, Y, wt, stdpat, intercept = attr(Terms, "intercept"))
yhat <- fit$coefficients[1] + fit$coefficients[2] * xx
fit$residuals <- Y - yhat
if (x)
fit$x <- X
if (y)
fit$y <- Y
if (model)
fit$model <- mf
na.action <- attr(mf, "na.action")
if (length(na.action))
fit$na.action <- na.action
fit$n <- length(Y)
fit$terms <- Terms
fit$call <- Call
fit
}
deming.aux(y ~ x + 0, df, xstd=sy, ystd=sy)
$`coefficients`
[1] 0.000000 4.324481
$se
[1] 0.2872988 0.7163073
$sigma
[1] 2.516912
$residuals
[,1]
1 9.19499513
2 2.13037957
3 3.00064886
4 2.16751905
5 0.00168729
6 4.75834265
7 3.44108236
8 6.40028085
9 6.63531039
10 -1.48624851
$model
y x (xstd) (ystd)
1 2.1459817 -1.6300251 0.48826221 0.48826221
2 1.3163362 -0.1882407 0.46002166 0.46002166
3 1.5263967 -0.3409084 0.55771660 0.55771660
4 -0.9078000 -0.7111417 0.81145673 0.81145673
5 -1.6768719 -0.3881527 0.01563191 0.01563191
6 -0.6114545 -1.2417205 0.41675425 0.41675425
7 -0.7783790 -0.9757150 0.82498713 0.82498713
8 1.1240046 -1.2200946 0.84072712 0.84072712
9 -0.3091330 -1.6058442 0.35926078 0.35926078
10 0.7215432 0.5105333 0.23674788 0.23674788
$n
[1] 10
$terms
y ~ x + 0
...
To adapt the function I have performed, these steps:
1 .- Load the internal functions of the package.
deming.fit1 <- getAnywhere(deming.fit1)[[2]][[1]]
deming.fit2 <- getAnywhere(deming.fit2)[[2]][[1]]
2 .- Locate the problem and solve it, executing the function step by step with an example.
Y <- as.matrix(model.response(mf, type = "numeric"))
...
xx <- X[, ncol(X), drop = FALSE]
3 .- Fix other possible error generated by the changes.
In this case, delete the class of the output to avoid an error in the print.
Bug report:
Terry Therneau (the author of deming) uploaded a new version to CRAN, with this problem solved.

Related

How to predict and extract R Squared with .lm.fit?

As the title suggest, I have seen some user mentioned that .lm.fit() functions has an advantage of more speed than a regular lm(), but when i look deeper at .lm.fit() in help, it is supposed to be a fitter functions, it returns a set of list instead of a model, which makes me to think is it still possible to extract components like R squared, Adj R Squared, and lastly do a predict() out of it?
Below is sample data and executions:
test_dat <- data.frame(y = rnorm(780, 20, 10))
for(b in 1:300){
name_var <- paste0("x",b)
test_dat[[name_var]] <- rnorm(780, 0.01 * b, 5)
}
tic()
obj_lm <- lm(y ~ ., data = test_dat)
print(class(obj_lm))
print(summary(obj_lm)$r.squared)
print(summary(obj_lm)$adj.r.squared)
predict(obj_lm)
toc() #approximately 0.4 seconds
tic()
datm <- as.matrix(test_dat)
obj_lm_fit <- .lm.fit(cbind(1,datm[,-1]), datm[,1])
print(class(obj_lm_fit))
toc() #approximately 0.2 seconds
Functions predict and resid are generic and since .lm.fit returns an object of class "list", all you have to do is to write methods implementing the definitions of what you want. Below are methods to compute fitted values, residuals and R^2.
set.seed(2023) # make the results reproducible
test_dat <- data.frame(y = rnorm(780, 20, 10))
for(b in 1:300){
name_var <- paste0("x",b)
test_dat[[name_var]] <- rnorm(780, 0.01 * b, 5)
}
obj_lm <- lm(y ~ ., data = test_dat)
datm <- as.matrix(test_dat)
obj_lm_fit <- .lm.fit(cbind(1,datm[,-1]), datm[,1])
#------------------------------------------------------------------------
# the methods for objects of class "list"
#
fitted.list <- function(object, X) {
X %*% coef(object)
}
resid.list <- residuals.list <- function(object, X, y) {
y_fitted <- fitted(object, X)
y - y_fitted
}
rsquared <- function(x, ...) UseMethod("rsquared")
rsquared.default <- function(x, ...) {
summary(x)$r.squared
}
rsquared.list <- function(object, X, y) {
e <- resid.list(object, X, y)
1 - sum(e^2)/sum( (y - mean(y))^2 )
}
rsquared(obj_lm_fit, cbind(1,datm[,-1]), datm[,1])
#> [1] 0.3948863
rsquared(obj_lm)
#> [1] 0.3948863
Created on 2023-01-03 with reprex v2.0.2
Edit 1
Added method to also calculate adj.R2
adj_rsquared_list <- function(object, X, y){
r2 <- rsquared.list(object, X, y)
k <- ncol(X) - 1
n <- nrow(X)
rate_of_error <- (1 - r2) * (n - 1) / (n - k - 1)
adj_r2 <- 1 - rate_of_error
return(adj_r2)
}
adj_rsquared_list(obj_lm_fit, cbind(1,datm[,-1]), datm[,1])
#> [1] 0.01590061
Edit 2
After the edit by Jovan, I have changed fitted.list above to use coef(), a function that extracts the first arguments list member "coefficients", if it exists, and rewrote the default and list methods of rsquared to accept a adj argument. The code to compute the adjusted R^2 is a copy&paste of Jovan's code.
rsquared <- function(x, ...) UseMethod("rsquared")
rsquared.default <- function(x, adj = FALSE, ...) {
if(adj) {
summary(x)$adj.r.squared
} else summary(x)$r.squared
}
rsquared.list <- function(object, X, y, adj = FALSE) {
e <- resid.list(object, X, y)
r2 <- 1 - sum(e^2)/sum( (y - mean(y))^2 )
if(adj) {
k <- ncol(X) - 1
n <- nrow(X)
rate_of_error <- (1 - r2) * (n - 1) / (n - k - 1)
adj_r2 <- 1 - rate_of_error
adj_r2
} else r2
}
# same as above
rsquared(obj_lm_fit, cbind(1,datm[,-1]), datm[,1])
#> [1] 0.3948863
rsquared(obj_lm)
#> [1] 0.3948863
# new, `adj = TRUE`
rsquared(obj_lm_fit, cbind(1,datm[,-1]), datm[,1], adj = TRUE)
#> [1] 0.01590061
rsquared(obj_lm, adj = TRUE)
#> [1] 0.01590061
Created on 2023-01-03 with reprex v2.0.2

Restrictions in a optimization using R

Continuing the question made here, I'd like to add a restriction in the optimization:
a <- c(52.67, 46.80, 41.74, 40.45)
b <- c(1.726219351, 1.842421805, 1.790801758, 1.449997494)
rsq <- function(c) {
x <- log(a)
y <- log((c*(a/b))-1)
summary(lm(y ~ x))$r.squared
}
optimise(rsq, maximum = TRUE, interval=c(1, 1000))
The interval for the optimization is 1 to 1000, however I'd like that the interval starts where
(c*(a/b)) > 0
To avoid problems with log
Just return -Inf if the log argument would be negative. Then it is unnecessary to manipulate the domain.
rsq <- function(c) {
x <- log(a)
tmp <- (c*(a/b))-1
if (any(tmp < 0)) -Inf else summary(lm(log(tmp) ~ x))$r.squared
}
optimise(rsq, maximum = TRUE, interval=c(1, 1000))
giving:
$maximum
[1] 1.082353
$objective
[1] 0.8093781
Update
Fixed.

Error in nrow(X) : object 'X' not found , but it gets defined

I am trying to implement logistic regression and the function works manually, but for some reason I get the error "Error in nrow(X) : object 'X' not found ", even though X is defined before the nrow command. I use the UCI Data "Adult" to test it.
If i try to run the function manually there is no error. Can anyone explain that?
# Sigmoidfunction
sigmoid <- function(z){
g <- 1/(1+exp(-z))
return(g)
}
# Costfunction
cost <- function(theta){
n <- nrow(X)
g <- sigmoid(X %*% theta)
J <- (1/n)*sum((-Y*log(g)) - ((1-Y)*log(1-g)))
return(J)
}
log_reg <- function(datafr, m){
# Train- und Testdaten Split
sample <- sample(1:nrow(datafr), m)
df_train <- datafr[sample,]
df_test <- datafr[-sample,]
num_features <- ncol(datafr) - 1
num_label <- ncol(datafr)
label_levels <- levels(datafr[, num_label])
datafr[, num_features+1] <- ifelse(datafr[, num_label] == names(table(datafr[,num_label]))[1], 0, 1)
# Predictor variables
X <- as.matrix(df_train[, 1:num_features])
X_test <- as.matrix(df_test[, 1:num_features])
# Add ones to X
X <- cbind(rep(1, nrow(X)), X)
X_test <- cbind(rep(1, nrow(X_test)), X_test)
# Response variable
Y <- as.matrix(df_train[, num_label] )
Y <- ifelse(Y == names(table(Y))[1], 0, 1)
Y_test <- as.matrix(df_test[, num_label] )
Y_test <- ifelse(Y_test == names(table(Y_test))[1], 0, 1)
# Intial theta
initial_theta <- rep(0, ncol(X))
# Derive theta using gradient descent using optim function
theta_optim <- optim(par=initial_theta, fn=cost)
predictions <- ifelse(sigmoid(X_test%*%theta_optim$par)>=0.5, 1, 0)
# Generalization error
error_rate <- sum(predictions!=Y_test)/length(Y_test)
return(error_rate)
}
### Adult Data
data <- read.table('https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data',
sep = ',', fill = F, strip.white = T)
colnames(data) <- c('age', 'workclass', 'fnlwgt', 'education',
'education_num', 'marital_status', 'occupation', 'relationship', 'race', 'sex',
'capital_gain', 'capital_loss', 'hours_per_week', 'native_country', 'income')
# Featureselection
datafr <- data[, c("age", "education_num", "hours_per_week", "income")]
log_reg(datafr = datafr, m = 20)
You are calling cost() in which you refer to X, but X has not been defined in cost(). Either define it within log_reg() after you have defined X, or, better, make X a parameter for cost().
cost <- function(theta, X, Y){
n <- nrow(X)
g <- sigmoid(X %*% theta)
J <- (1/n)*sum((-Y*log(g)) - ((1-Y)*log(1-g)))
return(J)
}
And later
theta_optim <- optim(par=initial_theta, fn=cost, X=X, Y=Y)
In general, try to avoid having variables used in a function which are not defined explicitly as arguments to that function. Otherwise you will always end up with problems like this one.
Also, how did I find it out? I used traceback():
> traceback()
5: nrow(X) at #2
4: fn(par, ...)
3: (function (par)
fn(par, ...))(c(0, 0, 0, 0))
2: optim(par = initial_theta, fn = cost) at #33
1: log_reg(datafr = datafr, m = 20)

Why do I get he error: argument is of length 0 for dffits?

I have a problem when I try to run the dffits() function for an object of my own logistic regression.
When I'm running dffits(log) I get the error message:
error in if (model$rank == 0) { : Argument is of length 0
However, when I'm using the inbuilt gym function (family = binomial), then dffits(glm) works just fine.
Here is my function for the logistic regression and a short example of my problem:
mydata <- read.csv("https://stats.idre.ucla.edu/stat/data/binary.csv")
mydata$rank <- factor(mydata$rank)
mydata$admit <- factor(mydata$admit)
logRegEst <- function(x, y, threshold = 1e-10, maxIter = 100)
{
calcPi <- function(x, beta)
{
beta <- as.vector(beta)
return(exp(x %*% beta) / (1 + exp(x %*% beta)))
}
beta <- rep(0, ncol(x)) # initial guess for beta
diff <- 1000
# initial value bigger than threshold so that we can enter our while loop
iterCount = 0
# counter to ensure we're not stuck in an infinite loop
while(diff > threshold) # tests for convergence
{
pi <- as.vector(calcPi(x, beta))
# calculate pi by using the current estimate of beta
W <- diag(pi * (1 - pi)) # calculate matrix of weights W
beta_change <- solve(t(x) %*% W %*% x) %*% t(x) %*% (y - pi)
# calculate the change in beta
beta <- beta + beta_change # new beta
diff <- sum(beta_change^2)
# calculate how much we changed beta by in this iteration
# if this is less than threshold, we'll break the while loop
iterCount <- iterCount + 1
# see if we've hit the maximum number of iterations
if(iterCount > maxIter){
stop("This isn't converging.")
}
# stop if we have hit the maximum number of iterations
}
df <- length(y) - ncol(x)
# calculating the degrees of freedom by taking the length of y minus
# the number of x columns
vcov <- solve(t(x) %*% W %*% x)
list(coefficients = beta, vcov = vcov, df = df)
# returning results
}
logReg <- function(formula, data)
{
mf <- model.frame(formula = formula, data = data)
# model.frame() returns us a data.frame with the variables needed to use the
# formula.
x <- model.matrix(attr(mf, "terms"), data = mf)
# model.matrix() creates a design matrix. That means that for example the
#"Sex"-variable is given as a dummy variable with ones and zeros.
y <- as.numeric(model.response(mf)) - 1
# model.response gives us the response variable.
est <- logRegEst(x, y)
# Now we have the starting position to apply our function from above.
est$formula <- formula
est$call <- match.call()
est$data <- data
# We add the formular and the call to the list.
est$x <- x
est$y <- y
# We add x and y to the list.
class(est) <- "logReg"
# defining the class
est
}
log <- logReg(admit ~ gre + gpa, data= mydata)
glm <- glm(admit ~ gre + gpa, data= mydata, family = binomial)
dffits(glm)
dffits(log)
log$data
glm$data
I don't understand why mydata$rank == 0, because when I look at log$data I see that the rank is just defined as in glm$data.
I really appreciate your help!

How to reliably get dependent variable name from formula object?

Let's say I have the following formula:
myformula<-formula("depVar ~ Var1 + Var2")
How to reliably get dependent variable name from formula object?
I failed to find any built-in function that serves this purpose.
I know that as.character(myformula)[[2]] works, as do
sub("^(\\w*)\\s~\\s.*$","\\1",deparse(myform))
It just seems to me, that these methods are more a hackery, than a reliable and standard method to do it.
Does anyone know perchance what exactly method the e.g. lm use? I've seen it's code, but it is a little to cryptic to me... here is a quote for your convenience:
> lm
function (formula, data, subset, weights, na.action, method = "qr",
model = TRUE, x = FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE,
contrasts = NULL, offset, ...)
{
ret.x <- x
ret.y <- y
cl <- match.call()
mf <- match.call(expand.dots = FALSE)
m <- match(c("formula", "data", "subset", "weights", "na.action",
"offset"), names(mf), 0L)
mf <- mf[c(1L, m)]
mf$drop.unused.levels <- TRUE
mf[[1L]] <- as.name("model.frame")
mf <- eval(mf, parent.frame())
if (method == "model.frame")
return(mf)
else if (method != "qr")
warning(gettextf("method = '%s' is not supported. Using 'qr'",
method), domain = NA)
mt <- attr(mf, "terms")
y <- model.response(mf, "numeric")
w <- as.vector(model.weights(mf))
if (!is.null(w) && !is.numeric(w))
stop("'weights' must be a numeric vector")
offset <- as.vector(model.offset(mf))
if (!is.null(offset)) {
if (length(offset) != NROW(y))
stop(gettextf("number of offsets is %d, should equal %d (number of observations)",
length(offset), NROW(y)), domain = NA)
}
if (is.empty.model(mt)) {
x <- NULL
z <- list(coefficients = if (is.matrix(y)) matrix(, 0,
3) else numeric(), residuals = y, fitted.values = 0 *
y, weights = w, rank = 0L, df.residual = if (!is.null(w)) sum(w !=
0) else if (is.matrix(y)) nrow(y) else length(y))
if (!is.null(offset)) {
z$fitted.values <- offset
z$residuals <- y - offset
}
}
else {
x <- model.matrix(mt, mf, contrasts)
z <- if (is.null(w))
lm.fit(x, y, offset = offset, singular.ok = singular.ok,
...)
else lm.wfit(x, y, w, offset = offset, singular.ok = singular.ok,
...)
}
class(z) <- c(if (is.matrix(y)) "mlm", "lm")
z$na.action <- attr(mf, "na.action")
z$offset <- offset
z$contrasts <- attr(x, "contrasts")
z$xlevels <- .getXlevels(mt, mf)
z$call <- cl
z$terms <- mt
if (model)
z$model <- mf
if (ret.x)
z$x <- x
if (ret.y)
z$y <- y
if (!qr)
z$qr <- NULL
z
}
Try using all.vars:
all.vars(myformula)[1]
I suppose you could also cook your own function to work with terms():
getResponse <- function(formula) {
tt <- terms(formula)
vars <- as.character(attr(tt, "variables"))[-1] ## [1] is the list call
response <- attr(tt, "response") # index of response var
vars[response]
}
R> myformula <- formula("depVar ~ Var1 + Var2")
R> getResponse(myformula)
[1] "depVar"
It is just as hacky as as.character(myformyula)[[2]] but you have the assurance that you get the correct variable as the ordering of the call parse tree isn't going to change any time soon.
This isn't so good with multiple dependent variables:
R> myformula <- formula("depVar1 + depVar2 ~ Var1 + Var2")
R> getResponse(myformula)
[1] "depVar1 + depVar2"
as they'll need further processing.
I found an useful package 'formula.tools' which is suitable for your task.
code Example:
f <- as.formula(a1 + a2~a3 + a4)
lhs.vars(f) #get dependent variables
[1] "a1" "a2"
rhs.vars(f) #get independent variables
[1] "a3" "a4"
Based on your edit to get the actual response, not just its name, we can use the nonstandard evaluation idiom employed by lm() and most other modelling functions with a formula interface in base R
form <- formula("depVar ~ Var1 + Var2")
dat <- data.frame(depVar = rnorm(10), Var1 = rnorm(10), Var2 = rnorm(10))
getResponse <- function(form, data) {
mf <- match.call(expand.dots = FALSE)
m <- match(c("formula", "data"), names(mf), 0L)
mf <- mf[c(1L, m)]
mf$drop.unused.levels <- TRUE
mf[[1L]] <- as.name("model.frame")
mf <- eval(mf, parent.frame())
y <- model.response(mf, "numeric")
y
}
> getResponse(form, dat)
1 2 3 4 5
-0.02828573 -0.41157817 2.45489291 1.39035938 -0.31267835
6 7 8 9 10
-0.39945771 -0.09141438 0.81826105 0.37448482 -0.55732976
As you see, this gets the actual response variable data from the supplied data frame.
How this works is that the function first captures the function call without expanding the ... argument as that contains things not needed for the evaluation of the data for the formula.
Next, the "formula" and "data" arguments are matched with the call. The line mf[c(1L, m)] selects the function name from the call (1L) and the locations of the two matched arguments. The drop.unused.levels argument of model.frame() is set to TRUE in the next line, and then the call is updated to switch the function name in the call from lm to model.frame. All the above code does is takes the call to lm() and processes that call into a call to the model.frame() function.
This modified call is then evaluated in the parent environment of the function - which in this case is the global environment.
The last line uses the model.response() extractor function to take the response variable from the model frame.
This should always give you all dependent vars:
myformula<-formula("depVar1 + depVar2 ~ Var1 + Var2")
as.character(myformula[[2]])[-1]
#[1] "depVar1" "depVar2"
And I wouldn't consider this particularly "hacky".
Edit:
Something strange happens with 3 dependents:
myformula<-formula("depVar1 + depVar2 + depVar3 ~ Var1 + Var2")
as.character(myformula[[2]])
#[1] "+" "depVar1 + depVar2" "depVar3"
So this might not be as reliable as I thought.
Edit2:
Okay, myformula[[2]] is a language object and as.character seems to do something similar as languageEl.
length(myformula[[2]])
#[1] 3
languageEl(myformula[[2]],which=1)
#`+`
languageEl(myformula[[2]],which=2)
#depVar1 + depVar2
languageEl(myformula[[2]],which=3)
#depVar3
languageEl(languageEl(myformula[[2]],which=2),which=2)
#depVar1
If you check the length of each element, you could create your own extraction function. But this is probably too much of a hack.
Edit3:
Based on the answer by #seancarmody all.vars(myformula[[2]]) is the way to go.
Using all.vars is very tricky as it won't detect the response from a one-sided formula. For example
all.vars(~x+1)
[1] "x"
that is wrong.
Here is the most reliable way of getting the response:
getResponseFromFormula = function(formula) {
if (attr(terms(as.formula(formula)) , which = 'response'))
all.vars(formula)[1]
else
NULL
}
getResponseFromFormula(~x+1)
NULL
getResponseFromFormula(y~x+1)
[1] "y"
Note that you can replace all.vars(formula)[1] in the function with formula[2] if the formula contains more than one variable for the response.
I know this question is quite old, but I thought I'd add a base R answer which doesn't require indexing, doesn't depend on the order of the variables listed in a call to all.vars, and which gives the response variables as separate elements when there is more than one:
myformula <- formula("depVar1 + depVar2 ~ Var1 + Var2")
all_vars <- all.vars(myformula)
response <- all_vars[!(all_vars %in% labels(terms(myformula)))]
> response
[1] "depVar1" "depVar2"

Resources