R geepack: unreasonably large estimates using GEE - r

I am using geepack for R to estimate logistic marginal model by geeglm(). But I am getting garbage estimates. They about 16 orders of magnitude too large. However the p-values seems to similar to what I expected. This means that the response essentially becomes a step function. See attached plot
Here is the code that generates the plot:
require(geepack)
data = read.csv(url("http://folk.uio.no/mariujon/data.csv"))
fit = geeglm(moden ~ 1 + power, id = defacto, data=data, corstr = "exchangeable", family=binomial)
summary(fit)
plot(moden ~ power, data=data)
x = 0:2500
y = predict(fit, newdata=data.frame(power = x), type="response" )
lines(x,y)
Here is the regression table:
Call:
geeglm(formula = moden ~ 1 + power, family = binomial, data = data,
id = defacto, corstr = "exchangeable")
Coefficients:
Estimate Std.err Wald Pr(>|W|)
(Intercept) -7.38e+15 1.47e+15 25.1 5.4e-07 ***
power 2.05e+13 1.60e+12 164.4 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Estimated Scale Parameters:
Estimate Std.err
(Intercept) 1.03e+15 1.65e+37
Correlation: Structure = exchangeable Link = identity
Estimated Correlation Parameters:
Estimate Std.err
alpha 0.196 3.15e+21
Number of clusters: 3 Maximum cluster size: 381
Hoping for some help. Thanks!
Kind regards,
Marius

I will give three procedures, each of which is a marginalized random intercept model (MRIM). These MRIMs have coefficients with marginal logistic interpretations and are of smaller magnitude than the GEE:
| Model | (Intercept) | power | LogL |
|-------|-------------|--------|--------|
| `L_N` | -1.050| 0.00267| -270.1|
| `LLB` | -0.668| 0.00343| -273.8|
| `LPN` | -1.178| 0.00569| -266.4|
compared to a glm that doesn't account for any correlation, for reference:
| Model | (Intercept) | power | LogL |
|-------|-------------|--------|--------|
| strt | -0.207| 0.00216| -317.1|
A marginalized random intercept model (MRIM) is worth exploring because you want a marginal model with exchangeable correlation structure for the clustered data, and that is the type of structure MRIMs exhibit.
The code (especially R script with comments) and PDFs for literature are in the GITHUB repo. I detail the code and literature down below.
The concept of MRIM has been around since 1999, and some background reading on this is in the GITHUB repo. I suggest reading Swihart et al 2014 first because it reviews the other papers.
In chronological order --
L_N Heagerty (1999): the approach fits a random intercept logistic model with a normally distributed random intercept. The trick is that the predictor in the random intercept model is nonlinearly parameterized with marginal coefficients so that the resulting marginal model has a marginal logistic interpretation. Its code is the lnMLE R package (not on CRAN, but on Patrick Heagerty's website here). This approach is denoted L_N in the code to indicate logit (L) on the marginal, no interepretation on conditional scale (_) and a normally (N) distributed random intercept.
LLB Wang & Louis (2003): the approach fits a random intercept logistic model with a bridge distributed random intercept. Unlike Heagerty 1999 where the trick is nonlinear-predictor for the random intercept model, the trick is a special random effects distribution (the bridge distribution) that allows both the random intercept model and the resulting marginal model to have a logistic interpretation. Its code is implemented with gnlmix4MMM.R (in the repo) which uses rmutil and repeated R packages. This approach is denoted LLB in the code to indicate logit (L) on the marginal, logit (L) on the conditional scale and a bridge (B) distributed intercept.
LPN Caffo and Griswold (2006): the approach fits a random intercept probit model with a normally distributed random intercept, whereas Heagerty 1999 used a logit random intercept model. This substitution makes computations easier and still yields a marginal logit model. Its code is implemented with gnlmix4MMM.R (in the repo) which uses rmutil and repeated R packages. This approach is denoted LPN in the code to indicate logit (L) on the marginal, probit (P) on the conditional scale and a normally (N) distributed intercept.
Griswold et al (2013): another review / practical introduction.
Swihart et al 2014: This is a review paper for Heagerty 1999 and Wang & Louis 2003 as well as others and generalizes the MRIM method. One of the most interesting generalizations is allowing the logistic CDF (equivalently, logit link) in both the marginal and conditional models to instead be a stable distribution that approximates a logistic CDF. Its code is implemented with gnlmix4MMM.R (in the repo) which uses rmutil and repeated R packages. I denote this SSS in the R script with comments to indicate stable (S) on the marginal, stable (S) on the conditional scale and a stable (S) distributed intercept. It is included in the R script but not detailed in this post on SO.
Prep
#code from OP Question: edit `data` to `d`
require(geepack)
d = read.csv(url("http://folk.uio.no/mariujon/data.csv"))
fit = geeglm(moden ~ 1 + power, id = defacto, data=d, corstr = "exchangeable", family=binomial)
summary(fit)
plot(moden ~ power, data=d)
x = 0:2500
y = predict(fit, newdata=data.frame(power = x), type="response" )
lines(x,y)
#get some starting values from glm():
strt <- coef(glm(moden ~ power, family = binomial, data=d))
strt
#I'm so sorry but these methods use attach()
attach(d)
L_N Heagerty (1999)
# marginally specifies a logit link and has a nonlinear conditional model
# the following code will not run if lnMLE is not successfully installed.
# See https://faculty.washington.edu/heagerty/Software/LDA/MLV/
library(lnMLE)
L_N <- logit.normal.mle(meanmodel = moden ~ power,
logSigma= ~1,
id=defacto,
model="marginal",
data=d,
beta=strt,
r=10)
print.logit.normal.mle(L_N)
Prep for LLB and LPN
library("gnlm")
library("repeated")
source("gnlmix4MMM.R") ## see ?gnlmix; in GITHUB repo
y <- cbind(d$moden,(1-d$moden))
LLB Wang and Louis (2003)
LLB <- gnlmix4MMM(y = y,
distribution = "binomial",
mixture = "normal",
random = "rand",
nest = defacto,
mu = ~ 1/(1+exp(-(a0 + a1*power)*sqrt(1+3/pi/pi*exp(pmix)) - sqrt(1+3/pi/pi*exp(pmix))*log(sin(pi*pnorm(rand/sqrt(exp(pmix)))/sqrt(1+3/pi/pi*exp(pmix)))/sin(pi*(1-pnorm(rand/sqrt(exp(pmix))))/sqrt(1+3/pi/pi*exp(pmix)))))),
pmu = c(strt, log(1)),
pmix = log(1))
print("code: 1 -best 2-ok 3,4,5 - problem")
LLB$code
print("coefficients")
LLB$coeff
print("se")
LLB$se
LPN Caffo and Griswold (2006)
LPN <- gnlmix4MMM(y = y,
distribution = "binomial",
mixture = "normal",
random = "rand",
nest = defacto,
mu = ~pnorm(qnorm(1/(1+exp(-a0 - a1*power)))*sqrt(1+exp(pmix)) + rand),
pmu = c(strt, log(1)),
pmix = log(1))
print("code: 1 -best 2-ok 3,4,5 - problem")
LPN$code
print("coefficients")
LPN$coeff
print("se")
LPN$se
coefficients from 3 approaches:
rbind("L_N"=L_N$beta, "LLB" = LLB$coefficients[1:2], "LPN"=LPN$coefficients[1:2])
max log likelihood for 3 models:
rbind("L_N"=L_N$logL, "LLB" = -LLB$maxlike, "LPN"=-LPN$maxlike)

Related

How to do negative binomial regression with the rms package in R?

How can I use the rms package in R to execute a negative binomial regression? (I originally posted this question on Statistics SE, but it was closed apparently because it is a better fit here.)
With the MASS package, I use the glm.nb function, but I am trying to switch to the rms package because I sometimes get weird errors when bootstrapping with glm.nb and some other functions. But I cannot figure out how to do a negative binomial regression with the rms package.
Here is sample code of what I would like to do (copied from the rms::Glm function documentation):
library(rms)
## Dobson (1990) Page 93: Randomized Controlled Trial :
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
f <- Glm(counts ~ outcome + treatment, family=poisson())
f
anova(f)
summary(f, outcome=c('1','2','3'), treatment=c('1','2','3'))
So, instead of using family=poisson(), I would like to use something like family=negative.binomial(), but I cannot figure out how to do this.
In the documentation for family {stats}, I found this note in the "See also" section:
For binomial coefficients, choose; the binomial and negative binomial distributions, Binomial, and NegBinomial.
But even after clicking the link for ?NegBinomial, I cannot make any sense of this.
I would appreciate any help on how to use the rms package in R to execute a negative binomial regression.
opinion up front You might be better off posting (as a separate question) a reproducible example of the "weird errors" from your bootstrap attempts and seeing whether people have ideas for resolving them. It's fairly common for NB fitting procedures to throw warnings or errors when data are equi- or underdispersed, as the estimates of the dispersion parameter become infinite in this case ...
#coffeinjunky is correct that using family = negative.binomial(theta=VALUE) will work (where VALUE is a numeric constant, e.g. theta=1 for the geometric distribution [a special case of the NB]). However: you won't be able (without significantly more work) be able to fit the general NB model, i.e. the model where the dispersion parameter (theta) is estimated as part of the fitting procedure. That's what MASS::glm.nb does, and AFAICS there is no analogue in the rms package.
There are a few other packages/functions in addition to MASS::glm.nb that fit the negative binomial model, including (at least) bbmle and glmmTMB — there may be others such as gamlss.
## Dobson (1990) Page 93: Randomized Controlled Trial :
dd < data.frame(
counts = c(18,17,15,20,10,20,25,13,12)
outcome = gl(3,1,9),
treatment = gl(3,3))
MASS::glm.nb
library(MASS)
m1 <- glm.nb(counts ~ outcome + treatment, data = dd)
## "iteration limit reached" warning
glmmTMB
library(glmmTMB)
m2 <- glmmTMB(counts ~ outcome + treatment, family = nbinom2, data = dd)
## "false convergence" warning
bbmle
library(bbmle)
m3 <- mle2(counts ~ dnbinom(mu = exp(logmu), size = exp(logtheta)),
parameters = list(logmu ~outcome + treatment),
data = dd,
start = list(logmu = 0, logtheta = 0)
)
signif(cbind(MASS=coef(m1), glmmTMB=fixef(m2)$cond, bbmle=coef(m3)[1:5]), 5)
MASS glmmTMB bbmle
(Intercept) 3.0445e+00 3.04540000 3.0445e+00
outcome2 -4.5426e-01 -0.45397000 -4.5417e-01
outcome3 -2.9299e-01 -0.29253000 -2.9293e-01
treatment2 -1.1114e-06 0.00032174 8.1631e-06
treatment3 -1.9209e-06 0.00032823 6.5817e-06
These all agree fairly well (at least for the intercept/outcome parameters). This example is fairly difficult for a NB model (5 parameters + dispersion for 9 observations, data are Poisson rather than NB).
Based on this, the following seems to work:
library(rms)
library(MASS)
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
Glm(counts ~ outcome + treatment, family = negative.binomial(theta = 1))
General Linear Model
rms::Glm(formula = counts ~ outcome + treatment, family = negative.binomial(theta = 1))
Model Likelihood
Ratio Test
Obs 9 LR chi2 0.31
Residual d.f.4 d.f. 4
g 0.2383063 Pr(> chi2) 0.9892
Coef S.E. Wald Z Pr(>|Z|)
Intercept 3.0756 0.2121 14.50 <0.0001
outcome=2 -0.4598 0.2333 -1.97 0.0487
outcome=3 -0.2962 0.2327 -1.27 0.2030
treatment=2 -0.0347 0.2333 -0.15 0.8819
treatment=3 -0.0503 0.2333 -0.22 0.8293

How to interpret significant test of moderators and null percentage of explained variance in meta-analysis using "metafor" package?

I'm currently working on a meta-analysis of proportions (number of mosquitoes transmitting a disease/number of mosquitoes tested), using metafor package (Viechtbauer, 2010). My aim is to compute a summarized effect size for each of 5 mosquitoes species. As far, my analysis strategy is:
Using PFT (double arcsine) transformation in order to normalize data (I have lots of 0 and 1 as values)
Running overall model to assess heterogeneity (test for residual heterogeneity significative)
As I have several measures coming from each of the articles included in the meta-analysis, I assessed the necessity of using a three-level model (LRT significative, for "measure" and "article" (measure nested in article)
Using subgroup analysis, with mosquitoes species as moderator
Assessing for residual heterogeneity
Testing some moderators to try to explain residual heterogeneity
When I performed subgroup analysis, I got a test for moderator significative for the variable "Specie". But I then wanted to know which part of variance is explained by this significative variable, and I obtained -0.9% (truncated at 0%) (I used "pseudo R-squared" method suggested by W. Viechtbauer here).
So, my question is: would it be possible/coherent to have a test of moderators significative and no variance explained by this moderator ? How to explain it ?
As I use REML estimation, I can't use LRT to test if the variable is significative (and I would rather not use ML back just to compute LRT).
Thanks in advance if someone can help me,
Best regards,
Alex
If useful, here is a abstract of the code I used:
ies.da <- escalc(xi = data_test[, "n"],
ni = data_test[, "n_tested"],
data = data,
measure = "PFT",
add = 0)
subganal.specie.mv <- rma.mv(yi, vi,
data = ies.da,
mods = ~factor(Specie),
method = "REML",
random = ~1|article/measure)
subganal.no.specie.mv <- rma.mv(yi, vi,
data = ies.da,
method = "REML",
random = ~1|article/measure)
pseudo.r.squared <- (sum(subganal.no.specie.mv$sigma2) - sum(subganal.specie.mv$sigma2)) / sum(subganal.no.specie.mv$sigma2)
As result, I get a test of moderator significative:
subganal.specie.mv
Multivariate Meta-Analysis Model (k = 165; method: REML)
Variance Components:
estim sqrt nlvls fixed factor
sigma^2.1 0.0184 0.1358 21 no article
sigma^2.2 0.0215 0.1466 165 no article/measure
Test for Residual Heterogeneity:
QE(df = 161) = 897.9693, p-val < .0001
Test of Moderators (coefficients 2:4):
QM(df = 3) = 12.3578, p-val = 0.0063
Model Results:
estimate se zval pval ci.lb ci.ub
intrcpt 0.6172 0.1051 5.8729 <.0001 0.4112 0.8232 ***
factor(Specie)1 -0.0123 0.1240 -0.0990 0.9211 -0.2554 0.2308
factor(Specie)2 -0.2110 0.1178 -1.7913 0.0733 -0.4419 0.0199 .
factor(Specie)3 -0.2299 0.1008 -2.2813 0.0225 -0.4274 -0.0324 *
But my "pseudo R squared" is null:
pseudo.r.squared
[1] -0.009012437

Estimating risk ratio instead of odds ratio in mixed effect logistic regression in `R`

glmer is used to estimate effects on the logit scale of y when the data are clustered. In the following model
fit1 = glmer(y ~ treat + x + ( 1 | cluster), family = binomial(link = "logit"))
the exp of the coefficient of treat is the odds ratio of a binary 0-1 treatment variable, x is a covariate, and cluster is a clustering indicator across which we model a random effect (intercept). A standard approach in glm's to estimate risk ratios is to use a log link instead, i.e. family=binomial(link = "log"). However using this in glmer I get error
Error in (function (fr, X, reTrms, family, nAGQ = 1L, verbose = 0L, maxit = 100L, :
(maxstephalfit) PIRLS step-halvings failed to reduce deviance in pwrssUpdate
after calling
fit1 = glmer(y ~ treat + x + ( 1 | cluster), family = binomial(link = "log"))
A web search revealed other people had similar issues with the Gamma family.
This seems to be a general problem as the reproducible example below demonstrates. My question thus is: how can I estimate risk ratios using a mixed effect model like glmer?
Reproducible Example
The following code simulates data that replicates the problem.
n = 1000 # sample size
m = 50 # number of clusters
J = sample(1:m, n, replace = T) # simulate cluster membership
x = rnorm(n) # simulate covariate
treat = rbinom(n, 1, 0.5) # simulate random treatment
u = rnorm(m) # simulate random intercepts
lt = x + treat + u[J] # compute linear term of logistic mixed effect model
p = 1/(1+exp(-lt)) # use logit link to transform to probabilities
y = rbinom(n,1,p) # draw binomial outcomes
d = data.frame(y, x, treat)
# First fit logistic model with glmer
fit1 = glmer( y ~ treat + x + (1 | as.factor(J)),
family = binomial(link = "logit"), data = d)
summary(fit1)
# Now try to log link
fit2 = glmer( y ~ treat + x + (1 | as.factor(J)),
family = binomial(link = "log"), data = d)
This error is returned due to your model producing values > 1:
PIRLS step-halvings failed to reduce deviance in pwrssUpdate
...
When using lme4 to fit GLMMs with link functions that do not automatically constrain the response to the allowable range of the distributional family (e.g. binomial models with a log link, where the estimated probability can be >1, or inverse-Gamma models, where the estimated mean can be negative), it is not unusual to get this error. This occurs because lme4 doesn’t do anything to constrain the predicted values, so NaN values pop up, which aren’t handled gracefully. If possible, switch to a link function to one that constrains the response (e.g. logit link for binomial or log link for Gamma).
Unfortunately, the suggested workaround is to use a different link function.
The following paper surveys a number of alternative model choices for calculation for [adjusted] relative risk:
Model choices to obtain adjusted risk difference estimates from a binomial regression model with convergence problems: An assessment of methods of adjusted risk difference estimation (2016)

Syntax for diagonal variance-covariance matrix for non-linear mixed effects model in nlme

I am analysing routinely collected substance use data during the first 12 months' of treatment in a large sample of outpatients attending drug and alcohol treatment services. I am interested in whether differing levels of methamphetamine use (no use, low use, and high use) at the outset of treatment predicts different levels after a year in treatment, but the data is very irregular, with different clients measured at different times and different numbers of times during their year of treatment.
The data for the high and low use group seem to suggest that drug use at outset reduces during the first 3 months of treatment and then asymptotes. Hence I thought I would try a non-linear exponential decay model.
I started with the following nonlinear generalised least squares model using the gnls() function in the nlme package:
fitExp <- gnls(outcome ~ C*exp(-k*yearsFromStart),
params = list(C ~ atsBase_fac, k ~ atsBase_fac),
data = dfNL,
start = list(C = c(nsC[1], lsC[1], hsC[1]),
k = c(nsC[2], lsC[2], hsC[2])),
weights = varExp(-0.8, form = ~ yearsFromStart),
control = gnlsControl(nlsTol = 0.1))
where outcome is number of days of drug use in the 28 days previous to measurement, atsBase_fac is a three-level categorical predictor indicating level of amphetamine use at baseline (noUse, lowUse, and highUse), yearsFromStart is a continuous predictor indicating time from start of treatment in years (baseline = 0, max - 1), C is a parameter indicating initial level of drug use, and k is the rate of decay in drug use. The starting values of C and k are taken from nls models estimating these parameters for each group. These are the results of that model
Generalized nonlinear least squares fit
Model: outcome ~ C * exp(-k * yearsFromStart)
Data: dfNL
AIC BIC logLik
27672.17 27725.29 -13828.08
Variance function:
Structure: Exponential of variance covariate
Formula: ~yearsFromStart
Parameter estimates:
expon
0.7927517
Coefficients:
Value Std.Error t-value p-value
C.(Intercept) 0.130410 0.0411728 3.16738 0.0015
C.atsBase_faclow 3.409828 0.1249553 27.28839 0.0000
C.atsBase_fachigh 20.574833 0.3122500 65.89218 0.0000
k.(Intercept) -1.667870 0.5841222 -2.85534 0.0043
k.atsBase_faclow 2.481850 0.6110666 4.06150 0.0000
k.atsBase_fachigh 9.485155 0.7175471 13.21886 0.0000
So it looks as if there are differences between groups in initial rate of drug use and in rate of reduction in drug use. I would like to go a step further and fit a nonlinear mixed effects model.I tried consulting Pinhiero and Bates' book accompanying the nlme package but the only models I could find that used irregular, sparse data like mine used a self-starting function, and my model does not do that.
I tried to adapt the gnls() model to nlme like so:
fitNLME <- nlme(model = outcome ~ C*exp(-k*yearsFromStart),
data = dfNL,
fixed = list(C ~ atsBase_fac, k ~ atsBase_fac),
random = pdDiag(yearsFromStart ~ id),
groups = ~ id,
start = list(fixed = c(nsC[1], lsC[1], hsC[1], nsC[2], lsC[2], hsC[2])),
weights = varExp(-0.8, form = ~ yearsFromStart),
control = nlmeControl(optim = "optimizer"))
bit I keep getting error message, I presume through errors in the syntax specifying the random effects.
Can anyone give me some tips on how the syntax for the random effects works in nlme?
The only dataset in Pinhiero and Bates that resembled mine used a diagonal variance-covariance matrix. Can anyone filled me in on the syntax of this nlme function, or suggest a better one?
p.s. I wish I could provide a reproducible example but coming up with synthetic data that re-creates the same errors is way beyond my skills.

Anova Type 2 and Contrasts

the study design of the data I have to analyse is simple. There is 1 control group (CTRL) and
2 different treatment groups (TREAT_1 and TREAT_2). The data also includes 2 covariates COV1 and COV2. I have been asked to check if there is a linear or quadratic treatment effect in the data.
I created a dummy data set to explain my situation:
df1 <- data.frame(
Observation = c(rep("CTRL",15), rep("TREAT_1",13), rep("TREAT_2", 12)),
COV1 = c(rep("A1", 30), rep("A2", 10)),
COV2 = c(rep("B1", 5), rep("B2", 5), rep("B3", 10), rep("B1", 5), rep("B2", 5), rep("B3", 10)),
Variable = c(3944133, 3632461, 3351754, 3655975, 3487722, 3644783, 3491138, 3328894,
3654507, 3465627, 3511446, 3507249, 3373233, 3432867, 3640888,
3677593, 3585096, 3441775, 3608574, 3669114, 4000812, 3503511, 3423968,
3647391, 3584604, 3548256, 3505411, 3665138,
4049955, 3425512, 3834061, 3639699, 3522208, 3711928, 3576597, 3786781,
3591042, 3995802, 3493091, 3674475)
)
plot(Variable ~ Observation, data = df1)
As you can see from the plot there is a linear relationship between the control and the treatment groups. To check if this linear effect is statistical significant I change the contrasts using the contr.poly() function and fit a linear model like this:
contrasts(df1$Observation) <- contr.poly(levels(df1$Observation))
lm1 <- lm(log(Variable) ~ Observation, data = df1)
summary.lm(lm1)
From the summary we can see that the linear effect is statistically significant:
Observation.L 0.029141 0.012377 2.355 0.024 *
Observation.Q 0.002233 0.012482 0.179 0.859
However, this first model does not include any of the two covariates. Including them results in a non-significant p-value for the linear relationship:
lm2 <- lm(log(Variable) ~ Observation + COV1 + COV2, data = df1)
summary.lm(lm2)
Observation.L 0.04116 0.02624 1.568 0.126
Observation.Q 0.01003 0.01894 0.530 0.600
COV1A2 -0.01203 0.04202 -0.286 0.776
COV2B2 -0.02071 0.02202 -0.941 0.354
COV2B3 -0.02083 0.02066 -1.008 0.320
So far so good. However, I have been told to conduct a Type II Anova rather than Type I. To conduct a Type II Anova I used the Anova() function from the car package.
Anova(lm2, type="II")
Anova Table (Type II tests)
Response: log(Variable)
Sum Sq Df F value Pr(>F)
Observation 0.006253 2 1.4651 0.2453
COV1 0.000175 1 0.0820 0.7763
COV2 0.002768 2 0.6485 0.5292
Residuals 0.072555 34
The problem here with using Type II is that you do not get a p-value for the linear and quadratic effect. So I do not know if the effect is statistically linear and or quadratic.
I found out that the following code produces the same p-value for Observation as the Anova() function. But the result also does not include any p-values for the linear or quadratic effect:
lm2 <- lm(log(Variable) ~ Observation + COV1 + COV2, data = df1)
lm3 <- lm(log(Variable) ~ COV1 + COV2, data = df1)
anova(lm2, lm3)
Does anybody know how to conduct a Type II anova and the contrasts function to obtain the p-values for the linear and quadratic effects?
Help would be very much appreciated.
Best
Peter
I found one partial workaround for this, but it may require further correction. The documentation for the function drop1() from the stats package indicates that this function produces Type II sums of squares (although this page: http://www.statmethods.net/stats/anova.html ) declares that drop1() produces Type III sums of squares, and I didn't spend too much time poring over this (http://afni.nimh.nih.gov/sscc/gangc/SS.html) to cross-check sums of squares calculations. You could use it to calculate everything manually, but I suspect you're asking this question because it would be nice if someone had already worked through it.
Anyway, I added a second vector to the dummy data called Observation2, and set it up with just the linear contrasts (you can only specify one set of contrasts for a given vector at a given time):
df1[,"Observation2"]<-df1$Observation
contrasts(df1$Observation2, how.many=1)<-contr.poly
Then created a third linear model:
lm3<-lm(log(Variable)~Observation2+COV1+COV2, data=df1)
And conducted F tests with drop1 to compare F statistics from Type II ANOVAs between the two models:
lm2, which contains both the linear and quadratic terms:
drop1(lm2, test="F")
lm3, which contains just the linear contrasts:
drop1(lm3, test="F")
This doesn't include a direct comparison of the models against each other, although the F statistic is higher (and p value accordingly lower) for the linear model, which would lead one to rely upon it instead of the quadratic model.

Resources