Why would loading a package change the resid function being used? - r

I understand that resid() is a generic function in R, and which specific residual function is used depends on the object to which resid() is applied, just like print().
However, I noticed that, sometimes loading a package would change which specific residual function is used, yielding drastically different residual plots. Could anyone help me understand why that happens?
This is an example from my data:
> #### Showing packages loaded after starting up R ####
> search()
[1] ".GlobalEnv" "tools:rstudio" "package:stats" "package:graphics" "package:grDevices" "package:utils"
[7] "package:datasets" "package:methods" "Autoloads" "package:base"
>
> #### Before loading nlme ####
>
> ## s1 is a gls object, calculated using the nlme package
> s1 <- readRDS("../Data/my_gls.RDS")
> qqnorm(resid(s1, type = "pearson"), main = "before loading nlme")
> qqline(resid(s1, type = "pearson"))
>
> methods(resid)
[1] residuals.default* residuals.glm residuals.HoltWinters* residuals.isoreg* residuals.lm
[6] residuals.nls* residuals.smooth.spline* residuals.tukeyline*
see '?methods' for accessing help and source code
Warning message:
In .S3methods(generic.function, class, envir) :
generic function 'resid' dispatches methods for generic 'residuals'
> sloop::s3_dispatch(resid(s1, type = "pearson"))
resid.gls
=> resid.default
> ## the resid.default is used
And the resulting qqplot is
Then, after loading the nlme package,
> #### After loading nlme ####
>
> library(nlme)
Warning message:
package ‘nlme’ was built under R version 4.1.2
> search()
[1] ".GlobalEnv" "package:nlme" "tools:rstudio" "package:stats" "package:graphics" "package:grDevices"
[7] "package:utils" "package:datasets" "package:methods" "Autoloads" "package:base"
>
> # s2 is the same as s1
> s2 <- readRDS("../Data/my_gls.RDS")
> qqnorm(resid(s2, type = "pearson"), main = "after loading nlme")
> qqline(resid(s2, type = "pearson"))
>
> methods(resid)
[1] residuals.default* residuals.glm residuals.gls* residuals.glsStruct* residuals.gnls*
[6] residuals.gnlsStruct* residuals.HoltWinters* residuals.isoreg* residuals.lm residuals.lme*
[11] residuals.lmeStruct* residuals.lmList* residuals.nlmeStruct* residuals.nls* residuals.smooth.spline*
[16] residuals.tukeyline*
see '?methods' for accessing help and source code
Warning message:
In .S3methods(generic.function, class, envir) :
generic function 'resid' dispatches methods for generic 'residuals'
> sloop::s3_dispatch(resid(s2, type = "pearson"))
=> resid.gls
* resid.default
> # resid.gls is used
the qqplot looks like this
As the command sloop::s3_dispatch(resid(s1, type = "pearson")) indicated, resid.default is the function being used before the nlme package is loaded, but resid.gls is the one being used after nlme is loaded. Why such a change---is it because resid.gls is not included in the default options of resid(), as the first methods(resid) suggested?
I am using R 4.1.0, and I would appreciate your feedback very much, if any. Thank you.
> version
_
platform x86_64-w64-mingw32
arch x86_64
os mingw32
system x86_64, mingw32
status
major 4
minor 1.0
year 2021
month 05
day 18
svn rev 80317
language R
version.string R version 4.1.0 (2021-05-18)
nickname Camp Pontaneze

Related

attaching packages to a "temporary" search path in R

Inside a function, I am sourcing a script:
f <- function(){
source("~/Desktop/sourceme.R") # source someone elses script
# do some stuff to the variables read in
}
f()
search() # library sourceme.R attaches is all the way in the back!
and unfortunately, the scripts that I am sourcing are not fully under my control. They make calls to library(somePackage), and it pollutes the search path.
This is mostly a problem if the author of sourceme.R expects the package that he/she is attaching to be at the top level/close to the global environment. If I myself have attached some package that masks some of the function names he/she is expecting to be available, then that's no good.
Is there a way I can source scripts but somehow make my own temporary search path that "resets" after the function is finished running?
I would consider sourcing the script in a separate R process using the callr package and then return the environment created by the sourced file.
By using a separate R process, this will prevent your search path from being polluted. I'm guessing there maybe some side effects (such as defining new functions of variables) in your global environment you do want. The local argument of the source functions allows you to specify where the parsed script should be executed. If you return this environment from the other R process, you can access any result you need.
Not sure what yours looks like but say I have this file that would modify the search path:
# messWithSearchPath.R
library(dplyr)
a <- data.frame(groupID = rep(1:3, 10), value = rnorm(30))
b <- a %>%
group_by(groupID) %>%
summarize(agg = sum(value))
From my top level script, I would write a wrapper function to source it in a new environment and have callr execute this function:
RogueScript <- function(){
rogueEnv <- new.env()
source("messWIthSearchPath.R", local = rogueEnv)
rogueEnv
}
before <- search()
scriptResults <- callr::r(RogueScript)
scriptResults$b
#> groupID agg
#> 1 1 -2.871642
#> 2 2 3.368499
#> 3 3 1.159509
identical(before, search())
#> [1] TRUE
If the scripts have other side effects (such as setting options or establishing external connections), this method probably won't work. There may be workarounds depending on what they are intended to do, but this should work if you just want the variables/functions created. It also prevents the scripts from conflicting with each other not just your top level script.
One way would be to "snapshot" your current search path and try to return to it later:
search.snapshot <- local({
.snap <- character(0)
function(restore = FALSE) {
if (restore) {
if (is.null(.snap)) {
return(character(0))
} else {
extras <- setdiff(search(), .snap)
# may not work if DLLs are loaded
for (pkg in extras) {
suppressWarnings(detach(pkg, character.only = TRUE, unload = TRUE))
}
return(extras)
}
} else .snap <<- search()
}
})
In action:
search.snapshot() # store current state
get(".snap", envir = environment(search.snapshot)) # view snapshot
# [1] ".GlobalEnv" "ESSR" "package:stats"
# [4] "package:graphics" "package:grDevices" "package:utils"
# [7] "package:datasets" "package:r2" "package:methods"
# [10] "Autoloads" "package:base"
library(ggplot2)
library(zoo)
# Attaching package: 'zoo'
# The following objects are masked from 'package:base':
# as.Date, as.Date.numeric
library(dplyr)
# Attaching package: 'dplyr'
# The following objects are masked from 'package:stats':
# filter, lag
# The following objects are masked from 'package:base':
# intersect, setdiff, setequal, union
search()
# [1] ".GlobalEnv" "package:dplyr" "package:zoo"
# [4] "package:ggplot2" "ESSR" "package:stats"
# [7] "package:graphics" "package:grDevices" "package:utils"
# [10] "package:datasets" "package:r2" "package:methods"
# [13] "Autoloads" "package:base"
search.snapshot(TRUE) # returns detached packages
# [1] "package:dplyr" "package:zoo" "package:ggplot2"
search()
# [1] ".GlobalEnv" "ESSR" "package:stats"
# [4] "package:graphics" "package:grDevices" "package:utils"
# [7] "package:datasets" "package:r2" "package:methods"
# [10] "Autoloads" "package:base"
I am somewhat confident (without verification) that this will not always work with all packages, perhaps due to dependencies and/or loaded DLLs. You can try adding force=TRUE to the detach call, not sure if that'll work better or perhaps have other undesirable side-effects.

Character Vector of loaded packages

I am currently trying to translate my loaded packages into a character vector to use in the pkgDep function. Does anyone have any idea on how to do this? Currently my results are formatted as a list, and using the unlist()function has not worked for me. I think rapply would do the trick, but I am running into issues on how to set up the function. I have pasted my code below. Thanks!
x <- loaded_packages()
typeof(x)
#need a character vector with package names to pass into function
pkgList <- pkgDep(x, availPkgs = pkgdata, suggests=TRUE)`
Use search() function to see the packages currently loaded.
x <- search()
x
# [1] ".GlobalEnv" "package:dplyr" "package:stats"
# [4] "package:graphics" "package:grDevices" "package:utils"
# [7] "package:datasets" "package:methods" "Autoloads"
# [10] "package:base"
pkgList <- pkgDep(x, availPkgs = pkgdata, suggests=TRUE)`
If you can tell us what pkgDep() function does, we can get the loaded packages list in specific format.
Try this function:
x <- search()
As per this link.

Error from CAPM.alpha in PerformanceAnalytics

When I try to run the example from page 22 of the PerformanceAnalytics reference, I get an error message. See below.
PS I am a beginner & this has never worked for me. Also, my underlying issue is that I'm getting exactly the same error when trying to use table.CAPM with my own data.
Thanks for any assistance.
> search()
[1] ".GlobalEnv" "package:PerformanceAnalytics"
[3] "package:xts" "package:zoo"
[5] "package:stats" "package:graphics"
[7] "package:grDevices" "package:utils"
[9] "package:datasets" "package:methods"
[11] "Autoloads" "package:base"
> version
_
platform x86_64-w64-mingw32
arch x86_64
os mingw32
system x86_64, mingw32
status
major 2
minor 15.2
year 2012
month 10
day 26
svn rev 61015
language R
version.string R version 2.15.2 (2012-10-26)
nickname Trick or Treat
> data(managers)
> CAPM.alpha(managers[,1,drop=FALSE], managers[,8,drop=FALSE], Rf=.035/12)
Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
0 (non-NA) cases
>
The bug is not in your code, it is in the R package itself. It it is shown on the package validation check here and it can be reproduced with:
library(PerformanceAnalytics)
example(CAPM.alpha)
The error seems to be on line 40 of Return.excess.R. It should be replaced with:
xR = coredata(as.xts(R))-coredata(as.xts(Rf))
The easiest way of fixing this in practice is to run:
require(utils)
assignInNamespace(
"Return.excess",
function (R, Rf = 0)
{ # #author Peter Carl
# edited by orizon
# .. additional comments removed
R = checkData(R)
if(!is.null(dim(Rf))){
Rf = checkData(Rf)
indexseries=index(cbind(R,Rf))
columnname.Rf=colnames(Rf)
}
else {
indexseries=index(R)
columnname.Rf=Rf
Rf=xts(rep(Rf, length(indexseries)),order.by=indexseries)
}
return.excess <- function (R,Rf)
{
xR = coredata(as.xts(R))-coredata(as.xts(Rf)) #fixed
}
result = apply(R, MARGIN=2, FUN=return.excess, Rf=Rf)
colnames(result) = paste(colnames(R), ">", columnname.Rf)
result = reclass(result, R)
return(result)
},
"PerformanceAnalytics"
)
Then your original command works:
> data(managers)
> CAPM.alpha(managers[,1,drop=FALSE], managers[,8,drop=FALSE], Rf=.035/12)
[1] 0.005960609
Be aware that I have not verified that the function does what it purports to do.

XCMS Package - Retention time

Is there a simple way to get the list/array of retention times from a xcmsRaw object?
Example Code:
xraw <- xcmsRaw(cdfFile)
So for example getting information from it :
xraw#env$intensity
or
xraw#env$mz
You can see what slots are available in your xcmsRaw instance with
> slotNames(xraw)
[1] "env" "tic" "scantime"
[4] "scanindex" "polarity" "acquisitionNum"
[7] "profmethod" "profparam" "mzrange"
[10] "gradient" "msnScanindex" "msnAcquisitionNum"
[13] "msnPrecursorScan" "msnLevel" "msnRt"
[16] "msnPrecursorMz" "msnPrecursorIntensity" "msnPrecursorCharge"
[19] "msnCollisionEnergy" "filepath"
What you want is xraw#msnRt - it is a vector of numeric.
The env slot is a environment that stores 3 variables:
> ls(xraw#env)
[1] "intensity" "mz" "profile"
More details on the class itself at class?xcmsRaw.
EDIT: The msnRt slot is populated only if you specify includeMSn = TRUE and your input file must be in mzXML or mzML, not in cdf; if you use the faahKO example from ?xcmasRaw, you will see that
xr <- xcmsRaw(cdffiles[1], includeMSn = TRUE)
Warning message:
In xcmsRaw(cdffiles[1], includeMSn = TRUE) :
Reading of MSn spectra for NetCDF not supported
Also, xr#msnRt will only store the retention times for MSn scans, with n > 1. See the xset#rt where xset is an xcmsSet instance for the raw/corrected MS1 retention times as provided by xcms.
EDIT2: Alternatively, have a go with the mzR package
> library(mzR)
> cdffiles[1]
[2] "/home/lgatto/R/x86_64-unknown-linux-gnu-library/2.16/faahKO/cdf/KO/ko15.CDF"
> xx <- openMSfile(cdffiles[1])
> xx
Mass Spectrometry file handle.
Filename: /home/lgatto/R/x86_64-unknown-linux-gnu-library/2.16/faahKO/cdf/KO/ko15.CDF
Number of scans: 1278
> hd <- header(xx)
> names(hd)
[1] "seqNum" "acquisitionNum"
[3] "msLevel" "peaksCount"
[5] "totIonCurrent" "retentionTime"
[7] "basePeakMZ" "basePeakIntensity"
[9] "collisionEnergy" "ionisationEnergy"
[11] "highMZ" "precursorScanNum"
[13] "precursorMZ" "precursorCharge"
[15] "precursorIntensity" "mergedScan"
[17] "mergedResultScanNum" "mergedResultStartScanNum"
[19] "mergedResultEndScanNum"
> class(hd)
[1] "data.frame"
> dim(hd)
[1] 1278 19
but you will be outside of the default xcms pipeline if you take this route (although Steffen Neumann, from xcms, does know mzR very well, oubviously).
Finally, you are better of to use the Bioconductor mailing list of the xcms online forum if you want to maximise your chances to get feedback from the xcms developers.
Hope this helps.
Good answer but i was looking for this :
xraw <- xcmsRaw(cdfFile)
dp_index <- which(duplicated(rawMat(xraw)[,1]))
xraw_rt_dp <- rawMat(xraw)[,1]
xrawData.rt <- xraw_rt_dp[-dp_index]
Now :
xrawData.rt #contains the retention time.
Observation --> Using mzr package:
nc <- mzR:::netCDFOpen(cdfFile)
ncData <- mzR:::netCDFRawData(nc)
mzR:::netCDFClose(nc)
ncData$rt #contains the same retention time for the same input !!!

F-test with plm-model

I want to make a f-test to a plm-model and test for
model <- plm(y ~ a + b)
if
# a = b
and
# a = 0 and b = 0
I tried linearHypothesis like this
linearHypothesis(ur.model, c("a", "b")) to test for a = 0 and b = 0
but got the error
Error in constants(lhs, cnames_symb) :
The hypothesis "sgp1" is not well formed: contains bad coefficient/variable names.
Calls: linearHypothesis ... makeHypothesis -> rbind -> Recall -> makeHypothesis -> constants
In addition: Warning message:
In constants(lhs, cnames_symb) : NAs introduced by coercion
Execution halted
My example above is with code that is a little simplified if the problem is easy. If the problems is in the details is the actual code here.
model3 <- formula(balance.agr ~ sgp1 + sgp2 + cp + eu + election + gdpchange.imf + ue.ameco)
ur.model<-plm(model3, data=panel.l.fullsample, index=c("country","year"), model="within", effect="twoways")
linearHypothesis(ur.model, c("sgp1", "sgp2"), vcov.=vcovHC(plmmodel1, method="arellano", type = "HC1", clustering="group"))
I can't reproduce your error with one of the inbuilt data sets, even after quite a bit of fiddling.
Does this work for you?
require(plm)
require(car)
data(Grunfeld)
form <- formula(inv ~ value + capital)
re <- plm(form, data = Grunfeld, model = "within", effect = "twoways")
linearHypothesis(re, c("value", "capital"),
vcov. = vcovHC(re, method="arellano", type = "HC1"))
Note also, that you seem to have an error in the more complex code you showed. You are using linearHypothesis() on the object ur.model, yet call vcovHC() on object plmmodel1. Not sure if that is the problem or not, but check that in case.
Is it possible to provide the data? Finally, edit your Question to include output from sessionInfo(). Mine is (from quite a busy R instance):
> sessionInfo()
R version 2.11.1 Patched (2010-08-25 r52803)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_GB.utf8 LC_NUMERIC=C
[3] LC_TIME=en_GB.utf8 LC_COLLATE=en_GB.utf8
[5] LC_MONETARY=C LC_MESSAGES=en_GB.utf8
[7] LC_PAPER=en_GB.utf8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C
attached base packages:
[1] splines grid stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] car_2.0-2 nnet_7.3-1 plm_1.2-6 Formula_1.0-0
[5] kinship_1.1.0-23 lattice_0.19-11 nlme_3.1-96 survival_2.35-8
[9] mgcv_1.6-2 chron_2.3-37 MASS_7.3-7 vegan_1.17-4
[13] lmtest_0.9-27 sandwich_2.2-6 zoo_1.6-4 moments_0.11
[17] ggplot2_0.8.8 proto_0.3-8 reshape_0.8.3 plyr_1.2.1
loaded via a namespace (and not attached):
[1] Matrix_0.999375-44 tools_2.11.1
Could it be because you are "mixing" models? You have a variance specification that starts out:
, ...vcov.=vcovHC(plmmodel1,
... and yet you are working with ur.model.

Resources