Negative singular value using LAPACK svd - r

I have a 399 x 399 matrix for which the R svd() function (using LAPACK) gives me a negative singular value! This is not supposed to happen -- has anyone seen this before? It does not happen if I use the LINPACK option, so I guess this is a bug in the LAPACK svd.
ganymede: R --vanilla
R version 2.15.1 (2012-06-22) -- "Roasted Marshmallows"
Copyright (C) 2012 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
R is free software, etc ...
> load('A.dat')
> ls()
[1] "A"
> dim(A)
[1] 399 399
>
> L1 <- svd(A)
> any( L1$d < 0 )
[1] TRUE
> L1$d[1:4]
[1] 80.18833 68.93905 61.62659 57.62883
> L1$d[396:399]
[1] 3.777844e-15 3.582460e-15 3.175665e-15 -6.512578e+00
>
> L2 <- svd(A,LINPACK=TRUE)
> any( L2$d < 0 )
[1] FALSE
> L2$d[1:4]
[1] 80.18833 68.93905 61.62659 57.62883
> L2$d[396:399]
[1] 8.565532e-32 3.254162e-32 3.484425e-47 5.411232e-48
>

Per this bug report the problem can most likely be fixed by relinking against LAPACK 3.4.1. Upgrading the generic version 3.1.1 seems to be non-trivial and the procedure depends on the OS.

Related

Why is it possible to use loaded functions along with variables of the same names if the latter do not point to other functions?

In the session below, I define the variables c and print without any side effects, i.e. the c and print functions are still available along with the variables of the same name:
R version 4.1.3 (2022-03-10) -- "One Push-Up"
Copyright (C) 2022 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
> typeof(c)
[1] "builtin"
> typeof(print)
[1] "closure"
> c <- 42
> print <- 'hello world'
> typeof(c)
[1] "double"
> typeof(print)
[1] "character"
> print(c(print, c))
[1] "hello world" "42"
But if I define a new function c, the new one is used now instead of base::c:
> c <- function(x, ...) paste('Hello', x, ...)
> c(1,2,3)
[1] "Hello 1 2 3"
> base::c(1,2,3)
[1] 1 2 3
> c <- "Gibberish"
> c(1,2,3)
[1] 1 2 3
Why don't the local variables c and print hide functions of the same name unless they point to some other user-defined procedures?

Mismatching results for singular fit with different R/lme4 versions

I am trying to match the estimate of random effects from R version 3.5.3 (lme4 1.1-18-1) to R version 4.1.1 (lme4 1.1-27.1). However, there is a small difference of random effects between these two versions when there is singular fit. I'm fine with singularity warnings, but it is puzzling that different versions of R/lme4 produce slightly different results.
The following scripts are from R version 3.5.3 (lme4 1.1-18-1) and R version 4.1.1 (lme4 1.1-27.1) with the dataset Arabidopsis from lme4.
> sessionInfo()
R version 3.5.3 (2019-03-11)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] minqa_1.2.4 MASS_7.3-51.1 compiler_3.5.3 Matrix_1.2-15
[5] tools_3.5.3 Rcpp_1.0.1 splines_3.5.3 nlme_3.1-137
[9] grid_3.5.3 nloptr_1.2.1 lme4_1.1-18-1 lattice_0.20-38
> library(lme4)
Loading required package: Matrix
> options(digits = 15)
> ##########
> #Example1#
> ##########
> fit1 <- lmer(total.fruits~(1|reg)+(1|reg:popu),data=Arabidopsis,control=lmerControl(optimizer="bobyqa"))
> VarCorr(fit1)
Groups Name Std.Dev.
reg:popu (Intercept) 7.744768797534
reg (Intercept) 10.629179104291
Residual 39.028818969641
> ##########
> #Example2#
> ##########
> fit2 <- lmer(total.fruits~(1|reg)+(1|reg:popu)+(1|reg:popu:amd)+(1|reg:popu:amd:status),data=Arabidopsis,control=lmerControl(optimizer="bobyqa"))
> fit2#theta
[1] 0.150979711638631 0.000000000000000 0.189968995915902
[4] 0.260818869156072
> VarCorr(fit2)
Groups Name Std.Dev.
reg:popu:amd:status (Intercept) 5.841181759473
reg:popu:amd (Intercept) 0.000000000000
reg:popu (Intercept) 7.349619506926
reg (Intercept) 10.090696322743
Residual 38.688521100461
> ##########
> #Example3#
> ##########
> devfun353 <- lmer(total.fruits~(1|reg)+(1|reg:popu)+(1|reg:popu:amd)+(1|reg:popu:amd:status),data=Arabidopsis,control=lmerControl(optimizer="bobyqa"),devFunOnly = T)
> save.image('myEnvironment353.Rdata')
> sessionInfo()
R version 4.1.1 (2021-08-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] minqa_1.2.4 MASS_7.3-54 compiler_4.1.1 minque_2.0.0 Matrix_1.3-4
[6] tools_4.1.1 Rcpp_1.0.7 tinytex_0.34 splines_4.1.1 nlme_3.1-152
[11] grid_4.1.1 xfun_0.27 nloptr_1.2.2.2 boot_1.3-28 lme4_1.1-27.1
[16] ADDutil_2.2.1.9005 lattice_0.20-44
> library(lme4)
Loading required package: Matrix
Warning message:
package ‘lme4’ was built under R version 4.1.2
> options(digits = 15)
> ##########
> #Example1#
> ##########
> fit1 <- lmer(total.fruits~(1|reg)+(1|reg:popu),data=Arabidopsis,control=lmerControl(optimizer="bobyqa"))
> VarCorr(fit1)
Groups Name Std.Dev.
reg:popu (Intercept) 7.744768797534
reg (Intercept) 10.629179104291
Residual 39.028818969641
> ##########
> #Example2#
> ##########
> fit2 <- lmer(total.fruits~(1|reg)+(1|reg:popu)+(1|reg:popu:amd)+(1|reg:popu:amd:status),data=Arabidopsis,control=lmerControl(optimizer="bobyqa"))
boundary (singular) fit: see ?isSingular
> fit2#theta
[1] 0.150979743348540 0.000000000000000 0.189969036985684 0.260818797487214
> VarCorr(fit2)
Groups Name Std.Dev.
reg:popu:amd:status (Intercept) 5.841182965248
reg:popu:amd (Intercept) 0.000000000000
reg:popu (Intercept) 7.349621069388
reg (Intercept) 10.090693513643
Residual 38.688520961140
> ##########
> #Example3#
> ##########
> devfun411 <- lmer(total.fruits~(1|reg)+(1|reg:popu)+(1|reg:popu:amd)+(1|reg:popu:amd:status),data=Arabidopsis,control=lmerControl(optimizer="bobyqa"),devFunOnly = T)
> load('myEnvironment353.Rdata')
> devfun353 <- lme4:::mkdevfun(environment(devfun353))
> minqa::bobyqa(c(1,1,1,1),devfun353,0,control = list(iprint=2))
npt = 6 , n = 4
rhobeg = 0.2 , rhoend = 2e-07
start par. = 1 1 1 1 fn = 6443.44054431489
rho: 0.020 eval: 11 fn: 6393.61 par: 0.00000 0.621363 0.744867 0.823498
rho: 0.0020 eval: 38 fn: 6361.97 par:0.156855 0.00000 0.190090 0.234676
rho: 0.00020 eval: 49 fn: 6361.94 par:0.150719 0.00000 0.190593 0.249106
rho: 2.0e-05 eval: 67 fn: 6361.94 par:0.150988 0.00000 0.189943 0.260821
rho: 2.0e-06 eval: 74 fn: 6361.94 par:0.150980 0.00000 0.189965 0.260811
rho: 2.0e-07 eval: 82 fn: 6361.94 par:0.150980 0.00000 0.189969 0.260819
At return
eval: 90 fn: 6361.9381 par: 0.150980 0.00000 0.189969 0.260819
parameter estimates: 0.150979722854965, 0, 0.189968942342717, 0.260818725554898
objective: 6361.93810274656
number of function evaluations: 90
> minqa::bobyqa(c(1,1,1,1),devfun411,0,control = list(iprint=2))
npt = 6 , n = 4
rhobeg = 0.2 , rhoend = 2e-07
start par. = 1 1 1 1 fn = 6443.44054431489
rho: 0.020 eval: 11 fn: 6393.61 par: 0.00000 0.621363 0.744867 0.823498
rho: 0.0020 eval: 38 fn: 6361.97 par:0.156855 0.00000 0.190090 0.234676
rho: 0.00020 eval: 49 fn: 6361.94 par:0.150719 0.00000 0.190593 0.249106
rho: 2.0e-05 eval: 67 fn: 6361.94 par:0.150988 0.00000 0.189943 0.260821
rho: 2.0e-06 eval: 74 fn: 6361.94 par:0.150980 0.00000 0.189965 0.260811
rho: 2.0e-07 eval: 82 fn: 6361.94 par:0.150980 0.00000 0.189969 0.260819
At return
eval: 90 fn: 6361.9381 par: 0.150980 0.00000 0.189969 0.260819
parameter estimates: 0.150979722854965, 0, 0.189968942342717, 0.260818725554898
objective: 6361.93810274656
number of function evaluations: 90
When the model is simpler, there is no singularity warning and the results match. (See example 1 in both scripts) When model is relatively complex, there is singularity warning and the results are slightly off (See example 2 in both scripts). The difference is <1e-5 in this case but I have observed <1e-4 before. Can anyone shed some lights on why the results are slightly different? and is it even possible to match the results to at least 1e-8?
Not sure if this is useful but I also extract devfun from 3.5.3 and run it in 4.1.1. The results match. (see example 3) In addition, when I read iteration history from BOBYQA, the $\theta$ of the term that leads to singularity warning oscillates between 0 and small numbers (around 1e-7 to 1e-9).
This post discusses similar topics. It also shows the singularity warning leads to slightly different estimate. There is no obvious change in LME4 NEWS that cause the difference. This FAQ and ?isSingular give great explanation on singularity warning but does not address the issue of mismatching directly.
TL;DR: Sometimes when there is singularity warning (I am ok with), the random effects are slightly different under different R/lme4 versions. Why is this happening and how to address it?
This is a hard problem to solve in general, and even a fairly hard problem to solve in specific cases.
I think the difference arose between version 1.1.27.1 and 1.1.28, probably from this NEWS item:
construction of interacting factors (e.g. when f1:f2 or f1/f2 occur in random effects terms) is now more efficient for partially crossed designs (doesn't try to create all combinations of f1 and f2) (GH #635 and #636)
My guess is that this changes the ordering of the components in the Z matrix, which in turn means that results of various linear algebra operations are not identical (e.g. floating point arithmetic is not associative, so while binary addition is commutative (a + b == b + a), left-to-right evaluation of a sum may not be the same as right-to-left evaluation ((a+b) + c != a + (b+c)) ...)
My attempt at reproducing the problem uses the same version of R ("under development 2022-02-25 r81818") and compares only lme4 package versions 1.18.1 with 1.1.28.9000 (development); any upstream packages such as Rcpp, RcppEigen, Matrix use the same versions. (I had to backport a few changes from the development version of lme4 to 1.1.18.1 to get it to install under the most recent version of R, but I don't think any of those modifications would affect numerical results.)
I did the comparison by installing different versions of the lme4 package before running the code in a fresh R session. My results differed between versions 1.1.18.1 and 1.1.28 less than yours did (both fits were singular, and the relative differences in the theta estimates were of the order of 2e-7 — still greater than your desired 1e-8 tolerance but much smaller than 1e-4 ...)
The results from 1.1.18.1 and 1.1.27.1 were identical.
Q1: Why are your results more different between versions than mine?
in general/anecdotally, numerical results on Windows are slightly more unstable/differ more from other platforms
there are more differences between your two test platforms than among mine: R version, upstream packages (Matrix/Rcpp/RcppEigen/minqa), possibly the compiler versions and settings used to build everything [all of which could make a difference]
Q2: how should one deal with this kind of problem?
as a minor frame challenge, why (other than not understanding what's going on, which is a perfectly legitimate reason to be concerned) does this worry you? The differences in the results are way smaller than the magnitude of statistical uncertainty, and differences this large are also likely to occur across different platforms (OS/compiler version/etc.) even for otherwise identical environments (versions of R, lme4, and other packages).
you could revert to version 1.1.27.1 for now ...
I do take the differences between 1.1.27.1 as a bug, of sorts — at the very least it's an undocumented change in the package. If it were sufficiently high-priority I could investigate the code changes described above and see if there is a way to fix the problems they addressed without breaking backward compatibility (in theory this should be possible, but it could be annoyingly difficult ...)
## R CMD INSTALL ~/R/misc/lme4
library(lme4)
packageVersion("lme4")
## 1.1.18.1
fit2 <- lmer(total.fruits~(1|reg)+(1|reg:popu)+(1|reg:popu:amd)+(1|reg:popu:amd:status),data=Arabidopsis,control=lmerControl(optimizer="bobyqa"))
dput(getME(fit2, "theta"))
t1 <- c(`reg:popu:amd:status.(Intercept)` = 0.150979711638631, `reg:popu:amd.(Intercept)` = 0,
`reg:popu.(Intercept)` = 0.189968995915902, `reg.(Intercept)` = 0.260818869156072
)
Run under 1.1.28.9000 (fresh R session, re-run package-loading/lmer code above)
## R CMD INSTALL ~/R/pkgs/lme4git/lme4
packageVersion("lme4")
## [1] ‘1.1.28.9000’
dput(getME(fit2, "theta"))
t2 <- c(`reg:popu:amd:status.(Intercept)` = 0.15097974334854, `reg:popu:amd.(Intercept)` = 0,
`reg:popu.(Intercept)` = 0.189969036985684, `reg.(Intercept)` = 0.260818797487214
)
(t1-t2)/((t1+t2)/2)
## reg:popu:amd:status.(Intercept) reg:popu:amd.(Intercept)
## -2.100276e-07 NaN
## reg:popu.(Intercept) reg.(Intercept)
## -2.161920e-07 2.747841e-07
The second element is NaN because both versions give singular fits (0/0 == NaN).
Run under 1.1.27.1 (fresh R session, re-run package-loading/lmer code above)
## remotes::install_version("lme4", "1.1-27.1")
t3 <- c(`reg:popu:amd:status.(Intercept)` = 0.150979711638631, `reg:popu:amd.(Intercept)` = 0,
`reg:popu.(Intercept)` = 0.189968995915902, `reg.(Intercept)` = 0.260818869156072)
identical(t1, t3) ## TRUE

furrr with rTorch in multisession

I want to use rTorch in a furrr "loop". A minimal example seems to be:
library(rTorch)
torch_it <- function(i) {
#.libPaths("/User/homes/mreichstein/.R_libs_macadamia4.0/")
#require(rTorch)
cat("torch is: "); print(torch)
out <- torch$tensor(1)
out
}
library(furrr)
plan(multisession, workers=8) # this works with sequential or multicore
result <- future_map(1:5, torch_it)
I get as output:
torch is: <environment: 0x556b57f07990>
attr(,"class")
[1] "python.builtin.module" "python.builtin.object"
Error in torch$tensor(1) : attempt to apply non-function
When I use multicore or sequential I get the expected output:
torch is: Module(torch)
torch is: Module(torch)
torch is: Module(torch)
torch is: Module(torch)
torch is: Module(torch)
... and result is the expected list of tensors.
Uncommenting the first lines in the function, assuming the new sessions "need this" did not help.
Update: I added torch <- import("torch") in the torch_it function. Then it runs without error, but the result I get is a list of empty pointers (?):
> result
[[1]]
<pointer: 0x0>
[[2]]
<pointer: 0x0>
[[3]]
<pointer: 0x0>
[[4]]
<pointer: 0x0>
[[5]]
<pointer: 0x0>
So, how do I properly link rTorch to each of the multisessions?
Thanks in advance!
Info:
> R.version
_
platform x86_64-redhat-linux-gnu
arch x86_64
os linux-gnu
system x86_64, linux-gnu
status
major 4
minor 0.3
year 2020
month 10
day 10
svn rev 79318
language R
version.string R version 4.0.3 (2020-10-10)
nickname Bunny-Wunnies Freak Out

unzip() is overwritting file, even when using overwrite = FALSE

saveRDS(1, tmp1 <- "test1.rds")
tmp3 <- tempfile(fileext = ".zip")
zip(tmp3, tmp1)
unlink(tmp1)
file.exists(tmp1) # FALSE
unzip(tmp3)
file.exists(tmp1) # TRUE
readRDS(tmp1) # 1
saveRDS(2, tmp1)
readRDS(tmp1) # 2
unzip(tmp3, overwrite = FALSE)
# Warning message:
# In unzip(tmp3, overwrite = FALSE) : not overwriting file './test1.rds
readRDS(tmp1) # 1
unlink(tmp1)
I was expecting the last readRDS(tmp1) to return 2, right?
Any thought?
PS: I'm on Linux CentOS 7, using R version 3.5.2.
I have confirmed that you are indeed facing a bug in R 3.5.2. I have checked on Centos 7.2 only
R 3.6.0
$ /usr/bin/R -f test2.r
R version 3.6.0 (2019-04-26) -- "Planting of a Tree"
Copyright (C) 2019 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
[Previously saved workspace restored]
During startup - Warning message:
Setting LC_CTYPE failed, using "C"
> saveRDS(1, tmp1 <- "test1.rds")
> tmp3 <- tempfile(fileext = ".zip")
> zip(tmp3, tmp1)
adding: test1.rds (deflated 2%)
> unlink(tmp1)
> file.exists(tmp1) # FALSE
[1] FALSE
>
> unzip(tmp3)
> file.exists(tmp1) # TRUE
[1] TRUE
> readRDS(tmp1) # 1
[1] 1
> saveRDS(2, tmp1)
> readRDS(tmp1) # 2
[1] 2
> unzip(tmp3, overwrite = FALSE)
Warning message:
In unzip(tmp3, overwrite = FALSE) : not overwriting file './test1.rds
> readRDS(tmp1) # 1
[1] 2
>
> unlink(tmp1)
>
R 3.5.2
$ R -f test2.r
R version 3.5.2 (2018-12-20) -- "Eggshell Igloo"
Copyright (C) 2018 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
[Previously saved workspace restored]
During startup - Warning message:
Setting LC_CTYPE failed, using "C"
> saveRDS(1, tmp1 <- "test1.rds")
> tmp3 <- tempfile(fileext = ".zip")
> zip(tmp3, tmp1)
adding: test1.rds (deflated 5%)
> unlink(tmp1)
> file.exists(tmp1) # FALSE
[1] FALSE
>
> unzip(tmp3)
> file.exists(tmp1) # TRUE
[1] TRUE
> readRDS(tmp1) # 1
[1] 1
> saveRDS(2, tmp1)
> readRDS(tmp1) # 2
[1] 2
> unzip(tmp3, overwrite = FALSE)
Warning message:
In unzip(tmp3, overwrite = FALSE) : not overwriting file './test1.rds
> readRDS(tmp1) # 1
[1] 1
>
> unlink(tmp1)
>

Error from CAPM.alpha in PerformanceAnalytics

When I try to run the example from page 22 of the PerformanceAnalytics reference, I get an error message. See below.
PS I am a beginner & this has never worked for me. Also, my underlying issue is that I'm getting exactly the same error when trying to use table.CAPM with my own data.
Thanks for any assistance.
> search()
[1] ".GlobalEnv" "package:PerformanceAnalytics"
[3] "package:xts" "package:zoo"
[5] "package:stats" "package:graphics"
[7] "package:grDevices" "package:utils"
[9] "package:datasets" "package:methods"
[11] "Autoloads" "package:base"
> version
_
platform x86_64-w64-mingw32
arch x86_64
os mingw32
system x86_64, mingw32
status
major 2
minor 15.2
year 2012
month 10
day 26
svn rev 61015
language R
version.string R version 2.15.2 (2012-10-26)
nickname Trick or Treat
> data(managers)
> CAPM.alpha(managers[,1,drop=FALSE], managers[,8,drop=FALSE], Rf=.035/12)
Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
0 (non-NA) cases
>
The bug is not in your code, it is in the R package itself. It it is shown on the package validation check here and it can be reproduced with:
library(PerformanceAnalytics)
example(CAPM.alpha)
The error seems to be on line 40 of Return.excess.R. It should be replaced with:
xR = coredata(as.xts(R))-coredata(as.xts(Rf))
The easiest way of fixing this in practice is to run:
require(utils)
assignInNamespace(
"Return.excess",
function (R, Rf = 0)
{ # #author Peter Carl
# edited by orizon
# .. additional comments removed
R = checkData(R)
if(!is.null(dim(Rf))){
Rf = checkData(Rf)
indexseries=index(cbind(R,Rf))
columnname.Rf=colnames(Rf)
}
else {
indexseries=index(R)
columnname.Rf=Rf
Rf=xts(rep(Rf, length(indexseries)),order.by=indexseries)
}
return.excess <- function (R,Rf)
{
xR = coredata(as.xts(R))-coredata(as.xts(Rf)) #fixed
}
result = apply(R, MARGIN=2, FUN=return.excess, Rf=Rf)
colnames(result) = paste(colnames(R), ">", columnname.Rf)
result = reclass(result, R)
return(result)
},
"PerformanceAnalytics"
)
Then your original command works:
> data(managers)
> CAPM.alpha(managers[,1,drop=FALSE], managers[,8,drop=FALSE], Rf=.035/12)
[1] 0.005960609
Be aware that I have not verified that the function does what it purports to do.

Resources