How to solve compatibility problems mgcv ang gam packages? - r

I'm compiling using RMarkdown and knitr a short protocol for me (in one html file), about modelling. The basis is Zuur, A. F. "leno, E. N, Walker, NJ, Saveliev, AA & Smith, G M. 2009: Mixed effects models and extensions in ecology with R."
I downloaded the code and the datasets shared in their website1 and I'm mixing it with other sources and comments to produce something useful for me. In this website you can freely download the two dataset I'm using.
The main problem is that I'm trying to mix something made with mgcv package and something with gam package.
I clearly understood from this two topics that this is the problem:
R Package conflict between gam and mgcv?
Are there known compatibility issues with R package mgcv? Are there general rules for compatibility?
But I would like to find a solution. Obviously as answered in theese two topics specify the package or detach the unused one does not work. I just tried in my code.
Those are the parts that are causing me problems:
---
title: "error"
author: "Simone Marini"
date: "29 marzo 2016"
output: html_document
---
```{r setup, include = FALSE, cache = FALSE}
knitr::opts_chunk$set(error = TRUE) # to allow rendering to html even if there are errors
```
```{r}
Loyn <- read.table(file = "zuur_data/Loyn.txt", header = TRUE, dec = ".")
Loyn$fGRAZE <- factor(Loyn$GRAZE) # Transform in factor Graze data (from 1 to 5 -> 5 classes)
Loyn$L.AREA<-log10(Loyn$AREA)
Loyn$L.DIST<-log10(Loyn$DIST)
Loyn$L.LDIST<-log10(Loyn$LDIST)
```
```{r}
library(mgcv)
AM1<-mgcv::gam(ABUND~s(L.AREA)+s(L.DIST)+s(L.LDIST)+
s(YR.ISOL)+s(ALT)+fGRAZE, data = Loyn)
# The anova command does not apply a sequential F-test as it did for the linear regression model.
# Instead, it gives the Wald test (approximate!) that shows the significance of each term in the model.
anova(AM1)
AM2<-mgcv:::gam(ABUND ~ s(L.AREA, bs = "cs") + s(L.DIST, bs = "cs") +
s(L.LDIST,bs = "cs") + s(YR.ISOL, bs = "cs") +
s(ALT, bs = "cs") + fGRAZE, data = Loyn)
anova(AM2)
AM3 <- mgcv:::gam(ABUND ~ s(L.AREA, bs = "cs") + fGRAZE, data = Loyn)
#Model plot
plot(AM3)
E.AM3 <- resid(AM3) # Residuals
Fit.AM3 <- fitted(AM3) # Fitted values
plot(x = Fit.AM3, y = E.AM3, xlab = "Fitted values",
ylab = "Residuals") # Graph
M3<-lm(ABUND ~ L.AREA + fGRAZE, data = Loyn)
AM3<-mgcv:::gam(ABUND ~ s(L.AREA, bs = "cs") + fGRAZE, data = Loyn)
anova(M3, AM3)
```
```{r}
rm(list=ls())
detach("package:mgcv")
ISIT <- read.table(file = "zuur_data/ISIT.txt", header = TRUE, dec = ".")
ISIT$fStation<-factor(ISIT$Station)
op <- par(mfrow=c(2,2),mar=c(5,4,1,2))
Sources16<-ISIT$Sources[ISIT$Station==16]
Depth16<-ISIT$SampleDepth[ISIT$Station==16]
plot(Depth16,Sources16,type="p")
library(gam)
M2<-gam:::gam(Sources16~gam::lo(Depth16,span=0.5))
plot(M2,se=T)
P2 <- predict(M2, se = TRUE)
plot(Depth16, Sources16, type = "p")
I1 <- order(Depth16)
lines(Depth16[I1], P2$fit[I1], lty = 1)
lines(Depth16[I1], P2$fit[I1] + 2 * P2$se[I1], lty = 2)
lines(Depth16[I1], P2$fit[I1] - 2 * P2$se[I1], lty = 2)
par(op)
```
Does anyone knows a way to "detach" completely mgcv or gam? Or a code that can "reaload" the entire environment when I have to compile the gam part?
My sessionInfo if it is useful.
R version 3.2.3 (2015-12-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
locale:
[1] LC_COLLATE=Italian_Italy.1252 LC_CTYPE=Italian_Italy.1252 LC_MONETARY=Italian_Italy.1252
[4] LC_NUMERIC=C LC_TIME=Italian_Italy.1252
attached base packages:
[1] splines stats graphics grDevices utils datasets methods base
other attached packages:
[1] lattice_0.20-33 gam_1.12 foreach_1.4.3 mgcv_1.8-11 nlme_3.1-124
loaded via a namespace (and not attached):
[1] Matrix_1.2-3 htmltools_0.3 tools_3.2.3 yaml_2.1.13 codetools_0.2-14 rmarkdown_0.9.2
[7] grid_3.2.3 iterators_1.0.8 knitr_1.12.3 digest_0.6.9

Related

glmmTMB_phylo: Error in Matrix::rankMatrix(TMBStruc$data.tmb[[whichX]]) : length(d <- dim(x)) == 2 is not TRUE

I am trying to run the following model:
mod1<- phylo_glmmTMB(response ~ sv1 + # sampling variables
sv2 + sv3 + sv4 + sv5 +
sv6 + sv7 +
(1|phylo) + (1|reference_id), #random effects
ziformula = ~ 0,
#ar1(pos + 0| group) # spatial autocorrelation structure; group is a dummy variable
phyloZ = supertreenew,
phylonm = "phylo",
family = "binomial",
data = data)
But I keep getting the error:
Error in Matrix::rankMatrix(TMBStruc$data.tmb[[whichX]]) :
length(d <- dim(x)) == 2 is not TRUE
This error is also occurring with other reproducible example (data) that I found.
Before I run the model, I just loaded my data (data and supertree) and computed a Z matrix from supertree:
#Compute Z matrix
#supertreenew <- vcv.phylo(supertreenew)
#or
supertreenew <- phylo.to.Z(supertreenew)
#enforced match between
supertreenew <- supertreenew[levels(factor(data$phylo)), ]
I have installed the development version via:
remotes::install_github("wzmli/phyloglmm/pkg")
But no success.
The dimension of my supertree are:
[[1]]
... [351]
[[2]]
... [645]
Any guess?
My session info:
R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 22621)
Matrix products: default
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] phyloglmm_0.1.0.9001 brms_2.18.0 cpp_1.0.9 performance_0.10. DHARMa_0.4.6
[6] phytools_1.2-0 maps_3.4.0 ape_5.6-2 lme4_1.1-31 Matrix_1.5-1
[11] TMB_1.9.1 glmmTMB_1.1.5.9000 remotes_2.4.2
(First error, "Error in Matrix::rankMatrix") This is a consequence of the addition of a check of the rank of the fixed-effects matrix in recent versions of glmmTMB. For now, adding
control = glmmTMB::glmmTMBControl(rank_check = "skip")
to your phylo_glmmTMB call should work around the problem.
(Second error, "Error in getParameterOrder(data, parameters, new.env(), DLL = DLL) ...") I just updated the refactor branch to handle this problem [caused by internal changes in glmmTMB]. Use remotes::install_github("wzmli/phyloglmm/pkg#refactor") to install this version, then try your example again.

How to link a couple of tip nodes in an inverted circular phylogenetic tree using ggtree in R

I want to create a figure of an annotated phylogenetic tree in circular layout with ggtree in R. Some tip nodes must be linked by a curve line. I can achieve this with the geom_taxalink() function in the rectangular layout, but it doesn't work in the circular layout. This seems to be because the geom_taxalink() uses geom_curve(), which doesn't support non-linear coordinates. I get the following message:
"Warning message:
geom_curve is not implemented for non-linear coordinates"
Below: reproducible code, the output I get, the output I want, and session info.
I'd appreciate any help to get the result I need.
Thanks!
Samuel
Example code:
library(treeio)
library(ggtree)
library(ggplot2)
raxml_file <- system.file("extdata/RAxML",
"RAxML_bipartitionsBranchLabels.H3",
package="treeio")
raxml <- read.raxml(raxml_file)
raxml <- as_tibble(raxml)
raxml$label <- gsub("_.*$", "", raxml$label)
raxml <- as.treedata(raxml)
my_tree <- ggtree(raxml, layout = "circular", branch.length = "none") +
geom_tiplab2(size = 3, hjust = 1) +
geom_taxalink("EU857082",
"YGSIV1534",
color = "red") +
scale_x_reverse(limits = c(100, 0))
ggsave("my_tree.png", my_tree,
width = 10, height = 10, units = "in",
dpi = 300)
Here is a link to a sample of the result I get:
Here is link to an example of the desired result:
Session info:
info <- sessionInfo()
toLatex(info, locale = FALSE)
# \begin{itemize}\raggedright
# \item R version 4.0.2 (2020-06-22), \verb|x86_64-pc-linux-gnu|
# \item Running under: \verb|Ubuntu 18.04.4 LTS|
# \item Matrix products: default
# \item BLAS: \verb|/usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1|
# \item LAPACK: \verb|/usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1|
# \item Base packages: base, datasets, graphics, grDevices, methods,
# stats, utils
# \item Other packages: ggplot2~3.3.2, ggtree~2.2.1, treeio~1.12.0
# \item Loaded via a namespace (and not attached): ape~5.4,
# aplot~0.0.4, assertthat~0.2.1, BiocManager~1.30.10, cli~2.0.2,
# colorspace~1.4-1, compiler~4.0.2, crayon~1.3.4, dplyr~1.0.0,
# ellipsis~0.3.1, fansi~0.4.1, farver~2.0.3, generics~0.0.2,
# glue~1.4.1, grid~4.0.2, gtable~0.3.0, jsonlite~1.7.0, labeling~0.3,
# lattice~0.20-41, lazyeval~0.2.2, lifecycle~0.2.0, magrittr~1.5,
# munsell~0.5.0, nlme~3.1-148, parallel~4.0.2, patchwork~1.0.1,
# pillar~1.4.6, pkgconfig~2.0.3, purrr~0.3.4, R6~2.4.1, Rcpp~1.0.5,
# rlang~0.4.7, rstudioapi~0.11, rvcheck~0.1.8, scales~1.1.1,
# tibble~3.0.3, tidyr~1.1.0, tidyselect~1.1.0, tidytree~0.3.3,
# tools~4.0.2, vctrs~0.3.1, withr~2.2.0
# \end{itemize}
The solution is to upgrade to the version 2.3.2 (last version as for July 15, 2020), which is hosted on github by the author of the package:
devtools::install_github("YuLab-SMU/ggtree")

knitr bookdown::gitbook and webgl: rotation does not work properly

I have the following Rmd file:
---
output: bookdown::gitbook
---
```{r include=FALSE}
rgl::setupKnitr()
```
```{r testing1,webgl=TRUE}
with(attitude,
car::scatter3d(x = rating, z = complaints, y = learning)
)
```
```{r testing2,webgl=TRUE}
with(attitude,
car::scatter3d(x = rating, z = complaints, y = learning)
)
```
When I knit this file, it produces and HTML file containing two, identical 3D interactive scatterplots. Both scatterplots look like they should, but the second scatterplot does not rotate properly. It will not rotate horizontally in depth correctly (eg, around the vertical axis).
In case it helps, you can find the HTML output of the knit here: https://www.dropbox.com/s/v3usmtes7n54t6q/Untitled.html.zip?dl=0
I have done all the following, none of which have fixed the problem:
Updated all packages with update.packages().
Installed the development version of bookdown.
Installed the development version of knitr.
Tried the solution here (didn't work): interactive 3D plots in markdown file - not working anymore?
I have noted the following:
If I change the output to html_document I do not have the problem (I'm debugging the problem in a bookdown::gitbook though, so that knowledge does not directly help me).
In the Firefox (77.0.1, 64-bit) javascript error console there is an error: TypeError: li[0] is undefined / plugin-bookdown.js:152:43 (which appears to have something to do with the table of contents and scrolling?)
Here is the output of sessionInfo():
> sessionInfo()
R version 4.0.0 (2020-04-24)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.5
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] bookdown_0.19.4 fansi_0.4.1 digest_0.6.25 crayon_1.3.4
[5] assertthat_0.2.1 evaluate_0.14 rlang_0.4.6 cli_2.0.2
[9] rstudioapi_0.11 rmarkdown_2.3 tools_4.0.0 glue_1.4.1
[13] xfun_0.14 yaml_2.2.1 rsconnect_0.8.16 compiler_4.0.0
[17] htmltools_0.5.0 knitr_1.28.7
In addition, here are the versions of some other relevant packages:
> installed.packages()[c("rgl","mgcv","car"),"Version"]
rgl mgcv car
"0.100.54" "1.8-31" "3.0-8"
Edit to add more detail
I have the same problem while using rgl::persp3d, so it isn't specific to car::scatter3d. The HTML from the Rmd file below uses only rgl but exhibits the same behavior.
---
output: bookdown::gitbook
---
```{r include=FALSE}
rgl::setupKnitr()
x <- seq(-10, 10, length = 30)
y <- x
f <- function(x, y) { r <- sqrt(x^2 + y^2); 10 * sin(r)/r }
z <- outer(x, y, f)
z[is.na(z)] <- 1
```
```{r testing1,webgl=TRUE}
rgl::persp3d(x, y, z, aspect = c(1, 1, 0.5), col = "lightblue",
xlab = "X", ylab = "Y", zlab = "Sinc( r )",
polygon_offset = 1)
```
```{r testing2,webgl=TRUE}
rgl::persp3d(x, y, z, aspect = c(1, 1, 0.5), col = "lightblue",
xlab = "X", ylab = "Y", zlab = "Sinc( r )",
polygon_offset = 1)
```
This turned out to be a bug in rgl, that was using an obsolete method to compute the location of mouse clicks relative to objects in scenes. It worked in an html_document, but not with bookdown::gitbook.
The development version (0.102.6) of rgl has fixed this, but it contains some really major changes, and a few other things are still broken by them: in particular using the webgl=TRUE chunk option. If you want to use the devel version, you should use explicit calls to rglwidget() in each chunk, or if you want to try out the new stuff, use rgl::setupKnitr(autoprint = TRUE) and just treat rgl graphics like base graphics, controlled by chunk options fig.keep etc.
Edited to add: version 0.102.7 fixes the known webgl=TRUE issue.

Obtaining an error when running exact code from a blog

I am following a tutorial here. A few days ago I was able to run this code without error and run it on my own data set (it was always a little hit and miss with obtaining this error) - however now I try to run the code and I always obtain the same error.
Error in solve.QP(Dmat, dvec, Amat, bvec = b0, meq = 2) :
constraints are inconsistent, no solution!
I get that the solver cannot solve the equations but I am a little confused as to why it worked previously and now it does not... The author of the article has this code working...
library(tseries)
library(data.table)
link <- "https://raw.githubusercontent.com/DavZim/Efficient_Frontier/master/data/mult_assets.csv"
df <- data.table(read.csv(link))
df_table <- melt(df)[, .(er = mean(value),
sd = sd(value)), by = variable]
er_vals <- seq(from = min(df_table$er), to = max(df_table$er), length.out = 1000)
# find an optimal portfolio for each possible possible expected return
# (note that the values are explicitly set between the minimum and maximum of the expected returns per asset)
sd_vals <- sapply(er_vals, function(er) {
op <- portfolio.optim(as.matrix(df), er)
return(op$ps)
})
SessionInfo:
R version 3.5.3 (2019-03-11)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=Spanish_Spain.1252 LC_CTYPE=Spanish_Spain.1252 LC_MONETARY=Spanish_Spain.1252
[4] LC_NUMERIC=C LC_TIME=Spanish_Spain.1252
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] lpSolve_5.6.13.1 data.table_1.12.0 tseries_0.10-46 rugarch_1.4-0
loaded via a namespace (and not attached):
[1] Rcpp_1.0.0 MASS_7.3-51.1 mclust_5.4.2
[4] lattice_0.20-38 quadprog_1.5-5 Rsolnp_1.16
[7] TTR_0.23-4 tools_3.5.3 xts_0.11-2
[10] SkewHyperbolic_0.4-0 GeneralizedHyperbolic_0.8-4 quantmod_0.4-13.1
[13] spd_2.0-1 grid_3.5.3 KernSmooth_2.23-15
[16] yaml_2.2.0 numDeriv_2016.8-1 Matrix_1.2-15
[19] nloptr_1.2.1 DistributionUtils_0.6-0 ks_1.11.3
[22] curl_3.3 compiler_3.5.3 expm_0.999-3
[25] truncnorm_1.0-8 mvtnorm_1.0-8 zoo_1.8-4
tseries::portfolio.optim disallows short selling by default, see argument short. If short = FALSE asset weights may not go below 0. And as the weights must sum up to 1, also no individual asset weight could be above 1. There's no leverage.
(Possibly, in an earlier version of tseries default could have been short = TRUE. This would explain why it previously worked for you.)
Your target return (pm) cannot exceed the highest return of any of the input assets.
Solution 1: Allow short selling, but remember that that's a different efficient frontier. (For reference, see any lecture or book discussing Markowitz optimization. There's a mathematical solution to the problem without short-selling restriction.)
op <- portfolio.optim(as.matrix(df), er, shorts = T)
Solution 2: Limit the target returns between the worst and the best asset's return.
er_vals <- seq(from = min(colMeans(df)), to = max(colMeans(df)), length.out = 1000)
Here's a plot of the obtained efficient frontiers.
Here's the full script that gives both solutions.
library(tseries)
library(data.table)
link <- "https://raw.githubusercontent.com/DavZim/Efficient_Frontier/master/data/mult_assets.csv"
df <- data.table(read.csv(link))
df_table <- melt(df)[, .(er = mean(value),
sd = sd(value)), by = variable]
# er_vals <- seq(from = min(df_table$er), to = max(df_table$er), length.out = 1000)
er_vals1 <- seq(from = 0, to = 0.15, length.out = 1000)
er_vals2 <- seq(from = min(colMeans(df)), to = max(colMeans(df)), length.out = 1000)
# find an optimal portfolio for each possible possible expected return
# (note that the values are explicitly set between the minimum and maximum of the expected returns per asset)
sd_vals1 <- sapply(er_vals1, function(er) {
op <- portfolio.optim(as.matrix(df), er, short = T)
return(op$ps)
})
sd_vals2 <- sapply(er_vals2, function(er) {
op <- portfolio.optim(as.matrix(df), er, short = F)
return(op$ps)
})
plot(x = sd_vals1, y = er_vals1, type = "l", col = "red",
xlab = "sd", ylab = "er",
main = "red: allowing short-selling;\nblue: disallowing short-selling")
lines(x = sd_vals2, y = er_vals2, type = "l", col = "blue")

caught segfault error in R

I am getting a caught segfault error every time I try to run any plotting functions from the ggplot2 package (1.0.0). I have tried this with qplot, geom_dotplot, geom_histogram, etc. Data from the package (e.g. diamonds or economics) work just fine.
I am operating on Mac OS 10.9.4 (the latest version) and on R 3.1.1 (also the latest version). I get the same error with the standard R GUI, RStudio, and when using R from the command line. The command brings up the default graphic device (Quartz for R GUI and command line), but also the terminal error.
library(ggplot2)
qplot(1:10)
gives me the error:
*** caught segfault ***
address 0x18, cause 'memory not mapped'
Traceback:
1: .Call("plyr_split_indices", PACKAGE = "plyr", group, n)
2: split_indices(scale_id, n)
3: scale_apply(layer_data, x_vars, scale_train, SCALE_X, panel$x_scales)
4: train_position(panel, data, scale_x(), scale_y())
5: ggplot_build(x)
6: print.ggplot(list(data = list(), layers = list(<environment>), scales = <S4 object of class "Scales">, mapping = list(x = 1:3), theme = list(), coordinates = list(limits = list(x = NULL, y = NULL)), facet = list(shrink = TRUE), plot_env = <environment>, labels = list(x = "1:3", y = "count")))
7: print(list(data = list(), layers = list(<environment>), scales = <S4 object of class "Scales">, mapping = list(x = 1:3), theme = list(), coordinates = list( limits = list(x = NULL, y = NULL)), facet = list(shrink = TRUE), plot_env = <environment>, labels = list(x = "1:3", y = "count")))
Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Here is my session info:
R version 3.1.1 (2014-07-10)
Platform: x86_64-apple-darwin13.1.0 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] graphics grDevices utils datasets stats methods base
other attached packages:
[1] ggplot2_1.0.0 marelac_2.1.3 seacarb_3.0 shape_1.4.1 beepr_1.1 birk_1.1
loaded via a namespace (and not attached):
[1] audio_0.1-5 colorspace_1.2-4 digest_0.6.4 grid_3.1.1 gtable_0.1.2
[6] MASS_7.3-34 munsell_0.4.2 plyr_1.8.1 proto_0.3-10 Rcpp_0.11.2
[11] reshape2_1.4 scales_0.2.4 stringr_0.6.2 tools_3.1.1
I've gathered from others that this is a memory issue of some sort, but this error occurs even when I have over 2 GB of free RAM. I know this is a widely used package, so of course this doesn't happen for everyone, but why is it happening for me? Does anyone know what I can do to fix this problem?
In case anyone else has this problem or similar in the future, I sent a bug report to the package maintainer and he recommended uninstalling all installed packages and starting over. I took his advice and it worked!
I followed advice from this posting: http://r.789695.n4.nabble.com/Reset-R-s-library-to-base-packages-only-remove-all-installed-contributed-packages-td3596151.html
ip <- installed.packages()
pkgs.to.remove <- ip[!(ip[,"Priority"] %in% c("base", "recommended")), 1]
sapply(pkgs.to.remove, remove.packages)
This is not an answer to this question but it might be helpful for someone. (Inspired by user1310503. Thanks!)
I am working on a data.frame df with three cols: col1, col2, col3.
Initially,
df =data.frame(col1=character(),col2=numeric(),col3=numeric(),stringsAsFactors = F)
In the process, rbind is used for many times, like:
aList<-list(col1="aaa", col2 = "123", col3 = "234")
dfNew <- as.data.frame(aList)
df <- rbind(df, dfNew)
At last, df is written to file via data.table::fwrite
data.table::fwrite(x = df, file = fileDF, append = FALSE, row.names = F, quote = F, showProgress = T)
df has 5973 rows and 3 cols. The "caught segfault" always occurs:
address 0x1, cause 'memory not mapped'. 
The solution to this problem is:
aList<-list(col1=as.character("aaa"), col2 = as.numeric("123"), col3 = as.numeric("234"))
dfNew <- as.data.frame(aList)
dfNew$col1 <- as.characer(dfNew$col1)
dfNew$col2 <- as.numeric(dfNew$col2)
dfNew$col3 <- as.numeric(dfNew$col3)
df <- rbind(df, dfNew)
Then this problem is solved. Possible reason is that the classes of cols are different.
This is not an answer to this question but it might be useful for someone. I had segfaults when I did pdf to create a PDF graphics device and then used plot. This happened with R 2.15.3, 3.2.4, and one or two other versions, running on Scientific Linux release 6.7. I tried many different things, but the only ways I could get it to work were (a) using png or tiff instead of pdf, or (b) saving large .RData files and then using a completely separate R program to create the graphics.

Resources