R - Matchit - Propensity Score Matching - Discard function not working - r

I am using the MatchIt package on the LaLonde data-set and the discard argument is generating two types of errors. (The code works if I do not use the discard argument). In both cases, it is not clear how to resolve the problems....
The first issue is when I try discard = "hull.control"
m.opt1 <- matchit(treat ~ inc.re74 + inc.re75 + education + nonwhite +
age + nodegree, data = cps_controls, method = "optimal", ratio=1,
discard="hull.control")
This error message is produced....
Loading required namespace: WhatIf
Preprocessing data ...
Performing convex hull test ...
Error in mclapply(1:m, in_ch, mc.cores = mc.cores) :
'mc.cores' > 1 is not supported on Windows
The second issue is when I try discard = "control"
Error in d[i, ] <- abs(d1[i] - d0) :
number of items to replace is not a multiple of replacement length
Is there a way to address either of these? Thanks!!

Your issue seems to be kinda bug in MatchIt package as noted on SO here and here. I've submitted a ticket on GitHub.

Regarding the discard = "hull.control" issue:
Download the source code of MatchIt from here and edit discard.R. Add to the calls of WhatIf::whatif the argument mc.cores = 1. This should hard-code the number of cores used to 1 and thus eliminate the issue.
Uninstall the MatchIt package and build the new one by opening command line and type R CMD build C:\path\to\MatchIt-master. This should create a .tar.gz file. In R Studio, click on Tools -> Install packages... and select the local package.
You may need to restart R Studio if the library was loaded previously.
Enjoy.

Related

STRINGdb r environment; error in plot_network

I'm trying to use stringdb in R and i'm getting the following error when i try to plot the network:
Error in if (grepl("The document has moved", res)) { : argument is
of length zero
code:
library(STRINGdb)
#(specify organism)
string_db <- STRINGdb$new( version="10", species=9606, score_threshold=0)
filt_mapped = string_db$map(filt, "GeneID", removeUnmappedRows = TRUE)
head(filt_mapped)
(i have columns titled: GeneID, logFC, FDR, STRING_id with 156 rows)
filt_mapped_hits = filt_mapped$STRING_id
head(filt_mapped_hits)
(156 observations)
string_db$plot_network(filt_mapped_hits, add_link = FALSE)
Error in if (grepl("The document has moved", res)) { : argument is
of length zero
You are using quite few years old version of Bioconductor and by extension the STRING package.
If you update to the newest one, it will work. However the updated package only supports only the latest version STRING (currently version 11), so the underlying network may change a bit.
More detailed reason is this:
The STRING's hardware infrastructure underwent recently major changes which forced a different server setup.
Now all the old calls are forwarded to a different URL, however the cURL call, how it was implemented, does not follow our redirects which breaks the STRINGdb package functionality.
We cannot update the old bioconductor package and our server setup can’t be really changed.
That said, the fix for an old version is relatively simple.
In STRINGdb library there is script with all the methods "rstring.r".
In there you’ll find “get_png” method. In it replace this line:
urlStr = paste("http://string-db.org/version_", version, "/api/image/network", sep="" )
With this line:
urlStr = paste("http://version", version, ".string-db.org/api/image/network", sep="" )
Load the library again and it should create the PNG, as before.

Index error when running maxnet function (maxnet package)

I use the maxnet function (maxnet package) as one of the model algorithms in an ensemble model. Sometimes, the code executes without an error. Other times, it gives me the error message you see below. I am working on a windows 10 Pro (R version 3.6.1, Rstudio version 1.2.5042).
Code:
dm.Maxent <- maxnet(p = train$species, data = train[-train$species],
maxnet.formula(p = train$species,
data = train[-train$species],
classes = "default"))
Error:
Error in intI(j, n = x#Dim[2], dn[[2]], give.dn = FALSE) :
index larger than maximal 185
train is a dataframe with 621 rows (one row for every occurrence/absence point), and 29 columns (28 columns containing variables and 1 column "species" that indicates presence or absence of the species (0/1)).
I am having the same issue. It is unpredictable, since for several species it ran fine, then out of a sudden it stopped.
I found a response on this link: https://github.com/jamiemkass/ENMeval/issues/62
In the new version of maxnet (check the Github repo, as it looks like the CRAN version gas not been updated yet), there is a new argument "addsamplestobackground". When set to TRUE, it solves some of these errors. Currently, you will have to use install_github to reinstall maxnet to use this argument. Once you do, install_github to get the dev branch version of ENMeval (v2), which will implement this by default. Hopefully that fixes these problems.
I reinstalled maxnet from github :
install.packages("remotes")
remotes::install_github("mrmaxent/maxnet")
and set addsamplestobackground = T Maybe this would help you.

ViSEAGO tutorial: visualising topGO object

Earlier, I had posted a question and was able to load in my data successfully and create a topGO object after some help. I'm trying to visualise GO terms that are significantly associated with the list of differentially expressed genes that I have from mouse RNA-seq data.
Now, I'd want to raise a concern about ViSEAGO's tutorial. The tutorial initially specifies loading two files: 'selection.txt' and 'background.txt'. The origin of these files is not clearly stated. However, after a lot of digging into topGO's documentation, I was able to find the datatypes for each of the files. But, even after following these, I have a problem running the following code. Does anyone have any insights to share?
WORKING CODE:
mysampleGOdata <- new("topGOdata",
description = "my Simple session",
ontology = "BP",
allGenes = geneList_new,
nodeSize = 1,
annot = annFUN.org,
mapping="org.Mm.eg.db",
ID = "SYMBOL")
resultFisher <- runTest(mysampleGOdata, algorithm = "classic", statistic = "fisher")
head(GenTable(mysampleGOdata,fisher=resultFisher),20)
myNewBP<-GenTable(mysampleGOdata,fisher=resultFisher)
PROBLEMS:
> head(myNewBP,2)
GO.ID Term Annotated Significant Expected fisher
1 GO:0006006 glucose metabolic process 194 12 0.19 1.0e-19
2 GO:0019318 hexose metabolic process 223 12 0.22 5.7e-19
> ###################
> # merge results
> myBP_sResults<-ViSEAGO::merge_enrich_terms(
+ Input=list(
+ condition=c("mysampleGOdata","resultFisher")
+ )
+ )
Error in setnames(x, value) :
Can't assign 3 names to a 2 column data.table
> myNewBP<-GenTable(mysampleGOdata,fisher=resultFisher)
> ###################
> # display the merged table
> ViSEAGO::show_table(myNewBP)
Error in ViSEAGO::show_table(myNewBP) :
object must be enrich_GO_terms, GO_SS, or GO_clusters class objects
According to the tutorial, the printed table contains for each enriched GO terms, additional columns including the list of significant genes and frequency (ratio of the number of significant genes to the number of background genes) evaluated by comparison. I think I have that, but it's definitely not working.
Can someone see why? I'm not very clear on this.
Thanks!
I think you try to circumvent an error you made at the beginning. You receive the error due to the fact that you did not use the wrapper function from the ViSEAGO package. As you stated in your last question, you had initial problems formatting your data.
Here are some tips:
The "selection" file is a character vector with your DEGs names or IDs. I recommend using EntrezID's.
The "Background" file is a character vector with known genes. I recommend using EntrezID's as well. You can easily generate this character vector with:
background=keys(org.Hs.eg.db, keytype ='ENTREZID').
With these two files, you can easily proceed to the next steps of the package as described in the vignette.
# connect to EntrezGene
EntrezGene<-ViSEAGO::EntrezGene2GO()
# load GO annotations from EntrezGene
# with the add of GO annotations from orthologs genes (see above)
#id = "9606" = homo sapiens
myGENE2GO<-ViSEAGO::annotate(id="9606", EntrezGene)
BP<-ViSEAGO::create_topGOdata(
geneSel = selection, #your DEG vector
allGenes = background, #your created background vector
gene2GO=myGENE2GO,
ont="BP",
nodeSize=5
)
classic<-topGO::runTest(
BP,
algorithm ="classic",
statistic = "fisher"
)
# merge results
BP_sResults<-ViSEAGO::merge_enrich_terms(
Input=list(
condition=c("BP","classic")
)
)
You should get a merged list of your enriched GO terms with the corresponding statistical tests you prefer.
I have faced this problem recently, it was very frustrating. In my case the whole issue seemed to be related to the package version I was using.
I used conda to install ViSEAGO. Nevertheless, R's version in my conda environment was a bit old (i.e. 3.6.1 to be specific). Therefore, when installing ViSEAGO with conda, the version 1.0.0 of the package was installed. Please note that the most recent version of ViSEAGO is 1.4.0.
Therefore, I created a conda environment with R version 4.0.3, and repeated the procedure to install ViSEAGO by using conda. When doing this, ViSEAGO's 1.4.0 version was installed, and everything went fine.
I've tried to backtrack the error, and only find one thing: in the older ViSEAGO version, the function Custom2GO loaded tables with 4 columns; in the most recent version it admits 5 columns (the new one being 'gene_symbol'). I think this disagreement might be part of the issue, as the source code of the function merge_enrich_terms seems to deal with the columns 'gene_id' and 'gene_symbol' at some point, but I'm not sure.
Hope you find my comment helpful!
Cheers,
Mauricio

How can I install and use (mice) function in R?

I want to use mice function to handle the missing data that I have in (data). I installed the package and I called the library. However, when I am trying to apply the function to my data it gives me error as below:
(Error in mice(data[, 5:9], m = 3, seed = 123) :
could not find function "mice")
I have a normal data frame that includes NAs
install.packages('mice')
library(mice)
library(VIM)
md.pattern(data)
md.pairs(data)
My_New_Data <- mice(data[,5:9], m=3, seed=123)
I am expecting the function to solve the problem and replace the NAs with reasonable values. It did not work at all!
Edit (incorporating comment suggestion)
In the comments the running mice::mice(data[, 5:9], m = 3, seed = 123). I ran this and the following error was returned.
Error in get(Info[i, 1], envir = env):
lazy-load database 'C:/Users/MUSTAFA KAMAL/Documents/R/win-library/3.5/broom/R/broom.rdb' is corrupt
In addition:
Warning message: In get(Info[i, 1], envir = env) : internal error -3 in R_decompress1
In order to incorporate an answer to this question, I will rewrite my comment which resolved the problem, in the form of a short answer.
From the comments executing mice::mice(data[, 5:9], m = 3, seed = 123) resulted in an error message, showing the directory ~/Documents/R/win-library/**3.5**/broom/R/broom.rdb being corrupt.
From the corrupted directory path, one can see that OP was running R-3.5.x, while the newest version is R-3.6.x. Some packages updated since the most recent R-update has experienced similar problems, as such a first step towards solving these types of issues is updating R. The installr contains the function updateR which can help smooth over such updates, while also updating any outdated packages.
As a side note, an update sometimes fails to update the actual packages or results in other packages being corrupted, as such if an error persists one solution is to simply delete and re-install the package (or the entire ~/Documents/R/win-library/3.z/ directory). In the question from OP the corrupt package is the broom package, as such one could re-install this package by running
remove.packages("broom")
install.packages("broom")
which should resolve any leftover issues. Note however multiple packages might be corrupt, and likely only one will be shown every time the function is executed. In such cases a full package clear will do the trick, but requires re-installing all packages. For this one can export all installed packages prior to removing them all, by noting that a full list of installed packages is contained in installed.packages(), which can then be exported to a file with for example write.table or write.csv.

Error message in lme4::glmer: " 'what' must be a character string or a function"

I am running a multi-level model. I use the following commands with validatedRS6 as the outcome, random as the predictor and clustno as the random effects variable.
new<-as.data.frame(read.delim("BABEX.dat", header=TRUE))
install.packages("lme4")
library(lme4)
model1<- glmer(validatedRS6 ~ random + (1|clustno), data=new, family=binomial("logit"), nAGQ = 1L)
However, I get the following error
Error in do.call(new, c(list(Class = "glmResp", family = family), ll[setdiff(names(ll), :
'what' must be a character string or a function
I have absolutely no idea what has gone wrong and have searched the internet. I am sorry but I cannot provide the data as it is from an intervention which has yet to be published.
(expanded from comment).
Congratulations, you found a bug in lme4! This is fixed now:
https://github.com/lme4/lme4/commit/9c12f002821f9567d5454e2ce3b78076dabffb54
It is caused by having a variable called new in the global environment (deep in the guts of the code, lme4 uses do.call(new,...) and finds your variable new rather than the built-in function new).
You can install a patched version from Github using devtools::install_github() (but you'll need compilation tools etc.). Alternately, there is a very simple workaround -- just call your variable anything other than new (you can't just copy it, i.e. new2 <- new -- you also have to make sure the old version is removed (rm("new"))).

Resources