Multiple plots in repeated question yields "Error in exm[[dups[j]]] : subscript out of bounds" - r-exams

When creating worksheets with exams2pdf() from R/exams, I like to repeat an exercise file multiple times to yield different numbers. However, when I include two plots in an exercise (e.g., one in the question and one in the solution) this yields:
Error in exm[[dups[j]]] : subscript out of bounds
A reproducible example is included below.
It works with one plot, and it works if I don't repeat the question. Also, the problem can be avoided by making multiple copies of a simple.Rmd (say simple1.Rmd and simple2.Rmd with different chunk names in each copy) but it seems there should be a better way.
The Rmd file: simple.Rmd
Question
========
A question.
```{r drawit}
x = (-330):330/100
y = dnorm(x)
plot(x,y)
```
Solution
========
Let's redraw...
```{r drawagain}
x2 = (-330):330/100+100
y2 = dnorm(x2,mean=100,sd=1)
plot(x2,y2)
```
Meta-information
============
extype: num
exsolution: 10
exname: calc
And the replication R code:
library("exams")
q1 = "simple.Rmd"
probs = c(q1,q1)
exams2pdf(probs)
The Rmd file will knit fine (with two plots) but running the code above yields the above mentioned
Error in exm[[dups[j]]] : subscript out of bounds

Thanks for reporting this, this is a bug in exams2pdf()! Single duplicated supplement names were already corrected but the case of multiple duplicated supplement names was not. I've just committed a fix to the repository on R-Forge that addresses the issue.
It would be great if you could install the development version of the package from R-Forge to test whether the fix also works correctly on your real use-case. You can install from within R via:
install.packages("exams", repos="http://R-Forge.R-project.org")

Related

Unexplained R failure: Error: attempt to use zero-length variable name

I am trying to specify the size of a plot dynamically in RMarkdown. Because it is a facetted plot with a dynamic number of plots, the aspect ratio needs to change.
I'm using this code:
...
jobs_levels <- 9
```
#### Figure 3.1.1. Effect of Inidivdual Interventions in Living Rooms
```{r figure_3_3_1, fig.height=jobs_levels}
jobs_levels
print(figure_3_3_1)
```
For the sake of simplicity, I've hard coded the jobs_levels (the number of levels in the facetting factor). As you can see, I set the value very clearly,and then use it in the very next code chuck. I can see the value clear as day in the environment. It is 9. But I get this:
Error in eval(ele) : object 'jobs_levels' not found
> ```{r figure_3_3_1, fig.height=jobs_levels}
Error: attempt to use zero-length variable name
When I run this in batch mode, it crashes. Any idea what is going on with this?
I even put in the extra lines:
jobs_levels
to debug. EAch time I run one of those lines with cotrl enter, it evaluates right, but I see the error message again too...
The error message makes it look as though you are trying to evaluate the chunk header as R code. You had
```{r figure_3_3_1, fig.height=jobs_levels}
The first two backticks would be parsed as a zero length variable name, as the error message states.
You can't run R Markdown code as R code, you need to use rmarkdown::render or a related function to run it.

ggplot2 error: Error in default + theme : non-numeric argument to binary operator

I have been getting the error Error in default + theme : non-numeric argument to binary operator. I have been using R and teaching R for a long time but I can't find this problem. I have included a reproducible example that fails this way below:
library(tibble)
library(ggplot2)
brains <- as_tibble(brains)
brains <- brains[1:10, ]
brains
ggplot(brains, aes(x = BodyWt, y = BrainWt)) +
geom_point()
The error occurs when executing the ggplot() statement.
My hardware is an HP Laptop 15-ef0xxx. I am running Windows 10 Home version 2004. I am running RStudio community edition "Water Lily" and R version R x64** 4.0.2.
I know this is a simple error and it is driving me crazy.
So I finally solved this problem. On the github issue I had opened Hiroaki commented that "One possibility is that you might set a invalid default theme in your .Rprofile, but I'm not sure..." (see link to issue below).
I'm not sure if your R file is part of a project but mine is.
So I went back and deleted the theme_set() line in my R file, went in and double checked all my project options and selected the option "Disable .Rprofile execution on session start/resume" and "Quit child processes on exit". And then I restarted the R session and now everything works. Including on the default R editor console.
I'm not sure if all those steps are necessary but that seemed to do the trick for me! Hope it helps.
I thought this was an RStudio issue but it seems it's possibly a ggplot2 > problem. I have verified using two different datasets that the same >error comes up when I try using ggplot2 in RStudio or using the default R >console. I get the exact same error with code that's been working fine >but now suddenly won't. I have opened an issue on Github (ggplot2) with a >reprex. Might be worth checking there: >https://github.com/tidyverse/ggplot2/issues/4177
I know this is not an answer per se but I don't have enough reputation >points to add a comment to the previous answer but I thought linking to >the issue on Github might help.
I think you have numbers quoted somewhere and you are trying to perform mathematical operation on character values. Consider
x <- c("5","6")
y <- x/1
> y <- x/1
Error in x/1 : non-numeric argument to binary operator
Now try converting x to numeric and perform the same operation.
y <- as.numeric(x)/1
> y
[1] 5 6
So, you need to use as.numeric on your variable.
The following should resove this issue
ggplot(brains, aes(x = as.numeric(BodyWt), y = as.numeric(BrainWt)))

Problem with round() function in .Rmd exercise file

I have a problem where I create a .Rmd file for an exercise and I include a large number together with the round() function. Here is a minimal example:
```{r data generation, echo = FALSE, results = "hide"}
Value = 12000.555
```
Question
========
temp
Meta-information
================
exname: temp
extype: num
exsolution: `r round(Value, 2)`
extol: 0.01
I try to compile this exercise into an exam using exams2pdf() yielding the following error:
exams2pdf("example.Rmd")
## Warning message: In read_metainfo(file) : NAs introduced by coercion
Why is that? I'm using R/exams version 2.3-6, and R version 3.6.3.
TL;DR: Use fmt(Value, 2) instead of round(Value, 2). This avoids problems with scientific notation (and uses rounding away from zero). See ?fmt for more details.
The reason for the error is actually not the round() function per se, but the fact the R by default uses scientific notation for numbers with a certain number of significant digits (factory-fresh default in R is scipen = 7). Furthermore, the knitr package (employed in the background by R/exams) tries to format this scientific notation nicely. So instead of 12000.56 the knit() function includes 1.200056 × 10<sup>4</sup> in the Markdown file. You can see this when you run xweave("example.Rmd") and then inspect the resulting example.md file. And then the subsequent processing of the exsolution tag hence has problems to convert this back to a number, hence the warning.
To avoid this you could increase the scipen limit within the R code of the exercise, e.g., options(scipen = 999). But this is very technical and tedious. This is one of the reasons why we have written the fmt(...) function that carries out various convenience tasks that have to do with formatting of numbers within R/exams exercises.

ViSEAGO tutorial: visualising topGO object

Earlier, I had posted a question and was able to load in my data successfully and create a topGO object after some help. I'm trying to visualise GO terms that are significantly associated with the list of differentially expressed genes that I have from mouse RNA-seq data.
Now, I'd want to raise a concern about ViSEAGO's tutorial. The tutorial initially specifies loading two files: 'selection.txt' and 'background.txt'. The origin of these files is not clearly stated. However, after a lot of digging into topGO's documentation, I was able to find the datatypes for each of the files. But, even after following these, I have a problem running the following code. Does anyone have any insights to share?
WORKING CODE:
mysampleGOdata <- new("topGOdata",
description = "my Simple session",
ontology = "BP",
allGenes = geneList_new,
nodeSize = 1,
annot = annFUN.org,
mapping="org.Mm.eg.db",
ID = "SYMBOL")
resultFisher <- runTest(mysampleGOdata, algorithm = "classic", statistic = "fisher")
head(GenTable(mysampleGOdata,fisher=resultFisher),20)
myNewBP<-GenTable(mysampleGOdata,fisher=resultFisher)
PROBLEMS:
> head(myNewBP,2)
GO.ID Term Annotated Significant Expected fisher
1 GO:0006006 glucose metabolic process 194 12 0.19 1.0e-19
2 GO:0019318 hexose metabolic process 223 12 0.22 5.7e-19
> ###################
> # merge results
> myBP_sResults<-ViSEAGO::merge_enrich_terms(
+ Input=list(
+ condition=c("mysampleGOdata","resultFisher")
+ )
+ )
Error in setnames(x, value) :
Can't assign 3 names to a 2 column data.table
> myNewBP<-GenTable(mysampleGOdata,fisher=resultFisher)
> ###################
> # display the merged table
> ViSEAGO::show_table(myNewBP)
Error in ViSEAGO::show_table(myNewBP) :
object must be enrich_GO_terms, GO_SS, or GO_clusters class objects
According to the tutorial, the printed table contains for each enriched GO terms, additional columns including the list of significant genes and frequency (ratio of the number of significant genes to the number of background genes) evaluated by comparison. I think I have that, but it's definitely not working.
Can someone see why? I'm not very clear on this.
Thanks!
I think you try to circumvent an error you made at the beginning. You receive the error due to the fact that you did not use the wrapper function from the ViSEAGO package. As you stated in your last question, you had initial problems formatting your data.
Here are some tips:
The "selection" file is a character vector with your DEGs names or IDs. I recommend using EntrezID's.
The "Background" file is a character vector with known genes. I recommend using EntrezID's as well. You can easily generate this character vector with:
background=keys(org.Hs.eg.db, keytype ='ENTREZID').
With these two files, you can easily proceed to the next steps of the package as described in the vignette.
# connect to EntrezGene
EntrezGene<-ViSEAGO::EntrezGene2GO()
# load GO annotations from EntrezGene
# with the add of GO annotations from orthologs genes (see above)
#id = "9606" = homo sapiens
myGENE2GO<-ViSEAGO::annotate(id="9606", EntrezGene)
BP<-ViSEAGO::create_topGOdata(
geneSel = selection, #your DEG vector
allGenes = background, #your created background vector
gene2GO=myGENE2GO,
ont="BP",
nodeSize=5
)
classic<-topGO::runTest(
BP,
algorithm ="classic",
statistic = "fisher"
)
# merge results
BP_sResults<-ViSEAGO::merge_enrich_terms(
Input=list(
condition=c("BP","classic")
)
)
You should get a merged list of your enriched GO terms with the corresponding statistical tests you prefer.
I have faced this problem recently, it was very frustrating. In my case the whole issue seemed to be related to the package version I was using.
I used conda to install ViSEAGO. Nevertheless, R's version in my conda environment was a bit old (i.e. 3.6.1 to be specific). Therefore, when installing ViSEAGO with conda, the version 1.0.0 of the package was installed. Please note that the most recent version of ViSEAGO is 1.4.0.
Therefore, I created a conda environment with R version 4.0.3, and repeated the procedure to install ViSEAGO by using conda. When doing this, ViSEAGO's 1.4.0 version was installed, and everything went fine.
I've tried to backtrack the error, and only find one thing: in the older ViSEAGO version, the function Custom2GO loaded tables with 4 columns; in the most recent version it admits 5 columns (the new one being 'gene_symbol'). I think this disagreement might be part of the issue, as the source code of the function merge_enrich_terms seems to deal with the columns 'gene_id' and 'gene_symbol' at some point, but I'm not sure.
Hope you find my comment helpful!
Cheers,
Mauricio

How to save Variant Call Format (VCF) file to disk in R using VariantAnnotation Package

I've searched the web for this without much luck. More or less you always get to the example from the VariantAnnotation Package. And since this example works fine on my computer I have no idea why the VCF I created does not.
The problem: I want to determine the number and location of SNPs in selected genes. I have a large VCF file (over 5GB) that has info on all SNPs on all chromosomes for several mice strains. Obviously my computer freezes if I try to do anything on the whole genome scale, so I first determined genomic locations of genes of interest on chromosome 1. I then used the VariantAnnotation Package to get only the data relating to my genes of interest out of the VCF file:
library(VariantAnnotation)
param<-ScanVcfParam(
info=c("AC1","AF1","DP","DP4","INDEL","MDV","MQ","MSD","PV0","PV1","PV2","PV3","PV4","QD"),
geno=c("DP","GL","GQ","GT","PL","SP","FI"),
samples=strain,
fixed="FILTER",
which=gnrng
)
The code above is taken out of a function I wrote which takes strain as an argument. gnrng refers to a GRanges object containing genomic locations of my genes of interest.
vcf<-readVcf(file, "mm10",param)
This works fine and I get my vcf (dim: 21783 1) but when I try to save it won't work
file.vcf<-tempfile()
writeVcf(vcf, file.vcf)
Error in .pasteCollapse(ALT, ",") : 'x' must be a CharacterList
I even tried in parallel, doing the example from the package first and then substituting for my VCF file:
#This is the example:
out1.vcf<-tempfile()
in1<-readVcf(fl,"hg19")
writeVcf(in1,out1.vcf)
This works just fine, but if I only substitute in1 for my vcf I get the same error.
I hope I made myself clear... And any help will be greatly appreciated!! Thanks in advance!
Thanks for reporting this bug. The problem is fixed in version 1.9.47 (devel branch). The fix will be available in the release branch after April 14.
The problem was that you selectively imported 'FILTER' from the 'fixed' field but not 'ALT'. writeVcf() was throwing an error because there was no ALT value to write out. If you don't have access to the version with the fix, a work around would be to import the ALT field.
ScanVcfParam(fixed = c("ALT", "FILTER"))
You can see what values were imorted with the fixed() accessor:
fixed(vcf)
Please report and bugs or problems on the Bioconductor mailing list Martin referenced. More Bioc users will see the question and you'll get help more quickly.
Valerie
Here's a reproducible example
library(VariantAnnotation)
fl <- system.file("extdata", "chr22.vcf.gz", package="VariantAnnotation")
param <- ScanVcfParam(fixed="FILTER")
writeVcf(readVcf(fl, "hg19", param=param), tempfile())
## Error in .pasteCollapse(ALT, ",") : 'x' must be a CharacterList
The problem seems to be that writeVcf expects the object to have an 'ALT' field, so
param <- ScanVcfParam(fixed="ALT")
writeVcf(readVcf(fl, "hg19", param=param), tempfile())
succeeds.

Resources