ClusterLongData kml package export to scv - r

I am clustering time series in R using package KmL. I have read both manual and paper how to use this package, but Im not very clear how to export the results (data frame, where each trajectories are assign to some clusters e.g.
trajectory (i), time1, time2, time3, clustername)
I have read several answers here Output from 'choice' in R's kml
but if I do the same (run choice(myCld, typeGraph= "bmp")) my R says:
~ Choice : menu ~ 'Arrow' : change partition 'Space' : select/unselect a partition ... etc. e : change the display (both)
~ 'Return' when its done ~
And only thing that is saved to my library is myCld.Rdata and it runs for very long time without any more results. (my dataset: N trajectories, with time= 1:53) I want to have csv. files as manual suggest (objectName-Cx-y-Clusters.csv)
I am also not very clear WHERE should I PRESS on "Return" or "Arrow"? There is no option to press on anything in my Rstudio workplace.
I am really a beginner with R so any help would be appreciated. Thanks!

I am not sure that kml is compatible with Rstudio. The older version of Rstudio did not handle instruction like getGraphicsEvent that was used by choice.

Related

ViSEAGO tutorial: visualising topGO object

Earlier, I had posted a question and was able to load in my data successfully and create a topGO object after some help. I'm trying to visualise GO terms that are significantly associated with the list of differentially expressed genes that I have from mouse RNA-seq data.
Now, I'd want to raise a concern about ViSEAGO's tutorial. The tutorial initially specifies loading two files: 'selection.txt' and 'background.txt'. The origin of these files is not clearly stated. However, after a lot of digging into topGO's documentation, I was able to find the datatypes for each of the files. But, even after following these, I have a problem running the following code. Does anyone have any insights to share?
WORKING CODE:
mysampleGOdata <- new("topGOdata",
description = "my Simple session",
ontology = "BP",
allGenes = geneList_new,
nodeSize = 1,
annot = annFUN.org,
mapping="org.Mm.eg.db",
ID = "SYMBOL")
resultFisher <- runTest(mysampleGOdata, algorithm = "classic", statistic = "fisher")
head(GenTable(mysampleGOdata,fisher=resultFisher),20)
myNewBP<-GenTable(mysampleGOdata,fisher=resultFisher)
PROBLEMS:
> head(myNewBP,2)
GO.ID Term Annotated Significant Expected fisher
1 GO:0006006 glucose metabolic process 194 12 0.19 1.0e-19
2 GO:0019318 hexose metabolic process 223 12 0.22 5.7e-19
> ###################
> # merge results
> myBP_sResults<-ViSEAGO::merge_enrich_terms(
+ Input=list(
+ condition=c("mysampleGOdata","resultFisher")
+ )
+ )
Error in setnames(x, value) :
Can't assign 3 names to a 2 column data.table
> myNewBP<-GenTable(mysampleGOdata,fisher=resultFisher)
> ###################
> # display the merged table
> ViSEAGO::show_table(myNewBP)
Error in ViSEAGO::show_table(myNewBP) :
object must be enrich_GO_terms, GO_SS, or GO_clusters class objects
According to the tutorial, the printed table contains for each enriched GO terms, additional columns including the list of significant genes and frequency (ratio of the number of significant genes to the number of background genes) evaluated by comparison. I think I have that, but it's definitely not working.
Can someone see why? I'm not very clear on this.
Thanks!
I think you try to circumvent an error you made at the beginning. You receive the error due to the fact that you did not use the wrapper function from the ViSEAGO package. As you stated in your last question, you had initial problems formatting your data.
Here are some tips:
The "selection" file is a character vector with your DEGs names or IDs. I recommend using EntrezID's.
The "Background" file is a character vector with known genes. I recommend using EntrezID's as well. You can easily generate this character vector with:
background=keys(org.Hs.eg.db, keytype ='ENTREZID').
With these two files, you can easily proceed to the next steps of the package as described in the vignette.
# connect to EntrezGene
EntrezGene<-ViSEAGO::EntrezGene2GO()
# load GO annotations from EntrezGene
# with the add of GO annotations from orthologs genes (see above)
#id = "9606" = homo sapiens
myGENE2GO<-ViSEAGO::annotate(id="9606", EntrezGene)
BP<-ViSEAGO::create_topGOdata(
geneSel = selection, #your DEG vector
allGenes = background, #your created background vector
gene2GO=myGENE2GO,
ont="BP",
nodeSize=5
)
classic<-topGO::runTest(
BP,
algorithm ="classic",
statistic = "fisher"
)
# merge results
BP_sResults<-ViSEAGO::merge_enrich_terms(
Input=list(
condition=c("BP","classic")
)
)
You should get a merged list of your enriched GO terms with the corresponding statistical tests you prefer.
I have faced this problem recently, it was very frustrating. In my case the whole issue seemed to be related to the package version I was using.
I used conda to install ViSEAGO. Nevertheless, R's version in my conda environment was a bit old (i.e. 3.6.1 to be specific). Therefore, when installing ViSEAGO with conda, the version 1.0.0 of the package was installed. Please note that the most recent version of ViSEAGO is 1.4.0.
Therefore, I created a conda environment with R version 4.0.3, and repeated the procedure to install ViSEAGO by using conda. When doing this, ViSEAGO's 1.4.0 version was installed, and everything went fine.
I've tried to backtrack the error, and only find one thing: in the older ViSEAGO version, the function Custom2GO loaded tables with 4 columns; in the most recent version it admits 5 columns (the new one being 'gene_symbol'). I think this disagreement might be part of the issue, as the source code of the function merge_enrich_terms seems to deal with the columns 'gene_id' and 'gene_symbol' at some point, but I'm not sure.
Hope you find my comment helpful!
Cheers,
Mauricio

Walsh-Hadamard Transform in r

I search for a command to compute Walsh-Hadamard Transform of an image in R, but I don't find anything. In MATLAB fwht use for this. this command implement Walsh-Hadamard Tranform to each row of matrix. Can anyone introduce a similar way to compute Walsh-Hadamard on rows or columns of Matrix in R?
I find a package here:
http://www2.uaem.mx/r-mirror/web/packages/boolfun/boolfun.pdf
But why this package is not available when I want to install it?
Packages that are not maintained get put in the Archive. They get put there when that aren't updated to match changing requirements or start making errors with changing R code base. https://cran.r-project.org/web/packages/boolfun/index.html
It's possible that you might be able to extract useful code from the archive version, despite the relatively ancient version of R that package was written under.
The R code for walshTransform calls an object code routine:
walshTransform <- function ( truthTable ) # /!\ should check truthTable values are in {0,1}
{
len <- log(length(truthTable),base=2)
if( len != round(len) )
stop("bad truth table length")
res <- .Call( "walshTransform",
as.integer(truthTable),
as.integer(len))
res
}
Installing the package succeeded on my Mac, but would require the appropriate toolchain on whatever OS you are working in.

I cannot load data from R package from R-Forge

I am trying to figure out the 'factorAnalytics' package from R-Forge. Everything seems to load fine, but when i try to walk through an example, I cannot load the data.
install.packages("factorAnalytics", repos="http://R-Forge.R-project.org")
This works fine and everything is installed. I now try to walk through the example in the vignette:
# Load fundamental and return data
data(Stock.df)
# fit a fundamental factor model
exposure.vars <- c("BOOK2MARKET", "LOG.MARKETCAP")
fit <- fitFfm(data=stock, asset.var="TICKER", ret.var="RETURN",
date.var="DATE", exposure.vars=exposure.vars)
names(fit)
Which provides this:
Warning message:
In data(Stock.df) : data set ‘Stock.df’ not found
But if I run this:
data(package = "factorAnalytics")
I see that the dataset should be there...
Data sets in package ‘factorAnalytics’:
factors.M (CommonFactors) Factor set of several commonly used factors
factors.Q (CommonFactors) Factor set of several commonly used factors
managers Hypothetical Alternative Asset Manager and Benchmark Data
r.M (StockReturns) Stock Return Data
r.W (StockReturns) Stock Return Data
stock (Stock.df) Fundamental and return data for 447 NYSE stocks
tr.yields (TreasuryYields) Treasury yields at different maturities
So what am I missing??
I am an idiot. I did not include the library after installing the package. Sorry all.
I just ran this:
library("factorAnalytics")
You can try like this:
load("E:\\R-3.3.1\\library\\factorAnalytics\\data\\Stock.df.RData")
You should find the path where "Stock.df.RData" is.

How to save Variant Call Format (VCF) file to disk in R using VariantAnnotation Package

I've searched the web for this without much luck. More or less you always get to the example from the VariantAnnotation Package. And since this example works fine on my computer I have no idea why the VCF I created does not.
The problem: I want to determine the number and location of SNPs in selected genes. I have a large VCF file (over 5GB) that has info on all SNPs on all chromosomes for several mice strains. Obviously my computer freezes if I try to do anything on the whole genome scale, so I first determined genomic locations of genes of interest on chromosome 1. I then used the VariantAnnotation Package to get only the data relating to my genes of interest out of the VCF file:
library(VariantAnnotation)
param<-ScanVcfParam(
info=c("AC1","AF1","DP","DP4","INDEL","MDV","MQ","MSD","PV0","PV1","PV2","PV3","PV4","QD"),
geno=c("DP","GL","GQ","GT","PL","SP","FI"),
samples=strain,
fixed="FILTER",
which=gnrng
)
The code above is taken out of a function I wrote which takes strain as an argument. gnrng refers to a GRanges object containing genomic locations of my genes of interest.
vcf<-readVcf(file, "mm10",param)
This works fine and I get my vcf (dim: 21783 1) but when I try to save it won't work
file.vcf<-tempfile()
writeVcf(vcf, file.vcf)
Error in .pasteCollapse(ALT, ",") : 'x' must be a CharacterList
I even tried in parallel, doing the example from the package first and then substituting for my VCF file:
#This is the example:
out1.vcf<-tempfile()
in1<-readVcf(fl,"hg19")
writeVcf(in1,out1.vcf)
This works just fine, but if I only substitute in1 for my vcf I get the same error.
I hope I made myself clear... And any help will be greatly appreciated!! Thanks in advance!
Thanks for reporting this bug. The problem is fixed in version 1.9.47 (devel branch). The fix will be available in the release branch after April 14.
The problem was that you selectively imported 'FILTER' from the 'fixed' field but not 'ALT'. writeVcf() was throwing an error because there was no ALT value to write out. If you don't have access to the version with the fix, a work around would be to import the ALT field.
ScanVcfParam(fixed = c("ALT", "FILTER"))
You can see what values were imorted with the fixed() accessor:
fixed(vcf)
Please report and bugs or problems on the Bioconductor mailing list Martin referenced. More Bioc users will see the question and you'll get help more quickly.
Valerie
Here's a reproducible example
library(VariantAnnotation)
fl <- system.file("extdata", "chr22.vcf.gz", package="VariantAnnotation")
param <- ScanVcfParam(fixed="FILTER")
writeVcf(readVcf(fl, "hg19", param=param), tempfile())
## Error in .pasteCollapse(ALT, ",") : 'x' must be a CharacterList
The problem seems to be that writeVcf expects the object to have an 'ALT' field, so
param <- ScanVcfParam(fixed="ALT")
writeVcf(readVcf(fl, "hg19", param=param), tempfile())
succeeds.

Save package settings between sessions

Is there a definitive way to save options or information pertaining to a certain package between sessions?
For example say somebody made a game and released it as an R package. If they wanted to save high scores and not have them reset each time R started a new session what would be the best way to do this? Currently I can only think of storing a file in the users home directory but I'm not sure if I like that approach.
This may be an approach. I created a dummy package with a dummy function (any function I create is bound to be a dummy function) and a data set I called scores that I set as follows:
scores <- NA
Then I created the package with the scores data set.
Then I used the following to change the data set from within R.
loc <- paste0(find.package("new"), "/Data")
unlink(paste0(loc, "/scores.rda"), recursive = TRUE, force = FALSE)
scores <- 10
save(scores, file=paste0(loc, "/scores.rda"))
Then when I unloaded the library and re loaded agin the data set now says:
> scores
[1] 10
Could this be modified to do what you want? You'd have to have it save in between somehow but am not sure on how to do this without messing with .Last function.
EDIT:
It appears this option is not viable in that when you compile as a package and use lazy load it saves the data sets as:
RData.rbd, RData.rbx, not as .rda files. That means the approach I use above is kinda worthless in that we want it to automatically be recognized.
EDIT2
This approach works and I tried it on a package I made. You can't do lazy load of the data and you have to either explicitly use data(scores) or use data(scores) inside of the function you're calling. I also assigned scores to .scores int he global.env the first time it was created and used exists inside the function to see if it exists. If `.scores. existed I assigned that to scores within the function. Once you unload the library and laod again you never have to worry about that again.
Maybe an alternative is to save this as a function somehow that can be altered using Josh's advice here: Permanently replacing a function
I guess there is no way to store settings without saving them to disk or a database, some way or another. It can be done silently though by putting the code below in your ~/.Rprofile. However, if you have packages that save settings in other ways than using options you need to add them manually.
I know this is exactly what you said you did not want, but it might spark some debate at least.
.Last <- function(){
my.options <- options()
save(my.options, file="~/.Roptions.Rdata")
}
.First <- function(){
tryCatch({
load("~/.Roptions.Rdata")
do.call(options, my.options)
rm(my.options)
}, error=function(...){})
}
To my suprise try(..., silent=TRUE) gives a warning on startup if ~/.Roptions.Rdata does not exist, which is why I used tryCatch instead.
The modern answer to this problem is well explained at https://blog.r-hub.io/2020/03/12/user-preferences/
I think I will be trying the hoardr package! Here is an example that worked for me :)
x <- hoardr::hoard()
x$cache_path_set("yourpackage", type = 'user_cache_dir')
x$mkdir()
scores<-data.frame(
user=c("one","two","three"),
score=c("500,200,1100")
)
save(scores,file = file.path(x$cache_path_get(), "scores.rdata"))
x$list()
x$details()
#new session
x <- hoardr::hoard()
x$cache_path_set("yourpackage", type = 'user_cache_dir')
x$list()
x$details()
load(file = file.path(x$cache_path_get(), "scores.rdata"))
PS - you can see a working example in the rnoaa package found on at github "opensci/rnoaa". Check their R/onload.r file! I can expand if needed.

Resources