Getting Internal Server Error when trying to use Gviz's IdeogramTrack - r

I posted this on Bioconductor's support page but didn't any answers hence trying here.
I am using the IdeogramTrack function of R/Biocondutor package, Gviz, from my institution's cluster:
IdeogramTrack(genome="mm10",chromosome="chr1")
When I try this from the master node it works fine but when I try this from any other node in the cluster which IO's through the master node, it hangs and eventually I get the error message:
Error: Internal Server Error
I am able to access enter link description here or any other UCSC mirror through these nodes (using traceroute http://genome.ucsc.edu), and can successfully download data from other repositories such as Ensembl, (e.g., using getBM).
Any idea what's wrong?
BTW, any idea which port is IdeogramTrack trying to use?

it sounds like your institution's cluster has issue fetching annotation data from UCSC through Gviz. One suggestion I have is to see if you can manually download mm9 annotation from UCSC; here is a good place to start, by chromosome. Alternatively, you may use a Bioconductor annotation package such as this.
When you have your data.frame with chromosome and chromosomal information (e.g. mapInfo), you could take advantage of GenomicRanges::makeGRangesFromDataFrame to convert the mm9 annotation to a GRanges object, which allows you to make your own IdeogramTrack object. Details on how to make custom IdeogramTrack can be found here.
In general, here is the workflow:
library(GenomicRanges)
library(Gviz)
mm9_annot <- read.table(<file or url with annotation>)
mm9_granges <- makeGRangesFromDataFrame(mm9_annot)
# Alternatively, you may use rtracklayer package
# mm9_granges <- rtracklayer::import(<file or url with annotation>)
my_ideo <- IdeogramTrack(genome="mm9_custom", bands=mm9_granges)
Hope this helps.

Related

Access list of functions and metadata for github dev R package

Please Note: This is cross-posted from here where it hasn't received a response.
So I'm adding it here.
I'm currently co-developing an R package on github which can be installed using devtools::install_github('repo/pkgname), as usual.
We have diligently used roxygen2 to document the individual functions.
We have split the functions into "internal" (#keywords internal) vs. "external" (#export)
so that the user gets to use the external functions i.e.pkgname::external_<fn_name>
and access documentation. They can also use ::: to access the internal
functions if they wish.
For some meta analysis of our package it would be nice to have a functionality
that produced a tidy tibble with the following columns:
function name,
function type i.e. internal/external (accessible by :: or ::: to the user)
More metadata e.g. another column containing parameter names for each function i.e. #param values
documentation strings for each parameter
As a crude version (non-tibble format) for say dplyr. One can do something like:
library(dplyr) # Assume installed already
ls('package:dplyr')
This produces a character vector of function names, but not a tidy tibble with more
useful metadata.
Ideally we would be able to produce this tibble after doing devtools::load_all(".") in
our package development, to track changes in real-time.
Are there any existing R packages that can help generate such a metadata tibble?
Or can such a function be developed for this using existing R packages?
Would appreciate any help associated with this.
I have an answer to my question, which may help others. It turns out this metadata can be accessed using the amazing pkgdown package.
See below for code to use when you have opened an RStudio project attached to a package
you are developing (using devtools):
# Setup - install required libraries
# install.packages(c("pkgdown", "here"))
# If you are in your local package directory, run the following
# to get the required package metadata
pkg <- pkgdown::as_pkgdown(pkg = here::here())
# Inspect the topics object, which contains function metadata
pkg$topics %>% dplyr::glimpse()
# Get list of all functions and just required metadata
pkg_fns_all <- pkg$topics %>%
dplyr::select(name, file_in, internal)
# Get the non-internal functions, acccessed using pkgname::function
pkg_fns_user <- pkg_fns_all %>% dplyr::filter(!internal)
# Get the internal functions, acccessed using pkgname:::function
pkg_fns_internal <- pkg_fns_all %>% dplyr::filter(internal)
Hope this helps others :slight_smile:
A few small outstanding items:
I'm not sure how to get access to individual function #param values from the
above, but if anyone can add some details around that it would be useful.
I'm not sure how to apply this to CRAN installed packages on my system e.g. dplyr

Plotting GTFS object routes with tidytransit in R

I need to plot General transit Feed Specification (GTFS) object routes and their frequencies. For this purpose I have run the following code from the package manual https://cran.r-project.org/web/packages/tidytransit/tidytransit.pdf
to get some practice. But although the code is taken from the manual, I do get the error below. Is there anyone who can clarify this issue and show me an alternative way to perform spatial analysis?
library(tidytransit)
local_gtfs_path <- system.file("extdata",
"google_transit_nyc_subway.zip",
package = "tidytransit")
nyc <- read_gtfs(local_gtfs_path,
local=TRUE)
plot(nyc)
Error in UseMethod("inner_join") :
no applicable method for 'inner_join' applied to an object of class "NULL"
thanks for posting this!
this happened because we made a change in the API and I think the docs you were looking at were out of sync. they should be up to date now. see http://tidytransit.r-transit.org/articles/introduction.html
also, we made a change so that the plot() function will work as specified in the old docs and in the new docs.

How do we set constant variables while building R packages?

We are building a package in R for our service (a robo-advisor here in Brazil) and we send requests all the time to our external API inside our functions.
As it is the first time we build a package we have some questions. :(
When we will use our package to run some scripts we will need some information as api_path, login, password.
How do we place this information inside our package?
Here is a real example:
get_asset_daily <- function(asset_id) {
api_path <- "https://api.verios.com.br"
url <- paste0(api_path, "/assets/", asset_id, "/dailies?asc=d")
data <- fromJSON(url)
data
}
Sometimes we use a staging version of the API and we have to constantly switch paths. How we should call it inside our function?
Should we set a global environment variable, a package environment variable, just define api_path in our scripts or a package config file?
How do we do that?
Thanks for your help in advance.
Ana
One approach would be to use R's options interface. Create a file zzz.r in the R directory (this is the customary name for this file) with the following:
.onLoad <- function(libname, pkgname) {
options(api_path='...', username='name', password='pwd')
}
This will set these options when the package is loaded into memory.

What to do when a NOAA ERDDAP dataset is not found?

I'm trying to download some gridded ERDDAP data using the rnoaa package in R. While the data retrieval works perfectly for some datasets, I'm having some problems getting the data for some datasets in particular. For example when I run:
library (rnoaa)
ds.info <- erddap_info ("noaa_pfeg_95de_54ab_a60a")
erddap_grid (ds.info,
time = c("2005-01-01", "2015-01-01"),
altitude = c (0,0),
latitude = c (3.25, 3.75),
longitude = c (72.5, 73.25),
fields = "all")
I get the following error:
`Error: (404) - Resource not found: /erddap/griddap/ncdcOwDly.csv (Currently unknown datasetID=ncdcOwDly)`.
The error is not really consistent because it works sometimes when I try different time-spans. But I get it pretty much every single time I try to download data from the datasets noaa_pfeg_95de_54ab_a60a, noaa_pfeg_1a4b_0c2a_2365 and some others by NOAA-NCDC.
Because erddap_grid works for some datasets but not for others, I'm inclined to think it's not a bug. Maybe it is a problem of the ERDDAP server?, or maybe something to do with my API key? Is there a way around it?
Update - 2015-01-10: It seems it is a server's problem. When trying to download the data using the address generated by the web interface (see below) I get the same error. I guess I'll just have to wait until "they" fix the problem with the database.
http://coastwatch.pfeg.noaa.gov/erddap/griddap/ncdcOw6hr.csv?u[(2006-01-01):1:(2015-01-09T18:00:00Z)][(10.0):1:(10.0)][(3.25):1:(3.75)][(72.5):1:(73.25)],v[(2006-01-01):1:(2015-01-09T18:00:00Z)][(10.0):1:(10.0)][(3.25):1:(3.75)][(72.5):1:(73.25)]
ERDDAP servers often become overloaded and 404 on some requests. These are public-facing servers that do heavy data lifting, after all.
So the answer here is to try again after waiting some time (please wait a while to be nice to the ERDDAP administrators), and contact the server administrator to be sure that your IP address has not been blacklisted for performing too many requests.

Using R package BerkeleyEarth

I'm working for the first time with the R package BerkeleyEarth, and attempting to use its convenience functions to access the BEST data. I think maybe it's just a problem with their servers (a matter I've separately addressed to the package's maintainer) but I wanted to know if it's instead something silly I'm doing.
To reproduce my fault
library(BerkeleyEarth)
downloadBerkeley()
which provides the following error message
trying URL 'http://download.berkeleyearth.org/downloads/TAVG/LATEST%20-%20Non-seasonal%20_%20Quality%20Controlled.zip'
Error in download.file(urls$Url[thisUrl], destfile = file.path(destDir, :
cannot open URL 'http://download.berkeleyearth.org/downloads/TAVG/LATEST%20-%20Non-seasonal%20_%20Quality%20Controlled.zip'
In addition: Warning message:
In download.file(urls$Url[thisUrl], destfile = file.path(destDir, :
InternetOpenUrl failed: 'A connection with the server could not be established'
Has anyone had a better experience using this package?
The error message is pointing to a different URL than one should get judging what URLs are listed at http://berkeleyearth.org/data/ that point to the zip formatted files. There are another set of .nc files that appear to be more recent. I would replace the entries in the BerkeleyUrls dataframe with the ones that match your analysis strategy:
This is the current URL that should be in position 1,1:
http://berkeleyearth.lbl.gov/downloads/TAVG/LATEST%20-%20Non-seasonal%20_%20Quality%20Controlled.zip
And this is the one that is in the package dataframe:
> BerkeleyUrls[1,1]
[1] "http://download.berkeleyearth.org/downloads/TAVG/LATEST%20-%20Non-seasonal%20_%20Quality%20Controlled.zip"
I suppose you could try:
BerkeleyUrls[, 1] <- sub( "download\\.berkeleyearth\\.org", "berkeleyearth.lbl.gov", BerkeleyUrls[, 1])

Resources