I am attempting to retrieve email addresses of contributing package authors and maintainers to the R-Project. The function reads as follows:
availpkgs <- available.packages(contriburl = contrib.url(getOption("repos"), type),
method, fields = NULL, type = getOption("pkgType"),
filters = NULL)
I've attempted different character values in the fields parameter to retrieve Maintainer and Author info from the 'PACKAGES' files, but have not been had luck. Does anyone know how I might approach this? Thank you in advance for your time.
I do not think the Author information is in what available.packages() retrieves:
R> AP <- available.packages()
R> colnames(AP)
[1] "Package" "Version" "Priority"
[4] "Bundle" "Contains" "Depends"
[7] "Imports" "LinkingTo" "Suggests"
[10] "Enhances" "OS_type" "License"
[13] "File" "Repository"
R>
So maybe you need to combine this with a per-package lookup of the DESCRIPTION info at CRAN (or a mirror). I do that, and a few more things, in the 200-line script driving the CRANberries RSS feed / html summary of package updates at CRAN which stores stateful info in SQLite. For this, I retrieve Author, Maintainer etc directly from the package I am currently looking at rather than in one big global scoop. That said, there may of course be other meta-data at CRAN for this...
Related
My code:
setwd("C:/A549_ALI/4_tert-Butanol (22)/")
list.celfiles()
my.affy=ReadAffy()
dim(exprs(my.affy))
Output:
Show in New Window
[1] "(46) 22-B1-1_(miRNA-4_0).CEL"
[2] "(47) 22-B1-2_(miRNA-4_0).CEL"
[3] "(48) 22-B1-3_(miRNA-4_0).CEL"
[4] "(49) 22-R1-1_(miRNA-4_0).CEL"
[5] "(50) 22-NEC 1-1_(miRNA-4_0).CEL"
[6] "(51) 22-B2-1_(miRNA-4_0).CEL"
[7] "(52) 22-B2-2_(miRNA-4_0).CEL"
[8] "(53) 22-B2-3_(miRNA-4_0).CEL"
[9] "(54) 22-R2-1_(miRNA-4_0).CEL"
[10] "(55) 22-NEC 2-1_(miRNA-4_0).CEL"
[11] "(56) 22-B3-1_(miRNA-4_0).CEL"
[12] "(57) 22-B3-2_(miRNA-4_0).CEL"
[13] "(58) 22-B3-3_(miRNA-4_0).CEL"
[14] "(59) 22-R3-1_(miRNA-4_0).CEL"
[15] "(60) 22-NEC 3-1_(miRNA-4_0).CEL"
[1] 292681 15
Up to here everything works but than I get this error message:
background correction: mas
PM/MM correction : mas
expression values: mas
background correcting...'getOption("repos")' replaces Bioconductor standard repositories, see '?repositories' for details
replacement repositories:
CRAN: https://cran.rstudio.com/
Error in getCdfInfo(object) :
Could not obtain CDF environment, problems encountered:
Specified environment does not contain miRNA-4_0
Library - package mirna40cdf not installed
Bioconductor - mirna40cdf not available
I have already tried to install this package, but I can't find it on the Bioconductor website.
Now I do not know how to proceed. Is there any other way to use the mas5calls function?
I use R 4.2.2.
Thanks for all answers.
This simply means that the package with the cdf-env built upon the chip description (CDF) - file for this type of MicroArray data is not distributed through Bio-conductor. It seems that Affymetrix is not providing those anymore, but you can find them on GEO. (click on the platform and than under "Supplementary file" . Alternatively ask the person you got the data from if they can provide you with the relevant CDFs. Use cfdName() to check which ones you need.
Once you obtained the CDF you can build the R package ( mirna40cdf in your case) that affy needs using the makecdfenv package you can install from Bioconductor. You could also try to use another package called oligo and see if it supports your data.
I would like to extract the github repo url for all the packages on CRAN. And I have tried to first read the link of CRAN and get the table of all the package names, which also contains the url for the description page of each package, for I want to extract the github repo url through the description page. But I can't get the completed url. Could you please help me with this? Or is there any better way to get the repo url for all packages?
This is my supplementary :
Actually, I want to filter the pkgs that do have a official github repo, like some pkgs as xfun or fddm. And I found I can extract the username and repo name from the description of pkgs on CRAN, and put them in a github formatted url. (for most of them have the same format url like : https://github.com/{username}/{reponame}. For example, for package xfun , it would be like : https://github.com/yihui/xfun.
And now, I have get some of them like : (three of them)
enter image description here
And I am wondering how could I get the url for all of them. I know use glue pkg can replace the elements in a url. and for get the url by replacing elements (username and reponame) I have tried map()
and map_dfr() function. But it returns me error : Error in parse_url(url) : length(url) == 1 is not TRUE
Here is my code :
get <- map_dfr(dat, ~{
username <- dat$user
reponame <- dat$package
pkg_url <- GET(glue::glue("https://github.com/{username}/{reponame}"))
})
Could you please help me with this? Thanks a lot ! :)
I want to suggest a different method for getting where you want.
As discussed in the comments, not all R packages have public GitHub repos.
Here is a version of some code from an answer to another question by Dirk Eddelbuettel that retrieves information from CRAN's database, including the package name and the URL field. If a package has a public GH repo, it is very likely that the authors have included that information in the URL field: there may be a few packages where the GH repo information is guessable (i.e. the GH user name is the same as (e.g.) the identifier in the maintainer's e-mail address; the GH repo name is the same as the package name), but it seems like a lot of work to do all that guessing (and accessing GitHub to see if the guess was correct) for a relatively low return.
getPackageRDS <- function() {
description <- sprintf("%s/web/packages/packages.rds",
getOption("repos")["CRAN"])
con <- if(substring(description, 1L, 7L) == "file://") {
file(description, "rb")
} else {
url(description, "rb")
}
on.exit(close(con))
db <- readRDS(gzcon(con))
rownames(db) <- NULL
return(db)
}
dd <- as.data.frame(getPackageRDS())
dd2 <- subset(dd, grepl("github.com", URL))
## clean up (multiple URLs, etc.)
dd2$URL <- sapply(strsplit(dd2$URL,"[, \n]"),
function(x) trimws(grep("github.com", x, value=TRUE)[1]))
As of today (25 May 2021) there are 17665 packages in total on CRAN, of which 6184 have "github.com" in the URL field. Here are the first few results:
Package URL
5 abbyyR http://github.com/soodoku/abbyyR
12 ABCoptim http://github.com/gvegayon/ABCoptim
16 abctools https://github.com/dennisprangle/abctools
18 abdiv https://github.com/kylebittinger/abdiv
20 abess https://github.com/abess-team/abess
23 ABHgenotypeR http://github.com/StefanReuscher/ABHgenotypeR
The URL field may still not be completely clean, but this should get you most of the way there.
An alternative approach would be to use the githubinstall package, which works by downloading a data frame that has been generated by crawling GitHub looking for R packages.
library(githubinstall)
dd3 <- gh_list_packages()
At present there are 34491 packages in this list, so obviously it includes a lot of stuff that's not on CRAN. You could intersect this list of packages with information from available_packages() ...
I'm developing a package where I wish to add an edit history to an object. The package allows other packages to register functions for editing the object. I'm looking for a way to record the version of the package that registered the function that was used for the edit.
The question is: Given a function how do you get the package from where it was exported? My idea is to investigate its search path, but search() only reports the search path for the current environment and thus not for a function, which is what I need.
Any pointers to other approaches is greatly appreciated.
The context for getting the package is this:
registerFunction <- function(fun) {
package <- getPackage(fun) ## This is what I need
version <- getPackageVersion(package)
register(fun, package, version)
}
You can use getAnywhere For example, if you're looking for the namespace for the stringr function str_locate you can do
getAnywhere("str_locate")$where
# [1] "package:stringr" "namespace:stringr"
This will work as long as stringr is "visible on the search path, registered as an S3 method or in a namespace but not exported."
The result is a named list, and you can see what's available from getAnywhere with names
names(getAnywhere("str_locate"))
# [1] "name" "objs" "where" "visible" "dups"
You can use:
environment(fun=someFunctionName)
It will return the environment of the function passed as parameter, specifying also the namespace, i.e. the package name.
This question is sort of a follow up to this post as I'm still not fully convinced that, with respect to code robustness, it wouldn't be far better to make typing namespace::foo() habit instead of just typing foo() and praying you get the desired result ;-)
Actual question
I'm aware that this goes heavily against "standard R conventions", but let's just say I'm curious ;-) Is it possible to attach a temporary namespace to the search path somehow?
Motivation
At a point where my package mypkg is still in "devel stage" (i.e. not a true R package yet):
I'd like to source my functions into an environment mypkg instead of .GlobalEnv
then attach mypkg to the search path (as a true namespace if possible)
in order to be able to call mypkg::foo()
I'm perfectly aware that calling :: has its downsides (it takes longer than simply typing a function's name and letting R handle the lookup implicitly) and/or might not be considered necessary due to the way a) R scans through the search path and b) packages may import their dependencies (i.e. using "Imports" instead of "Depends", not exporting certain functions etc). But I've seen my code crash at least twice due to the fact that some package has overwritten certain (base) functions, so I went from "blind trust" to "better-to-be-safe-than-sorry" mode ;-)
What I tried
AFAIU, namespaces are in principle nothing more than some special kind of environment
> search()
[1] ".GlobalEnv" "package:stats" "package:graphics"
[4] "package:grDevices" "package:utils" "package:datasets"
[7] "package:methods" "Autoloads" "package:base"
> asNamespace("base")
<environment: namespace:base>
And there's the attach() function that attaches objects to the search path. So here's what I thought:
temp.namespace <- new.env(parent=emptyenv())
attach(temp.namespace)
> asNamespace("temp.namespace")
Error in loadNamespace(name) :
there is no package called 'temp.namespace'
I guess I somehow have to work with attachNamepace() and figure out what this expects before it is called in in library(). Any ideas?
EDIT
With respect to Hadley's comment: I actually wouldn't care whether the attached environment is a full-grown namespace or just an ordinary environment as long as I could extend :: while keeping the "syntactic sugering" feature (i.e. being able to call pkg::foo() instead of "::"(pkg="pkg", name="foo")()).
This is how function "::" looks like:
> get("::")
function (pkg, name)
{
pkg <- as.character(substitute(pkg))
name <- as.character(substitute(name))
getExportedValue(pkg, name)
}
This is what it should also be able to do in case R detects that pkg is in fact not a namespace but just some environment attached to the search path:
"::*" <- function (pkg, name)
{
pkg <- as.character(substitute(pkg))
name <- as.character(substitute(name))
paths <- search()
if (!pkg %in% paths) stop(paste("Invalid namespace environment:", pkg))
pos <- which(paths == pkg)
if (length(pos) > 1) stop(paste("Multiple attached envirs:", pkg))
get(x=name, pos=pos)
}
It works, but there's no syntactic sugaring:
> "::*"(pkg="tempspace", name="foo")
function(x, y) x + y
> "::*"(pkg="tempspace", name="foo")(x=1, y=2)
[1] 3
How would I be able to call pkg::*foo(x=1, y=2) (disregarding the fact that ::* is a really bad name for a function ;-))?
There is something wrong in your motivation: your namespace does NOT have to be attached to the search path in order to use the '::' notation, it is actually the opposite.
The search path allows symbols to be picked by looking at all namespaces attached to the search path.
So, as Hadley told you, you just have to use devtools::load_all(), that's all...
Is there an equivalent of dir function (python) in R?
When I load a library in R like -
library(vrtest)
I want to know all the functions that are in that library.
In Python, dir(vrtest) would be a list of all attributes of vrtest.
I guess in general, I am looking for the best way to get help on R while running it in ESS on linux. I see all these man pages for the packages I have installed, but I am not sure how I can access them.
Thanks
help(package = packagename) will list all non-internal functions in a package.
Yes, use ls().
You can use search() to see what's in the search path:
> search()
[1] ".GlobalEnv" "package:stats" "package:graphics"
[4] "package:grDevices" "package:utils" "package:datasets"
[7] "package:methods" "Autoloads" "package:base"
You can search a particular package with the full name:
> ls("package:graphics")
[1] "abline" "arrows" "assocplot" "axis"
....
I also suggest that you look at this related question on stackoverflow which includes some more creative approaching to browsing the environment. If you're using ESS, then you can use Ess-rdired.
To get the help pages on a particular topic, you can either use help(function.name) or ?function.name. You will also find the help.search() function useful if you don't know the exact function name or package. And lastly, have a look at the sos package.
help(topic) #for documentation on a topic
?topic
summary(mydata) #an overview of data objects try
ls() # lists all objects in the local namespace
str(object) # structure of an object
ls.str() # structure of each object returned by ls()
apropos("mytopic") # string search of the documentation
All from the R reference card