download GEO data from NCBI to R - r

I am trying to learn bioinformatic analyses using R & Bioconductor by myself but at early steps I stucked! I was trying to download GSE data from NCBI and follow some commands that I found in youtube but you can see the error messages as following:
# First Step:
library(GEOquery)
Error in library(GEOquery) : there is no package called ‘GEOquery’
# Second step:
require(GEOquery)
Loading required package:
GEOquery Warning message: In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE, :
there is no package called ‘GEOquery’
library(GEOquery)
Error in library(GEOquery) : there is no package called ‘GEOquery’
# Third step:
source("https://bioconductor.org/biocLite.R")
Error in file(filename, "r", encoding = encoding) :
cannot open the connection
In addition: Warning message: In file(filename, "r", encoding = encoding) : unsupported URL scheme
# Forth step:
biocLite("GEOquery")
Error: could not find function "biocLite"

It seems like you did not install the package yet. First thing, you should do install.packages("GEOquery").
For your 3rd step, I am not sure if https://bioconductor.org/biocLite.R is an R script published online or what, I tried to open but it says cannot be found. I assume you would like to run an R script here since you used source(). If you have the R script, save it to your directory, so get the directory path and put it in source(), then you can run it successfully.
Then for your fourth step, it was because you did not run the biocLite.R correctly, the function did not save to the environment, that was why it failed and gave you an error.
Overall, two main problems you had: 1. you did not download the package (caused 1st and 2nd steps to fail) 2. argument in source() was wrong, cause 3rd and 4th to fail.
Hope this helps.

biocLite() is a function from the Bioconductor package. To use it, you'll need to install Bioconductor first.

Related

How to install R packages into Azure Machine Learning

I have trained a model locally using the R package locfit. I am now trying to run this in Azure Machine Learning.
Most guides/previous questions appear to be in relation to Azure Machine Learning (classic). Although I believe the process outlined in similar posts will be similar (e.g. here, here, I am still unable to get it to work.
I have outlined the steps I have followed below:
Download locfit R package for windows Zip file from here
Put this downloaded Zip file into a new Zip file entitled "locfit_package"
I upload this "locfit_package" zip folder to AML as a dataset (Create Dataset > From Local Files > name: locfit_package dataset type: file > Upload the zip ("locfit_package") > Confirm upload is correct
In the R terminal I then execute the following code:
install.packages("src/locfit_package.zip", lib = ".", repos = NULL, verbose = TRUE)
library(locfit_package, lib.loc=".", verbose=TRUE)
library(locfit)
The following error message is then returned:
system (cmd0): /usr/lib/R/bin/R CMD INSTALL
Warning: invalid package ‘src/locfit_package.zip’
Error: ERROR: no packages specified
Warning message:
In install.packages("src/locfit_package.zip", lib = ".", repos = NULL, : installation of package ‘src/locfit_package.zip’ had non-zero exit status
Error in library(locfit_package, lib.loc = ".", verbose = TRUE) : there is no package called ‘locfit_package’
Execution halted
I just checked on the document, it says:"Excute R Script module does not support installing packages that require native compilation, like qdap package which requires JAVA and drc package which requires C++. This is because this module is executed in a pre-installed environment with non-admin permission. Do not install packages which are pre-built on/for Windows, since the designer modules are running on Ubuntu. To check whether a package is pre-built on windows, you could go to CRAN and search your package, download one binary file according to your OS, and check Built: part in the DESCRIPTION file. Following is an example:
And the sample code:
# R version: 3.5.1
# The script MUST contain a function named azureml_main,
# which is the entry point for this module.
# Note that functions dependent on the X11 library,
# such as "View," are not supported because the X11 library
# is not preinstalled.
# The entry point function MUST have two input arguments.
# If the input port is not connected, the corresponding
# dataframe argument will be null.
# Param<dataframe1>: a R DataFrame
# Param<dataframe2>: a R DataFrame
azureml_main <- function(dataframe1, dataframe2){
print("R script run.")
if(!require(zoo)) install.packages("zoo",repos = "https://cloud.r-project.org")
library(zoo)
# Return datasets as a Named List
return(list(dataset1=dataframe1, dataset2=dataframe2))
}
Could you please check on if your package is on this?
Reference document: https://learn.microsoft.com/en-us/azure/machine-learning/algorithm-module-reference/execute-r-script

covr::package_coverage reports "No such file or directory"

I'm trying to see the code coverage of sumbose/iRF, so I did a git clone, started an R session inside of the directory, and
> library(covr)
> package_coverage()
Error in file(con, "r") : cannot open the connection
In addition: Warning messages:
1: In utils::install.packages(repos = NULL, lib = tmp_lib, pkg$path, :
installation of package ‘/private/tmp/iRF’ had non-zero exit status
2: In file(con, "r") :
cannot open file '/private/var/folders/ny/f06ns0d568bgf6s559z8j_9m0000gn/T/RtmpAr8dLV/R_LIBS168866d1ef32f/iRF/R/iRF': No such file or directory
However, both R CMD INSTALL iRF and install.packages('iRF', repos = NULL) installs the package as expected.
I encountered this problem too with a under-development package that is not installed. The error message has the path like this:
... /pkgname/R/pkgname': No such file or directory
where pkgname is the package name.
I used VS Code and called covr::package_coverage(), with the package folder as the working directory. I could consistently reproduce the error, and then I noticed that this error occurred if I called devtools::load_all() first.
I found that, for unknown reasons, this error disappeared if I started an R session and did not run devtools::load_all(). I did not need to (and maybe should not) load the package. covr::package_coverage() ran normally in that session without loading the package.
If I called devtools::load_all() after I called covr::package_coverage(), and then called covr::package_coverage() again, it would fail in the same session.
So I think the solution is simple, though a little bit counter intuitive:
Call covr::package_coverage() in a session that does not have the package loaded by devtools::load_all().
I could call covr::package_coverage() several times in this session without problems. Changes I made to files were reflected correctly in the output of covr::package_coverage(), without the need to loading the package.
I used covr 3.5.1, R 4.2.0 in Windows.

Pandoc not recognized as R library

I installed Pandoc manually through this installation - link.
After restarting the system, I was able to locate the installation folder at C:/Users/YourUserName/AppData/Local/Pandoc
But when I'm trying to call the library:
library("pandoc", lib.loc = "C:/Users/YourUserName/AppData/Local/Pandoc")
I'm getting the following error:
Error in library("pandoc", lib.loc = "C:/Users/YourUserName/AppData/Local/Pandoc") :
no library trees found in 'lib.loc'
As i'm behind a firewall, I cannot install pandoc through github. So the install.pandoc() function is out.
Any ideas where I'm getting the installation process wrong?
Edit:
I've changed .LibPath to point to Pandoc's installation folder:
.libPaths('C:/Users/stefanj/AppData/Local/Pandoc')
And if I check, it seems to be ok:
> grep("pandoc", list.files(.libPaths()))
[1] 22 24
library(pandoc)
Error in library(pandoc) : there is no package called ‘pandoc’
Execution halted
I agree to #Dason's point that library in path : "C:/Users/YourUserName/AppData/Local/Pandoc" is not any library/package connected with R. It's just pandoc installed.
Other way to install pandoc would be using installr :
installr::install.pandoc()
Now, for performing for converting from one markup format to another, use the following package :
rmarkdown
The rmarkdown package includes high level functions for converting to a variety of formats. For
example:
render("input.Rmd", html_document())
render("input.Rmd", pdf_document())
Hope this helps.

How to use install_github

I am trying to calculate weight of evidence and information value, and found online that R package riv and tomasgreif does the job. Both packages are located on github, so I used the following code:
library(devtools)
install_github("riv","tomasgreif")
library(woe)
But it gives me the following error/warning message:
> install_github("riv","tomasgreif")
Installing github repo riv/master from tomasgreif
Downloading master.zip from https://github.com/tomasgreif/riv/archive/master.zip
Error in function (type, msg, asError = TRUE) : couldn't connect to host
In addition: Warning message:
In mapCurlOptNames(names(.els), asNames = TRUE) :
Unrecognized CURL options: writedata
How can I solve this problem?
I tried to download the file manually. R returned no error, but the package is not found in the list... (I am able to install some other packages saved in the same location with similar code)
> install.packages("~/riv.zip", repos = NULL)
> library("riv")
Error in library("riv") : there is no package called ‘riv’
I was able to resolve it via downloading Open Source R from Revolution Analytics, then manually downloaded woe package from git and pasting that directly into library folder of OpenSource R
and loaded it library(woe) it will work
Also based on the same lines will be releasing IV and WOE for python also

error with R CMD check because of package dependency

Background
I am creating a newpackage that depends on oldpackage, and have indicated this dependency in the file newpackage/DESCRIPTION.
Furthermore,
oldpackage is installed in the directory, ~/lib/R
my .Rprofile includes .libPaths("~/lib/R")
hence, I can successfully load oldpackage without specifying the library location, e.g., using the command library(oldpackage) in R
Despite the ability to load the package without having its library specified, R CMD check newpackage gives an error indicating that it can not fine oldpackage:
checking whether the package can be loaded ... ERROR
Loading required package: oldpackage
Error: package 'oldpackage' could not be loaded
In addition: Warning message:
In library(pkg, character.only = TRUE, logical.return = TRUE, lib.loc = lib.loc) :
there is no package called 'oldpackage'
Execution halted
It looks like this package has a loading problem: see the messages for
details.
Questions:
Why is R unable to find the package?
Can I specify the library location in the DESCRIPTION file?
Regarding question 1), it is both a FAQ and yet somewhat annoying. R CMD check runs in vanilla mode, so it will not find user-level libraries. As I recall, setting R_LIBS="...." in the call helps, so try that.
Regarding question 2), no you cannot give a location in DESCRIPTION. Which makes sense as that file needs to work 'everywhere' whereas your location info is local to your machine.

Resources