Rismed R package fails to run EUtilsGet function - r

I'm using the Rismed package to make s search query for the word "hsv".
search_topic_hsv <- "HSV"
search_query_hsv <- EUtilsSummary(search_topic_hsv, retmax= 27000, mindate= 1970, maxdate= 2022)
summary(search_query_hsv) #returns to 26149
records_hsv <- EUtilsGet(search_query_hsv)
EutilsGet functions gives this error:
Error in readLines(collapse(EUtilsFetch, "&retmode=xml"), warn = FALSE, :
cannot read from connection
In addition: Warning message:
In readLines(collapse(EUtilsFetch, "&retmode=xml"), warn = FALSE, :
URL 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&id=35245365,35243667,35243023,35241838,35240194,35239848,35230726,35229412,35228129,35225980,35225036,35223260,35217873,35216177,35215908,35215799,35214758,35214644,35212944,35208817,35208572,35206263,35205731,35204477,35201702,35202955,35202734,35202567,35201013,35200997,35200729,35200645,35199176,35199165,35198914,35198826,35197947,35197285,35196784,35193038,35187895,35187000,35186664,35185822,35182142,35180012,35179120,35178081,35176987,35175220,35172828,35168095,35167501,35167411,35165387,35164537,35163675,35163086,35161988,35157734,35155582,35154980,35154946,35154584,35150885,35146210,35145504,35144835,35144523,35143960,35142052,35141364,35141210,35141072,35130299,35126336,35121324,35120255,35119473,35119328,35115776,35114961,35114414,35112018,35111489,35110532,35107381,35106980,35106191,35105326,35105322,35103749,35102178,35101046,35100650,35100323,35099587,35097177,35087520,35085968,35083659,35082770,350825 [... truncated]
is this a pubmed connection related issue or package related issue?
packageVersion("RISmed")
[1] ‘2.3.0’
Thanks in advance.

Related

Error in x[["endpoint"]] : object of type 'closure' is not subsettable in rtweet CRAN package

I am using a package rtweet to download tweets based on a keyword.
The function search_tweets() returned an error after successfully working for more than 2 months.
Now it's showing the error:
# load twitter library - the rtweet library is recommended now over twitteR
library(rtweet)
# plotting and pipes - tidyverse!
library(ggplot2)
library(dplyr)
# text mining library
library(tidytext)
# plotting packages
library(igraph)
library(ggraph)
cat ("Initializing.... \n")
Initializing....
cat("Enter a keyword whose analysis you want to check: ");
Enter a keyword whose analysis you want to check: > #a <- readLines("stdin",n=1) #doesnot work in Rstudio
a = "#realdonaldtrump"
src_tweets <- search_tweets(q = a, n = 1000,
+ lang = "en",
+ include_rts = TRUE)
Error in init_oauth1.0(self$endpoint, self$app, permission = self$params$permission, :
Unauthorized (HTTP 401).

Setting up sparklyr

I am working on setting up sparklyr utilizing R but I keep getting an error message. I essentially have this type in:
install.packages("sparklyr")
library(sparklyr)
spark_install(version = "2.1.0")
sc <- spark_connect(master = "local")
However when I get to create my spark connect I am receiving the following error message:
Using Spark: 2.1.0
Error in if (a[k] > b[k]) return(1) else if (a[k] < b[k]) return(-1L) :
missing value where TRUE/FALSE needed
In addition: Warning messages:
1: running command '"C:\WINDOWS\SYSTEM32\java.exe" -version' had status 2
2: In compareVersion(parsedVersion, "1.7") : NAs introduced by coercion
Any thoughts?

Not able to to convert R data frame to Spark DataFrame

When I try to convert my local dataframe in R to Spark DataFrame using:
raw.data <- as.DataFrame(sc,raw.data)
I get this error:
17/01/24 08:02:04 WARN RBackendHandler: cannot find matching method class org.apache.spark.sql.api.r.SQLUtils.getJavaSparkContext. Candidates are:
17/01/24 08:02:04 WARN RBackendHandler: getJavaSparkContext(class org.apache.spark.sql.SQLContext)
17/01/24 08:02:04 ERROR RBackendHandler: getJavaSparkContext on org.apache.spark.sql.api.r.SQLUtils failed
Error in invokeJava(isStatic = TRUE, className, methodName, ...) :
The question is similar to
sparkR on AWS: Unable to load native-hadoop library and
Don't need to use sc if you are using the latest version of Spark. I am using SparkR package having version 2.0.0 in RStudio. Please go through following code (that is used to connect R session with SparkR session):
if (nchar(Sys.getenv("SPARK_HOME")) < 1) {
Sys.setenv(SPARK_HOME = "path-to-spark home/spark-2.0.0-bin-hadoop2.7")
}
library(SparkR)
library(SparkR, lib.loc = c(file.path(Sys.getenv("SPARK_HOME"), "R","lib")))
sparkR.session(enableHiveSupport = FALSE,master = "spark://master url:7077", sparkConfig = list(spark.driver.memory = "2g"))
Following is the output of R console:
> data<-as.data.frame(iris)
> class(data)
[1] "data.frame"
> data.df<-as.DataFrame(data)
> class(data.df)
[1] "SparkDataFrame"
attr(,"package")
[1] "SparkR"
use this example code :
library(SparkR)
library(readr)
sc <- sparkR.init(appName = "data")
sqlContext <- sparkRSQL.init(sc)
old_df<-read_csv("/home/mx/data.csv")
old_df<-data.frame(old_df)
new_df <- createDataFrame( sqlContext, old_df)

R stm package error: "vectorized sources must have a positive length entry"

I think I'm making a pretty simple mistake but I'm a rookie at R and am having a hard time figuring it out. I'm trying to use the 'stm' package in R do some topic-modeling on a dataset of tweets I scraped.
The dataset is formatted in two columns, one with the name of the tweet-sender, with column header "meta" and the other with the vocab of the tweet, column header, "vocab". After running the script below, I get the following errors:
Error: is.Source(s) is not TRUE
In addition: Warning message:
In is.Source(s) : vectorized sources must have a positive length entry
library(stm)
library(igraph)
setwd("c:/Users/Adam/Desktop/RTwitter")
data <-read.csv("TweetDataSTM.csv")
processed <- textProcessor(data$documents, metadata = data)
out <- prepDocuments(processed$documents, processed$vocab, processed$meta)
docs <- out$documents
vocab <- out$vocab
meta <-out$meta
> library(stm)
> library(igraph)
> setwd("c:/Users/Adam/Desktop/RTwitter")
>
> rm(list=ls())
>
> data <-read.csv("TweetDataSTM.csv")
> processed <- textProcessor(data$documents, metadata = data)
Building corpus...
Error: is.Source(s) is not TRUE
In addition: Warning message:
In is.Source(s) : vectorized sources must have a positive length entry
> out <- prepDocuments(processed$documents, processed$vocab, processed$meta)
Error in prepDocuments(processed$documents, processed$vocab, processed$meta) :
object 'processed' not found
> docs <- out$documents
Error: object 'out' not found
> vocab <- out$vocab
Error: object 'out' not found
> meta <-out$meta
Error: object 'out' not found
(any advice would be greatly appreciated!)
-Adam
I think your mistake occurs because your columns are named vocab and meta. But here
processed <- textProcessor(data$documents, metadata = data)
you are trying to call a column documents that - as far as I see - does not exist in your data.frame. Try changing the code to:
processed <- textProcessor(data$vocab, metadata = data)

Recommenderlab running into memory issues

I am trying to compare some recommender algorithms against each other but am running into some memory issues. The dataset that i am using is https://drive.google.com/open?id=0By5yrncwiz_VZUpiak5Hc2l3dkE
Following is my code:
library(recommenderlab)
library(Matrix)
Amazon <- read.csv(Path to Reviews.csv, header = TRUE,
col.names = c("ID","ProductId","UserId","HelpfulnessNumerator","HelpfulnessDenominator","Score",
"Time","Summary","Text"),
colClasses = c("NULL","character","character","NULL","NULL","integer","NULL","NULL","NULL"))
Amazon <- Amazon[,c("UserId","ProductId","Score")]
Amazon <- Amazon[!duplicated(Amazon[1:2]),] ## To get unique values
scheme <- evaluationScheme(r, method = "split", train = .7,
k = 1, given = 1 ,goodRating = 4)
algorithms <- list(
"user-based CF" = list(name="UBCF", param=list(normalize = "Z-score",
method="Cosine",
nn=50, minRating=3)),
"item-based CF" = list(name="IBCF", param=list(normalize = "Z-score"
))
)
results <- evaluate(scheme, algorithms, n=c(1, 3, 5))
I get the following errors :
UBCF run fold/sample [model time/prediction time]
1 Timing stopped at: 1.88 0 1.87
Error in asMethod(object) :
Cholmod error 'problem too large' at file ../Core/cholmod_dense.c, line 105
IBCF run fold/sample [model time/prediction time]
1 Timing stopped at: 4.93 0.02 4.95
Error in asMethod(object) :
Cholmod error 'problem too large' at file ../Core/cholmod_dense.c, line 105
Warning message:
In .local(x, method, ...) :
Recommender 'user-based CF' has failed and has been removed from the results!
Recommender 'item-based CF' has failed and has been removed from the results!
I tried to use recommenderlabrats package which i thought would solve this problem but could not install it. https://github.com/sanealytics/recommenderlabrats
It gave me some errors which i am not bale to make sense of:
c:/rbuildtools/3.3/gcc-4.6.3/bin/../lib/gcc/i686-w64- mingw32/4.6.3/../../../../i686-w64-mingw32/bin/ld.exe: cannot find -llapack
collect2: ld returned 1 exit status
Then i came to this link for solving the recommenderlabrats problem but it did not work for me
Error while installing package from github in R. Error in dyn.load
Any help on how to get around the memory issue is appreciated
I am the author of recommenderlabrats. Try to install now, it should be fixed. Then use RSVD/ALS to solve. Your matrix is too big even when it's sparse for your computer.
Also, it might be a good idea to experiment with a smaller sample before spending on an AWS memory instance.

Resources