Retrieving PubMed IDs and title: HTTP error - r

I have written a R code to fetch Pubmed IDs and title using the journal name, date, volume, issue and page number.
The file contains rows like
AAPS PharmSci 2000 2 1 E2
AAPS PharmSci 2004 6 1 1-9
And the output I want is like:
AAPS PharmSci 2000 2 1 E2, 11741218 , Molecular modeling of G-protein coupled receptor kinase 2: docking and biochemical evaluation of inhibitors.
similarly for all the rows in the file
The code I have written in R for this is
search_topic <- "search term"
search_query <- EUtilsSummary(search_topic)
#summary(search_query)
# see the ids of our returned query
ID <- QueryId(search_query)
# get actual data from PubMed
records<- EUtilsGet(search_query)
# store it
pubmed_data <- data.frame(ID,'Title'=ArticleTitle(records))
write.csv(pubmed_data, file = paste("./",search_topic,".csv",sep=""))
Which gives an error like:
In addition: Warning message:
In file(con, "r") : cannot open: HTTP status was '502 Server Hangup'
Please let me know where am I going wrong?

Related

Download zip file to R when download link ends in '/download'

My issue is similar to this post, but the solution suggestion does not appear applicable.
I have a lot of zipped data stored an online server (B2Drop), that provides a download link with the extension "/download" instead of ".zip". I have been unable to get the method described here, to work.
I have created a test download page https://b2drop.eudat.eu/s/K9sPPjWz3jxtXEq, where the download link https://b2drop.eudat.eu/s/K9sPPjWz3jxtXEq/download can be obtained by right clicking the download button. Here is my script:
temp <- tempfile()
download.file("https://b2drop.eudat.eu/s/K9sPPjWz3jxtXEq/download",temp, mode="wb")
data <- read.table(unz(temp, "Test_file1.csv"))
unlink(temp)
When I run it, I get the error:
download.file("https://b2drop.eudat.eu/s/K9sPPjWz3jxtXEq/download",temp, mode="wb")
trying URL 'https://b2drop.eudat.eu/s/K9sPPjWz3jxtXEq/download'
Content type 'application/zip' length 558 bytes
downloaded 558 bytes
data <- read.table(unz(temp, "Test_file1.csv"))
Error in open.connection(file, "rt") : cannot open the connection
In addition: Warning message:
In open.connection(file, "rt") :
cannot locate file 'Test_file1.csv' in zip file 'C:\Users\User_name\AppData\Local\Temp\RtmpMZ6gXi\file3e881b1f230e'
which typically indicates a problem with the working directory where R is looking for the file. In this case that should be the temp wd.
Your internal path is wrong. You can use list=TRUE to list the files in the archive, analogous to the command-line utility's -l argument.
unzip(temp, list=TRUE)
# Name Length Date
# 1 Test/Test_file1.csv 256 2021-09-27 10:13:00
# 2 Test/Test_file2.csv 286 2021-09-27 10:14:00
Better than read.table, though, use read.csv since it's comma-delimited.
data <- read.csv(unz(temp, "Test/Test_file1.csv"))
head(data, 3)
# ID Variable1 Variable2 Variable Variable3
# 1 1 f 54654 25 t1
# 2 2 t 421 64 t2
# 3 3 x 4521 85 t3

R: trouble assigning values to a dynamic variable in a dataframe

I am trying to assign values to a dataframe variable defined by the user. The user specifies the name of the variable, let's call this x, in the dataframe df. For simplicity I want to assign a value of 3 to everything in the column the user specifies. The simplified code is:
variableName <- paste("df$", x, sep="")
eval(parse(text=variableName)) <- 3
But I get an error:
Error in file(filename, "r") : cannot open the connection
In addition: Warning message:
In file(filename, "r") :
cannot open file 'df$x': No such file or directory
I've tried all kinds of remedies to no avail. If I simply try to print the values of the column.
eval(parse(text=variableName))
I get no errors and it prints out ok. It's only when I try to give that column a value that I get the error. Any help would be appreciated.
I believe the issue is that there is no way to use the result of eval() on the LHS of an assignment.
df = data.frame(foo = 1:5,
bar = -3)
x = "bar"
variableName <- paste("df$", x, sep="")
eval(parse(text=variableName)) <- 3
#> Warning in file(filename, "r"): cannot open file 'df$bar': No such file or
#> directory
#> Error in file(filename, "r"): cannot open the connection
## This error is a bit misleading. Breaking it apart I get a different error.
eval(expression(df$bar)) <- 3
#> Error in eval(expression(df$bar)) <- 3: could not find function "eval<-"
## And it works if you put it all in the string to be parsed.
ex1 <- paste0("df$", x, "<-3")
eval(parse(text=ex1))
df
#> foo bar
#> 1 1 3
#> 2 2 3
#> 3 3 3
#> 4 4 3
#> 5 5 3
## But I doubt that's the best way to do it!

Extraction of post from Facebook using RFacebook package

I succeed getting the text of the post and share and likes count.
However, I am not able to get the like of the comments associated with the post . If this information is not avalaible , I would like to merge the like count of the post to each comments.
Example: A post gets 900 likes and 80 comments. I would like to associated the 900 likes values to each of the comments (a new column called post_like maybe).
I would like to use this information to perform a sentiment analysis using the number of likes (complexe like (i.e. haha, sad...)) in a logistic regression with the frequence of the most frequent words as the x variable.
Here is my script so far:
token<- "**ur token , get it at https://developers.facebook.com/tools/explorer/**"
# Function to download the comments
download.post <- function(i, refetch=FALSE, path=".") {
post <- getPost(post=fb_page$id[i], comments = TRUE, likes = TRUE, token=token)
post1<- as.data.frame(melt(post))
}
#----------------------- Request posts --- ALL
# Get post for ALL
fb_page<- getPage(page="**the page number u want**", token=token, since='2010/01/01', until='2016/01/01', n= 10000, reactions=TRUE)
fb_page$order <- 1:nrow(fb_page)
# Apply function to download comments
files<-data.frame(melt(lapply(fb_page$order, download.post)))
# Select only comments
files_c<-files[complete.cases(files$message),]
So basically I get the page with the post ID and create a function to get the post of the post ID on that page.
As you can see , I get all the information I need BESIDE the likes and share count.
I hope I am clear , thanks a lot for you help
It's all there:
library(Rfacebook)
token <- "#############" # https://developers.facebook.com/tools/explorer
fb_page <- getPage(page="europeanparliament", token=token, n = 3)
transform(
fb_page[,c("message", "likes_count", "comments_count", "shares_count")],
message = sapply(message, toString, width=30)
)
# message likes_count comments_count shares_count
# 1 This week members called o.... 92 73 21
# 2 Today we're all Irish, bea.... 673 133 71
# 3 European citizens will mee.... 1280 479 71
packageVersion("Rfacebook")
# [1] ‘0.6.12’

Calculating walking distance using Google Maps in R

I've been tying to get the distance between a list of home postcodes and a list of school postcodes for approximately 2,000 students. I'm using the gmapsdistance package within R to get this from the Google Maps Distance Matrix API. I've put in a valid API key and just replaced this in the following code for security reasons.
library(gmapsdistance)
set.api.key("valid API key")
results <- gmapsdistance(origin = school$HomePostcode,
destination = school$SchoolPostcode,
mode = "walking",
shape = "long")
However, this gives the following error code.
Error in function (type, msg, asError = TRUE) :
Unknown SSL protocol error in connection to maps.googleapis.com:443
Looking on the Google APIs website, it looks like it hasn't ran the query for all the data, it says that there were only 219 requests. I know I'm limited as to how many requests I can do in one day, but the limit is 2,500 and it's not even letting me get close to that.
I've tried running the code on one set of postcodes, like below;
test <- gmapsdistance(origin = "EC4V+5EX",
destination = "EC4V+3AL",
mode = "walking",
shape = "long")
Which gives the following, as I would expect.
$Time
[1] 384
$Distance
[1] 497
$Status
[1] "OK"
My data looks something like this, I've anonymised the data and removed all variables that aren't needed. There are 1,777 sets of postcodes.
head(school)
HomePostcode SchoolPostcode
1 EC4V+5EX EC4V+3AL
2 EC2V+7AD EC4V+3AL
3 EC2A+1WD EC4V+3AL
4 EC1V+3QG EC4V+3AL
5 EC2N+2PT EC4V+3AL
6 EC1M+5QA EC4V+3AL
I do not have enough reputation to comment but have you tried to set the parameter combinations to "pairwise". If set to "all" then it will compute all the combinations between one origin and all destinations.
library(gmapsdistance)
from <- c("EC4V+5EX", "EC2V+7AD", "EC2A+1WD", "EC1V+3QG", "EC2N+2PT", "EC1M+5QA")
to <- c("EC4V+3AL", "EC4V+3AL", "EC4V+3AL", "EC4V+3AL", "EC4V+3AL", "EC4V+3AL")
test <- gmapsdistance(origin=from,
destination=to,
combinations="pairwise",
key="YOURAPIKEYHERE",
mode="walking")
test$Distance
or de Distance
1 EC4V+5EX EC4V+3AL 497
2 EC2V+7AD EC4V+3AL 995
3 EC2A+1WD EC4V+3AL 2079
4 EC1V+3QG EC4V+3AL 2492
5 EC2N+2PT EC4V+3AL 1431
6 EC1M+5QA EC4V+3AL 1892
With this small set of 6 destinations it works, I have an API key, if you send me a bigger set I can try.
Another option would be to use the package googleway, it allows to set as well an API key. Example:
library(googleway)
test <- google_distance(origins = from,
destinations = to,
mode = "walking",
key="YOURAPIKEYHERE")

Cannot coerce class ....to a data.frame error

R subject
I have an "cannot coerce class "c("summary.turnpoints", "turnpoints")" to a data.frame" error when trying to save the summary in a file. I have tried to fix that with as.data.frame with no success.
code :
library(plyr)
library(pastecs)
data <- read.table("C:\\Users\\Ron\\Desktop\\dataset.txt", header=F, col.name="A")
data.tp=turnpoints(data$A)
print(data.tp)
Turning points for: data$A
nbr observations : 5990
nbr ex-aequos : 51
nbr turning points: 413 (first point is a pit)
E(p) = 3992 Var(p) = 1064.567 (theoretical)
Turning points for: data$A
nbr observations : 5990
nbr ex-aequos : 51
nbr turning points: 413 (first point is a pit)
E(p) = 3992 Var(p) = 1064.567 (theoretical)
data.sum=summary(data.tp)
print(data.sum)
point type proba info
1 11 pit 7.232437e-15 46.97444
2 21 peak 7.594058e-14 43.58212
3 30 pit 3.479857e-27 87.89303
4 51 peak 5.200612e-29 93.95723
5 62 pit 7.594058e-14 43.58212
6 70 peak 6.213321e-14 43.87163
7 81 pit 6.276081e-16 50.50099
8 91 peak 5.534016e-23 73.93602
.....................................
write.table(data.sum, file = "C:\\Users\\Ron\\Desktop\\datasetTurnP.txt")
Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors) :
cannot coerce class "c("summary.turnpoints", "turnpoints")" to a data.frame
In addition: Warning messages:
1: package ‘plyr’ was built under R version 3.0.1
2: package ‘pastecs’ was built under R version 3.0.1
How can I save these summary results to a text file?
Thank you.
Look at the Value section of:
?pastecs::summary.turnpoints
It should be clear that this will not be a set of lists all of which have the same length. Hence the error message. So rather than asking for the impossible, ... tell us what you wanted to save.
It's actually not impossible, just not possible with write.table, since it's not a dataframe. The dump function would allow you to construct an ASCII representation of the structure(...) representation of that summary-object.
dump(data.sum, file="dump_data_sum.asc")
This could then be source()-ed

Resources