Flatten facet.pivot from solr query in R - r

I'm trying to flatten a facet.pivot from a solr query.
I've came across this page: https://rdrr.io/github/ropensci/solr/man/pivot_flatten_tabular.html#heading-1
which says there is a function (pivot_flatten_tabular) that does it, but, after installing the package solrium the function does not appear.
Any ideas of why is it not working?

maintainer of solrium here: I changed the pkg name form solr to solrium. The pivot_flatten_tabular fxn is still there, its not exported though, you can access it with triple colon solrium:::pivot_flatten_tabular
Here's an example facet pivot query with solrium
cli <- SolrClient$new(host = "api.plos.org", path = "search", port = NULL)
solr_facet(cli, params = list(q='alcohol', facet.pivot='journal,subject',
facet.pivot.mincount=10))

Related

Obtaining Metadata Information using ee_print function from RGEE

I am using the package RGEE (R wrapper for the Google Earth Engine Python API). The function ee_print() seems to work perfectly for an ImageCollection of just one variable, but seems to fail for ImageCollection with different variables where one needs to select the variable of interest. Any ideas on how to approach this issues with the latter kind of data.
Here's an example code:
GRIDMET = ee$ImageCollection("IDAHO_EPSCOR/GRIDMET")
ee_print(GRIDMET)
Where I get the following error message in return:
Error in strsplit(code, ":") : non-character argument
Have you considered the following approach?
GRIDMET = ee$ImageCollection("IDAHO_EPSCOR/GRIDMET")
print(GRIDMET, type = getOption("rgee.print.option"))
And play with the list of all metadata properties
GRIDMET$propertyNames()$getInfo()# Get a list of all metadata properties
(GRIDMET$get("product_tags")$getInfo()) # you can choose to show a characteristic like "product_tags"

STRINGdb r environment; error in plot_network

I'm trying to use stringdb in R and i'm getting the following error when i try to plot the network:
Error in if (grepl("The document has moved", res)) { : argument is
of length zero
code:
library(STRINGdb)
#(specify organism)
string_db <- STRINGdb$new( version="10", species=9606, score_threshold=0)
filt_mapped = string_db$map(filt, "GeneID", removeUnmappedRows = TRUE)
head(filt_mapped)
(i have columns titled: GeneID, logFC, FDR, STRING_id with 156 rows)
filt_mapped_hits = filt_mapped$STRING_id
head(filt_mapped_hits)
(156 observations)
string_db$plot_network(filt_mapped_hits, add_link = FALSE)
Error in if (grepl("The document has moved", res)) { : argument is
of length zero
You are using quite few years old version of Bioconductor and by extension the STRING package.
If you update to the newest one, it will work. However the updated package only supports only the latest version STRING (currently version 11), so the underlying network may change a bit.
More detailed reason is this:
The STRING's hardware infrastructure underwent recently major changes which forced a different server setup.
Now all the old calls are forwarded to a different URL, however the cURL call, how it was implemented, does not follow our redirects which breaks the STRINGdb package functionality.
We cannot update the old bioconductor package and our server setup can’t be really changed.
That said, the fix for an old version is relatively simple.
In STRINGdb library there is script with all the methods "rstring.r".
In there you’ll find “get_png” method. In it replace this line:
urlStr = paste("http://string-db.org/version_", version, "/api/image/network", sep="" )
With this line:
urlStr = paste("http://version", version, ".string-db.org/api/image/network", sep="" )
Load the library again and it should create the PNG, as before.

Error when using *whatNWISdata* function within *dataRetrieval* package from USGS: All components of query must be named

I am trying to use the whatNWISdata function to retrieve all available data for specific USGS sites.
I get the following error after trying to execute the function:
siteNo <- "09508300"
dailyDataAvailable <- whatNWISdata(siteNo, service = "dv", parameterCd = "00060",
statCd = "00003")
Yields:
Error: All components of query must be named
Although I am using the function as recommended in rdocumentation.org and cran, I get the same error. My rstudio is updated to the latest version and so is my dataRetrieval package that this function is part of.
This error was a question that was brought up in 2016 on github, and they recommended downgrading the httr package, but the httr package has been updated since this question was asked and it seems this issue was resolved in the update.
Thanks!
Thank you so much for your input #Dragonthoughts!
I emailed the author and apparently they had made some slight edits in the format of the input values, and yes you are correct in that the first argument needed a value as in the other inputs for the function.
For anyone else having this problem, it works when you change the input to be like this: dailyDataAvailable <- whatNWISdata(siteNumber = "09508300", service = "dv", parameterCd = "00060", statCd = "00003")

rquery: Connect to specific schema in Postgres DB

The rquery package has been out for some time now, but the documentation is still very sparse. There isn't even a tag yet in SO, this question will create it.
Maybe there is someone who can help me nevertheless.
I want to connect to a schema in my Postgres-DB via rqueryto read the data into R with all the speed it promises.
Using this code it works with all the tables in the public-schema.
library(RPostgres)
library(rquery)
con <- dbConnect(RPostgres::Postgres(),
host = #####,
dbname = #####,
user = #####,
password = ######)
df <- db_td(con, "tablename") %.>%
execute(con, .)
Now when I want to access a table in a specific schema db_td() has the argument qualifiers = which is an
optional named ordered vector of strings carrying
additional db hierarchy terms,such as schema
So I did:
db_td(db, "tablename", qualifiers = c(schema = "schema"))
But:
Error in result_create(conn#ptr, statement) : Failed to prepare
query: FEHLER: Relation »tablename« existiert nicht LINE 1: SELECT
* FROM "tablename" LIMIT 1
So the qualifiers = argument seems to be completely ignored.
My question is thus pretty basic:
How can I connect to a schema in a PostgresDB via rquery?
all my attempts to solve this "within" rquery seem to fail miserably, but you can work around it by doing something like:
dbExecute(con, "SET search_path = foo_schema, public;")
before you run db_td.
I think it's caused by rq_colnames doing:
paste0("SELECT * FROM ", quote_identifier(db, table_name),
" LIMIT 1")
and hence not doing anything with its qualifiers, at least this matches the error I get back.
maybe report a bug/issue with rquery if this isn't enough
I have created an issue on github. So far regular rquery indeed doesn't have schema ability. The development version of rquery (1.3.4) however has, as of today, basic schema ability.
To be installed via:
library(devtools)
install_github("WinVector/rquery", host = "https://api.github.com")
Here's a small instruction. Seems to have been inteded to work just as I was trying in my question.
Be careful though, rquery hasn't been fully tested in schema-mode and some things might not work.
EDIT: rquery now has full schema support.

Rcrawler package: ContentScraper Error

i've a problem with ContentScraper function of Rcrawler package. I would like to extract from this site some information about time and airports of arrival and departure and also the price: (I took inspiration fom this site)
MY_Data=ContentScraper(CssPatterns = c(".leg",".price"), ManyPerPattern = T, Url = "http://www.skyscanner.it/trasporti/voli/rome/lond/180201?adults=1&children=0&adultsv2=1&childrenv2=&infants=0&cabinclass=economy&rtn=0&preferdirects=false&outboundaltsenabled=false&inboundaltsenabled=false&ref=day-view#results")
but i get this error:
Error in LinkExtractor(url = Ur, encod = encod) : object 'Extlinks' not found
I had a look to LinkExtractor function but i have no ideas of why it doesn't find Extlinks since it should be created by the function itself. Isn't it?
Someone could help me?
Thank You!
This website doesn't allow scraping. This may be one reason why your example doesn't work. You can try in this web. I also recommend you to try rvest package which is easier to use.
I have tried the same request using Rcrawler+phantomjs web driver but no result, There is some sort of javascript protection against unreal sessions,
br<-run_browser()
MY_Data<-ContentScraper(CssPatterns = c(".leg",".price"), ManyPerPattern = T, Url = "https://www.skyscanner.it/trasporti/voli/rome/lond/?adults=1&children=0&adultsv2=1&childrenv2=&infants=0&cabinclass=economy&rtn=0&preferdirects=false&outboundaltsenabled=false&inboundaltsenabled=false&ref=day-view&oym=1903&selectedoday=01", browser = br, RenderingDelay = 5)
I retrieved the session Screenshot, and I can confirm that the javascript which load results is stuck .
Using Rselenium+ chrome headless (with gpu enabled) I got robot check page. (see images)
As a result the only hope to get data legitimately is to use their API
Rcrawler creator

Resources