"Error: not compatible with STRSXP" on submit_form with rvest - r

I've searched around stackoverflow and github but haven't seen a solution to this one.
session <- read_html("http://www.whitepages.com")
form1 <- html_form(session)[[1]]
form2 <- set_values(form1, who = "john smith")
submit_form(session, form)
After the submit form line, I get the following:
Submitting with '<unnamed>'
Error: not compatible with STRSXP
I've pieced together that this error is usually from mismatched types (strings and numeric, for example), but I can't tell where that might be happening.
Any help would be greatly appreciated!

I just had this problem myself, and I found that the error was happening when submit_form() called the function rvest:::submit_request(), which tries to run this line:
xml2::url_absolute(form$url, session$url)
In this line, R tries to create an absolute url which throws an error because either form$url or session$url is NULL. In my case, session$url was NULL for some reason. So you should probably try:
session$url <- "http://www.whitepages.com"
submit_form(session, form2)

Try to change the URL of your form into an empty string
form2$url <- "" before submitting it.

Related

R function to parse returning error in strsplit "subscript out of bounds"

I'm using R to extract domain names for a column of HTML pages. I created a function "domain" to do so. It seems to work fine, until it hits pages that came in as "mailto: person#example.com". These are obviously the links for emails. I still wanted to incorporate these into my dataset, but the error I get is: "Error in strsplit(gsub("http://|https://|www\.", "", x), "/")[[c(1, 1)]] : subscript out of bounds"
How can I modify this code to get around the "mailto" pages?
This is my function
domain <- function(x) strsplit(gsub("http://|https://|www\\.","", x),"/")[[c(1,1)]]
This is my command
mainpagelevel3$url <- sapply(mainpagelevel3$url, domain)
I ran this code on a set of urls that did not include a "mailto:" page and it worked just fine, so I think this must be where it's getting stuck. I don't mind if it resulted in "person#example.com" or stays as is.
We could try to write an if condition to check for strings which start with "mailto" and have "#" in them (this can be made more strict if needed). So the function might look like
domain <- function(x) {
if(grepl("^mailto:.*#.*", x)) x
else strsplit(gsub("http://|https://|www\\.","", x),"/")[[c(1,1)]]
}
and then use sapply as usual
mainpagelevel3$url <- sapply(mainpagelevel3$url, domain)

url_absolute error "not compatible with STRSXP" when using submit_form

I am trying to scrape the http://www.emedexpert.com/lists/brand-generic.shtml web page for brand and generic drug names
library(httr)
library(rvest)
session <- read_html("http://www.emedexpert.com/lists/brand-generic.shtml")
form1 <- html_form(session)[[2]]
form2 <- set_values(form1, brand = "tylenol")
submit_form(session, form2)
however this results in the error message:
Error in xml2::url_absolute(form$url, session$url) :
not compatible with STRSXP
Therefore, based on this answer to the same error message ("Error: not compatible with STRSXP" on submit_form with rvest) I added a session$url as follows:
session$url <- "http://www.emedexpert.com/lists/brand-generic.shtml" # added from S.Ov
but I still get the same error message. So I tried also adding various permutations of also adding form2$url such as these
form2$url <- "http://www.emedexpert.com/lists/brand-generic.shtml"
form2$url <- ""
form2$url <- "/"
submit_form(session, form2)
At this point, the error message goes away and I obtain a web page which contain most of the desired web page. However it seems to completely lack the table of brand and generic names.
Any suggestions?
Yes #hackR, RSelenium is not always the answer.
library(rvest)
url<-"http://www.emedexpert.com/lists/bg.php?myc"
page<-html_session(url)
table<-html_table(read_html(page))[[1]]
This could help you I hope.

rvest::set_values() returning error

I'm trying to use the set_values() function to insert a company name using this website:
https://www.unternehmensregister.de/ureg/search1.4.html
Unfortunately, after
search <- html_form(read_html("https://www.unternehmensregister.de/ureg/search1.4.html"))[[1]]
the command
set_values(search, searchRegisterForm:companyPublicationsCompanyName - "Daimler")
gives an error.
Error in
set_values(search,searchRegisterForm:companyPublicationsCompanyName -
: object 'searchRegisterForm:companyPublicationsCompanyName' not
found
It would be great if someone can help me with that!

Overcoming "Error: unexpected input" in RCurl

I am trying and failing to use RCurl to automate the process of fetching a spreadsheet from a web site, China Labour Bulletin's Strike Map.
Here is the URL for the spreadsheet with the options set as I'd like them:
http://strikemap.clb.org.hk/strikes/api.v4/export?FromYear=2011&FromMonth=1&ToYear=2015&ToMonth=6&_lang=en
Here is the code I'm using:
library(RCurl)
temp <- tempfile()
temp <- getForm("http://strikemap.clb.org.hk/strikes/api.v4/export",
FromYear="2011", FromMonth="1",
ToYear="2015", ToMonth="6",
_lang="en")
And here is the error message I get in response:
Error: unexpected input in:
" ToYear=2015, ToMonth=6,
_"
Any suggestions on how to get this to work?
Try enclosing _lang with a backtick.
temp <- getForm("http://strikemap.clb.org.hk/strikes/api.v4/export",
FromYear="2011",
FromMonth="1",
ToYear="2015",
ToMonth="6",
`_lang`="en")
I think R has trouble on the argument starting with an underscore. This seems to have worked for me.

rJava - .jcall calling issue: method with signature not found

I am trying to call a method in java class with rJava for a few days and I did not yet figure it out what I am doing wrong. Maybe someone here will have a some clues for me.
The situation looks like this:
I load library and initializing an object (that's works fine):
library(rJava)
.jinit('C:/javatemp/worker.jar')
jobject <- .jnew("worker.concrete")
I list methods and I get fine result:
.jmethods(jobject)
> [1] "public java.util.List worker.concrete.lookup(java.lang.CharSequence)"
I am preparing input structure which also works fine:
word <- .jnew("java/lang/String", "a word")
input = .jcast(word, "java/lang/CharSequence", check = TRUE)
However when I am trying to execute the method I get an error that such method does not exist...
out = .jcall(jobject,"Ljava/util/List","lookup",input)
> Error in .jcall(jobject, "Ljava/util/List", "lookup", input) :
method lookup with signature (Ljava/lang/CharSequence;)Ljava/util/List not found
Does anyone have an idea how to call such method?
Sorry for answering an old question, but this has bugged me as well for some time. The answer is: ;
The format of type specification for non-primitive return types is Lpackage/subpackage/Type; - it has to end with a semicolon. So in the example above, you would need:
out = .jcall(jobject,"Ljava/util/List;","lookup",input)

Resources