Seach website for phrase in R - r

I'd like to understand what applications of machine learning are being developed by the US federal government. The federal government maintains the website FedBizOps that contains contracts. The web site can be searched for a phrase, e.g. "machine learning", and a date range, e.g. "last 365 days" to find relevant contracts. The resulting search produces links that contain a contract summary.
I'd like to be able to pull the contract summaries, given a search term and a date range, from this site.
Is there any way I can scrape the browser rendered data in to R? A similar question exists on web scraping, but I don't know how to change the date range.
Once the information is pulled into R, I'd like to organize the summaries with a bubble chart of key phrases.

This may look like a site that uses XHR via javascript to retrieve the URL contents, but it's not. It's just a plain web site that can easily be scraped via standard rvest & xml2 calls like html_session and read_html. It does keep the Location: URL the same, so it kinda looks like XHR even thought it's not.
However, this is a <form>-based site, which means you could be generous to the community and write an R wrapper for the "hidden" API and possibly donate it to rOpenSci.
To that end, I used the curlconverter package on the "Copy as cURL" content from the POST request and it provided all the form fields (which seem to map to most — if not all — of the fields on the advanced search page):
library(curlconverter)
make_req(straighten())[[1]] -> req
httr::VERB(verb = "POST", url = "https://www.fbo.gov/index?s=opportunity&mode=list&tab=list",
httr::add_headers(Pragma = "no-cache",
Origin = "https://www.fbo.gov",
`Accept-Encoding` = "gzip, deflate, br",
`Accept-Language` = "en-US,en;q=0.8",
`Upgrade-Insecure-Requests` = "1",
`User-Agent` = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.41 Safari/537.36",
Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
`Cache-Control` = "no-cache",
Referer = "https://www.fbo.gov/index?s=opportunity&mode=list&tab=list",
Connection = "keep-alive",
DNT = "1"), httr::set_cookies(PHPSESSID = "32efd3be67d43758adcc891c6f6814c4",
sympcsm_cookies_enabled = "1",
BALANCEID = "balancer.172.16.121.7"),
body = list(`dnf_class_values[procurement_notice][keywords]` = "machine+learning",
`dnf_class_values[procurement_notice][_posted_date]` = "365",
search_filters = "search",
`_____dummy` = "dnf_",
so_form_prefix = "dnf_",
dnf_opt_action = "search",
dnf_opt_template = "VVY2VDwtojnPpnGoobtUdzXxVYcDLoQW1MDkvvEnorFrm5k54q2OU09aaqzsSe6m",
dnf_opt_template_dir = "Pje8OihulaLVPaQ+C+xSxrG6WrxuiBuGRpBBjyvqt1KAkN/anUTlMWIUZ8ga9kY+",
dnf_opt_subform_template = "qNIkz4cr9hY8zJ01/MDSEGF719zd85B9",
dnf_opt_finalize = "0",
dnf_opt_mode = "update",
dnf_opt_target = "", dnf_opt_validate = "1",
`dnf_class_values[procurement_notice][dnf_class_name]` = "procurement_notice",
`dnf_class_values[procurement_notice][notice_id]` = "63ae1a97e9a5a9618fd541d900762e32",
`dnf_class_values[procurement_notice][posted]` = "",
`autocomplete_input_dnf_class_values[procurement_notice][agency]` = "",
`dnf_class_values[procurement_notice][agency]` = "",
`dnf_class_values[procurement_notice][zipstate]` = "",
`dnf_class_values[procurement_notice][procurement_type][]` = "",
`dnf_class_values[procurement_notice][set_aside][]` = "",
mode = "list"), encode = "form")
curlconverter adds the httr:: prefixes to the various functions since you can actually use req() to make the request. It's a bona-fide R function.
However, most of the data being passed in is browser "cruft" and can be trimmed down a bit and moved into a POST request:
library(httr)
library(rvest)
POST(url = "https://www.fbo.gov/index?s=opportunity&mode=list&tab=list",
add_headers(Origin = "https://www.fbo.gov",
Referer = "https://www.fbo.gov/index?s=opportunity&mode=list&tab=list"),
set_cookies(PHPSESSID = "32efd3be67d43758adcc891c6f6814c4",
sympcsm_cookies_enabled = "1",
BALANCEID = "balancer.172.16.121.7"),
body = list(`dnf_class_values[procurement_notice][keywords]` = "machine+learning",
`dnf_class_values[procurement_notice][_posted_date]` = "365",
search_filters = "search",
`_____dummy` = "dnf_",
so_form_prefix = "dnf_",
dnf_opt_action = "search",
dnf_opt_template = "VVY2VDwtojnPpnGoobtUdzXxVYcDLoQW1MDkvvEnorFrm5k54q2OU09aaqzsSe6m",
dnf_opt_template_dir = "Pje8OihulaLVPaQ+C+xSxrG6WrxuiBuGRpBBjyvqt1KAkN/anUTlMWIUZ8ga9kY+",
dnf_opt_subform_template = "qNIkz4cr9hY8zJ01/MDSEGF719zd85B9",
dnf_opt_finalize = "0",
dnf_opt_mode = "update",
dnf_opt_target = "", dnf_opt_validate = "1",
`dnf_class_values[procurement_notice][dnf_class_name]` = "procurement_notice",
`dnf_class_values[procurement_notice][notice_id]` = "63ae1a97e9a5a9618fd541d900762e32",
`dnf_class_values[procurement_notice][posted]` = "",
`autocomplete_input_dnf_class_values[procurement_notice][agency]` = "",
`dnf_class_values[procurement_notice][agency]` = "",
`dnf_class_values[procurement_notice][zipstate]` = "",
`dnf_class_values[procurement_notice][procurement_type][]` = "",
`dnf_class_values[procurement_notice][set_aside][]` = "",
mode="list"),
encode = "form") -> res
This portion:
set_cookies(PHPSESSID = "32efd3be67d43758adcc891c6f6814c4",
sympcsm_cookies_enabled = "1",
BALANCEID = "balancer.172.16.121.7")
makes me think you should use html_session or GET at least once on the main URL to establish those cookies in the cached curl handler (which will be created & maintained automagically for you).
The add_headers() bit may also not be necessary but that's an exercise left for the reader.
You can find the table you're looking for via:
content(res, as="text", encoding="UTF-8") %>%
read_html() %>%
html_nodes("table.list") %>%
html_table() %>%
dplyr::glimpse()
## Observations: 20
## Variables: 4
## $ Opportunity <chr> "NSN: 1650-01-074-1054; FILTER ELEMENT, FLUID; WSIC: L SP...
## $ Agency/Office/Location <chr> "Defense Logistics Agency DLA Acquisition LocationsDLA Av...
## $ Type / Set-aside <chr> "Presolicitation", "Presolicitation", "Award", "Award", "...
## $ Posted On <chr> "Sep 28, 2016", "Sep 28, 2016", "Sep 28, 2016", "Sep 28, ...
There's an indicator on the page saying these are results "1 - 20 of 2008". You need to scrape that as well and deal with the paginated results. This is also left as an exercise to the reader.

Related

R: search_fullarchive() and Twitter Academic research API track

I was wondering whether anyone has found a way to how to use search_fullarchive() from the "rtweet" package in R with the new Twitter academic research project track?
The problem is whenever I try to run the following code:
search_fullarchive(q = "sunset", n = 500, env_name = "AcademicProject", fromDate = "202010200000", toDate = "202010220000", safedir = NULL, parse = TRUE, token = bearer_token)
I get the following error "Error: Not a valid access token". Is that because search_fullarchive() is only for paid premium accounts and that doesn't include the new academic track (even though you get full archive access)?
Also, can you retrieve more than 500 tweets (e.g., n = 6000) when using search_fullarchive()?
Thanks in advance!
I've got the same problem w/ Twitter academic research API. I think if you set n = 100 or just skip the argument, the command will return you 100 tweets. Also, the rtweet package does not (yet) support the academic research API.
Change your code to this:
search_fullarchive(q = "sunset", n = 500, env_name = "AcademicProject", fromDate = "202010200000", toDate = "202010220000", safedir = NULL, parse = TRUE, token = t, env_name = "Your Environment Name attained in the Dev Dashboard")
Also The token must be created like this:
t <- create_token(
app = "App Name",
'Key',
'Secret',
access_token = '',
access_secret = '',
set_renv = TRUE
)

How to Call Amazon Product Advertising API 5 from R?

I want to call Amazon product advertising API from R. Given below is the quick guide of pa API 5.
https://webservices.amazon.com/paapi5/documentation/quick-start/using-curl.html
I tried to do the way it described here https://webservices.amazon.com/paapi5/documentation/sending-request.html using 'httr' but got thrown off on the signature version 4 signing process.
I tried using 'aws.signature' package for signature prior to calling the POST function, but the final output I am getting is status code 500.
Here is the code I have used
library(httr)
library(jsonlite)
library(aws.signature)
request_body=data.frame("Keywords"="Harry",
"Marketplace"= "www.amazon.com",
"PartnerTag"= "mytag-20",
"PartnerType"= "Associates",
"Access Key"="my_accesskey",
"Secret Key"="my_secret_key",
"service"="ProductAdvertisingAPIv1",
"Region"="us-east-1"
"Resources"="Offers.Listings.Price",
"SearchIndex"= "All")
request_body_json=toJSON(request_body,auto_unbox=T)
request_body_json=gsub("\\[|\\]","",request_body_json)
t=signature_v4_auth(
datetime = format(Sys.time(), "%Y%m%dT%H%M%SZ", tz = "UTC"),
region = NULL,
service="ProductAdvertisingAPIv1",
verb="POST",
"com.amazon.paapi5.v1.ProductAdvertisingAPIv1.SearchItems",
query_args = list(),
canonical_headers=c("Host: webservices.amazon.com",
"Content-Type: application/json; charset=UTF-8",
"X-Amz-Target: com.amazon.paapi5.v1.ProductAdvertisingAPIv1.SearchItems",
"Content-Encoding: amz-1.0",
"User-Agent: paapi-docs-curl/1.0.0"),
request_body=request_body_json,
signed_body = TRUE,
key = "access_key",
secret = "secret-key",
session_token = NULL,
query = FALSE,
algorithm = "AWS4-HMAC-SHA256",
force_credentials = FALSE,
verbose = getOption("verbose", FALSE)
)
result=POST("https://webservices.amazon.com/paapi5/searchitems",body=request_body_json,
add_headers(.headers=c("Host: webservices.amazon.com",
"Content-Type: application/json; charset=UTF-8",
paste("X-Amz-Date:",format(Sys.time(), "%Y%m%dT%H%M%SZ", tz = "UTC")),
"X-Amz-Target: com.amazon.paapi5.v1.ProductAdvertisingAPIv1.SearchItems",
"Content-Encoding: amz-1.0",
"User-Agent: paapi-docs-curl/1.0.0",
paste0("Authorization: AWS4-HMAC-SHA256 Credential=",t[["Credential"]],"SignedHeaders=content-encoding;host;x-amz-date;x-amz-target Signature=",t[["Signature"]])
)))
Appreciate it if anyone one can help with this. Thanks

Merge many bib tex files into one

I have multiple single bibtex files which are like this:
first file:
#article{DBLP:journals/access/AlotaibiAASA20,
author = {Bashayer Alotaibi and
Rabeeh Ayaz Abbasi and
Muhammad Ahtisham Aslam and
Kawther Saeedi and
Dimah Alahmadi},
title = {Startup Initiative Response Analysis {(SIRA)} Framework for Analyzing
Startup Initiatives on Twitter},
journal = {{IEEE} Access},
volume = {8},
pages = {10718--10730},
year = {2020},
url = {https://doi.org/10.1109/ACCESS.2020.2965181},
doi = {10.1109/ACCESS.2020.2965181},
timestamp = {Fri, 07 Feb 2020 12:04:40 +0100},
biburl = {https://dblp.org/rec/journals/access/AlotaibiAASA20.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
second file:
#inproceedings{DBLP:conf/comad/MathewKG020,
author = {Binny Mathew and
Navish Kumar and
Pawan Goyal and
Animesh Mukherjee},
editor = {Rishiraj Saha Roy},
title = {Interaction dynamics between hate and counter users on Twitter},
booktitle = {CoDS-COMAD 2020: 7th {ACM} {IKDD} CoDS and 25th COMAD, Hyderabad India,
January 5-7, 2020},
pages = {116--124},
publisher = {{ACM}},
year = {2020},
url = {https://doi.org/10.1145/3371158.3371172},
doi = {10.1145/3371158.3371172},
timestamp = {Wed, 22 Jan 2020 14:37:05 +0100},
biburl = {https://dblp.org/rec/conf/comad/MathewKG020.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
How is it possible to read them all (they are in the same path) and create a new file which will be just a paste of all of them.
Example of expected output
#article{DBLP:journals/access/AlotaibiAASA20,
author = {Bashayer Alotaibi and
Rabeeh Ayaz Abbasi and
Muhammad Ahtisham Aslam and
Kawther Saeedi and
Dimah Alahmadi},
title = {Startup Initiative Response Analysis {(SIRA)} Framework for Analyzing
Startup Initiatives on Twitter},
journal = {{IEEE} Access},
volume = {8},
pages = {10718--10730},
year = {2020},
url = {https://doi.org/10.1109/ACCESS.2020.2965181},
doi = {10.1109/ACCESS.2020.2965181},
timestamp = {Fri, 07 Feb 2020 12:04:40 +0100},
biburl = {https://dblp.org/rec/journals/access/AlotaibiAASA20.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
#inproceedings{DBLP:conf/comad/MathewKG020,
author = {Binny Mathew and
Navish Kumar and
Pawan Goyal and
Animesh Mukherjee},
editor = {Rishiraj Saha Roy},
title = {Interaction dynamics between hate and counter users on Twitter},
booktitle = {CoDS-COMAD 2020: 7th {ACM} {IKDD} CoDS and 25th COMAD, Hyderabad India,
January 5-7, 2020},
pages = {116--124},
publisher = {{ACM}},
year = {2020},
url = {https://doi.org/10.1145/3371158.3371172},
doi = {10.1145/3371158.3371172},
timestamp = {Wed, 22 Jan 2020 14:37:05 +0100},
biburl = {https://dblp.org/rec/conf/comad/MathewKG020.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
Edited:
I tested the code below and it merges multiple files into one:
First, extract all paths to the .bib files ("." if they are in the working directory, "path/to/directory/" or "/absolute/path/to/directory" otherwise:
path_to_bib_files <- list.files(".", pattern="\\.bib$", full.names=TRUE)
Then, iterate through the files one by one and append them:
combined_bib <- ""
for (path_to_bib_file in path_to_bib_files) {
fileCon <- file(path_to_bib_file)
content <- readLines(fileCon)
close(fileCon)
combined_bib <- paste0(combined_bib, "\n", "\n", trimws(paste0(content, collapse="\n")))
}
Finally, write the combined string to a file:
cat(combined_bib, file="combined_references.bib", "\n")
Here are the steps that worked from me.
1.List all files.
d<- list.files("path", pattern="\\.bib$", full.names=T)
2.Read all files
read_files<-lapply(d,readLines)
3.Unlist them
Unlist_files<-unlist(read_files)
4.Export as a single object
write(Unlist_files, file = "path/Bib.bib")
You can concatenate the contents of the two files together like this
big_bib <- c(readLines("bib1.bib"), "\n", readLines("bib2.bib"))
And write the new file like this:
writeLines(big_bib, "big_bib.bib")

choose.files() defaulting to last directory used

According to the documentation for the choose.files function:
choose.files(default = "", caption = "Select files",
multi = TRUE, filters = Filters,
index = nrow(Filters))
If you would like to display files in a particular directory, give a
fully qualified file mask (e.g., "c:\*.*") in the default argument.
If a directory is not given, the dialog will start in the current
directory the first time, and remember the last directory used on
subsequent invocations.
My code is as follows:
AZPic = choose.files(default = "", caption = "Select Azimuth Picture", multi = FALSE, filters = Filters[c("All","jpeg"),])
STSPic = choose.files(default = "", caption = "Select Side to Side Elevation Picture", multi = FALSE, filters = Filters[c("All","jpeg"),])
FTBPic = choose.files(default = "", caption = "Select Front to Back Elevation Picture", multi = FALSE, filters = Filters[c("All","jpeg"),])
OrientationPic = choose.files(default = "", caption = "Select Orientation Picture", filters = Filters[c("All","jpeg"),])
Now when I run this code, it defaults to my home directory for all 4 calls.
It starts there for the first call of course, but afterwards should it not remember the folder I navigate to?
I typically will hop through another 3 or 4 folders to find the pictures for the job, but all of them for each program use will be in the same folder.
Is there something I'm missing?
Thanks in advance.

How to get table from html form using rvest or httr?

I am using R, version 3.3.1. I am trying to scrap data from following web site:
http://plovila.pomorstvo.hr/
As you can see, it is a HTML form. I would like to choose "Tip objekta" (object type), for example "Jahta" (Yacht) and enter "NIB" (which is an integer, eg. 93567). You can try yourself; just choose "Jahta" and type 93567 in NIB field.
Method is POST, type application/x-www-form-urlencoded. I have tried 3 different approaches: using rvest, POST (httr package) and postForm (Rcurl). My rvest code is:
session <- html_session("http://plovila.pomorstvo.hr")
form <- html_form(session)[[1]]
form <- set_values(form, `ctl00$Content_FormContent$uiTipObjektaDropDown` = 2,
`ctl00$Content_FormContent$uiOznakaTextBox` = "",
`ctl00$Content_FormContent$uiNibTextBox` = 93567)
x <- submit_form(session, form)
If I run this code and get 200 status but I don't understand how can I get the table:
Additional step is to submit Detalji button and get additional information, but I can't see any information from x submit output.
I used the curlconverter package to take the "Copy as cURL" data from the XHR POST request and turn it automagically into:
httr::VERB(verb = "POST", url = "http://plovila.pomorstvo.hr/",
httr::add_headers(Origin = "http://plovila.pomorstvo.hr",
`Accept-Encoding` = "gzip, deflate",
`Accept-Language` = "en-US,en;q=0.8",
`X-Requested-With` = "XMLHttpRequest",
Connection = "keep-alive",
`X-MicrosoftAjax` = "Delta=true",
Pragma = "no-cache", `User-Agent` = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.34 Safari/537.36",
Accept = "*/*", `Cache-Control` = "no-cache",
Referer = "http://plovila.pomorstvo.hr/",
DNT = "1"), httr::set_cookies(ASP.NET_SessionId = "b4b123vyqxnt4ygzcykwwvwr"),
body = list(`ctl00$uiScriptManager` = "ctl00$Content_FormContent$ctl00|ctl00$Content_FormContent$uiPretraziButton",
ctl00_uiStyleSheetManager_TSSM = ";|635908784800000000:d29ba49:3cef4978:9768dbb9",
`ctl00$Content_FormContent$uiTipObjektaDropDown` = "2",
`ctl00$Content_FormContent$uiImeTextBox` = "",
`ctl00$Content_FormContent$uiNibTextBox` = "93567",
`__EVENTTARGET` = "", `__EVENTARGUMENT` = "",
`__LASTFOCUS` = "", `__VIEWSTATE` = "/wEPDwUKMTY2OTIzNTI1MA9kFgJmD2QWAgIDD2QWAgIBD2QWAgICD2QWAgIDD2QWAmYPZBYIAgEPZBYCZg9kFgZmD2QWAgIBDxAPFgYeDURhdGFUZXh0RmllbGQFD05heml2VGlwT2JqZWt0YR4ORGF0YVZhbHVlRmllbGQFDElkVGlwT2JqZWt0YR4LXyFEYXRhQm91bmRnZBAVBAAHQnJvZGljYQVKYWh0YQbEjGFtYWMVBAEwATEBMgEzFCsDBGdnZ2cWAQICZAIBDw8WAh4HVmlzaWJsZWdkFgICAQ8PFgIfA2dkZAICDw8WAh8DaGQWAgIBDw8WBB4EVGV4dGUfA2hkZAIHDzwrAA4CABQrAAJkFwEFCFBhZ2VTaXplAgoBFgIWCw8CCBQrAAhkZGRkZDwrAAUBBAUHSWRVcGlzYTwrAAUBBAUISWRVbG9za2E8KwAFAQQFBlNlbGVjdGRlFCsAAAspelRlbGVyaWsuV2ViLlVJLkdyaWRDaGlsZExvYWRNb2RlLCBUZWxlcmlrLldlYi5VSSwgVmVyc2lvbj0yMDEzLjMuMTExNC40MCwgQ3VsdHVyZT1uZXV0cmFsLCBQdWJsaWNLZXlUb2tlbj0xMjFmYWU3ODE2NWJhM2Q0ATwrAAcACyl1VGVsZXJpay5XZWIuVUkuR3JpZEVkaXRNb2RlLCBUZWxlcmlrLldlYi5VSSwgVmVyc2lvbj0yMDEzLjMuMTExNC40MCwgQ3VsdHVyZT1uZXV0cmFsLCBQdWJsaWNLZXlUb2tlbj0xMjFmYWU3ODE2NWJhM2Q0ARYCHgRfZWZzZGQWBB4KRGF0YU1lbWJlcmUeBF9obG0LKwQBZGZkAgkPZBYCZg9kFgJmD2QWIAIBD2QWBAIDDzwrAAgAZAIFDzwrAAgAZAIDD2QWBAIDDzwrAAgAZAIFDzwrAAgAZAIFD2QWAgIDDzwrAAgAZAIHD2QWBAIDDzwrAAgAZAIFDzwrAAgAZAIJD2QWBAIDDzwrAAgAZAIFDzwrAAgAZAILD2QWBgIDDxQrAAI8KwAIAGRkAgUPFCsAAjwrAAgAZGQCBw8UKwACPCsACABkZAIND2QWBgIDDxQrAAI8KwAIAGRkAgUPFCsAAjwrAAgAZGQCBw8UKwACPCsACABkZAIPD2QWAgIDDxQrAAI8KwAIAGRkAhEPZBYGAgMPPCsACABkAgUPPCsACABkAgcPPCsACABkAhMPZBYGAgMPPCsACABkAgUPPCsACABkAgcPPCsACABkAhUPZBYCAgMPPCsACABkAhcPZBYGAgMPPCsACABkAgUPPCsACABkAgcPPCsACABkAhkPPCsADgIAFCsAAmQXAQUIUGFnZVNpemUCBQEWAhYLZGRlFCsAAAsrBAE8KwAHAAsrBQEWAh8FZGQWBB8GZR8HCysEAWRmZAIbDzwrAA4CABQrAAJkFwEFCFBhZ2VTaXplAgUBFgIWC2RkZRQrAAALKwQBPCsABwALKwUBFgIfBWRkFgQfBmUfBwsrBAFkZmQCHQ88KwAOAgAUKwACZBcBBQhQYWdlU2l6ZQIFARYCFgtkZGUUKwAACysEATwrAAcACysFARYCHwVkZBYEHwZlHwcLKwQBZGZkAiMPPCsADgIAFCsAAmQXAQUIUGFnZVNpemUCBQEWAhYLZGRlFCsAAAsrBAE8KwAHAAsrBQEWAh8FZGQWBB8GZR8HCysEAWRmZAILD2QWAmYPZBYCZg9kFgICAQ88KwAOAgAUKwACZBcBBQhQYWdlU2l6ZQIFARYCFgtkZGUUKwAACysEATwrAAcACysFARYCHwVkZBYEHwZlHwcLKwQBZGZkZIULy2JISPTzELAGqWDdBkCVyvvKIjo/wm/iG9PT1dlU",
`__VIEWSTATEGENERATOR` = "CA0B0334",
`__PREVIOUSPAGE` = "jGgYHmJ3-6da6PzGl9Py8IDr-Zzb75YxIFpHMz4WQ6iQEyTbjWaujGRHZU-1fqkJcMyvpGRkWGStWuj7Uf3NYv8Wi0KSCVwn435kijCN2fM1",
`__ASYNCPOST` = "true",
`ctl00$Content_FormContent$uiPretraziButton` = "Pretraži"),
encode = "form") -> res
You can see the result of that via:
content(res, as="text") # returns raw HTML
or
content(res, as="parsed") # returns something you can use with `rvest` / `xml2`
Unfortunately, this is yet another useless SharePoint website that "eGov" sites around the world have bought into as a good thing to do. That means you have to do trial and error to figure out which of those parameters is necessary since it's different on virtually every site. I tried a minimal set to no avail.
You may even have to issue a GET request to the main site first to establish a session.
But this should get you going in the right direction.

Resources