R: Submit Form then Scrape Results - r

I'm trying to write code in R that will allow me to submit a query on http://nbawowy.com/ and scrape the resulting data. I'd like to be able to input values for at least the "Select Team", "Select Players On", and "Select Players Off" fields and then submit the form. As an example, if I select the 76ers as my team, and Ben Simmons as the "Player On", the resulting query is found here: http://nbawowy.com/#/z31mjvm5ss. I've tried using the following code, but it provides me with an unknown field names error:
library(rvest)
url <- "http://nbawowy.com/#/l0krk654imh"
session <- html_session(url)
form <- html_form(read_html(url))[[1]]
filled_form <- set_values(form,
"s2id_autogen1_search" = "76ers",
"s2id_autogen2" = "Ben Simmons")
session1<-submit_form(session, filled_form, submit='submit')
Since I can't seem to get passed this initial part, I'm looking to the community for some help. I'd ultimately like to navigate the session to the resulting url and scrape the data.

This is not a problem with your code. If you check the form returned by the website, you will see that not only there are no list elements named "s2id_autogen1_search" or "s2id_autogen2", in fact, the whole form is unnamed. Furthermore, what seems like one form in the browser, are, in fact, multiple forms (so the players' names cannot be even entered into the html_form(read_html(url))[[1]] element). Get to know the object to which you are trying to set the values with str(form). Alternatively, try to scrape using RSelenium.

Related

R: Scraping dynamic dropdown menus (rvest, httr)

I've done a little bit of web scraping previously, but I'm very much a beginner when it comes to HTML and XML structures.
On the following website (https://www.rec-registry.gov.au/rec-registry/app/calculators/swh-stc-calculator) there is a form with the following fields:
System brand
System model
Installation date
Postcode
Disclaimer check box
And then a 'calculate' button to generate results
The first two fields are drop down boxes, the second box (system model) changes dynamically based on the system brand input.
What I would like to do is extract a list of all system brand options (~130), then for each brand extract all associated system models, and iteratively enter fixed values for the installation date and postcode and return the values generated by the calculator.
I can hunt down the XPATH of the system brands (//*[#id="refSystemBrand"]) - but I've tried extracting a list of system brands via rvest::html_element but it yields an empty list.
Based on some similar SO questions, I suspect I might need to use httr::POST() to drive the form entry (or the httr::html_form_ functions?). But I genuinely have no idea where to start with these functions (and I don't find the help or intro tutorials very enlightening).
Any help on what I'm doing wrong, or the bits that I clearly don't understand about how web forms are designed would be very appreciated!
A basic way of getting what you want is inserting the resulting combinations from your lists into the following post request.
require(httr)
library(tidyverse)
headers = c(
`Content-Type` = 'application/json; charset=utf-8',
`Email` = 'add your email to be transparent'
)
data = '{"postcode":"3000","systemBrand":"AAE Solar","systemModel":"ES-250E-30-S26M","installationDate":"2021-09-30T00:00:00.000Z"}'
res <- httr::POST(url = 'https://www.rec-registry.gov.au/rec-registry/app/calculators/swh/stc',
body = data,
add_headers(headers),
verbose())

Google Analytics Query for R, content drilldown

I am trying to export Google Analytics data into R in order to build a report and do some other data mining related tasks with the data. I am using the RGoogleAnalytics package to do so. I have the connection working, but am having trouble specifying the correct query in order to obtain the right information.
I am trying to obtain information from a specific page that I would reach by going to the content drilldown section in google analytics, and searching for that specific page. I also would like to use a filtered view, to filter out ISP's that are from my work place. There are several websites under a particular view. To reach the specific page, I use the content drill down in Google Analytics. I am trying to build a query that pulls this information automatically. I have tried the following in regards to getting the correct query.
ValidateToken(token1)
query.list1 <- Init(start.date = "2016-10-28", end.date = "2016-12-05",
dimensions = "ga:date", metrics = "ga:uniquepageviews",
filters = "ga:pagePathLevel1
==/ed######.edu/;ga:pagePathLevel2==/content/",
table.id = "ga:##### ")
sort = "ga:date"
ga.query <- QueryBuilder(query.list1)
ga.data <- GetReportData(ga.query, token1)
This does not throw in error in R, but does not seem to be returning any metrics(it returns all zeros, for unique pageviews, when there are results) as shown below.
**date** **uniquepageviews**
1 20161028 0
2 20161029 0
3 20161030 0
4 20161031 0
In the above, I tried to use the filter to get the correct page. Is this correct? If so, what should I put into the filter so that it only returns metrics for a specific page in a given view? Also, is there a way to select for a given prebuilt view? Any help is appreciated, thanks.

Assigning Shiny Input Variable to Gloval Environment/Data

I'm converting some code/script to function on a shiny server.
I need to scrape html data tables from a url, and currently, to choose the id number that changes in the url I have a function.
Get_ID<-function()[readline("Please choose a number:>>> ")}
num.id<-GET_ID()
It works quite well, it prompts to write the number in by keyboard in the console, and I can run the scrape easily for that URL identified (the value is pasted into the middle of a URL to use in further code).
I'm now trying to provide a list of numbers in a drop-down list on the ui.r file.
I have a dataframe (call it df) with these values (pre-planned url ID's)
NAME number
seta 20001
setb 20002
setc 20003
setd 20004
sete 20005
I'd like for the 'NAME' to appear in the drop-down list and the corresponding value in "number" to be used and assigned to the variable "num.id" so that the rest of the script will execute with the current value chosen by the user.
For the ui side, I have something along the lines of:
selectInput("ChooseName", "df", choices=NAME),
actionButton("submit", "SUBMIT CHOICE")
and I'm stuck on the server side. I've played a bit with creating an observer for the submit button, but I can't figure out a way to have the values exist in my global environment, specifically assigned to the value "num.id" - what I was using above. I would like the choice to be unique for every user. I.e., if User A chooses "setb", this won't disrupt User B who is viewing "setd".
For background, the url scrapes a table full of numbers, for which I manipulate and match to several dataframes and then output a final graph using ggplot2. Theoretically, you may recommend I pass the reactive input's from the drop-down list into the ggplot2 code directly, but there's a lot of reformatting that happens prior to use in the plot, so I'd like to keep the steps separated.
Thanks!

Editing a Remedy User macro file.ARQ

Using BMC Remedy User v7.5 p004 to track/manage incidents. This tool has an option to record macros which are saved as a .ARQ file, I can open this file in Notepad++ but it is quite jumbled.
What I am trying to do is allow someone to search incidents based off of the summary that is put in the Working Log under the WorkInfo tab. I know that you can record macros that allow you to enter search variables that will prompt the end user, but when recording a macro the workinfo section is deactivated. So I would like to edit some pre-existing macros to try end create what I need.
SQL for what I want to pull
SELECT incidentno, summary, notes, summary*, notes*
FROM whatever the main table name is
WHERE WorkInfoType = WorkingLog
Note that the reason there are two summary and notes fields is because two of the fields are under the WorkingLog and the other two are fields listed for the whole incident. The BMC naming convention difference for these different fields is the *
Solved this by recording a macro out of the advanced search form within the incident management console. Within that form you can select the fields you want to search and there is an advanced button that brings up a query box for more complicated searching. After changing WorkInfoType on the form to equal "Working Log" I used the following advanced query to finish off the rest of the search.
( 'Summary' LIKE "%$Search Technical Name$%") AND ( 'Incident Status' = "Resolved" OR 'Incident Status' = "Closed" ) AND ( 'Assigned Group' = "Group1" OR 'Assigned Group' = "Group2" )
Note: When recording a maco you can enter $VariableName$ to make a variable. This will allow a user to enter text in a search box for whatever field you make a variable. So for example in the query I made a variable called titled "Search Technical Name", and this prompts the user to search the summary field when running the macro. Also, the % act as a wildcard search which will hit on not exact matches.

Populate a form from a select list

I have tried multiple attempts at populating a report from selecting a value in a select list. I have come close but not close enough for the right answer. Does anyone have a solution?
Here is the code
Currently I have a select list that has the option of choosing an employees track and the employees track is populated in the select list based on :app_user.
List of Values
List of values definition:
SELECT track_name AS display_value,
track_id AS return_value
FROM ref_track
ORDER BY 1
Source Value for select list:
SELECT "REF_TRACK"."TRACK_NAME" AS display_value,
"REF_TRACK"."TRACK_ID" AS return_value
FROM "REF_STAFF",
"REF_PLAN",
"WORK_ITEM",
"REF_RELEASE",
"REF_TRACK"
WHERE "REF_RELEASE"."RELEASE_ID" = "REF_PLAN"."RELEASE_ID"
AND "REF_TRACK"."TRACK_ID" = "REF_PLAN"."TRACK_ID"
AND "WORK_ITEM"."WR_ID" = "REF_PLAN"."WORK_ITEM_ID"
AND Nvl("REF_STAFF"."REF_STAFF_TRACK_ID", "REF_PLAN"."TRACK_ID") =
"REF_PLAN"."TRACK_ID"
AND (( "REF_STAFF"."STAFF_USER_ID" = :APP_user ))
I now have a report beneath it that is being populated when the page loads that also generates data based on :App_user.
Report Source Code:
SELECT "REF_PLAN"."PLAN_ID" "PLAN_ID",
"REF_PLAN"."WORK_ITEM_ID" "WORK_ITEM_ID",
"REF_PLAN"."TRACK_ID" "TRACK_ID",
"REF_PLAN"."PLANNED_TOT_HRS" "PLANNED_TOT_HRS",
"REF_PLAN"."PLAN_START_DATE" "PLAN_START_DATE",
"REF_PLAN"."PLAN_END_DATE" "PLAN_END_DATE",
"REF_PLAN"."COMMENTS" "COMMENTS",
"REF_PLAN"."RELEASE_ID" "RELEASE_ID",
"WORK_ITEM"."WR_ID" "WR_ID",
"WORK_ITEM"."WR_NUM" "WR_NUM",
"REF_RELEASE"."RELEASE_ID" "RELEASE_ID2",
"REF_RELEASE"."RELEASE_NUM" "RELEASE_NUM",
"REF_TRACK"."TRACK_ID" "TRACK_ID2",
"REF_TRACK"."TRACK_NAME" "TRACK_NAME",
"REF_STAFF"."REF_STAFF_TRACK_ID" "REF_STAFF_TRACK_ID",
"REF_STAFF"."STAFF_USER_ID" "STAFF_USER_ID"
FROM "REF_STAFF",
"REF_PLAN",
"WORK_ITEM",
"REF_RELEASE",
"REF_TRACK"
WHERE "REF_RELEASE"."RELEASE_ID" = "REF_PLAN"."RELEASE_ID"
AND "REF_TRACK"."TRACK_ID" = "REF_PLAN"."TRACK_ID"
AND "WORK_ITEM"."WR_ID" = "REF_PLAN"."WORK_ITEM_ID"
AND Nvl("REF_STAFF"."REF_STAFF_TRACK_ID", "REF_PLAN"."TRACK_ID") =
"REF_PLAN"."TRACK_ID"
AND (( "REF_STAFF"."STAFF_USER_ID" = :APP_USER ))
AND "REF_PLAN"."TRACK_ID" = :P47_TRACK_LIST
I tried adding this line to pick from the select list.
Is there any way to manipulate this code to be able to select a track from my list and populate data based on the track selection in my report. I would also like to let you know that my select list values are based on a submit page. Please let me know if you can help me. Its frustrating when I look at something for a complete day and cant figure the code out. Also, if there is any other way around it or other options to explore please let me know.
If you want the report to update when you change the selected value of the select list, you can do this in 2 ways. But both come down to the same principle: your selected value has to be submitted to the session state in order for the report to filter on it.
Solution 1: have the select list submit/redirect the page. This will submit the value of your select list to the session, and reloads the page. With the redirect you will fill up the browser history though: select a value a couple of times, and you use 'back' on the browser to navigate back through the choices you made. Or use a submit, this'll reload the page too, but won't fill the history as much. There'll still be one extra history entry though (initial, and first reload, following reloads are not in history).
Find the option by editing your select list, going to the Settings region, and change the page action when value changed.
Solution 2: refresh the report region through a dynamic action. This will not reload the page, it'll 'refresh' just your report. This might be the most userfriendly, it depends if you like a page reload or not :)
You'll need a dynamic option, configured like this:
With these true action details:
And most important, to make sure your selected value is submitted to the session state: add the item to the list of items to be submitted when the report is refreshed.
I set up an example here

Resources