R: Scraping dynamic dropdown menus (rvest, httr)

R: Scraping dynamic dropdown menus (rvest, httr) - r

I've done a little bit of web scraping previously, but I'm very much a beginner when it comes to HTML and XML structures.
On the following website (https://www.rec-registry.gov.au/rec-registry/app/calculators/swh-stc-calculator) there is a form with the following fields:
System brand
System model
Installation date
Postcode
Disclaimer check box
And then a 'calculate' button to generate results
The first two fields are drop down boxes, the second box (system model) changes dynamically based on the system brand input.
What I would like to do is extract a list of all system brand options (~130), then for each brand extract all associated system models, and iteratively enter fixed values for the installation date and postcode and return the values generated by the calculator.
I can hunt down the XPATH of the system brands (//*[#id="refSystemBrand"]) - but I've tried extracting a list of system brands via rvest::html_element but it yields an empty list.
Based on some similar SO questions, I suspect I might need to use httr::POST() to drive the form entry (or the httr::html_form_ functions?). But I genuinely have no idea where to start with these functions (and I don't find the help or intro tutorials very enlightening).
Any help on what I'm doing wrong, or the bits that I clearly don't understand about how web forms are designed would be very appreciated!

A basic way of getting what you want is inserting the resulting combinations from your lists into the following post request.
require(httr)
library(tidyverse)
headers = c(
`Content-Type` = 'application/json; charset=utf-8',
`Email` = 'add your email to be transparent'
)
data = '{"postcode":"3000","systemBrand":"AAE Solar","systemModel":"ES-250E-30-S26M","installationDate":"2021-09-30T00:00:00.000Z"}'
res <- httr::POST(url = 'https://www.rec-registry.gov.au/rec-registry/app/calculators/swh/stc',
body = data,
add_headers(headers),
verbose())

Related

Import pdf files from API into rows in Power Bi

I have a Power Bi Dashboard I've made that pulls its Data from a Redcap Database using an API. It looks like this, with mostly Text in the various columns:
What I'd love to do is make it so that thee fields circled in red were real files that could be clicked and downloaded. I know that the API allows me to pull files from it. I've used R with code like this (that individually mentions what record and field I want):
library(REDCapR)
redcap_download_file_oneshot( redcap_uri="https://redcap.company.org/redcap/api/", token="################", record="1", field='full_protocol_attachment_t_v2', event = "", repeat_instrument = NULL, repeat_instance = NULL, verbose = TRUE, config_options = NULL, overwrite = TRUE )
To individually download files one at a time. The problem is twofold:
If I were to use R, I have no idea how to automate that snippet of code for every row I may pull from the database (and if there are new rows)
My understanding of PowerBi is that if I do use R, it makes refreshing the data harder when the report is published online. Right now given all the data just comes from an api directly into PowerBi, I don't have to setup any fancy permissions or gateways to have automated refreshes.
So my question is: is there a way to do this directly within PowerBi? Like a calculated column or something that would pull a particular records file based on what row it was in?

The only thing you can do in native PBI is have a URL which when clicked will open the destination for you. Can you create a full url for the file download?

String columns are not listing inside the metrics, only allowed to select numeric column types from the metrics while creating report in GoodData

Hi,
I am new in GoodData, I am trying to create a report in that I have to show the product name but inside the Matrics tab I am only allowed to chose numeric columns.
How can I show string columns in Report?

I would recommend to check GoodData documentation for platform beginners to get understanding about basic concepts. There is tutorial which will be good start - https://help.gooddata.com/display/doc/GoodData+Developer+Tutorial. There is part for Concepts - https://help.gooddata.com/display/doc/Concepts and also for Creating Metrics & Reports - https://help.gooddata.com/display/doc/Creating+Metrics+and+Reports
In your particular case I suggest you are aiming for adding Attribute to your report, which can be achieved in "How" section.

Extracting table from a webpage in automation anywhere

Is there a way to extract a table from a web page in Automation Anywhere after taking certain steps using web recorder. The table does not appear directly, it appears after clicking few controls after launching the URL.
The table that I want to extract is coming after loggin in to the website and filtering using a control for search criteria.
I used web recorder to login and putting the desired search criteria in a text field and I want to extract the table now. When I use web recorder, it launches the URL again and takes me back to the login page which I dont want. I want the bot to stay on the page. Pls help.
Also, what is the significance of session name of an extracted table?

If you clicked on Advanced View, you will find at Step 5 : to run this command using an existing IE window. Try to write the URL of the page with the table and not the one of the login page.
The extracted table is to be used using variable $Table Column(Index)$ with index being the column number or column name

you can export directly using object cloning and in the selection criteria export to csv file. But we need to click on html inner text also in search criteria

An old question, but my experience has been the Extract Data/Table commands are rather poor. Not only do they only work in IE, you cannot call them as commands, they have to be called via a web recording.
Instead, I've found it much more useful to object clone the initial element, grab the DOMXPath, and variablize that. Then throw it into a loop while command and set the condition on finding at least one element (of the elements for the table you are trying to build). You can grab all sorts of useful info in the object clone command and then right that to a variable/table.
For example
//div[#id='updatable-standings']/div[1]/div[1]/div[2]/div[1]/table[1]/tbody[1]/tr[3]/td[2]/div[1]/span[2]
//div[#id='updatable-standings']/div[1]/div[1]/div[2]/div[1]/table[1]/tbody[1]/tr[4]/td[2]/div[1]/span[2]
I can create a incremental variable for {tr[3]} and call it $vTeamLoop$ and change my DOMXPath value in the Object Clone to be
//div[#id='updatable-standings']/div[1]/div[1]/div[2]/div[1]/table[1]/tbody[1]/tr[$vTeamLoop$]/td[2]/div[1]/span[2]
Ultimately, it is more steps than the Data/Table Extract command, but it is far less limited in scope.
Hope that helps.
enter code here

R: Submit Form then Scrape Results

I'm trying to write code in R that will allow me to submit a query on http://nbawowy.com/ and scrape the resulting data. I'd like to be able to input values for at least the "Select Team", "Select Players On", and "Select Players Off" fields and then submit the form. As an example, if I select the 76ers as my team, and Ben Simmons as the "Player On", the resulting query is found here: http://nbawowy.com/#/z31mjvm5ss. I've tried using the following code, but it provides me with an unknown field names error:
library(rvest)
url <- "http://nbawowy.com/#/l0krk654imh"
session <- html_session(url)
form <- html_form(read_html(url))[[1]]
filled_form <- set_values(form,
"s2id_autogen1_search" = "76ers",
"s2id_autogen2" = "Ben Simmons")
session1<-submit_form(session, filled_form, submit='submit')
Since I can't seem to get passed this initial part, I'm looking to the community for some help. I'd ultimately like to navigate the session to the resulting url and scrape the data.

This is not a problem with your code. If you check the form returned by the website, you will see that not only there are no list elements named "s2id_autogen1_search" or "s2id_autogen2", in fact, the whole form is unnamed. Furthermore, what seems like one form in the browser, are, in fact, multiple forms (so the players' names cannot be even entered into the html_form(read_html(url))[[1]] element). Get to know the object to which you are trying to set the values with str(form). Alternatively, try to scrape using RSelenium.

Content delivery criteria for getting data from LinkInfo Table

I need to fetch component ids from link info table based where url field matches a certain value. Is there any criteria to get data from link_info table in tridion using content delivery api.
for example
Regards,
Rajendra

Components don't have URLs, so that might be a bit tricky to achieve.
Pages do have URLs, and you can use the PageFactory class to find them, something along these lines:
PageMetaFactory factory = new PageMetaFactory(publicationId);
PageMeta meta = factory.getMetaByUrl(publicationId, "/my/url");
List<ComponentPresentationMeta> cpMetas = meta.getComponentPresentationMetas();
This will contain a list of all component presentations for a given page. You can use cpMeta.getComponentId() to get the component ID of the component presentation in question.
You may want to start asking Tridion questions here: http://tridion.stackexchange.com

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

R: Scraping dynamic dropdown menus (rvest, httr) - r

Related

Import pdf files from API into rows in Power Bi

String columns are not listing inside the metrics, only allowed to select numeric column types from the metrics while creating report in GoodData

Extracting table from a webpage in automation anywhere

R: Submit Form then Scrape Results

Content delivery criteria for getting data from LinkInfo Table

Categories

Resources