download data from website when no URL specified

download data from website when no URL specified - r

I am trying to pull related to trade (imports and exports) from different central banks or statistical offices' websites in RStudio.
This is not a problem when an URL link is associated to a file (.pdf, .csv, .xls, ...). However, I can't find a solution when the user has to manually specify manually the filters (e.g. years, months, sectors,...) and no URL link is associated with the query.
For example, I am trying to load the imports and exports of El Salvador at this url: http://www.bcr.gob.sv/bcrsite/?cdr=38
It appears that the data is not stored in the html code of the web page. I have tried web scraping, but the data cannot be found this way as the user has to first make a query and then click "Export the results".
How I can automatically load these datasets into RStudio?

Looks like you need to use http://www.bcr.gob.sv/bcrsite/downloadsxls.php?exportCDR=1&cdr=38&xls=1 to get the XLS file which you can then parse. Make sure they are ok with this service being used as an API.

Related

Accessing calendar from OpenSRS/Sabre via CALDAV

I'm trying to get a single .ics file containing all calendar entries for a particular user of OpenSRS, which I believe has something to do with Sabre. I've tried using code that I have used successfully on other CalDAV servers, but it doesn't seem to work the same. Alternatively, I could make multiple http calls to get individual .ics files, if there is way to do that.
Using what I believe is the correct server (example: mail.servername.ca/caldav/username#domain.ca)
I make a http post using the custom request of "PROPFIND". I get back http 207, and the return data is a bunch of xml. Some of this is an href to a web page that if I retrieve, is an html file that displays links to the .ics files of each event (under the heading "Nodes"). So, I suppose I could scrape this html to get a list of links, then download them one by one. But I'm not entirely sure what would happen if I have hundreds or thousands of events - would I get them all on a single html file? And that would be very slow of course.
I've also tried the "REPORT" command which is how I get .ics data from other CalDAV servers, but that does not return useful data. I was hoping someone could point me at a better method of doing this.

Issue scraping financial data via xpath + tables

I'm trying to build a stock analysis spreadsheet in Google sheets by using the importXML function in conjunction with XPath (absolute) and importHTML function using tables to scrape financial data from www.morningstar.co.uk key ratios page for the corresponding companies I like to keep an eye on.
Example: https://tools.morningstar.co.uk/uk/stockreport/default.aspx?tab=10&vw=kr&SecurityToken=0P00007O1V%5D3%5D0%5DE0WWE%24%24ALL&Id=0P00007O1V&ClientFund=0&CurrencyId=BAS
=importxml(N9,"/html/body/div[2]/div[2]/form/div[4]/div/div[1]/div/div[3]/div[2]/div[2]/div/div[2]/table/tbody/tr/td[3]")
=INDEX(IMPORTHTML(N9","table",12),3,2)
N9 being the cell containing the URL to the data source
I'm mainly using Morningstar as my source data due to the overwhelming amount of free information but the links keep on breaking, either the URL has slightly changed or the XPath hierarchy altered.
I'm guessing from what I've read so far is that busy websites such as these are dynamic and change often which is why my static links are breaking.
Is anyone able to suggest a solution or confirm if CSS selectors would be a more stable / reliable method of retrieving the data.
Many thanks in advance
Tried short XPath and long XPath links ( copied from dev tool in chrome ) frequently changed URL to repair link to data source but keeps breaking shortly after and unable to retrieve any information

Attempting to design a flexible reporting system. Getting stuck

I’m having some trouble coming up with a future-proof-ish design for reports for a company. Essentially the requirements are:
Be able to pull whatever data from the database
Generate formatted report from that data by populating a template (HTML, docx)
Export to Word and/or PDF
So initially I made an API endpoint per report (this is a web app), and had PDFs generated and formatted correctly.
But now I need to get the data into .docx/Word format, and I’m trying to figure out how I can design something as D.R.Y. as possible so that I don’t have to put in a TON of work every time the company decides they need another report (they’ve done this two, three times which is how I became aware that I had coded myself into a corner).
Every report I’ve done thus far has been done via a “brute-force” method: code the queries needed for the report, format the data, and then render to PDF (using HTML to PDF via phantomjs).
The complexity occurred when the company came back and said “Hey, we need all of those reports in Word format, also we have 3 other new reports that we need and a report that is a slight variation on the old one but +/- 2 fields”.
I am just having trouble coming up with a solid design/abstraction here, one that doesn’t send me down a week long hacking spree every time a requirement changes.

R - Download Data from Webpage

I want to download CSV file from a webpage, where I have to select the time frame described by the data as well as the columns I want to download.
The page is the following:
https://www.transtats.bts.gov/DL_SelectFields.asp?Table_ID=258&DB_Short_Name=Air%20Carriers
I wanted to ask, how I can achieve downloading this table for years 2015 and 2016, with columns passenger, carrier, origin and dest.
Using the Chrome Developer Tool I found out, that when clicking upon the button "Download", a function "TryDownload()" is being called in the background, which should be callable using a POST request. However, I dont understand, how I can call this function using R as well as changing the default selected columns.
Thank you for your help.

Hiding google maps raw data from user

I am trying to get into the google maps api v3 to display store locations.
All non-flash tutorials for google maps, which I have seen so far, create an array with the latitude and longitude in either java script part of the html or in a seperate js file.
However, then I list all coordinates in plain text in the requested html site.
Is there a way to hide the exact location in a seperate file or layer, which is not accessible to the user? I would like to display the locations with a broad view and would like to keep the exact locations hidden.
Thank you for any suggestions.

I do not know if it is possible to do, but you can try create external PHP script that will returns JSON output with all Google Maps data.
In the beggining of the script you can check referer and it is correct (equals to the site script) show that data - otherwise, print some error, etc.
In JavaScript load whole data with Ajax.
However there is no way to permanently hide data from user - it is always possible to write some script that will export them from Google Map (for example using FireBug/Chrome console).

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

download data from website when no URL specified - r

Looks like you need to use http://www.bcr.gob.sv/bcrsite/downloadsxls.php?exportCDR=1&cdr=38&xls=1 to get the XLS file which you can then parse. Make sure they are ok with this service being used as an API.

Related

Accessing calendar from OpenSRS/Sabre via CALDAV

Issue scraping financial data via xpath + tables

Attempting to design a flexible reporting system. Getting stuck

R - Download Data from Webpage

Hiding google maps raw data from user

Categories

Resources