How to get data from website with interaction - r

I'd like to get data with R from following website: Website MFarm
I'm wondering if there's a chance to get them from R or I must use something else (e.g. Selenium, or Python) ?
Note that:
Needed data is the plot trend. This shouldn't be an issue, as it's only about getting a html attribute, and then to re-elaborate numbers within R.
Data are shown in the page not immediately, but only some seconds after the page is loaded. So some waitfor() function shall be used
There are 7 tabs. I need data from all of them. An interaction is therefore needed.
There's a combo box and a text box with parameters I've to choose

Related

R, shiny, prevent error until field is calculated

I am building a shiny application, and I would like there to be a field like this that displays the probability someone will return (given a bunch of underlying models):
And it pretty much works, except its in decimal form:
The code pasting into that box looks like this:
paste(a$result)
I can get it to look "correct" and say '83%' for instance, instead of 0.83000.....
by using this code:
paste( c( round(a$result, digits=2))*100,"%")
But the problem is.... while this code does work, until you hit the "calculate" button, it looks like this:
I wish I could provide some sample data to try but given the interactiveness of the shiny app, that'd be very hard. Is there a simple solution?
Use the function shiny::req to make sure required values are present before performing the calculation. For example
paste(c( round(req(a$result), digits=2))*100,"%")

Dynamic aggregation column input reference in Spotfire TERR data function

How can I make a dropdown menu that allows me to reference different columns and change the column reference of a data function in Spotfire's TERR/R?
I am creating 2D cross plots of data, using TERR data function to overlay the average profile line of the data on top of the individual profile lines. I am trying to add the ability to toggle between different normalizations. I want to be able to see data and the average of data over time normalization, pressure normalization, etc, etc. Without having to go into the data function and change the column name reference every time I want to change.
I know how to make the dropdown in the text area and reference each visualization, so those change automatically, but I still can't figure out how to make the TERR data input column to change dynamically with the dropdown menu selection so that the average line also changes.
There must be some way to simply say I want whatever is in the document property to be the "group by" column in the TERR data function to perform aggregations against. (I'm using the R package dplyr to do various simple statistical aggregations on data)
Thanks for the help!

Assigning observation name to a value when retrieving a variable

I want to create a dataframe that contains > 100 observations on ~20 variables. Now, this will be based on a list of html files which are saved to my local folder. I would like to make sure that are matches the correct values per variable to each observation. Assuming that R would use the same order of going through the files for constructing each variable AND not skipping variables in case of errors or there like, this should happen automatically.
But, is there a "save way" to this, meaning assigning observation names to each variable value when retrieving the info?
Take my sample code for extracting a variable to make it more clear:
#Specifying the url for desired website to be scrapped
url <- 'http://www.imdb.com/search/title?
count=100&release_date=2016,2016&title_type=feature'
#Reading the HTML code from the website
webpage <- read_html(url)
title_data_html <- html_text(html_nodes(webpage,'.lister-item-header a'))
rank_data_html <- html_text(html_nodes(webpage,'.text-primary'))
description_data_html <- html_text(html_nodes(webpage,'.ratings-bar+ .text-
muted'))
df <- data.frame(title_data_html, rank_data_html,description_data_html)
This would come up with a list of rank and description data, but no reference to the observation name for rank or description (before binding it in the df). Now, in my actual code one variable suddenly comes up with 1 value too much, so 201 descriptions but there are only 200 movies. Without having a reference to which movie the description belongs, it is very though to see why that happens.
A colleague suggested to extract all variables for 1 observation at a time and extend the dataframe row-wise (1 observation at a time), instead of extending column-wise (1 variable at a time), but spotting errors and clean up needs per variable seems way more time consuming this way.
Does anyone have a suggestion of what is the "best practice" in such a case?
Thank you!
I know it's not a satisfying answer, but there is not a single strategy for solving this type of problem. This is the work of web scraping. There is no guarantee that the html is going to be structured in the way you'd expect it to be structured.
You haven't shown us a reproducible example (something we can run on our own machine that reproduces the problem you're having), so we can't help you troubleshoot why you ended up extracting 201 nodes during one call to html_nodes when you expected 200. Best practice here is the boring old advice to LOOK at the website you're scraping, LOOK at your data, and see where the extra or duplicate description is (or where the missing movie is). Perhaps there's an odd element that has an attribute that is also matching your xpath selector text. Look at both the website as it appears in a browser, as well as the source. Right click, CTL + U (PC), or OPT + CTL + U (Mac) are some ways to pull up the source code. Use the search function to see what matches the selector text.
If the html document you're working with is like the example you used, you won't be able to use the strategy you're looking for help with (extract the name of the movie together with the description). You're already extracting the names. The names are not in the same elements as the descriptions.

R ReporterRs Package - Add text after bookmark

I'm creating a tool to automatically generate reports for people in the office using the ReporteRs package in R. We have a standard set of tables/graphs, but the number of times a single table or graph may appear will vary from person to person. Due to this, I cannot make a single template with a fixed number of bookmarks.
I was hoping to get around this by having a bookmark on the first figure title and then repeatedly adding figure titles and graphs/tables underneath that one bookmark.
The 'addParagraph' function will only replace the bookmarked paragraph, so it will not work. I also tried replacing the bookmarked paragraph with a set of paragraphs, but since I have to alternate text/tables the bookmark gets placed onto two paragraphs after the first iteration and does not work after that.
Is there any way to simply add a piece of text after a bookmarked paragraph?
No that's not possible with ReporteRs. Maybe you could use package officer instead, cursor_* functions would help you.

R console output too long, how can I view outputs in `less`?

I am inspecting research data from NIR spectroscopy. Unfortunately, the output is too big (2048 rows with 15 columns).
Very often, when I try to check a variable like mymodel$loadings my results get truncated.
I understand that I can increase the max output of my terminal, but it's really a hassle to scroll my mouse up from my terminal window. Is there a way I can tell R to pipe the output from my last statement to less or more so I can just scroll using the keyboard?
Are you using a version of RStudio? I would generally look at tables like this in the Data Viewer pane, it allows you to see all data in tables like yours a lot easier.
Access by clicking on the data frame name in top right, or using below in console:
View(dataframe_name)

Resources