How to Webscrape Data from an Interactive Chart on a Webpage? - r

I am wondering whether it is possible to extract data (highest hour, daily average, March 2016 to March 2021) from the following interactive chart: Image of Chart: Bangkok-PM2.5 - Long-term History (daily averages)
The Website that contains the chart: http://berkeleyearth.lbl.gov/air-quality/local/Thailand/Bangkok
Then I would like to create a data frame in R with the extracted data from the chart.
Please let me know whether it is possible to do this, thank you in advance.

The following will get you a list of all the data on the page
url <- "http://berkeleyearth.lbl.gov/air-quality/maps/cities/Thailand/Bangkok/Bangkok.json"
all_data <- jsonlite::fromJSON(url)
The historical data for daily average, maximum and minimum temperatures can be found in the following ugly list:
all_data$day_data
I'll leave the data manipulation and plotting up to you...

Related

visualizing time series density (without data) in R

I have time series of 4 years. What I need to do, is to display the density of the time series by only visualising the dates for that data exists. The data doesn't matter. The picture shows a sketch. Anybody knows how to this in R? Thanks a lot.
sketch

Monthly average precipitation from multiple netCDf files in R

I'm a new student of R and my current goal is to build the monthly precipitation 3d data (lon,lat,time) of south korea from 2018-2021 from the netcdf files in https://downloads.psl.noaa.gov/Datasets/cpc_global_precip/ (which is daily data).
I was able to extract the data from one cm file following a youtube video but after that I am totally lost. I suspect that I have to use loops to make a dataframe of the data I need.
Can anyone point me in the right direction?

dygraphs doesn't show the line for more than 10,000 datapoints

I am trying to plot a graph using dygraph function for a dataset with more than 100,000 datapoints. As soon as I try it the graph appears empty. I tried to shorten the dataset and it turns out that dygraph shows graph for dataset up to 10,000 entries only. Here is a sample with 9,999 datapoints
dygraph(ts(1:9999))
up to 9999 datapoints
as soon as I change to 10,000 it doesn't show anything
dygraph(ts(1:10000))
10000 datapoints
After some research I came to conclusion that this is a bug. Nevertheless I found a solution to this. If you convert your data to time series using timeSeries function, it starts working.
For example:
y = timeSeries(1:1000000, 1:1000000)
dygraph(y)

Best way to transform source data?

Working in R. But I think this question is universal.
Wall Street Journal visualized a dataset on disease infection rates in the U.S.:
X-axis is year. Y-axis is state.
And shade of red per tile is infection rate intensity for that particular state recorded for that year.
The source dataset being visualized is arranged as follows:
Each row in the dataset corresponds to a single infection rate for a single country in a given year. So, each red tile in the visualization corresponds to a row from the dataset.
But what if the dataset looked like this?:
Now, each row corresponds to a state. And each state/row has multiple infection rates, one for each year recorded. This might match how data is captured in the real world because for each year or day (in the case of coronavirus) you track the infection rate, you can just add a new column (rather than a 50 new rows).
The problem is while this layout is more human-friendly, it's not very R-friendly. We can easily create the tile visualization based on the source dataset arrangement where data is arranged by infection rate, but not so easily if it's arranged by state.
So, finally, my question is — is there an easy way to transform data from the second layout to the first, in Excel?
You can use the transpose function in the free, open-source OpenRefine tool to prepare your data file prior to loading it into R.

How to show monthly data on diagram in Power BI from datetime

I have historical daily gold prices in excel and I load them to the Power BI. But in default, Power BI shows me only year based value, but for year 2016 i have values only for January and February and therefore diagram is distorted as you can see below.
Base data looks like (can be downloaded here in .csv)
Visual setting is
And i want to see diagram curve to be done by months to prevent distortion.
Curve should be similar to this diagram
There are years only as a description
Thank you for help
PBI is has generated a Date Hierarchy with only a Year level. I can tell that from the word "Year" appearing under on the Visualizations pane, under "Date" (in the Axis Section).
I would click the small down-arrow to the right of "Date", in the Axis section. I would change that setting from "Date Hierarchy" to "Date". This will plot every daily value on the chart.
Unfortunately you cant control the date format used in the X-Axis labels - hopefully they will add that in some future release. Their automatic labeling is usually OK.

Resources