R - web scraping dynamic forms skipping missing data - r

I am using RSelenium to scrape data off of a [website][1] that has a dynamic form where the multiple dropdown menus change depending on what is chosen. I am trying to pull the variable 'Number & Area of Operational Holdings' for every district in every state.
I am able to get the code working, but have an issue when the district does not have a table (The websites database has a few districts with no data). When my code runs into a district with no data, it finishes and I am left with an incomplete dataset.
How would I create a code that can skip over these districts that lack a table? My code is pasted below. A special shout out goes to the previous stack exchange thread on this, [link here][2], as I adapted their code. Also, if anyone can clean up my final output to avoid repeating the variable headers with every new district, it would be appreciated.
rm(list=ls(all=TRUE))
library(RSelenium)
library(XML)
library(dplyr)
library(magrittr)
library(devtools)
library(rvest)
# Start Selenium Server --------------------------------------------------------
checkForServer()
startServer()
remDrv <- remoteDriver()
remDrv$open()
# Simulate browser session and fill out form -----------------------------------
remDrv$navigate('http://agcensus.dacnet.nic.in/districtsummarytype.aspx')
# Select year
remDrv$findElement(using = "xpath",
"//select[#name = '_ctl0:ContentPlaceHolder1:DropDownList2']/option[#value = '2010']")$clickElement()
# Select 1 == Number & Area of Operational Holdings
remDrv$findElement(using = "xpath",
"//select[#name = '_ctl0:ContentPlaceHolder1:DropDownList3']/option[#value = '1']")$clickElement()
# Select 4 == All Social Group
remDrv$findElement(using = "xpath",
"//select[#name = '_ctl0:ContentPlaceHolder1:DropDownList4']/option[#value = '4']")$clickElement()
# Select 3 == All Gender (Total)
remDrv$findElement(using = "xpath",
"//select[#name = '_ctl0:ContentPlaceHolder1:DropDownList8']/option[#value = '3']")$clickElement()
# Get all state IDs and the respective names
state_IDs <- remDrv$findElements(using = "xpath",
"//select[#name = '_ctl0:ContentPlaceHolder1:DropDownList1']/option") %>%
lapply(function(x){x$getElementAttribute('value')}) %>%
unlist
state_names <- remDrv$findElements(using = "xpath",
"//select[#name = '_ctl0:ContentPlaceHolder1:DropDownList1']/option") %>%
lapply(function(x){x$getElementText()}) %>%
unlist
# Retrieve and download results ------------------------------------------------
result <- data.frame(state = character(), district = character(),
V1 = character(), V2 = character(), V3 = character(),
V4 = character(), V5 = character(), V6 = character(),
V7 = character(), V8 = character(), V9 = character(),
V10 = character(), V11 = character(), V12 = character())
for (i in seq_along(state_IDs)) {
remDrv$findElement(using = "xpath",
paste0("//select[#name = '_ctl0:ContentPlaceHolder1:DropDownList1']/option[#value = ",
"'", state_IDs[i], "']"))$clickElement()
Sys.sleep(2)
# Get all district IDs and names from the currently selected states
district_IDs <- remDrv$findElements(using = "xpath",
"//div[#id = '_ctl0_ContentPlaceHolder1_Panel14']/select/option") %>%
lapply(function(x){x$getElementAttribute('value')}) %>%
unlist
district_names <- remDrv$findElements(using = "xpath",
"//div[#id = '_ctl0_ContentPlaceHolder1_Panel14']/select/option") %>%
lapply(function(x){x$getElementText()}) %>%
unlist
for (j in seq_along(district_IDs)) {
remDrv$findElement(using = "xpath",
paste0("//div[#id = '_ctl0_ContentPlaceHolder1_Panel14']/select/option[#value = ",
"'", district_IDs[j], "']"))$clickElement()
Sys.sleep(2)
# Click submit and download data of the selected district
remDrv$findElement(using = "xpath",
"//input[#value = 'Submit']")$clickElement()
Sys.sleep(2)
######### if ##########
if (remDrv$findElement("xpath", "//input[#value ='No Records found'")) { #this isnt input value, but rather a "No Records found" lookup
remDrv$goBack()
Sys.sleep(2)
}
else {
# Download data for current district
district_data <- remDrv$getPageSource()[[1]] %>%
htmlParse %>%
readHTMLTable %>%
extract2(4) %>%
extract(c(-1, -2), )
result <- data.frame(state = state_names[i], district = district_names[j],
district_data) %>% rbind(result, .)
remDrv$goBack()
Sys.sleep(2)
}
}
}
remDrv$quit()
remDrv$closeServer()
result %<>% as_data_frame %>%
rename(
si_no = V1,
holding_size = V2,
Individual_Number = V3,
Individual_Area = V4,
Joint_Number = V5,
Joint_Area = V6,
Subtotal_Number = V7,
Subtotal_Area = V8,
Institutional_Number = V9,
Institutional_Area = V10,
Total_Number = V11,
Total_Area = V12
) %>%
mutate(
si_no = as.numeric(as.character(si_no))
)
str(result)
levels(result$state)
levels(result$district)

Related

Not able to download historical/past data with html_form_set() function in rvest

I am trying to download the past data from a website areavolume.
I am using rvest function html_form_set() to fill the form with the drop down select like interval = 15-minute-block, delivary = last 31days, type = both, Area = Mark All. snapshot required fill area . I have seen the solution from the site stack_site_1 and site site_with_httr. Snapshot for selection .
library(rvest)
library(httr)
library(tidyverse)
pg <- html_session('https://www.iexindia.com/marketdata/rtm_areavolume.aspx')
form.unfilled <- pg %>% html_node("form") %>% html_form()
form.filled <- form.unfilled %>% html_form_set("ctl00$InnerContent$ddlInterval" = "1", "ctl00$InnerContent$ddlPeriod" = "-31", 'ctl00$InnerContent$ddlType' = '1')
session <- session_submit(pg, form.filled)
table <- session %>% html_nodes("table")
vol_table <- html_table(table, fill=TRUE)
### another way selecting the date range
iex_html = 'https://www.iexindia.com/marketdata/rtm_areavolume.aspx'
iex_ses <- html_session(iex_html)
iex_form <- iex_ses %>% html_node("form") %>% html_form()
iex_fill <- iex_form %>% html_form_set("ctl00$InnerContent$ddlInterval" = "1", "ctl00$InnerContent$ddlPeriod" = "SR", "ctl00$InnerContent$calFromDate$txt_Date" = "01/03/2021", "ctl00$InnerContent$calToDate$txt_Date" = '03/03/2021', 'ctl00$InnerContent$ddlType' = '1')
iex_form$fields$`ctl00$InnerContent$btnUpdateReport`$type <- 'submit'
out <- session_submit(x = iex_ses, form = iex_fill)
out_table <- out %>% html_nodes("table")
out_table1 <- html_table(out_table, fill=TRUE)
###with httr
vol_htr <- POST("https://www.iexindia.com/marketdata/rtm_areavolume.aspx", body = list('ctl00$InnerContent$ddlInterval' = "ctl00$InnerContent$ddlInterval:1", 'ctl00$InnerContent$ddlPeriod' = "-31", 'ctl00$InnerContent$ddlType' = "1", 'ctl00$InnerContent$btnUpdateReport' = "Update Report"), encode = "form")
vol_httr_table <- read_html(vol_htr) %>% html_table(fill=TRUE)
It all shows the data table of present/current day data. I am sure that I am doing something wrong with submitting the 'update reports' May be my doubt with the selection of checkbox.
A RSelenium solution to download the excel file
#Start the server
library(RSelenium)
driver = rsDriver(browser = c("chrome"))
remDr <- driver[["client"]]
#Navigate to website
remDr$navigate("https://www.iexindia.com/marketdata/rtm_areavolume.aspx")
#Download the Excel file
button_element <- remDr$findElement(using ="xpath", value = '//*[#id="ctl00_InnerContent_reportViewer_ctl05_ctl04_ctl00_ButtonImg"]')
button_element$clickElement()
button_element <- remDr$findElement(using ="xpath", value = '//*[#id="ctl00_InnerContent_reportViewer_ctl05_ctl04_ctl00_Menu"]/div[1]/a')
button_element$clickElement()

Using reactive input within reactiveValue() function

I am new to shiny and trying to figure out some reactive stuff.
Currently this works for a static csv.
## function to return random row from twitter csv
tweetData <- read.csv('twitterData1.csv')
## stores reactive values
appVals <- reactiveValues(
tweet = tweetData[sample(nrow(tweetData), 1), ],
ratings = data.frame(tweet = character(), screen_name = character(), rating = character())
)
I need the same block of reactive values to be funciton but using a selected csv using input$file.
appVals <- reactiveValues(
csvName <- paste0('../path/', input$file),
tweetData <- read.csv(csvName),
tweet = tweetData[sample(nrow(tweetData), 1), ],
ratings = data.frame(tweet = character(), screen_name = character(), rating = character())
)
I get the error:
Warning: Error in : Can't access reactive value 'file' outside of reactive consumer.
I've tried moving things around but I keep getting stuck, help appreciated!
The error is telling that you should update the values inside a reactive expression.
First initialize the reactive values:
tweetData <- read.csv('twitterData1.csv')
appVals <- reactiveValues()
appVals$tweet <- tweetData[sample(nrow(tweetData), 1), ]
appVals$ratings <- data.frame(tweet = character(), screen_name = character())
Then update them with a reactive:
observeEvent(input$file,{
csvName <- paste0('../path/', input$file)
if (file.exists(csvName) {
tweetData <- read.csv(csvName)
appVals$tweet = tweetData[sample(nrow(tweetData), 1), ]
appVals$ratings = data.frame(tweet = character(), screen_name = character(), rating = character())
}
})

R: Plotting data using qplot

movies_df is a data frame with 100 records and a structure:
The genre with highest Runtime is plotted using qplot
qplot(data = movies_df, Runtime, fill = Genre, bins = 30)
From the graph above, 4 Action movies have the highest runtime (160)
How to get plot the Title of the movies that have genre = Action (the genre with the highest runtime) ?
Code to reproduce the data:
library("rvest")
url = "https://www.imdb.com/search/title?count=100&release_date=2016,2016&title_type=feature"
webpage = read_html(url)
## ---- PRE-PROCESSING ---- ##
# rank scraping
rank_data_html = html_nodes(webpage, ".text-primary")
rank_data = html_text(rank_data_html)
rank_data = as.numeric(rank_data)
#title scraping
title_data_html = html_nodes (webpage, ".lister-item-header a")
title_data = html_text(title_data_html)
#description scraping
desc_nodes = html_nodes(webpage, ".ratings-bar+.text-muted")
desc_data = html_text(desc_nodes)
desc_data = gsub("\n","",desc_data)
runtime_data_html = html_nodes (webpage, ".text-muted .runtime")
runtime_data = html_text(runtime_data_html)
runtime_data = gsub(" min", "", runtime_data)
runtime_data = as.numeric(runtime_data)
genre_data_html = html_nodes (webpage, ".genre")
genre_data = html_text (genre_data_html)
genre_data = gsub("\n", "", genre_data)
genre_data = gsub (" ","", genre_data)
genre_data = gsub(",.*", "", genre_data)
genre_data = as.factor(genre_data)
rating_data_html = html_nodes(webpage, ".ratings-imdb-rating strong")
rating_data = html_text(rating_data_html)
rating_data = as.numeric(rating_data)
votes_data_html = html_nodes(webpage, ".sort-num_votes-visible span:nth-child(2)")
votes_data = html_text(votes_data_html)
votes_data = gsub(",", "", votes_data)
votes_data = as.numeric(votes_data)
directors_data_html = html_nodes(webpage, ".text-muted+ p a:nth-child(1)")
directors_data = html_text(directors_data_html)
directors_data = as.factor(directors_data)
actors_data_html = html_nodes(webpage, ".lister-item-content .ghost+ a")
actors_data = html_text(actors_data_html)
actors_data = as.factor(actors_data)
metascore_data_html = html_nodes(webpage, ".metascore")
metascore_data = html_text(metascore_data_html)
metascore_data = gsub(" ", "", metascore_data)
for (i in c(39, 73, 80)){
a = metascore_data[1:(i-1)]
b = metascore_data[i:length(metascore_data)]
metascore_data = append(a, list("NA"))
metascore_data = append(metascore_data, b)
metascore_data = as.numeric(metascore_data)
}
gross_data_html = html_nodes(webpage, ".ghost~ .text-muted+ span")
gross_data = html_text(gross_data_html)
gross_data = gsub("M","",gross_data)
gross_data = substring(gross_data, 2, 6)
for (i in c(1,2,3,4,5,6,7,8,9,10)){
a = gross_data[1:(i-1)]
b = gross_data[i:length(gross_data)]
gross_data = append(a,list("NA"))
gross_data = append(gross_data,b)
}
gross_data = as.numeric(gross_data)
movies_df = data.frame(Rank = rank_data, Title = title_data,
Description = desc_data, Runtime = runtime_data,
Genre = genre_data, Rating = rating_data,
Metascore = metascore_data, Votes = votes_data,
Gross_Earning_in_Mil = gross_data, Director = directors_data,
Actor = actors_data)
At first you need to install the package dplyr AND load the library to filter youre data.
Then you need to filter youre data to all movies that have genre = Action. Thats it
library(dplyr)
newDataset <- movies_df %>% filter( Genre == "Action" )
qplot(data = newDataset, Runtime, fill = Title , bins = 30)
please rate my answer well if it helped you. And i hope, its all clear now

How to change dot to comma in R

get_bike_data <- function(url) {
html_bike_category <- read_html(url)
# Get the names
bike_name_tbl <- html_bike_category %>%
html_nodes(css = ".catalog-category-bikes__title-text") %>%
html_text() %>%
str_remove_all(pattern = "\n") %>%
enframe(name = "position", value = "name")
# Get the prices
bike_price_tbl <- html_bike_category %>%
html_nodes(css = ".catalog-category-bikes__price-title") %>%
html_text() %>%
str_remove_all(pattern = "\\.")%>%
extract_numeric()%>%
enframe(name = "position", value = "price_euro") %>%
left_join(bike_name_tbl)
}
# 2.3.1b Alternative with a for loop
# Create an empty tibble, that we can populate
# Loop through all urls
bike_data_tbl <- bike_data_tbl %>%
rename("model" = "name")%>%
subset(nchar(price_euro)!=0)
bike_data_tbl
this is a data of price and model from a website. I wanted to change the 1.699 to 1,699. Although I tried many other methods(format(decimal.mark=","), parse.number(), sub(), etc.) that I googled, it still does not work.
What is the problem?
Below a possible solution
library(stringr)
text<-c('1231.1','4343.5','312312.0')
str_replace(string = text,pattern = "[.]",replacement = ",")
[1] "1231,1" "4343,5" "312312,0"
another possible solution is:
num_text<-c(1231.1,4343.5,312312.0)
gsub("\\.", ",", num_text)
[1] "1231,1" "4343,5" "312312"
Both gsub and formatshould work
#format
format(bike_data_tbl$price_euro, decimal.mark = ",")
#gsub
gsub(pattern = ".", x = bike_data_tbl$price_euro, replacement = ",", fixed = TRUE)
However, it seems that the prices are in thousands (e.g. 1.699 for ground control model = 1699 euros. You could try this:
as.numeric(gsub(pattern = ".", x = bike_data_tbl$price_euro, replacement = "", fixed = TRUE))
The last function replaces all dots with nothing.

Highcharter Unlimited Drill down using R

I looked at this SO question, that SO question. I followed this github method check out ISSUES to help get ideas on how to solve this.
Step 1: Copy the following into text editor and save as test.csv file:
My datasource is NVD JSON FEEDS which I processed and cleaned and created this dataset with the first 25 rows shown here.
Year,Vendor name,Product name,CVE,Major,Minor,Build,Revision
1988,eric_allman,sendmail,CVE-1999-0095,5,5.58,,
1988,ftp,ftp,CVE-1999-0082,*,,,
1988,ftpcd,ftpcd,CVE-1999-0082,*,,,
1989,bsd,bsd,CVE-1999-1471,4,4.2,,
1989,bsd,bsd,CVE-1999-1471,4,4.3,,
1989,sun,sunos,CVE-1999-1122,4,4.0,,
1989,sun,sunos,CVE-1999-1122,4,4.0,4.0.1,
1989,sun,sunos,CVE-1999-1122,4,4.0,4.0.3,
1989,sun,sunos,CVE-1999-1467,4,4.0,,
1989,sun,sunos,CVE-1999-1467,4,4.0,4.0.1,
1989,sun,sunos,CVE-1999-1467,4,4.0,4.0.2,
1989,sun,sunos,CVE-1999-1467,4,4.0,4.0.3,
1989,sun,sunos,CVE-1999-1467,4,4.0,4.0.3c,
1990,digital,vms,CVE-1999-1057,5,5.3,,
1990,freebsd,freebsd,CVE-2000-0388,3,3.0,,
1990,freebsd,freebsd,CVE-2000-0388,3,3.1,,
1990,freebsd,freebsd,CVE-2000-0388,3,3.2,,
1990,freebsd,freebsd,CVE-2000-0388,3,3.3,,
1990,freebsd,freebsd,CVE-2000-0388,3,3.4,,
1990,hp,apollo_domain_os,CVE-1999-1115,sr10,sr10.2,,
1990,hp,apollo_domain_os,CVE-1999-1115,sr10,sr10.3,,
1990,next,nex,CVE-1999-1392,1,1.0a,,
1990,next,next,CVE-1999-1198,2,2.0,,
1990,next,next,CVE-1999-1391,1,1.0,,
1990,next,next,CVE-1999-1391,1,1.0a,,
Step 2: Copy and paste following code into R and run it:
My codes follow this SO question especially the discussion by #NinjaElvis. I think it is possible to create 3 levels or more. Just doing more research and figuring out.
############################
suppressPackageStartupMessages(library("highcharter"))
library("dplyr")
library("purrr")
library("data.table")
second_el_to_numeric <- function(ls){
map(ls, function(x){
x[[2]] <- as.numeric(x[[2]])
x
})
}
cve_affected_product <- fread("test.csv")
# LAYER ONE YEAR DRILLDOWN VIEW #########################
Year <- cve_affected_product[,c(1:2)]
Year <- unique(Year[,list(Year,`Vendor name`)])
Year <- Year[,c(1)][,count:=1]
Year <- setDT(aggregate(.~ Year ,data=Year,FUN=sum))
#setorder(Year, Year, `Vendor name`)
years_df <- tibble(
name = c(Year$Year),
y = c(Year$count),
drilldown = tolower(paste(name,'id'))
)
ds <- list_parse(years_df)
names(ds) <- NULL
# Vendor View HC ###########
hc <- highchart() %>%
hc_chart(type = "column") %>%
hc_title(text = "Basic drilldown") %>%
hc_xAxis(type = "category") %>%
hc_yAxis(visible = FALSE,reversed = FALSE) %>%
hc_legend(enabled = FALSE) %>%
hc_plotOptions(
series = list(
boderWidth = 0,
dataLabels = list(enabled = TRUE)
)
) %>%
hc_add_series(
name = "Vendors",
colorByPoint = TRUE,
data = ds
)
# LAYER TWO VENDOR DRILLDOWN VIEW #########################
Vendor <- cve_affected_product[,c(1:3)]
Vendor <- unique(Vendor[,list(Year,`Vendor name`,`Product name`)])
Vendor <- Vendor[,c(1:2)][,count:=1]
Vendor <- setDT(aggregate(.~ Year+`Vendor name` ,data=Vendor,FUN=sum))
setorder(Vendor, Year, `Vendor name`)
years <- as.character(unique(Vendor$Year))
for(i in 1:length(years)){
tempdf <- Vendor[Vendor$Year==years[i],]
dfname <- paste("df",years[i],sep="")
dsname <- paste("ds",years[i],sep="")
X <- tibble(
name = c(tempdf$`Vendor name`),
y = c(tempdf$count),
drilldown = tolower(paste(name,'id'))
)
Y <- second_el_to_numeric(list_parse2(assign(dfname,X)))
assign(dsname,Y)
}
# Vendor View HC ###########
hc <- hc %>%
hc_drilldown(
allowPointDrilldown = TRUE,
series = list(
list(
id = "1988 id",
data = ds1988,
colorByPoint = TRUE,
keys = list('name','y','drilldown')
),
list(
id = "1989 id",
data = ds1989,
keys = list('name','y','drilldown')
),
list(
id = "1990 id",
data = ds1990,
keys = list('name','y','drilldown')
)#,
)
)
# LAYER THREE PRODUCT DRILLDOWN VIEW #########################
Product <- cve_affected_product[,c(1:4)]
Product <- unique(Product[,list(Year,`Vendor name`,`Product name`, CVE)])
Product <- Product[,c(1:3)][,count:=1]
Product <- setDT(aggregate(.~ Year+`Vendor name`+`Product name` ,data=Product,FUN=sum))
setorder(Product, Year, `Vendor name`,`Product name`)
vendors <- as.character(unique(Product$`Vendor name`))
for(i in 1:length(vendors)){
tempdf <- Product[Product$`Vendor name`==vendors[i],]
dfname <- paste("df",vendors[i],sep="")
dsname <- paste("ds",vendors[i],sep="")
X <- tibble(
name = c(tempdf$`Product name`),
y = c(tempdf$count),
drilldown = tolower(paste(name,'id'))
)
Y <- second_el_to_numeric(list_parse2(assign(dfname,X)))
assign(dsname,Y)
}
# Product View HC ###########
hc <- hc %>%
hc_drilldown(
allowPointDrilldown = TRUE,
series = list(
list(
id = "eric_allman id",
data = dseric_allman,
keys = list('name','y','drilldown')
),
list(
id = "ftp id",
data = dsftp,
keys = list('name','y','drilldown')
),
list(
id = "ftpcd id",
data = dsftpcd,
keys = list('name','y','drilldown')
),
list(
id = "bsd id",
data = dsbsd,
keys = list('name','y','drilldown')
),
list(
id = "sun id",
data = dssun,
keys = list('name','y','drilldown')
),
list(
id = "digital id",
data = dsdigital,
keys = list('name','y','drilldown')
),
list(
id = "freebsd id",
data = dsfreebsd,
keys = list('name','y','drilldown')
),
list(
id = "hp id",
data = dshp,
keys = list('name','y','drilldown')
),
list(
id = "next id",
data = dsnext,
keys = list('name','y','drilldown')
)#,
)
)
# LAYER FOUR CVE DRILLDOWN VIEW #########################
Product_CVE <- cve_affected_product[,c(1:5)]
Product_CVE <- unique(Product_CVE[,list(Year,`Vendor name`,`Product name`, CVE, Major)])
Product_CVE <- Product_CVE[,c(1:4)][,count:=1]
Product_CVE <- setDT(aggregate(.~ Year+`Vendor name`+`Product name`+CVE ,data=Product_CVE,FUN=sum))
setorder(Product_CVE, Year, `Vendor name`,`Product name`, CVE)
products <- as.character(unique(Product_CVE$`Product name`))
for(i in 1:length(products)){
tempdf <- Product_CVE[Product_CVE$`Product name`==products[i],]
ifelse(tempdf$`Vendor name`==tempdf$`Product name`,
dfname <- paste("df_",products[i],sep=""),
dfname <- paste("df",products[i],sep=""))
ifelse(tempdf$`Vendor name`==tempdf$`Product name`,
dsname <- paste("ds_",products[i],sep=""),
dsname <- paste("ds",products[i],sep=""))
X <- tibble(
name = gsub("-", "", c(tempdf$CVE)),
y = c(tempdf$count),
drilldown = tolower(paste(name,'id'))
)
Y <- second_el_to_numeric(list_parse2(assign(dfname,X)))
assign(dsname,Y)
}
# CVE View HC ###########
hc <- hc %>%
hc_drilldown(
allowPointDrilldown = TRUE,
series = list(
list(
id = "sendmail id",
data = dssendmail,
keys = list('name','y','drilldown')
),
list(
id = "ftp id",
data = ds_ftp,
keys = list('name','y','drilldown')
),
list(
id = "ftpcd id",
data = ds_ftpcd,
keys = list('name','y','drilldown')
),
list(
id = "bsd id",
data = ds_bsd,
keys = list('name','y','drilldown')
),
list(
id = "sunos id",
data = dssunos,
keys = list('name','y','drilldown')
),
list(
id = "vms id",
data = dsvms,
keys = list('name','y','drilldown')
),
list(
id = "freebsd id",
data = ds_freebsd,
keys = list('name','y','drilldown')
),
list(
id = "apollo_domain_os id",
data = dsapollo_domain_os,
keys = list('name','y','drilldown')
),
list(
id = "nex id",
data = dsnex,
keys = list('name','y','drilldown')
),
list(
id = "next id",
data = ds_next,
keys = list('name','y','drilldown')
)
)
)
Output First Level:
Output Second Level Clicking on 1988:
ISSUES:
I should be able to click on vendor eric_allman and get a drill down, however I am not. I want to be able to drill down all the way to Revision if it exists. This is just prototypying new functionality for my app to get it working. However, highcharter does not make this easy or efficient. My dataset has almost 4 million observations. That will be the next struggle how to handle that.
I am even considering using D3 by creating a JSON file in python if I cannot do it using R. However, creating the JSON file in python is not trivial but doable. I am currently working on python code as a back up.
Thank you for the help and any suggestions
Interestingly enough after posting this question, i decided to do more research, since most of the research was pointing me to example of how to do it in javascript. Google search took me to this SO response by #K. Rohde
This post is from 2015 and I really appreciate how he explained the two different approaches. I ended up using hybrid approach borrowing from both.
For those interested in seeing how the drill down is working, go to my shiny app. Once you on the page Click on Visualizations, then "Drill Down For Vendor CVE Affected Products Versions and Revisions"and try it out. Again I would not have done it without #K. Rohde write up.

Resources