Async process blocking R Shiny app - r

It should be possible to use the R packages future and promises to trigger asynchronous (long running) processing via Shiny apps without freezing the rest of the app while the async process is running in another R process.
See:
https://cran.r-project.org/web/packages/promises/vignettes/intro.html
https://cran.r-project.org/web/packages/promises/vignettes/overview.html
https://cran.r-project.org/web/packages/promises/vignettes/futures.html
https://cran.r-project.org/web/packages/promises/vignettes/shiny.html
I got this to work in R-script-based environment but can't get this to work when I implement a simple shiny app with 2 functions. The "not-async" function is always blocked while the async function is running, but that should not be the case.
I have posted the same question on the GitHub repo of the package promises: https://github.com/rstudio/promises/issues/23
I am posting it here as well hoping someone can help.
The question is:
Can you take a look at the shiny app example posted below and let me know why the async processing is blocking the app? (It should not block).
Ideally, can you provide a small example of an app with a non-blocking async and normal functionality (accessible while the async is running)?
Environment
Mac OS 10.12
$ R --version
R version 3.4.3 (2017-11-30) -- "Kite-Eating Tree"
remove.packages("future")
remove.packages("promises")
remove.packages("shiny")
install.packages("future")
install.packages("devtools")
devtools::install_github("rstudio/promises")
devtools::install_github("rstudio/shiny")
> packageVersion("future")
[1] ‘1.8.1’
> packageVersion("promises")
[1] ‘1.0.1’
> packageVersion("shiny")
[1] ‘1.0.5.9000’
One side question on the shiny package version, https://rstudio.github.io/promises/articles/intro.html says it should be >=1.1, but even installing with devtools, the version remains 1.0.5... . Is this an issue or is there a typo in the doc?
First, you can use promises with Shiny outputs. If you’re using an async-compatible version of Shiny (version >=1.1), all of the built-in renderXXX functions can deal with either regular values or promises.
Example of issue
I have implemented this simple shiny app inspired from the example at the URLs mentioned above.
The shiny app has 2 "sections":
A button to trigger the "long running" async processing. This is simulated by a function read_csv_async which sleeps for a few seconds, reads a csv file into a data frame. The df is then rendered below the button.
A simple functionality which should work at any time (including when the async processing has been triggered): it includes a slider defining a number of random values to be generated. We then render a histogram of these values.
The issue is that the second functionality (histogram plot update) is blocked while the async processing is occurring.
global.R
library("shiny")
library("promises")
library("dplyr")
library("future")
# path containing all files, including ui.R and server.R
setwd("/path/to/my/shiny/app/dir")
plan(multiprocess)
# A function to simulate a long running process
read_csv_async = function(sleep, path){
log_path = "./mylog.log"
pid = Sys.getpid()
write(x = paste(format(Sys.time(), "%Y-%m-%d %H:%M:%OS"), "pid:", pid, "Async process started"), file = log_path, append = TRUE)
Sys.sleep(sleep)
df = read.csv(path)
write(x = paste(format(Sys.time(), "%Y-%m-%d %H:%M:%OS"), "pid:", pid, "Async process work completed\n"), file = log_path, append = TRUE)
df
}
ui.R
fluidPage(
actionButton(inputId = "submit_and_retrieve", label = "Submit short async analysis"),
br(),
br(),
tableOutput("user_content"),
br(),
br(),
br(),
hr(),
sliderInput(inputId = "hist_slider_val",
label = "Histogram slider",
value = 25,
min = 1,
max = 100),
plotOutput("userHist")
)
server.R
function(input, output){
# When button is clicked
# load csv asynchronously and render table
data_promise = eventReactive(input$submit_and_retrieve, {
future({ read_csv_async(10, "./data.csv") })
})
output$user_content <- renderTable({
data_promise() %...>% head(5)
})
# Render a new histogram
# every time the slider is moved
output$userHist = renderPlot({
hist(rnorm(input$hist_slider_val))
})
}
data.csv
Column1,Column2
foo,2
bar,5
baz,0
Thanks!

So this behavior is normal, see the response of the package developer at https://github.com/rstudio/promises/issues/23
Summary:
In shiny apps, one R process can be shared by multiple users.
If one user submits a long running task, then all the other users sharing the same underlying R process are blocked.
The goal of promises is to avoid this. So promises will prevent blocking between "user sessions" within one R process but not within a single "user session".
The author of the package mentioned that this feature is not supported yet and that it may be added if enough people ask for it. If you are looking for this, please go the GitHub issue and like the original question - this is how interest for new features is measured.
Thanks!

As this or similar questions about shiny intra-session responsiveness are frequently asked on stackoverflow I think it's worth mentioning the workaround Joe Cheng provides in the GitHub issue #Raphvanns created:
If you really must have this kind of behavior, there is a way to work
around it. You can "hide" the async operation from the Shiny session
(allowing the session to move on with its event loop) by not returning
your promise chain from your observer/reactive code. Essentially the
async operation becomes a "fire and forget". You need to hook up a
promise handler to have some side effect; in the example below, I set
a reactiveVal on successful completion.
Some caveats to this approach:
By doing this you are inherently opening yourself up to race
conditions. Even in this very simple example, the user can click the
Submit button multiple times; if the long-running task has very
variable runtime you might end up with multiple results coming back,
but out of order. Or if you reference input values in promise
handlers, they might pick up values that were set after the submit
button was clicked!
You also lose the automatic semi-transparent
indication that an output has been invalidated (though below I at
least null the reactiveVal out in the beginning of the observeEvent).
Accordingly the solution for the above example code can be something like this:
library("shiny")
library("promises")
library("dplyr")
library("future")
# path containing all files, including ui.R and server.R
# setwd("/path/to/my/shiny/app/dir")
write.csv(data.frame(stringsAsFactors=FALSE,
Column1 = c("foo", "bar", "baz"),
Column2 = c(2, 5, 0)
), file = "./data.csv")
onStop(function() {
file.remove("./data.csv")
})
plan(multiprocess)
# A function to simulate a long running process
read_csv_async = function(sleep, path){
log_path = "./mylog.log"
pid = Sys.getpid()
write(x = paste(format(Sys.time(), "%Y-%m-%d %H:%M:%OS"), "pid:", pid, "Async process started"), file = log_path, append = TRUE)
Sys.sleep(sleep)
df = read.csv(path)
write(x = paste(format(Sys.time(), "%Y-%m-%d %H:%M:%OS"), "pid:", pid, "Async process work completed\n"), file = log_path, append = TRUE)
df
}
ui <- fluidPage(
textOutput("parallel"),
sliderInput(inputId = "hist_slider_val",
label = "Histogram slider",
value = 25,
min = 1,
max = 100),
plotOutput("userHist"),
actionButton(inputId = "submit_and_retrieve", label = "Submit short async analysis"),
tableOutput("user_content")
)
server <- function(input, output, session) {
data_promise <- reactiveVal()
# When button is clicked
# load csv asynchronously and render table
observeEvent(input$submit_and_retrieve, {
future({ read_csv_async(10, "./data.csv") }) %...>% data_promise()
return(NULL) # hide future
})
output$user_content <- renderTable({
req(data_promise())
head(data_promise(), 5)
})
# Render a new histogram
# every time the slider is moved
output$userHist = renderPlot({
hist(rnorm(input$hist_slider_val))
})
output$parallel <- renderText({
invalidateLater(300)
paste("Something running in parallel:", Sys.time())
})
}
shinyApp(ui = ui, server = server)
Note the return(NULL) in the observeEvent call to hide the future. This way the long running process no longer blocks the execution of the other reactives.

Related

Shiny App re-runs observeEvent when restoring a bookmarked state

I'm building a data-driven Shiny App that requires the user to bookmark the App state sometimes so that she can save the processed data then reverting to it later to continue working on the App. I have used the bookmarking method in Shiny but I'm stuck with a problem that makes Shiny re-run the event whenever I restore the state, as if I hit the button again.
I attach herewith a sample code representing the problem. It is a simple interface with months and years dropdown menus. The user is expected to select a month and a year then hit the 'Set Billing Month' button to store the values. If the user elects to Bookmark the state, she hits the Bookmark button and receives the modal box with the URL. The user can then plug the URL into the browser and restores the state as she left it. All that works well except that when the App restores the state, it re-runs the observeEvent function of the Set Month button. This is evident by throwing the designed message "Billing Month has been set!" that appears when hitting the button.
I'm expecting Shiny to restore the bookmarked values without re-running the App to get them back. The bookmarking is done on 'server' as a store. This issue causes a problem to me especially for other functions that require loading big data and processing them.
I tried to replicate the problem with the sample code below.
Any assistance is highly appreciated!
`
library(shiny)
library(shinyjs)
enableBookmarking()
ui <- function(request) {
fluidPage(
useShinyjs(),
bookmarkButton(),
titlePanel("Billing App"),
tabsetPanel(
tabPanel("Invoicing Setup",
selectInput("BillMo",label = "Month",choices = c(1,2,3,4,5,6,7,8,9,10,11,12)),
selectInput("BillYr",label = "Year",choices = c(2022,2023,2024,2025,2026,2027,2028,2029,2030)),
actionButton("btnSetInvMo",label = "Set Billing Month",icon = NULL)
)
)
)
}
server <- function(input, output, session) {
# Set up Reactive Values
rVals <- reactiveValues(rBillMo=NULL,
rBillYr=NULL)
observeEvent(input$btnSetInvMo,{
BillMo <- as.integer(input$BillMo)
BillYr <- as.integer(input$BillYr)
# Assign reactive variables
observe({rVals$rBillMo <- BillMo})
observe({rVals$rPrvMo2 <- BillYr})
shinyjs::disable("btnSetInvMo")
showNotification("Billing Month has been set!", type = "message")
})
# BOOKMARKING
onBookmark(function(state){
state$values$billMo <- rVals$rBillMo
state$values$billYr <- rVals$rBillYr
})
onRestore(function(state){
rVals$rBillMo <- reactive(state$values$BillMo)
rVals$rBillYr <- reactive(state$values$BillYr)
})
}
shinyApp(ui, server,enableBookmarking = "server")
`
Add ignoreInit = TRUE to your observer
observeEvent(input$btnSetInvMo,{
BillMo <- as.integer(input$BillMo)
BillYr <- as.integer(input$BillYr)
# Assign reactive variables
observe({rVals$rBillMo <- BillMo})
observe({rVals$rPrvMo2 <- BillYr})
shinyjs::disable("btnSetInvMo")
print("doing")
showNotification("Billing Month has been set!", type = "message")
},ignoreInit = TRUE)

How to run a scheduled cross-session database query in shiny

I have an Oracle database which is refreshed once a day. I am a bit confused on how apps work in Shiny, what gets run once on app startup - and what gets run once per session.
My naive approach was to create a database connection and run a query outside of UI and Server code to create a dataframe of around 600,000 records...which can then be filtered and sliced during the session. I am a bit concerned by doing it inside app.R in global scope, that this connection and dataframe will only be created once when the server starts the app, and will never get run again (if that makes sense).
If I create the data frame in server, then my UI code fails, as is is dependent on the results of a query to populate the select list, and I do this in app.R scope at the moment, so UI can access it.
library(shiny)
library(DBI)
library(dplyr)
library(odbc)
library(stringdist)
library(reactable)
############################################################################
# business functions #
############################################################################
get_list_of_actives_from_db <- function() {
con <- dbConnect(odbc::odbc(), Driver="oracle", Host = "server.mycompany.net", Port = "1521", SVC = "service1", UID = "user_01", PWD = "hello", timeout = 10)
ingredients_df = dbGetQuery(con,
'
select DISTINCT INGREDIENTS FROM AES
'
)
}
get_adverse_events_from_db <- function() {
con <- dbConnect(odbc::odbc(), Driver="oracle", Host = "server.mycompany.net", Port = "1521", SVC = "service1", UID = "user_01", PWD = "hello", timeout = 10)
cases_df = dbGetQuery(con,
'
select * FROM AES
'
)
return(cases_df)
}
############################################################################
# load data sets for use in dashboard #
############################################################################
cases_df = get_adverse_events_from_db() # drive select list in UI
ingredients_df = get_list_of_actives_from_db() # main data to slice and filter
############################################################################
# shiny UI #
############################################################################
ui <- fluidPage(
"Adverse Event Fuzzy Search Tool",
fluidRow(
selectInput("ingredients", label = "Select on or more Active Ingredients:", choices = ingredients_df$PRIMARY_SUSPECT_KEY_INGREDIENT, multi=TRUE),
textInput("search_term", "AE Search Term:"),
actionButton("do_search", "Perform Search")
)
,
fluidRow(
reactableOutput("search_results")
)
)
############################################################################
# shiny server #
############################################################################
server <- function(input, output, session) {
# do stuff here to filter the data frame based on the selected value and render a table
}
# Run the application
shinyApp(ui = ui, server = server)
My main concern is doing this in the root of app.R, both functions run oracle queries which never need to be re-run for the session, as the data will only change overnight via ETL.
############################################################################
# load data sets for use in dashboard #
############################################################################
cases_df = get_adverse_events_from_db()
ingredients_df = get_list_of_actives_from_db()
When and how often is this called? Once when the app is initialized so the data set is never updated and is shared across sessions by users? Or is the entire script run end to end whenever a new sessions is started?
Part of me thinks it should be in the server function, so it runs once per session. But being new to Shiny I feel like server is called constantly whenever there is a change in the UI, I dont want to be constantly loading 600,000 records from Oracle.
Ideally I would cache the results once a day and make them available to all users across all sessions, not sure how to achieve that - so for now just want to know the best way to achieve this, so each user runs the query once and has the data frame cached for the session.
Please check RStudio's article Scoping rules for Shiny apps in this context.
If I got you right, you are asking to share a dataset across shiny-sessions and update it daily (The title of the question didn't really fit your explanation of the problem - I edited it).
I'd suggest using a cross-session reactivePoll to avoid unnecessary DB queries (I once asked a similar question here - Over there I gave an example showing, that the same can be achived via reactiveValues but it's more complex).
Here is the simple pattern you can use - please note that reactivePoll is defined outside the server function so all sessions share the same data:
library(shiny)
ui <- fluidPage(textOutput("my_db_data"))
updated_db_data <- reactivePoll(
intervalMillis = 1000L*60L*5L, # check for a new day every 5 minutes
session = NULL,
checkFunc = function() {
print(paste("Running checkFunc:", Sys.time()))
Sys.Date()
},
valueFunc = function() {
# your db query goes here:
paste("Latests DB update:", Sys.time())
}
)
server <- function(input, output, session) {
output$my_db_data <- renderText(updated_db_data())
}
shinyApp(ui, server)
Here, every 5 minutes the checkFunc checks for a new day - valueFunc is executed only if the result of checkFunc changed. As a (real world) alternative for checkFunc you could implement a query to check for the number of rows of a certain DB table.
PS: There is an example given on a cross-session reactiveFileReader (which is based on reactivePoll) when viewing ?reactiveFileReader
PPS: When doing further filtering etc. on that dataset also check bindCache().
While untested, perhaps this architecture will work:
server <- function(input, output, session) {
dailydata_ <- reactiveValues(when = NULL, what = NULL)
dailydata <- reactive({
oldwhen <- dailydata_$when
if (is.null(oldwhen) ||
as.Date(oldwhen) < Sys.Date()) {
newdata <- tryCatch(
DBI::dbGetQuery(con, "..."),
error = function(e) e)
if (inherits(newdata, "error")) {
warning("error retrieving new data: ", conditionMessage(e))
warning("using stale data instead")
} else {
dailydata_$when <- Sys.time()
dailydata_$what <- newdata
}
}
dailydata_$what
})
# some consumer of the real data
output$tbl <- renderTable(dailydata())
}
The advantage to this is that it's re-query will trigger when the data was retrieved on a different day. Granted, when the new ETL is available might change how exactly this conditional is fashioned, it might be that if it is updated at (say) 2am, then you may need some more time-math to determine if the current data is before or after the most recent update.
This logic has a "data available" fail: if it could not be queried, then the current/stale data is re-used. If you prefer that it returns no data, that is easy enough to change in the code.
(One thing you might want to do is to show the user when the data was last retrieved; this can be retrieved directly with dailydata_$when, accepting that it might be NULL.)

How to not waiting on promise evaluation in R

I would like to execute a very expensive function (called 'never_ending_calc' in the below example) in a separate R session in a shiny app. I already prepared the below code, it works fine, but my parent R session waits on the result of the promise object and this uses lots of resources.
If you check the task manager while the below test application is running, you will see that the parent R session uses one CPU thread on maximum capacity while just simply waiting on the results of the promise object.
How can I evaluate a promise in a way which does not use the resources of the parent R session? (It is also ok if I loose the connection between these two sessions.)
I tried the followings (none of them worked):
use different 'plan'
send a custom message (like a warning) from function 'never_ending_calc' on its first row to somehow stop the parent session to wait on the promise object
Here is the example:
library(shiny)
library(future)
library(promises)
library(future.callr)
never_ending_calc <- function(){
K = 1
for (i in 1:20){
K = K + i
Sys.sleep(5)
}
return(K)
}
ui <- fluidPage(
# App title ----
titlePanel("Test app"),
# Sidebar layout with input and output definitions ----
sidebarLayout(
# Sidebar panel ----
sidebarPanel(
# action button to start the long test calculation ----
actionButton(inputId = "start_test",
label = "Start test run")
),
mainPanel()
)
)
server <- function(input, output) {
observeEvent(input$start_test, {
# start execute the long calculation in another separate R session
future::plan(future.callr::callr)
long_calculation_future <<- future::future({
never_ending_calc()
})
promises::then(long_calculation_future,
onFulfilled = NULL,
onRejected = function(error){NULL})
future::plan(future::sequential)
# return NULL to keep the shiny app reactive
NULL
})
}
shinyApp(ui, server)

DT Editing in Shiny application with client-side processing (server = F) throws JSON Error

I have a Shiny Server application in which the user can edit a datatable, after which some reactive summary statistics update accordingly. I am hosting this app on a fairly slow framework, which is why I want to use client-side processing for the DT rendering, i.e. server = F passed to DT::renderDataTable. Let me break down the main points of my problem:
The code is fully operational when server = T is passed.
When passing server = F, the browser throws the following error message when the user edits a cell in the DT:
DataTables warning: table id=DataTables_Table_5 - Invalid JSON
response. For more information about this error, please see
http://datatables.net/tn/1
An interesting thing is that when this error window is dismissed, the dependent summary statistics update correctly according to the edit, and the Shiny app carries on. Hence, everything works except for the error. I should note that I visited the site referred to in the error without becoming any wiser.
Reproducible example below:
library(shiny)
library(DT)
dt = data.frame(V1 = c(1,2), V2 = c(3,4))
server <- function(input, output, session) {
val = reactiveValues(mat = data.table(dt))
output$testDT = renderDataTable({
DT::datatable(val$mat, editable = TRUE)
}, server = FALSE)
proxy = dataTableProxy('testDT')
observeEvent(input$testDT_cell_edit, {
info = input$testDT_cell_edit
str(info)
i = info$row
j = info$col
v = info$val
if (j == 1){
val$mat$V1[i] = DT::coerceValue(v, val$mat$V1[i])
replaceData(proxy, val$mat, rownames = FALSE)
}
})
}
ui <- fluidPage(
dataTableOutput('testDT')
)
shinyApp(ui, server)
Thanks!
It's has been answered on the Github thread and I'm sharing my answer here.
Probably it's not documented clearly. It has nothing to do with the editing. It's because replaceData() calls reloadData(), which requires the server-side processing mode. See ?reloadData().
reloadData() only works for tables in the server-side processing mode, e.g. tables rendered with renderDataTable(server = TRUE). The data to be reloaded (i.e. the one you pass to dataTableAjax()) must have exactly the same number of columns as the previous data object in the table.

Shiny: Dynamically load .RData file

I am using Shiny as an interface for viewing tables stored locally in a series of .RData files however I am unable to get the table to render.
My server code is like this:
output$table1 <- renderTable({
load(paste0(input$one,"/",input$two,".RData"))
myData})
On the ui side I am simply displaying the table in the main panel.
This other SO question suggests that the issue is that the environment that the data is loaded into goes away so the data isn't there to display. They suggest creating a global file and loading the .RData file in there, but I don't believe I will be able to load the data dynamically that way. Any guidance on how to use .RData files effectively within shiny would be appreciated.
Regards
I think you just need to move the load statement outside of the renderTable function. So you should have
load(paste0(input$one,"/",input$two,".RData"))
output$table1 <- renderTable({myData})
If you look at the help file for renderTable, the first argument is
expr: An expression that returns an R object that can be used with
xtable.
load does not return this.
I got around this by "tricking" R Shiny. I make a BOGUS textOutput, and in renderText, call a external function that, based in the input selected, sets the already globally loaded environments to a single environment called "e". Note, you MUST manually load all RDatas into environments in global.R first, with this approach. Assuming your data isn't that large, or that you don't have a million RDatas, this seems like a reasonable hack.
By essentially creating a loadEnvFn() like the below that returns a string input passed as input$datasetNumber, you can avoid the scoping issues that occur when you put code in a reactive({}) context. I tried to do a TON of things, but they all required reactive contexts. This way, I could change the objects loaded in e, without having to wrap a reactive({}) scope around my shiny server code.
#Global Environment Pre-loaded before Shiny Server
e = new.env()
dataset1 = new.env()
load("dataset1.RData", env=dataset1)
dataset2 = new.env()
load("dataset2.RData", env=dataset2)
dataset3 = new.env()
load("dataset3.RData", env=dataset3)
ui = fluidPage(
# Application title
titlePanel(title="View Datasets"),
sidebarLayout(
# Sidebar panel
sidebarPanel(width=3, radioButtons(inputId = "datasetNumber", label = "From which dataset do you want to display sample data?", choices = list("Dataset1", "Dataset2", "Dataset3"), selected = "Dataset2")
),
# Main panel
mainPanel(width = 9,
textOutput("dataset"), # Bogus textOutput
textOutput("numInEnv")
)
)
)
loadEnvFn = function(input) {
if (input$datasetNumber=="Dataset1") {
.GlobalEnv$e = dataset1
} else if (input$datasetNumber=="Dataset2") {
.GlobalEnv$e = dataset2
} else {
.GlobalEnv$e = dataset3
}
# Bogus return string unrelated to real purpose of function loadEnvFn
return(input$datasetNumber)
}
server = function(input, output, session) {
output$dataset = renderText(sprintf("Dataset chosen was %s", loadEnvFn(input))) # Bogus output
output$numInEnv = renderText(sprintf("# objects in environment 'e': %d", length(ls(e))))
}
shinyApp(ui, server)

Resources