How to force stop doParallel %dopar% calculations in shiny app - r

My shiny app does clustering using %dopar% from doParallel library. There is one thing that bothers me: if an operation takes, for example, 10 minutes to calculate, but user closes the window of an app before it's done - calculations will continue to happen until they're finished.
What I'm trying to do is to force stop them on session end. I tried to do something like this
session$onSessionEnded(function() {
stopCluster(clu)
print("stopped")
})
to stop calculations on window closure, but it is evaluated after the calculations are done. Here is a test app i'm trying to get it to work on:
server.r
library(shiny)
library(doParallel)
shinyServer(function(input, output, session) {
wd <- getwd()
rv <- reactiveValues()
clu <- makeCluster(2)
registerDoParallel(clu)
observeEvent(input$loadDataButton, {
rv$listDF <- readRDS(paste0(wd, "/iris15k.RData"))
})
observeEvent(input$clusterButton, {
if (input$clusterButton >= 1) {
df <- rv$listDF
rv$resultTime <- system.time(
rv$r <- foreach(icount(100)) %dopar% {
table <- kmeans(as.data.frame(df), centers = 3)
}
)
}
})
observeEvent(rv$listDF, {
output$done <- renderText("Done loading!")
})
observeEvent(rv$resultTime, {
output$time <- renderText(c("Done caltulating, time:", rv$resultTime))
print(rv$resultTime)
})
session$onSessionEnded(function() {
stopCluster(clu)
print("stopped")
})
})
ui.r
library(shiny)
shinyUI(fluidPage(
actionButton(
inputId = "loadDataButton",
label = "load data"
),
actionButton(
inputId = "clusterButton",
label = "Do clustering"
),
textOutput("done"),
textOutput("time")
))
iris15k.RData is a simple iris dataset with replicated rows to make the operation longer.
There is a way to write a bash script that will do the job and kill the processes in very straight way, but I'd rather do it from R side as I know little of it and it seems to be right to get it work withing an app code.
Any suggestions?

Related

Shiny show a progress bar when reading in global data before the app starts

I have an app where a large dataset is read in before the app starts. The app has separate ui and server files. So the UI is visible straightaway and the div for output plot remains empty. It sort of hangs for about 2-3 seconds as the data is read in. And then the plot is displayed. The rest of the app is fast enough and requires no progress bars. I would like to show some progress/indication that the data is being read in rather than just "freezing" for few seconds.
Here is a dummy example. The data is only read in once before the app loads. The data is used in ui as well as server.
library(shiny)
# read big file
#saveRDS(diamonds,"diamonds.Rds")
x <- readRDS("diamonds.Rds")
ui = fluidPage(
titlePanel("Progress bar test"),
selectInput("in_opts","Select",choices=colnames(x),selected=1),
verbatimTextOutput("out_txt")
)
server=function(input,output,session) {
output$out_txt <- renderPrint({
Sys.sleep(3)
head(x)
})
}
shinyApp(ui,server)
I have tried using shinycssloaders. It generally works. It works well in this dummy example. But, it doesn't work for the "reading in file" part since that is outside the withSpinner() function.
library(shiny)
library(shinycssloaders)
# read big file
#saveRDS(diamonds,"diamonds.Rds")
x <- readRDS("diamonds.Rds")
ui = fluidPage(
titlePanel("Progress bar test"),
selectInput("in_opts","Select",choices=colnames(x),selected=1),
shinycssloaders::withSpinner(verbatimTextOutput("out_txt"))
)
server=function(input,output,session) {
output$out_txt <- renderPrint({
Sys.sleep(3)
head(x)
})
}
shinyApp(ui,server)
Is there a way to display progress/indicator for the readRDS() step?
As mentioned in the comments we can run the long computation in a separate process. This can e.g. be done via library(future).
Once the future_promise returns its result, it is assigned to a global reactiveVal - therefore all shiny sessions started later don't have to wait.
library(shiny)
library(promises)
library(future)
library(datasets)
library(shinycssloaders)
plan(multisession)
globalrv <- reactiveVal(NULL)
future_promise({
Sys.sleep(10) # your long running function
iris
}) %...>%
globalrv() %...!% # assign result to globalrv
(function(e) {
globalrv(NULL) # error handling needed?
warning(e)
})
ui = fluidPage(
titlePanel("Progress bar test"),
conditionalPanel("output.trigger == null", shinycssloaders::withSpinner(uiOutput("dummy"))),
conditionalPanel("output.trigger != null", verbatimTextOutput("out_txt"))
)
server = function(input, output, session) {
output$trigger <- eventReactive(globalrv(), {globalrv()})
outputOptions(output, "trigger", suspendWhenHidden = FALSE)
output$out_txt <- renderPrint({
req(globalrv())
head(globalrv())
})
}
shinyApp(ui,server)
Using renderUI instead:
library(shiny)
library(promises)
library(future)
library(datasets)
library(shinycssloaders)
plan(multisession)
globalrv <- reactiveVal(NULL)
future_promise({
Sys.sleep(10) # your long running function
iris
}) %...>%
globalrv() %...!% # assign result to globalrv
(function(e) {
globalrv(NULL) # error handling needed?
warning(e)
})
ui = fluidPage(
titlePanel("Progress bar test"),
uiOutput("spinner"),
verbatimTextOutput("out_txt")
)
server = function(input, output, session) {
output$spinner <- renderUI({
if(is.null(globalrv())){
shinycssloaders::withSpinner(uiOutput("dummy"))
} else {
NULL
}
})
output$out_txt <- renderPrint({
req(globalrv())
head(globalrv())
})
}
shinyApp(ui,server)

Updating progress indicator in sequence with parallel processing

I've got a nested call to future_lapply that processes in parallel. It works great and fast!
The problem is that reporting on progress in sequence within Shiny is proving difficult. The following Shiny app has a progress indicator that instead of incrementing like 1,2,3, etc. increments out of order like 1, 5, 2, etc.
Any ideas on how to report progress in order (even if the processing is down out of order)? A simple counter increment doesn't work (increment <- 0; increment <- increment + 1) because of scoping rules.
Switching to plan(multisession) increments in order as expected, but that removes the parallel processing.
library(shiny)
library(future)
library(future.apply)
library(ipc)
library(callr)
plan(list(callr, tweak(multisession, workers = availableCores() -1)))
ui <- fluidPage(
actionButton("run","Run"),
tableOutput("dataset")
)
server <- function(input, output, session) {
dat <- reactiveVal()
observeEvent(input$run, {
progress <- AsyncProgress$new(session, min=1, max=15)
future({
future.apply::future_lapply(1:15, function(i){
progress$set(value = i, message = paste("Iteration:",i))
Sys.sleep(0.5)
})
progress$close()
cars
}) %...>% dat
NULL
})
output$dataset <- renderTable({
req(dat())
})
}
shinyApp(ui, server)
}

How to run parallel SQL within a reactive in Shiny

Background: I have a Shiny app that generates one dataframe by merging the results of several SQL queries from different databases into one dataframe.
Each SQL query is held within a function and all of the functions run once a button is pressed.
The problem: Two of the queries take a while to run but because Shiny is the way it is, they run in serial whereas I suppose it would be a lot faster to get them to run in parallel.
Attempts: I tried firstly to get each query to run asynchronously but apparently that is really for scaling an application between users rather than within session asynchronous so I've ditched that approach and tried to use parallel processing with doParallel.
I started by using the parallel processing within each function but of course that is pointless because the parallel processing doesn't chop up a SQL query and run each separately, so then I thought I should add all the functions to a list and run them with a parallel processing.
library(shiny)
library(doParallel)
library(dplyr)
PendingApptdf<-reactiveValues(data = data.frame()) #dataframe for the pending data
myResdf<-reactiveValues(data = data.frame()) #Results dataframe
LatestResultsdf<-reactiveValues(data = data.frame()) #Final dataframe
ui.R
ui <- fluidPage(
titlePanel("Parallel SQL attempt"),
sidebarLayout(
sidebarPanel(
),
mainPanel(
actionBttn("LatestResults_btn",label = "Latest results",size="sm"),
DT::dataTableOutput("LatestResultsTable")
)
)
)
server.R
server <- function(input, output) {
#SQL number one:
PatientPendingAppt<-function(HospNums){
ch_PAD_PendingAppt<-odbcDriverConnect("DRIVER={*******};*******;Database=******")
sql_PAD_PendingAppt<-paste0("SELECT * from database",HospNums)
resultsetPAD_PendingAppt <- sqlQuery(ch_PAD_PendingAppt, sql_PAD_PendingAppt)
resultsetPAD_PendingAppt<-unique(resultsetPAD_PendingAppt)
odbcClose(ch_PAD_PendingAppt)
return(resultsetPAD_PendingAppt)
}
#SQL number two:
myRes<-function(HospNums){
ch_PAD_myRes<-odbcDriverConnect("DRIVER={*******};*******;Database=******")
sql_PAD_myRes<-paste0("SELECT * from database2",HospNums)
resultsetmyRes <- sqlQuery(ch_PAD_myRes, sql_PAD_myRes)
resultsetPAD_myRes<-unique(resultsetPAD_myRes)
odbcClose(ch_PAD_myRes)
return(resultsetPAD_myRes)
}
PendingAppt<-reactive({
myNumber<- '123456'
Pending<-PatientPendingAppt(myNumber)
PendingApptdf$data<-Pending
})
myResOP<-reactive({
myNumber<- '123456'
myResOP1<-myRes(myNumber)
myResdf$data<-myResOP1
})
####Latest results datatable####
output$LatestResultsTable = DT::renderDT({
outputPar <- foreach(i = 1:2, .packages="RODBC")%dopar%{
PendingAppt()
mergedEnoting<-mergedEnoting()
#merge the results here with rbind
}
LatestResultsdf<-outputPar
datatable(LatestResultsdf$data,
options = list(
dom = 't'),
)
})
}
# Run the application
shinyApp(ui = ui, server = server)
But this doesn't seem to work either and it tells me that it can't find the function PendingAppt.
Why is this? Is it because it is a reactive being run inside of the parallel processing. If so then how to rectify?
How can I run the two SQL functions so that they run concurrently rather than in serial?
You can use futures combined with promises to run the two queries in parallel, see scaling shiny with async.
Under SQL Server I used a WAIT FOR DELAY to simulate two long queries:
library(DBI)
library(shiny)
library(future)
library(promises)
plan(multisession,workers = 2)
# Define UI for application that draws a histogram
ui <- fluidPage(
# Application title
titlePanel("Async Test"),
# Sidebar with a slider input for number of bins
sidebarLayout(
sidebarPanel(
selectInput("choice",
"Choice",
choice = list('choice 1','choice 2'))
),
# Show a plot of the generated distribution
mainPanel(
textOutput("data1"),
textOutput("data2")
)
)
)
# Define server logic required to draw a histogram
server <- function(input, output,session) {
cat(nbrOfWorkers(),' workers launched \n')
# first query depending on Choice
data1 <-reactive({
input$choice
start <- Sys.time()
future({
conn <- dbConnect(***)
dbGetQuery(conn,("WAITFOR DELAY '00:00:05'"))
dbDisconnect(conn)
paste("Result from Task 1 : done in ", round(difftime(Sys.time(),start)), 's')
}) %...>% print() })
# second query depending on Choice
data2 <-reactive({
input$choice
start <- Sys.time()
future({
conn <- Connect(***)
dbGetQuery(conn,("WAITFOR DELAY '00:00:10'"))
dbDisconnect(conn)
paste("Result from Task 2 : done in ", round(difftime(Sys.time(),start)), 's')
}) %...>% print() })
output$data1 <- renderText({ data1()})
output$data2 <- renderText({ data2()})
}
# Run the application
shinyApp(ui = ui, server = server)
You can verify that the result of a 5s query and a 10s query is displayed in 10s and not in 15s :

How to update reactive output inside a for loop in R shiny

I'm new to Shiny and have hit a problem I can't find an answer for. Basically, I have a Shiny app that does some long calculations in a loop and I want it to output a "progress report" every few iterations. However, even though I reassign my reactive variable within the loop, the output doesn't update until the loop (or entire function?) has finished.
Here is a simplified test case of what I mean:
library(shiny)
# Basic interface
ui <- fluidPage(
actionButton("run", "Run"),
textOutput("textbox")
)
# Basic server with loop
server <- function(input, output) {
textvals=reactiveValues(a=0)
observeEvent(input$run, {
for(i in 1:10){
textvals$a=i # Expect output to update here, but doesn't
Sys.sleep(0.1) # Slight pause so isn't instantaneous
}
})
output$textbox <- renderText({
textvals$a
})
}
# Run the application
shinyApp(ui = ui, server = server)
What I would expect is that the display would update 1, 2, 3, ... 10 as the loop executes. Instead, it just jumps straight from 0 to 10. How can I force an update partway through the loop?
Thank you.
With using invalidateLater you can get something closed to what you want. Not the shortest way to do it I think, but it may help you to find a better solution.
library(shiny)
# Basic interface
ui <- fluidPage(
actionButton("run", "Run"),
textOutput("textbox")
)
# Basic server with loop
server <- function(input, output, session) {
textvals <- reactiveVal(0)
active <- reactiveVal(FALSE)
output$textbox <- renderText({
textvals()
})
observe({
invalidateLater(1000, session)
isolate({
if (active()) {
textvals(textvals() + 1)
if (textvals() > 9) {
active(FALSE)
}
}
})
})
observeEvent(input$run, {
active(TRUE)
})
}
# Run the application
shinyApp(ui = ui, server = server)
By the way, reactive and for loops don't really get on well. This may help : https://gist.github.com/bborgesr/e1ce7305f914f9ca762c69509dda632e

Calling a shiny JavaScript Callback from within a future

In shiny, it is possible to call client-side callbacks written in javascript from the server's logic. Say in ui.R you have some JavaScript including a function called setText:
tags$script('
Shiny.addCustomMessageHandler("setText", function(text) {
document.getElementById("output").innerHTML = text;
})
')
then in your server.R you can call session$sendCustomMessage(type='foo', 'foo').
Suppose I have a long-running function which returns some data to plot. If I do this normally, the R thread is busy while running this function, and so can't handle additional requests in this time. It would be really useful to be able to run this function using the futures package, so that it runs asynchronously to the code, and call the callback asyncronously. However, when I tried this is just didn't seem to work.
Sorry if this isn't very clear. As a simple example, the following should work until you uncomment the two lines trying to invoke future in server.R. Once those lines are uncommented, the callback never gets called. Obviously it's not actually useful in the context of this example, but I think it would be very useful in general.
ui.R:
library(shiny)
shinyUI(fluidPage(
sidebarLayout(
sidebarPanel(
sliderInput("max",
"Max random number:",
min = 1,
max = 50,
value = 30)
),
mainPanel(
verbatimTextOutput('output'),
plotOutput('plot')
)
),
tags$script('
Shiny.addCustomMessageHandler("setText", function(text) {
document.getElementById("output").innerHTML = text;
})
')
))
server.R:
library(shiny)
library(future)
plan(multiprocess)
shinyServer(function(input, output, session) {
output$plot <- reactive({
max <- input$max
#f <- future({
session$sendCustomMessage(type='setText', 'Please wait')
Sys.sleep(3)
x <- runif(1,0,max)
session$sendCustomMessage(type='setText', paste('Your random number is', x))
return(NULL)
#})
})
})
Here is a solution on how you could use the future package in a shiny app.
It is possible to have multiple sessions with no session blocking another session when running a computationally intensive task or waiting for a sql query to be finished. I suggest to open two sessions (just open http://127.0.0.1:14072/ in two tabs) and play with the buttons to test the functionality.
run_app.R:
library(shiny)
library(future)
library(shinyjs)
runApp(host = "127.0.0.1", port = 14072, launch.browser = TRUE)
ui.R:
ui <- fluidPage(
useShinyjs(),
textOutput("existsFutureData"),
numericInput("duration", "Duration", value = 5, min = 0),
actionButton("start_proc", h5("get data")),
actionButton("start_proc_future", h5("get data using future")),
checkboxInput("checkbox_syssleep", label = "Use Sys.sleep", value = FALSE),
h5('Table data'),
dataTableOutput('tableData'),
h5('Table future data'),
dataTableOutput('tableFutureData')
)
server.R:
plan(multiprocess)
fakeDataProcessing <- function(duration, sys_sleep = FALSE) {
if(sys_sleep) {
Sys.sleep(duration)
} else {
current_time <- Sys.time()
while (current_time + duration > Sys.time()) { }
}
return(data.frame(test = Sys.time()))
}
#fakeDataProcessing(5)
############################ SERVER ############################
server <- function(input, output, session) {
values <- reactiveValues(runFutureData = FALSE, futureDataLoaded = 0L)
future.env <- new.env()
output$existsFutureData <- renderText({ paste0("exists(futureData): ", exists("futureData", envir = future.env)," | futureDataLoaded: ", values$futureDataLoaded) })
get_data <- reactive({
if (input$start_proc > 0) {
shinyjs::disable("start_proc")
isolate({ data <- fakeDataProcessing(input$duration) })
shinyjs::enable("start_proc")
data
}
})
observeEvent(input$start_proc_future, {
shinyjs::disable("start_proc_future")
duration <- input$duration # This variable needs to be created for use in future object. When using fakeDataProcessing(input$duration) an error occurs: 'Warning: Error in : Operation not allowed without an active reactive context.'
checkbox_syssleep <- input$checkbox_syssleep
future.env$futureData %<-% fakeDataProcessing(duration, sys_sleep = checkbox_syssleep)
future.env$futureDataObj <- futureOf(future.env$futureData)
values$runFutureData <- TRUE
check_if_future_data_is_loaded$resume()
},
ignoreNULL = TRUE,
ignoreInit = TRUE
)
check_if_future_data_is_loaded <- observe({
invalidateLater(1000)
if (resolved(future.env$futureDataObj)) {
check_if_future_data_is_loaded$suspend()
values$futureDataLoaded <- values$futureDataLoaded + 1L
values$runFutureData <- FALSE
shinyjs::enable("start_proc_future")
}
}, suspended = TRUE)
get_futureData <- reactive({ if(values$futureDataLoaded > 0) future.env$futureData })
output$tableData <- renderDataTable(get_data())
output$tableFutureData <- renderDataTable(get_futureData())
session$onSessionEnded(function() {
check_if_future_data_is_loaded$suspend()
})
}
I retooled André le Blond's excellent answer to and made a gist showing a generic asynchronous task processor which can be used either by itself or with Shiny: FutureTaskProcessor.R
Note it contains two files: FutureProcessor.R which is the stand alone asynchronous task handler and app.R which is a Shiny App showing use of the async handler within Shiny.
One admittedly complicated workaround to the single-threaded nature of R within Shiny apps is to do the following:
Splinter off an external R process (run another R script located in
the Shiny app directory, or any directory accessible from within the
Shiny session) from within R (I've tried this splintering before,
and it works).
Configure that script to output its results to a temp directory (assuming you're running Shiny on a Unix-based system) and give the output file a unique filename (preferably named within the namespace of the current session (i.e. "/tmp/[SHINY SESSION HASH ID]_example_output_file.RData".
Use Shiny's invalidateLater() function to check for the presence of that output file.
Load the output file into the Shiny session workspace.
Finally, trash collect by deleting the generated output file after loading.
I hope this helps.

Resources