R Shiny - cache big dataframe

R Shiny - cache big dataframe - r

I'm quite new to Shiny, so my apologizes if my question is an easy one. I tried to check on google and stackoverflow but couldn't locate a simple and helpful answer so far.
What's my goal/issue: I'm coding a Shiny page that displays a table with hundreds of thousands of rows.
Data is sourced from different databases, manipulated, cleaned, and displayed to all the users upon request.
Problem 1: in order to load all the data, the script takes almost 5minutes
Problem 2: if at 8:00am user1 requests this data and at 8:05am user2 requests the same data, two different queries are launched and also two different spaces in memory are used to show exactly the same data to two different users.
So the question is: shall I use a cache system to enhance this process?
if not, what else shall I use?
I found a lot of official Shiny documentation on caching plots but nothing related to caching data (and I found this quite surprising).
Other useful information: data in cache should be deleted every evening around 10pm since new data will be available the next day / early morning.
Code:
ui <- dashboardPage( # https://rstudio.github.io/shinydashboard/structure.html
title = "Dashboard",
dashboardHeader(title = "Angelo's Board"),
dashboardSidebar( # inside here everything that is displayed on the left hand side
includeCSS("www/styles.css"),
sidebarMenu(
menuItem('menu 1', tabName = "menu1", icon = icon("th"),
menuItem('Data 1', tabName = 'tab_data1'))
)),
dashboardBody(
tabItems(
tabItem(tabName = 'tab_data1')),
h3("Page with big table"),
fluidRow(dataTableOutput("main_table"))
))
server <- function(input, output, session) {
output$main_tabl <- renderDataTable({
df <- data.frame(names = c("Mark","George","Mary"), age = c(30,40,35))
})
}
cat("\nLaunching 'shinyApp' ....")
shinyApp(ui, server)
Resources I used to check for potential solution:
How to cache data in shiny server? but apparently I cannot use Jason Bryer package
https://shiny.rstudio.com/reference/shiny/1.2.0/memoryCache.html but I have no idea of how to use this code applied to my example
https://shiny.rstudio.com/articles/plot-caching.html is mainly focused on plot caching
Any help would be much appreciated. Thanks

I would break out the bulk of your ETL processes into a separate R script and set that script to run on a cron. You can then have this script write out the processed dataframe(s) to a .feather file. Then have your shiny app load the feather file(s) - feather is optimized for reading so should be fast.
Example, take the necessary libraries and code out of your server.R (or app.R) file, and create a new R script called query.R. That script performs all the ETL operations and finally writes out your data to a .feather file (requires the feather package). Then create a crontab to run that script as often as needed.
Your server.R script then just needs to read in that feather file when the app loads and you should see a significant performance improvement. In addition, you have have the query.R script run during off hours so that performance on the linux box isn't negatively impacted.

Another option, put this DataFrame in global.R and change /etc/shiny-server/shiny-server.conf by adding «app_idle_timeout 0» after «location / {». This will disable application idle timeouts in Shiny Server, so global.R will be in RAM for all users.
To prevent first user from long data loading, you can put in cron «#reboot wget -O index.html localhost:3838» on your server, so on every reboot global.R will load to memory automatically.
Also, about pre-cache organisation you can read here.

Related

R Shiny Dashboard - Loading Scripts using source('file.R')

Introduction
I have created an R shiny dashboard app that is quickly getting quite complex. I have over 1300 lines of code all sitting in app.R and it works. I'm using RStudio.
My application has a sidebar and tabs and rather than using modules I dynamically grab the siderbar and tab IDs to generate a unique identifier when plotting graphs etc.
I'm trying to reorganise it to be more manageable and split it into tasks for other programmers but I'm running into errors.
Working Code
My original code has a number of library statements and sets the working directory to the code location.
rm(list = ls())
setwd(dirname(rstudioapi::getActiveDocumentContext()$path))
getwd()
I then have a range of functions that sit outside the ui/server functions so are only loaded once (not reactive). These are all called from within the server by setting the reactive values and calling the functions from within something like a renderPlot. Some of them are nested, so a function in server calls a function just in regular app.R which in turn calls another one. Eg.
# Start of month calculation
som <- function(x) {
toReturn <- as.Date(format(x, "%Y-%m-01"))
return(toReturn)
}
start_fc <- function(){
fc_start_date <- som(today())
return(fc_start_date)
}
then in server something like this (code incomplete)
server <- function(input, output, session) {
RV <- reactiveValues()
observe({
RV$selection <- input[[input$sidebar]]
# cat("Selected:",RV$selection,"\r")
})
.......
cat(paste0("modelType: ",input[[paste0(RV$selection,"-modeltype")]]," \n"))
vline1 <- decimal_date(start_pred(input[[paste0(RV$selection,"-modeltype")]],input[[paste0(RV$selection,"-modelrange")]][1]))
vline2 <- decimal_date(start_fc())
.......
Problem Code
So now when I take all my functions and put them into different .R files I get errors indicating the functions haven't been loaded. If I load the source files by highlighting them and Alt-Enter running them so they are loaded into memory then click on Run App the code works. But if I rely on Run App to load those source files, and the functions within them, the functions can't be found.
source('./functionsGeneral.R')
source('./functionsQuote.R')
source('./functionsNewBusiness.R')
source('./ui.R')
source('./server.R')
shinyApp(ui, server)
where ui.R is
source('./header.R')
source('./sidebar.R')
source('./body.R')
source('./functionsUI.R')
ui <- dashboardPage(
header,
sidebar,
body
)
Finally the questions
In what order does R Shiny Dashboard run the code. Why does it fail when I put the exact same inline code into another file and reference it with source('./functions.R')? Does it not load into memory during a shiny app session? What am I missing?
Any help on this would be greatly appreciated.
Thanks,
Travis

Ok I've discovered the easiest way is to create a subfolder called R and to place the preload code into that folder. From shiny version 1.5 all this code in the R folder is loaded first automatically.

reactiveFileReader in Shiny for .RData

My current workflow in a shiny application is to run a R script as a cron job periodically to pull various tables from multiple databases as well as download data from some APIs. These are then saved as a .Rdata file in a folder called data.
In my global.R file I load the data by using load("data/workingdata.Rdata"). This results in all the dataframes (about 30) loading into the environment. I know I can use the reactiveFileReader() function to refresh the data, but obviously it would have to be used in the server.R file because of an associated session with the function. Also, I am not sure if load is accepted as a readFunc in reactiveFileReader(). What should be the best strategy for the scenario here?

This example uses a reactiveVal object with observe and invalidateLater. The data is loaded into a new environment and assigned to the reactiveVal every 2 seconds.
library(shiny)
ui <- fluidPage(
actionButton("generate", "Click to generate an Rdata file"),
tableOutput("table")
)
server <- shinyServer(function(input, output, session) {
## Use reactiveVal with observe/invalidateLater to load Rdata
data <- reactiveVal(value = NULL)
observe({
invalidateLater(2000, session)
n <- new.env()
print("load data")
env <- load("workingdata.Rdata", envir = n)
data(n[[names(n)]])
})
## Click the button to generate a new random data frame and write to file
observeEvent(input$generate, {
sample_dataframe <- iris[sample(1:nrow(iris), 10, F),]
save(sample_dataframe, file="workingdata.Rdata")
rm(sample_dataframe)
})
## Table output
output$table <- renderTable({
req(data())
data()
})
})
shinyApp(ui = ui, server = server)

A few thoughts on your workflow:
In the end with your RData-approach you are setting up another data source in parallel to your databases / APIs.
When working with files there always is some housekeeping-overhead (e.g. is your .RData file completed when reading it?). In my eyes this (partly) is what DBMS are made for – taking care about the housekeeping. Most of them have sophisticated solutions to ensure that you get what you query very fast; so why reinvent the wheel?
Instead of continuously creating your .RData files and polling data with the reactiveFileReader() function you could directly query the DB for changes using reactivePoll (see this
for an example using sqlite). If your queries are long running (which I guess is the cause for your workflow) you can wrap them in a future and run them asynchronously (see this post
to get some inspiration).
Alternatively many DBMS provide something like materialized views to avoid long waiting times (according user privileges presumed).
Of course, all of this is based on assumptions, due to the fact, that your eco-system isn’t known to me, but in my experience reducing interfaces means reducing sources of error.

You could use load("data/workingdata.Rdata") at the top of server.R. Then, anytime anyone starts a new Shiny session, the data would be the most recent. The possible downsides are that:
there could be a hiccup if the data is being written at the same time a new Shiny session is loading data.
data will be stale if a session is open just before and then after new data is available.
I imagine the first possible problem wouldn't arise enough to be a problem. The second possible problem is more likely to occur, but unless you are in a super critical situation, I can't see it being a substantial enough problem to worry about.
Does that work for you?

How to convert a Shiny app consisting of multiple files into an easily shareable and reproducible Shiny example?

There are resources on how to create a Minimal, Complete, and Verifiable example in general on Stack Overflow, and on how to make a great R reproducible example. However, there are no similar guidelines for shiny questions, while adhering to certain standards makes it much more likely that quality answers will be given, and thus that your question will be resolved.
However, asking a good Shiny question can be difficult. shiny apps are often large and complex, use multiple data sources, and the code is often split over multiple files, making it difficult to share easily reproducible code with others. Even though a problem may be caused in server.R, the example is not reproducible without the contents of ui.R (and possibly other files like stylesheets or global.R). Copy-pasting the contents of all these files individually is cumbersome, and requires other users to recreate the same file structure to be able to reproduce the problem.
So; how to convert your shiny app into a good reproducible example?

Example data
Of course, all guidelines regarding sample data mentioned in the answer on the question “how to make a great R reproducible example” also hold when creating questions related to Shiny. To summarize: Make sure no additional files are needed to run your code. Use sample datasets like mtcars, or create some sample data with data.frame(). If your data is very complex and that complexity is really required to illustrate the issue, you could also use dput(). Avoid using functions like read.csv(), unless of course you have questions related to functions like fileInput.
Example code
Always reduce your code to the bare minimum to reproduce your error or unexpected behavior. This includes removing calls to additional .CSS files and .js files and removing unnecessary functions in the ui and the server.
Shiny apps often consist of two or three files (ui.R, server.R and possibly global.R), for example this demo application. However, it is preferable to post your code as a single script, so it can easily be run by others without having to manually create those files. This can easily be done by:
wrapping your ui with ui <- fluidPage(…),
the server with server <- function(input,output, session) {…},
and subsequently calling shinyApp(ui, server).
So a simple skeleton to start with could look as follows:
library(shiny)
ui <- fluidPage(
)
server <- function(input,output,session) {
}
shinyApp(ui, server)
Working Example
So, taking all the above into account, a good Minimal, Complete, and Verifiable example for a Shiny application could look as follows:
library(shiny)
df <- data.frame(id = letters[1:10], value = seq(1,10))
ui <- fluidPage(
sliderInput('nrow', 'Number of rows', min = 1, max = 10, value = 5),
dataTableOutput('my_table')
)
server <- function(input, output, session) {
output$my_table <- renderDataTable({
df[1:input$nrow,]
})
}
shinyApp(ui, server)
Adding CSS
There are multiple ways to add custom CSS to a Shiny application, as explained here. The preferred way to add CSS to a Shiny application in a reproducible example is to add the CSS in the code, rather than in a separate file. This can be done by adding a line in the ui of an application, for example as follows:
tags$head(tags$style(HTML('body {background-color: lightblue;}'))),

Reading an RData file into Shiny Application

I am working on a shiny app that will read a few RData files in and show tables with the contents. These files are generated by scripts that eventually turns the data into a data frame. They are then saved using the save() function.
Within the shiny application I have three files:
ui.R, server.R, and global.R
I want the files to be read on an interval so they are updated when the files are updated, thus I am using:
reactiveFileReader()
I have followed a few of the instructions I have found online, but I keep getting an error "Error: missing value where TRUE/FALSE is needed". I have tried to simplify this so I am not using:
reactiveFileReader()
functionality and simply loading the file in the server.R (also tried in the global.R file). Again, the
load()
statement is reading in a data frame. I had this working at one point by loading in the file, then assigning the file to a variable and doing an "as.data.table", but that shouldn't matter, this should read in a data frame format just fine. I think this is a scoping issue, but I am not sure. Any help? My code is at:
http://pastebin.com/V01Uw0se
Thanks so much!

Here is a possible solution inspired by this post http://www.r-bloggers.com/safe-loading-of-rdata-files/. The Rdata file is loaded into a new environment which ensures that it will not have unexpected side effect (overwriting existing variables etc). When you click the button, a new random data frame will be generated and then saved to a file. The reactiveFileReader then read the file into a new environment. Lastly we access the first item in the new environment (assuming that the Rdata file contains only one variable which is a data frame) and print it to a table.
library(shiny)
# This function, borrowed from http://www.r-bloggers.com/safe-loading-of-rdata-files/, load the Rdata into a new environment to avoid side effects
LoadToEnvironment <- function(RData, env=new.env()) {
load(RData, env)
return(env)
}
ui <- shinyUI(fluidPage(
titlePanel("Example"),
sidebarLayout(
sidebarPanel(
actionButton("generate", "Click to generate an Rdata file")
),
mainPanel(
tableOutput("table")
)
)
))
server <- shinyServer(function(input, output, session) {
# Click the button to generate a new random data frame and write to file
observeEvent(input$generate, {
sample_dataframe <- data.frame(a=runif(10), b=rnorm(10))
save(sample_dataframe, file="test.Rdata")
rm(sample_dataframe)
})
output$table <- renderTable({
# Use a reactiveFileReader to read the file on change, and load the content into a new environment
env <- reactiveFileReader(1000, session, "test.Rdata", LoadToEnvironment)
# Access the first item in the new environment, assuming that the Rdata contains only 1 item which is a data frame
env()[[names(env())[1]]]
})
})
shinyApp(ui = ui, server = server)

Ok - I figured out how to do what I need to. For my first issue, I wanted the look and feel of 'renderDataTable', but I wanted to pull in a data frame (renderDataTable / dataTableOutput does not allow this, it must be in a table format). In order to do this, I found a handy usage of ReportingTools (from Bioconductor) and how they do it. This allows you to use a data frame directly and still have the HTML table with the sorts, search, pagination, etc.. The info can be found here:
https://bioconductor.org/packages/release/bioc/html/ReportingTools.html
Now, for my second issue - updating the data and table regularly without restarting the app. This turned out to be simple, it just took me some time to figure it out, being new to Shiny. One thing to point out, to keep this example simple, I used renderTable rather than the solution above with the ReportingTools package. I just wanted to keep this example simple. The first thing I did was wrap all of my server.R code (within the shinyServer() function) in an observe({}). Then I used invalidateLater() to tell it to refresh every 5 seconds. Here is the code:
## server.R ##
library(shiny)
library(shinydashboard)
library(DT)
shinyServer(function(input, output, session) {
observe({
invalidateLater(5000,session)
output$PRI1LastPeriodTable <- renderTable({
prioirtyOneIncidentsLastPeriod <- updateILP()
})
})
})
Now, original for the renderTable() portion, I was just calling the object name of the loaded .Rdata file, but I wanted it to be read each time, so I created a function in my global.R file (this could have been in server.R) to load the file. That code is here:
updateILP <- function() {
load(file = "W:/Projects/R/Scripts/ITPOD/itpod/data/prioirtyOneIncidentsLastPeriod.RData", envir = .GlobalEnv)
return(prioirtyOneIncidentsLastPeriod)
}
That's it, nothing else goes in the global.R file. Your ui.R would be however you have it setup, call tableOutout, dataTableOutput, or whatever your rendering method is in the UI. So, what happens is every 5 seconds the renderTable() code is read every 5 seconds, which in turns invokes the function that actually reads the file. I tested this by making changes to the data file, and the shiny app updated without any interaction from me. Works like a charm.
If this is inelegant or is not efficient, please let me know if it can be improved, this was the most straight-forward way I could figure this out. Thanks to everyone for the help and comments!

R Shiny app progress Indicator for loading data

Shiny is our internal BI tool. For our Shiny apps, we load data before shinyServer running:
load("afterProcessedData.RData")
# or dt = fread("afterProcessedData.csv")
shinyServer(function(input, output, session){ ...
However, some of apps are loading big files and they take up to 30s to load up. Many users, when they open a page, don't know whether the page is broken since it is stuck when it is loading. They may close it or click filters, which may cause an error. In this case, a progress bar will be very helpful. I notice withProgress() may help but it has to be inside reactive() or renderXx().
One way I can do is to have laod() warpped with reactive() inside the shinyServer(function(input, output, session){ but my concern is it will slower the performance. And my users very care about the responsive performance.
Any suggestions for this situation?
Edit: I guess there is not an easy way to do this. I have another thought. Maybe I can show a text on the screen saying 'the data is loading', but I have to make it disappear after the first table gets show up. However, I don't know how to set up the condition. Below is my code showing first table:
dashboardBody(
fluidRow(
tabBox(width = 12,
tabPanel("Summary",
dataTableOutput("data1")),
Thank you in advance!

Even though I am still interested in knowing how to add process bar for load(), I have implemented the alternative solution, which is good for now. It has a text saying 'the data is loading...' on the page, and it will disappear after first table shows up.
#server.R firstData is a reactive function to get the data for 1st table
output$firstTable = reactive({
return(is.null(firstData()))
})
#ui.R
conditionalPanel(
condition = "output.firstTable",
box(width = 12,
h1("The data is loading...")
)
)

To reference the intriguing note from #user5249203 , withSpinner() looks to be a useful option for this functionality and is a part of the shinycssloaders package. I have not used myself, but it is definitely an intriguing package that happens to be on CRAN and to have some nice examples: https://andrewsali.shinyapps.io/example/

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

R Shiny - cache big dataframe - r

Related

R Shiny Dashboard - Loading Scripts using source('file.R')

reactiveFileReader in Shiny for .RData

How to convert a Shiny app consisting of multiple files into an easily shareable and reproducible Shiny example?

Reading an RData file into Shiny Application

R Shiny app progress Indicator for loading data

Categories

Resources