Reactive Sankey Diagram in R Shiny with multiple dataframes as input - r

It's my first time creating a dashboard and I'm running into a problem I can't seem to solve. I have created a sankey diagram and I want to be able to interactively change its contents through different dataframes (in this example: level_1, level_2, level_3). I've only ever practiced this with a regular plot, where the input would come from a variable within one dataframe which is my starting point in this piece of code (e.g. I have a df$country, so I use input$country in my plot --> Then I could choose from different countries in the dashboard sidebar, in order to change the contents of the plot). I have no idea how to do this when the input has to come from seperate dataframes.
My code: (in app.R)
level_1 <- as.data.frame(matrix(sample(seq(0,40), 15, replace=T ), 3, 5))
level_2 <- as.data.frame(matrix(sample(seq(0,40), 20, replace=T ), 4, 5))
level_3 <- as.data.frame(matrix(sample(seq(0,40), 25, replace=T ), 5, 5))
levels <- list(level_1, level_2, level_3)
ui <- dashboardPage(
dashboardHeader(title = "title"),
dashboardSidebar(
selectInput("in_levels", "Levels", choices = levels)
),
dashboardBody(
fluidRow(sankeyNetworkOutput("widget1"))
)
)
server <- function(input, output) {
links <- input$in_levels %>%
rownames_to_column(var="source") %>%
gather(key="target", value="value", -1) %>%
filter(value != 0)
nodes <- data.frame(
name=c(as.character(links$source), as.character(links$target)) %>%
unique()
)
links$IDsource <- match(links$source, nodes$name)-1
links$IDtarget <- match(links$target, nodes$name)-1
output$widget1 <- renderSankeyNetwork({
sankeyNetwork(Links = links, Nodes = nodes,
Source = "IDsource", Target = "IDtarget",
Value = "value", NodeID = "name", fontSize = 14, nodeWidth = 60,
fontFamily = "Arial", iterations = 0, sinksRight=TRUE)
})
}
shinyApp(ui, server)
I thought maybe it would help to create a list(), levels, of all the dataframes, but that does not work. I get this error:
Error : Can't access reactive value 'in_levels' outside of reactive consumer.
i Do you need to wrap inside reactive() or observer()?
I've googled reactive() and observer() to try to find out what my next step should be, but I haven't found the solution yet. It would be much appreciated if someone could give me advice on how to proceed, changes to make or something to read to increase my understanding.
Thanks in advance!

If you want to access any input values in the server you need to use a reactive context. shiny won't allow you to do otherwise, but even if it did, if an input value is updated, the server-side code won't update to reflect the change. Since you want both links and nodes to be dynamic and both depend on each other, a neat solution might be to store both objects in a list as follows:
server <- function(input, output) {
plot_data <- reactive({
# Perform all your computation inside this reactive!
links <- input$in_levels %>%
rownames_to_column(var="source") %>%
gather(key="target", value="value", -1) %>%
filter(value != 0)
nodes <- data.frame(
name = c(as.character(links$source), as.character(links$target)) %>%
unique()
)
links$IDsource <- match(links$source, nodes$name)-1
links$IDtarget <- match(links$target, nodes$name)-1
# Return the data in a list
list(links = links, nodes = nodes)
})
# Access the datasets by calling the reactive and then treating as a normal list
output$widget1 <- renderSankeyNetwork({
sankeyNetwork(Links = plot_data()$links, Nodes = plot_data()$nodes,
Source = "IDsource", Target = "IDtarget",
Value = "value", NodeID = "name", fontSize = 14, nodeWidth = 60,
fontFamily = "Arial", iterations = 0, sinksRight=TRUE)
})
}
This is untested as my current version of R doesn't support the network3d package.
The concept of reactivity is tricky, but chapter 3 of Mastering Shiny should be very illuminating if you're new to shiny.

Related

Dataframe issue when double clicking on VisNetwork node to run a function

I have a network diagram with a fairly large amount of nodes (~600), each node having some data, including an ID and its name.
I want to be able to run a very simple function when double-clicking on a specific node.
For that purpose, I have followed the instructions from this thread.
Using the code provided:
library(shiny)
library(visNetwork)
ui <- fluidPage(
visNetworkOutput('network')
)
server <- function(input, output, session) {
getDiagramPlot <- function(nodes, edges){
v <- visNetwork(
nodes,
edges
) %>%
visPhysics(stabilization = TRUE, enabled = TRUE) %>%
visOptions(highlightNearest = list(enabled = T, degree = 1, hover = F), autoResize = TRUE, collapse = FALSE) %>%
visEdges(color = list(highlight = "red")) %>% # The colour of the edge linking nodes
visLayout(improvedLayout = TRUE) %>%
visEdges(arrows = edges$arrows) %>%
visInteraction(multiselect = F) %>%
visEvents(doubleClick = "function(nodes) {
Shiny.onInputChange('current_node_id', nodes.nodes);
;}")
return(v)
}
testFunction <- function(node_id){
print(paste("The selected node ID is:", node_id))
}
nodes <- data.frame(id = 1:3, label = 1:3)
edges <- data.frame(from = c(1,2), to = c(1,3))
output$network <- renderVisNetwork(
getDiagramPlot(nodes, edges)
)
observeEvent(input$current_node_id,{
testFunction(input$current_node_id)
})
}
shinyApp(ui, server)
The codes works well but when I replace the simple nodes and edges dataframe provided as example by my data (much larger network) then the code doesn't work anymore (nothing gets printed in the console when I double-click on any nodes).
Would anyone know why the code is not running with my data ?
Here is the adjustments that should be done to the code below:
load("NodesEdges.RData")
# nodes <- data.frame(id = 1:3, label = 1:3)
# edges <- data.frame(from = c(1,2), to = c(1,3))
Best wishes,
C.
I have tried:
adding more columns to the example nodes/edges (group, value, color, etc.) and the codes still runs well.
restricting my larger nodes/edges dataframes respectively to the "id", "label" and "from", "to" columns (same as example data) but the codes still fails.
I wonder whether the problem comes from the size of the dataframe.

Select a variable based on two inputs in Rshiny

I have a dataset with variables for the ID of patients, different tests (MMT), and the treatment.
ID
MMT_II_week15_change
MMT_II_Week20_change
MMT_Tot_week15_change
MMT_Tot_Week20_change
Treatment
As you can see, we have two different tests (MMT_II_change and MMT_Tot_change), for two different timepoints (week15, week20).
What I want is the user to be able to select, first, the test, and then, the timepoint.
In reality, he would be picking just one of the variables, but in two different steps.
Something like:
**Select test:**
MMT_II
MMT_III
**Select timepoint:**
Week15
Week20
And after this, the variable selected would be:
e.g: MMT_II_Week20_change
I though of using regex for this, but it seems quite complicated and coulnd't find of a way of doing it.
Any help really appreciated, as I've been stuck with this for a while.
Would something like this work?
VAR = paste0(test,"_",timepoint,"_change")
...
# then later to use the variable...
.data[[VAR]]
You can wrap the checking of changess occured in a single reactive function in the server section of your code.
uptodateChoice <- reactive({
paste0(input$firstcontrol, "_", input$secondcontrol, "_change")
})
This function will be called once any of the two controls state change.
You can also add any validate(need(...)) checks inside the function if required or simply return() if some conditions are not satisfied.
You can access the string value calling uptodateChoice().
I'm thinking about pivoting the data to longer format, filter it and then pivot again to wider. This way we can filter using filter function directly.
library(tidyverse)
library(shiny)
# create some data
df <- tibble(
ID = 1:5, MMT_II_week15_change = sample(seq(0.01, 0.2, 0.01), 5), MMT_II_week20_change = sample(seq(0.01, 0.2, 0.01), 5),
MMT_Tot_week15_change = sample(seq(0.01, 0.2, 0.01), 5), MMT_Tot_week20_change = sample(seq(0.01, 0.2, 0.01), 5)
)
# pivot wider capturing MMT_* for the first column and the number of week in the second.
df_pivot <- pivot_longer(df, -ID, names_to = c("test", "week"), values_to = "change", names_pattern = "(MMT_.*)_week(\\d+)_change$")
## APP
library(shiny)
ui <- fluidPage(
selectInput("test", "Select Test", choices = unique(df_pivot$test)),
selectInput("timepoint", "Select Timepoint", choices = NULL),
tableOutput("table")
)
server <- function(input, output, session) {
table <- reactiveVal(NULL)
observeEvent(input$test, {
choices <- filter(.data = df_pivot, test == input$test) %>%
{
unique(.$week)
}
updateSelectInput(inputId = "timepoint", choices = choices)
})
# this could also be a reactive.
observe({
table(filter(df_pivot, test == input$test, week == input$timepoint) %>%
pivot_wider(names_from = "test", values_from = "change"))
})
output$table <- renderTable({
table()
})
}
shinyApp(ui, server)

R shiny error when using reactive to plot

I have this data set:
Area <- c("Mexico", "USA", "USA", "Canada").
Type_of_participants <- c("Doctor", "Doctor", "Engineer", "Dancer".
Salary <- c("4000", "6000", "8000", "5000").
and I am trying to plot the salary base on the user input of Area(level1) and Type_of_participants(level2), but nothing appears. I modified aes to aes_string as I looked up here. Please help me find the error
My Code
`ui <- fluidPage(
titlePanel("Survey Results"),
sidebarLayout(
sidebarPanel(strong("Overview Plot"),
br(),
###1a.Area input
selectInput("selection","Var",
choices = c("Area","Type_of_participants"),
selected = "Area"),
uiOutput("choice_selection")
),
mainPanel(
plotOutput("Overview"))
`server <- function(input, output) {
output$choice_selection <- renderUI({
checkboxGroupInput("baseinput","Detail",
unique(df[,input$selection])
)`
})
dt1 <- reactive({
df %>%
group_by(input$selection,Type) %>%
filter (input$selection %in% input$baseinput) %>%
summarise(avg_salary_by_area = mean(Salary, na.rm = TRUE)) %>%
select(input$selection, Type, avg_Salary_by_area)
})
output$Overview <- renderPlot({
ggplot(data= dt1())+
aes(fill = Type)+
geom_bar(x=input$selection, y = avg_salary_by_area,stat="identity",
position = position_dodge())
The result is I can select the input but can not visualize the plot. The error "unknown column Area or unknow Type of participants
Please help me find the mistake
Thank you
*** Update
Thanks to Mr Flick, I have fixed my code but it still informs error "Object area not found". Please help to advise. Thank you so much
`dt1 <- reactive({
df[df[,input$selection] %in% input$baseinput,] %>%
group_by(input$selection,Type) %>%
summarise(avg_score_by_area = mean(Score, na.rm = TRUE))
})
output$Overview <- renderPlot({
ggplot(data= dt1(),aes_string(x= input$selection,
y = "avg_score_by_area",fill = "Type"))+
geom_bar(stat="identity",
position = position_dodge())`
#Suzie - as mentioned above, it would help if you edited your question with your complete code as you currently have it.
A few things that would help:
Salary should be numeric in your df (or be converted with as.numeric before trying to take the mean
Your reactive expression can use !!as.symbol with input$selection to filter by the string name from df
The plot can use aes_string for the variable names.
Edit:
For further explanation of !!as.symbol, first consider what the result of input$selection. If you use browser() in your shiny code, and inspect what input$selection returns, you will see something like "Area" (it returns a string). But a string would not be appropriate in your filter - it is expecting a symbol that represents a column in your data frame. (A symbol is the name of an object like df or mtcars, etc.)
First, you want to convert a string to a symbol. You can do that either by using as.symbol() or rlang::sym(). You can try this out in your console. If you do as.symbol("df") it would return the symbol df. If you entered eval(as.symbol("df")) it would be the same as just entering df itself (and it would show the contents of your data frame).
The other issue is that tidyverse functions evaluate code expressions in a special context (searching for names within a data frame, for example). In this case dplyr knows that the name Area is in the context of df (one of the column names). This is a complicating factor since arguments are quoted. To address this, you need to unquote (replace a name with its value) with the bang-bang !! operator.
Putting both together you get !!as.symbol().
Of note, varSelectInput is a newer shiny alternative to selectInput that can be considered for use in situations like these.
For more information:
shinymeta special topics
advanced R
library(tidyverse)
library(shiny)
Area <- c("Mexico", "USA", "USA", "Canada")
Type_of_participants <- c("Doctor", "Doctor", "Engineer", "Dancer")
Salary <- c(4000, 6000, 8000, 5000)
df <- data.frame(Area, Type_of_participants, Salary)
ui <- fluidPage(
titlePanel("Survey Results"),
sidebarLayout(
sidebarPanel(strong("Overview Plot"),
br(),
###1a.Area input
selectInput("selection","Var",
choices = c("Area","Type_of_participants"),
selected = "Area"),
uiOutput("choice_selection")
),
mainPanel(
plotOutput("Overview")
)
)
)
server <- function(input, output) {
output$choice_selection <- renderUI({
checkboxGroupInput("baseinput", "Detail", unique(df[,input$selection]))
})
dt1 <- reactive({
df %>%
group_by(Area, Type_of_participants) %>%
filter(!!as.symbol(input$selection) %in% input$baseinput) %>%
summarise(avg_salary_by_area = mean(Salary, na.rm = TRUE))
})
output$Overview <- renderPlot({
ggplot(data = dt1(), aes_string(x = input$selection, y = "avg_salary_by_area", fill = "Type_of_participants")) +
geom_bar(stat="identity", position = position_dodge())
})
}
shinyApp(ui, server)

Shiny R dynamic heatmap with ggplot. Scale and speed issues

I am attempting to use some public information to produce a heat-map of Canada for some labor statistics. Using the spacial files from the census, and data from Statistics Canada (these are large zip files that are not necessary to dig into). Below is a working example that illustrates both the problems I am having with little relative change between regions( though there may be a big absolute change between periods, and the slow draw time.To get this to work, you need to download the .zip file from the census link and unzip the files to a data folder.
library(shiny)
library(maptools)
library(ggplot2)
require(reshape2)
library(tidyr)
library(maptools)
library(ggplot2)
library(RColorBrewer)
ui <- fluidPage(
titlePanel("heatmap"),
# Sidebar with a slider input for year of interest
sidebarLayout(
sidebarPanel(
sliderInput("year",h3("Select year or push play button"),
min = 2000, max = 2002, step = 1, value = 2000,
animate = TRUE)
),
# Output of the map
mainPanel(
plotOutput("unemployment")
)
)
)
server <- function(input, output) {
#to get the spacial data: from file in link above
provinces<-maptools::readShapeSpatial("data/gpr_000a11a_e.shp")
data.p<- ggplot2::fortify(provinces, region = "PRUID")
data.p<-data.p[which(data.p$id<60),]
#dataframe with same structure as statscan csv after processing
unem <- runif(10,min=0,max=100)
unem1 <- unem+runif(1,-10,10)
unem2 <- unem1+runif(1,-10,10)
unemployment <- c(unem,unem1,unem2)
#dataframe with same structure as statscan csv after processing
X <- data.frame("id" = c(10,11,12,13,24,35,46,47,48,59,
10,11,12,13,24,35,46,47,48,59,
10,11,12,13,24,35,46,47,48,59),
"Unemployment" = unemployment,
"year" = c(rep(2000,10),rep(2001,10),rep(2002,10))
)
plot.data<- reactive({
a<- X[which(X$year == input$year),]
return(merge(data.p,a,by = "id"))
})
output$unemployment <- renderPlot({
ggplot(plot.data(),
aes(x = long, y = lat,
group = group , fill =Unemployment)) +
geom_polygon() +
coord_equal()
})
}
# Run the application
shinyApp(ui = ui, server = server)
Any help with either of the issues would be greatly appreciated
For this type of animation it is much faster to use leaflet instead of ggplot as leaflet allows you to only re-render the polygons, not the entire map.
I use two other tricks to speed up the animation:
I join the data outside of the reactive. Within the reactive it is just a simple subset. Note, the join could be done outside of the app and read in as a pre-processed .rds file.
I simplify the polygons with the rmapshaper package to reduce drawing time by leaflet. Again, this could be done outside the app to reduce loading time at the start.
The animation could likely be even more seamless if you use circles (i.e. centroid of each province) instead of polygons. Circle size could vary with Unemployment value.
Note, you need the leaflet, sf, dplyr and rmapshaper packages for this approach.
library(shiny)
library(dplyr)
library(leaflet)
library(sf)
library(rmapshaper)
ui <- fluidPage(
titlePanel("heatmap"),
# Sidebar with a slider input for year of interest
sidebarLayout(
sidebarPanel(
sliderInput("year",h3("Select year or push play button"),
min = 2000, max = 2002, step = 1, value = 2000,
animate = TRUE)
),
# Output of the map
mainPanel(
leafletOutput("unemployment")
)
)
)
server <- function(input, output) {
#to get the spacial data: from file in link above
data.p <- sf::st_read("input/gpr_000a11a_e.shp") %>%
st_transform(4326) %>%
rmapshaper::ms_simplify()
data.p$PRUID <- as.character(data.p$PRUID) %>% as.numeric
data.p <- data.p[which(data.p$PRUID < 60),]
lng.center <- -99
lat.center <- 60
zoom.def <- 3
#dataframe with same structure as statscan csv after processing
unem <- runif(10,min=0,max=100)
unem1 <- unem+runif(1,-10,10)
unem2 <- unem1+runif(1,-10,10)
unemployment <- c(unem,unem1,unem2)
#dataframe with same structure as statscan csv after processing
X <- data.frame("id" = c(10,11,12,13,24,35,46,47,48,59,
10,11,12,13,24,35,46,47,48,59,
10,11,12,13,24,35,46,47,48,59),
"Unemployment" = unemployment,
"year" = c(rep(2000,10),rep(2001,10),rep(2002,10))
)
data <- left_join(data.p, X, by = c("PRUID"= "id"))
output$unemployment <- renderLeaflet({
leaflet(data = data.p) %>%
addProviderTiles("OpenStreetMap.Mapnik", options = providerTileOptions(opacity = 1), group = "Open Street Map") %>%
setView(lng = lng.center, lat = lat.center, zoom = zoom.def) %>%
addPolygons(group = 'base',
fillColor = 'transparent',
color = 'black',
weight = 1.5) %>%
addLegend(pal = pal(), values = X$Unemployment, opacity = 0.7, title = NULL,
position = "topright")
})
get_data <- reactive({
data[which(data$year == input$year),]
})
pal <- reactive({
colorNumeric("viridis", domain = X$Unemployment)
})
observe({
data <- get_data()
leafletProxy('unemployment', data = data) %>%
clearGroup('polygons') %>%
addPolygons(group = 'polygons',
fillColor = ~pal()(Unemployment),
fillOpacity = 0.9,
color = 'black',
weight = 1.5)
})
}
# Run the application
shinyApp(ui = ui, server = server)
I didn't find the drawing time to be unreasonably long at ~2-3 seconds, which for a 2.4mb shapefile seems about right. It takes just as long outside shiny as it does in the app on my machine, anyway.
To hold a constant colour gradient you can specify limits in scale_fill_gradient which will hold the same gradient despite changes to your maps:
output$unemployment <- renderPlot({
ggplot(plot.data(),
aes(x = long, y = lat,
group = group , fill =Unemployment)) +
geom_polygon() +
scale_fill_gradient(limits=c(0,100)) +
coord_equal()
})

Strategies for editing reactive functions in Shiny, 'data' must be of a vector type, was 'NULL' error

Goal: I am trying to create a shiny app that displays (1) the stressplot of a non-metric multidimensional scaling solution, (2) a ggplot of the point configuration, and (3) the results of clustering the point configuration by plotting the point configuration and superimposing chulls of the clustering.
Problem: The first two plots work without difficulty. Instead of a third plot, I get the error: 'data' must be of a vector type, was 'NULL'
I would appreciate any advice on how to resolve the specific problem, i.e. "error in array: 'data' must be of a vector type, was 'NULL'"
I would also appreciate any general advice on how to debug shiny. My only strategy is to treat the code like it isn't reactive code, and I suspect that this strategy isn't terribly effective.
My attempt to solve: I've searched the error on rseek and stack overflow and reviewed the posts. In some of the cases with similar errors the problem was that necessary data wasn't being calculated. I went through the code, treated it as normal (non-reactive) code, and used fake data. When I did this I didn't have any problem, so I assume it is something about the reactivity? Question 2 about how to debug is a reaction to the fact that trying to debug like the code wasn't dynamic didn't identify the problem.
Reproducible Example: I put together a shiny app that has randomly generated data. Before doing the testing I updated R and all the packages I use.
# Packages and options
library(shiny)
library(vegan)
library(cluster)
library(tidyverse)
options(digits = 3)
# Create dissimilarity matrix
d <- rnorm(1000)
mat <- matrix(d, ncol = 10)
diss_m <- daisy(mat) %>% as.matrix()
# Function
find_chulls <- function(df, x, y) {
ch <- chull(df[[x]], df[[y]])
df[ch,] %>% as.data.frame()
}
ui <- fluidPage(
titlePanel("Research"),
sidebarLayout(
sidebarPanel(
numericInput('dim', 'Dimensions', 2, min = 2, max = 15)
),
mainPanel(
h3('Stressplot'),
plotOutput('plot0'),
h3('Non-Metric Multidimensional Scaling'),
plotOutput('plot1'),
h3('2d Density Plot'),
plotOutput('plot2'),
h3('Cluster Analysis'),
plotOutput('plot3')
)
)
)
server <- function(input, output, session) {
nmds <- reactive({
metaMDS(diss_m,
distance = "euclidean",
k = input$dim,
trymax = 200,
autotransform = FALSE,
noshare = FALSE,
wascores = FALSE)
})
output$plot0 <- renderPlot({
stressplot(nmds())
})
pts <- reactive({
nmds()$points %>% as.data.frame()
})
output$plot1 <- renderPlot({
ggplot(pts(), aes(x = MDS1, y = MDS2)) +
geom_point()
})
output$plot2 <- renderPlot({
ggplot(pts(), aes(x = MDS1, y = MDS2)) +
geom_point() +
geom_density2d()
})
df_cl <- reactive({
km <- kmeans(x = pts(), centers = input$clust)
cl <- km$cluster
data.frame(pts(), clust = cl)
})
df_ch <- reactive({
df_ch_temp <- df_cl() %>% group_by(clust) %>% do(find_chulls(., 1, 2))
df_ch_temp %>% as.data.frame()
})
The plot below is the one that doesn't work
output$plot3 <- renderPlot({
ggplot(df_ch(), aes(x = MDS1, y = MDS2, fill = as.factor(clust))) + geom_polygon(alpha = 0.10)
})
}
# Run the application
shinyApp(ui = ui, server = server)
Your input$clust is undefined in:
df_cl <- reactive({
km <- kmeans(x = pts(), centers = input$clust)
cl <- km$cluster
data.frame(pts(), clust = cl)
})
You need to add an input binding for clust, e.g.:
numericInput('clust', 'Clusters', 2, min = 2, max = 15)
As for debugging: I added browser() at the top in df_cl, then execution stops and you can inspect variables and run code in the terminal (e.g. in Rstudio). When I ran km <- kmeans(x = pts(), centers = input$clust) I got the error you described and could then see that input contains no clust element.

Resources