I get a warning (Warning: Removed 2 rows containing missing values (geom_path).), which I don't want to have for the following code:
library(shiny)
library(ggplot2)
library(scales)
ui <- navbarPage("Test",
tabPanel("Test_2",
fluidPage(
fluidRow(
column(width = 12, plotOutput("plot", width = 1200, height = 600))
),
fluidRow(
column(width = 12, sliderInput("slider",
label = "Range [h]",
min = as.POSIXct("2019-11-01 00:00"),
max = as.POSIXct("2019-11-01 07:00"),
value = c(as.POSIXct("2019-11-01 00:00"),as.POSIXct("2019-11-01 07:00"))))
))))
server <- function(input, output, session) {
df <- data.frame("x" = c(as.POSIXct("2019-11-01 00:00"),as.POSIXct("2019-11-01 01:00"),
as.POSIXct("2019-11-01 02:00"),as.POSIXct("2019-11-01 03:00"),
as.POSIXct("2019-11-01 04:00"),as.POSIXct("2019-11-01 05:00"),
as.POSIXct("2019-11-01 06:00"),as.POSIXct("2019-11-01 07:00")),
"y" = c(0,1,2,3,4,5,6,7))
observe({
len_date_list <- length(df$x)
min_merge_datetime <- df$x[1]
max_merge_datetime <- df$x[len_date_list]
updateSliderInput(session, "slider",
min = as.POSIXct(min_merge_datetime),
max = as.POSIXct(max_merge_datetime),
timeFormat = "%Y-%m-%d %H:%M")
})
output$plot <- renderPlot({
in_slider_1 <- input$slider[1]
in_slider_2 <- input$slider[2]
ggplot(data=df, aes(x, y, group = 1)) +
theme_bw() +
geom_line(color="black", stat="identity") +
# geom_point() +
scale_x_datetime(labels = date_format("%m-%d %H:%M"),
limits = c(
as.POSIXct(in_slider_1),
as.POSIXct(in_slider_2)))
})
}
shinyApp(server = server, ui = ui)
It seems to be an general problem with the "missing values", because I have found a lot of similar questions. In this question it is explained that it must be the range of the axis. So in my case I'm sure that it is because of the limits in scale_x_datetime.
scale_x_datetime(labels = date_format("%m-%d %H:%M"),
limits = c(
as.POSIXct(in_slider_1),
as.POSIXct(in_slider_2)))
But I didn't found an answered question when scale_x_datetime, as.POSIXct and a slider is used.
BTW: If I comment out "geom_point" I get a further similar warning.
I think it is because you haven't filtered df so when the limits of scale_x_datetime come along they remove the rows in df that don't fit between the slider parameters. I added this:
df %>% filter(between(x, in_slider_1, in_slider_2))
which seems to remove the issue for me. Please test. Just to mention that I did have some time zone problems.
Full code below:
library(shiny)
library(ggplot2)
library(scales)
ui <- navbarPage("Test",
tabPanel("Test_2",
fluidPage(
fluidRow(
column(width = 12, plotOutput("plot", width = 1200, height = 600))
),
fluidRow(
column(width = 12, sliderInput("slider",
label = "Range [h]",
min = as.POSIXct("2019-11-01 00:00"),
max = as.POSIXct("2019-11-01 07:00"),
value = c(as.POSIXct("2019-11-01 00:00"),as.POSIXct("2019-11-01 07:00"))))
))))
server <- function(input, output, session) {
df <- data.frame("x" = c(as.POSIXct("2019-11-01 00:00"),as.POSIXct("2019-11-01 01:00"),
as.POSIXct("2019-11-01 02:00"),as.POSIXct("2019-11-01 03:00"),
as.POSIXct("2019-11-01 04:00"),as.POSIXct("2019-11-01 05:00"),
as.POSIXct("2019-11-01 06:00"),as.POSIXct("2019-11-01 07:00")),
"y" = c(0,1,2,3,4,5,6,7))
observe({
len_date_list <- length(df$x)
min_merge_datetime <- df$x[1]
max_merge_datetime <- df$x[len_date_list]
updateSliderInput(session, "slider",
min = as.POSIXct(min_merge_datetime),
max = as.POSIXct(max_merge_datetime),
timeFormat = "%Y-%m-%d %H:%M")
})
output$plot <- renderPlot({
in_slider_1 <- input$slider[1]
in_slider_2 <- input$slider[2]
ggplot(data=df %>% filter(between(x, in_slider_1, in_slider_2)), aes(x, y, group = 1)) +
theme_bw() +
geom_line(color="black", stat="identity") +
# geom_point() +
scale_x_datetime(labels = date_format("%m-%d %H:%M"),
limits = c(
as.POSIXct(in_slider_1),
as.POSIXct(in_slider_2)))
})
}
shinyApp(server = server, ui = ui)
It looks like you could now actually remove the scale_x_datetime completely and just have:
ggplot(data=df %>% filter(between(x, in_slider_1, in_slider_2)), aes(x, y, group = 1)) +
theme_bw() +
geom_line(color="black", stat="identity")
I know this question already has an answer, but this is another possible solution for you.
If you just want to get rid of it, that implies to me that you are OK with the output. Then you can try the following:
Add na.rm=TRUE to geom_line like : geom_line(..., na.rm=TRUE )
This explicitly tells geom_line and geom_path that is OK to remove NA values.
Reasoning with the warning:
Warning of: Removed k rows containing missing values (geom_path)
This tells you mainly 3 things:
geom_path is being called by another geom_something which is firing the warning. In your case, is geom_line.
It already removed k rows. So if the output is as desired, then you want to those rows removed.
The reason for removal is that some values ARE missing (NA).
What the warning doesn't tells you is WHY those rows have missing (NA) values.
You know that the reason comes from scale_x_datetime. Mainly from the limits argument. In a sense of (X,Y) pairs to be drawn, you set the X scale to values where is no "Y", or Y=NA. Your scale may be continuous, but your data is not. You may want to set a larger scale for a different number of reasons, but ggplot will always find that there isn't an associated Y value, and it makes a unilateral decision and fires a warning instead of an error.
Hopefully, times will come when Errors and Warnings highlights intuitive, language-independent calling trace to the emitter and a link to a correctly explained site with common mistakes, etc.
Related
I am creating two interactive plots in R Shiny and while I can get one plot to show up and work, the second plot keeps giving me the "Warning: Error in [.data.frame: undefined columns selected" and will not appear.
I have looked at many solutions online and none so far have been able to help me or fix my issue.
I am having a hard time seeing how my columns are undefined, but I am also relatively new to R Shiny and could be easily overlooking something, so I was hoping someone could help me figure this out.
Here is my code:
library(shiny)
library(dplyr)
library(readr)
library(ggplot2)
library(tidyverse)
age <- c(1, 4, 7,10, 15)
v_m_1 <- c(10, 14, 17, 20, 25)
v_m_2 <- c(9, 13, 16, 19, 24)
sex <- c("F", "M","U", "F", "M")
P_v_rn <- c(0.11, 0.51, 0.61, 0.91, 1)
C_v_rn <- c(11.1, 15.1, 16.1, 19.1, 20.1)
P_v_rk <- c(0.11, 0.51, 0.61, 0.91, 1)
B_v_rk <- c("Low", "Medium", "Medium", "High", "High")
df_test <- data.frame(age, v_m_1, v_m_2, sex, P_v_rn, C_v_rn, P_v_rk, B_v_rk)
# Define UI for application that draws a histogram
ui <- fluidPage(
# Application title
titlePanel("Test"),
# Sidebar with a slider input for number of bins
verticalLayout(
sidebarLayout(
sidebarPanel(
selectInput(inputId = "xvar",
label = "Choose X variable", #All variables are numeric
c("Age" = 1),
selected = 1),
selectInput(inputId = "yvar",
label = "Choose bone variable", #All variables are numeric
c("v_m_1" = 2,
"v_m_2" = 3),
selected = 2),
checkboxInput(inputId = "regression",
label = "Fit LOESS - By Sex",
value = FALSE)),
mainPanel(
plotOutput('dataplot1')
)
),
tags$hr(),
sidebarLayout(
sidebarPanel(
selectInput(inputId = "xvar_name",
label = "Choose X variable", #All variables are numeric
c("Age" = 1),
selected = 1),
selectInput(inputId = "yvar_name",
label = "Choose Y variable", #The first variable option is numeric, the rest are factors
c("P_v_rk" = 7,
"B_v_rk" = 8),
selected = 7),
selectInput(inputId = "zvar_name",
label = "Choose Z variable", #All variables are numeric
c("C_v_rn" = 6,
"P_v_rn" = 5),
selected = 6)),
# Show a plot of the generated distribution
mainPanel(
plotOutput('dataplot2')
)
),
tags$hr(),
))
# Define server logic required to draw a scatterplot
server <- function(input, output) {
df <- df_test %>%
select(age, v_m_1, v_m_2, sex, P_v_rn, C_v_rn, P_v_rk, B_v_rk)
df$B_v_rk <- as.factor(df$B_v_rk)
#Growth Curve
output$dataplot1 <- renderPlot({
xvar <- as.numeric(input$xvar)
yvar <- as.numeric(input$yvar)
Sex <- as.factor(df$sex)
p <- ggplot() +
aes(x = df[ ,xvar],
y = df[ ,yvar],
col = sex) +
geom_point(alpha = 0.5, aes(size = 1.5)) + # 50% transparent
labs(x = names(df[xvar]),
y = names(df[yvar])) +
theme_classic()
if(input$regression) {
# add a line to the plot
p <- p + geom_smooth()
}
p # The plot ('p') is the "return value" of the renderPlot function
})
#Environmental metrics
output$dataplot2 <- renderPlot({
xvar_name <- input$xvar_name
yvar_name <- input$yvar_name
zvar_name <- input$zvar_name
#Color palette for ggplots as blue color range was difficult for me
fun_color_range <- colorRampPalette(c("yellow", "red"))
my_colors <- fun_color_range(20)
p2 <- ggplot() +
aes(x = df[ ,xvar_name],
y = df[ ,yvar_name],
col = df[ ,zvar_name]) +
geom_point(alpha = 0.5, aes(size = 1.5)) + # 50% transparent
scale_colour_gradientn(colors = my_colors) +
labs(x = names(df[xvar_name]),
y = names(df[yvar_name])) +
theme_classic()
p2 # The plot ('p2') is the "return value" of the renderPlot function
})
}
# Run the application
shinyApp(ui = ui, server = server)
Again the first plot works fine, it is the second plot that is producing an error code.
I guess I am confused as the code for the first plot works fine but it won't work for the second plot.
For reference, this is the layout I want, except I want another plot in the error code location.
My guess is that the bug is in the line with names(df[xvar_name]). If df is a data frame, this will throw the error you quoted. To subset a data frame with indices or column names you either use double brackets (df[[...]]) or a comma (df[ ..., ... ]). I think you meant names(df[ , xvar_name ]). This error is repeated on the line below as well.
In general, to identify the place where the problem occurs, use browser() in your code.
I am new to R and Shiny and have a problem that I have not been able to solve for hours.
I have a dataset from which I display the daily consumption of coffee on a dashboard, which works very well. The plot is a ggplot geom_line chart.
But now I want to be able to change the time period with two sliders.
The sliders I have also managed to do, but the plot does not change when the slider is moved.
I also suspect that I have an error with the date format.
What am I doing wrong?
Thanks for the help
RawData Dataset
library(shiny)
library(dplyr)
library(data.table)
library(ggplot2)
ui <- shinyUI(fluidPage(
# Application title
titlePanel("Coffee consumption"),
# Sidebar with a slider input for the number of bins
sidebarLayout(
sidebarPanel(
sliderInput("DatesMerge",
"Dates:",
min = as.Date("2018-01-22","%Y-%m-%d"),
max = as.Date("2020-04-04","%Y-%m-%d"),
value= c(as.Date("2018-01-22","%Y-%m-%d"),as.Date("2020-04-04","%Y-%m-%d")),
timeFormat="%Y-%m-%d")
),
mainPanel(
plotOutput("plot_daycount"),
tableOutput("structure"),
tableOutput("rawdata"),
tableOutput("dayconsumption"))
)
)
)
# RawData import
coffeedata = fread("C:/temp/ProductList.csv")
setDF(coffeedata)
coffeedata$Date = as.Date(coffeedata$Date, "%d.%m.%Y")
# Products a day counter
countcoffee <- function(timeStamps) {
Dates <- as.Date(strftime(coffeedata$Date, "%Y-%m-%d"))
allDates <- seq(from = min(Dates), to = max(Dates), by = "day")
coffee.count <- sapply(allDates, FUN = function(X) sum(Dates == X))
data.frame(day = allDates, coffee.count = coffee.count)}
# Making a DF with day consumption
daylicounter = countcoffee(df$coffee.date)
server <- shinyServer(function(input, output) {
output$structure = renderPrint({
str(coffeedata)
})
# Raw Data
output$rawdata = renderTable({
head(coffeedata)
})
output$dayconsumption = renderTable({
head(daylicounter)
})
# GGPLOT2
output$plot_daycount = renderPlot({
DatesMerge = input$DatesMerge
ggplot(daylicounter[daylicounter == DatesMerge], aes(daylicounter$day, daylicounter$coffee.count)) +
geom_line(color = "orange", size = 1)
scale_x_date(breaks = "3 month",
date_labels = "%d-%m-%Y")
# Try outs
# ggplot(daylicounter[month(day) == month(DatesMerge)], mapping = aes(day = day)) +
# geom_line(color = "orange", size = 1)
# scale_x_date(breaks = "3 month",
# date_labels = "%d-%m-%Y")
})
})
shinyApp(ui, server)
I appreciate your help
As noted by #Kevin, you need to use input$DatesMerge[1] and input$DatesMerge[2] when subsetting your data. For clarity, this can be done in a separate step. Try something like this in your server:
output$plot_daycount = renderPlot({
DatesMerge <- as.Date(input$DatesMerge, format = "%Y-%m-%d")
sub_data <- subset(daylicounter, day >= DatesMerge[1] & day <= DatesMerge[2])
ggplot(sub_data, aes(x = day, y = coffee.count)) +
geom_line(color = "orange", size = 1) +
scale_x_date(breaks = "3 month", date_labels = "%d-%m-%Y")
})
Edit Additional question from OP was asked:
Why does my date format look normal with str(coffeedata) but with
head(coffeedata) the date is just a number?
renderTable uses xtable which may have trouble with dates. You can get your dates to display correctly by converting to character first (one option):
output$rawdata = renderTable({
coffeedata$Date <- as.character(coffeedata$Date)
head(coffeedata)
})
output$dayconsumption = renderTable({
daylicounter$day <- as.character(daylicounter$day)
head(daylicounter)
})
See other questions on this topic:
as.Date returns number when working with Shiny
R shiny different output between renderTable and renderDataTable
Welcome to R and Shiny, you are off to a great start. Few things:
I don't recommend you using = in shiny, majority of the cases you want to use <-
To simplify code, no reason to add a variable like DatesMerge. It adds no value (at least in the code above)
For a dual slider, you need to tell shiny which end to pick. input$DatesMerge doesn't mean anything but input$DatesMerge[1] does.
When asking for help, it is always better to add a subset of data within the code itself. You tend to get more help and it is easier to run for the person trying to help (like me I was too lazy to download the file and place it in a folder so I didn't run it)
You need to account for a range of dates in your slider when subsetting the data; you also for the , when subsetting the data.
ggplot(daylicounter[daylicounter %in% input$DatesMerge[1]:input$DatesMerge[2],], aes(daylicounter$day, daylicounter$coffee.count)) +
geom_line(color = "orange", size = 1) +
scale_x_date(breaks = "3 month",
date_labels = "%d-%m-%Y")
I have a data set with three laps (15s/lap) each of which shows the different speed for every second:
AA <- as.data.frame(cbind(c(10,12,11,12,12,11,12,13,11,9,9,12,11,10,12,9,8,7,9,8,7,9,9,8,9,7,9,10,10,10,7,6,7,8,8,7,6,6,7,8,7,6,7,8,8),
c(rep("Lap_1",15),rep("Lap_2",15),rep("Lap_3",15))))
I want to compare the three laps together, but for the first one I'd like to use a sliderInput to select only some of the 15 secondes. I'm having some difficulties to add that to my code. Here is what I have for the moment:
install.packages("shiny")
install.packages("ggplot2")
library(shiny)
library(ggplot2)
colnames(AA) <- c("Speed","Lap")
AA$Speed <- as.numeric(as.character(AA$Speed))
ui=shinyUI(
fluidPage(
titlePanel("Title here"),
sidebarLayout(
sidebarPanel(
checkboxGroupInput("lap_choose",
label = "Choose the laps",
choices = c("Lap_1","Lap_2","Lap_3")),
sliderInput("secs_1",
"Seconds in L1:",
min = 0,
max = 15,
value = c(3,10),
step=1)),
mainPanel(
plotOutput("Comparison"))
)
)
)
server=function(input,output){
#data manipulation
data_1=reactive({
return(AA[AA$Lap%in%input$lap_choose,])
})
output$Comparison <- renderPlot({
ggplot(data=data_1(), aes(Speed, fill = Lap)) +
stat_density(aes(y = ..density..),
position = "identity",
color = "black",
alpha = 0.8) +
xlab("Distribution") +
ylab("Density") +
ggtitle("Comparison") +
theme(plot.title = element_text(hjust = 0.5,size=24, face="bold"))
})
}
shinyApp(ui,server)
I should use the secs_1 at some point to update data_1, but didn't find out how yet. Any ideas?
If i am understanding correctly, you want to filter out some values(based on sec_1 sliderInput) if "lap" variable is "lap_1".
Try using ifelse statement in data_1 function.
data_1=reactive({
xc <- AA[AA$Lap%in%input$lap_choose,]
gh <- ifelse(xc$Lap == "Lap_1" & xc$Speed %in% c(input$secs_1[1],input$secs_1[2]),
FALSE, TRUE)
return(xc[gh,])
})
I have problems with my ggplot in Shiny. I am new to Shiny, so there are probably some rookie mistakes in the code. But I receive the following warnings:
Listening on http://127.0.0.1:4278`
Warning: Removed 93 rows containing non-finite values (stat_smooth).
Warning: Removed 93 rows containing missing values (geom_point).
Warning: Removed 1 rows containing missing values (geom_text).
The R code:
library(shiny)
library(ggplot2)
ggplot_df <- data.frame("start_ts"=c(1555279200,1555280100,1555281000,1555281900,1555282800),
"V1"=c(6.857970e-04,7.144347e-05,1.398045e-06,2.997632e-05,2.035446e-06),
"sum"=c(20,21,22,15,23))
# Small test data set with 5 observations... 93 in original one
# Define UI for application
ui <- fluidPage(sliderInput("time", "Time:",
min = as.POSIXct("00:00",format="%H:%M", tz=""),
max = as.POSIXct("24:00",format="%H:%M", tz=""),
value = c(
as.POSIXct("00:00",format="%H:%M")
), timeFormat = "%H:%M", step=60*15, timezone = "",
animate=
animationOptions(interval=300, loop=TRUE)),
plotOutput("plot")
)
# Define server logic required
server <- function(input, output) {
output$plot<-renderPlot({
ggplot_df$start_ts <-as.POSIXct(ggplot_df$start_ts, format="%H:%M", tz="",origin="1970-01-01")
ggplot_df<-ggplot_df[ggplot_df$start_ts==input$time,]
ggplot(ggplot_df,aes(x=sum,y=V1))+geom_point() +
theme_bw() +
geom_smooth(method = "lm", se = FALSE) +
ylim(0,3) +
xlim(0,max(ggplot_df$sum)) +
annotate('text', max(ggplot_df$sum)-10,3,
label = paste("~R^{2}==",round(cor(ggplot_df$sum, ggplot_df$V1), digits=2)),parse = TRUE,size=4)
})
}
# Run the application
shinyApp(ui = ui, server = server)
Note that exactly the same thing is happening even if I define the time zone.
ggplot_df is a data frame with 93 rows. What have I done wrong? The plot I receive is empty, no points, etc, as shown below:
The problem is that the POSIXct column is a datetime, but the slider input is only a time. Is date important here is it only time of day which is of interest? The code below makes some plots, although I can't tell what the desired end result is, so it may not be quite right
ui <- fluidPage(sliderInput("time", "Time:",
min = as.POSIXct("2019-04-14 00:00",format="%Y-%m-%d %H:%M", tz=""),
max = as.POSIXct("2019-04-15 24:00",format="%Y-%m-%d %H:%M", tz=""),
value = c(
as.POSIXct("2019-04-14 00:00")
), timeFormat = "%Y-%m-%d %H:%M", step=60*15, timezone = "",
animate=
animationOptions(interval=300, loop=TRUE)),
plotOutput("plot")
)
# Define server logic required
server <- function(input, output) {
output$plot<-renderPlot({
ggplot_df$start_ts <-as.POSIXct(ggplot_df$start_ts, tz="",origin="1970-01-01")
ggplot_df<-ggplot_df[ggplot_df$start_ts==input$time,]
ggplot(ggplot_df,aes(x=sum,y=V1))+geom_point() +
theme_bw() +
geom_smooth(method = "lm", se = FALSE) +
ylim(0,3) +
xlim(0,max(ggplot_df$sum)) +
annotate('text', max(ggplot_df$sum)-10,3,
label = paste("~R^{2}==",round(cor(ggplot_df$sum, ggplot_df$V1), digits=2)),parse = TRUE,size=4)
})
}
So I am trying to tackle the following but I may have started down the wrong road.
As these sample sizes increase, I need to update the y-limits so the highest bar in geom_histogram() doesn't go off the top. The especially happens if the st. dev. is set near 0.
This is literally my second day working with Shiny and reactive applications so I feel I've gotten myself into a pickle.
I think I need to save the ggplot() objects and then update their ylimit reactively with the value of the largest bar from the last histogram. Just not sure if I can do that the way this thing is set up now.
(I am realizing I had a similar problem over 2 years ago)
ggplot2 Force y-axis to start at origin and float y-axis upper limit
This is different because it is the height of a histogram that needs to tell the y-axis to increase, not the largest data value. Also, because Shiny.
My server.R function looks like
library(shiny)
library(ggplot2)
library(extrafont)
# Define server logic for random distribution application
function(input, output, session) {
data <- reactive({
set.seed(123)
switch(input$dist,
norm = rnorm(input$n,
sd = input$stDev),
unif = runif(input$n,-4,4),
lnorm = rlnorm(input$n)
)
})
height="100%"
plotType <- function(blah, maxVal, stDev, n, type) {
roundUp <- function(x) 10^ceiling(log10(x)+0.001)
maxX<- roundUp(maxVal)
breakVal<-max(floor(maxX/10),1)
switch(type,
norm = ggplot(as.data.frame(blah), aes(x=blah))+
geom_histogram(binwidth = 0.2,
boundary = 0,
colour = "black") +
scale_y_continuous(limits = c(0, maxX),
breaks = seq(0, maxX, breakVal),
expand = c(0, 0)) +
scale_x_continuous(breaks = seq(-4, 4, 1),
expand = c(0, 0)) +
theme_set(theme_bw(base_size = 40) +
ylab("Frequency")+
xlab("")+
coord_cartesian(xlim=c(-4, 4))+
ggtitle(paste("n = ",n, "St Dev =", stDev," Normal Distribution ", sep = ' ')),
unif = ggplot(as.data.frame(blah), aes(x=blah))+
geom_histogram(binwidth=0.1, boundary =0,colour = "black")+
scale_y_continuous(limits = c(0,roundUp(maxVal*(3/stDev))),
breaks=seq(0,roundUp(maxVal*(3/stDev)), roundUp(maxVal*(3/stDev))/10),
expand = c(0, 0))+
scale_x_continuous(breaks=seq(-4,4,1),expand = c(0, 0))+
theme_set(theme_bw(base_size = 40))+
ylab("Frequency")+xlab("")+
coord_cartesian(xlim=c(-4,4))+
ggtitle(paste("n = ",n, " Uniform Distribution ", sep = ' ')),
lnorm = ggplot(as.data.frame(blah), aes(x=blah))+
geom_histogram(binwidth=0.2, boundary =0,colour = "black")+
scale_y_continuous(limits = c(0,maxX),
breaks=seq(0,maxX, breakVal),
expand = c(0, 0))+
scale_x_continuous(breaks=seq(0,8,1),expand = c(0, 0))+
theme_set(theme_bw(base_size = 40))+
ylab("Frequency")+xlab("")+
coord_cartesian(xlim=c(0,8))+
ggtitle(paste("n = ",n, " Log-Normal Distribution ", sep = ' '))
)
}
observe({
updateSliderInput(session, "n",
step = input$stepSize,
max=input$maxN)
})
plot.dat <- reactiveValues(main=NULL, layer1=NULL)
#plotType(data, maxVal, stDev, n, type)
output$plot <- renderPlot({
plotType(data(),
switch(input$dist,
norm = max((input$n)/7,1),
unif = max((input$n)/50,1),
lnorm =max((input$n)/8,1)
),
input$stDev,
input$n,
input$dist) })
# Generate a summary of the data
output$summary <- renderTable(
as.array(round(summary(data())[c(1,4,6)],5)),
colnames=FALSE
)
output$stDev <- renderTable(
as.array(sd(data())),
colnames=FALSE
)
# Generate an HTML table view of the data
output$table <- renderTable({
data.frame(x=data())
})
}
And my ui.R looks like
library(shiny)
library(shinythemes)
library(DT)
# Define UI for random distribution application
shinyUI(fluidPage(theme = shinytheme("slate"),
# Application title
headerPanel("Michael's Shiny App"),
# Sidebar with controls to select the random distribution type
# and number of observations to generate. Note the use of the
# br() element to introduce extra vertical spacing
sidebarLayout(
sidebarPanel(
tags$head(tags$style("#plot{height:90vh !important;}")),
radioButtons("dist", "Distribution:",
c("Standard Normal" = "norm",
"Uniform" = "unif",
"Log-normal" = "lnorm")),
br(),
numericInput("stepSize", "Step", 1, min = 1, max = NA, step = NA,
width = NULL),
numericInput("maxN", "Max Sample Size", 50, min = NA, max = NA, step = NA,
width = NULL),
br(),
sliderInput("n",
"Number of observations:",
value = 0,
min = 1,
max = 120000,
step = 5000,
animate=animationOptions(interval=1200, loop=T)),
sliderInput("stDev",
"Standard Deviation:",
value = 1,
min = 0,
max = 3,
step = 0.1,
animate=animationOptions(interval=1200, loop=T)),
p("Summary Statistics"),
tabPanel("Summary", tableOutput("summary")),
p("Sample St. Dev."),
tabPanel("Standard Dev", tableOutput("stDev")),
width =2
),
# Show a tabset that includes a plot, summary, and table view
# of the generated distribution
mainPanel(
tabsetPanel(type = "tabs",
tabPanel("Plot", plotOutput("plot")),
tabPanel("Table", tableOutput("table"))
))
)))
The whole thing has a lot of redundancy. What I want to do, is once the biggest bar on the histogram gets close to the upper y-limit, I want the ylimit to jump to the next power of 10.
Any suggestions are greatly appreciated.
Update Loosely, the solution that I ended up using is as follows: In the renderPlot() function, you need to save the ggplot object. Then as mentioned below, access the ymax value (still within renderPlot()),
ggplot_build(norm)$layout$panel_ranges[[1]]$y.range[[2]]
and then use that to update the y-axis. I used the following function to make the axis limit "nice".
roundUpNice <- function(x, nice=c(1,2,4,5,6,8,10)) {
10^floor(log10(x)) * nice[[which(x <= 10^floor(log10(x)) * nice)[[1]]]]
}
Then updating the y-axis. (still within renderplot())
ymaxX = roundUpNice(ggplot_build(norm)$layout$panel_ranges[[1]]$y.range[[2]])
norm+scale_y_continuous(limits = c(0, max(ymaxX, 20)),
expand=c(0,0))
First, store the histogram (default axes).
p1 <- ggplot(...) + geom_histogram()
Then, Use ggplot_build(p1) to access the heights of the histogram bars. For example,
set.seed(1)
df <- data.frame(x=rnorm(10000))
library(ggplot2)
p1 <- ggplot(df, aes(x=x)) + geom_histogram()
bar_max <- max(ggplot_build(p1)[['data']][[1]]$ymax) # where 1 is index 1st layer
bar_max # returns 1042
You will need a function to tell you what the next power of 10 is, for example:
nextPowerOfTen <- function(x) as.integer(floor(log10(x) + 1))
# example: nextPowerOfTen(999) # returns 3 (10^3=1000)
You will want to check whether the bar_max is within some margin (based on your preference) of the next power of 10. If an adjustment is triggered, you can simply do p1 + scale_y_continuous(limits=c(0,y_max_new)).
I found the answer hidden in the "scale_y_continuous()" portion of your code. The app was very close, but in some cases, the data maxed out the y-axis, which made it appear like it was running further than the axis limits as you said.
To fix this problem, the expand argument within the scale_y_continuous section needs to be set to "c(0.05, 0)", instead of "c(0, 0)".
First, I've replicated an example of the graph run-off you were describing by setting the sample size to 50 and standard deviation to 0.3 within your app. After running the original code with "expand=c(0, 0)", we can see we get the following graph:
This problem is fixed by changing the argument to "expand=c(0.05, 0)", as shown here:
For copies of the fixed scripts, see below.
Part 1 -- server.R
library(shiny)
library(ggplot2)
library(extrafont)
# Define server logic for random distribution application
function(input, output, session) {
data <- reactive({
set.seed(123)
switch(input$dist,
norm = rnorm(input$n,
sd = input$stDev),
unif = runif(input$n,-4,4),
lnorm = rlnorm(input$n)
)
})
height="100%"
plotType <- function(blah, maxVal, stDev, n, type){
roundUp <- function(x){10^ceiling(log10(x)+0.001)}
maxX<- roundUp(maxVal)
breakVal<-max(floor(maxX/10),1)
switch(type,
norm=ggplot(as.data.frame(blah), aes(x=blah)) +
geom_histogram(binwidth = 0.2,
boundary = 0,
colour = "black") +
scale_y_continuous(limits = c(0, maxX),
breaks = seq(0, maxX, breakVal),
expand = c(0.05, 0)) +
scale_x_continuous(breaks = seq(-4, 4, 1),
expand = c(0, 0)) +
theme_set(theme_bw(base_size = 40)) +
ylab("Frequency") +
xlab("") +
coord_cartesian(xlim=c(-4, 4))+
ggtitle(paste("n = ",n, "St Dev =", stDev,
" Normal Distribution ", sep = ' ')),
unif=ggplot(as.data.frame(blah), aes(x=blah)) +
geom_histogram(binwidth=0.1, boundary=0, colour="black")+
scale_y_continuous(
limits = c(0,roundUp(maxVal*(3/stDev))),
breaks=seq(0,roundUp(maxVal*(3/stDev)),
roundUp(maxVal*(3/stDev))/10),
expand = c(0.05, 0))+
scale_x_continuous(breaks=seq(-4,4,1),expand=c(0, 0)) +
theme_set(theme_bw(base_size = 40))+
ylab("Frequency")+xlab("")+
coord_cartesian(xlim=c(-4,4))+
ggtitle(paste("n = ",n,
" Uniform Distribution ", sep = ' ')),
lnorm=ggplot(as.data.frame(blah), aes(x=blah))+
geom_histogram(binwidth=0.2,boundary=0, colour="black") +
scale_y_continuous(limits=c(o,maxX),
breaks=seq(0,maxX, breakVal),
expand = c(0.05, 0)) +
scale_x_continuous(breaks=seq(0,8,1),
expand = c(0, 0)) +
theme_set(theme_bw(base_size = 40)) +
ylab("Frequency") +
xlab("") +
coord_cartesian(xlim=c(0,8)) +
ggtitle(paste("n = ",n,
" Log-Normal Distribution ",
sep = ' '))
)
}
observe({
updateSliderInput(session, "n",
step = input$stepSize,
max=input$maxN)
})
plot.dat <- reactiveValues(main=NULL, layer1=NULL)
#plotType(data, maxVal, stDev, n, type)
output$plot <- renderPlot({
plotType(data(),
switch(input$dist,
norm = max((input$n)/7,1),
unif = max((input$n)/50,1),
lnorm =max((input$n)/8,1)
),
input$stDev,
input$n,
input$dist) })
# Generate a summary of the data
output$summary <- renderTable(
as.array(round(summary(data())[c(1,4,6)],5)),
colnames=FALSE
)
output$stDev <- renderTable(
as.array(sd(data())),
colnames=FALSE
)
# Generate an HTML table view of the data
output$table <- renderTable({
data.frame(x=data())
})
}
Part 2 -- ui.R
library(shiny)
library(shinythemes)
library(DT)
# Define UI for random distribution application
shinyUI(fluidPage(theme = shinytheme("slate"),
# Application title
headerPanel("Michael's Shiny App"),
# Sidebar with controls to select the random distribution type
# and number of observations to generate. Note the use of the
# br() element to introduce extra vertical spacing
sidebarLayout(
sidebarPanel(
tags$head(tags$style("#plot{height:90vh !important;}")),
radioButtons("dist", "Distribution:",
c("Standard Normal" = "norm",
"Uniform" = "unif",
"Log-normal" = "lnorm")),
br(),
numericInput("stepSize", "Step", 1,
min = 1, max = NA, step = NA, width = NULL),
numericInput("maxN", "Max Sample Size", 50,
min = NA, max = NA, step = NA,width = NULL),
br(),
sliderInput("n", "Number of observations:", value = 0,
min = 1, max = 120000, step = 5000,
animate=animationOptions(interval=1200, loop=T)),
sliderInput("stDev","Standard Deviation:",value = 1,
min = 0,max = 3,step = 0.1,
animate=animationOptions(interval=1200, loop=T)),
p("Summary Statistics"),
tabPanel("Summary", tableOutput("summary")),
p("Sample St. Dev."),
tabPanel("Standard Dev", tableOutput("stDev")),
width =2),
# Show a tabset that includes a plot, summary, and table view
# of the generated distribution
mainPanel(tabsetPanel(type = "tabs",
tabPanel("Plot", plotOutput("plot")),
tabPanel("Table", tableOutput("table"))
))
)))
Update Loosely, the solution that I ended up using is as follows: In the renderPlot() function, you need to save the ggplot object. Then as mentioned below, access the ymax value (still within renderPlot()),
ggplot_build(p1)$layout$panel_ranges[[1]]$y.range[[2]]
and then use that to update the y-axis. I used the following function to make the axis limit "nice".
roundUpNice <- function(x, nice=c(1,2,4,5,6,8,10)) {
if(length(x) != 1) stop("'x' must be of length 1")
10^floor(log10(x)) * nice[[which(x <= 10^floor(log10(x)) * nice)[[1]]]]
}