How to remove empty factors from xaxis - r

Trying to make interactive plotly barchart with filter_select() and no-shiny work. I am working with data for a lot of airports (> 100). A barchart is typically too crowded to support the user to compare the performance observed (value VAL) at one airport (APT_x) to a subset of peers. The idea is to use a filter to have the user select the subset of airports.
# create a dummy table with data for year, airport, and oberved value
yr <- c(2017, 2018, 2019)
ap <- c("APT_1", "APT_2", "APT_3", "APT_N")
df <- expand.grid(YEAR = yr, APT = ap)
df$VAL <- c(10, 11, 12, 14, 9, 8, 7, 6, 2, 10, 12, 13)
library(plotly)
# shared data
df_sh <- highlight_key(df, key=~APT)
# filters
ap_filter <- filter_select(id="airport",label="filter airport", sharedData=df_sh, group=~APT)
# stacked bar chart
bc <- df_sh %>% plot_ly(x=~APT, y=~VAL, color=~factor(YEAR)) %>%
group_by(APT) %>%
add_bars() %>%
layout(barmode = "stack")
# arrange plot
bscols(widths = c(3, 9)
, ap_filter
, bc
)
Whenever more than one airport APT is selected, the x-axis shows all the entity-ticks between the bars.
How can this be removed/surpressed? Obviously, in the following example, APT_2 should not be shown. Thanks for any pointers.

I got an answer to the same issue here.
All that is needed is to set categoryorder = "trace" in the layout of the axis you are interested in.
In your example, it is (only difference is in the layout call of the bc definition):
library(crosstalk)
library(plotly)
# create a dummy table with data for year, airport, and oberved value
yr <- c(2017, 2018, 2019)
ap <- c("APT_1", "APT_2", "APT_3", "APT_N")
df <- expand.grid(YEAR = yr, APT = ap)
df$VAL <- c(10, 11, 12, 14, 9, 8, 7, 6, 2, 10, 12, 13)
# shared data
df_sh <- highlight_key(df, key = ~APT)
# filters
ap_filter <- filter_select(id = "airport", label = "filter airport", sharedData = df_sh, group = ~APT)
# stacked bar chart
bc <- df_sh %>% plot_ly(x = ~APT, y = ~VAL, color = ~factor(YEAR)) %>%
group_by(APT) %>%
add_bars() %>%
layout(barmode = "stack",
xaxis = list(categoryorder = "trace"))
# arrange plot
bscols(widths = c(3, 9), ap_filter, bc)

Related

{gtExtras} column showing in wrong order in {gt} table when grouped

I am making a gt table showing the progress of individuals towards a goal. In the table, there is a row showing a horizontal bar graph of progress towards that goal (if goal is 50 and score is 40, the bar is at 80%).
However, when I change the order of the gt rows by using the groupname_col argument, the order of the other cells changes, but not the order of the gtExtras gt_plt_bar_pct column, so it's showing the wrong bars for the name and score in that row, instead, that column seems to always be represented in the order of rows in the input data.
I understand that I can fix this by using arrange on the df before the gt begins, but this doesn't seem like a good solution since I'm going to want to change the order of the rows to view by different groups. Is this a flaw with gtExtras? is there a better fix?
thanks!
reprex:
library(tibble)
library(gt)
library(gtExtras)
library(dplyr)
# make dataframe of individuals and their goals
df <- tribble(
~name, ~group, ~score, ~goal,
"Bob", "C", 20, 40,
"Chris", "A", 50, 40,
"Dale", "B", 30, 50,
"Jay", "A", 0, 40,
"Ben", "B", 10, 20
) %>%
# calculate percent towards goal, and cap at 100%
mutate(percent_to_goal = score/goal *100,
percent_to_goal = case_when(percent_to_goal >= 100 ~ 100,
TRUE ~ percent_to_goal))
df %>%
# this fixes the issue, but doesn't seem like a permanent solution
#arrange(group, name) %>%
# make gt table
gt(rowname_col = "name", groupname_col = "group") %>%
# order groups
row_group_order(groups = c("A","B","C")) %>%
# add bar chart column
gt_plt_bar_pct(column = percent_to_goal) %>%
# highlight blue if person reaches their goal
tab_style(
style = list(
cell_fill(color = "lightcyan"),
cell_text(weight = "bold")),
locations = cells_body(
columns = c(goal,score, percent_to_goal),
rows = score >= goal
)
)
Here is the output from the above code: notice that the length of the bar charts do not always reflect the values of the rows they are appearing in. Instead, they reflect the order of the original dataset.
EDIT: remove row_group_order. If I run the above code again, but comment out the line meant to rearrange the appearance of groups, the grouping shows up in a different order (order of appearance of groups in the original dataset), and the name and first two columns sort into these groups accordingly, but the bar chart column still does not, and remains in the original order of the dataset. Image below:
Per gtExtras v 0.2.4 this bug has been fixed. Thanks for raising and the great reprex!
library(tibble)
library(gt)
library(gtExtras)
library(dplyr)
# make dataframe of individuals and their goals
df <- tribble(
~name, ~group, ~score, ~goal,
"Bob", "C", 20, 40,
"Chris", "A", 50, 40,
"Dale", "B", 30, 50,
"Jay", "A", 0, 40,
"Ben", "B", 10, 20
) %>%
# calculate percent towards goal, and cap at 100%
mutate(percent_to_goal = score/goal *100,
percent_to_goal = case_when(percent_to_goal >= 100 ~ 100,
TRUE ~ percent_to_goal))
df %>%
# make gt table
gt(rowname_col = "name", groupname_col = "group") %>%
# order groups
row_group_order(groups = c("A","B","C")) %>%
# add bar chart column
gt_plt_bar_pct(column = percent_to_goal) %>%
# highlight blue if person reaches their goal
tab_style(
style = list(
cell_fill(color = "lightcyan"),
cell_text(weight = "bold")),
locations = cells_body(
columns = c(goal,score, percent_to_goal),
rows = score >= goal
)
)

Save multiple ggplots with different layout matrices

Currently I'm creating multiple plots with regional data and save them to a PDF file. This works without problems, thanks to an SO post I've found (use grid.arrange over multiple pages or marrangeGrob with a layout_matrix).
This is my code so far:
library(ggplot2)
library(gridExtra)
library(dplyr)
data <- data.frame(
region = c("region 1", "region 2", "region 3", rep("region 4", 2), rep("region 5", 2)),
countries = c("country 1", "country 2", "country 3", "country 4", "country 5", "country 6", "country 7"),
dummydata1 = c(rep(1, 7)),
dummydata2 = c(rep(2, 7))
)
criterias <- list()
criterias[[ 'region_1' ]] <- data %>% filter(region == 'region 1')
criterias[[ 'region_2' ]] <- data %>% filter(region == 'region 2')
criterias[[ 'region_3' ]] <- data %>% filter(region == 'region 3')
criterias[[ 'region_4' ]] <- data %>% filter(region == 'region 4')
criterias[[ 'region_5' ]] <- data %>% filter(region == 'region 5')
# This layout matrix should be used for the regional plots
# Don't wonder about the strange numbering, some plots came later
# and it was easier to modify the matrix then all other functions.
regionLayout <- rbind(
c(1,1,1,1,1,2),
c(NULL,NULL,3,3,NULL,NULL),
c(9,9,4,4,10,10),
c(6,6,6,7,7,7),
c(6,6,6,7,7,7),
c(6,6,6,7,7,7),
c(6,6,6,7,7,7),
c(6,6,6,7,7,7),
c(6,6,6,7,7,7)
)
# This is just a dummy function
# The actual function creates several plots based on the real data
createRegionalPlots <- function (data, region) {
examplePlots <- list(ggplot() + ggtitle('Title (ggtext = plot 1)'),
ggplot() + ggtitle('Month (ggtext = plot 2)'),
ggplot() + ggtitle('Plot 1 (tile = 3)'),
ggplot() + ggtitle('Plot 2 (tile = 4)'),
ggplot() + ggtitle('Plot 3 (geom_bar = 5)'),
ggplot() + ggtitle('Plot 4 (geom_bar = 6)'),
ggplot() + ggtitle('Plot 5 (tile = 7)'),
ggplot() + ggtitle('Plot 6 (tile = 8)'))
}
# Found in https://stackoverflow.com/questions/43491685/
preparePage <- function(plots,layoutMatrix) {
# pdf(file = NULL) #invisible
par(mar=(c(5,5,5,5)))
plotsPerPage <- length(unique(na.omit(c(layoutMatrix))))
ml <- lapply(1:ceiling(length(plots)/plotsPerPage), function(page_IND){
ind <- (1 + ((page_IND - 1) * plotsPerPage )) : (page_IND * plotsPerPage)
grid.arrange(grobs = plots[ind], layout_matrix = layoutMatrix)
})
return(marrangeGrob(grobs=ml,nrow=1,ncol=1,top=NULL))
# dev.off() #invisible
}
# Here I'm running through all regions
regionalPlotList <- list()
for (region in names(criterias)) {
regionData <- criterias[[region]]
regionalPlots <- createRegionalPlots(data = regionData, region = region)
regionalPlotList <- do.call(c, list(regionalPlotList, regionalPlots))
}
# This leaves me with a list of 40 plots (5 regions x 8 plots)
allPlots <- preparePage(regionalPlotList, regionLayout)
ggsave("example.pdf",width = 297, height = 210, units = "mm", plot = allPlots)
As said, this works perfectly and leaves me (using the current data) with a five page report, one per every region and with the required layout.
I have now been asked to add additional per country plots at the end of the regional report and these pages should have a different layout (and different plots).
Overestimating myself (and my knowledge of r resp. ggplot) once again, I thought of this as an easy job (which it probably is for everyone else, but I'm stuck).
So, I've created a list of new criterias and a function, including a new layout:
createCountryPlots <- function(data, country) {
exampleCountryPlots <- list(ggplot() + ggtitle('Title (ggtext = plot 1)'),
ggplot() + ggtitle('Month (ggtext = plot 2)'),
ggplot() + ggtitle('Plot 1 (bar = 3)'),
ggplot() + ggtitle('Plot 2 (pie = 4)'),
ggplot() + ggtitle('Plot 3 (geom_bar = 5)'),
ggplot() + ggtitle('Plot 4 (geom_bar = 6)')
)
}
countryLayout = rbind(
c(1, 1, 1, 1, 1, 2),
c(3, 3, 3, 4, 4, 4),
c(3, 3, 3, 4, 4, 4),
c(3, 3, 3, 4, 4, 4),
c(5, 5, 5, 6, 6, 6),
c(5, 5, 5, 6, 6, 6),
c(5, 5, 5, 6, 6, 6)
)
# prepare the data per country
countryCriterias <- list()
countryCriterias[[ 'country_1' ]] <- data %>% filter(country == 'country 1')
countryCriterias[[ 'country_2' ]] <- data %>% filter(country == 'country 2')
# Running through all selected countries
countryPlotList <- list()
for (country in names(countryCriterias)) {
countryData <- countryCriterias[[country]]
countryPlots <- createCountryPlots(data = countryData, country = country)
countryPlotList <- do.call(c, list(countryPlotList, countryPlots))
}
countryPlots <- preparePage(countryPlotList, countryLayout)
# Just saving the country plots works perfectly again
ggsave("example.pdf",width = 297, height = 210, units = "mm", plot = countryPlots)
Saving this plots in a separate file works without any problems, but I'm currently stuck on how to combine these plots in one single PDF, respecting the different layouts the pages should have.
I've tried several possibilities (i.e. grid.arrange and arrangeGrob etc.), but I haven't been able to combine the plots into a single file.
Could anyone please enlighten me?
Edit:
Sorry, if I didn't make myself clear enough. This would be the result I should have at the end.
Thanks to the hint by #teunbrand to have a look at the patchwork package, I've found a solution to my problem.
It's in general almost the same as before, but instead of trying to arrange the plots first and then saving them, I "print" them directly to a pdf in the for-loop.
# defininig the layouts (simplified)
regionLayout <- "
AAAAAB
##CC##
DDEEFF
GGGHHH
GGGHHH"
countryLayout <- "
AAAAAB
CCCCDD
CCCCDD
EEEEFF
EEEEFF
"
# opening pdf
pdf('example5.pdf', pagecentre = FALSE, width = 29.7/2.54, height = 21/2.54)
par(mar = c(5, 5, 5, 5), oma = c(1, 1, 1, 1))
for (region in names(criterias)) {
regionData <- criterias[[region]]
regionalPlots <- createRegionalPlots(data = regionData, region = region)
# as regionalPlots is a list of plots, I'm using wrap_plots, which can take a dynamic
# number of plots
print(wrap_plots(regionalPlots, design = regionLayout))
}
# then the same for the country plots, with a different layout
countryPlotList <- list()
for (country in names(countryCriterias)) {
countryData <- countryCriterias[[country]]
countryPlots <- createCountryPlots(data = countryData, country = country)
print(wrap_plots(countryPlots, design = countryLayout))
}
dev.off()
And at the end I have my PDF with seperate layouts...
Thank you all for your help!!!
PS: Took me a while to find out why the PDF always was empty, before I realized that wrap_plot just arranges the plots but does not print them. As said, relatively new to R (did I mention that?)

Soil profiles with coloured volume fractions with "aqp" in R

I am trying to plot a soil profile in R using the package aqp: algorithms for quantitative pedology. The profile should represent matrix colour, plus mottling colour and percentage. For that purpose, I am using the function addVolumeFraction, which works well to some extent: it plots points on the profile that correspond to the right mottling percentage for each horizon, but it doesn't assign the corresponding colours. Here an example:
#Variables for the soil profile
id <- rep(1, 4)
hor <- c("H1", "H2", "H3", "H4")
tops <- c(0,15,35,60)
bottoms <- c(15, 35, 60, 95)
mx_Hex <- c("#695F59FF", "#A59181FF", "#9E9388FF", "#A59181FF")
mot_Hex <- c("#EEB422","#EEB422", "#CD4F39", "#CD4F39")
mot_perc <- c(5, 10, 40, 8)
#Soil profile df
soildf <- data.frame(id,hor,tops,bottoms, mx_Hex, mot_Hex, mot_perc)
soildf$mx_Hex <- as.character(mx_Hex) #the class "SoilProfileCollection" needs colors as characters
soildf$mot_Hex <- as.character(mot_Hex)
# Transform df to "SoilProfileCollection"
depths(soildf) <- id ~ tops + bottoms
#Plot
plot(soildf, name = "hor", color = "mx_Hex", divide.hz = TRUE)
addVolumeFraction(soildf, "mot_perc",pch = 19, cex.min = 0.4, cex.max = 0.5, col = soildf$mot_Hex)
Soil profile plot
As you can see on the plot, the mottles' colours are mixed along the profile. I would like to have mottles of a given colour for their corresponding horizon instead. Can anybody help me to solve this problem?
Thanks!!
This works as expected in the current version of aqp available on CRAN (v1.19 released in January 2020).
I modified your example below to use alternating black and white mottles in each horizon.
library(aqp)
#Variables for the soil profile
id <- rep(1, 4)
hor <- c("H1", "H2", "H3", "H4")
tops <- c(0,15,35,60)
bottoms <- c(15, 35, 60, 95)
mx_Hex <- c("#695F59FF", "#A59181FF", "#9E9388FF", "#A59181FF")
# change mottle colors to something obviously different in each horizon
mot_Hex <- c("#FFFFFF", "#000000", "#FFFFFF","#000000")
mot_perc <- c(5, 10, 40, 8)
#Soil profile df
soildf <- data.frame(id, hor, tops, bottoms, mx_Hex, mot_Hex, mot_perc)
#the class "SoilProfileCollection" needs colors as characters
soildf$mx_Hex <- as.character(mx_Hex)
soildf$mot_Hex <- as.character(mot_Hex)
# Transform df to "SoilProfileCollection"
depths(soildf) <- id ~ tops + bottoms
#Plot
plot(soildf,
name = "hor",
color = "mx_Hex",
divide.hz = TRUE)
addVolumeFraction(
soildf,
"mot_perc",
pch = 19,
cex.min = 0.4,
cex.max = 0.5,
col = soildf$mot_Hex
)
alternating mottles

Use checkboxGroupInput to subset df and create ggplot2 in R Shiny

I have some data as shown in the sample df below. There are data for 3 US states (CA, TX, NY) plus the Total, and for each of these there is an actual act, forecast fct, and predicted pred value.
I want to create a Shiny app in which the user can select which states to observe on the plot. To do this I'm using checkboxGroupInput. I have a reactive function named DF in the server portion of my code that subsets df based on user selected states. I then have a reactive function named gl that creates all the geom_line() statements I need to create a ggplot. The reason I'm doing this is because I want to keep the color the same for each state and use linetype to distinguish actual, forecast, or predicted for them.
Lastly, I try to create the plot, but this is where the problem begins. I get no error message, but no plot displays when I run the app. Is there a way to fix this, or a better way to accomplish what I want? Below is the code I have and a plot of what I would like to show if the user had selected all states and Total.
library(shiny)
library(ggplot2)
library(scales)
library(lubridate)
df <- data.frame(Date=seq.Date(as.Date('2017-01-01'), as.Date('2017-05-01'), by='month'),
CAact=rnorm(5, 10, 2), TXact=rnorm(5, 10, 2), NYact=rnorm(5, 10, 2),Totalact=rnorm(5, 30, 2),
CAfct=rnorm(5, 10, 2), TXfct=rnorm(5, 10, 2), NYfct=rnorm(5, 10, 2), Totalfct=rnorm(5, 30, 2),
CApred=rnorm(5, 10, 2), TXpred=rnorm(5, 10, 2), NYpred=rnorm(5, 10, 2), Totalpred=rnorm(5, 30, 2)
)
ui <- fluidPage(
sidebarLayout(
sidebarPanel(
checkboxGroupInput('states', 'Select Regions',
choices=c('CA','TX','NY','Total'),
selected=c('CA','TX','NY','Total')
)
),
mainPanel(
plotOutput('portfolio')
)
)
)
server <- function(input, output){
#Function to subset df based on user selected states
DF <- reactive({
df <- df[,c('Date', names(df)[grep(paste(input$states, collapse='|'), names(df))])]
return(df)
})
#Generate all the 'geom_line' statements to create ggplot
gl <- reactive({
gl <- character()
lt <- 1
for(i in 2:length(DF())){
col <- substr(names(DF())[i], 1, 2)
if(grepl('Total', names(DF())[i])){
col <- 'Total'
}
if(grepl('fct', names(DF())[i])){
lt <- 5
} else if(grepl('pred', names(DF())[i])){
lt <- 4
}
line <- paste0("geom_line(aes(y=", names(DF())[i], ", color='", col, "'), linetype=", lt, ", size=1.25) + ")
gl <- paste0(gl, line)
}
})
#Create ggplot (not working)
output$portfolio <- renderPlot({
paste0("ggplot(data=DF(), aes(Date)) + ", gl(), "labs(x='', y='Balances ($B)')")
})
}
shinyApp(ui, server)
The error is occurring because of
paste0("ggplot(data=TP, aes(Date)) + ", gl, "labs(x='', y='Balances ($B)')")
You need to refer to gl as gl() since it is a reactive object. After fixing this, there is another error in
DF <- reactive({
df <- df[,c('Date', names(df)[grep(paste(input$states, collapse='|'), names(df))])]
return(tpreg)
})
As there is no tpreg object in that function. Changed it to return(df).
Then nothing displays in the plot. I wanted to help fix the rest but I've never seen anyone paste() together a plot so I'm not sure that works...
UPDATE:
Ok.
library(shiny)
library(ggplot2)
library(scales)
library(lubridate)
library(tidyr)
df <- data.frame(Date=seq.Date(as.Date('2017-01-01'), as.Date('2017-05-01'), by='month'),
CAact=rnorm(5, 10, 2), TXact=rnorm(5, 10, 2), NYact=rnorm(5, 10, 2),Totalact=rnorm(5, 30, 2),
CAfct=rnorm(5, 10, 2), TXfct=rnorm(5, 10, 2), NYfct=rnorm(5, 10, 2), Totalfct=rnorm(5, 30, 2),
CApred=rnorm(5, 10, 2), TXpred=rnorm(5, 10, 2), NYpred=rnorm(5, 10, 2), Totalpred=rnorm(5, 30, 2)
)
df <- gather(df, Variable, Value, -Date)
df$State <- gsub('act|fct|pred', '', df$Variable)
df$Variable <- gsub('CA|NY|TX|Total', '', df$Variable)
df$State <- factor(df$State, levels = c('CA','NY','TX','Total'))
ui <- fluidPage(
sidebarLayout(
sidebarPanel(
checkboxGroupInput('states', 'Select Regions',
choices=c('CA','TX','NY','Total'),
selected=c('CA','TX','NY','Total')
)
),
mainPanel(
plotOutput('portfolio')
)
)
)
server <- function(input, output){
#Function to subset df based on user selected states
color.groups <- c(CA = 'green', TX = 'blue', NY = 'red', Total = 'black')
line.types <- c(pred = 1, act = 4, fct = 5)
#Generate all the 'geom_line' statements to create ggplot
#Create ggplot (not working)
output$portfolio <- renderPlot({
sub <- subset(df, subset = State %in% input$states)
ggplot(sub, aes(x = Date, y = Value, col = State))+
geom_line(aes(linetype = Variable))+
scale_color_manual(values = color.groups)+
scale_linetype_manual(values = line.types)
})
}
shinyApp(ui, server)

rCharts HighCharts dataLabels

I am using the following code to produce a scatterplot using rCharts & HighCharts. I each point to have a their corresponding Ticker right next to the point at all times. I would also like for the color of the dot to be determined by "Type", and all points to be circles.
library(rCharts)
x <- as.data.frame(c(1:6))
x$Tickers <- c("DBC", "IWV", "TIP", "TLT", "SPY", "MODEL")
x$Return <- c(0, 15, 4.3, 7.3, 15, 7)
x$StdDev <- c(16, 16, 6, 15, 16, 6)
x$Type <- c('Asset', 'Asset', 'Asset', 'Asset', 'Benchmark', 'Model')
x
b <- hPlot(x="StdDev", y="Return", data = x, group="Type", type = "scatter")
b
Thank you!
As is, the nodes are colored by "Type", you can force the nodes to be circles with this:
b$plotOptions(scatter=list(marker=list(symbol='circle')))

Resources