Combine scale_x_upset with scale_y_break

Combine scale_x_upset with scale_y_break - r

I made an upset plot using the ggupset package and added a break to the y axis with scale_y_break from the ggbreakpackage.
However, when I add scale_y_break, the combination matrix under the bar plot disappears.
Is there a way to combine the combination matrix of the plot made without scale_y_break with the bar plot portion of a plot made with scale_y_break? I can't seem to be able to access the grobs of these plots or use any other workaround. If anyone could help, I would greatly appreciate it!
Example with scale_x_upset and scale_y_break:
df = tidy_movies %>% distinct(title, year, length, .keep_all=TRUE)
ggplot(df, aes(x=Genres)) + geom_bar() + scale_x_upset(n_intersections = 20)+ scale_y_break(breaks = c(750,1000))
I would like to combine the barplot portion of the plot created with:
df = tidy_movies %>% distinct(title, year, length, .keep_all=TRUE)
ggplot(df, aes(x=Genres)) + geom_bar() + scale_x_upset(n_intersections = 20)+ scale_y_break(breaks = c(750,1000))
with the combination matrix portion of the plot made with:
df = tidy_movies %>% distinct(title, year, length, .keep_all=TRUE)
ggplot(df, aes(x=Genres)) + geom_bar() + scale_x_upset(n_intersections = 20)
Thanks!

Related

How to apply a code for multiple columns? [duplicate]

This question already has answers here:
Plotting two variables as lines using ggplot2 on the same graph
(5 answers)
Closed 8 months ago.
I am new to R and have the following example code that I wish to apply for every column in my data.
data(economics, package="ggplot2")
economics$index <- 1:nrow(economics)
loessMod10 <- loess(uempmed ~ index, data=economics, span=0.10)
smoothed10 <- predict(loessMod10)
plot(economics$uempmed, x=economics$date, type="l", main="Loess Smoothing and Prediction", xlab="Date", ylab="Unemployment (Median)")
lines(smoothed10, x=economics$date, col="red")
Could someone please suggest how this would be possible?

It's possible to perform loess smoothing within ggplot.
library(data.table)
library(ggplot2)
df <- economics
##
#
gg.melt <- setDT(df) |> melt(id='date', variable.name = 'KPI')
ggplot(gg.melt, aes(x=date, y=value))+
geom_line()+
stat_smooth(method=loess, color='red', size=0.5, se=FALSE, method.args = list(span=0.1))+
facet_wrap(~KPI, scales = 'free_y')
Regarding combining everything on one plot I'm not seeing how you would do that as the y-scales are so different. If the point is to see how the peaks line up, etc. you could do this:
ggplot(gg.melt, aes(x=date, y=value))+
geom_line()+
stat_smooth(method=loess, color='red', size=0.5, se=FALSE, method.args = list(span=0.1))+
facet_grid(KPI~., scales = 'free_y')
There is also the dygraphs package which allows creation of dynamic graphics that can be saved to html:
gg.melt[, scaled:=scale(value, center = FALSE, scale=diff(range(value))), by=.(KPI)]
gg.melt[, pred:=predict(loess(scaled~as.integer(date), .SD, span=0.1)), by=.(KPI)]
gg.dt <- dcast(gg.melt, date~KPI, value.var = list('scaled', 'pred'))
library(dygraphs)
dygraph(gg.dt) |>
dyCrosshair(direction = 'vertical') |>
dyRangeSelector()
It's possible to create a dygraph(...) version of the second plot, where the different KPI are in different facets, but you have to use RMarkdown for that.

You can make your data from wide to long by the date and use facet_wrap. Maybe you want something like this:
library(ggplot2)
library(reshape2)
library(dplyr)
economics %>%
melt(., "date") %>%
ggplot(., aes(date, value)) +
geom_line() +
facet_wrap(~variable, scales = "free")
Output:
Comment: All plots in one graph
If you mean all plots in one graph, you can give the variables a color like this:
economics %>%
melt(., "date") %>%
ggplot(., aes(date, value, color = variable)) +
geom_line() +
scale_y_log10()
Output:

how to plot a Vertical Likert Line Chart with Categories by ggplot2 or highcharter?

I want to create a Vertical Likert Line Chart , is there anyway to plot it by ggplot2 or highcharter ？
here is the example chart:
data example :
value1 <- abs(rnorm(26))*2
data <- data.frame(
x=LETTERS[1:26],
value1=value1,
value2=value1+1+rnorm(26, sd=1)
)

library(tidyverse)
data %>%
pivot_longer(-x) %>%
ggplot(aes(x, value, color = name, group = name)) +
geom_line() +
geom_point() +
coord_flip()
P.S. --- Since ggplot2 3.3.0 from March 2020, you can skip the "coord_flip" step and describe the axes directly in the orientation you want them, but the geom_line step still needs a nudge to display correctly:
...
ggplot(aes(value, x, color = name, group = name)) +
geom_line(orientation = "y") +
geom_point()

How to graph "before and after" measures using ggplot with connecting lines and subsets?

I’m totally new to ggplot, relatively fresh with R and want to make a smashing ”before-and-after” scatterplot with connecting lines to illustrate the movement in percentages of different subgroups before and after a special training initiative. I’ve tried some options, but have yet to:
show each individual observation separately (now same values are overlapping)
connect the related before and after measures (x=0 and X=1) with lines to more clearly illustrate the direction of variation
subset the data along class and id using shape and colors
How can I best create a scatter plot using ggplot (or other) fulfilling the above demands?
Main alternative: geom_point()
Here is some sample data and example code using genom_point
x <- c(0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1) # 0=before, 1=after
y <- c(45,30,10,40,10,NA,30,80,80,NA,95,NA,90,NA,90,70,10,80,98,95) # percentage of ”feelings of peace"
class <- c(0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,1,1) # 0=multiple days 1=one day
id <- c(1,1,2,3,4,4,4,4,5,6,1,1,2,3,4,4,4,4,5,6) # id = per individual
df <- data.frame(x,y,class,id)
ggplot(df, aes(x=x, y=y), fill=id, shape=class) + geom_point()
Alternative: scale_size()
I have explored stat_sum() to summarize the frequencies of overlapping observations, but then not being able to subset using colors and shapes due to overlap.
ggplot(df, aes(x=x, y=y)) +
stat_sum()
Alternative: geom_dotplot()
I have also explored geom_dotplot() to clarify the overlapping observations that arise from using genom_point() as I do in the example below, however I have yet to understand how to combine the before and after measures into the same plot.
df1 <- df[1:10,] # data before
df2 <- df[11:20,] # data after
p1 <- ggplot(df1, aes(x=x, y=y)) +
geom_dotplot(binaxis = "y", stackdir = "center",stackratio=2,
binwidth=(1/0.3))
p2 <- ggplot(df2, aes(x=x, y=y)) +
geom_dotplot(binaxis = "y", stackdir = "center",stackratio=2,
binwidth=(1/0.3))
grid.arrange(p1,p2, nrow=1) # GridExtra package

Or maybe it is better to summarize data by x, id, class as mean/median of y, filter out ids producing NAs (e.g. ids 3 and 6), and connect the points by lines? So in case if you don't really need to show variability for some ids (which could be true if the plot only illustrates tendencies) you can do it this way:
library(ggplot)
library(dplyr)
#library(ggthemes)
df <- df %>%
group_by(x, id, class) %>%
summarize(y = median(y, na.rm = T)) %>%
ungroup() %>%
mutate(
id = factor(id),
x = factor(x, labels = c("before", "after")),
class = factor(class, labels = c("one day", "multiple days")),
) %>%
group_by(id) %>%
mutate(nas = any(is.na(y))) %>%
ungroup() %>%
filter(!nas) %>%
select(-nas)
ggplot(df, aes(x = x, y = y, col = id, group = id)) +
geom_point(aes(shape = class)) +
geom_line(show.legend = F) +
#theme_few() +
#theme(legend.position = "none") +
ylab("Feelings of peace, %") +
xlab("")

Here's one possible solution for you.
First - to get the color and shapes determined by variables, you need to put these into the aes function. I turned several into factors, so the labs function fixes the labels so they don't appear as "factor(x)" but just "x".
To address multiple points, one solution is to use geom_smooth with method = "lm". This plots the regression line, instead of connecting all the dots.
The option se = FALSE prevents confidence intervals from being plotted - I don't think they add a lot to your plot, but play with it.
Connecting the dots is done by geom_line - feel free to try that as well.
Within geom_point, the option position = position_jitter(width = .1) adds random noise to the x-axis so points do not overlap.
ggplot(df, aes(x=factor(x), y=y, color=factor(id), shape=factor(class), group = id)) +
geom_point(position = position_jitter(width = .1)) +
geom_smooth(method = 'lm', se = FALSE) +
labs(
x = "x",
color = "ID",
shape = 'Class'
)

R Highlight point on ecdf line graph

I'm creating a frequency plot using ggplot and the stat_ecdf function. I would like to add the Y-value to the graph for specific X-values, but just can't figure out how. geom_point or geom_text seems likely options, but as stat_ecdf automatically calculates Y, I don't know how to call that value in the geom_point/text mappings.
Sample code for my initial plot is:
x = as.data.frame(rnorm(100))
ggplot(x, aes(x)) +
stat_ecdf()
Now how would I add specific y-x points here, e.g. y-value at x = -1.

The easiest way is to create the ecdf function beforehand using ecdf() from the stats package, then plot it using geom_label().
library(ggplot2)
# create a data.frame with column name
x = data.frame(col1 = rnorm(100))
# create ecdf function
e = ecdf(x$col1)
# plot the result
ggplot(x, aes(col1)) +
stat_ecdf() +
geom_label(aes(x = -1, y = e(-1)),
label = e(-1))

You can try
library(tidyverse)
# data
set.seed(123)
df = data.frame(x=rnorm(100))
# Plot
Values <- c(-1,0.5,2)
df %>%
mutate(gr=FALSE) %>%
bind_rows(data.frame(x=Values,gr=TRUE)) %>%
mutate(y=ecdf(x)(x)) %>%
mutate(xmin=min(x)) %>%
ggplot(aes(x, y)) +
stat_ecdf() +
geom_point(data=. %>% filter(gr), aes(x, y)) +
geom_segment(data=. %>% filter(gr),aes(y=y,x=xmin, xend=x,yend=y), color="red")+
geom_segment(data=. %>% filter(gr),aes(y=0,x=x, xend=x,yend=y), color="red") +
ggrepel::geom_label_repel(data=. %>% filter(gr),
aes(x, y, label=paste("x=",round(x,2),"\ny=",round(y,2))))
The idea is to add the y values in the beginning, together with the index gr specifing which Values you want to show.
Edit:
Since this code adds points to the actual data, which could be wrong for the curve, one should consider to remove these points at least in the ecdf function stat_ecdf(data=. %>% filter(!gr))

Plotting a bar graph in R

Here is a snapshot of data:
restaurant_change_sales = c(3330.443, 3122.534)
restaurant_change_labor = c(696.592, 624.841)
restaurant_change_POS = c(155.48, 139.27)
rest_change = data.frame(restaurant_change_sales, restaurant_change_labor, restaurant_change_POS)
I want two bars for each of the columns indicating the change. One graph for each of the columns.
I tried:
ggplot(aes(x = rest_change$restaurant_change_sales), data = rest_change) + geom_bar()
This is not giving the result the way I want. Please help!!

So ... something like:
library(ggplot2)
library(dplyr)
library(tidyr)
restaurant_change_sales = c(3330.443, 3122.534)
restaurant_change_labor = c(696.592, 624.841)
restaurant_change_POS = c(155.48, 139.27)
rest_change = data.frame(restaurant_change_sales,
restaurant_change_labor,
restaurant_change_POS)
cbind(rest_change,
change = c("Before", "After")) %>%
gather(key,value,-change) %>%
ggplot(aes(x = change,
y = value)) +
geom_bar(stat="identity") +
facet_grid(~key)
Which will produce:
Edit:
To be extra fancy e.g. make it so that the order of x-axis labels goes from "Before" to "After", you can add this line: scale_x_discrete(limits = c("Before", "After")) to the end of the ggplot function

Your data are not formatted properly to work well with ggplot2, or really any of the plotting packages in R. So we'll fix your data up first, and then use ggplot2 to plot it.
library(tidyr)
library(dplyr)
library(ggplot2)
# We need to differentiate between the values in the rows for them to make sense.
rest_change$category <- c('first val', 'second val')
# Now we use tidyr to reshape the data to the format that ggplot2 expects.
rc2 <- rest_change %>% gather(variable, value, -category)
rc2
# Now we can plot it.
# The category that we added goes along the x-axis, the values go along the y-axis.
# We want a bar chart and the value column contains absolute values, so no summation
# necessary, hence we use 'identity'.
# facet_grid() gives three miniplots within the image for each of the variables.
ggplot2(rc2, aes(x=category, y=value, facet=variable)) +
geom_bar(stat='identity') +
facet_grid(~variable)

You have to melt your data:
library(reshape2) # or library(data.table)
rest_change$rowN <- 1:nrow(rest_change)
rest_change <- melt(rest_change, id.var = "rowN")
ggplot(rest_change,aes(x = rowN, y = value)) + geom_bar(stat = "identity") + facet_wrap(~ variable)