I am really new to R and trying to lean it myself but now I got stuck in a place. I have tried to add a shaded area to a facet but it adds a shaded area to every facet. I wanna highlight a time from 2017-08-08 to 2017-08-16 only and I don't know how to do it.
The code I have right now looks like this:
ggplot(data = Andmed3, aes(x= Aeg, y = arch, group = Prooviala)) +
stat_summary(geom = "errorbar",fun.data = mean_se, aes(group = Prooviala, color = Prooviala)) +
stat_summary(fun = mean,geom="line",lwd=1,aes (group= Prooviala, color = Prooviala)) +
stat_summary(geom = "point", fun = mean, shape = 16, size = 2, aes(color = Prooviala)) +
facet_grid(~Periood, scales = "free_x", space = "free_x") +
labs(x = "KuupƤev", y = "Arhede 16S rRNA geenikoopiate arv")+
theme_bw() +
theme(axis.text.x = element_text(angle = 30, hjust = 1)) +
scale_x_datetime(date_breaks = "1 week", date_labels = " %e %b %y") +
geom_rect(aes(xmin = as.POSIXct('2017-08-08', format = '%Y-%m-%d'),xmax = as.POSIXct('2017-08-16', format = '%Y-%m-%d'),ymin = -Inf,ymax = Inf), alpha = 0.2, fill = 'grey')
I need my graph to look like my second picture but just with the highlighted background in the first facet.
Can someone help me with the code?
Thank You!
Related
I conducted some interviews and I wanted to create box plots with ggplot based on these interviews. I managed to create the box plots but I do not manage to include the outliers in the box plot. I have only a few observations and therefore I want the outliers to be part of the box plot.
This is the code that I have so far:
data_insurances_boxplot_merged <- ggplot(data_insurances_merged, aes(x = value, y = func, fill = group)) +
stat_boxplot(geom = "errorbar", width = 0.3, position = position_dodge(width = 0.75)) +
geom_boxplot() +
stat_summary(fun.y = mean, geom = "point", shape = 20, size = 3, color = "red",
position = position_dodge2(width = 0.75,
preserve = "single")) +
scale_x_continuous(breaks = seq(1, 7, 1), limits = c(1, 7)) +
scale_fill_manual(values = c("#E6645E", "#EF9C9D")) +
labs(x = "",
y = "", title = "") +
theme_light(base_size = 12) +
theme(legend.title = element_blank())
data_insurances_boxplot_merged
And this is the box plot that is generated:
Does anyone know how to achieve this?
I am trying to achieve a bar graph with percentages shown, but can not round them down.
In this variant the percentages are shown, but I am not able to round them
ggplot(data=true_dizzy, aes(x = reorder(factor(ICD_subgroup_new), ICD_subgroup_new, length),
y = prop.table(stat(count)),
label = scales::percent(prop.table(stat(count))))) +
geom_bar() +
coord_flip() +
geom_text(stat = "count", position = position_dodge(.9), hjust = 0, size = 3) +
scale_y_continuous(label = scales::percent) +
labs(y = c(""), x = c(""), title = c("Subgroups of Dizzy Patients (%")) +
theme_bw()
and in this variant there are no percent signs behind the numbers
ggplot(data=true_dizzy, aes(x = reorder(factor(ICD_subgroup_new), ICD_subgroup_new, length),
y = prop.table(stat(count)),
label = scales::percent(prop.table(stat(count))))) +
geom_bar() +
coord_flip() +
geom_text(stat = "count", aes(label = round(x = (..count../sum(..count..))*100, digits = 1), hjust = 0),
position = position_dodge(.9), hjust = 0, size = 3) +
scale_y_continuous(label = scales::percent) +
labs(y = c(""), x = c(""), title = c("Subgroups of Dizzy Patients (%")) +
theme_bw()
I know the problem is in the geom_text line, but i cant seem to figure it out.
Here's some sample data for a company's Net revenue split by two cohorts:
data <- data.frame(dates = rep(seq(as.Date("2000/1/1"), by = "month", length.out = 48), each = 2),
revenue = rep(seq(10000, by = 1000, length.out = 48), each = 2) * rnorm(96, mean = 1, sd = 0.1),
cohort = c("Group 1", "Group 2"))
I can show one year's worth of data and it returns what I would expect:
start = "2000-01-01"
end = "2000-12-01"
ggplot(data, aes(fill = cohort, x = dates, y = revenue)) +
geom_bar(stat = "identity", position = position_dodge(width = NULL)) +
xlab("Month") +
ylab("Net Revenue") +
geom_text(aes(label = round(revenue, 0)), vjust = -0.5, size = 3, position = position_dodge(width = 25)) +
scale_x_date(date_breaks = "1 month", limits = as.Date(c(start, end))) +
ggtitle("Monthly Revenue by Group") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 60, hjust = 1), plot.title = element_text(hjust = 0.5)) +
scale_fill_manual(values=c("#00BFC4", "#F8766D"))
But if I expand the date range to two years or more and rerun the graph, it shows additional months on both sides of the x-axis despite not displaying any information on the y-axis.
start = "2000-01-01"
end = "2001-12-01"
#rerun the ggplot code from above
Note the non-existant data points for 1999-12-01 and 2002-01-01. Why do these appear and how can I remove them?
Many (all?) of the scale_* functions take expand= as an argument. It's common in R plots (both base and ggplot2) to expand the axes just a little bit (4% on each end, I believe), I think so that none of the lines/points are scrunched up against the "box" boundary.
If you include expand=c(0,0), you get what you want.
(BTW: you have mismatched parens. Fixed here.)
ggplot(data, aes(fill = cohort, x = dates, y = revenue)) +
geom_bar(stat = "identity", position = position_dodge(width = NULL)) +
xlab("Month") +
ylab("Net Revenue") +
geom_text(aes(label = round(revenue, 0)), vjust = -0.5, size = 3, position = position_dodge(width = 25)) +
scale_x_date(date_breaks = "1 month", limits = as.Date(c(start, end)), expand = c(0, 0)) +
ggtitle("Monthly Revenue by Group") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 60, hjust = 1), plot.title = element_text(hjust = 0.5)) +
scale_fill_manual(values=c("#00BFC4", "#F8766D"))
I am not sure what exactly the issue is but if you change from "Date" class on x-axis to any other it seems to work as expected. Also filtering the data for the specific range before passing it to ggplot.
For example in this case changing dates to month-year format,
library(dplyr)
library(ggplot2)
start = as.Date("2000-01-01")
end = as.Date("2001-12-01")
all_fac <- c(outer(month.abb, 2000:2001, paste, sep = "-"))
data %>%
filter(between(dates, start, end)) %>%
mutate(dates = factor(format(dates, "%b-%Y"),levels = all_fac)) %>%
ggplot() + aes(fill = cohort, x = dates, y = revenue) +
geom_bar(stat = "identity", position = "dodge") +
xlab("Month") +
ylab("Net Revenue") +
geom_text(aes(label = round(revenue, 0))) +
ggtitle("Monthly Revenue by Group") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 60, hjust = 1), plot.title =
element_text(hjust = 0.5)) +
scale_fill_manual(values=c("#00BFC4", "#F8766D"))
Please beautify/change the labels on the bars.
I am trying to make a graph with two y axis. I know there are a lot of other questions out there similar to this but I just cant seem to figure it out based on other posts
So the issue I am having is the y axis scale. Here is what I am doing
Time <- c("June-2018-30", "July-2018-31", "August-2018-31", "September-2018-30",
"October-2018-31", "November-2018-30", "December-2018-31", "January-2019-31",
"February-2019-28", "March-2019-31", "April-2019-30", "May-2019-31")
Bitcoin <- c(3.469861e-17, 3.188903e-17, 2.685114e-17, 2.42335e-17, 2.322641e-17,
2.447058e-17, 3.18029e-17, 2.944836e-17, 2.839419e-17, 2.76008e-17,
2.661607e-17, 2.536966e-17)
`USD Return` <- c(2.35e-13, 2.27e-13, 1.80e-13, 1.60e-13, 1.51e-13, 1.33e-13, 1.18e-13,
1.08e-13, 1.047e-13, 1.09e-13, 1.37e-13, 1.83e-13)
total.values3 <- data.frame(Time, Bitcoin,`USD Return`, stringsAsFactors = F)
library(ggplot2)
ggplot(data=total.values3, aes(x=Time, y=`USD Return`, group=1)) +
geom_line(aes(y = `USD Return`), color = "blue") +
geom_line(aes(y = Bitcoin), color = "red") +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
scale_y_continuous("USD Return",
sec.axis = sec_axis(~./10000, name = "Bitcoin Return")) +
scale_x_date(labels=date_format("%B-%Y-%d"),
date_labels = "%B-%Y", breaks = total.values3$Time)
Here is a picture of what output
I am not sure what is going wrong. I can see that the scale is wrong. I can't figure out why the bitcoin line is just a straight line. I also don't know why the y axis on the right side goes into the negative
total.values3$Time <- as.Date(total.values3$Time, format = "%B-%Y-%d")
ggplot(data=total.values3, aes(x=Time, group=1)) +
geom_line(aes(y = `USD Return`), color = "blue") +
geom_line(aes(y = Bitcoin*10000), color = "red") +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
scale_y_continuous("USD Return",
sec.axis = sec_axis(~./10000, name = "Bitcoin Return")) +
scale_x_date(labels=date_format("%B-%Y-%d"),
date_labels = "%B-%Y", breaks = total.values3$Time)
This should do the trick.
I am trying to overlay a bar chart with a line graph on a single plot with ggplot in R. My line graph works fine but the data are much larger than the data for the bar chart component.
How could I use an additional scale for this bar chart or do something that will get this to look nice all in one graph.
Here is my plot code thus far:
chart <- data.frame("QuantileName" = 1:5, "AvgLoss" = c(100, 500, 1000, 2500, 3000), "AvgFactor" = c(1.0, 1.1, 1.3, 1.4, 1.5))
Plot <- ggplot(chart, aes(x = 1:5)) +
scale_x_continuous(name = "Quintile", limits = c(0, 5 + .5), breaks = seq(1, 5)) +
geom_line(aes(y = AvgLoss, colour = "AvgLoss")) +
geom_bar(aes(y = AvgFactor, colour = "AvgFactor" ), stat = "identity") +
geom_text(aes(y = AvgLoss, label = round(AvgLoss)), position = position_nudge(x = .3)) +
geom_point(aes(y = AvgLoss)) +
ylab("AvgLoss") +
scale_colour_manual("",breaks = c("AvgLoss","AvgFactor"), values = c("AvgLoss" = "red", "AvgFactor" = "grey")) +
ggtitle("Quintile Plot") +
theme(plot.title = element_text(hjust=0.5))
Plot
Thank you for any help!
Essentialy, multiply your AvgFactor variable by a number
+ geom_bar(aes(y = AvgFactor*1000, colour = "AvgFactor" ), stat = "identity")
and set
+ scale_y_continuous(sec.axis = sec_axis(~ ./1000, name = "AvgFactor"))
so your plot code would look like
Plot <- ggplot(chart, aes(x = 1:5)) +
scale_x_continuous(name = "Quintile", limits = c(0, 5 + .5),
breaks = seq(1, 5)) +
geom_bar(aes(y = AvgFactor*1000, colour = "AvgFactor" ),
stat = "identity") +
geom_line(aes(y = AvgLoss, colour = "AvgLoss")) +
geom_text(aes(y = AvgLoss,
label = round(AvgLoss)),
position = position_nudge(x = .3)) +
geom_point(aes(y = AvgLoss)) +
ylab("AvgLoss") +
scale_colour_manual("",breaks = c("AvgLoss","AvgFactor"),
values = c("AvgLoss" = "red", "AvgFactor" = "grey")) +
ggtitle("Quintile Plot") +
theme(plot.title = element_text(hjust=0.5)) +
scale_y_continuous(sec.axis = sec_axis(~ ./1000, name = "AvgFactor"))
However, I think it is probably more elegant to avoid secondary axes whenever possible.
It may be useful to know that geom_col(...) is shorthand for geom_bar(..., stat = 'identity')