I have a plot of depth of fish individuals over time. The background represents the temperature, the grey dots is the raw depth data, and the black line is the geom_smooth line of raw data (image of plot is attached here). I used ggplot to make the graphs, but my x-axis (= date/time) is slightly moved to the right. I need the axis to be adjusted in the middle (standardized). This is my very long code for the plot:
tibble(y=c(-7:0)) %>%
expand_grid(TBRtemperature %>% select(`Date and Time (UTC)`, Temperature)) %>%
rename(dt="Date and Time (UTC)") %>%
filter(yday(dt)>136&yday(dt)<147) %>%
mutate(dt=with_tz(dt, "Europe/Oslo")) %>%
ggplot(aes(dt, y, fill=Temperature)) +
geom_tile() +
scale_fill_gradientn(colours = c("lightblue", "white", "red")) +
scale_x_datetime(expand = c(0,0)) +
scale_y_continuous(expand = c(0,0)) +
geom_point(data=fbd_TBR %>% filter(yday(dt)>136&yday(dt)<147, n()>100), aes(dt, -Data/10, group=paste0(ID, Trial)), colour="grey50", alpha=0.2) +
geom_smooth(data=fbd_TBR %>% filter(yday(dt)>136&yday(dt)<147, n()>100), aes(dt, -Data/10, group=paste0(ID, Trial), colour=paste0(ID, sep=" - ", Weight)), colour="black") +
labs(y="Depth (m)", x=("Time (days)"), title = "Trial 2") +
facet_wrap(~paste(ID, sep = " - ", Weight)) +
theme_classic() +
theme(plot.title = element_text(face="bold"), strip.text = element_text(face = "bold"))
Anyone who knows how the axis can be adjusted?
I think specifying the range you want by adding this argument into the ggplot section should solve it
+coord_cartesian(xlim = c(1, 10),ylim = c(10,40))
Related
I'll use the diamond data set in ggplot to illustrate my point , I want to draw a histogram for price , but I want to show the count for each bin for each cut
this is my code
ggplot(aes(x = price ) , data = diamonds_df) +
geom_histogram(aes(fill = cut , binwidth = 1500)) +
stat_bin(binwidth= 1500, geom="text", aes(label=..count..) ,
vjust = -1) +
scale_x_continuous(breaks = seq(0 , max(stores_1_5$Weekly_Sales) , 1500 )
, labels = comma)
here is my current plot
but as you see the number shows the count for all cuts at each bin , I want to display the count for each cut on each bin .
also a bonus point if if I would be able to configure Y axis instead of displaying numbers at step of 5000 to something else I can configure manually
Update for ggplot2 2.x
You can now center labels within stacked bars without pre-summarizing the data using position=position_stack(vjust=0.5). For example:
ggplot(aes(x = price ) , data = diamonds) +
geom_histogram(aes(fill=cut), binwidth=1500, colour="grey20", lwd=0.2) +
stat_bin(binwidth=1500, geom="text", colour="white", size=3.5,
aes(label=..count.., group=cut), position=position_stack(vjust=0.5)) +
scale_x_continuous(breaks=seq(0,max(diamonds$price), 1500))
Original Answer
You can get the counts for each value of cut by adding cut as a group aesthetic to stat_bin. I also moved binwidth outside of aes, which was causing binwidth to be ignored in your original code:
ggplot(aes(x = price ), data = diamonds) +
geom_histogram(aes(fill = cut ), binwidth=1500, colour="grey20", lwd=0.2) +
stat_bin(binwidth=1500, geom="text", colour="white", size=3.5,
aes(label=..count.., group=cut, y=0.8*(..count..))) +
scale_x_continuous(breaks=seq(0,max(diamonds$price), 1500))
One issue with the code above is that I'd like the labels to be vertically centered within each bar section, but I'm not sure how to do that within stat_bin, or if it's even possible. Multiplying by 0.8 (or whatever) moves each label by a different relative amount. So, to get the labels centered, I created a separate data frame for the labels in the code below:
# Create text labels
dat = diamonds %>%
group_by(cut,
price=cut(price, seq(0,max(diamonds$price)+1500,1500),
labels=seq(0,max(diamonds$price),1500), right=FALSE)) %>%
summarise(count=n()) %>%
group_by(price) %>%
mutate(ypos = cumsum(count) - 0.5*count) %>%
ungroup() %>%
mutate(price = as.numeric(as.character(price)) + 750)
ggplot(aes(x = price ) , data = diamonds) +
geom_histogram(aes(fill = cut ), binwidth=1500, colour="grey20", lwd=0.2) +
geom_text(data=dat, aes(label=count, y=ypos), colour="white", size=3.5)
To configure the breaks on the y axis, just add scale_y_continuous(breaks=seq(0,20000,2000)) or whatever breaks you'd like.
Now with GGPLOT 2.2.0 position_stack options makes it easier
library(ggplot2)
s <- ggplot(mpg, aes(manufacturer, fill = class))
s + geom_bar(position = "stack") +
theme(axis.text.x = element_text(angle=90, vjust=1)) +
geom_text(stat='count', aes(label=..count..), position = position_stack(vjust = 0.5),size=4)
I created a ggplot graph using ggsegment for certain subcategories and their cost.
df <- data.frame(category = c("A","A","A","A","A","A","B","B","B","B","B","B","B"),
subcat = c("S1","S2","S3","S4","S5","S6","S7","S8","S9","S10","S11","S12","S13"),
value = c(100,200,300,400,500,600,700,800,900,1000,1100,1200,1300))
df2 <- df %>%
arrange(desc(value)) %>%
mutate(subcat=factor(subcat, levels = subcat)) %>%
ggplot(aes(x=subcat, y=value)) +
geom_segment(aes(xend=subcat, yend=0)) +
geom_point(size=4, color="steelblue") +
geom_text(data=df, aes(x=subcat, y=value, label = dollar(value, accuracy = 1)), position = position_nudge(x = -0.3), hjust = "inward") +
theme_classic() +
coord_flip() +
scale_y_continuous(labels = scales::dollar_format()) +
ylab("Cost Value") +
xlab("subcategory")
df2
This code results in a graph that is shown below:
My main issue is I want the category variable on the left of the subcategory variables. It should look like this:
How do I add the category variables in the y-axis, such that it looks nested?
As mentioned in my comment and adapting this post by #AllanCameron to your case one option to achieve your desired result would be the "facet trick", which uses faceting to get the nesting and some styling to remove the facet look:
Facet by category and free the scales and the space so that the distance between categories is the same.
Remove the spacing between panels and place the strip text outside of the axis text.
Additionally, set the expansion of the discrete x scale to .5 to ensure that the distance between categories is the same at the facet boundaries as inside the facets.
library(dplyr)
library(ggplot2)
library(scales)
df1 <- df %>%
arrange(desc(value)) %>%
mutate(subcat=factor(subcat, levels = subcat))
ggplot(df1, aes(x=subcat, y=value)) +
geom_segment(aes(xend=subcat, yend=0)) +
geom_point(size=4, color="steelblue") +
geom_text(data=df, aes(x=subcat, y=value, label = dollar(value, accuracy = 1)), position = position_nudge(x = -0.3), hjust = "inward") +
theme_classic() +
coord_flip() +
scale_y_continuous(labels = scales::dollar_format()) +
scale_x_discrete(expand = c(0, .5)) +
facet_grid(category~., scales = "free_y", switch = "y", space = "free_y") +
ylab("Cost Value") +
xlab("subcategory") +
theme(panel.spacing.y = unit(0, "pt"), strip.placement = "outside")
I have a Lorenz Curve graph that I filled by factor variables (male and female). This was done simply enough and overlapping was not an issue because there were only two factors.
Wage %>%
ggplot(aes(x = salary, fill = gender)) +
stat_lorenz(geom = "polygon", alpha = 0.65) +
geom_abline(linetype = "dashed") +
coord_fixed() +
scale_fill_hue() +
theme(legend.title = element_blank()) +
labs(x = "Cumulative Percentage of Observations",
y = "Cumulative Percentage of Wages",
title = "Lorenz curve by sex")
This provides the following graph:
However, when I have more than two factors (in this case four), the overlapping becomes a serious problem even if I use contrasting colors. Changing alpha does not do much at this stage. Have a look:
Wage %>%
ggplot(aes(x = salary, fill = Diploma)) +
stat_lorenz(geom = "polygon", alpha = 0.8) +
geom_abline(linetype = "dashed") +
coord_fixed() +
scale_fill_manual(values = c("green", "blue", "black", "white")) +
theme(legend.title = element_blank()) +
labs(x = "Cumulative Percentage of Observations",
y = "Cumulative Percentage of Wages",
title = "Lorenz curve by diploma")
At this point I've tried all different color pallettes, hues, brewers, manuals etc. I've also tried reordering the factors but as you can imagine, this did not work as well.
What I need is probably a single argument or function to stack all these areas on top of each other so they all have their distinct colors. Funny enough, I've failed to find what I'm looking for and decided to ask for help.
Thanks a lot.
The problem was solved by a dear friend. This was done by adding the categorical variables layer by layer, without defining the Lorenz Curve as a whole.
ggplot() + scale_fill_manual(values = wes_palette("GrandBudapest2", n = 4)) +
stat_lorenz(aes(x=Wage[Wage$Diploma==levels(Wage$Diploma)[3],]$salary, fill=Wage[Wage$Diploma==levels(Wage$Diploma)[3],]$Diploma), geom = "polygon") +
stat_lorenz(aes(x=Wage[Wage$Diploma==levels(Wage$Diploma)[4],]$salary, fill=Wage[Wage$Diploma==levels(Wage$Diploma)[4],]$Diploma), geom = "polygon") +
stat_lorenz(aes(x=Wage[Wage$Diploma==levels(Wage$Diploma)[2],]$salary, fill=Wage[Wage$Diploma==levels(Wage$Diploma)[2],]$Diploma), geom = "polygon") +
stat_lorenz(aes(x=Wage[Wage$Diploma==levels(Wage$Diploma)[1],]$salary, fill=Wage[Wage$Diploma==levels(Wage$Diploma)[1],]$Diploma), geom = "polygon") +
geom_abline(linetype = "dashed") +
coord_fixed() +
theme(legend.title = element_blank()) +
labs(x = "Cumulative Percentage of Observations",
y = "Cumulative Percentage of Wages",
title = "Lorenz curve by diploma")
Which yields:
I have this data
TX_growth<-data.frame(year=c(2017,2016, 2015),statewide=c(61, 62,57),black=c(58,58,53),hispanic=c(59,60,55),white=c(65,64,61))
Until now I have this chart using the following code:
My chart until now
ggplot() + geom_line(data = TX_growth, aes(x=year, y= statewide), color = "blue", size=1) +
geom_line(data = TX_growth, aes(x=year, y= white), color = "red", size=1) +
geom_line(data = TX_growth, aes(x=year, y= black), color = "green", size=1) +
geom_line(data = TX_growth, aes(x=year, y= hispanic), color = "orange", size=1) +
labs(title = "Figure 1: Statewide Percent who Met or Exceeded Progress",
subtitle = "Greater percentage means that student subgroup progressed at higher percentage than previous year.",
x = "Year", y = "Percentage progress")+ theme_bw() +
scale_x_continuous(breaks=c(2017,2016,2015))
I want to add (a) legend showing the name and color of each line and (b) a table below with all values of my dataframe. Something like this:
What I want
Instead of cities, my chart would have "Statewide", "White", "Black", and "Hispanic". Also, my table would have years (from 2015 to 2017), rather than months. I don't want the seasons or "freezing" line. I just want to add the legend and table like they did it.
Part 1 - Fixing the legend
Concerning the legend, this is not the ggplot-way. Convert your data from wide to long, and then map the what keys to the colour as an aesthetic mapping.
library(tidyverse)
TX_growth %>%
gather(what, value, -year) %>%
ggplot() +
geom_line(aes(x=year, y= value, colour = what), size=1) +
labs(
title = "Figure 1: Statewide Percent who Met or Exceeded Progress",
subtitle = "Greater percentage means that student subgroup progressed at higher percentage than previous year.",
x = "Year", y = "Percentage progress") +
theme_bw() +
scale_x_continuous(breaks=c(2017,2016,2015))
Part 2 - Adding a table
Concerning the table, this seems to be somewhat of a duplicate of Adding a table of values below the graph in ggplot2.
To summarise from various posts, we can use egg::ggarrange to add a table at the bottom; here is a minimal example:
library(tidyverse)
gg.plot <- TX_growth %>%
gather(what, value, -year) %>%
ggplot() +
geom_line(aes(x=year, y= value, colour = what), size=1) +
theme_bw() +
scale_x_continuous(breaks=c(2017,2016,2015))
gg.table <- TX_growth %>%
gather(what, value, -year) %>%
ggplot(aes(x = year, y = as.factor(what), label = value, colour = what)) +
geom_text() +
theme_bw() +
scale_x_continuous(breaks=c(2017,2016,2015)) +
guides(colour = FALSE) +
theme_minimal() +
theme(
axis.title.y = element_blank())
library(egg)
ggarrange(gg.plot, gg.table, ncol = 1)
All that remains to do is some final figure polishing.
Part 3 - After some polishing ...
library(tidyverse)
gg.plot <- TX_growth %>%
gather(Group, value, -year) %>%
ggplot() +
geom_line(aes(x = year, y = value, colour = Group)) +
theme_bw() +
scale_x_continuous(breaks = 2015:2017)
gg.table <- TX_growth %>%
gather(Group, value, -year) %>%
ggplot(aes(x = year, y = as.factor(Group), label = value, colour = Group)) +
geom_text() +
theme_bw() +
scale_x_continuous(breaks = 2015:2017) +
scale_y_discrete(position = "right") +
guides(colour = FALSE) +
theme_minimal() +
theme(
axis.title.y = element_blank(),
axis.title.x = element_blank(),
axis.text.x = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank())
library(egg)
ggarrange(gg.plot, gg.table, ncol = 1, heights = c(4, 1))
I've got a data frame with three variables, location, price, and varname.
I'd like to use ggplot2's geom_tile to make a heat map of sorts. This plot almost looks like a bar chart, but I prefer geom_tile because I like the values, big or small, to be allocated the same amount of physical space on the plot. My code almost gets me there.
The first problem's that I can't format the plot so to get rid of all the white space to the left and right of my pseudo-bar. The second problem's that I can't remove the Price legend below the plot, because I'd like Price only to feature in the legend above the plot.
Thanks for any help!
Starting point (df):
df <- data.frame(location=c("AZ","MO","ID","MI"),price=c(1380.45677,1745.1245,12.45652,1630.65341),varname=c("price","price","price","price"))
Current code:
library(ggplot2)
ggplot(df, aes(varname,location, width=.2)) + geom_tile(aes(fill = price),colour = "white") + geom_text(aes(label = round(price, 3))) +
scale_fill_gradient(low = "ivory1", high = "green") +
theme_classic() + labs(x = "", y = "") + theme(legend.position = "none") + ggtitle("Price")
Don't set the width to 0.2.
Use theme to disable the labels and ticks.
You might want to use coord_equal to get nice proportions (i.e. squares). expand = FALSE gets rid of all white space.
.
ggplot(df, aes(varname, location)) +
geom_tile(aes(fill = price), colour = "white") +
geom_text(aes(label = round(price, 3))) +
scale_fill_gradient(low = "ivory1", high = "green") +
theme_classic() + labs(x = "", y = "") +
theme(legend.position = "none", axis.text.x = element_blank(), axis.ticks.x = element_blank()) +
ggtitle("Price") +
coord_equal(expand = FALSE)