I'm trying to plot mean values for species although the mean values are all negative. I want the more smaller values (more negative) to be towards the bottom of the y axis with the larger values (less negative) to be higher up on the y axis.
I've tried changing coord_cartesian and ylim and neither work.
ggplot(meanWUE, aes(x = Species, y = mean, fill = Species)) +
coord_cartesian(ylim = c(-0.8, -0.7)) +
scale_fill_manual( values c("EUCCHR" = "darkolivegreen2","ESCCAL" = "darkgoldenrod2", "ARTCAL" = "darkcyan", "DEIFAS" = "darkred", "ENCCAL" = "darkorchid2", "SALMEL" = "deepskyblue1", "ERIFAS" = "blue3", "BRANIG" = "azure3", "PHAPAR"= "palevioletred" )) +
scale_y_reverse() +
geom_bar(position = position_dodge(), stat="identity") +
geom_errorbar(aes(ymin=mean-se, ymax=mean+se),width=.3) +
labs(x="Species", y="WUE")+
theme_bw() +
theme(panel.grid.major = element_blank(), legend.position = "none")
I want ESCCAL and EUCCHR to be the shortest bars essentially, but currently they're being shown as the tallest.
Species vs water use efficiency
If I don't do scale_y_reverse, I get a plot that looks like this second image
One approach is to shift all the numbers to show their value over a baseline, and then adjust the labeling the same way:
df <- data.frame(Species = LETTERS[1:10],
mean = -80:-71/100)
ggplot(df, aes(x = Species, y = mean, fill = Species)) +
geom_bar(position = position_dodge(), stat="identity")
Here we shift the values to show them against a new baseline. Then we can show larger numbers as larger bars the way we'd normally expect for positive numbers. At the same time, we change the labels on the y axis so they correspond to the original values. So -0.8 becomes +0.1 vs. a baseline of -0.9. But we adjust the labels too, so that adjusted 0 has a label of -0.9, and adjusted +0.1 has a label of -0.8, its original value.
baseline <- -0.9
ggplot(df, aes(x = Species, y = mean - baseline, fill = Species)) +
geom_bar(position = position_dodge(), stat="identity") +
scale_y_continuous(breaks = 0:100*0.02,
labels = 0:100*0.02 + baseline, minor_breaks = NULL)
Related
I'd like to draw bar plot like this but in dual Y axis
(https://i.stack.imgur.com/ldMx0.jpg)
the first three indexs range from 0 to 1,
so I want the left y-axis (corresponding to NSE, KGE, VE) to range from 0 to 1,
and the right y-axis (corresponding to PBIAS) to range from -15 to 5.
the following is my data and code:
library("ggplot2")
## data
data <- data.frame(
value=c(0.82,0.87,0.65,-3.39,0.75,0.82,0.63,1.14,0.85,0.87,0.67,-7.03),
sd=c(0.003,0.047,0.006,4.8,0.003,0.028,0.006,4.77,0.004,0.057,0.014,4.85),
index=c("NSE","KGE","VE","PBIAS","NSE","KGE","VE","PBIAS","NSE","KGE","VE","PBIAS"),
period=c("all","all","all","all","calibration","calibration","calibration","calibration","validation","validation","validation","validation")
)
## fix index sequence
data$index <- factor(data$index, levels = c('NSE','KGE','VE',"PBIAS"))
data$period <- factor(data$period, levels = c('all','calibration', 'validation'))
## bar plot
ggplot(data, aes(x=index, y=value, fill=period))+
geom_bar(position="dodge", stat="identity")+
geom_errorbar(aes(ymin=value-sd, ymax=value+sd),
position = position_dodge(0.9), width=0.2 ,alpha=0.5, size=1)+
theme_bw()
I try to scale and shift the second y-axis,
but PBIAS bar plot was removed because of out of scale limit as follow:
(https://i.stack.imgur.com/n6Jfm.jpg)
the following is my code with dual y axis:
## bar plot (scale and shift the second y-axis with slope/intercept in 20/-15)
ggplot(data, aes(x=index, y=value, fill=period))+
geom_bar(position="dodge", stat="identity")+
geom_errorbar(aes(ymin=value-sd, ymax=value+sd),
position = position_dodge(0.9), width=0.2 ,alpha=0.5, size=1)+
theme_bw()+
scale_y_continuous(limits = c(0,1), name = "value", sec.axis = sec_axis(~ 20*.- 15, name="value"))
Any advice for move bar_plot or other solution?
Taking a different approach, instead of using a dual axis one option would be to make two separate plots and glue them together using patchwork. IMHO that is much easier than fiddling around with the rescaling the data (that's the step you missed, i.e. if you want to have a secondary axis you also have to rescale the data) and makes it clearer that the indices are measured on a different scale:
library(ggplot2)
library(patchwork)
data$facet <- data$index %in% "PBIAS"
plot_fun <- function(.data) {
ggplot(.data, aes(x = index, y = value, fill = period)) +
geom_bar(position = "dodge", stat = "identity") +
geom_errorbar(aes(ymin = value - sd, ymax = value + sd),
position = position_dodge(0.9), width = 0.2, alpha = 0.5, size = 1
) +
theme_bw()
}
p1 <- subset(data, !facet) |> plot_fun() + scale_y_continuous(limits = c(0, 1))
p2 <- subset(data, facet) |> plot_fun() + scale_y_continuous(limits = c(-15, 15), position = "right")
p1 + p2 +
plot_layout(guides = "collect", width = c(3, 1))
A second but similar option would be to use ggh4x which via ggh4x::facetted_pos_scales allows to set the limits for facet panels individually. One drawback, the panels have the same width. (I failed in making this approach work with facet_grid and space="free")
library(ggplot2)
library(ggh4x)
data$facet <- data$index %in% "PBIAS"
ggplot(data, aes(x = index, y = value, fill = period)) +
geom_bar(position = "dodge", stat = "identity") +
geom_errorbar(aes(ymin = value - sd, ymax = value + sd),
position = position_dodge(0.9), width = 0.2, alpha = 0.5, size = 1
) +
facet_wrap(~facet, scales = "free") +
facetted_pos_scales(
y = list(
facet ~ scale_y_continuous(limits = c(-15, 15), position = "right"),
!facet ~ scale_y_continuous(limits = c(0, 1), position = "left")
)
) +
theme_bw() +
theme(strip.text.x = element_blank())
I'm making sediment profile grain size distribution graphs, with stacked bar charts representing sand, silt and clay and an added line showing the median value for each depth. The graph looks good, yet the legend of my final output is mixing up some of my items.
Here is a breakdown of my code:
GS_as = data.frame(Depth = c(10,30,50,70,90),
clay = c(0.99,0,0,2.86,3.62),
silt = c(55.48,81.48,53.26,79.5,70.71),
sand = c(43.53,18.52,46.74,17.64,25.67))
long = melt(GS_as,id = "Depth")
df = data.frame(Depth = c(10,30,50,70,90),
value = c(34.8,24.84,48.9,12.7,19.73),
variable = c("median","median","median","median","median"))
ggplot(long,aes(x=Depth,y=value,fill=variable)) +
geom_bar(stat="identity") + coord_flip() +
scale_y_continuous(position = "right") +
scale_x_continuous(breaks = seq(10,900,by = 20),trans='reverse') +
scale_fill_grey() +
geom_line(data=df, aes(x= Depth, y = value,group=variable,colour=variable)) +
geom_point(data=df,aes(x= Depth, y = value,group=variable,colour=variable))
The final output is giving me this graph 1
Now, how do I remove median from the legend grayscale of grain sizes, and how do i remove the points from each box in grayscale? The points should only be presented with the median as a separate variable. I've searched long to find a solution, but have not gotten anywhere. I'm guessing I got to my final graph by a strange unintuitive way.
Additionally, if its possible I would also like the median line and points to be black, remove the variables title and group all the items under 1 level.
I appreciate any help you can give.
To fix your first issue with the median showing up in the fill legend you could make fill a locale aes of geom_bar. For a black color you could set the color via scale_color_manual. The legend titles could be set or removed via labs and finally (and as far as I understand you) you could "group all the items under 1 level" via theme options by removing the spacing between the legends and almost all the margin around them.
library(ggplot2)
ggplot(long, aes(x = Depth, y = value)) +
geom_bar(aes(fill = variable), stat = "identity") +
coord_flip() +
scale_y_continuous(position = "right") +
scale_x_continuous(breaks = seq(10, 900, by = 20), trans = "reverse") +
scale_fill_grey() +
geom_line(data = df, aes(x = Depth, y = value, group = variable, colour = variable)) +
geom_point(data = df, aes(x = Depth, y = value, group = variable, colour = variable)) +
scale_color_manual(values = c("black")) +
labs(fill = NULL, color = NULL) +
theme(legend.spacing.y = unit(0, "pt"), legend.margin = margin(1, 0, 0, 0))
I use n.breaks to have a labeled x-axis mark for each cluster this works well for 4, 5, 6 clusters. Now I tried it with two cluster and it does not work anymore.
I build the graphs like this:
country_plot <- ggplot(Data) + aes(x = Cluster) +
theme(legend.title = element_blank(), axis.title.y = element_blank()) +
geom_bar(aes(fill = country), stat = "count", position = "fill", width = 0.85) +
scale_fill_manual(values = color_map_3, drop = F) +
scale_x_continuous(n.breaks = max(unique(Data$Cluster))) + scale_y_continuous(labels = percent) +
ggtitle("Country")
and export it like this:
ggsave("country_plot.png", plot = country_plot, device = "png", width = 16, height = 8, units = "cm")
When it works it looks something like this:
But with two clusters I get something like this with only one mark beyond the actual bars with a 2.5:
I manually checked the return value of
max(unique(Data$Cluster))
and it returns 2 which in my understanding should lead to two x-axis marks with 1 and 2 like it works with more clusters.
edit:
mutate(country = factor(country, levels = 1:3)) %>%
mutate(country =fct_recode(country,!!!country_factor_naming))%>%
mutate(Gender = factor(Gender, levels = 1:2)) %>%
mutate(Gender = fct_recode(Gender, !!!gender_factor_naming))%>%
If I understand correctly the issue is caused by Cluster being treated as continuous variable. It needs to be turned into a factor.
Here is a minimal, reproducible example using the mtcars dataset that reproduces the unwanted behaviour:
First attempt (continuous x-axis)
library(ggplot2)
library(scales)
ggplot(mtcars) +
aes(x = gear, fill = factor(vs)) +
geom_bar(stat = "count", position = "fill", width = 0.85) +
scale_y_continuous(labels = percent)
In this example, gear takes over the role of Cluster and is assigned to the x-axis.
There are unwanted labeled tick marks at x = 2.5, 3.5, 4.5, 5.5 which are due to the continuous scale.
Second attempt (continuous x-axis with n.breaks given)
ggplot(mtcars) +
aes(x = gear, fill = factor(vs)) +
geom_bar(stat = "count", position = "fill", width = 0.85) +
scale_x_continuous(n.breaks = length(unique(mtcars$gear))) +
scale_y_continuous(labels = percent)
Specifying n.breaks in scale_x_continuous() does not change the x-axis to discrete.
Third attempt (discrete x-axis, gear as factor)
When gear is turned into a factor, we get a labeled tick mark for each factor value;
ggplot(mtcars) +
aes(x = factor(gear), fill = factor(vs)) +
geom_bar(stat = "count", position = "fill", width = 0.85) +
scale_y_continuous(labels = percent)
I have a graph made in ggplot that looks like this:
I wish to have the numeric labels at each of the bars to be grounded/glued to the x axis where y <= 0.
This is the code to generate the graph as such:
ggplot(data=df) +
geom_bar(aes(x=row, y=numofpics, fill = crop, group = 1), stat='identity') +
geom_point(data=df, aes(x = df$row, y=df$numofparcels*50, group = 2), alpha = 0.25) +
geom_line(data=df, aes(x = df$row, y=df$numofparcels*50, group = 2), alpha = 0.25) +
geom_text(aes(x=row, y=numofpics, label=bbch)) +
geom_hline(yintercept=300, linetype="dashed", color = "red", size=1) +
scale_y_continuous(sec.axis= sec_axis(~./50, name="Number of Parcels")) +
scale_x_discrete(name = c(),breaks = unique(df$crop), labels = as.character(unique(df$crop)))+
labs(x=c(), y="Number of Pictures")
I've tried vjust and experimenting with position_nudge for the geom_text element, but every solution I can find changes the position of each element of the geom_text respective to its current position. As such everything I try results in situation like this one:
How can I make ggplot ground the text to the bottom of the x axis where y <= 0, possibly with the possibility to also introduce a angle = 45?
Link to dataframe = https://drive.google.com/file/d/1b-5AfBECap3TZjlpLhl1m3v74Lept2em/view?usp=sharing
As I said in the comments, just set the y-coordinate of the text to 0 or below, and specify the angle : geom_text(aes(x=row, y=-100, label=bbch), angle=45)
I'm behind a proxy server that blocks connections to google drive so I can't access your data. I'm not able to test this, but I would introduce a new label field in my dataset that sets y to be 0 if y<0:
df <- df %>%
mutate(labelField = if_else(numofpics<0, 0, numofpics)
I would then use this label field in my geom_text call:
geom_text(aes(x=row, y=labelField, label=bbch), angle = 45)
Hope that helps.
You can simply define the y-value in geom_text (e.g. -50)
ggplot(data=df) +
geom_bar(aes(x=row, y=numofpics, fill = crop, group = 1), stat='identity') +
geom_point(data=df, aes(x = df$row, y=df$numofparcels*50, group = 2), alpha = 0.25) +
geom_line(data=df, aes(x = df$row, y=df$numofparcels*50, group = 2), alpha = 0.25) +
geom_text(aes(x=row, y=-50, label=bbch)) +
geom_hline(yintercept=300, linetype="dashed", color = "red", size=1) +
scale_y_continuous(sec.axis= sec_axis(~./50, name="Number of Parcels")) +
scale_x_discrete(name = c(),breaks = unique(df$crop), labels =
as.character(unique(df$crop)))+
labs(x=c(), y="Number of Pictures")
I have two vectors of values, both with the same number of entries. Hence, when these vectors are histogrammed, the corresponding distributions should depict the counts vs values. I'm not sure whether I misinterpret something or plotted something wrong but in my understand the red values should not top the green values everywhere. When both vectors provide the same number of entries the one distribution must be lower than the other when the other is higher somewhere. Or not?
The plot command:
number_ticks<- function(n) {function(limits) pretty(limits, n)}
ggplot(data, aes(x = value, fill = Parameter)) +
geom_histogram(
binwidth = 0.25,
color = "black",
alpha = 0.75) +
theme_classic() +
theme(legend.position = c(0.21, 0.85)) +
labs(title = "",
x = TeX("$ \\Delta U_{bias} / V"))) +
scale_x_continous(breaks = number_ticks(20)) +
guides(fill=guide_legend(title=Parameter))
Currently the red histogram goes on top of the green one: they are stacked. That is, position = "stack" is the default option in geom_histogram, while you want to use position = "identity".
For instance, compare
ggplot(diamonds, aes(price, fill = cut)) +
geom_histogram(binwidth = 500)
with
ggplot(diamonds, aes(price, fill = cut)) +
geom_histogram(binwidth = 500, position = "identity", alpha = 0.5)