Graph of zero point clumped together after applying ggbreak() in ggplot2 - r

I dont have data between 1 and 11 hrs (inyour text x axis) so I want to delete this portion from my graph.
How can I put zero values in each group instead of being clumped together?
I tried to plot the graph as follows:
plot <- ggplot(bgse_lmean, aes(time_h)) +
geom_point(aes(y = Conc, shape = Isolate, col = Isolate, group = Isolate), size = 2.5)+
geom_line(aes(y = Conc, group= Isolate, col = Isolate))+
scale_x_continuous(breaks=c(0,12, 16, 20, 24))+
xlab("Time (h)") + ylab("Conc mg/ml") +
scale_color_brewer(palette = "Set1") +
facet_wrap(parameters~.,
strip.position = "top",
nrow = 3)
plot
p2 <- plot + scale_x_break(c(1,11))

Related

Bunched up x axis ticks on multi panelled plot in ggplot

I am attempting to make a multi-panelled plot from three individual plots (see images).However, I am unable to rectify the bunched x-axis tick labels when the plots are in the multi-panel format. Following is the script for the individual plots and the multi-panel:
Individual Plot:
NewDat [[60]]
EstRes <- NewDat [[60]]
EstResPlt = ggplot(EstRes,aes(Distance3, `newBa`))+geom_line() + scale_x_continuous(n.breaks = 10, limits = c(0, 3500))+ scale_y_continuous(n.breaks = 10, limits = c(0,25))+ xlab("Distance from Core (μm)") + ylab("Ba:Ca concentration(μmol:mol)") + geom_hline(yintercept=2.25, linetype="dashed", color = "red")+ geom_vline(xintercept = 1193.9, linetype="dashed", color = "grey")+ geom_vline(xintercept = 1965.5, linetype="dashed", color = "grey") + geom_vline(xintercept = 2616.9, linetype="dashed", color = "grey") + geom_vline(xintercept = 3202.8, linetype="dashed", color = "grey")+ geom_vline(xintercept = 3698.9, linetype="dashed", color = "grey")
EstResPlt
Multi-panel plot:
MultiP <- grid.arrange(MigrPlt,OcResPlt,EstResPlt, nrow =1)
I have attempted to include:
MultiP <- grid.arrange(MigrPlt,OcResPlt,EstResPlt, nrow =1)+
theme(axis.text.x = element_text (angle = 45)) )
MultiP
but have only received errors. It's not necessary for all tick marks to be included. An initial, mid and end value is sufficient and therefore they would not need to all be included or angled. I'm just not sure how to do this. Assistance would be much appreciated.
There are several options to resolve the crowded axes. Let's consider the following example which parallels your case. The default labelling strategy wouldn't overcrowd the x-axis.
library(ggplot2)
library(patchwork)
library(scales)
df <- data.frame(
x = seq(0, 3200, by = 20),
y = cumsum(rnorm(161))
)
p <- ggplot(df, aes(x, y)) +
geom_line()
(p + p + p) / p &
scale_x_continuous(
name = "Distance (um)"
)
However, because you've given n.breaks = 10 to the scale, it becomes crowded. So a simple solution would just be to remove that.
(p + p + p) / p &
scale_x_continuous(
n.breaks = 10,
name = "Distance (um)"
)
Alternatively, you could convert the micrometers to millimeters, which makes the labels less wide.
(p + p + p) / p &
scale_x_continuous(
n.breaks = 10,
labels = label_number(scale = 1e-3, accuracy = 0.1),
name = "Distance (mm)"
)
Yet another alternative is to put breaks only every n units, in the case below, a 1000. This happens to coincide with omitting n.breaks = 10 by chance.
(p + p + p) / p &
scale_x_continuous(
breaks = breaks_width(1000),
name = "Distance (um)"
)
Created on 2021-11-02 by the reprex package (v2.0.1)
I thought it would be better to show with an example.
What I mean was, you made MigrPlt, OcResPlt, EstResPlt each with ggplot() +...... For plot that you want to rotate x axis, add + theme(axis.text.x = element_text (angle = 45)).
For example, in iris data, only rotate x axis text for a like
a <- ggplot(iris, aes(Sepal.Width, Sepal.Length)) +
geom_point() +
theme(axis.text.x = element_text (angle = 45))
b <- ggplot(iris, aes(Petal.Width, Petal.Length)) +
geom_point()
gridExtra::grid.arrange(a,b, nrow = 1)

Scale density curve made with geom_density to similar height of geom_histogram?

I need to align the density line with the height of geom_histogram and keep count values on the y axis instead of density.
I have these 2 versions:
# Creating dataframe
library(ggplot2)
values <- c(rep(0,2), rep(2,3), rep(3,3), rep(4,3), 5, rep(6,2), 8, 9, rep(11,2))
data_to_plot <- as.data.frame(values)
# Option 1 ( y scale shows frequency, but geom_density line and geom_histogram are not matching )
ggplot(data_to_plot, aes(x = values)) +
geom_histogram(aes(y = ..count..), binwidth = 1, colour= "black", fill = "white") +
geom_density(aes(y=..count..), fill="blue", alpha = .2)+
scale_x_continuous(breaks = seq(0, max(data_to_plot$values), 1))
y scale shows frequency, but geom_density line and geom_histogram are not matching
# Option 2 (geom_density line and geom_histogram are matching, but y scale density = 1)
ggplot(data_to_plot, aes(x = values)) +
geom_histogram(aes(y = after_stat(ndensity)), binwidth = 1, colour= "black", fill = "white") +
geom_density(aes(y = after_stat(ndensity)), fill="blue", alpha = .2)+
scale_x_continuous(breaks = seq(0, max(data_to_plot$values), 1))
geom_density line and geom_histogram are matching, but y scale density = 1
What I need is plot from Option 2, but Y scale from Option 1. I can get it by adding (aes(y=1.25*..count..) for this particular data, but my data is not static and this will not work for another dataset (just modify values to test):
# Option 3 (with coefficient in aes())
ggplot(data_to_plot, aes(x = values)) +
geom_histogram(aes(y = ..count..), binwidth = 1, colour= "black", fill = "white") +
geom_density(aes(y=1.25*..count..), fill="blue", alpha = .2)+
scale_x_continuous(breaks = seq(0, max(data_to_plot$values), 1))
Desired result: y scale shows frequency and geom_density line is matching with geom_histogram height
I cannot hardcode coefficient or bins.
This problem is close to the ones discussed here, but it did not work for my case:
Programatically scale density curve made with geom_density to similar height to geom_histogram?
How to put geom_density and geom_histogram on same counts scale
A density curve always represents data between 0 and 1, whereas a count data are multiples of 1. So it does mostly not make sense to plot those data to the same y-axis.
The left plot shows density line and histogram for data similar to the ones from you - I just added some. The height of the bar shows the percentage of counts for the corresponding x-value. The y-scale is smaller than 1.
The right plot shows the same as the left, but another histogram is added which shows the count. The y-scales goes up and the 2 density plots shrink.
If you want to scale both to the same scale, you could to this by calculating a scaling factor. I have used this scaling factor to add a secondary y-axis to the third plot and saling the sec y-axis accordingly.
In order to make clear what belongs to what scale I have colored 2nd y-axis and the data belonging to it red.
library(ggplot2)
library(patchwork)
values <- c(rep(0,2),rep(1,4), rep(2,6), rep(3,8), rep(4,12), rep(5,7), rep(6,4),rep(7,2))
df <- as.data.frame(values)
p1 <- ggplot(df, aes(x = values)) +
stat_density(geom = 'line') +
geom_histogram(aes(y = ..density..), binwidth = 1,color = 'white', fill = 'red', alpha = 0.2)
p2 <- ggplot(df, aes(x = values)) +
stat_density(geom = 'line') +
geom_histogram(aes(y = ..count..), binwidth = 1, color = 'white', alpha = 0.2) +
geom_histogram(aes(y = ..density..), binwidth = 1, color = 'white', alpha = 0.2) +
ylab('density and counts')
# Find maximum of ..density..
m <- max(table(df$values)/sum(table(df$values)))
# Find maxium of df$values
mm <- max(table(df$values))
# Create Scaling factor for secondary axis
scaleF <- m/mm
p3 <- p1 + scale_y_continuous(
limits = c(0, m),
# Features of the first axis
name = "density",
# Add a second axis and specify its features
sec.axis = sec_axis( trans=~(./scaleF), name = 'counts')
) +
theme(axis.ticks.y.right = element_line(color = "red"),
axis.line.y.right = element_line(color = 'red'),
axis.text.y.right = element_text(color = 'red'),
axis.title.y.right = element_text(color = 'red')) +
annotate("segment", x = 5, xend = 7,
y = 0.25, yend = .25, colour = "pink", size=3, alpha=0.6, arrow=arrow())
p1 | p2 | p3

Plotting multiple x axis profiles from a csv file in R?

I am planning to plot vertical profile of multiple parameters on x axis, for example, salinity, temperature, density, against pressure as y axis, in the same graph. This is the kind of plot i am hoping to get :
Here is a sample from my data :
ï..IntD.Date. IntT.Time. Salinity..psu. SIGMA.Kg.m3. Pressure.dbar.
1 21-April-2019 5:31:55 PM 30.2502 20.2241 0.7160
2 21-April-2019 5:32:00 PM 31.0254 20.8081 0.8409
3 21-April-2019 5:32:05 PM 31.2654 20.9930 1.0551
4 21-April-2019 5:32:10 PM 31.2953 21.0176 1.2694
Temp..0C. Vbatt.volt.
1 23.4054 12.29
2 23.4148 12.30
3 23.4060 12.29
4 23.4024 12.33
I already used these codes:
data <- read.csv('file location')
vert_plot <- ggplot(data, aes(x = Pressure.dbar., y = Temp..0C.)) + geom_line(color = '#088DA5', size = 0.75) + labs(size = 18) + ggtitle("temp vs pressure") + theme_grey() + coord_flip() + scale_y_reverse()
Which generated this plot :
as you can see, i was able to bring a single profile where the scale of y axis wasn't in reverse order whereas I'd prefer pressure value (0, 5, 10....) starting from the top left corner. Unlike the plot i made where pressure value begins in bottom left corner.
I'd be grateful if someone helped me to get figure where i will be able to plot multiple vertical profile in same graph where y axis is pressure and is in reverse order, as shown in that barrier layer thickness picture.
Add as many geom_line() as required and call aes in each geom_line(). For breaks of 5, add scale_x_continuous and call sequence of breaks in it.
vert_plot <- ggplot(df) +
geom_line(aes(x = Pressure.dbar., y = Temp..0C.), color = 'blue', size = 0.75) +
geom_line(aes(x = Pressure.dbar., y = Salinity..psu.), color = 'red', size = 0.75) +
geom_line(aes(x = Pressure.dbar., y = SIGMA.Kg.m3.), color = 'green', size = 0.75) +
labs(size = 18) + ggtitle("Dummy Title") + xlab("Pressure") + ylab("Dummy Label") +
scale_x_reverse(limits = c(40, 0), breaks = seq(40, 0, -5)) +
theme_grey() + coord_flip() + scale_y_reverse()
Alternate method:
Instead of going through all these, you can melt the data frame keeping the variable names as groups.
library(reshape2)
newdf <- melt(df, id.vars = c("IntD.Date.", "IntT.Time.", "Pressure.dbar."),
variable.name = "group")
vert_plot <- ggplot(newdf, aes(x = Pressure.dbar., y = value, color = group)) +
geom_line(size = 0.75) +
labs(size = 18) + ggtitle("Dummy Title") +
xlab("Pressure") + ylab("Dummy Label") +
scale_x_reverse(limits = c(40, 0), breaks = seq(40, 0, -5)) +
theme_grey() + coord_flip() + scale_y_reverse()

How to fix: when overlaying two scatter plots with using reorder of aes, the reorder gets lost

I have two scatter plots obtained from two sets of data that I would like to overlay, when using the ggplo2 for creating single plot i am using log scale and than ordering the numbers sothe scatter plot falls into kind if horizontal S shape. Byt when i want to overlay, the information about reordering gets lost, and the plot loses its shape.
this is how the df looks like (one has 1076 entries and the other 1448)
protein Light_Dark log10
AT1G01080 1.1744852 0.06984755
AT1G01090 1.0710359 0.02980403
AT1G01100 0.4716955 -0.32633823
AT1G01320 156.6594802 2.19495668
AT1G02500 0.6406005 -0.19341276
AT1G02560 1.3381804 0.12651467
AT1G03130 0.6361147 -0.19646458
AT1G03475 0.7529015 -0.12326181
AT1G03630 0.7646064 -0.11656207
AT1G03680 0.8340107 -0.07882836
this is for single plot:
p1 <- ggplot(ratio_log_ENR4, aes(x=reorder(protein, -log10), y=log10)) +
geom_point(size = 1) +
#coord_cartesian(xlim = c(0, 1000)) +
geom_hline(yintercept=0.1, col = "red") + #check gene
geom_hline(yintercept=-0.12, col = "red") +#check gene
labs(x = "Protein")+
theme_classic()+
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank())+
labs(y = "ratio Light_Dark log10")+
labs(x="Protein")
image=p1
ggsave(file="p1_ratio_data_ENR4_cys.svg", plot=image, width=10, height=8)
and for over lay:
p1_14a <- ggplot(ratio_log_ENR1, aes(x=reorder(protein, -log10), y=log10)) +
geom_point(size = 1) +
#coord_cartesian(xlim = c(0, 1000)) +
geom_hline(yintercept=0.1, col = "red") + #check gene
geom_hline(yintercept=-0.12, col = "red") +#check gene
labs(x = "Protein")+
theme_classic()+
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank())+
labs(y = "ratio Light_Dark log10")+
labs(x="Protein")+
geom_point()+
geom_point(data=ratio_log_ENR4, color="red")
p=ggplot(ratio_log_ENR1, aes(x=reorder(protein, -log10), y=log10)) +
geom_point(size = 1) +
#coord_cartesian(xlim = c(0, 1000)) +
geom_hline(yintercept=0.1, col = "red") + #check gene
geom_hline(yintercept=-0.12, col = "red") +#check gene
labs(x = "Protein")+
theme_classic()+
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank())+
labs(y = "ratio Light_Dark log10")+
labs(x="Protein")
p = p + geom_point(data=ratio_log_ENR4, aes(x=reorder(protein, -log10), y=log10), color ="red" )
p
I tried to change classes... but it cant be the problem since for single plot its working like it is
The easiest solution I see for you is just binding together your two dataframes before plotting.
a$color <- 'red'
b$color <- 'blue'
ab <- a %>%
rbind(b)
ggplot(ab, aes(x = fct_reorder(protein, -log10), y = log10, color = color)) +
geom_point() +
scale_color_identity()
You can find a nice cheat-sheet for working with factors here: https://stat545.com/block029_factors.html

Highlight single points in scatter plot with ggplot2 and ggrepel

I want to highlight 4 single points in a scatter plot with a box surrounding the name associated with the plot. I am using ggrepel to create the boxes surrounding the plots and to repel them.
This is the code I have:
library(ggplot2)
gg <- ggplot(X, aes(x = XX, y = XY)) +
geom_point(col = "steelblue", size = 3) +
geom_smooth(method = "lm", col = "firebrick", se = FALSE) +
labs(title = "XX vs XY", subtitle = "X", y = "XX", x = "XY") +
scale_x_continuous(breaks = seq(76, 82, 1)) +
scale_y_continuous(breaks = seq(15, 19, 1))
library(ggrepel)
gg + geom_text_repel(aes(label = Female), size = 3, data = X)
gg + geom_label_repel(aes(label = Female), size = 2, data = X)
With that code, I obtain boxes surrounding all the plots. However, I only want to have the boxes in 4 specific plots and no boxes in the other plots. How can I do that?
Thanks in advance! Regards,
TD

Resources