Issue with log_2 scaling using ggplot2 and log2_trans() - r

I am trying to plot data using ggplot2 in R.
The datapoints occur for each 2^i-th x-value (4, 8, 16, 32,...). For that reason, I want to scale my x-Axis by log_2 so that my datapoints are spread out evenly. Currently most of the datapoints are clustered on the left side, making my plot hard to read (see first image).
I used the following command to get this image:
ggplot(summary, aes(x=xData, y=yData, colour=groups)) +
geom_errorbar(aes(ymin=yData-se, ymax=yData+se), width=2000, position=pd) +
geom_line(position=pd) +
geom_point(size=3, position=pd)
However trying to scale my x-axis with log2_trans yields the second image, which is not what I expected and does not follow my data.
Code used:
ggplot(summary, aes(x=settings.numPoints, y=benchmark.costs.average, colour=solver.name)) +
geom_errorbar(aes(ymin=benchmark.costs.average-se, ymax=benchmark.costs.average+se), width=2000, position=pd) +
geom_line(position=pd) +
geom_point(size=3, position=pd) +
scale_x_continuous(trans = log2_trans(),
breaks = trans_breaks("log2", function(x) 2^x),
labels = trans_format("log2", math_format(2^.x)))
Using scale_x_continuous(trans = log2_trans()) only doesn't help either.
EDIT:
Attached the data for reproducing the results:
https://pastebin.com/N1W0z11x
EDIT 2:
I have used the function pd <- position_dodge(1000) to avoid overlapping of my error bars, which caused the problem.
Removing the position=pd statements solved the issue

Here is a way you could format your x-axis:
# Generate dummy data
x <- 2^seq(1, 10)
df <- data.frame(
x = c(x, x, x),
y = c(0.5*x, x, 1.5*x),
z = rep(letters[seq_len(3)], each = length(x))
)
The plot of this would look like this:
ggplot(df, aes(x, y, colour = z)) +
geom_point() +
geom_line()
Adjusting the x-axis would work like so:
ggplot(df, aes(x, y, colour = z)) +
geom_point() +
geom_line() +
scale_x_continuous(
trans = "log2",
labels = scales::math_format(2^.x, format = log2)
)
The labels argument is just so you have labels in the format 2^x, you could change that to whatever you like.

I have used the function pd <- position_dodge(1000) to avoid overlapping of my error bars, which caused the problem.
Adjusting the amount of position dodge and the with of the error bars according to the new scaling solved the problem.
pd <- position_dodge(0.2) # move them .2 to the left and right
ggplot(summary, aes(x=settings.numPoints, y=benchmark.costs.average, colour=algorithm)) +
geom_errorbar(aes(ymin=benchmark.costs.average-se, ymax=benchmark.costs.average+se), width=0.4, position=pd) +
geom_line(position=pd) +
geom_point(size=3, position=pd) +
scale_x_continuous(
trans = "log2",
labels = scales::math_format(2^.x, format = log2)
)
Adding scale_y_continuous(trans="log2") yields the results I was looking for:

Related

How to create custom labels at the ends of the x-axis in a faceted ggplot

I've created a plot (below) and basically want the left and right ends of each facet to state "Ov" and "Cx." I've tried using scale_x_continuous but the issue is that the x-axis for each facet is different.
What I have right now (image):
What I'd like to get ideally:
all_prm %>% ggplot(aes(y_coord, prominence)) +
geom_point() +
facet_wrap(~interaction(ms, sample), scales="free_x") +
scale_x_continuous(breaks=c(10000), labels=c("Ov")) +
theme(
axis.title.y=element_text(margin=margin(r=7)),
axis.title.x=element_text(margin=margin(t=7)),
panel.background = element_rect(fill='white', color='grey10')) +
xlab("Oviduct-Cervical Axis") +
ylab("Prominence")
You can use a custom function to set the break points, in this case using the range of the x limit values with an adjustment argument to move the labels away from the axis limits relative to their scale.
Using the iris dataset:
break_range <- function(x, adjust = .025) {
rng <- range(x)
rng + diff(rng) * c(adjust, -adjust)
}
ggplot(iris, aes(Petal.Length, Sepal.Length )) +
geom_point() +
facet_wrap(. ~ Species, nrow = 3, scales="free_x") +
scale_x_continuous(breaks = break_range, labels = c("Ov", "Cx")) +
theme(
axis.title.y=element_text(margin=margin(r=7)),
axis.title.x=element_text(margin=margin(t=7)),
panel.background = element_rect(fill='white', color='grey10'))

How to trim extra space from ggplot

I am trying to make an extremely single heatmap of percentages using ggplot2 which ideally will just be two single thin columns. I tried the following code, believing that the width option in aes would solve the problem.
p_prev_tg <- ggplot(tg_melt, aes(x = variable , y = OTU, fill = value,
width=.3)) + geom_tile() +
scale_fill_gradientn(colours = hm.palette2(10)) +
xlab(NULL) + ylab(NULL) +
theme(axis.text=element_text(size=7))
p_prev_tg
Unfortunately, this returns a plot with lots of empty space as shown. The plot I would like is those two bars side by side, how can I do this in ggplot?
thanks
What about this solution ?
set.seed(1234)
tg_melt <- data.frame(variable=rep(c("Prevalence_T","Prevalence_NT"), each=10),
OTU=rep(paste0("OTU_",1:10),2),
value=rnorm(20))
library(RColorBrewer)
library(ggplot2)
hm.palette2 <- colorRampPalette(rev(brewer.pal(11, 'Spectral')))
p_prev_tg <- ggplot(tg_melt, aes(x = as.numeric(variable), y = OTU, fill = value)) +
geom_tile() +
scale_fill_gradientn(colours = hm.palette2(10)) +
xlab(NULL) + ylab(NULL) +
theme(axis.text=element_text(size=7)) +
scale_x_continuous(breaks=c(1,2),
limits=c(0,3),
labels=levels(tg_melt$variable))+
theme_bw()
p_prev_tg

Visualizing crosstab tables with a plot in R - changing colours

I have the following code in R which is modified from here, which plots a crosstab table:
#load ggplot2
library(ggplot2)
# Set up the vectors
xaxis <- c("A", "B")
yaxis <- c("A","B")
# Create the data frame
df <- expand.grid(xaxis, yaxis)
df$value <- c(120,5,30,200)
#Plot the Data
g <- <- ggplot(df, aes(Var1, Var2)) + geom_point(aes(size = value), colour = "lightblue") + theme_bw() + xlab("") + ylab("")
g + scale_size_continuous(range=c(10,30)) + geom_text(aes(label = value))
It produces the right figure, which is great, but I was hoping to custom colour the four dots, ideally so that the top left and bottom right are both one colour and the top right and bottom left are another.
I have tried to use:
+ scale_color_manual(values=c("blue","red","blue","red"))
but that doesn't seem to work. Any ideas?
I would suggest that you colour by a vector in your data frame, as you don't have a column that gives you this, you can either create one, or make a rule based on existing columns (which I have done below):
g <- ggplot(df, aes(Var1, Var2)) + geom_point(aes(size = value, colour = (Var2!=Var1))) + theme_bw() + xlab("") + ylab("")
g + scale_size_continuous(range=c(10,30)) + geom_text(aes(label = value))
The important part is: colour = (Var2!=Var1), note that i put this inside the aesthetic (aes) for the geom_point
Edit: if you wish to remove the legend (you annotate the chart with totals, so I guess you don't really need it), you can add: g + theme(legend.position="none") to remove it

adding vertical line to date x axis [duplicate]

Even though I found Hadley's post in the google group on POSIXct and geom_vline, I could not get it done. I have a time series from and would like to draw a vertical line for years 1998, 2005 and 2010 for example. I tried with ggplot and qplot syntax, but still I either see no vertical line at all or the vertical line is drawn at the very first vertical grid and the whole series is shifted somewhat strangely to the right.
gg <- ggplot(data=mydata,aes(y=somevalues,x=datefield,color=category)) +
layer(geom="line")
gg + geom_vline(xintercept=mydata$datefield[120],linetype=4)
# returns just the time series plot I had before,
# interestingly the legend contains dotted vertical lines
My date field has format "1993-07-01" and is of class Date.
Try as.numeric(mydata$datefield[120]):
gg + geom_vline(xintercept=as.numeric(mydata$datefield[120]), linetype=4)
A simple test example:
library("ggplot2")
tmp <- data.frame(x=rep(seq(as.Date(0, origin="1970-01-01"),
length=36, by="1 month"), 2),
y=rnorm(72),
category=gl(2,36))
p <- ggplot(tmp, aes(x, y, colour=category)) +
geom_line() +
geom_vline(xintercept=as.numeric(tmp$x[c(13, 24)]),
linetype=4, colour="black")
print(p)
You could also do geom_vline(xintercept = as.numeric(as.Date("2015-01-01")), linetype=4) if you want the line to stay in place whether or not your date is in the 120th row.
Depending on how you pass your "Dates" column to aes, either as.numeric or as.POSIXct works:
library(ggplot2)
using aes(as.Date(Dates),...)
ggplot(df, aes(as.Date(Dates), value)) +
geom_line() +
geom_vline(xintercept = as.numeric(as.Date("2020-11-20")),
color = "red",
lwd = 2)
using aes(Dates, ...)
ggplot(df, aes(Dates, value)) +
geom_line() +
geom_vline(xintercept = as.POSIXct(as.Date("2020-11-20")),
color = "red",
lwd = 2)
as.numeric works to me
ggplot(data=bmelt)+
geom_line(aes(x=day,y=value,colour=type),size=0.9)+
scale_color_manual(labels = c("Observed","Counterfactual"),values = c("1","2"))+
geom_ribbon(data=ita3,aes(x=day,
y=expcumresponse, ymin=exp.cr.ll,ymax=exp.cr.uu),alpha=0.2) +
labs(title="Italy Confirmed cases",
y ="# Cases ", x = "Date",color="Output")+
geom_vline(xintercept = as.numeric(ymd("2020-03-13")), linetype="dashed",
color = "blue", size=1.5)+
theme_minimal()

ggplot2 add offset to jitter positions

I have data that looks like this
df = data.frame(x=sample(1:5,100,replace=TRUE),y=rnorm(100),assay=sample(c('a','b'),100,replace=TRUE),project=rep(c('primary','secondary'),50))
and am producing a plot using this code
ggplot(df,aes(project,x)) + geom_violin(aes(fill=assay)) + geom_jitter(aes(shape=assay,colour=y),height=.5) + coord_flip()
which gives me this
This is 90% of the way to being what I want. But I would like it if each point was only plotted on top of the violin plot for the matching assay type. That is, the jitterred positions of the points were set such that the triangles were only ever on the upper teal violin plot and the circles in the bottom red violin plot for each project type.
Any ideas how to do this?
In order to get the desired result, it is probably best to use position_jitterdodge as this gives you the best control over the way the points are 'jittered':
ggplot(df, aes(x = project, y = x, fill = assay, shape = assay, color = y)) +
geom_violin() +
geom_jitter(position = position_jitterdodge(dodge.width = 0.9,
jitter.width = 0.5,
jitter.height = 0.2),
size = 2) +
coord_flip()
which gives:
You can use interaction between assay & project:
p <- ggplot(df,aes(x = interaction(assay, project), y=x)) +
geom_violin(aes(fill=assay)) +
geom_jitter(aes(shape=assay, colour=y), height=.5, cex=4)
p + coord_flip()
The labeling can be adjusted by numeric scaled x axis:
# cbind the interaction as a numeric
df$group <- as.numeric(interaction(df$assay, df$project))
# plot
p <- ggplot(df,aes(x=group, y=x, group=cut_interval(group, n = 4))) +
geom_violin(aes(fill=assay)) +
geom_jitter(aes(shape=assay, colour=y), height=.5, cex=4)
p + coord_flip() + scale_x_continuous(breaks = c(1.5, 3.5), labels = levels(df$project))

Resources