geom_point plot with only number without circles - r

In ggplot in R, is it possible to plot each point with a unique number but without circles surrounded? I tried to use color "white" but it doesn't work.

I would recommend geom_text.
set.seed(101)
dd <- data.frame(x=rnorm(50),y=rnorm(50),id=1:50)
library(ggplot2)
ggplot(dd,aes(x,y))+geom_text(aes(label=id))

I'll show how to do it with geom_text and/or geom_point.
Using geom_text (recommended)
For this example I'll use the built-in dataset mtcars and let's pretend the numbers you want to display are the weights (wt) variable:
data(mtcars)
p <- ggplot(mtcars, aes(wt, mpg, label = rownames(mtcars)))
p + geom_text(aes(label = wt),
parse = TRUE)
or if you want an example with truly unique numbers, we can just make up an index using seq:
data(mtcars)
p <- ggplot(mtcars, aes(wt, mpg, label = rownames(mtcars)))
p + geom_text(aes(label = seq(1:32)),
parse = TRUE)
Using geom_point
While it would require more work, it actually is possible to do this with geom_point.
This is a reference image of some of the shapes you can use with geom_point:
As you can see, shapes 48 to 57 are 0 to 9. You can leverage these shapes (and combinations of them to form an infinite amount of numbers) via geom_point like this:
d=data.frame(p=c(48:57))
ggplot() +
scale_y_continuous(name="") +
scale_x_continuous(name="") +
scale_shape_identity() +
geom_point(data=d, mapping=aes(x=p%%16, y=p%/%16, shape=p), size=5, fill="red")
Finally, a trivial example using mtcars + geom_point with arbitrary numbers:
d=data.frame(p=c(48:57,48:57,48:57,48,49))
attach(mtcars)
ggplot(mtcars) +
scale_y_continuous(name="") +
scale_x_continuous(name="") +
scale_shape_identity() +
geom_point(data=d, mapping=aes(x=wt, y=mpg, shape=p), size=5, fill="red")

Related

How to guarantee exactly equal size of faceted plots across different plots?

I'm doing something like the following. I tend to use cowplot for saving images of ggplot-generated plots but non-cowplot solutions are also fine. The first plot produces three facets (and one empty space) in a 2x2 arrangement, the second produces 6 facets in a 3x2 arrangement. I set base_height and base_width assuming a size of 2 square for each plot. In the images generated from the code below, the individual plots (each facet) are not quite the same size, across the two images.
library(ggplot2)
library(cowplot)
ggplot(mtcars, aes(x=mpg, y=hp))+
geom_point()+
facet_wrap(~cyl,ncol=2) +
ggtitle("hp vs. mpg, by cyl") +
theme_cowplot(font_size=10)
save_plot("car1.png", last_plot(), base_height=4, base_width=4)
ggplot(mtcars, aes(x=mpg, y=hp))+
geom_point()+
ggtitle("hp vs. mpg, by carb") +
facet_wrap(~carb,ncol=3)+
theme_cowplot(font_size=10)
save_plot("car2.png", last_plot(), base_height=4, base_width=6)
I tried including the png files in the post but they each get scaled differently so it would be misleading.
I know I could generate each facet separately as its own plot and use plot_grid, and then base_height and base_width would set the size of each plot on its own, but is there any way to use facet_wrap or facet_grid and set the absolute size when saved of each facet?
Does cowplot::align_plots work for you?
library(ggplot2)
library(cowplot)
p1 <- ggplot(mtcars, aes(x = mpg, y = hp)) +
geom_point() +
facet_wrap(~cyl, ncol = 2) +
ggtitle("hp vs. mpg, by cyl") +
theme_cowplot(font_size = 10)
p2 <- ggplot(mtcars, aes(x = mpg, y = hp)) +
geom_point() +
ggtitle("hp vs. mpg, by carb") +
facet_wrap(~carb, ncol = 3) +
theme_cowplot(font_size = 10)
twoplots <- align_plots(p1, p2, align = "hv", axis = "tblr") #tblr' for aligning all margins
save_plot("car1.png", ggdraw(twoplots[[1]]))
save_plot("car2.png", ggdraw(twoplots[[2]]))

Coloring subset of lines in a plot using ggplot

I am trying to connect the dots on my plot using geom_path(). I also want to color certain lines(intervals) based on a group variable(t). This is what I have so far:
ggplot(data, aes(x=x, y=x)) +
geom_point() +
geom_path(color=t)
What this does is it "incorrectly" connects the points based on this group. I just want the correct connecting lines to have a separate color.
Could any one help me with this?
Since you did not share your data: You could be experiencing an edge case that occurs if you color by boolean; e.g., a specific value of a variable.
In this case, ggplot groups your geom_path by var == x. You can prevent this by adding group = 1.
Basic (somewhat contrived) example
ggplot(mtcars) +
geom_point(aes(mpg, hp)) +
geom_path(aes(mpg, hp))
Above plot with color = cyl == 4
ggplot(mtcars) +
geom_point(aes(mpg, hp)) +
geom_path(aes(mpg, hp, color = cyl == 4))
Above plot with group = 1
ggplot(mtcars) +
geom_point(aes(mpg, hp)) +
geom_path(aes(mpg, hp, color = cyl == 4, group = 1))
If you pass either a single color (not what you want), or a vector of colors equal to the number of plot elements, you can get ggplot to color the lines for you. So, for instance,
data <- data.frame(x = 1:10, y = 1:10)
ggplot(data, aes(x=x, y=x)) +
geom_point() +
geom_path(color=rainbow(10))

How to place multiple boxplots in the same column with ggplot(geom_boxplot)

I would like to built a boxplot in which the 4 factors (N1:N4) are overlaid in the same column. For example with the following data:
df<-data.frame(N=N,Value=Value)
Q<-c("C1","C1","C2","C3","C3","C1","C1","C2","C2","C3","C3","Q1","Q1","Q1","Q1","Q3","Q3","Q4","Q4","Q1","Q1","Q1","Q1","Q3","Q3","Q4","Q4")
N<-c("N2","N3","N3","N2","N3","N2","N3","N2","N3","N2","N3","N0","N1","N2","N3","N1","N3","N0","N1","N0","N1","N2","N3","N1","N3","N0","N1")
Value<-c(4.7,8.61,8.34,5.89,8.36,1.76,2.4,5.01,2.12,1.88,3.01,2.4,7.28,4.34,5.39,11.61,10.14,3.02,9.45,8.8,7.4,6.93,8.44,7.37,7.81,6.74,8.5)
with the following (usual) code, the output is 4 box-plots displayed in 4 columns for the 4 variables
ggplot(df, aes(x=N, y=Value,color=N)) + theme_bw(base_size = 20)+ geom_boxplot()
many thanks
Updated Answer
Based on your comment, here's a way to add marginal boxplots. We'll use the built-in mtcars data frame.
First, some set-up:
library(cowplot)
# Common theme elements
thm = list(theme_bw(),
guides(colour=FALSE, fill=FALSE),
theme(plot.margin=unit(rep(0,4),"lines")))
Now, create the three plots:
# Main plot
p1 = ggplot(mtcars, aes(wt, mpg, colour=factor(cyl), fill=factor(cyl))) +
geom_smooth(method="lm") + labs(colour="Cyl", fill="Cyl") +
scale_y_continuous(limits=c(10,35)) +
thm[-2] +
theme(legend.position = c(0.85,0.8))
# Top margin plot
p2 = ggplot(mtcars, aes(factor(cyl), wt, colour=factor(cyl))) +
geom_boxplot() + thm + coord_flip() + labs(x="Cyl", y="")
# Right margin plot
p3 = ggplot(mtcars, aes(factor(cyl), mpg, colour=factor(cyl))) +
geom_boxplot() + thm + labs(x="Cyl", y="") +
scale_y_continuous(limits=c(10,35))
Lay out the plots and add the legend:
plot_grid(plotlist=list(p2, ggplot(), p1, p3), ncol=2,
rel_widths=c(5,1), rel_heights=c(1,5), align="hv")
Original Answer
You can overlay all four boxplots in a single column, but the plot will be unreadable. The first example below removes N as the x coordinate, but keeps N as the colour aesthetic. This results in the four levels of N being plotted at a single tick mark (which I've removed by setting breaks to NULL). However, the plots are still dodged. To plot them one on top of the other, set the dodge width to zero, as I've done in the second example. However, the plots are not readable when they are overlaid.
ggplot(df, aes(x="", y=Value,color=N)) +
theme_bw(base_size = 20) +
geom_boxplot() +
scale_x_discrete(breaks=NULL) +
labs(x="")
ggplot(df, aes(x="", y=Value,color=N)) +
theme_bw(base_size = 20) +
geom_boxplot(position=position_dodge(0)) +
scale_x_discrete(breaks=NULL) +
labs(x="")

plot number of data points in r

I am using the following in R to generate a Boxplot out of a given set of data:
ggplot(data = daten, aes(x=Bodentyp, y=Fracht)) + geom_boxplot(aes(fill=Bewirtschaftungsform))
Now I want to display the number of data points going into each category of the column "Bodentyp". How do I achieve this?
You can use fun.datato apply a function (f) to the grouped data to return a count (length(y)) and a position for the label (median(y))
f <- function(y)
c(label=length(y), y=median(y))
library(ggplot2)
data(mtcars)
ggplot(mtcars, aes(x=as.factor(cyl), y=mpg)) +
geom_boxplot() + theme_bw() +
stat_summary(fun.data=f, geom="text", vjust=-0.5, col="blue")

Plotting means as a line plot onto a scatter plot with ggplot

I have this simple data frame holding three replicates (value) for each factor (CT). I would like to plot it as geom_point and than the means of the point as geom_line.
gene <- c("Ckap5","Ckap5","Ckap5","Ckap5","Ckap5","Ckap5","Ckap5","Ckap5","Ckap5","Ckap5","Ckap5","Ckap5","Ckap5","Ckap5","Ckap5")
value <- c(0.86443, 0.79032, 0.86517, 0.79782, 0.79439, 0.89221, 0.93071, 0.87170, 0.86488, 0.91133, 0.87202, 0.84028, 0.83242, 0.74016, 0.86656)
CT <- c("ET","ET","ET", "HP","HP","HP","HT","HT","HT", "LT","LT","LT","P","P","P")
df<- cbind(gene,value,CT)
df<- data.frame(df)
So, I can make the scatter plot.
ggplot(df, aes(x=CT, y=value)) + geom_point()
How do I get a geom_line representing the means for each factor. I have tried the stat_summary:
ggplot(df, aes(x=CT, y=value)) + geom_point() +
stat_summary(aes(y = value,group = CT), fun.y=mean, colour="red", geom="line")
But it does not work.
"geom_path: Each group consist of only one observation. Do you need to adjust the group aesthetic?"
But each group has three observations, what is wrong?
Ps. I am also interested in a smooth line.
You should set the group aes to 1:
ggplot(df, aes(x=CT, y=value)) + geom_point() +
stat_summary(aes(y = value,group=1), fun.y=mean, colour="red", geom="line",group=1)
You can use the dplyr package to get the means of each factor.
library(dplyr)
group_means <- df %>%
group_by(CT) %>%
summarise(mean = mean(value))
Then you will need to convert the factors to numeric to let you plot lines on the graph using the geom_segment function. In addition, the scale_x_continuous function will let you set the labels for the x axis.
ggplot(df, aes(x=as.numeric(CT), y=value)) + geom_point() +
geom_segment(aes(x=as.numeric(CT)-0.4, xend=as.numeric(CT)+0.4, y=mean, yend=mean),
data=group_means, colour="red") +
scale_x_continuous("name", labels=as.character(df$CT), breaks=as.numeric(df$CT))
Following on from hrbrmstr's comment you can add the smooth line using the following:
ggplot(df, aes(x=as.numeric(CT), y=value, group=1)) + geom_point() +
geom_segment(aes(x=as.numeric(CT)-0.4, xend=as.numeric(CT)+0.4, y=mean, yend=mean),
data=group_means, colour="red") +
scale_x_continuous("name", labels=as.character(df$CT), breaks=as.numeric(df$CT)) +
geom_smooth()

Resources