R - ggplot geom_dotplot shape option - r

I want to use geom_dotplot to distinguish two different variables by shape of the dots (rather than colours as the documentation suggests). For example:
library(ggplot2)
set.seed(1)
x = rnorm(20)
y = rnorm(20)
df = data.frame(x,y)
ggplot(data = df) +
geom_dotplot(aes(x = x), fill = "red") +
geom_dotplot(aes(x=y), fill = "blue")
i.e. to distinguish the x and y in the below example
I want to set all the x to be dots, and y to be triangles.
Is this possible?
Thanks!

You could probably hack together something similar to what you want using the information from geom_dotplot plus base R's stripchart function.
#Save the dot plot in an object.
dotplot <- ggplot(data = df) +
geom_dotplot(aes(x = x), fill = "red") +
geom_dotplot(aes(x=y), fill = "blue")
#Use ggplot_build to save information including the x values.
dotplot_ggbuild <- ggplot_build(dotplot)
main_info_from_ggbuild_x <- dotplot_ggbuild$data[[1]]
main_info_from_ggbuild_y <- dotplot_ggbuild$data[[2]]
#Include only the first occurrence of each x value.
main_info_from_ggbuild_x <-
main_info_from_ggbuild_x[which(duplicated(main_info_from_ggbuild_x$x) == FALSE),]
main_info_from_ggbuild_y <-
main_info_from_ggbuild_y[which(duplicated(main_info_from_ggbuild_y$x) == FALSE),]
#To demonstrate, let's first roughly reproduce the original plot.
stripchart(rep(main_info_from_ggbuild_x$x,
times=main_info_from_ggbuild_x$count),
pch=19,cex=2,method="stack",at=0,col="red")
stripchart(rep(main_info_from_ggbuild_y$x,
times=main_info_from_ggbuild_y$count),
pch=19,cex=2,method="stack",at=0,col="blue",add=TRUE)
#Now, redo using what we actually want.
#You didn't specify if you want the circles and triangles filled or not.
#If you want them filled in, just change the pch values.
stripchart(rep(main_info_from_ggbuild_x$x,
times=main_info_from_ggbuild_x$count),
pch=21,cex=2,method="stack",at=0)
stripchart(rep(main_info_from_ggbuild_y$x,
times=main_info_from_ggbuild_y$count),
pch=24,cex=2,method="stack",at=0,add=TRUE)

Related

R: plotting a line and horizontal barplot on the same plot

I am trying to combine a line plot and horizontal barplot on the same plot. The difficult part is that the barplot is actually counts of the y values of the line plot.
Can someone show me how this can be done using the example below ?
library(ggplot2)
library(plyr)
x <- c(1:100)
dff <- data.frame(x = x,y1 = sample(-500:500,size=length(x),replace=T), y2 = sample(3:20,size=length(x),replace=T))
counts <- ddply(dff, ~ y1, summarize, y2 = sum(y2))
# line plot
ggplot(data=dff) + geom_line(aes(x=x,y=y1))
# bar plot
ggplot() + geom_bar(data=counts,aes(x=y1,y=y2),stat="identity")
I believe what I need is presented in the pseudocode below but I do not know how to write it out in R.
Apologies. I actually meant the secondary x axis representing the value of counts for the barplot, while primary y-axis is the y1.
ggplot(data=dff) + geom_line(aes(x=x,y=y1)) + geom_bar(data=counts , aes(primary y axis = y1,secondary x axis =y2),stat="identity")
I just want the barplots to be plotted horizontally, so I tried the code below which flip both the line chart and barplot, which is also not I wanted.
ggplot(data=dff) +
geom_line(aes(x=x,y=y1)) +
geom_bar(data=counts,aes(x=y2,y=y1),stat="identity") + coord_flip()
You can combine two plots in ggplot like you want by specifying different data = arguments in each geom_ layer (and none in the original ggplot() call).
ggplot() +
geom_line(data=dff, aes(x=x,y=y1)) +
geom_bar(data=counts,aes(x=y1,y=y2),stat="identity")
The following plot is the result. However, since x and y1 have different ranges, are you sure this is what you want?
Perhaps you want y1 on the vertical axis for both plots. Something like this works:
ggplot() +
geom_line(data=dff, aes(x=y1 ,y = x)) +
geom_bar(data=counts,aes(x=y1,y=y2),stat="identity", color = "red") +
coord_flip()
Maybe you are looking for this. Ans based on your last code you look for a double axis. So using dplyr you can store the counts in the same dataframe and then plot all variables. Here the code:
library(ggplot2)
library(dplyr)
#Data
x <- c(1:100)
dff <- data.frame(x = x,y1 = sample(-500:500,size=length(x),replace=T), y2 = sample(3:20,size=length(x),replace=T))
#Code
dff %>% group_by(y1) %>% mutate(Counts=sum(y2)) -> dff2
#Scale factor
sf <- max(dff2$y1)/max(dff2$Counts)
# Plot
ggplot(data=dff2)+
geom_line(aes(x=x,y=y1),color='blue',size=1)+
geom_bar(stat='identity',aes(x=x,y=Counts*sf),fill='tomato',color='black')+
scale_y_continuous(name="y1", sec.axis = sec_axis(~./sf, name="Counts"))
Output:

how to combine in ggplot line / points with special values?

I'm quite new to ggplot but I like the systematic way how you build your plots. Still, I'm struggeling to achieve desired results. I can replicate plots where you have categorical data. However, for my use I often need to fit a model to certain observations and then highlight them in a combined plot. With the usual plot function I would do:
library(splines)
set.seed(10)
x <- seq(-1,1,0.01)
y <- x^2
s <- interpSpline(x,y)
y <- y+rnorm(length(y),mean=0,sd=0.1)
plot(x,predict(s,x)$y,type="l",col="black",xlab="x",ylab="y")
points(x,y,col="red",pch=4)
points(0,0,col="blue",pch=1)
legend("top",legend=c("True Values","Model values","Special Value"),text.col=c("red","black","blue"),lty=c(NA,1,NA),pch=c(4,NA,1),col=c("red","black","blue"),cex = 0.7)
My biggest problem is how to build the data frame for ggplot which automatically then draws the legend? In this example, how would I translate this into ggplot to get a similar plot? Or is ggplot not made for this kind of plots?
Note this is just a toy example. Usually the model values are derived from a more complex model, just in case you wante to use a stat in ggplot.
The key part here is that you can map colors in aes by giving a string, which will produce a legend. In this case, there is no need to include the special value in the data.frame.
df <- data.frame(x = x, y = y, fit = predict(s, x)$y)
ggplot(df, aes(x, y)) +
geom_line(aes(y = fit, col = 'Model values')) +
geom_point(aes(col = 'True values')) +
geom_point(aes(col = 'Special value'), x = 0, y = 0) +
scale_color_manual(values = c('True values' = "red",
'Special value' = "blue",
'Model values' = "black"))

Highlight specific dot in ggplot2

I am trying to plot a scatterplot using ggplot2 in R. I have data as follows in csv format
A B
-4.051587034 -2.388276692
-4.389339837 -3.742321425
-4.047207557 -3.460923901
-4.458420756 -2.462180905
-2.12090412 -2.251811973
I want to high light specific two dot with corresponds -2.462180905 and -3.742321425 and to in plot with different colors. Which should to different than default colors in the plot. I tried following code
library(ggplot2)
library(reshape2)
library(methods)
library(RSvgDevice)
Data<-read.csv("table.csv",header=TRUE,sep=",")
data1<-Data[,-3]
plot2<-ggplot(data1,aes(x = A, y = B)) + geom_point(aes(size=2,color=ifelse(y=-2.462180905,'red')))
graph<-plot2 + theme_bw()+opts(axis.line = theme_segment(colour = "black"),panel.grid.major=theme_blank(),panel.grid.minor=theme_blank(),panel.border = theme_blank())
ggsave(graph,file="figure.svg",height=6,width=7)
It is not working the way i want. It gives all dots in same color. Can anybody help?
Another way, which may be more or less efficient depending on your requirements, would be to add another geom_point():
x <- c(-4.051587034, -4.389339837, -4.047207557, -4.458420756, -2.12090412)
y <- c(-2.388276692, -3.742321425, -3.460923901, -2.462180905, -2.251811973)
d <- data.frame(x, y)
require("ggplot2")
h <- c(2, 4) # put row numbers in here or use condition
ggplot() +
geom_point(data = d, aes(x, y), colour = "red", size = 5) +
geom_point(data = d[h, ], aes(x, y), colour = "blue", size = 5)
# notice the colour is outside the aesthetic arguments
Which gives you this:
Add a different column with the same value for all points except the highlighted point, assign the colour aesthetic to that column, then change the colours manually.
data1$highlight <- data1$B == -2.462180905 # FALSE except for the one you want
ggplot(data1, aes(x = A, y = B)) +
geom_point(aes(colour = highlight), size = 2) +
scale_colour_manual(values = c("FALSE" = "black", "TRUE" = "red"))
Note that the condition in the first line will have to be exact in order to get TRUE at the right row. Either ensure the value is exact or use a condition that will match the desired row.
Also note that opts is deprecated. Use theme instead. But that's another question.

Need to arrange several plots (3*2) in a single page (grid.arrange) (With Unified X axis) and shifted y axis

I am trying on faceting six plots in a single page using ggplot. in 3*2 grid.
and also changing the y label from left to right for all the plots in second column.
I am trying to have only one scale for all the 6 plots combined.
How to have single x axis for 1st and second column of plots ?
I have a function like below...
plot_4 <- function( for (i in 1:7) { var[i] = readline("enter the variable name \n")\n) {
# var [i] doesnt work, how to get multiple inputs in this case ?
#Next I am doing aggregation, and its fine
gg4 <- aggregate(cbind(get(var1),get(var2),get(var3),get(var4),get(var5),get(var6))~Ei+Mi+hours,a, FUN=mean)
#For future calling purpose I am using below vector to store the variable names
myvars<-c(var1,var2,var3,var4,var5,var6)
names(gg4)[4] <- var1
names(gg4)[5] <- var2
names(gg4)[6] <- var3
names(gg4)[7] <- var4
names(gg4)[8] <- var5
names(gg4)[9] <- var6
# Plotting for each variable
plot_exp <- function(i,plotvar) {
dat <- subset(gg4, var = myvars[i] ) # myvars [i] doesnt work, I dont know how to choose each #variable separately for each iterations here in R.
ggplot(gg4,aes_string(x="hours", y=eval(get(var)), fill = "Mi")) +
geom_point(aes(color= "Mi"), size = 3) +
geom_smooth(stat= "smooth" , alpha = I(0.01), method="loess", color = "blue") +
facet_grid(myvars~.)
}
#Arranging in grid
ll <- do.call(Map, c(plot_exp, expand.grid(i=1:6, plotvar=myvars, stringsAsFactors=F)))
do.call(grid.arrange, ll)
}
This is not working.
Please help to fix the error in above code and also few changes as I described in beginning.
I have did little arrangements to get the graph into what I wanted (Except the Y axis on right, but its ok), But now I am not able to edit the legend title.
ggplot(dd,aes(x=hours, y=value)) +
geom_point(aes(color=factor(Mi)), size = 4, alpha = I (0.5)) +
geom_smooth(stat= "smooth" , alpha = I(0.01), method="loess", color = "blue") +
facet_wrap(~variable, nrow=3, ncol=2,scales = "free_y")
Legend is like
![enter image description here][2]
How can I edit the legend name , I dont want to have factor(Mi) as its name. Please let me know.
Thanks in advance.
I have used scale_fill_continuous(guide = guide_legend(title = "V")) and scale_fill_continuous(name = "V")
But both doesnt work here.
Rather than looping and calling ggplot multiple times, it's better to melt your data into one data.frame and then plot all at the same time using facets to arrange your plots. This will automatically help coordinate the x-axes.
First, let's start at the point where you have myvars. For example
myvars <- c("Nphy", "Cphy", "CHLphy", "Nhet", "Chet", "Ndet")
Now we can extract those values from gg4 and "melt" them so that each observation is on it's own row.
dd <- melt(gg4,id.vars=c("Ei","Mi","hours"), measure.vars=myvars)
Now we can make all the plots at once with
ggplot(dd,aes(x=hours, y=value, fill = Mi)) +
geom_point(aes(color=Mi), size = 3) +
geom_smooth(stat= "smooth" , alpha = I(0.01), method="loess", color = "blue") +
facet_wrap(~variable, nrow=3, ncol=2)
And that's it!

ggplot2: Reading maximum bar height from plot object containing geom_histogram

Like this previous poster, I am also using geom_text to annotate plots in gglot2. And I want to position those annotations in relative coordinates (proportion of facet H & W) rather than data coordinates. Easy enough for most plots, but in my case I'm dealing with histograms. I'm sure the relevant information as to the y scale must be lurking in the plot object somewhere (after adding geom_histogram), but I don't see where.
My question: How do I read maximum bar height from a faceted ggplot2 object containing geom_histogram? Can anyone help?
Try this:
library(plyr)
library(scales)
p <- ggplot(mtcars, aes(mpg)) + geom_histogram(aes(y = ..density..)) + facet_wrap(~am)
r <- print(p)
# in data coordinate
(dc <- dlply(r$data[[1]], .(PANEL), function(x) max(x$density)))
(mx <- dlply(r$data[[1]], .(PANEL), function(x) x[which.max(x$density), ]$x))
# add annotation (see figure below)
p + geom_text(aes(x, y, label = text),
data = data.frame(x = unlist(mx), y = unlist(dc), text = LETTERS[1:2], am = 0:1),
colour = "red", vjust = 0)
# scale range
(yr <- llply(r$panel$ranges, "[[", "y.range"))
# in relative coordinates
(rc <- mapply(function(d, y) rescale(d, from = y), dc, yr))

Resources