Geom_points not dodging when geom_errorbars are - r

I can't figure out how to get these geom_points to properly dodge! I've searched many, MANY how-to's and questions on different stackexchange pages, but none of them fix the problem.
analyze_weighted <- data.frame(
mus = c(clean_mu,b_mu,d_mu,g_mu,bd_mu,bg_mu,dg_mu,bdg_mu,m_mu),
sds = c(clean_sigma,b_sigma,d_sigma,g_sigma,bd_sigma,bg_sigma,dg_sigma,bdg_sigma,m_sigma),
SNR =c("No shifts","1 shift","1 shift","1 shift","2 shifts","2 shifts","2 shifts","3 shifts","4 shifts"),
)
And then I try to plot it:
ggplot(analyze_weighted, aes(x=SNR,y=mus,color=SNR,group=mus)) +
geom_point(position="dodge",na.rm=TRUE) +
geom_errorbar(position="dodge",aes(ymax=mus+sds/2,ymin=mus-sds/2,), width=0.25)
And it manages to dodge the error bars but not the points! I'm going crazy here, what do I do?
Here's what it looks like now--I want the points to be slightly dodged!

geom_point requires that you explicitly provide the width you desire the points to dodge.
This should work:
ggplot(analyze_weighted, aes(x=SNR,y=mus,color=SNR,group=mus)) +
geom_point(position=position_dodge(width=0.2),na.rm=TRUE) +
geom_errorbar(position=position_dodge(width=0.2),aes(ymax=mus+sds/2,ymin=mus-sds/2),width=0.25)
Please notice that your example wasn't a fully reproducible one, as no values of the variables used to construct mus and sds are available.

Related

why the 'fill=' function doesnt work in boxplot in ggplots?

I am making boxplot by ggplot2, but I want to divide into two groups, treated' and 'control', so I use 'fill=treatment', but still one box in each time point,
however, when I use 'fill=treatment' in barplot, it works,
so can you help me to fix it, really thanks!
newcrk10m <- melt(newcrk10,id.vars="time point",variable.name="treatment",
value.name="value")
ggplot(newcrk10m,aes(`time point`,value,fill=treatment))+
geom_bar(stat="identity",position="dodge")+
scale_x_continuous(breaks = seq(0,72,24))
ggplot(newcrk10m,aes(x=`time point`,y=value,
group=`time point`,fill=treatment))+
geom_boxplot(size=0.5)+scale_x_continuous(breaks = seq(0,72,24))
i fix it, i paste 'time point' and 'treatment' then make a new df, it works, thanks!

Changing the Order of Levels through ggplot

I am trying to learn the package cregg through the tutorial here. The tutorial works fine. However, I have an issue when I try to change the default setting of the functions. It looks like when it plots, the order of the levels and coef dots of the legend is ordered alphabetically or by numbers. My question is that when I have tried two ways: one if through the ggplot function and the second one is to change the order of levels in advance to change the order to, say 31524, both methods do not work. The original codes are as follow:
data("immigration")
stacked <- cj(immigration, ChosenImmigrant ~ Gender +
Education + LanguageSkills + CountryOfOrigin + Job + JobExperience +
JobPlans + ReasonForApplication + PriorEntry, id = ~ CaseID,
estimate = "mm", by = ~ contest_no)
plot(stacked, group = "contest_no", feature_headers = FALSE)
My question is how I can the order of levels of contest_no both on the plot and in the legend. One thing I have found is that it seems like the order of levels of contest_no is determined by the function cj first (you can check it by stacked[["contest_no"]]). Thank you!
Thanks to #Tung!(I know I left a similar comment but I still want to answer this one and close it) The answer is simple and straightforward but I didn't think it completely. In my question I kind of having the answer but I didn't know why I didn't see it. Since stacked[["contest_no"]] can show the order of levels of stacked[["contest_no"]], I just change the order by stacked[["contest_no"]] <- factor(stacked[["contest_no"]], levels=c(3, 1, 5, 2, 4)) and then plot the whole object of stacked. It works fine.

How to use a bubbleplot in ggplot2/R to deal with overplotting

I have a plot of categorical variables as below:
http://i.imgur.com/d1hJP21.png
This is a very small subset of the actual data (n > 10000)
While jittering handles the overplotting, it is ugly and can lead to ambiguity. I was keen to instead place bubbles to show the number of points that are co-incident.
I can't seem to find a simple and repeatable way to do this.
Thank you in advance!
Edit:
Thanks for the feedback. Here is what I hope is a reproducible example:
First, a CSV of the data (long, but relevant in this example):
ID,g,wf,fi
1824848,14,2,4
1314001,14,2,3
670960,14,1,3
1313235,15,3,4
1172304,3,5,4
1859973,15,1,3
1826951,14,1,4
1868238,15,1,2
1911869,15,1,4
1911861,15,1,2
926829,14,1,3
1609578,3,4,4
1306895,3,5,4
1199557,15,1,4
692849,10,3,4
1923352,3,5,4
1881724,4,4,4
1384603,3,5,4
1928829,15,1,4
493503,3,5,4
902650,15,1,3
1887582,6,4,4
1887584,3,5,4
1933992,13,1,4
635372,3,3,4
1892765,15,1,2
1934773,13,2,4
1892530,14,2,4
936786,3,5,4
1897585,13,3,4
1895932,15,1,3
422785,15,1,3
1219573,8,1,4
1897817,3,2,4
1899612,14,3,4
1939157,15,1,4
1952043,14,1,3
1938048,14,1,3
1896607,15,1,2
1941385,15,1,3
1959437,3,5,4
1064010,15,1,3
1951600,13,3,4
541439,15,1,4
1938609,3,5,4
1958667,15,1,2
1943792,10,1,4
1943782,14,1,4
1893714,14,1,4
1335502,15,1,1
1950179,3,2,4
1959069,15,1,2
1958811,15,1,2
1958808,15,3,4
1959878,15,1,1
1949904,15,1,3
1961475,15,1,4
1876863,15,1,4
384705,15,1,3
1966338,15,1,4
1980290,3,4,4
1966997,15,2,4
1967107,15,1,1
1976077,15,1,2
1967579,11,1,4
1967387,4,2,4
1973408,3,3,4
1684881,3,3,3
...and the plot code:
sx <- ggplot(dx, aes(x=fi, y=wf)) +
geom_point(shape=19, alpha=1, size=1, position=position_jitter(width=0.1,height=.1))
print(sx)
I really don't know where to go from here, other than manually making a count matrix...
Thanks again (sorry, new to stackoverflow).

ggplot2 & stat_ellipse: Draw ellipses around multiple groups of points

This might be a simple one, but I'm trying to draw ellipses around my treatments on my PCoA plot.
My data frame (sc) is:
MDS1 MDS2 Treatment
X1xF1 -0.19736183 -0.24299825 1xFlood
X1xF2 -0.17409568 -0.29727596 1xFlood
X1xF3 -0.15272444 -0.28553837 1xFlood
S1 -0.06643271 0.47049959 Start
S2 -0.15143350 0.31152966 Start
S3 -0.26156297 0.12296849 Start
X3xF1 0.29840827 0.04581617 3xFloods
X3xF2 0.50503749 -0.07011503 3xFloods
X3xF3 0.20016537 -0.05488630 3xFloods
and my code is:
ggplot(data=sc,(aes(x=MDS1,y=MDS2,colour = Treatment)))+geom_point(size=3)+
ggtitle("PCoA of samples at 'class' level(method='Bray')\n",sep=''))+
theme_bw()+guides(colour = guide_legend(override.aes = list(size=3)))+
stat_ellipse()
It plots the PCoA okay up until stat_ellipse(). I've tried it with various parameters and at best I can get one ellipse for the whole plot (although I can't seem to reproduce that now).
What I'm after is three CI ellipses for the three treatments, coloured the same as the treatments. Any help would be very appreciated!
Thanks.
There is no stat_ellipse(...) in the ggplot package, so you must have retreived it from somewhere else. Care to share?? There are at least two versions that I am aware of, here, and here. Neither of these seem to work with your dataset, which is odd because both have worked with other datasets.
I finally fell back on the option of generating the ellipses externally to ggplot, which is not that difficult really.
library(ggplot2)
library(ellipse)
centroids <- aggregate(cbind(MDS1,MDS2)~Treatment,sc,mean)
conf.rgn <- do.call(rbind,lapply(unique(sc$Treatment),function(t)
data.frame(Treatment=as.character(t),
ellipse(cov(sc[sc$Treatment==t,1:2]),
centre=as.matrix(centroids[t,2:3]),
level=0.95),
stringsAsFactors=FALSE)))
ggplot(data=sc,(aes(x=MDS1,y=MDS2,colour = Treatment)))+
geom_point(size=3)+
geom_path(data=conf.rgn)+
ggtitle(paste("PCoA of samples at 'class' level(method='Bray')\n",sep=''))+
theme_bw()+
guides(colour = guide_legend(override.aes = list(size=3)))

Weird ggplot2 error: Empty raster

Why does
ggplot(data.frame(x=c(1,2),y=c(1,2),z=c(1.5,1.5)),aes(x=x,y=y,color=z)) +
geom_point()
give me the error
Error in grid.Call.graphics(L_raster, x$raster, x$x, x$y, x$width, x$height, : Empty raster
but the following two plots work
ggplot(data.frame(x=c(1,2),y=c(1,2),z=c(2.5,2.5)),aes(x=x,y=y,color=z)) +
geom_point()
ggplot(data.frame(x=c(1,2),y=c(1,2),z=c(1.5,2.5)),aes(x=x,y=y,color=z)) +
geom_point()
I'm using ggplot2 0.9.3.1
TL;DR: Check your data -- do you really want to use a continuous color scale with only one possible value for the color?
The error does not occur if you add + scale_fill_continuous(guide=FALSE) to the plot. (This turns off the legend.)
ggplot(data.frame(x=c(1,2), y=c(1,2), z=c(1.5,1.5)), aes(x=x,y=y,color=z)) +
geom_point() + scale_color_continuous(guide = FALSE)
The error seems to be triggered in cases where a continuous color scale uses only one color. The current GitHub version already includes the relevant pull request. Install it via:
devtools::install_github("hadley/ggplot2")
But more probably there is an issue with the data: why would you use a continuous color scale with only one value?
The same behaviour (i.e. the "Empty raster"error) appeared to me with another value apart from 1.5.
Try the following:
ggplot(data.frame(x=c(1,2),y=c(1,2),z=c(0.02,0.02)),aes(x=x,y=y,color=z))
+ geom_point()
And you get again the same error (tried with both 0.9.3.1 and 1.0.0.0 versions) so it looks like a nasty and weird bug.
This definitely sounds like an edge case better suited for a bug report as others have mentioned but here's some generalizable code that might be useful to somebody as a clunky workaround or for handling labels/colors. It's plotting a rescaled variable and using the real values as labels.
require(scales)
z <- c(1.5,1.5)
# rescale z to 0:1
z_rescaled <- rescale(z)
# customizable number of breaks in the legend
max_breaks_cnt <- 5
# break z and z_rescaled by quantiles determined by number of maximum breaks
# and use 'unique' to remove duplicate breaks
breaks_z <- unique(as.vector(quantile(z, seq(0,1,by=1/max_breaks_cnt))))
breaks_z_rescaled <- unique(as.vector(quantile(z_rescaled, seq(0,1,by=1/max_breaks_cnt))))
# make a color palette
Pal <- colorRampPalette(c('yellow','orange','red'))(500)
# plot z_rescaled with breaks_z used as labels
ggplot(data.frame(x=c(1,2),y=c(1,2),z_rescaled),aes(x=x,y=y,color=z_rescaled)) +
geom_point() + scale_colour_gradientn("z",colours=Pal,labels = breaks_z,breaks=breaks_z_rescaled)
This is quite off-topic but I like to use rescaling to send tons of changing variables to a function like this:
colorfunction <- gradient_n_pal(colours = colorRampPalette(c('yellow','orange','red'))(500),
values = c(0:1), space = "Lab")
colorfunction(z_rescaled)

Resources