I've started to produce the charts for a paper. For some of them which are bar charts I've used the "Pastel1" palette (as recommended in the book on ggplot2, pastel colours are better than saturated ones for fill areas, such as bars).
The problem with Pastel1 at least is that when printed on a B&W laser printer, the colours are indistinguishable. I don't know if the readers will view the paper on screen or will print it on B&W, so I'm looking for either of the following:
how to add hash lines to a palette such as Pastel1 (hopefully the hash lines are also subtle)
a colour palette easy on the eyes that also produces distinct grey areas for B&W for, say, up to 3-4 different colours.
Granted, I could find the latter by experimenting and using toner, but perhaps this has already been solved, I suppose it's a common problem. And yes, I did google for this, but didn't find anything pertinent.
Thank you.
Use http://colorbrewer2.org/ and only show colour schemes that are printer friendly.
Also see scale_fill_grey.
Currently it's not possible to used hash lines due to a limitation in the underlying grid drawing package.
There is the col2grey function in the TeachingDemos package that will convert a set of colors to an approximation of the grey color that will result from printing. You can use this to try different pallettes without wasting toner/paper.
Use this to select another color combination (gray scale option included)
Related
Often when I am quickly doing some plots to get a feel for my data, I don't wish to agonize exactly what colors my color map should have. I just want something "good enough", that I can call with a single short command, and move on.
On this page, I found some basic color palettes:
rainbow(n)
heat.colors(n)
terrain.colors(n)
topo.colors(n)
cm.colors(n)
This is sort of what I want, except there isn't enough of it. terrain and topo are specialized for maps, rainbow is notoriously bad, cm is for data roughly centered around 0, and heat doesn't look good for data spanning [0, 1].
Is there more colormaps? Is there a site or document somewhere showing a list of all the colormaps from different packages so I can just look through it and pick whichever one looks best?
How are you plotting your graphs?
I use ggplot2 a large amount, and the colour keys on there are pretty good as a default. Further there are very good tools for mapping variables by colour available, and it works well with both RColourBrewer, Scales, and ggthemes
Information onf ggplot2 colours: http://www.cookbook-r.com/Graphs/Colors_(ggplot2)/
More information on ggthemes: http://cran.r-project.org/web/packages/ggthemes/vignettes/ggthemes.html
I've started to produce the charts for a paper. For some of them which are bar charts I've used the "Pastel1" palette (as recommended in the book on ggplot2, pastel colours are better than saturated ones for fill areas, such as bars).
The problem with Pastel1 at least is that when printed on a B&W laser printer, the colours are indistinguishable. I don't know if the readers will view the paper on screen or will print it on B&W, so I'm looking for either of the following:
how to add hash lines to a palette such as Pastel1 (hopefully the hash lines are also subtle)
a colour palette easy on the eyes that also produces distinct grey areas for B&W for, say, up to 3-4 different colours.
Granted, I could find the latter by experimenting and using toner, but perhaps this has already been solved, I suppose it's a common problem. And yes, I did google for this, but didn't find anything pertinent.
Thank you.
Use http://colorbrewer2.org/ and only show colour schemes that are printer friendly.
Also see scale_fill_grey.
Currently it's not possible to used hash lines due to a limitation in the underlying grid drawing package.
There is the col2grey function in the TeachingDemos package that will convert a set of colors to an approximation of the grey color that will result from printing. You can use this to try different pallettes without wasting toner/paper.
Use this to select another color combination (gray scale option included)
I'm trying to use the stat_binhex() in ggplot2 to drop hex tiles on a plot, and the automatic settings vary the color of the bins, depending on count. That is, all the hexes are the same size, but have different colors.
I want to vary the size of the hex symbol itself! so that some are bigger than others... and i also want to vary color based on a third variable. I read through the documentation of ggplot2 and couldn't find any way to do this. The *hexbin* package has an option like this (lattice) but its plot() functions are maddening, so I was hoping to stay in ggplot2. Any other suggestions would be extremely helpful, as well.
If you know Kirk Goldsberry's NBA shot charts on Grantland, that's very similar to what I'd like to accomplish with my dataset.
A while ago, I asked How to change Lattice graphics default groups colors?, and got a helpful response from BenBarnes. This allowed me to define more than 7 cycling colors for different data in the same plot in R's Lattice package, which I did. However, I found that it's difficult to define more than 9, maybe 10 colors are not (a) hard to see on a white background, or (b) include pairs of colors that look very similar. (That might be why seven colors is Lattice's default, obviously.) It occurs to me, though, that there are people out there who are much better at managing colors in information display than I am, and that maybe someone had already defined a good list of 10, 12, maybe even 15 colors for display of data in the same plot. Anybody know of such a list? Any color specification that I can convert into a Lattice format would work. If it's already been done in Lattice, even better! (Is there a better place to ask this question??)
There's a large body of work on choosing colors. Check out the RColorBrewer and colorspace packages as a starting point. In the documentation for colorspace there is a link to an excellent paper (and the vignette summarizes much of the paper). And think about your color blind colleagues, with dichromat.
In general, I think it is very difficult to pick a large set of colors that don't end up being hard to distinguish from one another. When I am looking for a large number (>8) of colors that I want to be noticeably distinct and aesthetically pleasing, I usually use the rich.colors palette in the gplots package. I find it more useful than the similar rainbow palette, because the colors don't wrap around on each other.
I am in my way of finishing the graphs for a paper and decided (after a discussion on stats.stackoverflow), in order to transmit as much information as possible, to create the following graph that present both in the foreground the means and in the background the raw data:
However, one problem remains and that is overplotting. For example, the marked point looks like it reflects one data point, but in fact 5 data points exists with the same value at that place.
Therefore, I would like to know if there is a way to deal with overplotting in base graph using points as the function.
It would be ideal if e.g., the respective points get darker, or thicker or,...
Manually doing it is not an option (too many graphs and points like this). Furthermore, ggplot2 is also not what I want to learn to deal with this single problem (one reason is that I tend to like dual-axes what is not supprted in ggplot2).
Update: I wrote a function which automatically creates the above graphs and avoids overplotting by adding vertical or horizontal jitter (or both): check it out!
This function is now available as raw.means.plot and raw.means.plot2 in the plotrix package (on CRAN).
Standard approach is to add some noise to the data before plotting. R has a function jitter() which does exactly that. You could use it to add the necessary noise to the coordinates in your plot. eg:
X <- rep(1:10,10)
Z <- as.factor(sample(letters[1:10],100,replace=T))
plot(jitter(as.numeric(Z),factor=0.2),X,xaxt="n")
axis(1,at=1:10,labels=levels(Z))
Besides jittering, another good approach is alpha blending which you can obtain (on the graphics devices supporing it) as the fourth color parameter. I provided an example for 'overplotting' of two histograms in this SO question.
One additional idea for the general problem of showing the number of points is using a rug plot (rug function), this places small tick marks along the margin that can show how many points contribute (still use jittering or alpha blending for ties). This allows the actual points to show their true rather than jittered values, but the rug can then indicate which parts of the plot have more values.
For the example plot direct jittering or alpha blending is probably best, but in some other cases the rug plot can be useful.
You may also use sunflowerplot, while it would be hard to implement it here. I would use alpha-blending, as Dirk suggested.