Are there good predefined color sequences for different data in one plot? - r

A while ago, I asked How to change Lattice graphics default groups colors?, and got a helpful response from BenBarnes. This allowed me to define more than 7 cycling colors for different data in the same plot in R's Lattice package, which I did. However, I found that it's difficult to define more than 9, maybe 10 colors are not (a) hard to see on a white background, or (b) include pairs of colors that look very similar. (That might be why seven colors is Lattice's default, obviously.) It occurs to me, though, that there are people out there who are much better at managing colors in information display than I am, and that maybe someone had already defined a good list of 10, 12, maybe even 15 colors for display of data in the same plot. Anybody know of such a list? Any color specification that I can convert into a Lattice format would work. If it's already been done in Lattice, even better! (Is there a better place to ask this question??)

There's a large body of work on choosing colors. Check out the RColorBrewer and colorspace packages as a starting point. In the documentation for colorspace there is a link to an excellent paper (and the vignette summarizes much of the paper). And think about your color blind colleagues, with dichromat.

In general, I think it is very difficult to pick a large set of colors that don't end up being hard to distinguish from one another. When I am looking for a large number (>8) of colors that I want to be noticeably distinct and aesthetically pleasing, I usually use the rich.colors palette in the gplots package. I find it more useful than the similar rainbow palette, because the colors don't wrap around on each other.

Related

Is there a simple page showing a visual comparison of the different pre-made R color scales available?

Often when I am quickly doing some plots to get a feel for my data, I don't wish to agonize exactly what colors my color map should have. I just want something "good enough", that I can call with a single short command, and move on.
On this page, I found some basic color palettes:
rainbow(n)
heat.colors(n)
terrain.colors(n)
topo.colors(n)
cm.colors(n)
This is sort of what I want, except there isn't enough of it. terrain and topo are specialized for maps, rainbow is notoriously bad, cm is for data roughly centered around 0, and heat doesn't look good for data spanning [0, 1].
Is there more colormaps? Is there a site or document somewhere showing a list of all the colormaps from different packages so I can just look through it and pick whichever one looks best?
How are you plotting your graphs?
I use ggplot2 a large amount, and the colour keys on there are pretty good as a default. Further there are very good tools for mapping variables by colour available, and it works well with both RColourBrewer, Scales, and ggthemes
Information onf ggplot2 colours: http://www.cookbook-r.com/Graphs/Colors_(ggplot2)/
More information on ggthemes: http://cran.r-project.org/web/packages/ggthemes/vignettes/ggthemes.html

ggplot: Pallete Greyscale On Print, Colourful on Screen [duplicate]

I've started to produce the charts for a paper. For some of them which are bar charts I've used the "Pastel1" palette (as recommended in the book on ggplot2, pastel colours are better than saturated ones for fill areas, such as bars).
The problem with Pastel1 at least is that when printed on a B&W laser printer, the colours are indistinguishable. I don't know if the readers will view the paper on screen or will print it on B&W, so I'm looking for either of the following:
how to add hash lines to a palette such as Pastel1 (hopefully the hash lines are also subtle)
a colour palette easy on the eyes that also produces distinct grey areas for B&W for, say, up to 3-4 different colours.
Granted, I could find the latter by experimenting and using toner, but perhaps this has already been solved, I suppose it's a common problem. And yes, I did google for this, but didn't find anything pertinent.
Thank you.
Use http://colorbrewer2.org/ and only show colour schemes that are printer friendly.
Also see scale_fill_grey.
Currently it's not possible to used hash lines due to a limitation in the underlying grid drawing package.
There is the col2grey function in the TeachingDemos package that will convert a set of colors to an approximation of the grey color that will result from printing. You can use this to try different pallettes without wasting toner/paper.
Use this to select another color combination (gray scale option included)

Is there any color pattern that exist in R that can be used to color a graph?

I am wondering if in R there is a per-existing package that can colorate sets inside graph or a package that can generate a list of colors that are not close,
Because I have a graph that have many clusters and I want to color but I don't want to colors to be close.
I have found a nice answer here but I am wondering if there is a per-existing package for
You may also want to check out the package RColorBrewer for other built in color palettes. However, you may run into issues if you need large numbers of colours. There is a nice post on CrossValidated which addresses the large n issue and offers a few nice solutions as well. Specifically, would it make sense to facet your plot based on some large groupings? Do you need to plot all of the items at once? ggplot2 makes it easy to facet based on a column in your data. I'm sure there are equivalent functions in base graphics and lattice, but I'm not as familiar with them.
See the functions rainbow, heat.colors, terrain.colors etc, described in the help pages (?rainbow). These are part of the grDevices package, which is installed by default.

ggplot: recommended colour palettes also distinguishable for B&W printing?

I've started to produce the charts for a paper. For some of them which are bar charts I've used the "Pastel1" palette (as recommended in the book on ggplot2, pastel colours are better than saturated ones for fill areas, such as bars).
The problem with Pastel1 at least is that when printed on a B&W laser printer, the colours are indistinguishable. I don't know if the readers will view the paper on screen or will print it on B&W, so I'm looking for either of the following:
how to add hash lines to a palette such as Pastel1 (hopefully the hash lines are also subtle)
a colour palette easy on the eyes that also produces distinct grey areas for B&W for, say, up to 3-4 different colours.
Granted, I could find the latter by experimenting and using toner, but perhaps this has already been solved, I suppose it's a common problem. And yes, I did google for this, but didn't find anything pertinent.
Thank you.
Use http://colorbrewer2.org/ and only show colour schemes that are printer friendly.
Also see scale_fill_grey.
Currently it's not possible to used hash lines due to a limitation in the underlying grid drawing package.
There is the col2grey function in the TeachingDemos package that will convert a set of colors to an approximation of the grey color that will result from printing. You can use this to try different pallettes without wasting toner/paper.
Use this to select another color combination (gray scale option included)

How to avoid overplotting (for points) using base-graph?

I am in my way of finishing the graphs for a paper and decided (after a discussion on stats.stackoverflow), in order to transmit as much information as possible, to create the following graph that present both in the foreground the means and in the background the raw data:
However, one problem remains and that is overplotting. For example, the marked point looks like it reflects one data point, but in fact 5 data points exists with the same value at that place.
Therefore, I would like to know if there is a way to deal with overplotting in base graph using points as the function.
It would be ideal if e.g., the respective points get darker, or thicker or,...
Manually doing it is not an option (too many graphs and points like this). Furthermore, ggplot2 is also not what I want to learn to deal with this single problem (one reason is that I tend to like dual-axes what is not supprted in ggplot2).
Update: I wrote a function which automatically creates the above graphs and avoids overplotting by adding vertical or horizontal jitter (or both): check it out!
This function is now available as raw.means.plot and raw.means.plot2 in the plotrix package (on CRAN).
Standard approach is to add some noise to the data before plotting. R has a function jitter() which does exactly that. You could use it to add the necessary noise to the coordinates in your plot. eg:
X <- rep(1:10,10)
Z <- as.factor(sample(letters[1:10],100,replace=T))
plot(jitter(as.numeric(Z),factor=0.2),X,xaxt="n")
axis(1,at=1:10,labels=levels(Z))
Besides jittering, another good approach is alpha blending which you can obtain (on the graphics devices supporing it) as the fourth color parameter. I provided an example for 'overplotting' of two histograms in this SO question.
One additional idea for the general problem of showing the number of points is using a rug plot (rug function), this places small tick marks along the margin that can show how many points contribute (still use jittering or alpha blending for ties). This allows the actual points to show their true rather than jittered values, but the rug can then indicate which parts of the plot have more values.
For the example plot direct jittering or alpha blending is probably best, but in some other cases the rug plot can be useful.
You may also use sunflowerplot, while it would be hard to implement it here. I would use alpha-blending, as Dirk suggested.

Resources