What function can I use to emulate ggplot2's default color palette for a desired number of colors. For example, an input of 3 would produce a character vector of HEX colors with these colors:
It is just equally spaced hues around the color wheel, starting from 15:
gg_color_hue <- function(n) {
hues = seq(15, 375, length = n + 1)
hcl(h = hues, l = 65, c = 100)[1:n]
}
For example:
n = 4
cols = gg_color_hue(n)
dev.new(width = 4, height = 4)
plot(1:n, pch = 16, cex = 2, col = cols)
This is the result from
library(scales)
show_col(hue_pal()(4))
show_col(hue_pal()(3))
These answers are all very good, but I wanted to share another thing I discovered on stackoverflow that is really quite useful, here is the direct link
Basically, #DidzisElferts shows how you can get all the colours, coordinates, etc that ggplot uses to build a plot you created. Very nice!
p <- ggplot(mpg,aes(x=class,fill=class)) + geom_bar()
ggplot_build(p)$data
[[1]]
fill y count x ndensity ncount density PANEL group ymin ymax xmin xmax
1 #F8766D 5 5 1 1 1 1.111111 1 1 0 5 0.55 1.45
2 #C49A00 47 47 2 1 1 1.111111 1 2 0 47 1.55 2.45
3 #53B400 41 41 3 1 1 1.111111 1 3 0 41 2.55 3.45
4 #00C094 11 11 4 1 1 1.111111 1 4 0 11 3.55 4.45
5 #00B6EB 33 33 5 1 1 1.111111 1 5 0 33 4.55 5.45
6 #A58AFF 35 35 6 1 1 1.111111 1 6 0 35 5.55 6.45
7 #FB61D7 62 62 7 1 1 1.111111 1 7 0 62 6.55 7.45
From page 106 of the ggplot2 book by Hadley Wickham:
The default colour scheme, scale_colour_hue picks evenly spaced hues
around the hcl colour wheel.
With a bit of reverse engineering you can construct this function:
ggplotColours <- function(n = 6, h = c(0, 360) + 15){
if ((diff(h) %% 360) < 1) h[2] <- h[2] - 360/n
hcl(h = (seq(h[1], h[2], length = n)), c = 100, l = 65)
}
Demonstrating this in barplot:
y <- 1:3
barplot(y, col = ggplotColours(n = 3))
To get the hex values instead of the plot you can use:
hue_pal()(3)
Instead of this code:
show_col(hue_pal()(3))
Related
I'm struggling with color coding and was hoping you could help.
Here is my issue. I have a dummy dataset:
df <- data.frame(x = 1:10, y = sample(1:100, 10, replace = FALSE, set.seed(2021)))
I first want to plot these data, using a specific color scale:
ggplot(data = df, aes(x = x, y = y, fill = y)) +
geom_point(shape = 21) +
scale_fill_continuous_divergingx(palette = "RdBu",
mid = 50,
rev = TRUE)
Now I would like to use the color that corresponds to, let's say, x = 6 (i.e. y = 70) for another plot. Given certain constraints, I cannot make just another simple ggplot with the same scale to do this, but would instead need to 'hardcode' the hexadecimal value of that specific color, i.e. #e99c8f.
Is there a way to do this, so that I can just then use fill = "#e99c8f" in my other plot?
Hardcoding hex values is easy for certain color scales, e.g. viridis, but I haven't found a way to do it with this one, which I need... :/
Thanks for your help!
Try with the ggplot_build() function, to get a data.frame with your data and all layers, and see the points to which color they map (fill column):
R> p <- ggplot(data = df, aes(x = x, y = y, fill = y)) +
+ geom_point(shape = 21) +
+ scale_fill_continuous_divergingx(palette = "RdBu",
+ mid = 50,
+ rev = TRUE)
R> x <- ggplot_build(p)
R> x$data
[[1]]
fill x y PANEL group shape colour size alpha stroke
1 #00578C 1 7 1 -1 21 black 1.5 NA 0.5
2 #CBDEEB 2 38 1 -1 21 black 1.5 NA 0.5
3 #ECF1F5 3 46 1 -1 21 black 1.5 NA 0.5
4 #F3D9D6 4 58 1 -1 21 black 1.5 NA 0.5
5 #0772AC 5 12 1 -1 21 black 1.5 NA 0.5
6 #E69C90 6 70 1 -1 21 black 1.5 NA 0.5
7 #EEBCB4 7 64 1 -1 21 black 1.5 NA 0.5
8 #611300 8 99 1 -1 21 black 1.5 NA 0.5
9 #E8A197 9 69 1 -1 21 black 1.5 NA 0.5
10 #4EA4CB 10 23 1 -1 21 black 1.5 NA 0.5
In addition, scale_fill_continuous_divergingx() is internally using RColorBrewer's "RdBu" palette, which you can query the hex colors by:
R> RColorBrewer::brewer.pal(11, "RdBu")
R> scales::show_col(RColorBrewer::brewer.pal(11, "RdBu"))
I woould like to display a histogram with the allocation of school notes.
The dataframe looks like:
> print(xls)
# A tibble: 103 x 2
X__1 X__2
<dbl> <chr>
1 3 w
2 1 m
3 2 m
4 1 m
5 1 w
6 0 m
7 3 m
8 1 w
9 0 m
10 5 m
I create the histogram with:
hist(xls$X__1, main='Notenverteilung', xlab='Note (0 = keine Beurteilung)', ylab='Anzahl')
It looks like:
Why are there spaces between 1,2,3 but not between 0 & 1?
Thanks, BR Bernd
Use ggplot2 for that, and your bars will be aligned
library(ggplot2)
ggplot(xls, aes(x = X__1)) + geom_histogram(binwidth = 1)
You can try
barplot(table(xls$X__1))
or try
h <- hist(xls$X__1, xaxt = "n", breaks = seq(min(xls$X__1), max(xls$X__1)))
axis(side=1, at=h$mids, labels=seq(min(xls$X__1), max(xls$X__1))[-1])
and using ggplot
ggplot(xls, aes(X__1)) +
geom_histogram(binwidth = 1, color=2) +
scale_x_continuous(breaks = seq(min(xls$X__1), max(xls$X__1)))
I have the following code. It produces a levelplot in which square values less than 0 should be colored in a red hue and squares with values greater than 0 in a blue hue. And then I would like squares with values of 0 to be colored white. However, nothing ends up being white. How can I fix this?
All three squares in that first column should be white.
library(lattice)
cc = colorRampPalette( c("red", "white","blue"))
trellis.par.set(regions=list(col=cc(20)))
x = c(1,2,3,1,2,3,1,2,3)
y = c(1,1,1,2,2,2,3,3,3)
z = c(0,-2,-3,0,2,3,0,1,-1)
df = data.frame(x,y,z)
p <- levelplot(z~x*y, df,
panel=function(...) {
arg <- list(...)
panel.levelplot(...)
})
print(p)
Update:
Here is a reproducible example that attempts to fix it, but still isn't quite right:
Here is a dataframe df:
x y z
1 1 1 -0.17457167
2 2 1 0.93407856
3 3 1 0.55129545
4 4 1 0.97388216
5 5 1 -1.00000000
6 6 1 0.52883410
7 7 1 -1.00000000
8 8 1 0.85112829
9 9 1 -1.00000000
10 10 1 1.00000000
11 11 1 -0.87714166
12 12 1 1.00000000
13 13 1 -0.95403260
14 14 1 1.00000000
15 15 1 -0.91600501
16 16 1 1.00000000
17 17 1 -1.00000000
18 18 1 -0.38800669
19 19 1 -0.52110322
20 20 1 0.00000000
21 21 1 -0.08211450
22 22 1 0.55390723
23 23 1 1.00000000
24 24 1 -0.04147514
25 25 1 -1.00000000
26 26 1 -0.39751358
27 27 1 -0.99550773
28 28 1 0.00000000
29 29 1 0.20737568
30 30 1 0.00000000
31 31 1 0.00000000
32 32 1 0.00000000
33 33 1 -0.26702883
And then here is the code:
cc = colorRampPalette( c("red", "white","blue"))
trellis.par.set(regions=list(col=cc(21)))
zrng <- range(z) # what's the range of z
tol <- 1e-2 # what tolerance is necessary?
colorBreaks <- c(
seq(zrng[1] - 0.01, 0 - tol, length.out = 11),
seq(0 + tol,zrng[2] + 0.01,length.out = 10))
p <- levelplot(z~x*y, df,
at = colorBreaks,
panel=function(...) {
arg <- list(...)
panel.levelplot(...)
})
print(p)
It produces this plot, which does not have a slot for the color white in the spectrum:
As thelatemail pointed out, cc(20) will never produce white ("#FFFFFF"). You have to use an odd number for the middle value of the color ramp to be represented exactly (checkout cc(3) vs. cc(4)).
Now, you need to set the at argument for levelplot to set breakpoints for the colors. The default is at = pretty(z):
#[1] -3 -2 -1 0 1 2 3
But you don't want 0 to be a breakpoint. You want it to have it's own color, and align with the middle of the color ramp.
You can achieve that by setting breakpoints as close to 0 as necessary (within some tol) to prevent any other values from mapping to white. The rough idea is to leave a little spot for 0 by doing something like this at = c(seq(-3.01, -0.00001, length.out = 11), seq(0.00001, 3.01, length.out = 11)) or using the similar method shown below. Because the color ramp has an odd number of values, the sequence needs an even number of values. (i.e. a color ramp of 3 colors can be divide by 2 breakpoints, but a color ramp of 4 values can be divided by 3 breakpoints)
trellis.par.set(regions=list(col=cc(21)))
# Define a sequence of breaks for the at argument to levelplot.
zrng <- range(z) # what's the range of z
tol <- 1e-5 # what tolerance is necessary?
colorBreaks <- c(
seq(zrng[1] - 0.01, # adding a small buffer on end
0 - tol,
length.out = 11),
seq(0 + tol,
zrng[2] + 0.01,
length.out = 11))
# note, I chose length.out = 11.
# Don't do more than roughly ceiling((# of colors) / 2)
p <- levelplot(z~x*y, df,
at = colorBreaks,
panel=function(...) {
arg <- list(...)
panel.levelplot(...)
})
I am not sure if I formulated the title/question correct. Maybe one of my problems are missing terms in my vocabulary. Sorry. But lets try:
I have data (sleep in this example) I would describe as three-dimensional. Maybe a real statistican wouldn't do that?
I think I want to draw multiple two-dimensional plots into a three-dimensial one. I want to plot them side by side. Please correct me if I am wrong.
My problem here is that there is only one line.
There are two groups. I want one line per group. The same data with type='h' give a better description I think:
Can you imagine the two lines here? What I am missing in that concept?
We could use another ploting library for printing/publication. Currently it doesn't matter for me which one. Maybe I am totaly at the wrong place?
This is the code:
require('mise')
require('scatterplot3d')
mise() # clear the workspace
# example data
print(sleep)
scatterplot3d(x=sleep$ID,
x.ticklabs=levels(sleep$ID),
y=sleep$group,
y.ticklabs=levels(sleep$group),
lab = c(length(unique(sleep$ID)), 1),
z=sleep$extra,
type='o')
And the data
extra group ID
1 0.7 1 1
2 -1.6 1 2
3 -0.2 1 3
4 -1.2 1 4
5 -0.1 1 5
6 3.4 1 6
7 3.7 1 7
8 0.8 1 8
9 0.0 1 9
10 2.0 1 10
11 1.9 2 1
12 0.8 2 2
13 1.1 2 3
14 0.1 2 4
15 -0.1 2 5
16 4.4 2 6
17 5.5 2 7
18 1.6 2 8
19 4.6 2 9
20 3.4 2 10
You could add the lines manually in two steps:
# Store the plot in rr
rr <- scatterplot3d(x=as.numeric(sleep$ID),
x.ticklabs=levels(sleep$ID),
y=sleep$group,
y.ticklabs=levels(sleep$group),
z=sleep$extra)
# find all that belong to group one
idx = sleep$group == 1
# add the first line
rr$points3d(x = sleep$ID[idx], y = rep(1, each = sum(idx)), z = sleep$extra[idx], type = 'l', col = 'red')
# add the second line
rr$points3d(x = sleep$ID[!idx], y = rep(2, each = sum(!idx)), z = sleep$extra[!idx], type = 'l', col = 'blue')
So to add ribbons instead of lines things change a bit. In particular, the ribbons are plotted with the polygon function. However, this function only handles 2D coordinates, so we need to transform our 3D coordinates to 2D coordinates with the function rr$xyz.convert.
rr <- scatterplot3d(x=sleep$ID,
x.ticklabs=levels(sleep$ID),
y=sleep$group,
y.ticklabs=levels(sleep$group),
z=sleep$extra)
idx = sleep$group == 1
# draw first group
mat = matrix(c(rep(sleep$ID[idx], 2),
rep(c(1, 1.05), each = sum(idx)), # 1.05 determines width
rep(sleep$extra[idx], 2)), ncol = 3)
ll = rr$xyz.convert(mat)
polygon(x = ll$x[c(1:10, 20:11)],
y = ll$y[c(1:10, 20:11)], col = 'red')
# draw second group
mat = matrix(c(rep(sleep$ID[!idx], 2),
rep(c(2, 1.95), each = sum(!idx)), # 1.95 determines width
rep(sleep$extra[!idx], 2)), ncol = 3)
ll = rr$xyz.convert(mat)
polygon(x = ll$x[c(1:10, 20:11)],
y = ll$y[c(1:10, 20:11)], col = 'blue')
I am trying to adjust the colour scale of a geom_tile plot.
A short version of my data (in data.frame format) is:
mydat <-
Sc K n minC
A 2 1 NA
A 2 2 37.453023
A 2 3 23.768316
A 2 4 17.628376
A 3 1 NA
A 3 2 12.693124
A 3 3 8.884226
A 3 4 7.436250
A 10 1 2.128121
A 10 2 2.116539
A 10 3 2.737923
A 10 4 3.509773
A 20 1 1.104592
A 20 2 1.840195
A 20 3 2.717198
A 20 4 3.616501
B 2 1 NA
B 2 2 25.090085
B 2 3 15.924186
B 2 4 11.811022
B 3 1 NA
B 3 2 8.827183
B 3 3 6.179484
B 3 4 5.175331
B 10 1 2.096934
B 10 2 2.064984
B 10 3 2.662373
B 10 4 3.407246
B 20 1 1.096871
B 20 2 1.802418
B 20 3 2.649153
B 20 4 3.517776
My code to prepare the data to plot is the following:
mydat$Sc <- factor(mydat$Sc, levels =c("A", "B"))
mydat$K <- factor(mydat$K, levels =c("2", "3","10","20"))
mydat.m <- melt(pmydat,id.vars=c("Sc","K","n"), measure.vars=c("minC"))
I want to plot with geom_tile the value of minC with K and n as axis and different facets for Sc with the following:
mydat.m.p <- ggplot(mydat.m, aes(x=n, y=K))
mydat.m.p +
geom_tile(data=mydat.m, aes(fill=value)) +
scale_fill_gradient(low="palegreen", high="lightcoral") +
facet_wrap(~ Sc, ncol=2)
This gives me a plot for each Sc factor. However, the colour scale does not reflect want I want to portray, because a few high values making low values all equal.
I want to adjust to a relevant scale in 4 breaks, i.e., 1-2, 2-3, 3-5, >5.
Looking at other questions there was a suggestion to use the cut function and scale fill manual as:
mydat.m$value1 <- cut(mydat.m$value, breaks = c(1:5, Inf), right = FALSE)
Then use the following in geom_tile:
scale_fill_manual(breaks = c("\[1,2)", "\[2, 3)", "\[3, 5)", "\[5, Inf)"),
values = c("darkgreen", "palegreen", "lightcoral", "red"))
However, I am not sure how this can be applied to a data.frame with other factors and in long format.
You're almost there. Simply use cut before melting:
mydat$minC.cut <- cut(mydat$minC, breaks = c(1:3, 5, Inf), right = FALSE)
mydat.cut <- melt(mydat, id.vars=c("Sc", "K", "n"), measure.vars=c("minC.cut"))
Now, you don't need to specify breaks since we took care of that already.
ggplot(mydat.cut, aes(x=n, y=K)) +
geom_tile(aes(fill=value)) +
facet_wrap(~ Sc, ncol=2) +
scale_fill_manual(values = c("darkgreen", "palegreen", "lightcoral", "red"))