I'm trying to plot a geom_tile plot for a dataset, where I need to highlight the max and min values in every row (colour palette going from green to red)
Dataset:
draft_mean trim rf_pwr
1 12.0 1.0 12253
2 12.0 0.8 12052
3 12.0 0.6 12132
4 12.0 0.4 12280
5 12.0 0.2 11731
6 12.0 0.0 11317
7 12.0 -0.2 12126
8 12.0 -0.4 12288
9 12.0 -0.6 12461
10 12.0 -0.8 12791
11 12.0 -1.0 12808
12 12.2 1.0 12346
13 12.2 0.8 12041
14 12.2 0.6 12345
15 12.2 0.4 12411
16 12.2 0.2 12810
17 12.2 0.0 12993
18 12.2 -0.2 12796
19 12.2 -0.4 12411
20 12.2 -0.6 12342
21 12.2 -0.8 12671
22 12.2 -1.0 13161
ggplot(dataset, aes(trim, draft_mean)) +
geom_tile(aes(fill=rf_pwr), color="black") +
scale_fill_gradient(low= "green", high= "red") +
scale_x_reverse() +
scale_y_reverse()
This plot (image) is taking the minimum values and plotting them as green and maximum values as red. What I need help with is that I need colour palette to go from green to red (minimum to maximum) for every row of the plot (2 rows in this plot) rather than the whole plot.
For draft_mean=12.2, rf_pwr should be colour formatted from minimum to maximum for trim values.
For every value of draft_mean, I should be able to tell the trim values with lowest and highest rf_pwr.
I can plot individual draft_mean values to check, but all draft_mean values needs to be visualized together.
You can create a scaled variable where min = 0 and max = 1 per group like this:
require(tidyverse)
# create toy data
set.seed(1)
df <- data.frame(
draft_mean =sort(rep(c(12,12.2),11 )),
trim=rep(sample(seq(-1,1,length.out = 11), replace = F),2),
rf_pwr = sample(11000:13000,22)
)
# create a scaled variable per unique draft_mean (min = 0 and max = 1)
df <- df %>% group_by(draft_mean) %>% mutate(rf_scl = (rf_pwr-
min(rf_pwr))/(max(rf_pwr)-min(rf_pwr)))
ggplot(df, aes(trim, draft_mean)) +
geom_tile(aes(fill=rf_scl), color="black") +
scale_fill_gradient(low= "green", high= "red") +
scale_x_reverse() +
scale_y_reverse()
Related
I'd like to make a boxplot with mean instead of median. Moreover, I would like the line to stop at 5% (lower) end 95% (upper) quantile. Here the code;
ggplot(data, aes(x=Cement, y=Mean_Gap, fill=Material)) +
geom_boxplot(fatten = NULL,aes(fill=Material), position=position_dodge(.9)) +
xlab("Cement") + ylab("Mean cement layer thickness") +
stat_summary(fun=mean, geom="point", aes(group=Material), position=position_dodge(.9),color="black")
I'd like to change geom to errorbar, but this doesn't work. I tried middle = mean(Mean_Gap), but this doesn't work either. I tried ymin = quantile(y,0.05), but nothing was changing. Can anyone help me?
The standard boxplot using ggplot. fill is Material:
Here is how you can create the boxplot using custom parameters for the box and whiskers. It's the solution shown by #lukeA in stackoverflow.com/a/34529614/6288065, but this one will also show you how to make several boxes by groups.
The R built-in data set called "ToothGrowth" is similar to your data structure so I will use that as an example. We will plot the length of tooth growth (len) for each vitamin C supplement group (supp), separated/filled by dosage level (dose).
# "ToothGrowth" at a glance
head(ToothGrowth)
# len supp dose
#1 4.2 VC 0.5
#2 11.5 VC 0.5
#3 7.3 VC 0.5
#4 5.8 VC 0.5
#5 6.4 VC 0.5
#6 10.0 VC 0.5
library(dplyr)
# recreate the data structure with specific "len" coordinates to plot for each group
df <- ToothGrowth %>%
group_by(supp, dose) %>%
summarise(
y0 = quantile(len, 0.05),
y25 = quantile(len, 0.25),
y50 = mean(len),
y75 = quantile(len, 0.75),
y100 = quantile(len, 0.95))
df
## A tibble: 6 x 7
## Groups: supp [2]
# supp dose y0 y25 y50 y75 y100
# <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 OJ 0.5 8.74 9.7 13.2 16.2 19.7
#2 OJ 1 16.8 20.3 22.7 25.6 26.9
#3 OJ 2 22.7 24.6 26.1 27.1 30.2
#4 VC 0.5 4.65 5.95 7.98 10.9 11.4
#5 VC 1 14.0 15.3 16.8 17.3 20.8
#6 VC 2 19.8 23.4 26.1 28.8 33.3
# boxplot using the mean for the middle and 95% quantiles for the whiskers
ggplot(df, aes(supp, fill = as.factor(dose))) +
geom_boxplot(
aes(ymin = y0, lower = y25, middle = y50, upper = y75, ymax = y100),
stat = "identity"
) +
labs(y = "len", title = "Boxplot with Mean Middle Line") +
theme(plot.title = element_text(hjust = 0.5))
In the figure above, the boxplot on the left is the standard boxplot with regular median line and regular min/max whiskers. The boxplot on the right uses the mean middle line and 5%/95% quantile whiskers.
I have a data set of UK earthquakes that I want to plot by location on a map. (Hopefully I then want to change the size to be representative of the magnitude). I have made a map of the uk using ggmap, but I am struggling to then add the points to a map.
I however keep getting 2 errors, and cannot plot my points on the map. The errors are either
- Error: Aesthetics must be either length 1 or the same as the data (990): x, y
or
- Error in FUN(X[[i]], ...) : object 'group' not found
depending on how I try to plot the points.
this is what I have so far:
table <- data.frame(long2, lat2, mag1)
table
long2 lat2 mag1
1 -2.62 52.84 1.9
2 1.94 57.03 4.2
3 -0.24 51.16 0.6
4 -2.34 53.34 0.8
5 -3.16 55.73 2.0
6 -0.24 51.16 1.0
7 -4.11 53.03 1.5
8 -0.24 51.16 0.2
9 -0.24 51.16 1.1
10 -5.70 57.08 1.6
11 -2.40 53.00 1.4
12 -1.19 53.35 1.2
13 -1.02 53.84 1.7
14 -4.24 52.62 0.8
15 -3.23 54.24 0.3
16 -2.06 52.62 1.0
17 1.63 54.96 1.7
18 -5.24 56.05 0.7
19 -5.86 55.84 1.3
20 -3.22 54.23 0.3
21 -0.24 51.16 -1.4
22 -0.24 51.16 -0.7
23 -4.01 55.92 0.3
24 -5.18 50.08 2.3
25 -1.95 54.44 1.0
library(ggplot2)
library(maps)
w <- map_data("world", region = "uk")
uk <- ggplot(data = w, aes(x = long, y = lat, group=group)) + geom_polygon(fill = "seagreen2", colour="white") + coord_map()
uk + geom_point(data=table, aes(x=long2, y=lat2, colour="red", size=2), position="jitter", alpha=I(0.5))
Is it the way I have built my map, or how I am plotting my points? And how do I fix it?
I've made three changes to your code, and one or more of them solved the problems you were having. I'm not sure exactly which—feel free to experiment!
I named your data pdat (point data) instead of table. table is the name of a built-in R function, and it's best to avoid using it as a variable name.
I have placed both data= expressions inside the geom function that needs that data (instead of placing the data= and aes() inside the initial ggplot() call.) When I use two or more data.frames in a single plot, I do this defensively and find that it avoids many problems.
I have moved colour="red" and size=2 outside of the aes() function. aes() is used to create an association between a column in your data.frame and a visual attribute of the plot. Anything that's not a name of a column doesn't belong inside aes().
# Load data.
pdat <- read.table(header=TRUE,
text="long2 lat2 mag1
1 -2.62 52.84 1.9
2 1.94 57.03 4.2
3 -0.24 51.16 0.6
4 -2.34 53.34 0.8
5 -3.16 55.73 2.0
6 -0.24 51.16 1.0
7 -4.11 53.03 1.5
8 -0.24 51.16 0.2
9 -0.24 51.16 1.1
10 -5.70 57.08 1.6
11 -2.40 53.00 1.4
12 -1.19 53.35 1.2
13 -1.02 53.84 1.7
14 -4.24 52.62 0.8
15 -3.23 54.24 0.3
16 -2.06 52.62 1.0
17 1.63 54.96 1.7
18 -5.24 56.05 0.7
19 -5.86 55.84 1.3
20 -3.22 54.23 0.3
21 -0.24 51.16 -1.4
22 -0.24 51.16 -0.7
23 -4.01 55.92 0.3
24 -5.18 50.08 2.3
25 -1.95 54.44 1.0")
library(ggplot2)
library(maps)
w <- map_data("world", region = "uk")
uk <- ggplot() +
geom_polygon(data = w,
aes(x = long, y = lat, group = group),
fill = "seagreen2", colour = "white") +
coord_map() +
geom_point(data = pdat,
aes(x = long2, y = lat2),
colour = "red", size = 2,
position = "jitter", alpha = 0.5)
ggsave("map.png", plot=uk, height=4, width=6, dpi=150)
I am trying to create a plot that shows the mean and standard deviation of my data. I have the code that creates the plot, and it works. However, my points are out of order. Below is the data that is being plotted (called Sumcircle):
reduction.area.size antlerless.harvest.rate.sink mean sd
1 0.3 23.14362 5.1980318
5 0.3 24.82013 2.9770937
10 0.3 25.30464 1.9167845
15 0.3 25.27654 1.6662350
20 0.3 24.86209 1.5823747
25 0.3 25.17401 1.3082544
1 0.35 20.65101 4.9711989
5 0.35 21.47942 2.6411690
10 0.35 21.72935 1.8211059
15 0.35 21.30290 1.6275956
20 0.35 21.49806 1.3869719
25 0.35 21.10958 1.1720223
1 0.4 18.09449 4.8401543
5 0.4 17.56596 2.2518005
10 0.4 18.22319 1.7100441
15 0.4 17.89776 1.2087007
20 0.4 17.84899 1.2016877
25 0.4 18.05289 1.0047864
1 0.45 14.35913 4.0633069
5 0.45 14.78276 2.1630511
10 0.45 15.18299 1.6615803
15 0.45 14.83019 1.2601986
20 0.45 14.90748 1.1107997
25 0.45 14.69429 0.8485477
1 0.5 11.75290 3.5159347
5 0.5 12.10627 2.2036029
10 0.5 12.47646 1.4440110
15 0.5 12.31346 0.9431687
20 0.5 12.20568 0.9091177
25 0.5 12.14800 0.8264364
Here is the code I use to create the plot:
library(ggplot2)
pd = position_dodge(.8)
circlelineplot<-ggplot(Sumcircle,
aes(x = antlerless.harvest.rate.sink,
y = mean,
color = reduction.area.size)) +
geom_point(shape = 15,
size = 4,
position = pd) +
geom_errorbar(aes(ymin = mean - sd,
ymax = mean + sd),
width = 0.2,
size = 0.7,
position = pd) +
theme_bw() +
theme(axis.title = element_text(face = "bold"))
# Change the y axis name
circlelineplot + ggtitle("Population Density in Circular Reduction
Area with 30 deer/mi2 Ambient Density") +
theme(plot.title = element_text(hjust=0.5)) +
scale_y_continuous(name ="End Population Density"),
breaks=seq(0,30,5)) +
scale_x_discrete(name ="Antlerless Harvest Rate",
breaks=c("0.3","0.35","0.4","0.45","0.5"),
labels=c("30%","35%","40%","45%","50%")) +
scale_color_manual(values=c("brown","brown1","brown2","brown3",
"brown4","darkred"), name="Size of Reduction Area",
limits=c("1","5","10","15","20","25"))
However, this code gives me the following plot:
How do I get the data for the size "5" reduction area to go between the data for sizes "1" and "10"? I thought the limits function would do that, but it is not. Thanks for any help!
I would like to plot a grouped boxplot using ggplot. Something like the picture below:
Below please see a sample (10 rows) from my data:
alpha colsample_bytree best_F1
35 0.00 0.5 0.5825656
78 0.10 0.3 0.4716612
68 0.00 0.3 0.4714286
27 0.40 1.0 0.4786216
49 0.15 0.5 0.4943968
62 0.00 0.3 0.4938805
70 0.00 0.3 0.4849785
73 0.10 0.3 0.4997061
59 0.30 0.5 0.4856369
88 0.20 0.3 0.4552402
sort(unique(data$alpha))
0 0.1 0.15 0.2 0.3 0.4
sort(unique(data$colsample_bytree))
0.3 0.5 1
My code is the following:
library(ggplot2)
library(ggthemes)
ggplot(data, aes(x= colsample_bytree, y = best_F1, fill = as.factor(alpha))) +
geom_boxplot(alpha = 0.5, position=position_dodge(1)) + theme_economist() +
ggtitle("F1 for alpha and colsample_bytree")
This produces the following plot:
and the following Warning:
Warning message:
"position_dodge requires non-overlapping x intervals"
Since the variable colsample_bytree takes 3 discrete values and the variable alpha takes 6 I would expect to see 3 groups of boxplots --each group comprised from 6 boxplots corresponding to the different alpa values and each group positioned at a different value of colsample_bytree,i.e. 0.3, 0.5 and 1.
I would expect the boxplots to not overlap just like in the example I cited.
You just have to include data$colsample_bytree <- as.factor(data$colsample_bytree) before you plot your data with the ggplot command.
I want to get the hexadecimal codes of the colors that the scale_fill_grey function uses to fill the categories of the barplot produced by the following codes:
library(ggplot2)
data <- data.frame(
Meal = factor(c("Breakfast","Lunch","Dinner","Snacks"),
levels=c("Breakfast","Lunch","Dinner","Snacks")),
Cost = c(9.75,13,19,10.20))
ggplot(data=data, aes(x=Meal, y=Cost, fill=Meal)) +
geom_bar(stat="identity") +
scale_fill_grey(start=0.8, end=0.2)
scale_fill_grey() uses grey_pal() from the scales package, which in turn uses grey.colors(). So, you can generate the codes for the scale of four colours that you used as follows:
grey.colors(4, start = 0.8, end = 0.2)
## [1] "#CCCCCC" "#ABABAB" "#818181" "#333333"
This shows a plot with the colours
image(1:4, 1, matrix(1:4), col = grey.colors(4, start = 0.8, end = 0.2))
Using ggplot_build() function:
#assign ggplot to a variable
myplot <- ggplot(data=data, aes(x=Meal, y=Cost, fill=Meal)) +
geom_bar(stat="identity") +
scale_fill_grey(start=0.8, end=0.2)
#get build
myplotBuild <- ggplot_build(myplot)
#see colours
myplotBuild$data
# [[1]]
# fill x y PANEL group ymin ymax xmin xmax colour size linetype alpha
# 1 #CCCCCC 1 9.75 1 1 0 9.75 0.55 1.45 NA 0.5 1 NA
# 2 #ABABAB 2 13.00 1 2 0 13.00 1.55 2.45 NA 0.5 1 NA
# 3 #818181 3 19.00 1 3 0 19.00 2.55 3.45 NA 0.5 1 NA
# 4 #333333 4 10.20 1 4 0 10.20 3.55 4.45 NA 0.5 1 NA