This question already has answers here:
ggplot2 - jitter and position dodge together
(2 answers)
Closed 6 years ago.
I have a data which can be divaded via two seperators. One is year and second is a field characteristics.
box<-as.data.frame(1:36)
box$year <- c(1996,1996,1996,1996,1996,1996,1996,1996,1996,
1997,1997,1997,1997,1997,1997,1997,1997,1997,
1996,1996,1996,1996,1996,1996,1996,1996,1996,
1997,1997,1997,1997,1997,1997,1997,1997,1997)
box$year <- as.character(box$year)
box$case <- c(6.40,6.75,6.11,6.33,5.50,5.40,5.83,4.57,5.80,
6.00,6.11,6.40,7.00,NA,5.44,6.00, NA,6.00,
6.00,6.20,6.40,6.64,6.33,6.60,7.14,6.89,7.10,
6.73,6.27,6.64,6.41,6.42,6.17,6.05,5.89,5.82)
box$code <- c("L","L","L","L","L","L","L","L","L","L","L","L",
"L","L","L","L","L","L","M","M","M","M","M","M",
"M","M","M","M","M","M","M","M","M","M","M","M")
colour <- factor(box$code, labels = c("#F8766D", "#00BFC4"))
In boxplots, I want to display points over them, to see how data is distributed. That is easily done with one single boxplot for every year:
ggplot(box, aes(x = year, y = case, fill = "#F8766D")) +
geom_boxplot(alpha = 0.80) +
geom_point(colour = colour, size = 5) +
theme(text = element_text(size = 18),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
panel.grid.minor.x = element_blank(),
panel.grid.major.x = element_blank(),
legend.position = "none")
But it become more complicated as I add fill parameter in them:
ggplot(box, aes(x = year, y = case, fill = code)) +
geom_boxplot(alpha = 0.80) +
geom_point(colour = colour, size = 5) +
theme(text = element_text(size = 18),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
panel.grid.minor.x = element_blank(),
panel.grid.major.x = element_blank(),
legend.position = "none")
And now the question: How to move these points to boxplot axes, where they belong? As blue points to blue boxplot and red to red one.
Like Henrik said, use position_jitterdodge() and shape = 21. You can clean up your code a bit too:
No need to define box, then fill it piece by piece
You can let ggplot hash out the colors if you wish and skip constructing the colors factor. If you want to change the defaults, look into scale_fill_manual and scale_color_manual.
box <- data.frame(year = c(1996,1996,1996,1996,1996,1996,1996,1996,1996,
1997,1997,1997,1997,1997,1997,1997,1997,1997,
1996,1996,1996,1996,1996,1996,1996,1996,1996,
1997,1997,1997,1997,1997,1997,1997,1997,1997),
case = c(6.40,6.75,6.11,6.33,5.50,5.40,5.83,4.57,5.80,
6.00,6.11,6.40,7.00,NA,5.44,6.00, NA,6.00,
6.00,6.20,6.40,6.64,6.33,6.60,7.14,6.89,7.10,
6.73,6.27,6.64,6.41,6.42,6.17,6.05,5.89,5.82),
code = c("L","L","L","L","L","L","L","L","L","L","L","L",
"L","L","L","L","L","L","M","M","M","M","M","M",
"M","M","M","M","M","M","M","M","M","M","M","M"))
ggplot(box, aes(x = factor(year), y = case, fill = code)) +
geom_boxplot(alpha = 0.80) +
geom_point(aes(fill = code), size = 5, shape = 21, position = position_jitterdodge()) +
theme(text = element_text(size = 18),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
panel.grid.minor.x = element_blank(),
panel.grid.major.x = element_blank(),
legend.position = "none")
I see you've already accepted #JakeKaupp's nice answer, but I thought I would throw in a different option, using geom_dotplot. The data you are visualizing is rather small, so why not forego the boxplot?
ggplot(box, aes(x = factor(year), y = case, fill = code))+
geom_dotplot(binaxis = 'y', stackdir = 'center',
position = position_dodge())
Related
This heatmap has a grid builtin, which I am failing to find the way to customize.
I want to preserve horizontal lines in the grid, if possible increase thickness, and disable vertical lines. Each row should look as a continuous time-serie where data is present and blank where it is not.
Either adding vertical/horizontal lines on-top would possibly cover some data, because of that grid lines, or controlled gaps between tiny rectangles, is preferable.
Alternativelly, geom_raster doesn't shows any grid at all. With which I would need to add the horizontal lines of the grid.
I tried changing linetype, the geom_tile argument, which does seem to change the type or allow to fully disable it with linetype=0, fully disabling the grid, but it wouldn't allow to preserve horizontal grid-lines. I didn't saw any changes by modifying the size argument.
This is the code generating the plot as above:
ggplot( DF, aes( x=rows, y=name, fill = value) ) +
#geom_raster( ) +
geom_tile( colour = 'white' ) +
scale_fill_gradient(low="steelblue", high="black",
na.value = "white")+
theme_minimal() +
theme(
legend.position = "none",
plot.margin=margin(grid::unit(0, "cm")),
#line = element_blank(),
#panel.grid = element_blank(),
panel.border = element_blank(),
panel.grid = element_blank(),
panel.spacing = element_blank(),
#panel.grid = element_line(color="black"),
#panel.grid.minor = element_blank(),
plot.caption = element_text(hjust=0, size=8, face = "italic"),
plot.subtitle = element_text(hjust=0, size=8),
plot.title = element_text(hjust=0, size=12, face="bold")) +
labs( x = "", y = "",
#caption= "FUENTE: propia",
fill = "Legend Title",
#subtitle = "Spaces without any data (missing, filtered, etc)",
title = "Time GAPs"
)
I tried to attach DF %>% dput but I get Body is limited to 30000 characters; you entered 203304. If anyone is familiar with a similar Dataset, please advise.
Additionally,
There are 2 gaps at left&right of the plot area, one is seen inbetween the y-axis, and at the right you can see the X-axis outbounding, and are not controlled by a plot.margin argument.
I would want to set the grid to a thicker line when month changes.
The following data set has the same names and essential structure as your own, and will suffice for an example:
set.seed(1)
DF <- data.frame(
name = rep(replicate(35, paste0(sample(0:9, 10, T), collapse = "")), 100),
value = runif(3500),
rows = rep(1:100, each = 35)
)
Let us recreate your plot with your own code, using the geom_raster version:
library(ggplot2)
p <- ggplot( DF, aes( x=rows, y=name, fill = value) ) +
geom_raster( ) +
scale_fill_gradient(low="steelblue", high="black",
na.value = "white") +
theme_minimal() +
theme(
legend.position = "none",
plot.margin=margin(grid::unit(0, "cm")),
panel.border = element_blank(),
panel.grid = element_blank(),
panel.spacing = element_blank(),
plot.caption = element_text(hjust=0, size=8, face = "italic"),
plot.subtitle = element_text(hjust=0, size=8),
plot.title = element_text(hjust=0, size=12, face="bold")) +
labs( x = "", y = "", fill = "Legend Title", title = "Time GAPs")
p
The key here is to realize that discrete axes are "actually" numeric axes "under the hood", with the discrete ticks being placed at integer values, and factor level names being substituted for those integers on the axis. That means we can draw separating white lines using geom_hline, with values at 0.5, 1.5, 2.5, etc:
p + geom_hline(yintercept = 0.5 + 0:35, colour = "white", size = 1.5)
To change the thickness of the lines, simply change the size parameter.
Created on 2022-08-01 by the reprex package (v2.0.1)
This question already has answers here:
Place a border around points
(5 answers)
Closed 1 year ago.
I have a dataset like this:
Year<-rep(2001:2005, each = 5)
name<-c("John","Ellen","Mark","Randy","Luisa")
Name<-c(rep(name,5))
Value<-sample(seq(0,25,by=1),25)
mydata<-data.frame(Year,Name,Value)
And my plot looks like this:
p <- ggplot(mydata, aes(x=Year, y=reorder(Name, desc(Name)), size = Value)) +
geom_point(aes(colour = Value,
alpha = I(as.numeric(Value > 0))))
p <- p + scale_colour_viridis_c(option = "D", direction = -1,
limits = c(1, 25)) +
scale_size_area(guide = "none") +
ylab("Name") +
theme(axis.line = element_blank(),
axis.text.x=element_text(size=11,margin=margin(b=10),colour="black"),
axis.text.y=element_text(size=13,margin=margin(l=10),colour="black",
face="italic"),
axis.ticks = element_blank(),
axis.title=element_text(size=18,face="bold"),
panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(),
legend.text = element_text(size=14),
legend.title = element_text(size=18))
I would like to improve it in two ways but I couldn't figure out how.
I would like to add a black border around points. I know I should use pch>20 and specify colour, but because my colours are mapped to a feature of the dataset (they depend on value, in this case), I don't know exactly how to do that. Note that value = 0 points are not plotted. Easy stratagems such as plotting bigger black points under my points seem utopic for me.
I would like to change the breaks of the scale (e.g., instead of having breaks every 5, I'd like to have breaks every 2.5), but it is a continuous scale, and I'm not sure how to do that.
I am not very familiar with ggplo2, thus any help would be appreciated!
You can indeed use a shape >20, e.g. I use shape=21 here. Then you need to change your scale_color_ to scale_fill_, because the color is now black (it is the border of the shape).
For breaks, you could just specify them in the scale itself. Combining both:
ggplot(mydata, aes(x=Year, y=reorder(Name, desc(Name)), size = Value)) +
geom_point(aes(fill = Value,
alpha = I(as.numeric(Value > 0))), shape=21, color = "black") +
scale_fill_viridis_c(option = "D", direction = -1,
limits = c(1, 25), breaks=seq(1, 25, 2.5)) +
scale_size_area(guide = "none") +
ylab("Name") +
theme(axis.line = element_blank(),
axis.text.x=element_text(size=11,margin=margin(b=10),colour="black"),
axis.text.y=element_text(size=13,margin=margin(l=10),colour="black",
face="italic"),
axis.ticks = element_blank(),
axis.title=element_text(size=18,face="bold"),
panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(),
legend.text = element_text(size=14),
legend.title = element_text(size=18))
I'm trying to map some points across a map of the ocean... this is my dataframe (rik_data)
location mean_longitude mean_latitude
NSTR002 -63.53341 44.47846
NSTR002 -63.53341 44.47846
NSTR001 -63.52704 44.46643
NSTR001 -63.52704 44.46643
NSTR003 -63.50115 44.41449
HFX014 -63.24095 44.21091
HFX014 -63.24095 44.21091
HFX023 -63.22477 44.19080
HFX0165 -63.21937 44.16828
HFX0165 -63.21937 44.16828
HFX020 -63.20010 44.12228
HFX020 -63.20010 44.12228
I want to plot these points so that every location starting with "HFX" is one color versus every location starting with "NSTR" is another colour. I'm using this code for the graph.
canada = map_data("worldHires", "Canada")
p = ggplot(data = canada) +
geom_polygon(data = canada, aes(x=long, y = lat, group = group), fill = "lightgrey") +
coord_sf(xlim=c(-64.5,-62.8), ylim=c(42.7,45), expand = FALSE) +
#HOW TO ASSIGN COLOR BY STRINGS?
geom_point(data = rik_data,
mapping = aes(x = mean_longitude,
y = mean_latitude), color = "black",
size = 4, alpha = 0.5) +
labs(colour = "Location") +
theme(panel.background = element_rect(fill = "#add8e6"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.ticks.y = element_blank(),
axis.ticks.x = element_blank(),
axis.title.y =element_blank(),
axis.title.x = element_blank(),
axis.text.x=element_blank(),
axis.text.y = element_blank(),
legend.position = c(0.2, 0.2),
legend.background = element_blank(),
text = element_text(size = 25,
family = "sans"))
Doesn anyone know how to assign red to locations starting with "NSTR" and black to locations starting with "HFX"?
You can create the classification within the ggplot call using a grepl (this assumes all cases are either HFX or NSTR). Replace your geom_point statement with something like this:
geom_point(data = dd,
mapping = aes(x = mean_longitude,
y = mean_latitude,
color = grepl("^HFX",location)),
size = 4, alpha = 0.5) +
scale_color_discrete(breaks=c(0,1),labels=c("NSTR","HFX"))
I have been trying to shift my legend title across to be centered over the legend contents using the guide function. I've been trying to use the following code:
guides(colour=guide_legend(title.hjust = 20))
I thought of trying to make a reproducable example, but I think the reason it's not working has something to do with the above line not matching the rest of my code specifically. So here is the rest of the code I'm using in my plot:
NH4.cum <- ggplot(data=NH4_by_Date, aes(x=date, y=avg.NH4, group = CO2, colour=CO2)) +
geom_line(aes(linetype=CO2), size=1) + #line options
geom_point(size=3) + #point symbol sizes
#scale_shape_manual(values = c(1, 16)) + #manually choose symbols
theme_bw()+
theme(axis.text.x=element_text(colour="white"), #change x axis labels to white.
axis.title=element_text(size=12),
axis.title.x = element_text(color="white"), #Change x axis label colour to white
panel.border = element_blank(), #remove box boarder
axis.line.x = element_line(color="black", size = 0.5), #add x axis line
axis.line.y = element_line(color="black", size = 0.5), #add y axis line
legend.key = element_blank(), #remove grey box from around legend
legend.position = c(0.9, 0.6))+ #change legend position
geom_vline(xintercept=c(1.4,7.5), linetype="dotted", color="black")+ #put in dotted lines for season boundaries
scale_color_manual(values = c("#FF6600", "green4", "#0099FF"),
name=expression(CO[2]~concentration~(ppm))) + #manually define line colour
scale_linetype_manual(guide="none", values=c("solid", "solid", "solid")) + #manually define line types
scale_shape_manual(values = c(16, 16, 16)) + #manually choose symbols
guides(colour=guide_legend(title.hjust = 20))+
scale_y_continuous(expand = c(0, 0), limits = c(0,2200), breaks=seq(0,2200,200))+ #change x axis to intercept y axis at 0
xlab("Date")+
ylab(expression(Membrane~available~NH[4]^{" +"}~-N~(~mu~g~resin^{-1}~14~day^{-1})))+
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
geom_errorbar(aes(ymin = avg.NH4 - se.NH4, #set y error bars
ymax = avg.NH4 + se.NH4),
width=0.1)
I have tried doing the following instead with no luck:
guides(fill=guide_legend(title.hjust=20)
I have also adjusted the hjust value from values between -2 to 20 just to see if that made a difference but it didn't.
I'll try to attach a picture of the graph so far so you can see what I'm talking about.
I've looked through all the questions I can on stack overflow and to the best of my knowledge this is not a duplicate as it's specific to a coding error of my own somewhere.
Thank-you in advance!!
The obvious approach e.g.
theme(legend.title = element_text(hjust = .5))
didn't work for me. I wonder if it is related to this open issue in ggplot2. In any case, one manual approach would be to remove the legend title, and position a new one manually:
ggplot(mtcars, aes(x = wt, y = mpg, colour = factor(cyl))) +
geom_point() +
stat_smooth(se = FALSE) +
theme_bw() +
theme(legend.position = c(.85, .6),
legend.title = element_blank(),
legend.background = element_rect(fill = alpha("white", 0)),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank()) +
annotate("text", x = 5, y = 27, size = 3,
label = "CO[2]~concentration~(ppm)", parse = TRUE)
Output:
I have asked a question earlier.Here is the link:how to add a vertical line using theme() function in my plot
And now new problem happened,the horizontal line of the band6 can not display completely.Anyone can give me some suggestions?Thank you.
And my code is below:
p <- ggplot(data = df1, aes(x = df1$MeanDecreaseAccuaracy, y = reorder(factor(df1$Variables),df1$MeanDecreaseAccuaracy)))
p + geom_segment(aes(yend = df1$Variables,xend = 0)) +
geom_point() +
theme_minimal() +
scale_x_continuous(expand = c(0,0),breaks = c(5,10,15,20,25,30,35,40,45)) +
labs(x = "Mean Decrease in Accuracy",y = "Prdictors variable") +
theme(axis.line = element_line(colour = "black"),
axis.text.x = element_text(colour = "black"),
axis.text.y = element_text(colour = "black"),
axis.ticks.x = element_line(size = 0.2,colour = "black"),
axis.ticks.y = element_line(size = 0.2,colour = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank())
And the output figure is as follows.
Okay, posting as answer:
Don't use data$column inside aes(). It will cause problems if you try to facet or use other advanced features. You should have
aes(x = MeanDecreaseAccuracy,
y = reorder(factor(Variables, MeanDecreaseAccuracy)))
To solve your problem, I would recommend setting limits = c(0, 1.05 * max(df1$MeanDecreaseAccuracy)). inside your scale_x_continuous. (Note that is not inside aes() so you do need to use the data$column identifier here).