Changing ggplot graphs for comparison

Changing ggplot graphs for comparison - r

I produce these two graphs using p-values obtained from a pairwise.wilcox.test and the following script. My problem is that I want to have the same colour for the different breaks in both graphs for comparison purposes. I’m aware that the problem here is that in the first graph (Sum of ROH) I don’t have any value in the break (0.001,0.05]. However I want to force the graph to add this break in the legend and to have the same colour as the second graph (Mean ROH Size)
test.result$value<-cut(test.result$value, breaks=c(-Inf,0.001,0.05,1),right=T)
windows()
ggplot(data = test.result, aes(X1, X2, fill = value))+
ggtitle("Sum of ROH")+
xlab("")+
ylab("")+
geom_tile(aes(fill=test.result$value),color="white")+
scale_fill_brewer(palette="Blues", direction = -1,name="p-Val")+
theme_minimal()+
theme(axis.text.x = element_text(angle = 45, vjust = 1,
size = 12, hjust = 1))+
coord_fixed()

Set the levels of your factors to be the same. You can add levels that don't exist.
For example
levels(test.result$value) = c("(-Inf,0.001]", "(0.001,0.05]", "(0.05,1]")
Then add drop = FALSE into your fill scale to keep the value in the legend.
scale_fill_brewer(palette = "Blues", direction = -1, drop = FALSE, name = "p-Val")

Related

ggplot2 geom_points won't colour or dodge

So I'm using ggplot2 to plot both a bar graph and points. I'm currently getting this:
As you can see the bars are nicely separated and colored in the desired colors. However my points are all uncolored and stacked ontop of eachother. I would like the points to be above their designated bar and in the same color.
#Add bars
A <- A + geom_col(aes(y = w1, fill = factor(Species1)),
position = position_dodge(preserve = 'single'))
#Add colors
A <- A + scale_fill_manual(values = c("A. pelagicus"= "skyblue1","A. superciliosus"="dodgerblue","A. vulpinus"="midnightblue","Alopias sp."="black"))
#Add points
A <- A + geom_point(aes(y = f1/2.5),
shape= 24,
size = 3,
fill = factor(Species1),
position = position_dodge(preserve = 'single'))
#change x and y axis range
A <- A + scale_x_continuous(breaks = c(2000:2020), limits = c(2016,2019))
A <- A + expand_limits(y=c(0,150))
# now adding the secondary axis, following the example in the help file ?scale_y_continuous
# and, very important, reverting the above transformation
A <- A + scale_y_continuous(sec.axis = sec_axis(~.*2.5, name = " "))
# modifying axis and title
A <- A + labs(y = " ",
x = " ")
A <- A + theme(plot.title = element_text(size = rel(4)))
A <- A + theme(axis.text.x = element_text(face="bold", size=14, angle=45),
axis.text.y = element_text(face="bold", size=14))
#A <- A + theme(legend.title = element_blank(),legend.position = "none")
#Print plot
A
When I run this code I get the following error:
Error: Unknown colour name: A. pelagicus
In addition: Warning messages:
1: Width not defined. Set with position_dodge(width = ?)
2: In max(table(panel$xmin)) : no non-missing arguments to max; returning -Inf
I've tried a couple of things but I can't figure out it does work for geom_col and not for geom_points.
Thanks in advance

The two basic problems you have are dealing with your color error and not dodging, and they can be solved by formatting your scale_...(values= argument using a list instead of a vector, and applying the group= aesthetic, respectively.
You'll see the answer to these two question using an example:
# dummy dataset
year <- c(rep(2017, 4), rep(2018, 4))
species <- rep(c('things', 'things1', 'wee beasties', 'ew'), 2)
values <- c(10, 5, 5, 4, 60, 10, 25, 7)
pt.value <- c(8, 7, 10, 2, 43, 12, 20, 10)
df <-data.frame(year, species, values, pt.value)
I made the "values" set for my column heights and I wanted to use a different y aesthetic for points for illustrative purposes, called "pt.value". Otherwise, the data setup is similar to your own. Note that df$year will be set as numeric, so it's best to change that into either Date format (kinda more trouble than it's worth here), or just as a factor, since "2017.5" isn't gonna make too much sense here :). The point is, I need "year" to be discrete, not continuous.
Solve the color error
For the plot, I'll try to create it similar to you. Here note that in the scale_fill_manual object, you have to set the values= argument using a list. In your example code, you are using a vector (c()) to specify the colors and naming. If you have name1=color1, name2=color2,..., this represents a list structure.
ggplot(df, aes(x=as.factor(year), y=values)) +
geom_col(aes(fill=species), position=position_dodge(width=0.62), width=0.6) +
scale_fill_manual(values=
list('ew' = 'skyblue1', 'things' = 'dodgerblue',
'things1'='midnightblue', 'wee beasties' = 'gray')) +
geom_point(aes(y=pt.value), shape=24, position=position_dodge(width=0.62)) +
theme_bw() + labs(x='Year')
So the colors are applied correctly and my axis is discrete, and the y values of the points are mapped to pt.value like I wanted, but why don't the points dodge?!
Solve the dodging issue
Dodging is a funny thing in ggplot2. The best reasoning here I can give you is that for columns and barplots, dodging is sort of "built-in" to the geom, since the default position is "stack" and "dodge" represents an alternative method to draw the geom. For points, text, labels, and others, the default position is "identity" and you have to be more explicit in how they are going to dodge or they just don't dodge at all.
Basically, we need to let the points know what they are dodging based on. Is it "species"? With geom_col, it's assumed to be, but with geom_point, you need to specify. We do that by using a group= aesthetic, which let's the geom_point know what to use as criteria for dodging. When you add that, it works!
ggplot(df, aes(x=as.factor(year), y=values, group=species)) +
geom_col(aes(fill=species), position=position_dodge(width=0.62), width=0.6) +
scale_fill_manual(values=
list('ew' = 'skyblue1', 'things' = 'dodgerblue',
'things1'='midnightblue', 'wee beasties' = 'gray')) +
geom_point(aes(y=pt.value), shape=24, position=position_dodge(width=0.62)) +
theme_bw() + labs(x='Year')

Stat summary for each factor in scatter plot ggplot2: What about fun.x, fun_y combinations?

I have a bunch of data for people touching bacteria for up to 5 touches. I'm comparing how much they pick up with and without gloves. I'd like to plot the mean by the factor NumberContacts and colour it red. E.g. the red dots on the following graphs.
So far I have:
require(tidyverse)
require(reshape2)
Make some data
df<-data.frame(Yes=rnorm(n=100),
No=rnorm(n=100),
NumberContacts=factor(rep(1:5, each=20)))
Calculate the mean for each group= NumberContacts
centroids<-aggregate(data=melt(df,id.vars ="NumberContacts"),value~NumberContacts+variable,mean)
Get them into two columns
centYes<-subset(centroids, variable=="Yes",select=c("NumberContacts","value"))
centNo<-subset(centroids, variable=="No",select="value")
centroids<-cbind(centYes,centNo)
colnames(centroids)<-c("NumberContacts","Gloved","Ungloved")
Make an ugly plot.
ggplot(df,aes(x=gloves,y=ungloved)+
geom_point()+
geom_abline(slope=1,linetype=2)+
stat_ellipse(type="norm",linetype=2,level=0.975)+
geom_point(data=centroids,size=5,color='red')+
#stat_summary(fun.y="mean",colour="red")+ doesn't work
facet_wrap(~NumberContacts,nrow=2)+
theme_classic()
Is there a more elegant way by using stat_summary? Also How can I change the look of the boxes at the top of my graphs?

stat_summary is not an option because (see ?stat_summary):
stat_summary operates on unique x
That is, while we can take a mean of y, x remains fixed. But we may do something else that is very concise:
ggplot(df, aes(x = Yes, y = No, group = NumberContacts)) +
geom_point() + geom_abline(slope = 1, linetype = 2)+
stat_ellipse(type = "norm", linetype = 2, level = 0.975)+
geom_point(data = df %>% group_by(NumberContacts) %>% summarise_all(mean), size = 5, color = "red")+
facet_wrap(~ NumberContacts, nrow = 2) + theme_classic() +
theme(strip.background = element_rect(fill = "black"),
strip.text = element_text(color = "white"))
which also shows that to modify the boxes above you want to look at strip elements of theme.

geom_point isn't filled by scale_fill_manual

I would like to draw a chart with ggplot for a couple of model accuracies. The detail of the plotted result doesn't matter, however, I've a problem to fill the geom_point objects.
A sample file can be found here: https://ufile.io/z1z4c
My code is:
library(ggplot2)
library(ggthemes)
Palette <- c('#A81D35', '#085575', '#1DA837')
results <- read.csv('test.csv', colClasses=c('factor', 'factor', 'factor', 'numeric'))
results$dates <- factor(results$dates, levels = c('01', '15', '27'))
results$pocd <- factor(results$pocd, levels = c('without POCD', 'with POCD', 'null accuracy'))
results$model <- factor(results$model, levels = c('SVM', 'DT', 'RF', 'Ada', 'NN'))
ggplot(data = results, group = pocd) +
geom_point(aes(x = dates, y = acc,
shape = pocd,
color = pocd,
fill = pocd,
size = pocd)) +
scale_shape_manual(values = c(0, 1, 3)) +
scale_color_manual(values = c(Palette[1], Palette[2], Palette[3])) +
scale_fill_manual(values = c(Palette[1], Palette[2], Palette[3])) +
scale_size_manual(values = c(2, 2, 1)) +
facet_grid(. ~ model) +
xlab('Date of knowledge') +
ylab('Accuracy') +
theme(legend.position = 'right',
legend.title = element_blank(),
axis.line = element_line(color = '#DDDDDD'))
As a result I get unfilled circles and squares. How can I fix it, so that the squares and circles are filled with the specfic color?
Additional question: I would like to add a geom_line to the graph, connecting the three points in each group. However, I fail to adjust linetype and width. It always take the values of scale_*_manual, which is very adverse especially in the case of size.
Thanks for helping!

You need to change the shapes specified, like so:
scale_shape_manual(values = c(21,22,23)) +
For your additional question, that should be solved if you set aes(size=) in the first part of your code (under ggplot(data=...) and then manually specify size=1 under geom_line as +geom_line(size=1....`

Ggplot2 in R gives incorrect coloring when creating overlapping demographic pyramids

I am creating an overlapping demographic pyramids in R with ggplot2 library to compare demographic data from two different sources.
I have however run in to problems with ggplot2 and the colouring when using the alpha-parameter. I have tried to make sense of ggplot2 and geom_bar structure, but so far it has gotten me nowhere. The deal is to draw four geom_bars where two geom_bars are overlapping each other (males and females, respectively). I'd have no problems if I didn't need use alpha to demonstrate differences in my data.
I would really appreciate some answers where I am going wrong here. As a R programmer I am pretty close to beginner, so bear with me if my code looks weird.
Below is my code which results in the image also shown below. I have altered my demographic data to be random for this question.
library(ggplot2)
# Here I randomise my data for StackOverflow
poptest<-data.frame(matrix(NA, nrow = 101, ncol = 5))
poptest[,1]<- seq(0,100)
poptest[,2]<- rpois(n = 101, lambda = 100)
poptest[,3]<- rpois(n = 101, lambda = 100)
poptest[,4]<- rpois(n = 101, lambda = 100)
poptest[,5]<- rpois(n = 101, lambda = 100)
colnames(poptest) <- c("age","A_males", "A_females","B_males", "B_females")
myLimits<-c(-250,250)
myBreaks<-seq(-250,250,50)
# Plot demographic pyramid
poptestPlot <- ggplot(data = poptest) +
geom_bar(aes(age,A_females,fill="black"), stat = "identity", alpha=0.75, position = "identity")+
geom_bar(aes(age,-A_males, fill="black"), stat = "identity", alpha=0.75, position="identity")+
geom_bar(aes(age,B_females, fill="white"), stat = "identity", alpha=0.5, position="identity")+
geom_bar(aes(age,-B_males, fill="white"), stat = "identity", alpha=0.5, position="identity")+
coord_flip()+
#set the y-axis which (because of the flip) shows as the x-axis
scale_y_continuous(name = "",
limits = myLimits,
breaks = myBreaks,
#give the values on the y-axis a name, to remove the negatives
#give abs() command to remove negative values
labels = paste0(as.character(abs(myBreaks))))+
#set the x-axis which (because of the flip) shows as the y-axis
scale_x_continuous(name = "age",breaks=seq(0,100,5)) +
#remove the legend
theme(legend.position = 'none')+
# Annotate geom_bars
annotate("text", x = 100, y = -200, label = "males",size=6)+
annotate("text", x = 100, y = 200, label = "females",size=6)
# show results in a separate window
x11()
print(poptestPlot)
This is what I get as result: (sorry, as a StackOverflow noob I can't embed my pictures)
Ggplot2 result
The colouring is really nonsensical. Black is not black and white is not white. Instead it may use some sort of default coloring because R or ggplot2 can't interpret my code.
I welcome any and all answers. Thank you.

You are trying to map "black" to data points. That means you would have to add a manual scale and tell ggplot to colour each instance of "black" in colour "black". There is a shortcut for this called scale_colour_identity. However, if this is your only level, it is much easier to just use fill outside the aes. This way the whole geom is filled in black or white respectively:
poptestPlot <- ggplot(data = poptest) +
geom_bar(aes(age,A_females),fill="black", stat = "identity", alpha=0.75, position = "identity")+
geom_bar(aes(age,-A_males), fill="black", stat = "identity", alpha=0.75, position="identity")+
geom_bar(aes(age,B_females), fill="white", stat = "identity", alpha=0.5, position="identity")+
geom_bar(aes(age,-B_males), fill="white", stat = "identity", alpha=0.5, position="identity")+
coord_flip()+
#set the y-axis which (because of the flip) shows as the x-axis
scale_y_continuous(name = "",
limits = myLimits,
breaks = myBreaks,
#give the values on the y-axis a name, to remove the negatives
#give abs() command to remove negative values
labels = paste0(as.character(abs(myBreaks))))+
#set the x-axis which (because of the flip) shows as the y-axis
scale_x_continuous(name = "age",breaks=seq(0,100,5)) +
#remove the legend
theme(legend.position = 'none')+
# Annotate geom_bars
annotate("text", x = 100, y = -200, label = "males",size=6)+
annotate("text", x = 100, y = 200, label = "females",size=6)

Override legend with interacting variables in ggplot2

In the following plot:
library(ggplot2)
test <- data.frame(Depth=c(rep(c(0,10,20),4)),
Core=c(rep("A", 6), rep("B",6)),
Variable=c(rep("Treat1",3),rep("Treat2",3), rep("Treat1",3),rep("Treat2",3)),
Value=runif(12,0,1))
ggplot(test, aes(Value, Depth, col=Variable, shape=Core, lty=Core))+
geom_path(aes(group=interaction(Variable, Core))) +
geom_point(aes(group=interaction(Variable, Core)))+
theme_bw()+
guides(colour = guide_legend(aes.override=list(linetype = "solid")))
is it possible with remove the shapes from the colour-based legend (set to "Variable"), as i tried with aes.overide in guides?
My real life example produces this legend:
and I want to remove the shapes from the left legend; in fact I want to replace the current legend keys (lines and shapes) with filled boxes. Since the aes contains an interaction-argument, I fear my attempt to manipulate the legend via colour=guide_legend is futile.

Use override.aes instead of aes.override where you specify linetype = 0, shape = 15 (boxes):
ggplot(test, aes(Value, Depth,
color = Variable, shape = Core, lty = Core))+
geom_path(aes(group = interaction(Variable, Core))) +
geom_point(aes(group = interaction(Variable, Core)))+
theme_bw()+
guides(colour = guide_legend(override.aes=list(shape = 15, size = 5, linetype = 0)))
Result:

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Changing ggplot graphs for comparison - r

Related

ggplot2 geom_points won't colour or dodge

Stat summary for each factor in scatter plot ggplot2: What about fun.x, fun_y combinations?

geom_point isn't filled by scale_fill_manual

Ggplot2 in R gives incorrect coloring when creating overlapping demographic pyramids

Override legend with interacting variables in ggplot2

Categories

Resources