Visualizing crosstab tables with a plot in R - changing colours - r

I have the following code in R which is modified from here, which plots a crosstab table:
#load ggplot2
library(ggplot2)
# Set up the vectors
xaxis <- c("A", "B")
yaxis <- c("A","B")
# Create the data frame
df <- expand.grid(xaxis, yaxis)
df$value <- c(120,5,30,200)
#Plot the Data
g <- <- ggplot(df, aes(Var1, Var2)) + geom_point(aes(size = value), colour = "lightblue") + theme_bw() + xlab("") + ylab("")
g + scale_size_continuous(range=c(10,30)) + geom_text(aes(label = value))
It produces the right figure, which is great, but I was hoping to custom colour the four dots, ideally so that the top left and bottom right are both one colour and the top right and bottom left are another.
I have tried to use:
+ scale_color_manual(values=c("blue","red","blue","red"))
but that doesn't seem to work. Any ideas?

I would suggest that you colour by a vector in your data frame, as you don't have a column that gives you this, you can either create one, or make a rule based on existing columns (which I have done below):
g <- ggplot(df, aes(Var1, Var2)) + geom_point(aes(size = value, colour = (Var2!=Var1))) + theme_bw() + xlab("") + ylab("")
g + scale_size_continuous(range=c(10,30)) + geom_text(aes(label = value))
The important part is: colour = (Var2!=Var1), note that i put this inside the aesthetic (aes) for the geom_point
Edit: if you wish to remove the legend (you annotate the chart with totals, so I guess you don't really need it), you can add: g + theme(legend.position="none") to remove it

Related

Problem when trying to plot two histograms using fill aesthetic

I've been trying to plot two histograms by using the fill aesthetic and a specific column with two levels. However, instead of displaying both desired histograms, my code displays one histogram with the whole data and another only for the second classification. I don't know if there is a problem in my syntax neither if this is some kind of tricky issue.
library(tidyverse)
db1 <- data.frame(type=rep("A",100),val=rnorm(n=100,mean=50,sd=10))
db2 <- data.frame(type=rep("B",150),val=rnorm(n=150,mean=50,sd=10))
dbf <- bind_rows(db1,db2)
P1 <- ggplot(db1, aes(x=val)) + geom_histogram()
P2 <- ggplot(db2, aes(x=val)) + geom_histogram()
PF <- ggplot(dbf, aes(x=val)) + geom_histogram()
I want to get this, P1 and P2
ggplot(db1, aes(x=val)) + geom_histogram(fill="red", alpha=0.5) + geom_histogram(data=db2, aes(x=val),fill="green", alpha=0.5)
What I want
But the code I think should work, P1 and P2 with the fill aesthetic for column val
ggplot(dbf, aes(x=val)) + geom_histogram(aes(fill=type), alpha=0.5)
My code
Produces the combination of PF and P2
ggplot(dbf, aes(x=val)) + geom_histogram(fill="red", alpha=0.5) + geom_histogram(data=db2, aes(x=val),fill="green", alpha=0.5)
What I get
Any help or idea will be highly appreciated!
All you need is to pass position = "identity" to your geom_histogram function.
library(tidyverse)
library(ggplot2)
db1 <- data.frame(type=rep("A",100),val=rnorm(n=100,mean=50,sd=10))
db2 <- data.frame(type=rep("B",150),val=rnorm(n=150,mean=50,sd=10))
dbf <- bind_rows(db1,db2)
ggplot(dbf, aes(x=val, fill = type)) + geom_histogram(alpha=0.5, position = "identity")
Is your goal to show the overlap via the color combination? I'm not sure how to force geom_histogram to show the overlap, but geom_density does do what you want. You can play with the bandwidth (bw) to show more or less detail.
dbf %>% ggplot() +
aes(x = val, fill = type) +
geom_density(alpha = .5, bw = .5) +
scale_fill_manual(values = c("red","green"))

How to trim extra space from ggplot

I am trying to make an extremely single heatmap of percentages using ggplot2 which ideally will just be two single thin columns. I tried the following code, believing that the width option in aes would solve the problem.
p_prev_tg <- ggplot(tg_melt, aes(x = variable , y = OTU, fill = value,
width=.3)) + geom_tile() +
scale_fill_gradientn(colours = hm.palette2(10)) +
xlab(NULL) + ylab(NULL) +
theme(axis.text=element_text(size=7))
p_prev_tg
Unfortunately, this returns a plot with lots of empty space as shown. The plot I would like is those two bars side by side, how can I do this in ggplot?
thanks
What about this solution ?
set.seed(1234)
tg_melt <- data.frame(variable=rep(c("Prevalence_T","Prevalence_NT"), each=10),
OTU=rep(paste0("OTU_",1:10),2),
value=rnorm(20))
library(RColorBrewer)
library(ggplot2)
hm.palette2 <- colorRampPalette(rev(brewer.pal(11, 'Spectral')))
p_prev_tg <- ggplot(tg_melt, aes(x = as.numeric(variable), y = OTU, fill = value)) +
geom_tile() +
scale_fill_gradientn(colours = hm.palette2(10)) +
xlab(NULL) + ylab(NULL) +
theme(axis.text=element_text(size=7)) +
scale_x_continuous(breaks=c(1,2),
limits=c(0,3),
labels=levels(tg_melt$variable))+
theme_bw()
p_prev_tg

Removing ggplot2 legend removes whole data from the plot

Here I have 2-dim numeric array dataset and numeric 1-dim array of labels clustring. Then I plot it with the following code:
s = data.frame(x = dataset[,1], y = dataset[,2])
p = ggplot(s, aes(x, y))
p + geom_point(aes(colour = factor(clustering)))
which displays beautiful picture:
Now I want to remove legend completely, so here I've found possible solution:
# Remove legend for a particular aesthetic (fill)
p + guides(fill=FALSE)
# It can also be done when specifying the scale
p + scale_fill_discrete(guide=FALSE)
# This removes all legends
p + theme(legend.position="none")
but none of such commands wont help. It shows empty plot instead:
So how do I remove the legend from my plot?
Try this:
library(ggplot2)
s = data.frame(x = rnorm(20), y = rnorm(20), clustering = rep(c(1, 2), 10))
p <- ggplot(s, aes(x, y))+
guides(fill=FALSE)+
geom_point(aes(colour = factor(clustering)))+
scale_fill_discrete(guide=FALSE)+
theme(legend.position="none")
p
In your code, you are not saving the plot again after each time you add something to it. You can fix this by changing the lines that add to the plot:
# Remove legend for a particular aesthetic (fill)
p = p + guides(fill=FALSE)
But the way I wrote is is more common R formatting.
Use show.legend = FALSE within geom_point. Here is an example using ggplot2's diamonds dataset.
s <- diamonds
p <- ggplot(data = s, aes(x = depth, y = price))
p + geom_point(aes(colour = factor(cut)), show.legend = FALSE)
Just try this:
p + geom_point(aes(colour = factor(clustering)),show.legend=FALSE)

ggplot: Manually add legends for aesthetics that are not mapped

I want to produce a barplot overlayed with dots where both have separate legends. Also, I want to choose the color of the bars and the size of the dots using the arguments outside aes(). As both are not mapped, no legend is produced.
1) How can I add a legend manually for both fill and size?
library(ggplot2)
d <- data.frame(group = 1:3,
prop = 1:3 )
ggplot(d, aes(x=group, y=prop)) +
geom_bar(stat="identity", fill="red") +
geom_point(size=5)
This is what I came up with: I used dummy mappings and modified the legend according to my needs afterwards. But this approach appears clumsy to me.
2) Is there a manual way to say: Add a legend with this title, these shapes, these colors etc.?
d <- data.frame(dummy1="d1",
dummy2="d2",
group = 1:3,
prop = 1:3 )
ggplot(d, aes(x=group, y=prop, fill=dummy1, size=dummy2)) +
geom_bar(stat="identity", fill="red") +
geom_point(size=5) +
scale_fill_discrete(name="fill legend", label="fill label") +
scale_size_discrete(name="size legend", label="size label")
Above I mapped fill to dummy1. So I would expect scale_fill_discrete to alter this legend. But it appears to modify the size legend instead.
3) I am not sure what went wrong here. Any ideas?
I'm not sure why you say "Also, I want to choose the color of the bars and the size of the dots using the arguments outside aes()". Is it something you're trying to do or is it something that you have to do given how ggplot works?
If it's the latter, one solution is as under -
library(ggplot2)
d <- data.frame(group = 1:3,
prop = 1:3 )
ggplot(d, aes(x=group, y=prop)) +
geom_bar(stat="identity",aes( fill="label")) +
geom_point(aes(size='labelsize')) +
scale_fill_manual(breaks = 'label', values = 'red')+
scale_size_manual(breaks = 'labelsize', values = 5)

Individual Gradient Fill with Facet Wrap

I'm trying to achieve an output where the fill gradient is independent on each histogram. I know I could make individual plots and then combine them using grid.arrange, but I want this to work on a data set with any number of columns.
Any help is appreciated.
P.S. I would include an image but I don't have the reputation points.
# rm(list=ls())
var_his <- function(this_data){
this_data <- melt(this_data)
ggplot(this_data, aes(x = value)) +
geom_histogram(aes(x = value, y = ..density.., fill = ..count..), position="identity") +
facet_wrap(~variable, scales = "free") +
scale_fill_gradient('count', low='lightblue', high='steelblue')
}
data(Seatbelts)
data <- data.frame(Seatbelts)
var_his(data)

Resources