I'm trying to achieve an output where the fill gradient is independent on each histogram. I know I could make individual plots and then combine them using grid.arrange, but I want this to work on a data set with any number of columns.
Any help is appreciated.
P.S. I would include an image but I don't have the reputation points.
# rm(list=ls())
var_his <- function(this_data){
this_data <- melt(this_data)
ggplot(this_data, aes(x = value)) +
geom_histogram(aes(x = value, y = ..density.., fill = ..count..), position="identity") +
facet_wrap(~variable, scales = "free") +
scale_fill_gradient('count', low='lightblue', high='steelblue')
}
data(Seatbelts)
data <- data.frame(Seatbelts)
var_his(data)
Related
I've been trying to plot two histograms by using the fill aesthetic and a specific column with two levels. However, instead of displaying both desired histograms, my code displays one histogram with the whole data and another only for the second classification. I don't know if there is a problem in my syntax neither if this is some kind of tricky issue.
library(tidyverse)
db1 <- data.frame(type=rep("A",100),val=rnorm(n=100,mean=50,sd=10))
db2 <- data.frame(type=rep("B",150),val=rnorm(n=150,mean=50,sd=10))
dbf <- bind_rows(db1,db2)
P1 <- ggplot(db1, aes(x=val)) + geom_histogram()
P2 <- ggplot(db2, aes(x=val)) + geom_histogram()
PF <- ggplot(dbf, aes(x=val)) + geom_histogram()
I want to get this, P1 and P2
ggplot(db1, aes(x=val)) + geom_histogram(fill="red", alpha=0.5) + geom_histogram(data=db2, aes(x=val),fill="green", alpha=0.5)
What I want
But the code I think should work, P1 and P2 with the fill aesthetic for column val
ggplot(dbf, aes(x=val)) + geom_histogram(aes(fill=type), alpha=0.5)
My code
Produces the combination of PF and P2
ggplot(dbf, aes(x=val)) + geom_histogram(fill="red", alpha=0.5) + geom_histogram(data=db2, aes(x=val),fill="green", alpha=0.5)
What I get
Any help or idea will be highly appreciated!
All you need is to pass position = "identity" to your geom_histogram function.
library(tidyverse)
library(ggplot2)
db1 <- data.frame(type=rep("A",100),val=rnorm(n=100,mean=50,sd=10))
db2 <- data.frame(type=rep("B",150),val=rnorm(n=150,mean=50,sd=10))
dbf <- bind_rows(db1,db2)
ggplot(dbf, aes(x=val, fill = type)) + geom_histogram(alpha=0.5, position = "identity")
Is your goal to show the overlap via the color combination? I'm not sure how to force geom_histogram to show the overlap, but geom_density does do what you want. You can play with the bandwidth (bw) to show more or less detail.
dbf %>% ggplot() +
aes(x = val, fill = type) +
geom_density(alpha = .5, bw = .5) +
scale_fill_manual(values = c("red","green"))
I am trying to add corresponding labels to the color in the bar in a histogram. Here is a reproducible code.
ggplot(aes(displ),data =mpg) + geom_histogram(aes(fill=class),binwidth = 1,col="black")
This code gives a histogram and give different colors for the car "class" for the histogram bars. But is there any way I can add the labels of the "class" inside corresponding colors in the graph?
The inbuilt functions geom_histogram and stat_bin are perfect for quickly building plots in ggplot. However, if you are looking to do more advanced styling it is often required to create the data before you build the plot. In your case you have overlapping labels which are visually messy.
The following codes builds a binned frequency table for the dataframe:
# Subset data
mpg_df <- data.frame(displ = mpg$displ, class = mpg$class)
melt(table(mpg_df[, c("displ", "class")]))
# Bin Data
breaks <- 1
cuts <- seq(0.5, 8, breaks)
mpg_df$bin <- .bincode(mpg_df$displ, cuts)
# Count the data
mpg_df <- ddply(mpg_df, .(mpg_df$class, mpg_df$bin), nrow)
names(mpg_df) <- c("class", "bin", "Freq")
You can use this new table to set a conditional label, so boxes are only labelled if there are more than a certain number of observations:
ggplot(mpg_df, aes(x = bin, y = Freq, fill = class)) +
geom_bar(stat = "identity", colour = "black", width = 1) +
geom_text(aes(label=ifelse(Freq >= 4, as.character(class), "")),
position=position_stack(vjust=0.5), colour="black")
I don't think it makes a lot of sense duplicating the labels, but it may be more useful showing the frequency of each group:
ggplot(mpg_df, aes(x = bin, y = Freq, fill = class)) +
geom_bar(stat = "identity", colour = "black", width = 1) +
geom_text(aes(label=ifelse(Freq >= 4, Freq, "")),
position=position_stack(vjust=0.5), colour="black")
Update
I realised you can actually selectively filter a label using the internal ggplot function ..count... No need to preformat the data!
ggplot(mpg, aes(x = displ, fill = class, label = class)) +
geom_histogram(binwidth = 1,col="black") +
stat_bin(binwidth=1, geom="text", position=position_stack(vjust=0.5), aes(label=ifelse(..count..>4, ..count.., "")))
This post is useful for explaining special variables within ggplot: Special variables in ggplot (..count.., ..density.., etc.)
This second approach will only work if you want to label the dataset with the counts. If you want to label the dataset by the class or another parameter, you will have to prebuild the data frame using the first method.
Looking at the examples from the other stackoverflow links you shared, all you need to do is change the vjust parameter.
ggplot(mpg, aes(x = displ, fill = class, label = class)) +
geom_histogram(binwidth = 1,col="black") +
stat_bin(binwidth=1, geom="text", vjust=1.5)
That said, it looks like you have other issues. Namely, the labels stack on top of each other because there aren't many observations at each point. Instead I'd just let people use the legend to read the graph.
I am trying to make an extremely single heatmap of percentages using ggplot2 which ideally will just be two single thin columns. I tried the following code, believing that the width option in aes would solve the problem.
p_prev_tg <- ggplot(tg_melt, aes(x = variable , y = OTU, fill = value,
width=.3)) + geom_tile() +
scale_fill_gradientn(colours = hm.palette2(10)) +
xlab(NULL) + ylab(NULL) +
theme(axis.text=element_text(size=7))
p_prev_tg
Unfortunately, this returns a plot with lots of empty space as shown. The plot I would like is those two bars side by side, how can I do this in ggplot?
thanks
What about this solution ?
set.seed(1234)
tg_melt <- data.frame(variable=rep(c("Prevalence_T","Prevalence_NT"), each=10),
OTU=rep(paste0("OTU_",1:10),2),
value=rnorm(20))
library(RColorBrewer)
library(ggplot2)
hm.palette2 <- colorRampPalette(rev(brewer.pal(11, 'Spectral')))
p_prev_tg <- ggplot(tg_melt, aes(x = as.numeric(variable), y = OTU, fill = value)) +
geom_tile() +
scale_fill_gradientn(colours = hm.palette2(10)) +
xlab(NULL) + ylab(NULL) +
theme(axis.text=element_text(size=7)) +
scale_x_continuous(breaks=c(1,2),
limits=c(0,3),
labels=levels(tg_melt$variable))+
theme_bw()
p_prev_tg
I have the following code in R which is modified from here, which plots a crosstab table:
#load ggplot2
library(ggplot2)
# Set up the vectors
xaxis <- c("A", "B")
yaxis <- c("A","B")
# Create the data frame
df <- expand.grid(xaxis, yaxis)
df$value <- c(120,5,30,200)
#Plot the Data
g <- <- ggplot(df, aes(Var1, Var2)) + geom_point(aes(size = value), colour = "lightblue") + theme_bw() + xlab("") + ylab("")
g + scale_size_continuous(range=c(10,30)) + geom_text(aes(label = value))
It produces the right figure, which is great, but I was hoping to custom colour the four dots, ideally so that the top left and bottom right are both one colour and the top right and bottom left are another.
I have tried to use:
+ scale_color_manual(values=c("blue","red","blue","red"))
but that doesn't seem to work. Any ideas?
I would suggest that you colour by a vector in your data frame, as you don't have a column that gives you this, you can either create one, or make a rule based on existing columns (which I have done below):
g <- ggplot(df, aes(Var1, Var2)) + geom_point(aes(size = value, colour = (Var2!=Var1))) + theme_bw() + xlab("") + ylab("")
g + scale_size_continuous(range=c(10,30)) + geom_text(aes(label = value))
The important part is: colour = (Var2!=Var1), note that i put this inside the aesthetic (aes) for the geom_point
Edit: if you wish to remove the legend (you annotate the chart with totals, so I guess you don't really need it), you can add: g + theme(legend.position="none") to remove it
I have a dataset with binary variables like the one below.
M4 = matrix(sample(1:2,20*5, replace=TRUE),20,5)
M4 <- as.data.frame(M4)
M4$id <- 1:20
I have produced a stacked bar plot using the code below
library(reshape)
library(ggplot2)
library(scales)
M5 <- melt(M4, id="id")
M5$value <- as.factor(M5$value)
ggplot(M5, aes(x = variable)) + geom_bar(aes(fill = value), position = 'fill') +
scale_y_continuous(labels = percent_format())
Now I want the percentage for each field in each bar to be displayed in the graph, so that each bar reach 100%. I have tried 1, 2, 3 and several similar questions, but I can't find any example that fits my situation. How can I manage this task?
Try this method:
test <- ggplot(M5, aes(x = variable, fill = value, position = 'fill')) +
geom_bar() +
scale_y_continuous(labels = percent_format()) +
stat_bin(aes(label=paste("n = ",..count..)), vjust=1, geom="text")
test
EDITED: to give percentages and using the scales package:
require(scales)
test <- ggplot(M5, aes(x = variable, fill = value, position = 'fill')) +
geom_bar() +
scale_y_continuous(labels = percent_format()) +
stat_bin(aes(label = paste("n = ", scales::percent((..count..)/sum(..count..)))), vjust=1, geom="text")
test
You could use the sjp.stackfrq function from the sjPlot-package (see examples here).
M4 = matrix(sample(1:2,20*5, replace=TRUE),20,5)
M4 <- as.data.frame(M4)
sjp.stackfrq(M4)
# alternative colors: sjp.stackfrq(M4, barColor = c("aquamarine4", "brown3"))
Plot appearance can be custzomized with various parameters...
I really like the usage of the implicit information that is created by ggplot itself, as described in this post:
using the ggplot_build() function
From my point of view this provides a lot of opportunities to finally control the appearance of a ggplot chart.
Hope this helps somehow
Tom