symmetric (same axis) Heatmap with ggplot2 - r

I'd like to use ggplot2, to create a symmetrical heatmap. The x-axis should show exactly the same labels as the y-axis. unfortunately does the ddply() method affect the order.
The input.csv looks like this:
Names,Peter,Tom,Marc
Peter,1,6,1
Tom,2,4,12
Marc,3,0,21
Im using the following code so far:
library(ggplot2)
library(plyr)
library(reshape2)
library (scales)
dat <- read.csv("input.csv")# read input
dat.m <- melt(dat)# to "melt" the dataset
dat.s <- ddply(dat.m, .(variable), transform, rescale = scale(value)) #pairwise format
file <- ggplot(dat.s, aes(Names,variable)) + geom_tile(aes(fill = value),colour = "white") + theme(axis.text.x = element_text(angle = 90, hjust = 1),legend.position="top")
pdf(file=paste("output",".pdf",sep="")) # write to file
plot(file)
# make plot
dev.off()
This results in a plot where the Y-axis (from top-to-bottom) have the labels Marc-Tom-Peter, but the X-Axis have the labels (left-to-right) Marc-Peter-Tom.
Does anyone know, how I can achieve a plot, where the labels for both axis have the same (original) order ? (Peter,Tom,Marc), note that this is just a toy example - the real data has more than 100 labels so it would not help to manually define the pairs.
Thanks in advance

First create a vector of the names ordered the way you like it:
lvls <- as.character(dat$Names)
Next order variable so it matches Names:
dat.s$variable <- factor(dat.s$variable, levels = lvls)
Now try plotting.

You could also just add the limits to your scales. Note that the default is from bottom-to-top, so if I understand you correctly, you also have to use rev to reverse the order. Here's a possible solution:
ggplot(dat.s, aes(Names,variable)) +
geom_tile(aes(fill = value),colour = "white") +
theme(axis.text.x = element_text(angle = 90, hjust = 1),legend.position="top") +
scale_x_discrete(limits = dat$Names) +
scale_y_discrete(limits = rev(dat$Names))

Related

changing ggplot legend unit scale

This question is motivated by a previous post illustrating various ways to change how axes scales are plotted in a ggplot figure, from the default exponential notation to the full integer value (when ones axes values are very large). While I am able to convert the axes scales from exponential notation to full values, I am unclear how one would achieve the same goal for the values appearing in the legend.
While I understand that one can manually change the length of the legend scale with "scale_color..." or "scale_fill..." followed by the "limits" argument, this does not appear to be a solution to getting my legend values to show "6000000000" rather than "6e+09" (or "0" rather than "0e+00" for that matter).
The following example should suffice. My hope is someone can point out how to implement the 'scales' package to apply for legend scales rather than axes scales.
Thanks very much.
library(ggplot2)
library(scales)
Data <- data.frame(
pi = c(2,71,828,1828,45904,523536,2874713,52662497,757247093,6999595749),
e = c(3,14,159,2653,58979,311599,7963468,54418516,1590576171, 99),
face = 1:10)
p <- ggplot(data = Data, aes(x=face, y=e, colour = pi))
myplot <- p + geom_point() +
scale_y_continuous(labels = comma) +
scale_color_gradientn(colours = rainbow(2), limits=c(0,7000000000))
myplot
Use the Comma formatter in scale_color_gradientn by setting labels = comma e.g.:
p <- ggplot(data = Data, aes(x=face, y=e, colour = pi))
myplot <- p + geom_point() +
scale_y_continuous(labels = comma) +
scale_color_gradientn(colours = rainbow(2), limits=c(0,7000000000), labels = comma)
myplot

R ggplot2: Line overlayed on Bar Graph (from separate data frames)

I have a bar graph coming from one set of monthly data and I want to overlay on it data from another set of monthly data in the form of a line. Here is a simplified example (in my data the second data set is not a simple manipulation of the first):
library(reshape2)
library(ggplot2)
test<-abs(rnorm(12)*1000)
test<-rbind(test, test+500)
colnames(test)<-month.abb[seq(1:12)]
rownames(test)<-c("first", "second")
otherTest<-apply(test, 2, mean)
test<-melt(test)
otherTest<-as.data.frame(otherTest)
p<-ggplot(test, aes(x=Var2, y=value, fill=Var1, order=-as.numeric(Var2))) + geom_bar(stat="identity")+
theme_bw() + theme(panel.border = element_blank(), panel.grid.major = element_blank(),
panel.grid.minor = element_blank(), axis.line = element_line(colour = "black")) +
ggtitle("Test Graph") +
scale_fill_manual(values = c(rgb(1,1,1), rgb(.9,0,0))) +
guides(fill=FALSE) +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
works great to get the bar graph:
but I have tried multiple iterations to get the line on there and can't figure it out (like this):
p + geom_line(data=otherTest,size=1, color=rgb(0,.5,0)
Also, if anybody knows how I can make the bars in front of each other so that all you see is a red bar of height 500, I would appreciate any suggestions. I know I can just take the difference between the two lines of the matrix and keep it as a stacked bar but I thought there might be an easy way to put both bars on the x-axis, white in front of red. Thanks!
You have a few problems to deal with here.
Directly answering your question, if you don't provide a mapping via aes(...) in a geom call (like your geom_line...), then the mapping will come from ggplot(). Your ggplot() specifies x=Var2, y=value, fill=Var1.... All of these variable names must exist in your data frame otherTest for this to work, and they don't right now.
So, you either need to ensure that these variable names exist in otherTest, or specify mapping separately in geom_line. You might want to read up about how these layering options work. E.g., here's a post of mine that goes into some detail.
If you go for the first option, some other problems to think about:
is Var2 a factor with the same levels in both data frames? It probably should be.
to use geom_line as you are, you might need to add group = 1. See here.
Some others too, but here's a brief example of what you might do:
library(reshape2)
library(ggplot2)
test <- abs(rnorm(12)*1000)
test <- rbind(test, test+500)
colnames(test) <- month.abb[seq(1:12)]
rownames(test) <- c("first", "second")
otherTest <- apply(test, 2, mean)
test <- melt(test)
otherTest <- data.frame(
Var2 = names(otherTest),
value = otherTest
)
otherTest$Var2 = factor(otherTest$Var2, levels = levels(test$Var2))
ggplot(test, aes(x = Var2, y = value, group = 1)) +
geom_bar(aes(fill = Var1), stat="identity") +
geom_line(data = otherTest)

How to reorder the x axis on a stacked area plot

I have the following data frame and want to plot a stacked area plot:
library(ggplot2)
set.seed(11)
df <- data.frame(a = rlnorm(30), b = as.factor(1:10), c = rep(LETTERS[1:3], each = 10))
ggplot(df, aes(x = as.numeric(b), y = a, fill = c)) +
geom_area(position = 'stack') +
theme_grey() +
scale_x_discrete(labels = levels(as.factor(df$b))) +
theme(axis.text.x = element_text(angle = 90, hjust = 1))
The resulting plot on my system looks like this:
Unfortunately, the x-axis doesn't seem to show up. I want to plot the values of df$b rotated so that they don't overlap, and ultimately I would like to sort them in a specific way (haven't gotten that far yet, but I will take any suggestions).
Also, according to ?factor() using as.numeric() with a factor is not the best way to do it. When I call ggplot but leave out the as.numeric() for aes(x=... the plot comes up empty.
Is there a better way to do this?
Leave b as a factor. You will further need to add a group aesthetic which is the same as the fill aesthetic. (This tells ggplot how to "connect the dots" between separate factor levels.)
ggplot(df, aes(x = b, y = a, fill = c, group = c)) +
geom_area(position = 'stack') +
theme(axis.text.x = element_text(angle = 90, hjust = 1))
As for the order, the x-axis will go in the order of the factor levels. To change the order of the axis simply change the order of the factor levels. reorder() works well if you are basing it on a numeric column (or a function of a numeric column). For arbitrary orders, just specify the order of the levels directly in a factor call, something like: df$b = factor(df$b, levels = c("1", "5", "2", ...) For more examples of this, see the r-faq Order bars in ggplot. Yours isn't a barplot but the principle is identical.

Plot continuous raster data in binned classes with ggplot2 in R

I quite like the look and feel of ggplot2 and use them often to display raster data (e.g facetting over timesteps for time-varying precipitation fields is very useful).
However, I'm still wondering whether it is easily possible to bin the continuous raster values into discrete bins and assign to each bin a single colour, that is shown in the legend (as many GIS systems do).
I tried with the guide = "legend", and breaks arguments of the scale_fill_gradient option. However these affect just the legend on the side of the graph, but the plotted values are still continuous.
library(ggplot2)
data <- data.frame(x=rep(seq(1:10),times = 10), y=rep(seq(1:10),each = 10), value = runif(100,-10,10))
ggplot(data = data, aes(x=x,y=y)) +
geom_raster(aes(fill = value)) +
coord_equal() +
scale_fill_gradient2(low = "darkred", mid = "white", high = "midnightblue",
guide = "legend", breaks = c(-8,-4,0,4,8))
My question is mainly how to discretize the data that is plotted in ggplot, so that the reader of the graph can make quantitative conclusions on the values represented by the colors.
Secondly, how can I still use a diverging color palette (similar to scale_fill_gradient2), that is centered around zero or another specific value?
You should use the raster package to work with raster data. This
package provides several function to work with categorical
rasters. For example, with reclassify you can convert a continuous
file into a discrete raster. The next example is adapted from
this question:
library(raster)
f <- system.file("external/test.grd", package="raster")
r <- raster(f)
r <- reclassify(r, c(0, 500, 1,
500, 2000, 2))
On the other hand, if you want to use the ggplot2 functions, the
rasterVis package provides a simple wrapper around ggplot that
works with RasterLayer objects:
library(rasterVis)
gplot(r) +
geom_raster(aes(fill = factor(value))) +
coord_equal()
to define your own colors you can add then:
scale_fill_manual(values=c('red','green')))
The best is indeed to modify the underlying data set by manually discretizing it. Below answer is based on the answer by joran.
library(ggplot2)
set.seed(1)
data <- data.frame(x = rep(seq(1:10),times = 10),
y = rep(seq(1:10),each = 10),
value = runif(100,-10,10))
# Define category breaks
breaks <- c(-Inf,-3:3,Inf)
data$valueDiscr <- cut(data$value,
breaks = breaks,
right = FALSE)
# Define colors using the function also used by "scale_fill_gradient2"
discr_colors_fct <-
scales::div_gradient_pal(low = "darkred",
mid = "white",
high = "midnightblue")
discr_colors <- discr_colors_fct(seq(0, 1, length.out = length(breaks)))
discr_colors
# [1] "#8B0000" "#B1503B" "#D18978" "#EBC3B9" "#FFFFFF" "#C8C0DB" "#9184B7" "#5B4C93" "#191970"
ggplot(data = data, aes(x=x,y=y)) +
geom_raster(aes(fill = valueDiscr)) +
coord_equal() +
scale_fill_manual(values = discr_colors) +
guides(fill = guide_legend(reverse=T))
Update 2021-05-31:
Based on the comment by #slhck one can indeed discretize the data in the aesthetic mapping as follows:
library(ggplot2)
set.seed(1)
data <- data.frame(x = rep(seq(1:10),times = 10),
y = rep(seq(1:10),each = 10),
value = runif(100,-10,10))
# Define category breaks
breaks <- c(-Inf,-3:3,Inf)
discr_colors <- scales::div_gradient_pal(low = "darkred", mid = "white", high = "midnightblue")(seq(0, 1, length.out = length(breaks)))
# [1] "#8B0000" "#B1503B" "#D18978" "#EBC3B9" "#FFFFFF" "#C8C0DB" "#9184B7" "#5B4C93" "#191970"
ggplot(data = data, aes(x=x,y=y)) +
geom_raster(aes(fill = cut(value, breaks, right=FALSE))) +
coord_equal() +
scale_fill_manual(values = discr_colors) +
guides(fill = guide_legend(reverse=T))

How do I create a categorical scatterplot in R like boxplots?

Does anyone know how to create a scatterplot in R to create plots like these in PRISM's graphpad:
I tried using boxplots but they don't display the data the way I want it. These column scatterplots that graphpad can generate show the data better for me.
Any suggestions would be appreciated.
As #smillig mentioned, you can achieve this using ggplot2. The code below reproduces the plot that you are after pretty well - warning it is quite tricky. First load the ggplot2 package and generate some data:
library(ggplot2)
dd = data.frame(values=runif(21), type = c("Control", "Treated", "Treated + A"))
Next change the default theme:
theme_set(theme_bw())
Now we build the plot.
Construct a base object - nothing is plotted:
g = ggplot(dd, aes(type, values))
Add on the points: adjust the default jitter and change glyph according to type:
g = g + geom_jitter(aes(pch=type), position=position_jitter(width=0.1))
Add on the "box": calculate where the box ends. In this case, I've chosen the average value. If you don't want the box, just omit this step.
g = g + stat_summary(fun.y = function(i) mean(i),
geom="bar", fill="white", colour="black")
Add on some error bars: calculate the upper/lower bounds and adjust the bar width:
g = g + stat_summary(
fun.ymax=function(i) mean(i) + qt(0.975, length(i))*sd(i)/length(i),
fun.ymin=function(i) mean(i) - qt(0.975, length(i)) *sd(i)/length(i),
geom="errorbar", width=0.2)
Display the plot
g
In my R code above I used stat_summary to calculate the values needed on the fly. You could also create separate data frames and use geom_errorbar and geom_bar.
To use base R, have a look at my answer to this question.
If you don't mind using the ggplot2 package, there's an easy way to make similar graphics with geom_boxplot and geom_jitter. Using the mtcars example data:
library(ggplot2)
p <- ggplot(mtcars, aes(factor(cyl), mpg))
p + geom_boxplot() + geom_jitter() + theme_bw()
which produces the following graphic:
The documentation can be seen here: http://had.co.nz/ggplot2/geom_boxplot.html
I recently faced the same problem and found my own solution, using ggplot2.
As an example, I created a subset of the chickwts dataset.
library(ggplot2)
library(dplyr)
data(chickwts)
Dataset <- chickwts %>%
filter(feed == "sunflower" | feed == "soybean")
Since in geom_dotplot() is not possible to change the dots to symbols, I used the geom_jitter() as follow:
Dataset %>%
ggplot(aes(feed, weight, fill = feed)) +
geom_jitter(aes(shape = feed, col = feed), size = 2.5, width = 0.1)+
stat_summary(fun = mean, geom = "crossbar", width = 0.7,
col = c("#9E0142","#3288BD")) +
scale_fill_manual(values = c("#9E0142","#3288BD")) +
scale_colour_manual(values = c("#9E0142","#3288BD")) +
theme_bw()
This is the final plot:
For more details, you can have a look at this post:
http://withheadintheclouds1.blogspot.com/2021/04/building-dot-plot-in-r-similar-to-those.html?m=1

Resources