asymmetric colour gradient with ggplot bin2d - r

I have a chart which I want to colour the bin density (as below). But I want to have single bins (value=1) as black and higher values either as a single other colour, or better, as a gradient.
I have only been able to have a single black->red gradient, or completely discrete colours which is too confusing. I haven't been able to successfully map manual colours to the 'count' variable of the bin2d function. Can anyone suggest a fix?
My code:
ggplot(x, aes(x=as.factor(V4), y=V2)) +
geom_bin2d(binwidth = c(1,100)) +
scale_fill_continuous(low="black", high="red") +
facet_wrap(~V1, nrow = 1)
Zoomed version, showing how difficult it is to differentiate 2s
EDIT: I've realised a better way to represent this. What I want is a scale that looks like this:
My data (x) looks like this:
V1 V2 V3 V4
5 5831 30 A
5 20451 38 A
5 23151 34 B
5 30061 39 A
5 34191 32 B
5 41641 30 A
So, V2 is position of the row up the y axis, V1 is the facets and V4 is the vertical columns. Existence of the row (previously determined by V3 but not relevant here) contributes to the bin2d count.

I have managed to work this out. Found that you can map to the bind count using "..count..", so the code now reads:
ggplot(x, aes(x=as.factor(V4), y=V2)) +
geom_bin2d(binwidth = c(1,100), aes(fill=as.factor(..count..))) +
scale_fill_manual(values = c("#000000", "#FF9900", "#FF6600", "#FF3300")) +
scale_y_continuous(breaks = pretty_breaks(12)) +
facet_wrap(~V1, nrow = 1)

Related

Creating a mirrored, grouped barplot?

I have attempted adapting from some other solutions for slightly different situations. I am not being able to sort this out.
I would like to build a mirrored barplot comparing chemicals with controls, but with results grouped by chemical concentrations, and (if possible) both positive axes.
I provide data below, and an example of I would like it to generally look like.
volatiles<-c("hexenal3", "trans2hexenal", "trans2hexenol", "ethyl2hexanol", "phenethylalcohol", "methylsalicylate", "geraniol", "eugenol")
require(reshape2)
dat<-list(
conc1=data.frame("volatile"=volatiles, "focal"=c(26,27,28,28,31,31,30,28), "control"=c(24,31,30,29,24,23,21,25)),
conc2=data.frame("volatile"=volatiles, "focal"=c(29,18,34,17,30,32,35,27), "control"=c(21,42,20,40,25,16,17,29)),
conc3=data.frame("volatile"=volatiles, "focal"=c(33, 5,38, 7,37,35,40,26), "control"=c(18,51,14, 50,15,12,13,31))
)
long.dat<-melt(dat)
Attempting the following isn't working. Perhaps I should input a different data structure?
ggplot(long.dat, aes(x=L1, group=volatile, y=value, fill=variable)) +
geom_bar(stat="identity", position= "identity")
I would like it to look similar to this, but with the bars grouped in the triads of different concentrations (and, if possible, with all positive values).
Thanks in advance!
Try this:
long.dat$value[long.dat$variable == "focal"] <- -long.dat$value[long.dat$variable == "focal"]
library(ggplot2)
gg <- ggplot(long.dat, aes(interaction(volatile, L1), value)) +
geom_bar(aes(fill = variable), color = "black", stat = "identity") +
scale_y_continuous(labels = abs) +
scale_fill_manual(values = c(control = "#00000000", focal = "blue")) +
coord_flip()
gg
I suspect that the order on the left axis (originally x, but flipped to the left with coord_flip) will be relevant to you. If the current isn't what you need and using interaction(L1, volatile) instead does not give you the correct order, then you will need to combine them intelligently before ggplot(..), convert to a factor, and control the levels= so that they are in the order (and string-formatting) you need.
Most other aspects can be controlled via + theme(...), such as legend.position="top". I don't know what the asterisks in your demo image might be, but they can likely be added with geom_point (making sure to negate those that should be on the left).
For instance, if you have a $star variable that indicates there should be a star on each particular row,
set.seed(42)
long.dat$star <- sample(c(TRUE,FALSE), prob=c(0.2,0.8), size=nrow(long.dat), replace=TRUE)
head(long.dat)
# volatile variable value L1 star
# 1 hexenal3 focal -26 conc1 TRUE
# 2 trans2hexenal focal -27 conc1 TRUE
# 3 trans2hexenol focal -28 conc1 FALSE
# 4 ethyl2hexanol focal -28 conc1 TRUE
# 5 phenethylalcohol focal -31 conc1 FALSE
# 6 methylsalicylate focal -31 conc1 FALSE
then you can add it with a single geom_point call (and adding the legend move):
gg +
geom_point(aes(y=value + 2*sign(value)), data = ~ subset(., star), pch = 8) +
theme(legend.position = "top")
Like r2evans stated, the only way to do this is to use negative values in your data, and then manually use abs() when labelling. More specifically, it would look something like this:
ggplot(long.dat, aes(x=L1, group=volatile, y=value, fill=variable)) +
geom_bar(stat="identity", position= "identity") +
scale_y_continuous(breaks= c(-25,-15,-5,5,15,25),labels=abs(c(-25,-15,-5,5,15,25)))
Of course, use whatever labels make the most sense for your data, or you can set a sequence of numbers using the seq() function.
P.S. I also had trouble with your code, so next time please make sure your example is reproducible- you'll get answers quicker!

plotting the count of x-value in one col in df in r

I am in a bit of a struggle and I can't find a solution (it should be very simple)
my Code is this
df
Ch1 V1 V2 ID
A a1 a2 1
B b1 b2 2
C a1 b2 1
D d1 d2 3
...
in total we have values ranging from 1 to 9.
I simply want to plot how often 1(,2,3,...,9) occurs in this data frame. My code is this
ggplot(df,aes(ID))+ #because I read that leaving y value makes ggplot count the occurences which is T
geom_bar()+
This works but unfortunately I get this as a result
I want all values to be displayed though.
I tried to modify this with scale_x_continuous
but it didn't work (made the whole x-axis go away and display only 1)
I know I can also create a table = table(df)
But I want to find a universal solution. Because later I want to be able to apply this while making several bars per x-axis value with dependency on V1 or V2 ...
Thank you very much for your help!
According to the OP, the intention is to create
several bars per x-axis value with dependency on V1 or V2
This can be solved either by using fill = V1 and position = "dodge" as already suggested H 1 or by facetting. Both approaches have their merits depending on the aspect the OP wants to focus on.
Note that in all variants ID is turned into a discrete variable (using factor()) and by overriding the default axis title to solve the issue with labeling the x-axis.
Dogded position
library(ggplot2)
ggplot(df) +
aes(x = factor(ID), fill = V1) +
geom_bar(position = "dodge") +
xlab("ID")
This is good if the focus is on comparing the differences between V1 within each ID value.
Facets
library(ggplot2)
ggplot(df) +
aes(x = factor(ID), fill = V1) +
geom_bar() +
xlab("ID") +
facet_wrap(~ V1, nrow = 1L)
Here, the focus is on comparing the distribution of ID counts within each V1.
Colouring the bars in addition to faceting is redundant (but I find it aesthetically more pleasing as compared to all-black bars).
Data
As there were no reproducible data supplied in the question, I have tried to simulate the data by
nr <- 1000L
set.seed(123L) # required to reproduce the data
df <- data.frame(Ch1 = sample(LETTERS[1:4], nr, TRUE),
V1 = paste0(sample(letters[1:4], nr, TRUE), "1"),
V2 = paste0(sample(letters[1:4], nr, TRUE), "2"),
ID = pmin(1L + rgeom(nr, 0.3), 9L)
)
"Raw" plot for comparison with OP's chart
library(ggplot2)
ggplot(df) +
aes(x = ID) +
geom_bar()

Error: Don't know how to add e2 to a plot

Hello I am working on a data set which looks like as below
raw_data =
week v1 v3 v4 v5 v6
1 17 20.983819 7.799831 16.0600278 113.018687
2 34 22.651678 8.090671 16.4898951 120.824817
3 15 24.197048 6.892516 16.9805836 128.105372
4 14 26.016688 5.272781 17.471264 140.15794
5 26 27.572317 10.767018 17.8686156 154.886518
6 37 29.018684 21.280104 19.8096452 165.244061
7 27 30.395094 32.140543 22.937902 176.453934
8 24 31.832068 44.008145 28.714597 184.7598
9 16 33.383742 45.704626 39.2958153 193.461108
10 28 34.877819 39.355206 45.9069661 201.305558
What I am trying to achieve is to plot variables from v3 to v6 as a stacked area plot while variable v1 as a line plot in the same graph plot across the week.
I have tried the following code which does plot the stack area plot but not the line plot.
mdf <- melt(raw_data, id="Week") # convert to long format
p <- ggplot(mdf, aes(x=Week, y=value)) + geom_area(aes(fill= mdf$variable), position = 'stack') + theme_classic()
p + ggplot(raw_data, aes(x=Week, y=v1)) +geom_line()
and I get the following error
Error: Don't know how to add e2 to a plot
I tired the method suggested by this article How to overlay geom_bar and geom_line plots with different number of elements using ggplot2? and used the below code
mdf <- melt(raw_data, id="Week") # convert to long format
p <- ggplot(mdf, aes(x=Week, y=value)) + geom_area(aes(colour =
mdf$variable, fill= mdf$variable), position = 'stack') + theme_classic()
p + geom_line(aes(x=Week, y=mdf$variable=="v1"))
but then I got the below error
Error: Discrete value supplied to continuous scale
I tried to convert the v1 variable as per below code referencing the following article, however it did not help to resolve.
How do I get discrete factor levels to be treated as continuous?
raw_data$v1 <- as.numeric(as.character(raw_data$v1))
Please help how to resolve the issue. Also, how do I create a black border line for each graph in my stacked graph such that it is easy to differentiate among the graphs.
Thanks a lot for the help in advance!!
Using your melt command does not work for me, so I'm using gather instead.
All you need to do is add geom_line and specify the data and mapping:
mdf <- tidyr::gather(raw_data, variable, value, -week, -v1)
ggplot(mdf, aes(week, value)) +
geom_area(aes(fill = variable), position = 'stack', color = 'black') +
geom_line(aes(y = v1), raw_data, lty = 2)
Note: don't use $ inside aes, ever!

vertical line chart - change line plotting direction to top-down in R

I am looking for a way where data points are connected following a top-down manner to visualize a ranking. In that the y-axis represents the rank and the x-axis the attributes. With the normal setting the line connects the point starting from left to right. This results that the points are connected in the wrong order.
With the data below the line should be connected from (6,1) to (4,2) and then (5,3) etc. Optimally the ranking scale need to be inverted so that rank one starts on the top.
data <- read.table(header=TRUE, text='
attribute rank
1 6
2 5
3 4
4 2
5 3
6 1
7 7
8 11
9 10
10 8
11 9
')
plot(data$attribute,data$rank,type="l")
Is there a way to change the line drawing direction? My second idea would be to rotate the graph or maybe you have better ideas.
The graph I am trying to achieve is somewhat similar to this one:
example vertical line chart
You can do this with ggplot:
library(ggplot2)
ggplot(data, aes(y = attribute, x = rank)) +
geom_line() +
coord_flip() +
scale_x_reverse()
It solves the problem exactly the way you suggested. The first part of the command (ggplot(...) + geom_line()) creates an "ordinary" line plot. Note that I have already switched x- and y-coordinates. The next command (coord_flip()) flips x- and y-axis, and the last one (scale_x_reverse) changes the ordering of the x-axis (which is plotted as the y-axis) such that 1 is in the top left corner.
Just to show you that something like the example you linked in your question can be done with ggplot2, I add the following example:
library(tidyr)
data$attribute2 <- sample(data$attribute)
data$attribute3 <- sample(data$attribute)
plot_data <- pivot_longer(data, cols = -"rank")
ggplot(plot_data, aes(y = value, x = rank, colour = name)) +
geom_line() +
geom_point() +
coord_flip() +
scale_x_reverse()
If you intend to do your plots with R, learning ggplot2 is really worthwhile. You can find many examples on Cookbook for R.

adding a key for geom_line to legend from geom_area

I have a data frame, where I am talking about different flows of water at a dam (water units are kcfs—1000 cubic feet per second—if anyone is interested)
Call it df4plot
date kcfs Flowtype
10/1/2010 50 Power
10/1/2010 10 Spill_Overgen
10/1/2010 8 Spill_Force
10/2/2010 52 Power
10/2/2010 7 Spill_Overgen
10/2/2010 10 Spill_Force
(there are 3x365 rows in the data frame)
So what I want to do is make an aggregated area graph that shows each of these flows
p <- ggplot(data = df4plot, aes(date,kcfs)) +
geom_area(aes(colour = Flowtype, fill=Flowtype), position = “stack”)
I want to control the colors used, so I added
plot_colors_aggregate <- c("forestgreen","lightsalmon","dodgerblue")
p <- p +
scale_color_manual(values = plot_colors_aggregate) +
scale_fill_manual(values = plot_colors_aggregate)
Now I want to add a dashed line, showing the maximum turbine capacity—the flow limits for power generation—that vary by month. I have a separate dataframe for this (365 rows long), df4FGline
Date FGlimit
10/1/2010 52
10/2/2010 52
…
11/1/2010 60
11/2/2010 60
...
Etc
So now I have
p <- p +
geom_line(data = df4FGline, aes(x=date,y=FGlimit), colour = “darkblue”, linetype = “dashed”)
p
The legend is currently just the three blocks for the three types of Flowtype. I’d like to add the dashed line for the flow gate limits to the bottom, but I can’t get it to show up there.
It is probably related to my incomplete understanding of aes (help(aes) is AMAZINGLY unhelpful).
I’ve tried something similar to this and something similar to this, but since I’m only trying to add 1 line to a pre-existing legend, maybe?, this is not working for me.
I tried adding “legend = TRUE” inside the parentheses for the geom_line, but it put a dashed line inside each color box in the legend, AND created a 4th entry for the legend, but offset from the rest of the legend (below and to the right)... ARG!
I swear I have the book on order... any help you can share so that I understand this aesthetic thing and how it relates to the legend a little better, I'd be extremely grateful.
edited for typo
This should help:
df <- data.frame(x = 1:10,y = 1:10)
ggplot(df,aes(x = x,y = y)) +
geom_line(aes(linetype = "dashed")) +
scale_linetype_manual(name = "Linetype",values = "dashed")

Resources