R ggplot2 barplot+lineplot change origin of lineplot [duplicate] - r

I'm having a problem with geom_bars wherein the bars are not rendered when I specify limits on the y-axis. I believe the following should reproduce the problem:
data <- structure(list(RoleCond = structure(c(1L, 1L, 2L, 2L), .Label = c("Buyer", "Seller"), class = "factor"),
ArgCond = structure(c(1L, 2L, 1L, 2L), .Label = c("No Argument", "Argument"), class = "factor"),
mean = c(2210.71428571429, 2142.70833333333, 2282.40740740741, 2346.2962962963),
se = c(20.1231042081511, 16.7408757749718, 20.1471554637891, 15.708092540868)),
.Names = c("RoleCond", "ArgCond", "mean", "se"), row.names = c(NA, -4L), class = "data.frame")
library(ggplot2)
ggplot(data=data, aes(fill=RoleCond, y=mean, x=ArgCond)) +
geom_bar(position="dodge", stat="identity") +
geom_errorbar(limits, position=dodge, width=0.1, size=.75) +
scale_y_continuous(limits=c(2000,2500))
which gives me this
The same code without the limits specified works fine. The geom_errorbar() doesn't seem to be related to the problem, but it does illustrate where the bars should be showing up.
I've tried using coord_cartesian(ylim=c(2000,2500)) which works for limiting the yaxis and getting the bars to display, but the axis labels get messed up and I don't understand what I'm doing with it.
Thanks for any suggestions! (I'm using R 2.15.0 and ggplot2 0.9.0)

You could try, with library(scales):
+ scale_y_continuous(limits=c(2000,2500),oob = rescale_none)
instead, as outlined here.

Adding an answer for my case which was slightly different in case someone comes across this:
When using position="dodge", the bars get horizontally resized automatically to fill space that is often well beyond the limits of the data itself. As a result, even if both your x-axis and y-axis limits are limits=c(min-1, max+1, for certain data sets, the position="dodge" might resize it beyond that limit range, causing the bars to not appear. This might even occur if your limit floor is 0, unlike the case above.
Using oob=rescale_none in both scale_y_continous() AND scale_x_continuous() fixes this issue by simply cutting off the resizing done by position="dodge".
As per earlier comment, it requires package:scales so run library(scales) first.
Hope this helps someone else where the above answers only get you part way.

This worked for me based on the link shared previously.
p + coord_cartesian(ylim=c(5,15))

This is a community wiki essentially copying user teunbrand's canonical answer to that topic - for more visibility added to this larger thread.
Consider the following plot (geom_col() is equivalent to geom_bar(stat = "identity")):
df <- data.frame(x = letters[1:7],
y = 1:7)
g <- ggplot(df, aes(x, y)) +
geom_col()
g
You can clearly see that the bars look like rectangles. Checking the underlying plot data, confirms that the bars are parameterised as rectangles with xmin/xmax/ymin/ymax parametrisation:
> layer_data(g)
x y PANEL group ymin ymax xmin xmax colour fill size linetype alpha
1 1 1 1 1 0 1 0.55 1.45 NA grey35 0.5 1 NA
2 2 2 1 2 0 2 1.55 2.45 NA grey35 0.5 1 NA
3 3 3 1 3 0 3 2.55 3.45 NA grey35 0.5 1 NA
4 4 4 1 4 0 4 3.55 4.45 NA grey35 0.5 1 NA
5 5 5 1 5 0 5 4.55 5.45 NA grey35 0.5 1 NA
6 6 6 1 6 0 6 5.55 6.45 NA grey35 0.5 1 NA
7 7 7 1 7 0 7 6.55 7.45 NA grey35 0.5 1 NA
Now consider the following plot:
g2 <- ggplot(df, aes(x, y)) +
geom_col() +
scale_y_continuous(limits = c(1, 7))
This one is empty, and reflects the case you have posted. Inspecting the underlying data yields the following:
> layer_data(g2)
y x PANEL group ymin ymax xmin xmax colour fill size linetype alpha
1 1 1 1 1 NA 1 0.55 1.45 NA grey35 0.5 1 NA
2 2 2 1 2 NA 2 1.55 2.45 NA grey35 0.5 1 NA
3 3 3 1 3 NA 3 2.55 3.45 NA grey35 0.5 1 NA
4 4 4 1 4 NA 4 3.55 4.45 NA grey35 0.5 1 NA
5 5 5 1 5 NA 5 4.55 5.45 NA grey35 0.5 1 NA
6 6 6 1 6 NA 6 5.55 6.45 NA grey35 0.5 1 NA
7 7 7 1 7 NA 7 6.55 7.45 NA grey35 0.5 1 NA
You can see that the ymin column is replaced by NAs. This behaviour depends on the oob (out-of-bounds) argument of scale_y_continuous(), which defaults to the scales::censor() function. This censors (replaces with NA) any values that are outside the axis limits, which includes the 0 which should be the ymin column. As a consequence, the rectangles can't be drawn.
There are two ways to work around this. One candidate is indeed as Magnus suggested to use the ylim argument in the coord_cartesian() function:
ggplot(df, aes(x, y)) +
geom_col() +
coord_cartesian(ylim = c(1, 7))
Specifying the limits inside a coord_* function causes the graphical objects to be clipped. You can see this in action when you turn the clipping off:
ggplot(df, aes(x, y)) +
geom_col() +
coord_cartesian(ylim = c(1, 7), clip = "off")
The other option is to use an alternative oob argument in the scale_y_continuous, for example scales::squish:
g3 <- ggplot(df, aes(x, y)) +
geom_col() +
scale_y_continuous(limits = c(1, 7),
oob = scales::squish)
g3
What this does, is that it replaces any value outside the limits by the nearest limit, e.g. the ymin of 0 becomes 1:
> layer_data(g3)
y x PANEL group ymin ymax xmin xmax colour fill size linetype alpha
1 1 1 1 1 1 1 0.55 1.45 NA grey35 0.5 1 NA
2 2 2 1 2 1 2 1.55 2.45 NA grey35 0.5 1 NA
3 3 3 1 3 1 3 2.55 3.45 NA grey35 0.5 1 NA
4 4 4 1 4 1 4 3.55 4.45 NA grey35 0.5 1 NA
5 5 5 1 5 1 5 4.55 5.45 NA grey35 0.5 1 NA
6 6 6 1 6 1 6 5.55 6.45 NA grey35 0.5 1 NA
7 7 7 1 7 1 7 6.55 7.45 NA grey35 0.5 1 NA
Another thing you could do is provide a custom function to the oob argument, that simply returns it's input. Since by default, clipping is on, this reflects the coord_cartesian(ylim = c(1,7)) case:
ggplot(df, aes(x, y)) +
geom_col() +
scale_y_continuous(limits = c(1, 7),
oob = function(x, ...){x})

Related

Why are the bars in my bar chart not showing up?

I'm having a problem with geom_bars wherein the bars are not rendered when I specify limits on the y-axis. I believe the following should reproduce the problem:
data <- structure(list(RoleCond = structure(c(1L, 1L, 2L, 2L), .Label = c("Buyer", "Seller"), class = "factor"),
ArgCond = structure(c(1L, 2L, 1L, 2L), .Label = c("No Argument", "Argument"), class = "factor"),
mean = c(2210.71428571429, 2142.70833333333, 2282.40740740741, 2346.2962962963),
se = c(20.1231042081511, 16.7408757749718, 20.1471554637891, 15.708092540868)),
.Names = c("RoleCond", "ArgCond", "mean", "se"), row.names = c(NA, -4L), class = "data.frame")
library(ggplot2)
ggplot(data=data, aes(fill=RoleCond, y=mean, x=ArgCond)) +
geom_bar(position="dodge", stat="identity") +
geom_errorbar(limits, position=dodge, width=0.1, size=.75) +
scale_y_continuous(limits=c(2000,2500))
which gives me this
The same code without the limits specified works fine. The geom_errorbar() doesn't seem to be related to the problem, but it does illustrate where the bars should be showing up.
I've tried using coord_cartesian(ylim=c(2000,2500)) which works for limiting the yaxis and getting the bars to display, but the axis labels get messed up and I don't understand what I'm doing with it.
Thanks for any suggestions! (I'm using R 2.15.0 and ggplot2 0.9.0)
You could try, with library(scales):
+ scale_y_continuous(limits=c(2000,2500),oob = rescale_none)
instead, as outlined here.
Adding an answer for my case which was slightly different in case someone comes across this:
When using position="dodge", the bars get horizontally resized automatically to fill space that is often well beyond the limits of the data itself. As a result, even if both your x-axis and y-axis limits are limits=c(min-1, max+1, for certain data sets, the position="dodge" might resize it beyond that limit range, causing the bars to not appear. This might even occur if your limit floor is 0, unlike the case above.
Using oob=rescale_none in both scale_y_continous() AND scale_x_continuous() fixes this issue by simply cutting off the resizing done by position="dodge".
As per earlier comment, it requires package:scales so run library(scales) first.
Hope this helps someone else where the above answers only get you part way.
This worked for me based on the link shared previously.
p + coord_cartesian(ylim=c(5,15))
This is a community wiki essentially copying user teunbrand's canonical answer to that topic - for more visibility added to this larger thread.
Consider the following plot (geom_col() is equivalent to geom_bar(stat = "identity")):
df <- data.frame(x = letters[1:7],
y = 1:7)
g <- ggplot(df, aes(x, y)) +
geom_col()
g
You can clearly see that the bars look like rectangles. Checking the underlying plot data, confirms that the bars are parameterised as rectangles with xmin/xmax/ymin/ymax parametrisation:
> layer_data(g)
x y PANEL group ymin ymax xmin xmax colour fill size linetype alpha
1 1 1 1 1 0 1 0.55 1.45 NA grey35 0.5 1 NA
2 2 2 1 2 0 2 1.55 2.45 NA grey35 0.5 1 NA
3 3 3 1 3 0 3 2.55 3.45 NA grey35 0.5 1 NA
4 4 4 1 4 0 4 3.55 4.45 NA grey35 0.5 1 NA
5 5 5 1 5 0 5 4.55 5.45 NA grey35 0.5 1 NA
6 6 6 1 6 0 6 5.55 6.45 NA grey35 0.5 1 NA
7 7 7 1 7 0 7 6.55 7.45 NA grey35 0.5 1 NA
Now consider the following plot:
g2 <- ggplot(df, aes(x, y)) +
geom_col() +
scale_y_continuous(limits = c(1, 7))
This one is empty, and reflects the case you have posted. Inspecting the underlying data yields the following:
> layer_data(g2)
y x PANEL group ymin ymax xmin xmax colour fill size linetype alpha
1 1 1 1 1 NA 1 0.55 1.45 NA grey35 0.5 1 NA
2 2 2 1 2 NA 2 1.55 2.45 NA grey35 0.5 1 NA
3 3 3 1 3 NA 3 2.55 3.45 NA grey35 0.5 1 NA
4 4 4 1 4 NA 4 3.55 4.45 NA grey35 0.5 1 NA
5 5 5 1 5 NA 5 4.55 5.45 NA grey35 0.5 1 NA
6 6 6 1 6 NA 6 5.55 6.45 NA grey35 0.5 1 NA
7 7 7 1 7 NA 7 6.55 7.45 NA grey35 0.5 1 NA
You can see that the ymin column is replaced by NAs. This behaviour depends on the oob (out-of-bounds) argument of scale_y_continuous(), which defaults to the scales::censor() function. This censors (replaces with NA) any values that are outside the axis limits, which includes the 0 which should be the ymin column. As a consequence, the rectangles can't be drawn.
There are two ways to work around this. One candidate is indeed as Magnus suggested to use the ylim argument in the coord_cartesian() function:
ggplot(df, aes(x, y)) +
geom_col() +
coord_cartesian(ylim = c(1, 7))
Specifying the limits inside a coord_* function causes the graphical objects to be clipped. You can see this in action when you turn the clipping off:
ggplot(df, aes(x, y)) +
geom_col() +
coord_cartesian(ylim = c(1, 7), clip = "off")
The other option is to use an alternative oob argument in the scale_y_continuous, for example scales::squish:
g3 <- ggplot(df, aes(x, y)) +
geom_col() +
scale_y_continuous(limits = c(1, 7),
oob = scales::squish)
g3
What this does, is that it replaces any value outside the limits by the nearest limit, e.g. the ymin of 0 becomes 1:
> layer_data(g3)
y x PANEL group ymin ymax xmin xmax colour fill size linetype alpha
1 1 1 1 1 1 1 0.55 1.45 NA grey35 0.5 1 NA
2 2 2 1 2 1 2 1.55 2.45 NA grey35 0.5 1 NA
3 3 3 1 3 1 3 2.55 3.45 NA grey35 0.5 1 NA
4 4 4 1 4 1 4 3.55 4.45 NA grey35 0.5 1 NA
5 5 5 1 5 1 5 4.55 5.45 NA grey35 0.5 1 NA
6 6 6 1 6 1 6 5.55 6.45 NA grey35 0.5 1 NA
7 7 7 1 7 1 7 6.55 7.45 NA grey35 0.5 1 NA
Another thing you could do is provide a custom function to the oob argument, that simply returns it's input. Since by default, clipping is on, this reflects the coord_cartesian(ylim = c(1,7)) case:
ggplot(df, aes(x, y)) +
geom_col() +
scale_y_continuous(limits = c(1, 7),
oob = function(x, ...){x})

Problems adapting the y-axis to 2x2 ANOVA bargraph using R and ggplot [duplicate]

This question already has answers here:
geom_bar bars not displaying when specifying ylim
(4 answers)
Closed last year.
I am not a Pro R user but I already tried multiple things and can't find a solution to the problem.
I created a bar graph for 2x2 ANOVA including error bars, APA theme and custom colors based on this website: https://sakaluk.wordpress.com/2015/08/27/6-make-it-pretty-plotting-2-way-interactions-with-ggplot2/
It works nicely but the y-axis starts at 0 although my scale only ranges from 1 - 7. I am trying to adapt the axis but I get strange errors.
This is what I did:
# see https://sakaluk.wordpress.com/2015/08/27/6-make-it-pretty-plotting-2-way-interactions-with-ggplot2/
interactionMeans(anova.2)
plot(interactionMeans(anova.2))
#using ggplot
install.packages("ggplot2")
library(ggplot2)
# create factors with value
GIFTSTUDY1DATA$PRICE <- ifelse (Scenario == 3 | Scenario == 4, 1, -1 )
table(GIFTSTUDY1DATA$PRICE)
GIFTSTUDY1DATA$PRICE <- factor(GIFTSTUDY1DATA$PRICE, levels = c(-1, +1),
labels = c("2 expensive", "1 inexpensive"))
GIFTSTUDY1DATA$AFFECT <- ifelse (Scenario == 1 | Scenario == 3, -1, +1 )
table(GIFTSTUDY1DATA$AFFECT)
GIFTSTUDY1DATA$AFFECT <- factor(GIFTSTUDY1DATA$AFFECT,
levels = c(-1,1),
labels = c("poor", "rich"))
# get descriptives
dat2 <- describeBy(EVALUATION,list(GIFTSTUDY1DATA$PRICE,GIFTSTUDY1DATA$AFFECT),
mat=TRUE,digits=2)
dat2
names(dat2)[names(dat2) == 'group1'] = 'Price'
names(dat2)[names(dat2) == 'group2'] = 'Affect'
dat2$se = dat2$sd/sqrt(dat2$n)
# error bars +/- 1 SE
limits = aes(ymax = mean + se, ymin=mean - se)
dodge = position_dodge(width=0.9)
# set layout
apatheme=theme_light()+
theme(panel.grid.major=element_blank(),
panel.grid.minor=element_blank(),
panel.border=element_blank(),
axis.line=element_line(),
text=element_text(family='Arial'))
#plot
p=ggplot(dat2, aes(x = Affect, y = mean, fill = Price))+
geom_bar(stat='identity', position=dodge)+
geom_errorbar(limits, position=dodge, width=0.15)+
apatheme+
ylab('mean gift evaluatoin')+
scale_fill_manual(values=c("yellowgreen","skyblue4"))
p
Which gives me this figure:
https://i.stack.imgur.com/MwdVo.png
Now, if I try to change the y-axis using ylim or scale_y_continous
p + ylim(1,7)
p + scale_y_continuous(limits = c(1,7))
I get a graph with the y-axis as wanted but no bars and an error message stating
Removed 4 rows containing missing values (geom_bar).
https://i.stack.imgur.com/p66H8.png
Using
p + expand_limits(y=c(1,7))
p
changes the upper end of the y-axis but still includes the zero!
What am I doing wrong? Do I have to start all over without using geom_bar?
Thanks in advance.
While Magnus Nordmo's answer is helpful, I would like to add the reason why ggplot2 behaves this way.
Consider the following plot (friendly reminder that geom_col() is shorthand for geom_bar(stat = "identity")):
df <- data.frame(x = letters[1:7],
y = 1:7)
g <- ggplot(df, aes(x, y)) +
geom_col()
g
You can clearly see that the bars look like rectangles. Checking the underlying plot data, confirms that the bars are parameterised as rectangles with xmin/xmax/ymin/ymax parametrisation:
> layer_data(g)
x y PANEL group ymin ymax xmin xmax colour fill size linetype alpha
1 1 1 1 1 0 1 0.55 1.45 NA grey35 0.5 1 NA
2 2 2 1 2 0 2 1.55 2.45 NA grey35 0.5 1 NA
3 3 3 1 3 0 3 2.55 3.45 NA grey35 0.5 1 NA
4 4 4 1 4 0 4 3.55 4.45 NA grey35 0.5 1 NA
5 5 5 1 5 0 5 4.55 5.45 NA grey35 0.5 1 NA
6 6 6 1 6 0 6 5.55 6.45 NA grey35 0.5 1 NA
7 7 7 1 7 0 7 6.55 7.45 NA grey35 0.5 1 NA
Now consider the following plot:
g2 <- ggplot(df, aes(x, y)) +
geom_col() +
scale_y_continuous(limits = c(1, 7))
This one is empty, and reflects the case you have posted. Inspecting the underlying data yields the following:
> layer_data(g2)
y x PANEL group ymin ymax xmin xmax colour fill size linetype alpha
1 1 1 1 1 NA 1 0.55 1.45 NA grey35 0.5 1 NA
2 2 2 1 2 NA 2 1.55 2.45 NA grey35 0.5 1 NA
3 3 3 1 3 NA 3 2.55 3.45 NA grey35 0.5 1 NA
4 4 4 1 4 NA 4 3.55 4.45 NA grey35 0.5 1 NA
5 5 5 1 5 NA 5 4.55 5.45 NA grey35 0.5 1 NA
6 6 6 1 6 NA 6 5.55 6.45 NA grey35 0.5 1 NA
7 7 7 1 7 NA 7 6.55 7.45 NA grey35 0.5 1 NA
You can see that the ymin column is replaced by NAs. This behaviour depends on the oob (out-of-bounds) argument of scale_y_continuous(), which defaults to the scales::censor() function. This censors (replaces with NA) any values that are outside the axis limits, which includes the 0 which should be the ymin column. As a consequence, the rectangles can't be drawn.
There are two ways to work around this. One candidate is indeed as Magnus suggested to use the ylim argument in the coord_cartesian() function:
ggplot(df, aes(x, y)) +
geom_col() +
coord_cartesian(ylim = c(1, 7))
Specifying the limits inside a coord_* function causes the graphical objects to be clipped. You can see this in action when you turn the clipping off:
ggplot(df, aes(x, y)) +
geom_col() +
coord_cartesian(ylim = c(1, 7), clip = "off")
The other option is to use an alternative oob argument in the scale_y_continuous, for example scales::squish:
g3 <- ggplot(df, aes(x, y)) +
geom_col() +
scale_y_continuous(limits = c(1, 7),
oob = scales::squish)
g3
What this does, is that it replaces any value outside the limits by the nearest limit, e.g. the ymin of 0 becomes 1:
> layer_data(g3)
y x PANEL group ymin ymax xmin xmax colour fill size linetype alpha
1 1 1 1 1 1 1 0.55 1.45 NA grey35 0.5 1 NA
2 2 2 1 2 1 2 1.55 2.45 NA grey35 0.5 1 NA
3 3 3 1 3 1 3 2.55 3.45 NA grey35 0.5 1 NA
4 4 4 1 4 1 4 3.55 4.45 NA grey35 0.5 1 NA
5 5 5 1 5 1 5 4.55 5.45 NA grey35 0.5 1 NA
6 6 6 1 6 1 6 5.55 6.45 NA grey35 0.5 1 NA
7 7 7 1 7 1 7 6.55 7.45 NA grey35 0.5 1 NA
Another thing you could do is provide a custom function to the oob argument, that simply returns it's input. Since by default, clipping is on, this reflects the coord_cartesian(ylim = c(1,7)) case:
ggplot(df, aes(x, y)) +
geom_col() +
scale_y_continuous(limits = c(1, 7),
oob = function(x, ...){x})
I hope this clarified what is going on here.
I have encountered a similar problem which was solved by replacing
scale_y_continuous(limits = c() with
coord_cartesian(ylim = c())
I think this might work for you.
Here is an example:
library(tidyverse)
ggplot(mtcars,aes(factor(am),hp)) +
geom_bar(stat = "identity") +
coord_cartesian(ylim = c(1000,3000))
Also see link:
Google R Discussion

How to plot negative values on x-axis from min to max (from -1 to 0)

I got a plot where on x-axis there are negative values from -0.15 to -1, but I need them from -1 to 0.
I plotted values (both positive and negative) by geom_bar in ggplot function. I got a plot where on x-axis there are negative values from -0.15 to -1, but I need them from -1 to 0.
Could you help how to fix it?
data frame looks like:
id value33333
<dbl> <chr>
1 -0.6
2 -0.8
3 -1
4 -0.2
5 -1
6 0.4
7 -1
8 -1
9 -0.6
10 0.1
11 -0.6
12 -1
13 0.1
14 0.15
15 0.5
16 0.4
17 -0.95
18 0.5
19 -0.6
20 0.05
I need to plot value33333 on x-axis and percent on y axis.
Thanks a lot!
ggplot(data = value33333) + geom_bar(mapping = aes(x = value33333, y = ..prop.., group = 1), stat = "count") +
scale_y_continuous(labels = scales::percent_format()) + theme_bw()
Using xlim(-1.1,0) (-1.1 to include the last bar) works without errors.
head(value33333)
interviewer internalID value
1 Nuriya 3 -0.6
2 Nuriya 5 -0.8
3 Nuriya 7 -1.0
4 Nuriya 9 -0.2
5 Nuriya 11 -1.0
6 Nuriya 13 0.4
ggplot(data = value33333) +
geom_bar(aes(x = value, y = ..prop.., group = 1), stat = "count") +
scale_y_continuous(labels = scales::percent_format()) + theme_bw() +
xlim(-1.1,0)

Stacked/Dodged barplot using base R with x-axis is numerical data

I have looked at all barplot questions in the sites but still couldn't figure out what to do with my dataset. I don't know if it's a duplicate but any help would be so much appreciated
My dataset
Region Scenario HC NPV1 NPV2
C 1 0.1 10 5
C 2 0.2 8 4
C 3 0.3 7 3
C 4 0.4 6 2
N 1 0.1 10 5
N 2 0.2 8 4
N 3 0.3 7 3
N 4 0.4 6 2
W 1 0.1 10 5
W 2 0.2 8 4
W 3 0.3 7 3
W 4 0.4 6 2
I want to create a barplot where HC, Scenario is at x-axis, NPV1 and NPV2 is the height and be distinguished by different patterns. A region should be a common name in the middle of each 4 scenarios.
Thanks a lot.
Expected output is something like this.
Further to my above comments, I'm quite unclear about how you'd like to visualise your data. What exactly would you like to show on the x axis?
As a start, perhaps you are after something like this?
library(tidyverse)
df %>%
gather(key, val, -Region, -Scenario, -HC) %>%
unite(x, Region, Scenario, HC) %>%
ggplot(aes(x, val, fill = key)) +
geom_col()
Here categories on the x-axis are of the form <Region>_<Scenario>_<HC>.
Update
To achieve a plot similar to the one you're showing you can do the following
library(tidyverse)
df %>%
gather(key, val, -Region, -Scenario, -HC) %>%
ggplot(aes(HC, val, fill = key)) +
geom_col(position = "dodge2") +
facet_wrap(~Region, nrow = 1, strip.position = "bottom") +
theme_minimal() +
theme(strip.placement = "outside")
Explanation: strip.position = "bottom" ensures that strip labels are at the bottom, and strip.placement = "outside" ensures that strip labels are below the axis labels (to be precise, between the axis labels and axis title).
Sample data
df <- read.table(text =
"Region Scenario HC NPV1 NPV2
C 1 0.1 10 5
C 2 0.2 8 4
C 3 0.3 7 3
C 4 0.4 6 2
N 1 0.1 10 5
N 2 0.2 8 4
N 3 0.3 7 3
N 4 0.4 6 2
W 1 0.1 10 5
W 2 0.2 8 4
W 3 0.3 7 3
W 4 0.4 6 2
", header = T)

ggplot2 expecting square matrix even though matrix is not symmetric

Hi I am trying to plot a heat map in ggplot2, using a matrix with 9 rows and 10 columns
I melt the matrix using the "as.matrix" notation in reshape2 and get the following output
A1 = melt(as.matrix(A))
Var1 Var2 value
1 1 X0.05 8.690705e-01
2 2 X0.05 1.930320e-01
3 3 X0.05 1.474900e-02
4 4 X0.05 3.498176e-04
5 5 X0.05 2.451419e-06
6 6 X0.05 4.946808e-09
7 7 X0.05 2.832895e-12
8 8 X0.05 4.563140e-16
9 9 X0.05 2.055474e-20
10 1 X0.1 5.906241e-01
11 2 X0.1 7.416265e-01
12 3 X0.1 2.311771e-01
13 4 X0.1 3.892639e-02
14 5 X0.1 3.361408e-03
15 6 X0.1 1.445629e-04
16 7 X0.1 3.043528e-06
17 8 X0.1 3.103555e-08
18 9 X0.1 1.522292e-10
The output is correct with each column represented by 9 values
I then rescale by value and get the following output
A2 = ddply(A1, .(var2), transform, rescale = rescale(value))
Var1 Var2 value rescale
1 1 X0.05 8.690705e-01 1.000000e+00
2 2 X0.05 1.930320e-01 2.221132e-01
3 3 X0.05 1.474900e-02 1.697101e-02
4 4 X0.05 3.498176e-04 4.025192e-04
5 5 X0.05 2.451419e-06 2.820737e-06
6 6 X0.05 4.946808e-09 5.692068e-09
7 7 X0.05 2.832895e-12 3.259684e-12
8 8 X0.05 4.563140e-16 5.250361e-16
9 9 X0.05 2.055474e-20 0.000000e+00
10 1 X0.1 5.906241e-01 7.963902e-01
11 2 X0.1 7.416265e-01 1.000000e+00
12 3 X0.1 2.311771e-01 3.117163e-01
13 4 X0.1 3.892639e-02 5.248786e-02
14 5 X0.1 3.361408e-03 4.532480e-03
15 6 X0.1 1.445629e-04 1.949266e-04
16 7 X0.1 3.043528e-06 4.103651e-06
17 8 X0.1 3.103555e-08 4.164269e-08
18 9 X0.1 1.522292e-10 0.000000e+00
Everything still looks fine and when I plot the heat map the output is correct, so far so good
ggplot(A2, aes(Var2, Var1)) + geom_tile(aes(fill = rescale), colour = "white") + scale_fill_gradient(low = "light blue", high = "dark blue")
The problem comes up when I add custom labels, where the y axis goes from 1 to 9 (displaying the number of heterozygote individuals) and the x-axis goes from 0.05 to 0.5 (displaying the minor allele frequency)
x = [0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50]
y = [1 2 3 4 5 6 7 8 9]
ggplot(A4, aes(Var2, Var1)) + geom_tile(aes(fill = rescale), colour = "white") + scale_fill_gradient(low = "light blue", high = "dark blue") + scale_x_discrete(labels= x, expression("Minor allele frequency")) + scale_y_discrete(labels= y, expression("Number of Heterozygotes"))
However this time the y axis is all messed up
It seems to me that ggplot automatically assumes a 10X10 matrix and tries to add the missing labels. I tried to find an option in reshape where I could maybe declare the shape of the matrix, however I was unable to find a solution. Has anyone come across this problem. Any help would be much appreciated, thanks in advance
Here is one approach. You can change tick mark labels with scale_x_discrete. As for y, I converted Var1 to factor.
ggplot(mydf, aes(x = Var2, y = as.factor(Var1), fill = rescale)) +
geom_tile(color = "white") +
scale_fill_gradient(low = "light blue", high = "dark blue") +
scale_x_discrete(breaks=c("X0.05", "X0.1"), labels=c("0.05", "0.1")) +
xlab("Minor allele frequency") +
ylab("Number of Heterozygotes")

Resources