ggplot2: how to show the legend [duplicate] - r

This question already has answers here:
Add legend to ggplot2 line plot
(4 answers)
Closed 2 years ago.
I made a simple classic plot with ggplot2 which is two graphs in one. However, I'm struggling in showing the legend. It's not showing the legend. I didn't use the melt and reshape way, I just use the classic way. Below is my code.
df <- read.csv("testDataFrame.csv")
graph <- ggplot(df, aes(A)) +
geom_line(aes(y=res1), colour="1") +
geom_point(aes(y=res1), size=5, shape=12) +
geom_line(aes(y=res2), colour="2") +
geom_point(aes(y=res2), size=5, shape=20) +
scale_colour_manual(values=c("red", "green")) +
scale_x_discrete(name="X axis") +
scale_y_continuous(name="Y-axis") +
ggtitle("Test")
#scale_shape_discrete(name ="results",labels=c("Res1", "Res2"),solid=TRUE)
print(graph)
the data frame is:
A,res1,res2
1,11,25
2,29,40
3,40,42
4,50,51
5,66,61
6,75,69
7,85,75
Any suggestion on how to show the legend for the above graph?

In ggplot2, legends are shown for every aesthetic (aes) you set; such as group, colour, shape. And to do that, you'll have to get your data in the form:
A variable value
1 res1 11
... ... ...
6 res1 85
7 res2 75
You can accomplish this with reshape2 using melt (as shown below):
require(reshape2)
require(ggplot2)
ggplot(dat = melt(df, id.var="A"), aes(x=A, y=value)) +
geom_line(aes(colour=variable, group=variable)) +
geom_point(aes(colour=variable, shape=variable, group=variable), size=4)
For example, if you don't want colour for points, then just remove colour=variable from geom_point(aes(.)). For more legend options, follow this link.

Related

ggplot2 - why does changing axis scale affect summary statistics of variables? [duplicate]

This question already has an answer here:
R ggplot boxplot: change y-axis limit
(1 answer)
Closed last month.
I have a the following data:
x <- data.frame('myvar'=c(10,10,9,9,8,8, runif(100)), 'mygroup' = c(rep('a', 26), rep('b', 80)))
I want to describe the data using a box-and-whiskers plot in ggplot2. I have also included the mean using a stat_summary.
library(ggplot2)
ggplot(x, aes(x=myvar, y=mygroup)) +
geom_boxplot() +
stat_summary(fun=mean, geom='point', shape=20, color='red', fill='red')
This is fine, but for some of my graphs, the outliers are so huge, that it's hard to make sense of the total distribution. In these cases, I have cut the x axis:
ggplot(x, aes(x=myvar, y=mygroup)) +
geom_boxplot() +
stat_summary(fun=mean, geom='point', shape=20, color='red', fill='red') +
scale_x_continuous(limit=c(0,5))
Note, now that the means (and medians?) are calculated using only the subset of data that is visible on the graph. Is there a ggplot way to include the outlier observations in the calculation but drop them from the visualisation?
My desired output would be a graph with x limits at c(0,5) and a red dot at 2.48 for group mygroup='a'.
scale_x_continuous will remove those points not lying within the limits. You want to use coord_cartesian to "zoom in" without removing your data:
ggplot(x, aes(x=myvar, y=mygroup)) +
geom_boxplot() +
stat_summary(fun=mean, geom='point', shape=20, color='red', fill='red') +
coord_cartesian(c(0,5))

How to Add Lines With A Facet R [duplicate]

This question already has answers here:
facet_wrap add geom_hline
(2 answers)
Closed 5 months ago.
So I have a faceted graph, and I want to be able to add lines to it that change by each facet.
Here's the code:
p <- ggplot(mtcars, aes(x=wt))+
geom_histogram(bins = 20,aes(fill = factor(cyl)))+
facet_grid(.~cyl)+
scale_color_manual(values = c('red','green','blue'))+
geom_vline(xintercept = mean(mtcars$wt))
p
So my question is, how would I get it so that the graph is showing the mean of each faceted sub-graph.
I hope that makes sense and appreciate your time regardless of your answering capability.
You can do this within the ggplot call by using stat_summaryh from the ggstance package. In the code below, I've also changed scale_colour_manual to scale_fill_manual on the assumption that you were trying to set the fill colors of the histogram bars:
library(tidyverse)
library(ggstance)
ggplot(mtcars, aes(x=wt))+
geom_histogram(bins = 20,aes(fill = factor(cyl)))+
stat_summaryh(fun.x=mean, geom="vline", aes(xintercept=..x.., y=0),
colour="grey40") +
facet_grid(.~cyl)+
scale_fill_manual(values = c('red','green','blue')) +
theme_bw()
Another option is to calculate the desired means within geom_vline (this is an implementation of the summary approach that #Ben suggested). In the code below, the . is a "pronoun" that refers to the data frame (mtcars in this case) that was fed into ggplot:
ggplot(mtcars, aes(x=wt))+
geom_histogram(bins = 20,aes(fill = factor(cyl)))+
geom_vline(data = . %>% group_by(cyl) %>% summarise(wt=mean(wt)),
aes(xintercept=wt), colour="grey40") +
facet_grid(.~cyl)+
scale_fill_manual(values = c('red','green','blue')) +
theme_bw()

ggplot2 different facet width for categorical x-axis [duplicate]

This question already has an answer here:
different size facets proportional of x axis on ggplot 2 r
(1 answer)
Closed 5 years ago.
I have am plotting different facets of categorical data:
df <- as.data.frame(as.factor(c("A","B","C","D","E","F")))
names(df) <- "Xvar"
df$Yvar <- c(2,1,4,5,3,7)
df$facet <- c(rep("facet 1",2),rep("facet 2",4))
ggplot(df, aes(x=Xvar, y=Yvar, group=1)) +
geom_line() +
facet_wrap(~facet, scales="free_x")
How can I make it such that facet 1 consisting of only two categories is half the size of facet 2 containing four categories? I.e. that the width of each facet is proportional to the number of categorical x-axis data points? I tried scales="free_x" to no avail.
If you're willing to use facet_grid instead of facet_wrap, you can do this with the space parameter.
ggplot(df, aes(x=Xvar, y=Yvar, group=1)) +
geom_line() +
facet_grid(~facet, scales="free_x", space = "free_x")

Multiple graphs with different x-axis ticks [duplicate]

This question already has answers here:
Order discrete x scale by frequency/value
(7 answers)
Closed 6 years ago.
I have the following data.frame:
ef2 <- data.frame(X1=c(50,100,'bb','aa'), X2=c('A','A','B','B'), value=c(1,4,3,6))
I want to create two plots, one for each group in X2.
Here is the code I have and the plot obtained:
ggplot(data=ef2, aes(x=X1, y=value, group=X2)) +
facet_grid(.~X2, scales="free_x") +
geom_line(size=1) +
geom_point(size=3) +
xlab('') +
ylab('Y')
The problem is that the x-axis is ordered alphabetically and I don't know how to fix it. I have tried adding scale_x_discrete, but I don't know how to separate groups. You can see the plot I obtained adding this parameter in the following link:
ggplot(data=ef2, aes(x=X1, y=value, group=X2)) +
facet_grid(.~X2, scales="free_x") +
geom_line(size=1) +
geom_point(size=3) +
xlab('') +
ylab('Y') +
scale_x_discrete(limits=ef2$X1)
Edited: I can't change ef2 data.frame. I've tried ordering factors in another data.frame:
ef2 <- data.frame(X1=c(50,100,'bb','aa'), X2=c('A','A','B','B'), value=c(1,4,3,6))
ef2$X1 <- as.character(ef2$X1)
nou <- data.frame(X1=factor(ef2$X1), levels=ef2$X1, X2=ef2$X2, value=ef2$value)
But it doesn't work.
This worked for me but I am not sure if it is exactly what you need:
ef2 <- data.frame(X1=factor(c('50','100','bb','aa'), levels = c('50','100','bb','aa')), X2=c('A','A','B','B'), value=c(1,4,3,6))
ggplot(data=ef2, aes(x=X1, y=value, group=X2)) +
facet_grid(.~X2, scales="free_x") +
geom_line(size=1) +
geom_point(size=3) +
xlab('') +
ylab('Y')
According to this post: Avoid ggplot sorting the x-axis while plotting geom_bar()
ggplot orders automatically unless you provide an already orderd factor.
Update:
The code you use has an error. levels is an argument of the factor function.
Try this:
ef2 <- data.frame(X1=c(50,100,'bb','aa'), X2=c('A','A','B','B'), value=c(1,4,3,6))
ef2$X1 <- factor(ef2$X1, levels = unique(ef2$X1))

Fit curve to histogram ggplot [duplicate]

This question already has answers here:
"Density" curve overlay on histogram where vertical axis is frequency (aka count) or relative frequency?
(3 answers)
Closed 7 years ago.
I know that i can fit a density curve to my histogram in ggplot in the following way.
df = data.frame(x=rnorm(100))
ggplot(df, aes(x=x, y=..density..)) + geom_histogram() + geom_density()
However, I want my yaxis to be frequency(counts) instead of density, and retain a curve that fits the distribution. How do I do that?
Depending on your goals, something like this may work by just scaling the density curve using multiplication:
ggplot(df, aes(x=x)) + geom_histogram() + geom_density(aes(y=..density..*10))
or
ggplot(df, aes(x=x)) + geom_histogram() + geom_density(aes(y=..count../10))
Choose other values (instead of 10) if you want to scale things differently.
Edit:
Since you are defining your scaling factor in the global environment, you can define it within aes:
ggplot(df, aes(x=x)) + geom_histogram() + geom_density(aes(n=n, y=..density..*n))
# or
ggplot(df, aes(x=x, n=n)) + geom_histogram() + geom_density(aes(y=..density..*n))
or another, less nice way using get:
ggplot(df, aes(x=x)) +
geom_histogram() +
geom_density(aes(y=..density.. * get("n", pos = .GlobalEnv)))

Resources