ggplot2: Violin Plot with Stat="Identity" - r

I'm trying to use ggplot to create a violin plot where instead of the widths of the violins being controlled by a density function, they directly represent the count of relevant elements.
I think this can be accomplished through setting geom_violin(stat="identity"), but R then complains
> ggplot(allData, aes(x = tool, y = length)) + geom_violin(stat="identity")
Warning: Ignoring unknown parameters: trim, scale
Error in eval(substitute(list(...)), `_data`, parent.frame()) :
object 'violinwidth' not found
Trying to add aes(violinwidth=0.2*count), as this answer suggests, gives
> ggplot(allData, aes(x = tool, y = length)) + geom_violin(stat="identity", aes(violinwidth=0.2*count))
Warning: Ignoring unknown parameters: trim, scale
Warning: Ignoring unknown aesthetics: violinwidth
Error in FUN(X[[i]], ...) : object 'count' not found
And while I can set violinwidth to just a constant, this makes the violins just rectangles. How can I fix this?

When I run this with some sample data, it generates the plots ok with and without the changes to stat and violinwidth. Is your count a column in allData?
library(ggplot2)
dt <- data.frame(category = rep(letters[1:2], each = 10),
response = runif(20),
count = rpois(20, 5))
ggplot(dt, aes(x = category, y = response)) + geom_violin()
ggplot(dt, aes(x = category, y = response)) +
geom_violin(stat = "identity", aes(violinwidth = 0.1*count))

Related

Error plotting in a second series in ggplot2

I want to plot a scatter plot with two different series:
Ej4 = read.xlsx("C:\\Users\\Ulia\\Downloads\\Parte 5.xlsx", sheet = 4)
fix(Ej4)
graf = ggplot(Ej4, aes(x1,x1)) + geom_point(alpha = 0.8)
graf
variable = data.frame(y = c(6,9,6,17,12), y2= c(6,9,6,17,12))
variable
grafica2 = graf + geom_point() +
geom_point(data = variable, colour ="blue")
grafica2
But R shows this error:
Error in FUN(X[[i]], ...) : object 'x' not found
There's no problem plotting graf, but I can't understand why R tells me there's is an error with grafica2
PD: Ej4 is a dataframe with numerical variables, with the exactly same size as 'variable'
There's another way:
ggplot() + geom_point(data = Ej4, aes(x1,y1)) + geom_point(data = variable, aes(y,y2))

How do I plot flights data from nycflights13 so that x=airlines, y = dep_delay?

When I try to plot x = airlines, y = dep_delay, I get an error message.
My hypothesis is that delays are caused by the inefficiency of the airlines above and beyond any other factors. I simply want to plot these two variables, I get an error message.
I try this code but it doesn't work.
ggplot(data = flights, mapping = aes(x = airlines, y= dep_delay)) +
geom_point() +
geom_smooth(se = FALSE)
ggplot(data = flights, mapping = aes(x = airlines, y = dep_delay)) +
geom_point() +
geom_smooth(se = FALSE)
Don't know how to automatically pick scale for object of type tbl_df/tbl/data.frame. Defaulting to continuous.
Error: Aesthetics must be either length 1 or the same as the data (336776): x
You are using two "+" instead of a single "+" sign

Weighted ggplot2 warning: Ignoring unknown aesthetics: weight

I try to plot a weighted density with ggplot2. The results seem to be fine, but I get the following warning: Warning: Ignoring unknown aesthetics: weight. Similar problems seem to appear in other ggplot2 applications and therefore I am wondering, if the warning could be ignored.
Reproducible example:
library(ggplot2)
set.seed(123)
# Some random data & weights
x <- rnorm(1000, 5)
w <- x^5
# Plot unweighted
ggplot() + stat_density(aes(x = x))
# Plot weighted - Warning: Ignoring unknown aesthetics: weight
ggplot() + stat_density(aes(x = x, weight = w / sum(w))) # Weighting seems to work fine
# Comparison of weighted density in base graphics - Same results as with ggplot2
plot(density(x, weights = w / sum(w)))
Can this warning message be ignored?
You can avoid the warning by using geom_density:
ggplot() +
geom_density(aes(x = x, weight = w / sum(w)), color = "green") +
geom_density(aes(x = x), color = "blue")
I would have expected the stat_ function to handle the same aesthetics as the geom and it appears to do so. The warning would then be a bug that should be reported to the maintainers.
Here is another solution:
ggplot(data=NULL, aes(x = x, weight=w/sum(w))) + stat_density()
And:
ggplot(data=NULL, aes(x = x, weight=w/sum(w))) +
stat_density(fill=NA, color = "green") +
stat_density(aes(x=x), fill=NA, color = "blue", inherit.aes=F)
Whether or not your get a warning appears to depend on where you give the weight argument (ggplot2 version 2.2.1):
Following these answers:
Create weighted histogram,
Histogram with weights
Setup data:
w = seq(1,1000)
v = sort(runif(1000))
foo = data.frame(v,w)
The following command produces a warning:
ggplot(foo) + geom_histogram(aes(v, weight=w),bins = 30)
These commands do not produce a warning:
ggplot(foo, aes(v, weight=w)) + geom_histogram(bins = 30)
ggplot(foo, aes(weight=w)) + geom_histogram(aes(v),bins = 30)
But all three commands produce the same plot.

Scatterplot: Error in FUN(X[[i]], ...) : object 'Group' not found

I'm trying to plot some data using ggplot and I'm having some problems with the significant lines and asterisk.
This is the code I am using:
p <- ggplot(Hematoxilin_tumor_necrosis, aes(x=total, y=necro, colour = Group))+
labs(y="Necrotic area",x="Total area")+
theme_minimal()
path = data.frame(x=c(78,79,79,78),y=c(22,22,34,34))
p + geom_point(size=0.7)+
geom_smooth(method=lm, se = F, size=0.8) +
scale_color_manual(values=c("#999999","#333333"))+
#Adding asterisks
geom_path(data = path, aes(x = x,y = y)) +
annotate("text",x = 80, y = 27, label="*", cex=7)
Which gives me the following error:
Error in FUN(X[[i]], ...) : object 'Group' not found
I know that the problem is in the geom_path(data = path, aes(x = x,y = y)) but I am kind of lost. I am new in ggplot so I expect some simple problem.
Any advice?
aesthetics are inherited by default. The geom_path is trying to look for the Group variable on the path dataset to get the color. You should use inherit.aes = FALSE on the geom_path:
geom_path(data = path, aes(x = x,y = y), inherit.aes = FALSE )

ggplot2: how to add sample numbers to density plot?

I am trying to generate a (grouped) density plot labelled with sample sizes.
Sample data:
set.seed(100)
df <- data.frame(ab.class = c(rep("A", 200), rep("B", 200)),
val = c(rnorm(200, 0, 1), rnorm(200, 1, 1)))
The unlabelled density plot is generated and looks as follows:
ggplot(df, aes(x = val, group = ab.class)) +
geom_density(aes(fill = ab.class), alpha = 0.4)
What I want to do is add text labels somewhere near the peak of each density, showing the number of samples in each group. However, I cannot find the right combination of options to summarise the data in this way.
I tried to adapt the code suggested in this answer to a similar question on boxplots: https://stackoverflow.com/a/15720769/1836013
n_fun <- function(x){
return(data.frame(y = max(x), label = paste0("n = ",length(x))))
}
ggplot(df, aes(x = val, group = ab.class)) +
geom_density(aes(fill = ab.class), alpha = 0.4) +
stat_summary(geom = "text", fun.data = n_fun)
However, this fails with Error: stat_summary requires the following missing aesthetics: y.
I also tried adding y = ..density.. within aes() for each of the geom_density() and stat_summary() layers, and in the ggplot() object itself... none of which solved the problem.
I know this could be achieved by manually adding labels for each group, but I was hoping for a solution that generalises, and e.g. allows the label colour to be set via aes() to match the densities.
Where am I going wrong?
The y in the return of fun.data is not the aes. stat_summary complains that he cannot find y, which should be specificed in global settings at ggplot(df, aes(x = val, group = ab.class, y = or stat_summary(aes(y = if global setting of y is not available. The fun.data compute where to display point/text/... at each x based on y given in the data through aes. (I am not sure whether I have made this clear. Not a native English speaker).
Even if you have specified y through aes, you won't get desired results because stat_summary compute a y at each x.
However, you can add text to desired positions by geom_text or annotate:
# save the plot as p
p <- ggplot(df, aes(x = val, group = ab.class)) +
geom_density(aes(fill = ab.class), alpha = 0.4)
# build the data displayed on the plot.
p.data <- ggplot_build(p)$data[[1]]
# Note that column 'scaled' is used for plotting
# so we extract the max density row for each group
p.text <- lapply(split(p.data, f = p.data$group), function(df){
df[which.max(df$scaled), ]
})
p.text <- do.call(rbind, p.text) # we can also get p.text with dplyr.
# now add the text layer to the plot
p + annotate('text', x = p.text$x, y = p.text$y,
label = sprintf('n = %d', p.text$n), vjust = 0)

Resources