How to make frequency barplot in groups? - r

Suppose my data is two columns, one is "Condition", one is "Stars"
food <- data.frame(Condition = c("A", "B", "A", "B", "A"), Stars=c('good','meh','meh','meh','good'))
How to make a barplot of the frequency of "Star" as grouped by "Condition"?
I read here but would like to expand that answer to include groups.
for now I have
q <- ggplot(food, aes(x=Stars))
q + geom_bar(aes(y=..count../sum(..count..)))
but that is the proportion of the full data set.
How to make a plot with four bars, that is grouped by 'Condition'?
Eg. 'Condition A' would have 'Good' as 0.66 and 'Meh' as 0.33

I guess this is what you are looking for:
food <- data.frame(Condition = c("A", "B", "A", "B", "A"), Stars=c('good','meh','meh','meh','good'))
library(ggplot2)
library(dplyr)
data <- food %>% group_by(Stars,Condition) %>% summarize(n=n()) %>% mutate(freq=n/sum(n))
ggplot(data, aes(x=Stars, fill = Condition, group = Condition)) + geom_bar(aes(y=freq), stat="identity", position = "dodge")
At first i have calculated the frequencies using dplyr package, which is used as y argument in geom_bar(). Then i have used fill=Condition argument in ggplot() which divided the bars according to Condition. Additionally i have set position="dodge" to get the bars next to each other and stat="identity", due to already calculated frequencies.

I have used value ..prop.., aesthetic group and facet_wrap(). Using aesthetic group proportions are computed by groups. And facet_wrap() is used to plot each condition separately.
require(ggplot2)
food <- data.frame(Condition = c("A", "B", "A", "B", "A"),
Stars=c('good','meh','meh','meh','good'))
ggplot(food) +
geom_bar(aes(x = Stars, y = ..prop.., group = Condition)) +
facet_wrap(~ Condition)

Related

Add significant letters for single panel with facet_wrap

I used facet_wrap to visually split into two single panels representing for different letters. I would like to add astertiks and bars to show the level of significance between two groups (m and n).
My pseudo data looks like this:
my_data <- data.frame(Letter = c("a", "a", "a", "a", "a",
"b", "b", "b", "b", "b"),
value = c(19,13.5, 6.4, 17.5, 14.2,
0.3, 0.4, 0.7, 0.8,0.9),
group = c("m", "n"))
To compare the significant difference between groups, I am using anova and will plot boxplot for visualization. To meet the assumptions of using anova, I did transform data using different kinds of transformation (e.g., log10, Boxcox, etc).
My first visualization looks like:
ggplot(my_data, aes(group, value, colour = group, shape = group)) +
geom_boxplot(width = .5, alpha = 2) +
ylab(NULL) +
xlab("") +
facet_wrap(~Letter, scales = "free", ncol=4, strip.position = "left") +
facetted_pos_scales(
y = list(
Letter == "a" ~ scale_y_log10(),
Letter == "b" ~ scale_y_log10()))
I would like to add the astertiks and bars manually into each panel having different y scales. Any suggestions for this? Thank you in advance!!!
My desired output looks like:
P.s I did draw another single boxplot since we could have more than 3 groups in case.

Why pies are flat in geom_scatterpie in R?

Why are the pies flat?
df<- data.frame(
Day=(1:6),
Var1=c(172,186,191,201,205,208),
Var2= c(109,483,64010,161992,801775,2505264), A=c(10,2,3,4.5,16.5,39.6), B=c(10,3,0,1.4,4.8,11.9), C=c(2,5,2,0.1,0.5,1.2), D=c(0,0,0,0,0.1,0.2))
ggplot() +
geom_scatterpie(data = df, aes(x = Var1 , y = Var2, group = Var1), cols = c("A", "B", "C", "D"))
I have tried using coord_fixed() and does not work either.
The problem seems to be the scales of the x- and y-axes. If you rescaled them to both to have zero mean and unit variance, the plot works. So, one thing you could do is plot the rescaled values, but transform the labels back into the original scale. To do this, you would have to do the following:
Make the data:
df<- data.frame(
Day=(1:6),
Var1=c(172,186,191,201,205,208),
Var2= c(109,483,64010,161992,801775,2505264), A=c(10,2,3,4.5,16.5,39.6), B=c(10,3,0,1.4,4.8,11.9), C=c(2,5,2,0.1,0.5,1.2), D=c(0,0,0,0,0.1,0.2))
Rescale the variables
df <- df %>%
mutate(x = c(scale(Var1)),
y = c(scale(Var2)))
Find the linear map that transforms the rescaled values back into their original values. Then, you can use the coefficients from the model to make a function that will transform the rescaled values back into the original ones.
m1 <- lm(Var1 ~ x, data=df)
m2 <- lm(Var2 ~ y, data=df)
trans_x <- function(x)round(coef(m1)[1] + coef(m1)[2]*x)
trans_y <- function(x)round(coef(m2)[1] + coef(m2)[2]*x)
Make the plot, using the transformation functions as the call to labels in the scale_[xy]_continuous() functions
ggplot() +
geom_scatterpie(data=df, aes(x = x, y=y), cols = c("A", "B", "C", "D")) +
scale_x_continuous(labels = trans_x) +
scale_y_continuous(labels = trans_y) +
coord_fixed()
There may be an easier way than this, but it wasn't apparent to me.
The range on the y-axis is so large it's compressing the disks to lines. Change the y-axis to a log scale, and you can see the shapes. Adding coord_fixed() to keep the pies circular:
ggplot() +
geom_scatterpie(data = df, aes(x = Var1 , y = Var2, group = Var1), cols = c("A", "B", "C", "D")) +
scale_y_log10() +
coord_fixed()

For looping x-as in ggplot

I would like to create multiple histograms (ggplot) using a for loop. The problem is that my x-as from the plots, stay the same like "value". Do you know how to change the x-as every time it loops?
My dataframe for example:
df <- data.frame(variable = c("A", "A", "B", "B", "C", "C"), value = c(1, 2, 4, 5, 2, 3))
So that means I get three plots with x-as: "A", "B" and "C"
My code:
for (i in unique(df$variable)){
d <- subset(df, df$variable == i)
print(ggplot(d, aes(x = value)) + geom_histogram())
}
You can take help of imap to get different x-axis value after splitting the data by variable.
library(ggplot2)
list_plot <- df %>%
split(.$variable) %>%
purrr::imap(~ggplot(.x, aes(x = value)) +
geom_histogram() + xlab(.y))
Also have you considered using facets? Where x-axis is the same and you get A, B, C as facet names.
ggplot(df, aes(x = value)) + geom_histogram() + facet_wrap(~variable)

ggplot2: No legend with multiple geom_point

I have a plot with several single geom_point-plots and I would like to specify the shape and color for each plot individually.
Somehow I am really struggling with getting a proper legend and also I could not find a solution on stackoverflow.
I tried to use "fill" in the aes-command, but if I have more than two plots with fill, I get the error:
"Error: Aesthetics must be either length 1 or the same as the data
(1): x, y"
This is a simplified minimal example of the basic structure of my plot:
da <- as.character(c(1:10))
type <- c("a", "b", "c", "a", "b", "c", "a", "b", "c", "a" )
value <- c(1:10)
df <- data.frame(da, type, value)
require("ggplot2")
ggplot() +
geom_point(data = subset(df, type %in% c("a")), aes(x=da, y=value), shape=1, color="red", size=5) +
geom_point(data = subset(df, type %in% c("b")), aes(x=da, y=value), shape=2, color="darkorange", size=3) +
geom_point(data = subset(df, type %in% c("c")), aes(x=da, y=value), shape=3, color="violet", size=3)
How can I add a legend with custom labels?
Thanks! :-)
Why would you create separate layers and manually create a legend when you can simply create one layer and map aesthetics to your data (in this case, simply "type")? If you want specific colour or shape values, you can specify these using scales such as scale_colour_manual, scale_shape_discrete, etc)
da <- as.character(c(1:10))
type <- c("a", "b", "c", "a", "b", "c", "a", "b", "c", "a" )
value <- c(1:10)
df <- data.frame(da, type, value)
require("ggplot2")
#> Loading required package: ggplot2
ggplot(df, aes(x=da, y=value, color=type, shape = type, size = type)) +
geom_point()
#> Warning: Using size for a discrete variable is not advised.
Created on 2020-01-20 by the reprex package (v0.3.0)

How to make legends independet of aesthetics?

I've managed to make the plot below using ggplot2 containing a boxplot, the observations and the weighted mean.
I managed to get the legend that i want by using the three different aesthetics fill, color and size, but I would like to produce the legend without using the aesthetics. Using the aesthetics make customization of the plot with regard to colors, fills and sized impossible and given I one day need a fourth element plotted, I'm running out of aesthetics.
Is there any way to treat legends individually on a "geom-basis" using the same aesthetics for all geoms?
More specifically I want to have the edges of the boxplot colored as fill, the observations colored as the boxplot and the weighted average colored black, but if I specidy these colors outside of the aes(), the legend is deleted or altered.
library(dplyr)
library(tidyr)
library(ggplot2)
study <- c(1:10)
observations <- c(seq(10, 100, by = 10))
type <- c("A", "A", "A", "A", "A", "B", "B", "B", "B", "B")
rate <- c(runif(10, 0, 1))
data1 <- data.frame(study, type, observations, rate)
average <- data1 %>%
group_by(type) %>%
summarise(rate = weighted.mean(rate, observations))
data1 %>%
ggplot() +
geom_boxplot(aes(x = type, y = rate, fill = type), alpha = 0.2) +
geom_point(aes(x = type, y = rate, size = "Observations")) +
geom_point(data = average,
aes(x = type, y = rate, color = "Weighted mean"),
shape = 18, size = 5) +
guides(fill = guide_legend(title = "Legend"),
color = guide_legend(title = ""),
size = guide_legend(title = ""))

Resources