Reorder sub-groups by group in a frequency barplot - r

I have the following problem:
My code is like this:
ggplot(data, aes(x = fct_infreq(sub-group), fill = group)) + geom_bar()
And the result was this:
I want to plot firstly the red group (in ascendent order) and after the blue group (also in ascendent order), all this in the same plot.
How can i do this?
Thanks in advance!

The following is simply providing the limits to the y-axis in the order you want, without bothering with factors.
library(ggplot2)
df <- data.frame(
y = LETTERS[1:20],
group = rep(c("A", "B"), 10),
x = rnorm(20)
)
ggplot(df, aes(x, y, fill = group)) +
geom_col() +
scale_y_discrete(
limits = df$y[rev(order(df$group, df$x))]
)
Created on 2021-12-16 by the reprex package (v2.0.1)

Related

Legend for scatter-line graph in ggplot2 (without colour)

very new to this so apologies if there's an obvious answer. I'm trying to add a legend to a scatter-line graph with 2 y variables; i'm aware this can be done using colour, however I ideally want to keep this black and white, and define the variables in the legend by linetype/point instead. Is there any way to do this?
ggplot(birds, aes(distance)) +geom_point(aes(y=individuals_AC)) +geom_point(aes(y=species_AC, shape=17)) +geom_line(aes(y=individuals_AC)) +geom_line(aes(y=species_AC, linetype="dashed")) + scale_shape_identity() + scale_linetype_identity() + theme_classic()
library(tidyverse)
#create some dummy data
df <- tibble(
x = runif(10),
y = runif(10),
type = rep(c("a", "b"), 5)
)
#plot it with a different shape for each type
df %>%
ggplot(aes(x, y, shape = type)) +
geom_point()

Adding entry for NA-values in continuous ggplot-legend

There is a very similar question here: Add NA value to ggplot legend for continuous data map.
I tried to understand it, but I didn't manage to make it work for my data.
So I created a super simple example. I have this data:
set.seed(1)
df = data.frame(a=rnorm(50), b=rnorm(50), c=rep(1:5, 10))
df[sample(1:50, 10), ]$c = NA
where all columns are numeric. Now I'd like to make a ggplot with a legend entry for the NA-values. When I do the following:
ggplot(df) +
geom_point(
aes(x = a, y =b, col=c)
)
This is the result
What I want is something like this (when c is a a factor it gets automatically an entry):
ggplot(df) +
geom_point(
aes(x = a, y =b, col=factor(c))
)
Could I achieve more or less easy similar results and keep my values in class numeric?
Defining a color for NA is easy by adding scale_color_continuous(na.value="red"), but it is not explicitly labeled in the legend.
To achieve that you could add a second color scale just for the NA value using ggnewscale:
library(ggplot2)
library(ggnewscale)
set.seed(1)
df = data.frame(a=rnorm(50), b=rnorm(50), c=rep(1:5, 10))
df[sample(1:50, 10), ]$c = NA
na.value.forplot <- 'red'
ggplot(df) +
geom_point(aes(x = a, y =b, col=c)) +
scale_color_continuous(guide = guide_colorbar(order = 2)) +
new_scale_color() +
geom_point(data=subset(df, is.na(c)),
aes(x=a, y=b, col="red")) +
scale_color_manual(name=NULL, labels="NA", values="red")
Created on 2021-03-31 by the reprex package (v1.0.0)

How to draw a barplot from counts data in R?

I have a data-frame 'x'
I want barplot like this
I tried
barplot(x$Value, names.arg = x$'Categorical variable')
ggplot(as.data.frame(x$Value), aes(x$'Categorical variable')
Nothing seems to work properly. In barplot, all axis labels (freq values) are different. ggplot is filling all bars to 100%.
You can try plotting using geom_bar(). Following code generates what you are looking for.
df = data.frame(X = c("A","B C","D"),Y = c(23,12,43))
ggplot(df,aes(x=X,y=Y)) + geom_bar(stat='identity') + coord_flip()
It helps to read the ggplot documentation. ggplot requires a few things, including data and aes(). You've got both of those statements there but you're not using them correctly.
library(ggplot2)
set.seed(256)
dat <-
data.frame(variable = c("a", "b", "c"),
value = rnorm(3, 10))
dat %>%
ggplot(aes(x = variable, y = value)) +
geom_bar(stat = "identity", fill = "blue") +
coord_flip()
Here, I'm piping my dat to ggplot as the data argument and using the names of the x and y variables rather than passing a data$... value. Next, I add the geom_bar() statement and I have to use stat = "identity" to tell ggplot to use the actual values in my value rather than trying to plot the count of the number.
You have to use stat = "identity" in geom_bar().
dat <- data.frame("cat" = c("A", "BC", "D"),
"val" = c(23, 12, 43))
ggplot(dat, aes(as.factor(cat), val)) +
geom_bar(stat = "identity") +
coord_flip()

How to create a heatmap with continuous scale using ggplot2 in R

I have got a data frame with several 1000 rows in the form of
group = c("gr1","gr1","gr1","gr1","gr1","gr1","gr1","gr1","gr1","gr1","gr2","gr2","gr2","gr2","gr2","gr2","gr2","gr2","gr2","gr2","gr3","gr3","gr3","gr3","gr3","gr3","gr3","gr3","gr3","gr3")
pos = c(1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10)
color = c(2,2,2,2,3,3,2,2,3,2,1,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,1,1,2,2)
df = data.frame(group, pos, color)
and would like to make a kind of heatmap in which one axes has a continuous scale (position). The color column is categorical. However due to the large amount of data points I want to use binning, i.e. use it as a continuous variable.
This is more or less how the plot should look like:
I can't think of a way to create such a plot using ggplot2/R. I have tried several geometries, e.g. geom_point()
ggplot(data=df, aes(x=strain, y=pos, color=color)) +
geom_point() +
scale_colour_gradientn(colors=c("yellow", "black", "orange"))
Thanks for your help in advance.
Does this help you?
library(ggplot2)
group = c("gr1","gr1","gr1","gr1","gr1","gr1","gr1","gr1","gr1","gr1","gr2","gr2","gr2","gr2","gr2","gr2","gr2","gr2","gr2","gr2","gr3","gr3","gr3","gr3","gr3","gr3","gr3","gr3","gr3","gr3")
pos = c(1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10)
color = c(2,2,2,2,3,3,2,2,3,2,1,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,1,1,2,2)
df = data.frame(group, pos, color)
ggplot(data = df, aes(x = group, y = pos)) + geom_tile(aes(fill = color))
Looks like this
Improved version with 3 color gradient if you like
library(scales)
ggplot(data = df, aes(x = group, y = pos)) + geom_tile(aes(fill = color))+ scale_fill_gradientn(colours=c("orange","black","yellow"),values=rescale(c(1, 2, 3)),guide="colorbar")

Plotting the average values for each level in ggplot2

I'm using ggplot2 and am trying to generate a plot which shows the following data.
df=data.frame(score=c(4,2,3,5,7,6,5,6,4,2,3,5,4,8),
age=c(18,18,23,50,19,39,19,23,22,22,40,35,22,16))
str(df)
df
Instead of doing a frequency plot of the variables (see below code), I want to generate a plot of the average values for each x value. So I want to plot the average score at each age level. At age 18 on the x axis, we might have a 3 on the y axis for score. At age 23, we might have an average score of 4.5, and so forth (Edit: average values corrected). This would ideally be represented with a barplot.
ggplot(df, aes(x=factor(age), y=factor(score))) + geom_bar()
Error: stat_count() must not be used with a y aesthetic.
Just not sure how to do this in R with ggplot2 and can't seem to find anything on such plots. Statisticially, I don't know if the plot I desire to plot is even the right thing to do, but that's a different store.
Thanks!
You can use summary functions in ggplot. Here are two ways of achieving the same result:
# Option 1
ggplot(df, aes(x = factor(age), y = score)) +
geom_bar(stat = "summary", fun = "mean")
# Option 2
ggplot(df, aes(x = factor(age), y = score)) +
stat_summary(fun = "mean", geom = "bar")
Older versions of ggplot use fun.y instead of fun:
ggplot(df, aes(x = factor(age), y = score)) +
stat_summary(fun.y = "mean", geom = "bar")
If I understood you right, you could try something like this:
library(plyr)
library(ggplot2)
ggplot(ddply(df, .(age), mean), aes(x=factor(age), y=factor(score))) + geom_bar()
You can also use aggregate() in base R instead of loading another package.
temp = aggregate(list(score = df$score), list(age = factor(df$age)), mean)
ggplot(temp, aes(x = age, y = score)) + geom_bar()
Another option is doing a group_by of the x-values and summarise the "mean_score" per "age" using dplyr to do it in one pipe. Also you can use geom_col instead of geom_bar. Here is a reproducible example:
df=data.frame(score=c(4,2,3,5,7,6,5,6,4,2,3,5,4,8),
age=c(18,18,23,50,19,39,19,23,22,22,40,35,22,16))
library(dplyr)
library(ggplot2)
df %>%
group_by(age) %>%
summarise(mean_score = mean(score)) %>%
ggplot(aes(x = factor(age), y = mean_score)) +
geom_col() +
labs(x = "Age", y = "Mean score")
Created on 2022-08-26 with reprex v2.0.2

Resources