Is there a way to add a legend with the count to give density of each row?
Or an easier way to show it?
Thanks very much!
Couldn't even get a legend added :)
Code I used:
data %>%
ggplot(aes(x = subscribed, y = campaign)) +
geom_point () +
geom_jitter()
You could per group (subscribed) create a label which is calculated beforehand the number of n() observations and assign these as a column string. This can be used in the aes to make sure it is shown in the legend. Here is a reproducible example:
library(dplyr)
library(ggplot2)
df %>%
group_by(subscribed) %>%
mutate(count = paste0(subscribed, ' (n = ', n(), ')')) %>%
ggplot(aes(subscribed, campaign, colour = factor(count))) +
geom_jitter()
Created on 2023-01-12 with reprex v2.0.2
Created data:
df <- data.frame(campaign = runif(100),
subscribed = rep(c("no", "yes"), 50))
I found another way to show similar data to this, in a more clear manner.
However, I couldn't figure out the legend lol
The code I used was :
p <- ggplot(data = data, aes(x = subscribed, y = pdays)) +
geom_count() + scale_size_continuous(range = c(7, 30))
p + geom_text(data = ggplot_build(p)$data[[1]],
aes(x, y, label = n), color = "#ffffff") +
scale_y_continuous(breaks = seq(0, 30, by = 4))
As the title says, I want to add a macron to a faceting label. An example:
library(tidyverse)
# subset data
df2 <- diamonds %>%
sample_n(500)
# plot
ggplot(df2,aes(x = carat, y = price)) +
geom_point() +
facet_wrap(~cut)
Now I want to add a macron over the a in Fair
# attempt to recode Fair to Fāir
df2 <- df2 %>%
mutate(cut2 = fct_recode(cut, "F\u0101ir" = "Fair"))
# doesn't work - produces exactly the same plot as above.
ggplot(df2,aes(x = carat, y = price)) +
geom_point() +
facet_wrap(~cut2)
Any tips would be greatly appreciated.
Looks like it's a problem with fct_recode rather than ggplot2. This seems to work just fine
df2 <- diamonds %>%
sample_n(500)
df2$cut2 <- df2$cut
levels(df2$cut2)[1] <- "F\u0101ir"
ggplot(df2,aes(x = carat, y = price)) +
geom_point() +
facet_wrap(~cut2)
Actually I guess it has to do with all parameter names in R. It doesn't look like you can use unicode names (at least not in 4.0.5 which I tested with)
foo <- function(...) {
print(match.call())
}
foo("F\u0101ir" = 1)
# foo(Fair = 1)
foo(Fāir = 1)
# foo(Fair = 1)
Seems the values are just converted to ASCII
Suppose I have some code like the following, generating a lineplot with a considerable number of lines (example taken from here)
library(ggplot2)
library(reshape2)
n = 1000
set.seed(123)
mat = matrix(rnorm(n^2), ncol=n)
cmat = apply(mat, 2, cumsum)
cmat = t(cmat)
rownames(cmat) = paste("trial", seq(n), sep="")
colnames(cmat) = paste("time", seq(n), sep="")
dat = as.data.frame(cmat)
dat$trial = rownames(dat)
mdat = melt(dat, id.vars="trial")
mdat$time = as.numeric(gsub("time", "", mdat$variable))
p = ggplot(mdat, aes(x=time, y=value, group=trial)) +
theme_bw() +
theme(panel.grid=element_blank()) +
geom_line(size=0.2, alpha=0.1)
So here, "trial number" is my group producing all of these lines, and there are 1000 trials.
Suppose I want to "group my grouping variable" now - that is, I want to see the exact same lines in this plot, but I want the first 500 trial lines to be one color and the next 500 trial lines to be another. How can I do this with ggplot? I've been poking around for some time and I can't figure out how to manually set the colors per group.
Add a variable splitting the data into two groups, then add use it to color the lines in ggplot
dat = as.data.frame(cmat)
dat$trial = rownames(dat)
dat$group = rep(c("a","b"), each = n/2)
mdat = melt(dat, id.vars=c("trial", "group"))
mdat$time = as.numeric(gsub("time", "", mdat$variable))
p = ggplot(mdat, aes(x=time, y=value, group=trial, color = group)) +
theme_bw() +
theme(panel.grid=element_blank()) +
geom_line(size=0.2, alpha=0.1)
One possible solution will be to create a new column with the index of the trial number and then using an ifelse condition, you can set different group based on the trial number and pass the grouping variable as color in aes such as:
mdat %>% mutate(Trial = as.numeric(sub("trial","",trial))) %>%
mutate(Group = ifelse(Trial < 51,"A","B")) %>%
ggplot(aes(x=time, y=value, group=trial, color = Group)) +
theme_bw() +
theme(panel.grid=element_blank()) +
geom_line(size=0.2, alpha=0.8)
Is it what you are looking for ?
NB: I only use n = 100 to get smallest dataframe.
My goal is to produce two overlapping PMFs of binomial distributions using ggplot2, color-coded according to colors that I specify, with a legend at the bottom.
So far, I think I have set up the data frame right.
successes <- c(seq(0,10,1),seq(0,10,1))
freq <- c(dbinom(seq(0,10,1),10,0.2),dbinom(seq(0,10,1),10,0.8))
class <- c(rep(' A ',11),rep(' B ',11))
df1 <- data.frame(cbind(successes,freq,class))
However, this gives the wrong result.
library(ggplot2)
g <- ggplot(df1, aes(successes),y=freq)
g + geom_bar(aes(fill = class))
I feel like I'm following an example yet getting a totally different result. This (almost) does what I want: it would be exact if it gave relative frequencies.
g <- ggplot(mpg, aes(class))
g + geom_bar(aes(fill = drv))
A couple of questions:
1) Where am I going wrong in my block of code?
2) Is there a better way to show to PMFs in one graph? I'm not determined to use a histogram or bar chart.
3) How can I set this up to give me the ability to choose the colors?
4) How do I order the values on the x-axis? They aren't categories. They are the numbers 0-10 and have a natural order that I want to preserve.
Thanks!
UPDATE
The following two blocks worked.
successes <- c(seq(0,10,1),seq(0,10,1))
freq <- c(dbinom(seq(0,10,1),10,0.2),dbinom(seq(0,10,1),10,0.8))
class <- c(rep(' A ',11),rep(' B ',11))
df1 <- data.frame(successes,freq,class)
ggplot(df1, aes(successes ,y=freq, fill = class)) +
geom_bar(stat = "identity") +
scale_x_continuous(breaks = seq(0,10,1)) +
scale_fill_manual(values = c("blue", "green")) + theme_bw()
AND
successes <- c(seq(0,10,1),seq(0,10,1))
freq <- c(dbinom(seq(0,10,1),10,0.2),dbinom(seq(0,10,1),10,0.8))
class <- c(rep(' A ',11),rep(' B ',11))
df1 <- data.frame(successes,freq,class)
ggplot(df1, aes(x=successes,y=freq),y=freq) +
geom_col(aes(fill = class)) +
scale_x_continuous(breaks = seq(0,10,1)) +
scale_fill_manual(values = c("blue", "green")) + theme_bw()
I think your issue is that successes and freq are being changed to factors when you create df1
Maybe this is what you're thinking of?
successes <- c(seq(0,10,1),seq(0,10,1))
freq <- c(dbinom(seq(0,10,1),10,0.2),dbinom(seq(0,10,1),10,0.8))
class <- c(rep(' A ',11),rep(' B ',11))
df1 <- data.frame(successes = as.numeric(successes), freq = as.numeric(freq), class)
ggplot(df1, aes(x = successes, y = freq)) +
geom_bar(stat = "identity", aes(fill = class))
If not, happy to answer any further questions!
Is this what you're looking for?
library(ggplot2)
g <- ggplot(df1, aes(successes ,y=freq, fill = class))
g + geom_bar(stat = "identity") +
scale_fill_manual(values = c("blue", "green"))
Of course, keeping in mind you'd indeed change your dataframe creation to:
successes <- c(seq(0,10,1),seq(0,10,1))
freq <- c(dbinom(seq(0,10,1),10,0.2),dbinom(seq(0,10,1),10,0.8))
class <- c(rep(' A ',11),rep(' B ',11))
df1 <- data.frame(successes,freq,class)
as suggested in the comments.
I have data with lots of factor variables that I am visualising to get a feel for each of the variables. I am reproducing a lot of the code with minor tweaks for variable names etc. so decided to write a function to simply things. I just can't get it to work...
Dummy Data
ID <- sample(1:32, 128, replace = TRUE)
AgeGrp <- sample(c("18-65", "65-75", "75-85", "85+"), 128, replace = TRUE)
ID <- factor(ID)
AgeGrp <- factor(AgeGrp)
data <- data_frame(ID, AgeGrp)
data
Basically what I am trying to do with each factor variable is produce a bar chart with labels of percentages inside the bars. For example with the dummy data.
plotstats <- #Create a table with pre-summarised percentages
data %>%
group_by(AgeGrp) %>%
summarise(count = n()) %>%
mutate(pct = count/sum(count)*100)
age_plot <- #Plot the data
ggplot(data,aes(x = AgeGrp)) +
geom_bar() + #Add the percentage labels using pre-summarised table
geom_text(data = plotstats, aes(label=paste0(round(pct,1),"%"),y=pct),
size=3.5, vjust = -1, colour = "sky blue") +
ggtitle("Count of Age Group")
age_plot
This works fine with the dummy data - but when I try to create a function...
basic_plot <-
function(df, x){
plotstats <-
df %>%
group_by_(x) %>%
summarise_(
count = ~n(),
pct = ~count/sum(count)*100)
plot <-
ggplot(df,aes(x = x)) +
geom_bar() +
geom_text(data = plotstats, aes(label=paste0(round(pct,1),"%"),
y=pct), size=3.5, vjust = -1, colour = "sky blue")
plot
}
basic_plot(data, AgeGrp)
I get the error code :
Error in UseMethod("as.lazy") : no applicable method for 'as.lazy' applied to an object of class "factor"
I have looked at questions here, here, and here and also looked at the NSE Vignette but can't find my fault.