How to draw graph with dodged position? - r

I am an R novice. I will try to be as brief and simple as possible. Currently, I am trying to connect points between two conditions based on another condition all over a single discrete x-axis.
Below is some test data and my attempt to plot some data.
set.seed(42)
# Test case data
mydf1 <- tibble(
xx = rep('myLabel', 8),
yy = rnorm(8),
grp = rep(c(1, 2), each = 4),
cond = rep(c('a', 'b', 'c', 'd'), length.out = 8)
)
ggplot(mydf1, aes(x = xx, y = yy, col = factor(grp))) +
geom_point(position = position_dodge(width = 0.9)) +
geom_path(position = position_dodge(width = 0.9), aes(group = cond), col = "black") +
theme_bw() +
ggtitle("Test Case for geom_path and position_dodge")
From what I can tell, it seems that position_dodge is applied after the draw. Is there a way to change this behavior? or to achieve the overall goal of connecting these points in this type of way?
Thank you for your time.
EDIT: details.
EDIT2:
I would like to capture a before and after relationship between grp based on 4 conditions in one big main conditions.

Probably you want this.
set.seed(42)
library(ggplot2)
ggplot(mydf1, aes(x = grp, y = yy, col = factor(grp))) +
geom_point() +
geom_path(aes(group = cond), col = "black") +
theme_bw() +
ggtitle("Test Case for geom_path and position_dodge") +
xlim(c(.5, 2.5)) +
labs(color = "Group", x = "myLabel", y = "yy") +
theme(axis.text.x=element_blank(),
axis.ticks.x=element_blank())

You could plot a categorical x axis.
ggplot(mydf1, aes(x = cond, y = yy, col = factor(grp))) +
geom_point() +
geom_path(aes(group = cond), col = "black") +
theme_bw() +
ggtitle("Test Case for categorical X-axis")
Alternatively, if you need comparison across multiple categorical dimensions mapped to the x axis, you can try facets.
ggplot(mydf1, aes(x = cond, y = yy, col = factor(grp))) +
geom_point() +
geom_path(aes(group = cond), col = "black") +
theme_bw() +
ggtitle("Test Case for Categorical X-axis and Facets") +
facet_wrap(~cond)

Related

How to segregate by one factor but colour by another in ggplot2 R?

In my dataset, I have segregated the data by a parameter par for either Black or Red noise that are staggered in represtation. Now, for both species, I want to colour the "Black" noise as black, and "Red" as red. Furthermore, I want to join the points by par -- specifically, I want to join par -- No with a Dashed line, and Yes as a solid line. I tried the piece of code attached (and multiple versions of it)..but no luck. Any suggestions?
#Data
set.seed(100)
sp <- factor(c("A","A","A","A","B","B","B","B"))
par <- factor(c("No","No","Yes","Yes","No","No","Yes","Yes"))
y <- rnorm(8, 2,3)
noise <- factor(c("Black","Red","Black","Red","Black","Red","Black","Red"))
df <- data.frame(sp, par, y, noise)
df$noise <- factor(df$noise, levels = c("Black","Red"))
library(ggplot2)
ggplot(data = df, aes(x = noise, y = y, fill = par, color = par)) +
geom_point(size = 4) +
facet_wrap(.~sp) +
theme_classic() +
scale_fill_manual(values = c("black","red")) + scale_color_manual(values = c("black","red")) +
geom_line(aes(linetype=par)) + scale_linetype_manual(name = "indicator", values = c(2,1,2))
geom_path(aes(group = par,linetype=par), geom = "path")
ERROR: geom_path: Each group consists of only one observation. Do you need to adjust the group aesthetic?
In your code, you forget to add a + to link geom_path() with the ggplot(). Since the aes() of geom_point() and geom_path() doesn't match, you'll need to include them in the corresponding geom_*().
library(tidyverse)
ggplot(data = df, aes(x = noise, y = y, group = par, linetype = par)) +
geom_point(aes(fill = noise, color = noise, ), size = 4) +
facet_wrap(.~sp) +
theme_classic() +
scale_fill_manual(values = c("black","red")) +
scale_color_manual(values = c("black","red")) +
geom_line() +
scale_linetype_manual(name = "indicator", values = c(2,1,2)) +
geom_path()

adding a label in geom_line in R

I have two very similar plots, which have two y-axis - a bar plot and a line plot:
code:
sec_plot <- ggplot(data, aes_string (x = year, group = 1)) +
geom_col(aes_string(y = frequency), fill = "orange", alpha = 0.5) +
geom_line(aes(y = severity))
However, there are no labels. I want to get a label for the barplot as well as a label for the line plot, something like:
How can I add the labels to the plot, if there is only pone single group? is there a way to specify this manually? Until know I have only found option where the labels can be added by specifying them in the aes
EXTENSION (added a posterior):
getSecPlot <- function(data, xvar, yvar, yvarsec, groupvar){
if ("agegroup" %in% xvar) xvar <- get("agegroup")
# data <- data[, startYear:= as.numeric(startYear)]
data <- data[!claims == 0][, ':=' (scaled = get(yvarsec) * max(get(yvar))/max(get(yvarsec)),
param = max(get(yvar))/max(get(yvarsec)))]
param <- data[1, param] # important, otherwise not found in ggplot
sec_plot <- ggplot(data, aes_string (x = xvar, group = groupvar)) +
geom_col(aes_string(y = yvar, fill = groupvar, alpha = 0.5), position = "dodge") +
geom_line(aes(y = scaled, color = gender)) +
scale_y_continuous(sec.axis = sec_axis(~./(param), name = paste0("average ", yvarsec),labels = function(x) format(x, big.mark = " ", scientific = FALSE))) +
labs(y = paste0("total ", yvar)) +
scale_alpha(guide = 'none') +
theme_pubclean() +
theme(legend.title=element_blank(), legend.background = element_rect(fill = "white"))
}
plot.ExposureYearly <- getSecPlot(freqSevDataAge, xvar = "agegroup", yvar = "exposure", yvarsec = "frequency", groupvar = "gender")
plot.ExposureYearly
How can the same be done on a plot where both the line plot as well as the bar plot are separated by gender?
Here is a possible solution. The method I used was to move the color and fill inside the aes and then use scale_*_identity to create and format the legends.
Also, I needed to add a scaling factor for severity axis since ggplot does not handle the secondary axis well.
data<-data.frame(year= 2000:2005, frequency=3:8, severity=as.integer(runif(6, 4000, 8000)))
library(ggplot2)
library(scales)
sec_plot <- ggplot(data, aes(x = year)) +
geom_col(aes(y = frequency, fill = "orange"), alpha = 0.6) +
geom_line(aes(y = severity/1000, color = "black")) +
scale_fill_identity(guide = "legend", label="Claim frequency (Number of paid claims per 100 Insured exposure)", name=NULL) +
scale_color_identity(guide = "legend", label="Claim Severity (Average insurance payment per claim)", name=NULL) +
theme(legend.position = "bottom") +
scale_y_continuous(sec.axis =sec_axis( ~ . *1, labels = label_dollar(scale=1000), name="Severity") ) + #formats the 2nd axis
guides(fill = guide_legend(order = 1), color = guide_legend(order = 2)) #control which scale plots first
sec_plot

ggplot2 - how to limit panel and axis?

I want to know how to turn this plot:
Into this plot:
As you can see the panel and axis on the 2nd plot are limited to the data extent. I made the second graph using design software but want to know the code.
Ive already limited the x and y axis using
xlim and ylim but no difference.
Please see my code below, sorry its so messy, first time using r studio. Thanks!
ggplot() +
geom_errorbar(data = U1483_Coiling_B_M_Removed_R, mapping = aes(x = `Age (Ma) Linear Age Model`, ymin = `Lower interval*100`, ymax = `Upper interval*100`), width = 0.025, colour = 'grey') +
geom_line(data = U1483_Coiling_B_M_Removed_R, aes(x = `Age (Ma) Linear Age Model`, y = `Percent Dextral`)) +
geom_point(data = U1483_Coiling_B_M_Removed_R, aes(x = `Age (Ma) Linear Age Model`, y = `Percent Dextral`), colour = 'red') +
geom_point(data = U1483_Coiling_B_M_Removed_R, aes(x = `Age (Ma) Linear Age Model`, y = `Lab?`)) +
theme(axis.text.x=element_text(angle=90, size=10, vjust=0.5)) +
theme(axis.text.y=element_text(angle=90, size=10, vjust=0.5)) +
theme_classic() +
theme(panel.background = element_rect(colour = 'black', size = 1)) +
xlim(0, 2.85) +
ylim(0, 100)
You can use expand when specifying axis scales, like so:
# Load library
library(ggplot2)
# Set RNG
set.seed(0)
# Create dummy data
df <- data.frame(x = seq(0, 3, by = 0.1))
df$y <- 100 - abs(rnorm(nrow(df), 0, 10))
# Plot results
# Original
ggplot(df, aes(x, y)) +
geom_line() +
geom_point(colour = "#FF3300", size = 5)
# With expand
ggplot(df, aes(x, y)) +
geom_line() +
geom_point(colour = "#FF3300", size = 5) +
scale_y_continuous(expand = c(0, 0))

Aesthetics must be either length 1 or the same as the data (1): x, y, label

I'm working on some data on party polarization (something like this) and used geom_dumbbell from ggalt and ggplot2. I keep getting the same aes error and other solutions in the forum did not address this as effectively. This is my sample data.
df <- data_frame(policy=c("Not enough restrictions on gun ownership", "Climate change is an immediate threat", "Abortion should be illegal"),
Democrats=c(0.54, 0.82, 0.30),
Republicans=c(0.23, 0.38, 0.40),
diff=sprintf("+%d", as.integer((Democrats-Republicans)*100)))
I wanted to keep order of the plot, so converted policy to factor and wanted % to be shown only on the first line.
df <- arrange(df, desc(diff))
df$policy <- factor(df$policy, levels=rev(df$policy))
percent_first <- function(x) {
x <- sprintf("%d%%", round(x*100))
x[2:length(x)] <- sub("%$", "", x[2:length(x)])
x
}
Then I used ggplot that rendered something close to what I wanted.
gg2 <- ggplot()
gg2 <- gg + geom_segment(data = df, aes(y=country, yend=country, x=0, xend=1), color = "#b2b2b2", size = 0.15)
# making the dumbbell
gg2 <- gg + geom_dumbbell(data=df, aes(y=country, x=Democrats, xend=Republicans),
size=1.5, color = "#B2B2B2", point.size.l=3, point.size.r=3,
point.color.l = "#9FB059", point.color.r = "#EDAE52")
I then wanted the dumbbell to read Democrat and Republican on top to label the two points (like this). This is where I get the error.
gg2 <- gg + geom_text(data=filter(df, country=="Government will not control gun violence"),
aes(x=Democrats, y=country, label="Democrats"),
color="#9fb059", size=3, vjust=-2, fontface="bold", family="Calibri")
gg2 <- gg + geom_text(data=filter(df, country=="Government will not control gun violence"),
aes(x=Republicans, y=country, label="Republicans"),
color="#edae52", size=3, vjust=-2, fontface="bold", family="Calibri")
Any thoughts on what I might be doing wrong?
I think it would be easier to build your own "dumbbells" with geom_segment() and geom_point(). Working with your df and changing the variable refences "country" to "policy":
library(tidyverse)
# gather data into long form to make ggplot happy
df2 <- gather(df,"party", "value", Democrats:Republicans)
ggplot(data = df2, aes(y = policy, x = value, color = party)) +
# our dumbell
geom_path(aes(group = policy), color = "#b2b2b2", size = 2) +
geom_point(size = 7, show.legend = FALSE) +
# the text labels
geom_text(aes(label = party), vjust = -1.5) + # use vjust to shift text up to no overlap
scale_color_manual(values = c("Democrats" = "blue", "Republicans" = "red")) + # named vector to map colors to values in df2
scale_x_continuous(limits = c(0,1), labels = scales::percent) # use library(scales) nice math instead of pasting
Produces this plot:
Which has some overlapping labels. I think you could avoid that if you use just the first letter of party like this:
ggplot(data = df2, aes(y = policy, x = value, color = party)) +
geom_path(aes(group = policy), color = "#b2b2b2", size = 2) +
geom_point(size = 7, show.legend = FALSE) +
geom_text(aes(label = gsub("^(\\D).*", "\\1", party)), vjust = -1.5) + # just the first letter instead
scale_color_manual(values = c("Democrats" = "blue", "Republicans" = "red"),
guide = "none") +
scale_x_continuous(limits = c(0,1), labels = scales::percent)
Only label the top issue with names:
ggplot(data = df2, aes(y = policy, x = value, color = party)) +
geom_path(aes(group = policy), color = "#b2b2b2", size = 2) +
geom_point(size = 7, show.legend = FALSE) +
geom_text(data = filter(df2, policy == "Not enough restrictions on gun ownership"),
aes(label = party), vjust = -1.5) +
scale_color_manual(values = c("Democrats" = "blue", "Republicans" = "red")) +
scale_x_continuous(limits = c(0,1), labels = scales::percent)

Colour average lines in ggplot

I would like to colour the dashed lines, which are the average values of the two respective categories, with the same colour of the default palette used by ggplot to fill the distributions:
Click here to view the distribution
This is the code used:
library(ggplot2)
print(ggplot(dati, aes(x=ECU_fuel_consumption_L_100Km_CF, fill=Model))
+ ggtitle("Fuel Consumption density histogram, by Model")
+ ylab("Density")
+ geom_density(alpha=.3)
+ scale_x_continuous(breaks=pretty(dati$ECU_fuel_consumption_L_100Km_CF, n=10))
+ geom_vline(aes(xintercept = mean(ECU_fuel_consumption_L_100Km_CF[dati$Model == "500X"])), linetype="dashed", size=1)
+ geom_vline(aes(xintercept = mean(ECU_fuel_consumption_L_100Km_CF[dati$Model == "Renegade"])), linetype="dashed", size=1)
)
Thank you all in advance!
No reproducible example, but you probably want to do something like this:
library(dplyr)
# make up some data
d <- data.frame(x = c(mtcars$mpg, mtcars$hp),
var = rep(c('mpg', 'hp'), each = nrow(mtcars)))
means <- d %>% group_by(var) %>% summarize(m = mean(x))
ggplot(d, aes(x, fill = var)) +
geom_density(alpha = 0.3) +
geom_vline(data = means, aes(xintercept = m, col = var),
linetype = "dashed", size = 1)
This approach is extendable to any number of groups.
An option that doesn't require pre-calculation, but is also a bit more hacky, is:
ggplot(d, aes(x, fill = var)) +
geom_density(alpha = 0.3) +
geom_vline(aes(col = 'hp', xintercept = x), linetype = "dashed", size = 1,
data = data.frame(x = mean(d$x[d$var == 'hp']))) +
geom_vline(aes(col = 'mpg', xintercept = x), linetype = "dashed", size = 1,
data = data.frame(x = mean(d$x[d$var == 'mpg'])))

Resources