Change tickmark labels in ggplot2 [duplicate] - r

I would like to show a short time series showing heterogeneity of heroin seizures in Europe over the span of 22 years. However there are different amount countries included in some of the years. I would like to display this in the graph by putting "n=xx" for each year on the x-axis. Does anyone know how I should do this?
across_time<- ggplot(by_year, aes(year, value) +
geom_errorbar(aes(ymin=value-se, ymax=value+se), width=.4) +
geom_line(colour="black", size= 2) +
geom_point(size=4, shape=21, fill="white") + # 21 is filled circle
xlab("Year") +
ylab("Siezures") +
ggtitle("Hetrogeniety Across Time") +
scale_x_continuous(breaks = round(seq(min(1990), max(2012), by=2)))
across_time
Here is a link to what the graph looks like:
http://imgur.com/XWhBqqi

I found this as a solution:
#make a list of the lables you want
lab<- c("1990\nn=26", "1991\nn=29", "1992\nn=30", "1993\nn=32", "1994\nn=36", "1995\nn=35", "1996\nn=33", "1997\nn=38", "1998\nn=36", "1999\nn=39", "2000\nn=39", "2001\nn=40", "2002\nn=38", "2003\nn=40", "2004\nn=39", "2005\nn=41", "2006\nn=42", "2007\nn=43", "2008\nn=44", "2009\nn=41", "2010\nn=41", "2011\nn=41", "2012\nn=42")
lab<- as.factor(lab)
#bind our label list to our table
by_year<-cbind(lab, by_year)
#make a column of zeros to group by for the line
by_year$g<- 0
# Standard error of the mean
across_time<- ggplot(by_year, aes(x=lab, y=value)) +
geom_errorbar(aes(ymin=value-se, ymax=value+se), width=.4) +
geom_line(aes(group=g), colour="black", size= 2) + #notice the grouping
geom_point(size=4, shape=21, fill="white") + # 21 is filled circle
scale_x_discrete(labels = by_year$lab) + # discrete not continuous
xlab("Year & Number of Reporting Countries") +
ylab("Total Annual Seizures") +
ggtitle("Heterogeneity of Heroin Seizures in Europe")
across_time
Here is the final result:

Have you tried using the label argument in scale_x_continuous? If you have a vector with the "xx" you want as labels this should work.

Related

How do I sort the bars when the barchart represents the number of occurances?

I draw a barchart in R:
ggplot(data, aes(x=rating, fill=rating)) +
geom_bar(stat="count") +
ggtitle("Rating in stories")+
coord_flip()+
xlab("rating")+
ylab("number of stories")+
theme(legend.position="none")
The result is here.
The bars represent the amount of times the specific value (M, T, K or K+) occurs in the rating variable.
How do I sort the bars decreasingly?
OK, I found what I was looking for. I needed to use fct_rev(fct_infreq()) on the variable.
ggplot(data, aes(forcats::fct_rev(fct_infreq(rating)), fill=rating)) +
geom_bar(stat = "count") +
ggtitle("Rating in stories")+
coord_flip()+
xlab("rating")+
ylab("number of stories")+
theme(legend.position="none")

ggplot histogram: present both overall count in addition to group count in each bin

I am trying to generate a histogram using ggplot which on the x axis has speeds and on the y axis has the counts. In addition, each bin shows how many of those were during the day and night.
I need to present the counts themselves on the plot. I managed to add the counts within each bar but now I would like to present another number, the total count, on top of each bar. Is that possible?
This is my code:
ggplot(aes(x = speedmh ) , data = GPSdataset1hDFDS48) +
geom_histogram(aes(fill=DayActv), bins=15, colour="grey20", lwd=0.2) + ylim(0, 400) +xlim(0,500)+
stat_bin(bins=15, geom="text", colour="white", size=3.5,
aes(label=..count.., group=DayActv), position=position_stack(vjust=0.5))
and this is the result I get:
How do I add the total count of speeds within each bin to the top of every bar?
Ideally I would like to make this histogram of proportions of speeds instead of counts, but I think that is too complicated for me at the moment.
Thank you!!
Mia
One way is to add another stat_bin command without the grouping:
library(ggplot2)
ggplot(aes(x = speedmh) , data = GPSdataset1hDFDS48) +
geom_histogram(aes(fill=DayActv), bins=15, colour="grey20", lwd=0.2) + ylim(0, 400) +
xlim(0,500) +
stat_bin(bins=15, geom="text", colour="white", size=3.5,
aes(label=..count.., group=DayActv), position=position_stack(vjust=0.5)) +
stat_bin(bins=15, geom="text", colour="black", size=3.5,
aes(label=..count..), vjust=-0.5)
Data:
GPSdataset1hDFDS48 <- data.frame(speedmh=rexp(1000, 0.015), DayActv=factor(sample(0:1, 1000,TRUE)))

Drawing a bar chart using bin sizes

Arrival_Frequency Total_Arrival
0-1 2633586
2-4 223079
4-7 5281
7+ 1718
How to get bar plot for this. If use normal geom_bar() it gives the count not the total.
Do you want this?
library(scales)
library(tidyverse)
ggplot(df, aes(x=Arrival_Frequency, y=Total_Arrival))+
geom_bar(position=position_dodge(), stat="identity") +
scale_y_continuous(labels = label_number()) +
ylab("Total Arrival") + xlab("Arrival Frequency")
Your values are wide apart. So, you can think of transforming the values like
ggplot(df, aes(x=Arrival_Frequency, y=Total_Arrival))+
geom_col() +
scale_y_continuous(trans = "log", labels = label_number()) +
ylab("log (Total Arrival)") + xlab("Arrival Frequency")

Color areas in a radar chart using geom_area() in ggplot2

Before reading any longer, I suggest you to download and see the original codes in this question posted in this forum.
I run this ggplot code:
ggplot(data=data, aes(x=X2, y=Count, group=X3, colour=X3)) +
geom_point(size=5) +
geom_line() +
xlab("Decils") +
ylab("% difference in nº Pk") +
ylim(-50,25) + ggtitle("CL") +
geom_hline(aes(yintercept=0), lwd=1, lty=2) +
scale_x_discrete(limits=c(orden_deciles)) +
coord_polar() +
geom_area(aes(color=X3, fill=X3), alpha=0.2) +
And got this plot:
As you may imagine, something's wrong with the code. Blue group seems to be colored properly. I would like to color all the groups, taking as reference the black-slashed ring, which represent 0.
How can I implement this?
geom_area() has a default position argument of stack. This stack each area over the others, creating weird results in this case.
Simply change the position to identity:
library(ggplot2)
## This was not given, I supposed it's in this form
orden_deciles <- paste0('DC', 1:10)
ggplot(data=data, aes(x=X2, y=Count, group=X3, colour=X3)) +
geom_point(size=5) +
geom_line() +
xlab("Decils") +
ylab("% difference in nº Pk") +
ylim(-50,25) + ggtitle("CL") +
geom_hline(aes(yintercept=0), lwd=1, lty=2) +
scale_x_discrete(limits=c(orden_deciles)) +
geom_area(aes(fill=X3), alpha=0.2, position = position_identity()) +
coord_polar()
#> Warning: Removed 5 rows containing missing values (geom_point).
#> Warning: Removed 2 rows containing missing values (geom_path).
Created on 2018-05-15 by the reprex package (v0.2.0).

Subgroup Boxplots in R

I'm trying to make a graphic that will show three things side-by-side. First is to show change in the individual over time. Next is to show change in their peer group over time. Last is to show change in the overall population over time.
I have four time points on each observation. What I'd like to see is two sets of boxplots next to each other, one for the peer group and one for the population. Overlaid on each of these would the datapoints for a given individual. Each set would show data at time1, time2, time3, and time4. The overlayed points would convey where the individuals had been at each time, so the information can be conveyed in two sets of boxplots.
Here is code to simulate the sort of data I am working with, and my ineffective attempt at creating my plot.
peer <- c(rep(1, 15), rep(2, 41))
year <- rep(c(1, 2), 28)
pct <- rep(1:8, 7)
dat <- data.frame(cbind(peer, year, pct))
ggplot(dat, aes(peer==1, pct)) + geom_boxplot() + facet_grid(. ~ year)
I don't think my ggplot approach is even close to correct. Please help!
Here's a sketch of what I'm trying to do.
Is this close to what you had in mind? There's a boxplot for each value of peer for each year. I've also included the mean values for each group.
# Boxplots for each combination of year and peer, with means superimposed
ggplot(dat, aes(year, pct, group=interaction(year,peer), colour=factor(peer))) +
geom_boxplot(position=position_dodge(width=0.4), width=0.4) +
stat_summary(fun.y=mean, geom="line", position=position_dodge(width=0.4),
aes(group=peer)) +
stat_summary(fun.y=mean, geom="point", position=position_dodge(width=0.4), size=4,
aes(group=peer)) +
scale_x_continuous(breaks=unique(dat$year))
You can add a population boxplot, but then the plot starts to look cluttered:
# Add population boxplot (not grouped by peer)
ggplot(dat, aes(year, pct, group=interaction(year,peer), colour=factor(peer))) +
geom_boxplot(aes(group=year), width=0.05, colour="grey60", fill="#FFFFFF90") +
geom_boxplot(position=position_dodge(width=0.4), width=0.2) +
stat_summary(fun.y=mean, geom="line", position=position_dodge(width=0.4),
aes(group=peer)) +
stat_summary(fun.y=mean, geom="point", position=position_dodge(width=0.4), size=4,
aes(group=peer)) +
scale_x_continuous(breaks=unique(dat$year))
UPDATE: Based on your comment, maybe something like this:
# Add an ID variable to the data
dat$id = rep(1:(nrow(dat)/2), each=2)
library(gridExtra) # For grid.arrange function
pdf("plots.pdf", 7, 5)
for (i in unique(dat$id)) {
p1 = ggplot() +
geom_boxplot(data=dat[dat$peer==unique(dat$peer[dat$id==i]),],
aes(year, pct, group=year)) +
geom_point(data=dat[dat$id==i,], aes(year, pct),
pch=8, colour="red", size=5) +
ggtitle("Your Peers")
p2 = ggplot() +
geom_boxplot(data=dat, aes(year, pct, group=year)) +
geom_point(data=dat[dat$id==i,], aes(year, pct),
pch=8, colour="red", size=5) +
ggtitle("All Participants")
grid.arrange(p1, p2, ncol=2, main=paste0("ID = ", i))
}
dev.off()
Here's what the first plot looks like:

Resources