Only display label per category - r

I have the following dataset:
year <- as.factor(c(1999,2000,2001))
era <- c(0.4,0.6,0.7)
player_id <- as.factor(c(2,2,2))
df <- data.frame(year, era, player_id)
Using this data I created the following graph:
ggplot(data = df, aes(x = year, y=era, colour = player_id))+
geom_line() +
geom_text(aes(label = player_id), hjust=0.7)
Thing is however that I do now get a label at every datapoint. I only want to have a label at the end of each datapoint.
Any thoughts on what I should change to I get only one label?

If I understand correctly, you want label at end of data point. You could do this using directlabels library, as below:
library(ggplot2)
library(directlabels)
ggplot(data = df, aes(x = year, y=era, group = player_id, colour = player_id))+
geom_line() +
scale_colour_discrete(guide = 'none') +
scale_x_discrete(expand=c(0, 1)) +
geom_dl(aes(label = player_id), method = list(dl.combine("last.points"), cex = 0.8))
Output:

If I am understanding correctly what you want, then you can replace the geom_text(...) with geom_point()

Related

ggplot geom_line - setting colour of lines doesn't work?

I'm trying to plot several lines and then colouring them grey. However, whatever the colour I set, I get black lines. And if I put colour inside the aesthetic, then I get different colours (as expected), even if I specify the argument colour again outside aes().
I'm sure I'm missing something very basic here!
library(tidyverse)
library(ggplot)
country <- c(rep("A", 10), rep("B",10), rep("C", 10))
year <- c(2000:2009, 2000:2009, 2000:2009)
value <- c(rnorm(10), rnorm(10, mean = 0.5), rnorm(10, mean = 1.1))
myData <- tibble(country, year, value) %>%
mutate(avg = mean(value))
ggplot(myData,
aes(x = year, y = value, country = country),
colour = "grey") +
geom_line()
Try this:
ggplot(myData, aes(x = year, y = value, country = country, colour = I("grey"))) +
geom_line()
Here is an othe approach: How you can use scale_color_manual:
p <- ggplot(myData, aes(x = year, y = value, color=country)) +
geom_line()
p + scale_color_manual(values=c("#a6a6a6", "#a6a6a6", "#a6a6a6"))
Instead of using hex color you could also use:
p + scale_color_manual(values=c("gray69", "gray69", "gray69"))

Plot many variables

Having a dataframe like this one:
From a dataframe like this one:
data <- data.frame(year = c(2010,2011,2012,2010,2011,2012),
name = c("stock1","stock1","stock1","stock2","stock2","stock2"),
value = c(0,3,1,4,1,3))
I would like to create a plot and I use this:
library(ggplot2)
ggplot(data=data, xName="year", groupName="name", brewerPalette="Blues")
but I can't receive the plot. Anything wrong in the call?
I think you need something like this:
library(ggplot2)
library(dplyr)
library(RColorBrewer)
df %>%
group_by(name) %>%
ggplot(aes(year,value,fill=name))+
geom_col()+
scale_fill_brewer(palette = "Blues")
If you want a grouped bar plot (as I guessed from your code), this code may be helpful:
ggplot(data = data, aes(x = as.factor(year), y = value, fill = name)) +
geom_bar(stat = "identity", position = position_dodge(0.8), width = 0.7) +
scale_fill_brewer(palette = "Blues")

Box plot with multiple groups + Dots + Counts

I have a boxplot with multiple groups in R.
When i add the dots within the boxplots, they are not in the center.
Since each week has a different number of boxplots, the dots are not centered within the box.
The problem is in the geom_point part.
I uploaded my data of df.m in a text file and a figure of what i get.
I am using ggplot, and here is my code:
setwd("/home/usuario")
dput("df.m")
df.m = read.table("df.m.txt")
df.m$variable <- as.factor(df.m$variable)
give.n = function(elita){
return(c(y = median(elita)*-0.1, label = length(elita)))
}
p = ggplot(data = df.m, aes(x=variable, y=value))
p = p + geom_boxplot(aes(fill = Label))
p = p + geom_point(aes(fill = Label), shape = 21,
position = position_jitterdodge(jitter.width = 0))
p = p + stat_summary(fun.data = give.n, geom = "text", fun.y = median)
p
Here is my data in a text file:
https://drive.google.com/file/d/1kpMx7Ao01bAol5eUC6BZUiulLBKV_rtH/view?usp=sharing
Only in variable 12 is in the center, because there are 3 groups (the maximum of possibilities!
I would also like to show the counting of observations. If I use the code shown, I can only get the number of observations for all the groups. I would like to add the counting for EACH GROUP.
Thank you in advance
enter image description here
Here's a solution using boxplot and dotplot and an example dataset:
library(tidyverse)
# example data
dt <- data.frame(week = c(1,1,1,1,1,1,1,1,1,
2,2,2,2,2,2,2,2,2),
value = c(6.40,6.75,6.11,6.33,5.50,5.40,5.83,4.57,5.80,
6.00,6.11,6.40,7.00,3,5.44,6.00,5,6.00),
donor_type = c("A","A","A","A","CB","CB","CB","CB","CB",
"CB","CB","CB","CB","CB","A","A","A","A"))
# create the plot
ggplot(dt, aes(x = factor(week), y = value, fill = donor_type)) +
geom_boxplot() +
geom_dotplot(binaxis='y', stackdir='center', position = position_dodge(0.75))
You should be able to adjust my code to your real dataset easily.
Edited answer with OP's dataset:
Using some generated data and geom_point():
library(tidyverse)
df.m <- df.m %>%
mutate(variable = as.factor(variable)) %>%
filter(!is.na(value))
ggplot(df.m, aes(x = variable, y = value, fill = Label)) +
geom_boxplot() +
geom_point(shape = 21, position = position_jitterdodge(jitter.width = 0)) +
scale_x_discrete("variable", drop = FALSE)

plot multiple lines in ggplot

I need to plot hourly data for different days using ggplot, and here is my dataset:
The data consists of hourly observations, and I want to plot each day's observation into one separate line.
Here is my code
xbj1 = bj[c(1:24),c(1,6)]
xbj2 = bj[c(24:47),c(1,6)]
xbj3 = bj[c(48:71),c(1,6)]
ggplot()+
geom_line(data = xbj1,aes(x = Date, y= Value), colour="blue") +
geom_line(data = xbj2,aes(x = Date, y= Value), colour = "grey") +
geom_line(data = xbj3,aes(x = Date, y= Value), colour = "green") +
xlab('Hour') +
ylab('PM2.5')
Please advice on this.
I'll make some fake data (I won't try to transcribe yours) first:
set.seed(2)
x <- data.frame(
Date = rep(Sys.Date() + 0:1, each = 24),
# Year, Month, Day ... are not used here
Hour = rep(0:23, times = 2),
Value = sample(1e2, size = 48, replace = TRUE)
)
This is a straight-forward ggplot2 plot:
library(ggplot2)
ggplot(x) +
geom_line(aes(Hour, Value, color = as.factor(Date))) +
scale_color_discrete(name = "Date")
ggplot(x) +
geom_line(aes(Hour, Value)) +
facet_grid(Date ~ .)
I highly recommend you find good tutorials for ggplot2, such as http://www.cookbook-r.com/Graphs/. Others exist, many quite good.

ggplot with variable line types and colors

In R with ggplot, I want to create a spaghetti plot (2 quantitative variables) grouped by a third variable to specify line color. Secondly, I want to aggregate that grouping variable with the line type or width.
Here's an example using the airquality dataset. I want the line's color to represent the month, and the summer months to have a different line width from non-summer months.
First, I created an indicator variable for the aggregated groups:
airquality$Summer <- with(airquality, ifelse(Month >= 6 & Month < 9, 1, 0))
I would like something like this, but with differing line widths:
However, this fails:
library(ggplot2)
ggplot(data = airquality, aes(x=Wind, y = Temp, color = as.factor(Month), group = Summer)) +
geom_point() +
geom_line(linetype = as.factor(Summer))
This also fails (specifying airquality$Summer):
ggplot(data = airquality, aes(x=Wind, y = Temp,
color = as.factor(Month), group = airquality$Summer)) +
geom_point() +
geom_line(linetype = as.factor(airquality$Summer))
I attempted this solution, but get another error:
lty <- setNames(c(0, 1), levels(airquality$Summer))
ggplot(data = airquality, aes(x=Wind, y = Temp,
color = as.factor(Month), group = airquality$Summer)) +
geom_point() +
geom_line(linetype = as.factor(airquality$Summer)) +
scale_linetype_manual(values = lty)
Any ideas?
EDIT:
My actual data show very clear trends, and I want to differentiate the top line from all the others below. My goal is to convince people they should make more than just the minimum payment on their student loans:
You just need to change the group to Month and putlinetype in aes:
ggplot(data = airquality, aes(x=Wind, y = Temp, color = as.factor(Month), group = Month)) +
geom_point() +
geom_line(aes(linetype = factor(Summer)))
If you want to specify the linetype you can use a few methods. Here is one way:
lineT <- c("solid", "dotdash")
names(lineT) <- c("1","0")
ggplot(data = airquality, aes(x=Wind, y = Temp, color = as.factor(Month))) +
geom_point() +
geom_line(aes(linetype = factor(Summer))) +
scale_linetype_manual(values = lineT)

Resources