I have a plot with several single geom_point-plots and I would like to specify the shape and color for each plot individually.
Somehow I am really struggling with getting a proper legend and also I could not find a solution on stackoverflow.
I tried to use "fill" in the aes-command, but if I have more than two plots with fill, I get the error:
"Error: Aesthetics must be either length 1 or the same as the data
(1): x, y"
This is a simplified minimal example of the basic structure of my plot:
da <- as.character(c(1:10))
type <- c("a", "b", "c", "a", "b", "c", "a", "b", "c", "a" )
value <- c(1:10)
df <- data.frame(da, type, value)
require("ggplot2")
ggplot() +
geom_point(data = subset(df, type %in% c("a")), aes(x=da, y=value), shape=1, color="red", size=5) +
geom_point(data = subset(df, type %in% c("b")), aes(x=da, y=value), shape=2, color="darkorange", size=3) +
geom_point(data = subset(df, type %in% c("c")), aes(x=da, y=value), shape=3, color="violet", size=3)
How can I add a legend with custom labels?
Thanks! :-)
Why would you create separate layers and manually create a legend when you can simply create one layer and map aesthetics to your data (in this case, simply "type")? If you want specific colour or shape values, you can specify these using scales such as scale_colour_manual, scale_shape_discrete, etc)
da <- as.character(c(1:10))
type <- c("a", "b", "c", "a", "b", "c", "a", "b", "c", "a" )
value <- c(1:10)
df <- data.frame(da, type, value)
require("ggplot2")
#> Loading required package: ggplot2
ggplot(df, aes(x=da, y=value, color=type, shape = type, size = type)) +
geom_point()
#> Warning: Using size for a discrete variable is not advised.
Created on 2020-01-20 by the reprex package (v0.3.0)
Related
Why are the pies flat?
df<- data.frame(
Day=(1:6),
Var1=c(172,186,191,201,205,208),
Var2= c(109,483,64010,161992,801775,2505264), A=c(10,2,3,4.5,16.5,39.6), B=c(10,3,0,1.4,4.8,11.9), C=c(2,5,2,0.1,0.5,1.2), D=c(0,0,0,0,0.1,0.2))
ggplot() +
geom_scatterpie(data = df, aes(x = Var1 , y = Var2, group = Var1), cols = c("A", "B", "C", "D"))
I have tried using coord_fixed() and does not work either.
The problem seems to be the scales of the x- and y-axes. If you rescaled them to both to have zero mean and unit variance, the plot works. So, one thing you could do is plot the rescaled values, but transform the labels back into the original scale. To do this, you would have to do the following:
Make the data:
df<- data.frame(
Day=(1:6),
Var1=c(172,186,191,201,205,208),
Var2= c(109,483,64010,161992,801775,2505264), A=c(10,2,3,4.5,16.5,39.6), B=c(10,3,0,1.4,4.8,11.9), C=c(2,5,2,0.1,0.5,1.2), D=c(0,0,0,0,0.1,0.2))
Rescale the variables
df <- df %>%
mutate(x = c(scale(Var1)),
y = c(scale(Var2)))
Find the linear map that transforms the rescaled values back into their original values. Then, you can use the coefficients from the model to make a function that will transform the rescaled values back into the original ones.
m1 <- lm(Var1 ~ x, data=df)
m2 <- lm(Var2 ~ y, data=df)
trans_x <- function(x)round(coef(m1)[1] + coef(m1)[2]*x)
trans_y <- function(x)round(coef(m2)[1] + coef(m2)[2]*x)
Make the plot, using the transformation functions as the call to labels in the scale_[xy]_continuous() functions
ggplot() +
geom_scatterpie(data=df, aes(x = x, y=y), cols = c("A", "B", "C", "D")) +
scale_x_continuous(labels = trans_x) +
scale_y_continuous(labels = trans_y) +
coord_fixed()
There may be an easier way than this, but it wasn't apparent to me.
The range on the y-axis is so large it's compressing the disks to lines. Change the y-axis to a log scale, and you can see the shapes. Adding coord_fixed() to keep the pies circular:
ggplot() +
geom_scatterpie(data = df, aes(x = Var1 , y = Var2, group = Var1), cols = c("A", "B", "C", "D")) +
scale_y_log10() +
coord_fixed()
I would like to create multiple histograms (ggplot) using a for loop. The problem is that my x-as from the plots, stay the same like "value". Do you know how to change the x-as every time it loops?
My dataframe for example:
df <- data.frame(variable = c("A", "A", "B", "B", "C", "C"), value = c(1, 2, 4, 5, 2, 3))
So that means I get three plots with x-as: "A", "B" and "C"
My code:
for (i in unique(df$variable)){
d <- subset(df, df$variable == i)
print(ggplot(d, aes(x = value)) + geom_histogram())
}
You can take help of imap to get different x-axis value after splitting the data by variable.
library(ggplot2)
list_plot <- df %>%
split(.$variable) %>%
purrr::imap(~ggplot(.x, aes(x = value)) +
geom_histogram() + xlab(.y))
Also have you considered using facets? Where x-axis is the same and you get A, B, C as facet names.
ggplot(df, aes(x = value)) + geom_histogram() + facet_wrap(~variable)
I'm new to R and I'm trying to create a single plot with data from 2 melted dataframes.
Ideally I would have a legend for each of the dataframes with their respective titles; however, I get a only a single legend with the title of the first aesthetic.
My starting point is:
aerobic_melt <- melt(aerobic, id.vars = 'Distance', variable.name = 'Aerobic')
anaerobic_melt <- melt(anaerobic, id.vars = 'Distance', variable.name = 'Anaerobic')
plot <- ggplot() +
geom_line(data = aerobic_melt, aes(Distance, value, col=Aerobic)) +
geom_line(data = anaerobic_melt, aes(Distance, value, col= Anaerobic)) +
xlim(0, 125) +
ylab('Energy (J/kg )') +
xlab('Distance (m)')
Which results in
I've searched, but with my limited ability I haven't been able to find a way to do it.
My question is:
How do I create separate legends with titles 'Aerobic' and 'Anaerobic' which should respectively refer to A,B,C,F,G,L and E,H,I,J,K?
Any help is appreciated.
Obviously we don't have your data, but I have created some sample data that should have the same names and structure as your own data frames, since it works with your own plot code. See the end of the answer for the data used here.
You can use the package ggnewscale if you want two color scales on the same plot. Just add in a new_scale_color() call between your geom_line calls. I have left the rest of your code as-is.
library(ggplot2)
library(ggnewscale)
plot <- ggplot() +
geom_line(data = aerobic_melt, aes(Distance, value, col=Aerobic)) +
new_scale_color() +
geom_line(data = anaerobic_melt, aes(Distance, value, col= Anaerobic)) +
xlim(0, 125) +
ylab('Energy (J/kg )') +
xlab('Distance (m)')
plot
Data
set.seed(1)
aerobic_melt <- data.frame(
Aerobic = rep(c("A", "B", "C", "F", "G", "L"), each = 120),
value = as.numeric(replicate(6, cumsum(rnorm(120)))),
Distance = rep(1:120, 6))
anaerobic_melt <- data.frame(
Anaerobic = rep(c("E", "H", "I", "J", "K"), each = 120),
value = as.numeric(replicate(5, cumsum(rnorm(120)))),
Distance = rep(1:120, 5))
I've no idea where to even start with this. I've looked at GGPlot and plotly etc to try and find the right thing but haven't come across anything.
This is as example of my data though
Skill <- c("Tackling", "Shooting", "Technique", "Passing", "Pace", "Stamina")
Grade <- c("A", "C", "C", "B", "A", "B")
data <- data.frame(Skill, Grade)
This is the sort of graph I'd like
I'm a football scout and it would be fantastic to be able to have a graph like that to compare the players we have to the player I'm scouting.
so if the grade is D, it would just show red, if the grade was C it would show red and orange. Etc.
This is quite close to what you want:
Skill <- c("Tackling", "Shooting", "Technique", "Passing", "Pace", "Stamina")
Grade <- c("A", "C", "C", "B", "A", "B")
data <- data.frame(Skill, Grade)
library(ggplot2)
library(dplyr)
data$grade <- factor(data$Grade, levels=c("D","C","B","A"))
data$grade2 <- recode(data$grade, A="B")
data$grade3 <- recode(data$grade2, B="C")
data$grade4 <- recode(data$grade3, C="D")
ggplot(data, aes(x=Skill, y=grade)) +
geom_bar(stat="identity", fill="green",col="black",width=1) +
geom_bar(aes(y=grade2),stat="identity", fill="yellow",col="black",width=1) +
geom_bar(aes(y=grade3),stat="identity", fill="orange",col="black",width=1) +
geom_bar(aes(y=grade4),stat="identity", fill="red",col="black",width=1) +
scale_y_discrete(limits = c("D","C","B","A")) +
coord_polar(start = pi/6) + theme_bw() + theme(axis.text.y = element_blank()) +
theme(axis.ticks = element_blank(), axis.title = element_blank())
How about this
library(ggplot2)
ggplot(data = data, aes(Skill, Grade, fill = Grade)) +
geom_tile() +
coord_polar() +
theme_bw()
To have all levels below the grade coded, you'll need to have all those lower levels within the dataframe, which is in a way redundant. Wouldn't it be?
d = transform(data, gr = as.numeric(factor(data$Grade, c("D", "C", "B", "A"))))
d = do.call(rbind, lapply(split(d, d$Skill), function(x){
foo = with(x, setNames(data.frame(Skill[1], Grade[1], seq(gr)), names(x)))
}))
library(ggplot2)
ggplot(d, aes(Skill, gr, fill = factor(gr, 4:1))) +
geom_col() +
coord_polar()
Suppose my data is two columns, one is "Condition", one is "Stars"
food <- data.frame(Condition = c("A", "B", "A", "B", "A"), Stars=c('good','meh','meh','meh','good'))
How to make a barplot of the frequency of "Star" as grouped by "Condition"?
I read here but would like to expand that answer to include groups.
for now I have
q <- ggplot(food, aes(x=Stars))
q + geom_bar(aes(y=..count../sum(..count..)))
but that is the proportion of the full data set.
How to make a plot with four bars, that is grouped by 'Condition'?
Eg. 'Condition A' would have 'Good' as 0.66 and 'Meh' as 0.33
I guess this is what you are looking for:
food <- data.frame(Condition = c("A", "B", "A", "B", "A"), Stars=c('good','meh','meh','meh','good'))
library(ggplot2)
library(dplyr)
data <- food %>% group_by(Stars,Condition) %>% summarize(n=n()) %>% mutate(freq=n/sum(n))
ggplot(data, aes(x=Stars, fill = Condition, group = Condition)) + geom_bar(aes(y=freq), stat="identity", position = "dodge")
At first i have calculated the frequencies using dplyr package, which is used as y argument in geom_bar(). Then i have used fill=Condition argument in ggplot() which divided the bars according to Condition. Additionally i have set position="dodge" to get the bars next to each other and stat="identity", due to already calculated frequencies.
I have used value ..prop.., aesthetic group and facet_wrap(). Using aesthetic group proportions are computed by groups. And facet_wrap() is used to plot each condition separately.
require(ggplot2)
food <- data.frame(Condition = c("A", "B", "A", "B", "A"),
Stars=c('good','meh','meh','meh','good'))
ggplot(food) +
geom_bar(aes(x = Stars, y = ..prop.., group = Condition)) +
facet_wrap(~ Condition)