I have a data frame that looks like so:
dat<-structure(list(x = 1:14, y = c(1.26071476002898, 1.97600316441492,
2.41629009067185, 3.48953782319898, 10, 8.49584395945854, 3.80688560348562,
3.07092373734549, 2.96665740569527, 2.73020216450355, 2.39926441554745,
2.4236111796397, 2.63338121290737, 2.13662243060685)), .Names = c("x",
"y"), row.names = c(NA, -14L), class = "data.frame")
x y
1 1.260715
2 1.976003
3 2.416290
4 3.489538
5 10.000000
6 8.495844
7 3.806886
8 3.070924
9 2.966657
10 2.730202
11 2.399264
12 2.423611
13 2.633381
14 2.136622
I'm trying to create a circular plot in ggplot2 where the circle is divided into the 14 data points I have and the length of each of the arcs corresponds to the value of y. Something like this:
My code produces a very weird output with the bars overlapping one another. I have searched everywhere to fix it but no success. Here is my code:
ggplot(dat, aes(x, y)) + geom_bar(breaks = seq(1,14), width = 2, colour = "grey", stat="identity") + coord_polar(start = 0) + scale_x_continuous("", limits = c(1, 14), breaks = seq(1, 14), labels = seq(1, 14))
Please help me… thanks in advance...
It's because you specified width=2. This produces the overlap.
Note also that you don't need any breaks.
Try this:
library(ggplot2)
ggplot(dat, aes(x, y)) +
geom_bar(stat="identity") +
coord_polar(start = 0) +
scale_x_continuous("",breaks = seq(1, 14))
Related
I have a time series DataFrame in R.
There are 4 groups and the data of each group (variable) is acquired at 3 different timepoints.
Group Timepoint Variable
A 1 1.4705745
B 1 4.6090900
C 1 2.2480962
D 1 1.6443650
E 1 4.4812444
A 2 0.8026552
B 2 4.7803944
C 2 1.3743527
D 2 4.0399467
E 2 3.5651057
A 3 4.7275369
B 3 2.4491532
C 3 3.9508347
D 3 3.4278974
E 3 0.6917490
I made a line plot using the following code,
plot_data <- ggplot(data, aes(x = Timepoint, y = Variable, color = Group, group = Group))+geom_line()+
scale_color_discrete("Group")+
scale_y_continuous(limits = c(0, 6))
plot_data
but also want to add significant asterisk, for instance, like that.
Is there any way to add asterisk to the plot manually?
You can use annotate like this:
library(ggplot2)
plot_data <- ggplot(data, aes(x = Timepoint, y = Variable, color = Group, group = Group))+geom_line()+
scale_color_discrete("Group")+
scale_y_continuous(limits = c(0, 6)) +
annotate('text', x = 1, y = c(5,5.2), label='"*"', parse=TRUE, color = c("pink", "yellow")) +
annotate('text', x = 3, y = c(5,5.2), label='"*"', parse=TRUE, color = c("red", "green"))
plot_data
Created on 2022-09-02 with reprex v2.0.2
To get the exact same colors as in standard ggplot you can check that by using hue_pal like this:
library(ggplot2)
library(scales)
show_col(hue_pal()(5))
plot_data <- ggplot(data, aes(x = Timepoint, y = Variable, color = Group, group = Group))+geom_line()+
scale_color_discrete("Group")+
scale_y_continuous(limits = c(0, 6)) +
annotate('text', x = 1, y = c(5,5.2), label='"*"', parse=TRUE, color = c("#A3A500", "#E76BF3")) +
annotate('text', x = 3, y = c(5,5.2), label='"*"', parse=TRUE, color = c("#F8766D", "#00BF7D"))
plot_data
Created on 2022-09-02 with reprex v2.0.2
I'm trying to use ggplot2 to plot a horizontal number line with 0 at the center to compare different items along that axis.
Example
Say that we're interested in the effect of diets given to different mice. Each mouse is fed with a different type of food, and after a month we compare the mice to each other in terms of weight. For each mouse we want to know whether there was a weight gain or weight loss, and by how much.
library(dplyr)
df <-
data.frame(mouse = c("Mickey", "Jerry", "Gonzales", "Remi"),
weight_pre = c(10.1, 6.2, 9.5, 13.3),
weight_post = c(9.2, 12.4, 2.3, 10))
df_with_diff <-
df %>%
mutate(diff = weight_post - weight_pre)
df_with_diff
#> mouse weight_pre weight_post diff
#> 1 Mickey 10.1 9.2 -0.9
#> 2 Jerry 6.2 12.4 6.2
#> 3 Gonzales 9.5 2.3 -7.2
#> 4 Remi 13.3 10.0 -3.3
Created on 2021-05-25 by the reprex package (v2.0.0)
Desired Output
I'm trying to achieve a simple horizontal number line like the one below, with 0 at the center:
(clearly this is not drawn to scale, but just to demonstrate my intention)
Any idea how to do this (or something close enough) with ggplot?
I would use pointrange and theme_minimal. For example:
library(ggplot2)
library(dplyr)
library(ggrepel)
df_with_diff %>%
mutate(min = ifelse(diff > 0, 0, diff),
max = ifelse(diff > 0, diff, 0)) %>%
ggplot(aes(x = diff, y = mouse, col = mouse, label = diff)) +
geom_pointrange(aes(xmin = min, xmax = max)) +
geom_text_repel() +
theme_minimal() +
theme(panel.grid.major.x = element_blank(),
panel.grid.minor.x = element_blank(),
legend.position = "none")
To give an overview which tools you can use and hw they work, have a look at this code
#Draw horizontal lines and set up plot
maxAbsDiff <- max(abs(df_with_diff$diff))
xLimits <- c(-2*maxAbsDiff,2*maxAbsDiff)
p <- ggplot() + geom_hline(yintercept = 1:NROW(df_with_diff)) + geom_point(mapping = aes(x=0,y=1:NROW(df_with_diff))) +
scale_x_continuous(limits = xLimits, expand=c(0,0)) + labs(x=NULL,y=NULL,title = NULL) +
scale_y_continuous(breaks = 1:NROW(df_with_diff), labels = df_with_diff$mouse)
#Draw the lines decoding the diff
lineData <- data.frame(xStart=rep(0,NROW(df_with_diff)), xEnd = df_with_diff$diff,
yStart = 1:NROW(df_with_diff)+0.2, yEnd =1:NROW(df_with_diff)+0.2,
col = paste0("something_",1:NROW(df_with_diff)))
p <- p + geom_segment(data = lineData, mapping = aes(x=xStart,xend = xEnd, y = yStart, yend = yEnd, col = col))
p <- p + theme(legend.position = "none")
#Anotate with the diff-values
annotationData <- data.frame(xpos = df_with_diff$diff/2, ypos = 1:NROW(df_with_diff)+0.5, lab = as.character(df_with_diff$diff))
p <- p + annotate(geom = "text", x = annotationData$xpos, y = annotationData$ypos, label = annotationData$lab)
producing
Its not styled like you want, but looking at the documentation of the used functions, you should easily get what you want. To get rid og the backgounr,d have a look at the documentation of the theme-function. I think ggplot also offers something like an empty theme or void theme, such that you have a shortcut for removing the grids instead of setting all the parameters of theme.
I think I'm missing something very easy here, but I just can't figure it out at the moment:
I would like to consistently assign colors to certain values from a column across multiple plots.
So I have this tibble (sl):
# A tibble: 15 x 3
class hex x
<chr> <chr> <int>
1 translational slide #c23b22 1
2 rotational slide #AFC6CE 2
3 fast flow-type #b7bf5e 3
4 complex #A6CEE3 4
5 area subject to rockfall/topple #1F78B4 5
6 fall-type #B2DF8A 6
7 n.d. #33A02C 7
8 NA #FB9A99 8
9 area subject to shallow-slides #E31A1C 9
10 slow flow-type #FDBF6F 10
11 topple #FF7F00 11
12 deep-seated movement #CAB2D6 12
13 subsidence #6A3D9A 13
14 areas subject to subsidence #FFFF99 14
15 area of expansion #B15928 15
This should recreate it:
structure(list(class = c("translational slide", "rotational slide",
"fast flow-type", "complex", "area subject to rockfall/topple",
"fall-type", "n.d.", NA, "area subject to shallow-slides", "slow flow-type",
"topple", "deep-seated movement", "subsidence", "areas subject to subsidence",
"area of expansion"), hex = c("#c23b22", "#AFC6CE", "#b7bf5e",
"#A6CEE3", "#1F78B4", "#B2DF8A", "#33A02C", "#FB9A99", "#E31A1C",
"#FDBF6F", "#FF7F00", "#CAB2D6", "#6A3D9A", "#FFFF99", "#B15928"
), x = 1:15), row.names = c(NA, -15L), class = c("tbl_df", "tbl",
"data.frame"))
Now I would like to plot each class with a bar in the color if its hex-code (for now just for visualization purposes). So I did:
ggplot(sl) +
geom_col(aes(x = x,
y = 1,
fill = class)) +
scale_fill_manual(values = sl$hex) +
geom_text(aes(x = x,
y = 0.5,
label = class),
angle = 90)
But these are not the colors as they are in the tibble.
So I tried to follow this guide: How to assign colors to categorical variables in ggplot2 that have stable mapping? and created this:
# create the color palette
mycols = sl$hex
names(mycols) = sl$class
and then plotted it with
ggplot(sl) +
geom_col(aes(x = x,
y = 1,
fill = class)) +
scale_fill_manual(values = mycols) +
geom_text(aes(x = x,
y = 0.5,
label = class),
angle = 90)
But the results is the same. It's this:
For example the translational slide has the hex code: "#c23b22" and should be a pastell darkish red.
Anyone might have an idea what I'm missing here?
Consider this:
sl <- structure(list(class = c("translational slide", "rotational slide",
"fast flow-type", "complex", "area subject to rockfall/topple",
"fall-type", "n.d.", NA, "area subject to shallow-slides", "slow flow-type",
"topple", "deep-seated movement", "subsidence", "areas subject to subsidence",
"area of expansion"), hex = c("#c23b22", "#AFC6CE", "#b7bf5e",
"#A6CEE3", "#1F78B4", "#B2DF8A", "#33A02C", "#FB9A99", "#E31A1C",
"#FDBF6F", "#FF7F00", "#CAB2D6", "#6A3D9A", "#FFFF99", "#B15928"
), x = 1:15), row.names = c(NA, -15L), class = c("tbl_df", "tbl",
"data.frame"))
sl$class <- factor( sl$class, levels=unique(sl$class) )
cl <- sl$hex
names(cl) <- paste( sl$class )
ggplot(sl) +
geom_col(aes(x = x,
y = 1,
fill = class)) +
scale_fill_manual( values = cl, na.value = cl["NA"] ) +
geom_text(aes(x = x,
y = 0.5,
label = class),
angle = 90)
By changing class to a factor and setting levels to it, and using a named vector for your values in scale_fill_manual, and using na.value in there properly, yo might get something that looks more as expected.
You need to provide correct order to colors as per your column, since there is already one called 'x' I have used it as well. Also I replaced NA with character 'NA'. I have checked few of them, Please let me know if this is not the desired output. Thanks
#Assuming df is your dataframe:
df[is.na(df$class), 'class'] <- 'NA'
ggplot(df) +
geom_col(aes(x = x,
y = 1,
fill = factor(x))) +
scale_fill_manual(values = df$hex, labels=df$class) +
geom_text(aes(x = x,
y = 0.5,
label = class),
angle = 90)
Output:
I think the problem is that scale_fill_manual expect the order of its values and labels arguments to match. This isn't the case with your dataset.
Does
sl %>% ggplot() +
geom_col(aes(x = x,
y = 1,
fill = hex)) +
geom_text(aes(x = x,
y = 0.5,
label = class),
angle = 90) +
scale_fill_manual(values=sl$hex, labels=sl$class)
Give you what you want?
next time, please dput() your test data: it took me as long to create the test dataset as to answer your question. Also, using hex codes for colours make it difficult to check the colours are as expected. For a MWE, blue/green/black etx would have been more helpful.
Im not sure what the correct name for this type of plot would be, but lets say we have a list of names (or letters here): data <- data.frame(letters[1:10])
Lets also say that we want to illustrate which of these names are connected based on some empirical decision, so we have a list of observations we want to connect in a plot like the following (done in powerpoint):
Can this be done in ggplot?
Yes, it can be done in ggplot.
Let's start by setting up a data frame of letters, with associated positions on the x and y axis of a plot. We'll make the x values 1 and 2 (though this is arbitrary), and the y values 1:10 (also arbitrary, as long as they are evenly spaced)
labels <- data.frame(x = c(rep(1, 10), rep(2, 10)),
y = rep(1:10, 2),
labs = rep(LETTERS[10:1], 2),
stringsAsFactors = FALSE)
Now we also need some way of deciding which letters will be joined. Let's do this by having a simple data frame of "left" and "right" values, where each row describes which two letters will be joined:
set.seed(69)
joins <- data.frame(left = sample(LETTERS[1:10], 6, TRUE),
right = sample(LETTERS[1:10], 6, TRUE),
stringsAsFactors = FALSE)
joins
#> left right
#> 1 A G
#> 2 B B
#> 3 H J
#> 4 G D
#> 5 G J
#> 6 F B
Now we can assign start and end x and y co-ordinates for the lines by matching the letters in these two columns to the columns in our labels data frame:
joins$x <- rep(1.05, nrow(joins))
joins$xend <- rep(1.9, nrow(joins))
joins$y <- labels$y[match(joins$left, labels$labs)]
joins$yend <- labels$y[match(joins$right, labels$labs)]
This just leaves the plot. We want to get rid of all the axes, titles and legends so we use theme_void:
library(ggplot2)
ggplot(labels, aes(x, y)) +
geom_text(aes(label = labs), size = 8) +
geom_segment(data = joins, aes(xend = xend, yend = yend, color = left),
arrow = arrow(type = "closed", length = unit(0.02, "npc"))) +
coord_cartesian(xlim = c(0.5, 2.5)) +
theme_void() +
theme(legend.position = "none")
Created on 2020-07-10 by the reprex package (v0.3.0)
This solution could be tidied up, but gives a start using geom_segment
library(tidyverse)
tibble(x0 = 0, x1 = 1, y0 = sample(letters[1:10]), y1 = sample(letters[1:10])) %>%
mutate(y0 = factor(y0, levels = rev(letters[1:10])),
y1 = factor(y1, levels = rev(letters[1:10]))) %>%
ggplot(aes(x = x0, xend = x1, y = y0, yend = y1)) +
geom_segment(arrow = arrow(length = unit(0.03, "npc"))) +
geom_text(aes(x = x1, y = y1, label = y1), nudge_x = 0.01)
I was thinking of doing this in R but am new to it and would appreciate any help
I have a dataset (pitches) of baseball pitches identified by
'pitchNumber' and 'outcome' e.g S = swinging strike, B = ball, H= hit
etc.
e.g.
1 B ;
2 H ;
3 S ;
4 S ;
5 X ;
6 H; etc.
All I want to do is have a graph that plots them in a line cf BHSSXB
but replacing the letter with a small bar colored to represent the letter, with a legend, and optionally having the pitch number above the color . Somewhat like a sparkline.
Any suggestion on how to implement this much appreciated
And the same graph using ggplot.
Data courtesy of #GavinSimpson.
ggplot(baseball, aes(x=pitchNumber, y=1, ymin=0, ymax=1, colour=outcome)) +
geom_point() +
geom_linerange() +
ylab(NULL) +
xlab(NULL) +
scale_y_continuous(breaks=c(0, 1)) +
opts(
panel.background=theme_blank(),
panel.grid.minor=theme_blank(),
axis.text.y = theme_blank()
)
Here is a base graphics idea from which to work. First some dummy data:
set.seed(1)
baseball <- data.frame(pitchNumber = seq_len(50),
outcome = factor(sample(c("B","H","S","S","X","H"),
50, replace = TRUE)))
> head(baseball)
pitchNumber outcome
1 1 H
2 2 S
3 3 S
4 4 H
5 5 H
6 6 H
Next we define the colours we want:
## better colours - like ggplot for the cool kids
##cols <- c("red","green","blue","yellow")
cols <- head(hcl(seq(from = 0, to = 360,
length.out = nlevels(with(baseball, outcome)) + 1),
l = 65, c = 100), -1)
then plot the pitchNumber as a height 1 histogram-like bar (type = "h"), suppressing the normal axes, and we add on points to the tops of the bars to help visualisation:
with(baseball, plot(pitchNumber, y = rep(1, length(pitchNumber)), type = "h",
ylim = c(0, 1.2), col = cols[outcome],
ylab = "", xlab = "Pitch", axes = FALSE, lwd = 2))
with(baseball, points(pitchNumber, y = rep(1, length(pitchNumber)), pch = 16,
col = cols[outcome]))
Add on the x-axis and the plot frame, plus a legend:
axis(side = 1)
box()
## note: this assumes that the levels are in alphabetical order B,H,S,X...
legend("topleft", legend = c("Ball","Hit","Swinging Strike","X??"), lty = 1,
pch = 16, col = cols, bty = "n", ncol = 2, lwd = 2)
Gives this:
This is in response to your last comment on #Gavin's answer. I'm going to build off of the data provided by #Gavin and the ggplot2 plot by #Andrie. ggplot() supports the concept of faceting by a variable or variables. Here you want to facet by pitcher and at the pitch limit of 50 per row. We'll create a new variable that corresponds to each row we want to plot separately. The equivalent code in base graphics would entail adjusting mfrow or mfcol in par() and calling separate plots for each group of data.
#150 pitches represents a somewhat typical 9 inning game.
#Thanks to Gavin for sample data.
longGame <- rbind(baseball, baseball, baseball)
#Starter goes 95 pitches, middle relief throws 35, closer comes in for 20 and the glory
longGame$pitcher <- c(rep("S", 95), rep("M", 35), rep("C",20))
#Adjust pitchNumber accordingly
longGame$pitchNumber <- c(1:95, 1:35, 1:20)
#We want to show 50 pitches at a time, so will combine the pitcher name
#with which set of pitches this is
longGame$facet <- with(longGame, paste(pitcher, ceiling(pitchNumber / 50), sep = ""))
#Create the x-axis in increments of 1-50, by pitcher
longGame <- ddply(longGame, "facet", transform, pitchFacet = rep(1:50, 5)[1:length(facet)])
#Convert facet to factor in the right order
longGame$facet <- factor(longGame$facet, levels = c("S1", "S2", "M1", "C1"))
#Thanks to Andrie for ggplot2 function. I change the x-axis and add a facet_wrap
ggplot(longGame, aes(x=pitchFacet, y=1, ymin=0, ymax=1, colour=outcome)) +
geom_point() +
geom_linerange() +
facet_wrap(~facet, ncol = 1) +
ylab(NULL) +
xlab(NULL) +
scale_y_continuous(breaks=c(0, 1)) +
opts(
panel.background=theme_blank(),
panel.grid.minor=theme_blank(),
axis.text.y = theme_blank()
)
You can obviously change the labels for the facet variable, but the above code will produce: