I have the following data :
data <- data.frame(x = letters[1:6],
group = rep(letters[1:2], each = 3),
y = 1:6)
x group y
1 a a 1
2 b a 2
3 c a 3
4 d b 4
5 e b 5
6 f b 6
And I would like to plot y ~ x and split into facets by groups with ggplot2.
ggplot(data, aes(x, y)) +
geom_bar(stat = "identity") +
facet_grid(group ~ .)
The problem is that some tuples (x; group) don't exist in my data(for example there is no data for x = a && group = b) , but they are kept in the x-axis of both facets so I would like to remove them and then remove white spaces in the facets when factors are missing in respective groups.
I thought scales = "free_x" or drop = TRUE could do the trick but I couldn't manage to do it.
Any help would be appreciated, Thanks !
Use facet_wrap instead
ggplot(data, aes(x, y)) +
geom_col() +
facet_wrap(~group, scales = 'free', nrow = 2, strip.position = 'right')
also note geom_col as an alternative to using identity
Related
I just started learning R. I melted my dataframe and used ggplot to get this graph. There's supposed to be two lines on the same graph, but the lines connecting seem random.
Correct points plotted, but wrong lines.
# Melted my data to create new dataframe
AvgSleep2_DF <- melt(AvgSleep_DF , id.vars = 'SleepDay_Date',
variable.name = 'series')
# Plotting
ggplot(AvgSleep2_DF, aes(SleepDay_Date, value, colour = series)) +
geom_point(aes(colour = series)) +
geom_line(aes(colour = series))
With or without the aes(colour = series) in the geom_line results in the same graph. What am I doing wrong here?
The following might explain what geom_line() does when you specify aesthetics in the ggplot() call.
I assign a deliberate colour column that differs from the series specification!
df <- data.frame(
x = c(1,2,3,4,5)
, y = c(2,2,3,4,2)
, colour = factor(c(rep(1,3), rep(2,2)))
, series = c(1,1,2,3,3)
)
df
x y colour series
1 1 2 1 1
2 2 2 1 1
3 3 3 1 2
4 4 4 2 3
5 5 2 2 3
Inheritance in ggplot will look for aesthetics defined in an upper layer.
ggplot(data = df, aes(x = x, y = y, colour = colour)) +
geom_point(size = 3) + # setting the size to stress point layer call
geom_line() # geom_line will "inherit" a "grouping" from the colour set above
This gives you
While we can control the "grouping" associated to each line(segment) as follows:
ggplot(data = df, aes(x = x, y = y, colour = colour)) +
geom_point(size = 3) +
geom_line(aes(group = series) # defining specific grouping
)
Note: As I defined a separate "group" in the series column for the 3rd point, it is depicted - in this case - as a single point "line".
I need plot two grouped barcodes with two dataframes that has distinct number of rows: 6, 5.
I tried many codes in R but I don't know how to fix it
Here are my data frames: The Freq colum must be in Y axis and the inter and intra columns must be the x axis.
> freqinter
inter Freq
1 0.293040975264367 17
2 0.296736775990729 2
3 0.297619926364764 4
4 0.587377012109561 1
5 0.595245125315916 4
6 0.597022018595893 2
> freqintra
intra Freq
1 0 3
2 0.293040975264367 15
3 0.597022018595893 4
4 0.598809552335782 2
5 0.898227748764939 6
I expect to plot the barplots in the same plot and could differ inter e intra values by colour
I want a picture like this one:
You probably want a histogram. Use the raw data if possible. For example:
library(tidyverse)
freqinter <- data.frame(x = c(
0.293040975264367,
0.296736775990729,
0.297619926364764,
0.587377012109561,
0.595245125315916,
0.597022018595893), Freq = c(17,2,4,1,4,2))
freqintra <- data.frame(x = c(
0 ,
0.293040975264367,
0.597022018595893,
0.598809552335782,
0.898227748764939), Freq = c(3,15,4,2,6))
df <- bind_rows(freqinter, freqintra, .id = "id") %>%
uncount(Freq)
ggplot(df, aes(x, fill = id)) +
geom_histogram(binwidth = 0.1, position = 'dodge', col = 1) +
scale_fill_grey() +
theme_minimal()
With the data you posted I don't think you can have this graph to look good. You can't have bars thin enough to differentiate 0.293 and 0.296 when your data ranges from 0 to 0.9.
Maybe you could try to treat it as a factor just to illustrate what you want to do:
freqinter <- data.frame(x = c(
0.293040975264367,
0.296736775990729,
0.297619926364764,
0.587377012109561,
0.595245125315916,
0.597022018595893), Freq = c(17,2,4,1,4,2))
freqintra <- data.frame(x = c(
0 ,
0.293040975264367,
0.597022018595893,
0.598809552335782,
0.898227748764939), Freq = c(3,15,4,2,6))
df <- bind_rows(freqinter, freqintra, .id = "id")
ggplot(df, aes(x = as.factor(x), y = Freq, fill = id)) +
geom_bar(stat = "identity", position = position_dodge2(preserve = "single")) +
theme(axis.text.x = element_text(angle = 90)) +
scale_fill_discrete(labels = c("inter", "intra"))
You can also check the problem by not treating your x variable as a factor:
ggplot(df, aes(x = x, y = Freq, fill = id)) +
geom_bar(stat = "identity", width = 0.05, position = "dodge") +
theme(axis.text.x = element_text(angle = 90)) +
scale_fill_discrete(labels = c("inter", "intra"))
Either the bars must be very thin (small width), or you'll get overlapping x intervals breaking the plot.
I want to use ggplot to loop over several columns to create multiple plots, but using the placeholder in the for loop changes the behavior of ggplot.
If I have this:
t <- data.frame(w = c(1, 2, 3, 4), x = c(23,45,23, 34),
y = c(23,34,54, 23), z = c(23,12,54, 32))
This works fine:
ggplot(data=t, aes(w, x)) + geom_line()
But this does not:
i <- 'x'
ggplot(data=t, aes(w, i)) + geom_line()
Which is a problem if I want to eventually loop over x, y and z.
Any help?
You just need to use aes_string instead of aes, like this:
ggplot(data=t, aes_string(x = "w", y = i)) + geom_line()
Note that w then needs to be specified as a string, too.
ggplot2 > 3.0.0 supports tidy evaluation pronoun .data. So we can do the following:
Build a function that takes x- & y- column names as inputs. Note the use of .data[[]].
Then loop through every column using purrr::map.
library(rlang)
library(tidyverse)
dt <- data.frame(
w = c(1, 2, 3, 4), x = c(23, 45, 23, 34),
y = c(23, 34, 54, 23), z = c(23, 12, 54, 32)
)
Define a function that accept strings as input
plot_for_loop <- function(df, x_var, y_var) {
ggplot(df, aes(x = .data[[x_var]], y = .data[[y_var]])) +
geom_point() +
geom_line() +
labs(x = x_var, y = y_var) +
theme_classic(base_size = 12)
}
Loop through every column
plot_list <- colnames(dt)[-1] %>%
map( ~ plot_for_loop(dt, colnames(dt)[1], .x))
# view all plots individually (not shown)
plot_list
# Combine all plots
library(cowplot)
plot_grid(plotlist = plot_list,
ncol = 3)
Edit: the above function can also be written w/ rlang::sym & !! (bang bang).
plot_for_loop2 <- function(df, .x_var, .y_var) {
# convert strings to variable
x_var <- sym(.x_var)
y_var <- sym(.y_var)
# unquote variables using !!
ggplot(df, aes(x = !! x_var, y = !! y_var)) +
geom_point() +
geom_line() +
labs(x = x_var, y = y_var) +
theme_classic(base_size = 12)
}
Or we can just use facet_grid/facet_wrap after convert the data frame from wide to long format (tidyr::gather)
dt_long <- dt %>%
tidyr::gather(key, value, -w)
dt_long
#> w key value
#> 1 1 x 23
#> 2 2 x 45
#> 3 3 x 23
#> 4 4 x 34
#> 5 1 y 23
#> 6 2 y 34
#> 7 3 y 54
#> 8 4 y 23
#> 9 1 z 23
#> 10 2 z 12
#> 11 3 z 54
#> 12 4 z 32
### facet_grid
ggp1 <- ggplot(dt_long,
aes(x = w, y = value, color = key, group = key)) +
facet_grid(. ~ key, scales = "free", space = "free") +
geom_point() +
geom_line() +
theme_bw(base_size = 14)
ggp1
### facet_wrap
ggp2 <- ggplot(dt_long,
aes(x = w, y = value, color = key, group = key)) +
facet_wrap(. ~ key, nrow = 2, ncol = 2) +
geom_point() +
geom_line() +
theme_bw(base_size = 14)
ggp2
### bonus: reposition legend
# https://cran.r-project.org/web/packages/lemon/vignettes/legends.html
library(lemon)
reposition_legend(ggp2 + theme(legend.direction = 'horizontal'),
'center', panel = 'panel-2-2')
The problem is how you access the data frame t. As you probably know, there are several ways of doing so but unfortunately using a character is obviously not one of them in ggplot.
One way that could work is using the numerical position of the column in your example, e.g., you could try i <- 2. However, if this works rests on ggplot which I have never used (but I know other work by Hadley and I guess it should work)
Another way of circumventing this is by creating a new temporary data frame every time you call ggplot. e.g.:
tmp <- data.frame(a = t[['w']], b = t[[i]])
ggplot(data=tmp, aes(a, b)) + geom_line()
Depending on what you are trying to do, I find facet_wrap or facet_grid to work well for creating multiple plots with the same basic structure. Something like this should get you in the right ballpark:
t.m = melt(t, id="w")
ggplot(t.m, aes(w, value)) + facet_wrap(~ variable) + geom_line()
I am working on a figure which should contain 3 different lines on the same graph. The data frame I am working on is the follow:
I would like to be able to use ind(my data point) on x axis and then draw 3 different lines using the data coming from the columns med, b and c.
I only managed to obtain draw one line.
Could you please help me? the code I am using now is
ggplot(data=f, aes(x=ind, y=med, group=1)) +
geom_line(aes())+ geom_line(colour = "darkGrey", size = 3) +
theme_bw() +
theme(plot.background = element_blank(),panel.grid.major = element_blank(),panel.grid.minor = element_blank())
The key is to spread columns in question into a new variable. This happens in the gather() step in the below code. The rest is pretty much boiler plate ggplot2.
library(ggplot2)
library(tidyr)
xy <- data.frame(a = rnorm(10), b = rnorm(10), c = rnorm(10),
ind = 1:10)
# we "spread" a and b into a a new variable
xy <- gather(xy, key = myvariable, value = myvalue, a, b)
ggplot(xy, aes(x = ind, y = myvalue, color = myvariable)) +
theme_bw() +
geom_line()
With melt and ggplot:
df$ind <- 1:nrow(df)
head(df)
a b med c ind
1 -87.21893 -84.72439 -75.78069 -70.87261 1
2 -107.29747 -70.38214 -84.96422 -73.87297 2
3 -106.13149 -105.12869 -75.09039 -62.61283 3
4 -93.66255 -97.55444 -85.01982 -56.49110 4
5 -88.73919 -95.80307 -77.11830 -47.72991 5
6 -86.27068 -83.24604 -86.86626 -91.32508 6
df <- melt(df, id='ind')
ggplot(df, aes(ind, value, group=variable, col=variable)) + geom_line(lwd=2)
I would like to make a plot using facet_wrap where the axes can vary for each panel but within a panel the x and y axes should be the same scale.
e.g. see the following plots
df <- read.table(text = "
x y g
1 5 a
2 6 a
3 7 a
4 8 a
5 9 b
6 10 b
7 11 b
8 12 b", header = TRUE)
library(ggplot2)
ggplot(df, aes(x=x,y=y,g=g)) +
geom_point() +
facet_wrap(~ g) # all axes 1-12
ggplot(df, aes(x=x,y=y,g=g)) +
geom_point() +
facet_wrap(~ g, scales = "free")
# fee axes, y & y axes don't match per panel
What i want is for panel a the x and why axes both to be 1-8 and for panel b the x and y axes both to range from 5 - 12.
Is this possible?
Using this answer you could try the following:
dummy <- data.frame(x = c(1, 8, 5, 12), y = c(1, 8, 5, 12), g = c("a", "a", "b", "b"))
ggplot(df, aes(x=x,y=y)) +
geom_point() +
facet_wrap(~ g, scales = "free") +
geom_blank(data = dummy)
Another solution is trick the axes for individual facet_wrap() plots by adding invisible points to the plots with x and y reversed so that the plotted data is "square", e.g.,
library(ggplot2)
p <- ggplot(data = df) +
geom_point(mapping = aes(x = x, y = y)) +
geom_point(mapping = aes(x = y, y = x), alpha = 0) +
facet_wrap( ~ g, scales = "free")
print(p)
You could also use geom_blank(). You don't need dummy data.
This wasn't an option when the question was asked, but these days I would highly recommend patchwork for combining plots.