proportional line width ggplot2, in Gantt chart - r

I aim to plot line widths proportional to a variable in a data.frame, a topic which has, for example, been discussed here.
My application is (although the issue is probably not related to that) within a Gantt chart adapted from here as in:
library(reshape2)
library(ggplot2)
MA <- c("A", "B", "C")
dfr <- data.frame(
name = factor(MA, levels = MA),
start.date = as.Date(c("2012-09-01", "2013-01-01","2014-01-01")),
end.date = as.Date(c("2019-01-01", "2017-12-31","2019-06-30")),
prozent = c(1,0.5,0.75)*100
)
mdfr <- melt(dfr, measure.vars = c("start.date", "end.date"))
ggplot(mdfr, aes(value, name)) + geom_line(aes(size = prozent))
This yields
where I do get different line widths, which however do not look proportional to prozent.
Is there a way to make the line widths proportional to prozent?

just add + scale_size_area():
ggplot(mdfr, aes(value, name)) +
geom_line(aes(size = prozent)) +
scale_size_area()

Related

Individual legends for separate geom_line aesthetics in the same ggplot

I'm new to R and I'm trying to create a single plot with data from 2 melted dataframes.
Ideally I would have a legend for each of the dataframes with their respective titles; however, I get a only a single legend with the title of the first aesthetic.
My starting point is:
aerobic_melt <- melt(aerobic, id.vars = 'Distance', variable.name = 'Aerobic')
anaerobic_melt <- melt(anaerobic, id.vars = 'Distance', variable.name = 'Anaerobic')
plot <- ggplot() +
geom_line(data = aerobic_melt, aes(Distance, value, col=Aerobic)) +
geom_line(data = anaerobic_melt, aes(Distance, value, col= Anaerobic)) +
xlim(0, 125) +
ylab('Energy (J/kg )') +
xlab('Distance (m)')
Which results in
I've searched, but with my limited ability I haven't been able to find a way to do it.
My question is:
How do I create separate legends with titles 'Aerobic' and 'Anaerobic' which should respectively refer to A,B,C,F,G,L and E,H,I,J,K?
Any help is appreciated.
Obviously we don't have your data, but I have created some sample data that should have the same names and structure as your own data frames, since it works with your own plot code. See the end of the answer for the data used here.
You can use the package ggnewscale if you want two color scales on the same plot. Just add in a new_scale_color() call between your geom_line calls. I have left the rest of your code as-is.
library(ggplot2)
library(ggnewscale)
plot <- ggplot() +
geom_line(data = aerobic_melt, aes(Distance, value, col=Aerobic)) +
new_scale_color() +
geom_line(data = anaerobic_melt, aes(Distance, value, col= Anaerobic)) +
xlim(0, 125) +
ylab('Energy (J/kg )') +
xlab('Distance (m)')
plot
Data
set.seed(1)
aerobic_melt <- data.frame(
Aerobic = rep(c("A", "B", "C", "F", "G", "L"), each = 120),
value = as.numeric(replicate(6, cumsum(rnorm(120)))),
Distance = rep(1:120, 6))
anaerobic_melt <- data.frame(
Anaerobic = rep(c("E", "H", "I", "J", "K"), each = 120),
value = as.numeric(replicate(5, cumsum(rnorm(120)))),
Distance = rep(1:120, 5))

Barplot side by side and line charts in the same plot

I want to create in R a plot which contains side by side bars and line charts as follows:
I tried:
Total <- c(584,605,664,711,759,795,863,954,1008,1061,1117,1150)
Infected <- c(366,359,388,402,427,422,462,524,570,560,578,577)
Recovered <- c(212,240,269,301,320,359,385,413,421,483,516,548)
Death <- c(6,6,7,8,12,14,16,17,17,18,23,25)
day <- itemizeDates(startDate="01.04.20", endDate="12.04.20")
df <- data.frame(Day=day, Infected=Infected, Recovered=Recovered, Death=Death, Total=Total)
value_matrix = matrix(, nrow = 2, ncol = 12)
value_matrix[1,] = df$Recovered
value_matrix[2,] = df$Death
plot(c(1:12), df$Total, ylim=c(0,1200), xlim=c(1,12), type = "b", col="peachpuff", xaxt="n", xlab = "", ylab = "")
points(c(1:12), df$Infected, type = "b", col="red")
barplot(value_matrix, beside = TRUE, col = c("green", "black"), width = 0.35, add = TRUE)
But the bar chart does not fit the line chart. I guess it would be easier to use ggplot2, but don't know how. Could anyone help me? Thanks a lot in advance!
With ggplot2, the margins are handled nicely for you, but you'll need the data in two separate long forms. Reshape from wide to long with tidyr::gather, tidyr::pivot_longer, reshape2::melt, reshape, or whatever you prefer.
library(tidyr)
library(ggplot2)
df <- data.frame(
Total = c(584,605,664,711,759,795,863,954,1008,1061,1117,1150),
Infected = c(366,359,388,402,427,422,462,524,570,560,578,577),
Recovered = c(212,240,269,301,320,359,385,413,421,483,516,548),
Death = c(6,6,7,8,12,14,16,17,17,18,23,25),
day = seq(as.Date("2020-04-01"), as.Date("2020-04-12"), by = 'day')
)
ggplot(
tidyr::gather(df, Population, count, Total:Infected),
aes(day, count, color = Population, fill = Population)
) +
geom_line() +
geom_point() +
geom_col(
data = tidyr::gather(df, Population, count, Recovered:Death),
position = 'dodge', show.legend = FALSE
)
Another way to do it is to gather twice before plotting. Not sure if this is easier or harder to understand, but you get the same thing.
df %>%
tidyr::gather(Population, count, Total:Infected) %>%
tidyr::gather(Resolution, count2, Recovered:Death) %>%
ggplot(aes(x = day, y = count, color = Population)) +
geom_line() +
geom_point() +
geom_col(
aes(y = count2, color = Resolution, fill = Resolution),
position = 'dodge', show.legend = FALSE
)
You can actually plot the lines and points without reshaping by making separate calls for each, but to dodge bars (or get legends), you'll definitely need to reshape.

Highlight / Draw a box around some of the plots when using `facet_grid` in ggplot2

I am creating a matrix of plots similar to
ggplot(mpg, aes(displ, hwy)) + geom_point() + facet_grid(rows = vars(cyl), cols = vars(drv))
Now, I would like to have some way to highlight some of the individual plots, say the ones where cyl is 5 or 6, and drv is f. So, ideally, this might look like this:
But I would also be happy with those panels having a different look by setting ggtheme to classic or similar.
However, it is very unclear to me how I can modify individually selected plots within a matrix of plots generated via facet_grid
From #joran answer found here, this is what I get :
[EDIT] code edited to select multiple facets
if(!require(tidyverse)){install.packages("tidyverse")}
library(tidyverse)
#dummy dataset
df = data.frame(type = as.character(c("a", "b", "c", "d")),
id = as.character(c("M5", "G5", "A7", "S3")),
val = runif(4, min = 1, max = 10),
temp = runif(4))
# use a rectangle to individually select plots
ggplot(data = df, aes(x = val, y = temp)) +
geom_point() +
geom_rect(data = subset(df, type %in% c("b", "c") & id %in% c("A7","G5")),
fill = NA, colour = "red", xmin = -Inf,xmax = Inf,
ymin = -Inf,ymax = Inf) +
facet_grid(type~id)
It does not use theme() but it seems simple enough to highlight some facets.

How to draw a barplot from counts data in R?

I have a data-frame 'x'
I want barplot like this
I tried
barplot(x$Value, names.arg = x$'Categorical variable')
ggplot(as.data.frame(x$Value), aes(x$'Categorical variable')
Nothing seems to work properly. In barplot, all axis labels (freq values) are different. ggplot is filling all bars to 100%.
You can try plotting using geom_bar(). Following code generates what you are looking for.
df = data.frame(X = c("A","B C","D"),Y = c(23,12,43))
ggplot(df,aes(x=X,y=Y)) + geom_bar(stat='identity') + coord_flip()
It helps to read the ggplot documentation. ggplot requires a few things, including data and aes(). You've got both of those statements there but you're not using them correctly.
library(ggplot2)
set.seed(256)
dat <-
data.frame(variable = c("a", "b", "c"),
value = rnorm(3, 10))
dat %>%
ggplot(aes(x = variable, y = value)) +
geom_bar(stat = "identity", fill = "blue") +
coord_flip()
Here, I'm piping my dat to ggplot as the data argument and using the names of the x and y variables rather than passing a data$... value. Next, I add the geom_bar() statement and I have to use stat = "identity" to tell ggplot to use the actual values in my value rather than trying to plot the count of the number.
You have to use stat = "identity" in geom_bar().
dat <- data.frame("cat" = c("A", "BC", "D"),
"val" = c(23, 12, 43))
ggplot(dat, aes(as.factor(cat), val)) +
geom_bar(stat = "identity") +
coord_flip()

Setting the x-axis for R ggplot2

How can I make the x-axis display the text in the "xaxisTitles" vector?
Here is my code you can run:
require(ggplot2)
require(reshape)
xaxisTitles<- cbind("a","b","c","d","e","f","g","h","j","k")
df <- data.frame(time = 1:10,
a = cumsum(rnorm(10)),
b = cumsum(rnorm(10)),
c = cumsum(rnorm(10)))
df <- melt(df , id = 'time', variable_name = 'series')
# plot on same grid, each series colored differently --
# good if the series have same scale
ggplot(df, aes(time,value)) + geom_line(aes(colour = series))+ theme(axis.text.x = xaxisTitles)
I am getting the error:
Error in (function (el, elname) :
Element axis.text.x must be a element_text object.
The reason you are getting the error is that theme(...) is used to set the appearance of the axis text (e.g., color, font family, font face, size, orientation, etc.), but not the values of the text. To do that, as #SteveReno points out, you have to use scale_x_discrete(...).
require(ggplot2)
require(reshape)
set.seed(321)
# xaxisTitles<- cbind("a","b","c","d","e","f","g","h","j","k")
xaxisTItles<- letters[1:10] # easier way to do this...
df <- data.frame(time = 1:10,
a = cumsum(rnorm(10)),
b = cumsum(rnorm(10)),
c = cumsum(rnorm(10)))
df <- melt(df , id = 'time', variable_name = 'series')
# plot on same grid, each series colored differently --
# good if the series have same scale
ggplot(df, aes(time,value)) +
geom_line(aes(colour = series))+
scale_x_discrete(labels=xaxisTitles)+
theme(axis.text.x=element_text(face="bold",colour="red",size=14))
The best way to do this is to make the time variable a factor rather than a numeric vector, as long as you remember to adjust the group aesthetic:
df$time = factor(xaxisTitles[df$time])
ggplot(df, aes(time, value)) + geom_line(aes(colour = series, group=series))
(If you don't add the group=series argument, it won't know that you want to connect lines across the factor on the x axis).
You can just use scale_x_discrete to set the labels.
ggplot(df, aes(time,value)) + geom_line(aes(colour = series))+ scale_x_discrete(labels= xaxisTitles)
Here's some more helpful info http://docs.ggplot2.org/0.9.3.1/scale_discrete.html

Resources