Stacked Bar Chart in ggplot - r

I would like to have a stacked bar-chart. I succesfully created my dataframe using lubridate, however as I can just specify x and y values I do not know how to 'put in' my data values.
The dataframe is looking like so:
Date Feature1 Feature2 Feature3
2020-01-01 72 0 0
2020-02-01 90 21 5
2020-03-01 112 28 2
2020-04-01 140 36 0
...
The date should be on the x-axis and each row represents one bar in the bar chart (the height of the bar is the sum of Feature1+Feature2+Feature3
The only thing I get is this:
ggplot(dataset_monthly, aes(x = dataset_monthly$Date, y =dataset_monthly$????)) +
+ geom_bar(stat = "stack")

We can reshape to 'long' format first
library(dplyr)
library(tidyr)
library(ggplot2)
dataset_monthly %>%
pivot_longer(cols = -Date, names_to = 'Feature') %>%
ggplot(aes(x = Date, y = value, fill = Feature)) +
geom_col()
-output
data
dataset_monthly <- structure(list(Date =
structure(c(18262, 18293, 18322, 18353), class = "Date"),
Feature1 = c(72L, 90L, 112L, 140L), Feature2 = c(0L, 21L,
28L, 36L), Feature3 = c(0L, 5L, 2L, 0L)), row.names = c(NA,
-4L), class = "data.frame")

Slightly modified using geom_bar. thanks to akrun!
library(tidyverse)
# Bring data in longformat -> same code as akruns!
df <- dataset_monthly %>%
pivot_longer(cols = -Date, names_to = 'Feature')
ggplot(df, aes(x=Date, y=value, fill=Feature, label = value)) +
geom_bar(stat="identity")+
geom_text(size = 3, position = position_stack(vjust = 0.8)) +
scale_fill_brewer(palette="Paired")+
theme_classic()

Related

plotting means in a horizontal bar with a vertical line

ID
score1
score 2
score 3
score 4
1
200
300
400
-200
2
250
-310
-470
-200
3
210
400
480
-200
4
220
-10
-400
-200
5
150
-50
400
-200
I am new to R, I want to make a graph that presents the mean of each score.
whereas, the scores are lined in the Y axis, and there is a vertical line which represents the 0.
every score mean above zero a horizontal bar appears from the central to the right.
every score mean below zero a horizonal bar appears from the central to the left.
Thanks for the help!
You could achieve your desired result by first converting your dataset to long format and by computing the means per score afterwards. After these data wrangling steps you could plot the means using ggplot2 via geom_col and add a vertical zero line using geom_vline:
df <- data.frame(
ID = c(1L, 2L, 3L, 4L, 5L),
score1 = c(200L, 250L, 210L, 220L, 150L),
score.2 = c(300L, -310L, 400L, -10L, -50L),
score.3 = c(400L, -470L, 480L, -400L, 400L),
score.4 = c(-200L, -200L, -200L, -200L, -200L)
)
library(dplyr)
library(tidyr)
library(ggplot2)
df1 <- df |>
tidyr::pivot_longer(-ID, names_to = "score") |>
group_by(score) |>
summarise(value = mean(value))
ggplot(df1, aes(value, score)) +
geom_vline(xintercept = 0) +
geom_col()
EDIT To label the bars you could use geom_text. Tricky part is to align the labels. To this end I make use of an ifelse to right align (hjust = 1) the labels in case of a positive mean and left align (hjust = 0) in case of a negative mean. Actually I did 1.1 and -.1 to add some padding between the label and the bar. The axis labels could be set via the labels argument of the scale, in your case it is scale_y_discrete. Personally I prefer to use a named vector which assign labels to categories in the data.
ggplot(df1, aes(value, score)) +
geom_vline(xintercept = 0) +
geom_col() +
geom_text(aes(label = value, hjust = ifelse(value > 0, 1.1, -.1)), color = "white") +
scale_y_discrete(labels = c("score1" = "Test1", "score.2" = "Test2", "score.3" = "Test3", "score.4" = "Test4"))
Similar approach with stefan's but slightly different choice of functions:
The data:
dat <- structure(list(ID = 1:5, score1 = c(200L, 250L, 210L, 220L, 150L
), score2 = c(300L, -310L, 400L, -10L, -50L), score3 = c(400L,
-470L, 480L, -400L, 400L), score4 = c(-200L, -200L, -200L, -200L,
-200L)), class = "data.frame", row.names = c(NA, -5L))
The chain of functions
dat %>%
select(-ID) %>%
map_df(mean) %>%
pivot_longer(everything(), names_to = "score", values_to = "means") %>%
ggplot() +
coord_flip() +
geom_col(aes(x = score, y = means))
The result
In case you want to change the labels on the tick marks ("score 1", "score2", etc) to other labels, you can use scale_x_discrete.
In addition, in case you want to show the numeric value on top of each bar, you can use geom_text with hjust to adjust the label positions.
For example :
dat %>%
select(-ID) %>%
map_df(mean) %>%
pivot_longer(everything(), names_to = "score", values_to = "means") %>%
ggplot() +
coord_flip() +
geom_col(aes(x = score, y = means)) +
scale_x_discrete(labels = c("Test A", "Test B", "Test C", "Test D")) +
geom_text(aes(x = score, y = means, label = means),
hjust = c(-0.5, -0.5, -0.5, 1.1))

Interaction plot with multiple facets using ggplot

I am on R studio, and I am working on a graph that allows comparison between an input vector and what the database have.
The data looks like this:
Type P1 P2 P3
H1 2000 60 4000
H2 1500 40 3000
H3 1000 20 2000
The input vector for comparison will look like this:
Type P1 P2 P3
C 1200 30 5000
and I want my final plot to look like this:
The most important thing is a visual comparison between the input vector and the different types, for each P component. The scale of the y axis should adapt to each type of P, because there is big differences between them.
library(dplyr)
library(tidyr)
library(ggplot2)
d %>% gather(var1, val, -Type) %>%
mutate(input = as.numeric(d2[cbind(rep(1, max(row_number())),
match(var1, names(d2)))]),
slope = factor(sign(val - input), -1:1)) %>%
gather(var2, val, -Type, -var1, -slope) %>%
ggplot(aes(x = var2, y = val, group = 1)) +
geom_point(aes(fill = var2), shape = 21) +
geom_line(aes(colour = slope)) +
scale_colour_manual(values = c("red", "blue")) +
facet_grid(Type ~ var1)
DATA
d = structure(list(Type = c("H1", "H2", "H3"),
P1 = c(2000L, 1500L, 1000L),
P2 = c(60L, 40L, 20L),
P3 = c(4000L, 3000L, 2000L)),
class = "data.frame",
row.names = c(NA, -3L))
d2 = structure(list(Type = "C", P1 = 1200L, P2 = 30L, P3 = 5000L),
class = "data.frame",
row.names = c(NA, -1L))

How do I plot the differences between two groups, across multiple sampling days?

I am looking to plot, in a barplot, the differences in value between two groups (Elevated Temp and Control).
I'd like to be able to plot these in the same way as my original graph with Months along the x axis.
Here is the following script I have used to get to the current barplot 1 that I have plotted. This shows y axis= plant growth and x axis=Months.
Script: Current Barplot
Tempmean<- data %>% group_by (Treatment, Month) %>% summarize (TTmean = mean(Amean, na.rm=TRUE), TTsd=sd(Amean,na.rm=TRUE))
p<-ggplot(data=Tempmean, aes(x=factor(Month), y=TTmean, fill=Treatment)) +
geom_bar(stat="identity", position="dodge", colour="black" , size = 0.25, width=0.5) + geom_errorbar(aes(ymin=TTmean-TTsd, ymax=TTmean+TTsd), width=.1,
position=position_dodge(.5)) + scale_fill_manual(values=c("darkgray","darkolivegreen")) + scale_x_discrete(breaks=6:8,labels=c("June","July","August")) + scale_y_continuous(limits=c(0,20), breaks=seq(0,20,2))
p
This is the data I am working with 2. I would be looking to take the TTmean of the eCO2 from the TTmean of the aCO2.
Data:
structure(list(Treatment = c("aCO2", "aCO2", "aCO2", "eCO2","eCO2", "eCO2"), Month = c(6L, 7L, 8L, 6L, 7L, 8L), TTmean = c(10.1922587348143,10.1061784054575, 8.27148533916994, 12.0261355594138,10.8954781586458, 10.9468200269188), TTsd =c(7.04936647397141,4.18653008350561, 1.50026716071241, 3.25471492346035, 0.742036555955107, 2.00464198948226)), row.names = c(NA, -6L), class = c("grouped_df", "tbl_df", "tbl", "data.frame"), vars = "Treatment", drop = TRUE, indices = list(0:2, 3:5), group_sizes = c(3L, 3L), biggest_group_size = 3L, labels = structure(list(Treatment = c("aCO2", "eCO2")), row.names = c(NA, -2L),class = "data.frame", vars = "Treatment", drop = TRUE))
This should do the trick.
I dropped TTsd because it doesn't seem like you need it. The trick is to spread() the data so you can easily compute the difference in the values. I computed as aCO2 minus eCO2; but you can change that in mutate()
library(tidyverse)
Tempmean %>%
select(-TTsd) %>%
# either group_by Month, or just ungroup entirely
group_by(Month) %>%
spread(Treatment, TTmean) %>%
mutate(T_diff = aCO2 - eCO2) %>%
ggplot(aes(factor(Month), T_diff)) %+%
geom_bar(position = "dodge", stat = "identity", size = 0.25, width=0.5) %+%
scale_x_discrete(breaks=6:8,labels=c("June","July","August"))

Multiple plots in R with time series

enter image description hereI have the following data; please can any one help me to plot it, I have tried to use a lot of different commands but none has given me a perfect graph
year x y
2012 4 5
2014 7 9
2017 4 3
enter image description here
this picture i need to make as it
Based on your comments you might be looking for:
library(tidyverse)
plot1 <- df %>% gather(key = measure, value = value, -year) %>%
ggplot(aes(x = year, y = value, color = measure))+
geom_point()+
geom_line()+
facet_wrap(~measure)
plot1
The biggest points here are gather and facet_wrap. I recommend the following two links:
https://ggplot2.tidyverse.org/reference/facet_grid.html
https://ggplot2.tidyverse.org/reference/facet_wrap.html
You need to convert year column type to Date.
This is a tidyverse style solution
library(tidyverse)
mydf %>%
rename("col1" = x, "col2" = y) %>%
mutate(year = paste0(year, "-01-01")) %>%
mutate(year = as.Date(year)) %>%
ggplot() +
geom_line(aes(x = year, y = col1), color = "red", size = 2) +
geom_line(aes(x = year, y = col2), color = "blue", size = 2) +
theme_minimal()
which returns this
Using the data shown reproducibly in the Note below use matplot. No packages are used.
matplot(dd[[1]], dd[-1], pch = c("x", "y"), type = "o", xlab = "year", ylab = "value")
Note
dd <- structure(list(year = c(2012L, 2014L, 2017L), x = c(4L, 7L, 4L),
y = c(5L, 9L, 3L)), class = "data.frame", row.names = c(NA, -3L))

Plot multiple columns with different scales

I have some data in the following format:
Section Env. Ar. Width Length
A 8.38 8.76 7 36
B 11.84 13.51 11 57
C 16.69 16.49 17 87
D 11.04 11.62 9 44
E 19.56 16.79 20 106
F 17.93 21.34 19 98
I need to have a plot with section on X axis and Env. and Ar. on one Y axis and Width and Length on another Y axis, since it has a different scale. I know how to plot them in just one Y axis using ggplot, but I am stuck in how to do as I mentioned with two different Y axes. Any help will be appreciated.
Thanks!
Whats about using this?
library(tidyverse)
d <- structure(list(Section = structure(1:6, .Label = c("A", "B",
"C", "D", "E", "F"), class = "factor"), Env. = c(8.38, 11.84,
16.69, 11.04, 19.56, 17.93), Ar. = c(8.76, 13.51, 16.49, 11.62,
16.79, 21.34), Width = c(7L, 11L, 17L, 9L, 20L, 19L), Length = c(36L,
57L, 87L, 44L, 106L, 98L)), .Names = c("Section", "Env.", "Ar.",
"Width", "Length"), class = "data.frame", row.names = c(NA, -6L))
d %>%
gather(key, value,-Section) %>%
ggplot(aes(Section, value, colour=key, group= key)) +
geom_line(size=1.1) + geom_point(size=4)+
scale_y_continuous(name="Env_Ar",
sec.axis = sec_axis(~., name = "Width_Length"))
You can also try using different facets with "free_y" scaling. This is IMO much more cleaner and elegant.
d %>%
gather(key, value,-Section) %>%
mutate(group=ifelse(key %in% c("Width","Length"), 2, 1)) %>%
ggplot(aes(Section, value, colour=key, group= key)) +
geom_line(size=1.1) + geom_point(size=4)+
facet_wrap(~group, scales = "free_y")
Edit
Here an approach for a different scaling (10 times higher) of the right y-axis
d %>%
mutate(Width=Width*10,
Length=Length*10) %>%
gather(key, value,-Section) %>%
ggplot(aes(Section, value, colour=key, group= key)) +
geom_line(size=1.1) + geom_point(size=4)+
scale_y_continuous(name="Env_Ar",
sec.axis = sec_axis(~.*10, name = "Width_Length"))

Resources