bar plot in r with multiple bars per x variable - r

How do I plot a bar-plot so that every variable (treatment group) on the x-axis displays two bars, representing avgRDm and avgSDM? I would like the bars to be colored by avgRDm and avgSDM.
The data for the plot is in the following image:
Thank you

I'm a big fan of ggplot, so here is an option in that vein. It's easiest (and tidiest) to reshape data from wide to long and then map the fill aesthetic to the key
library(tidyverse)
df %>%
gather(key, val, -trt) %>%
ggplot(aes(trt, val, fill = key)) +
geom_col(position = "dodge2")
PS. For future posts, please share data in a reproducible way using e.g. dput; screenshots are never a good idea as it requires respondents to manually type out your sample data.
Sample data
df <- read.table(text =
"trt avgRDM avgSDM
F10 49.5 108.333
NH4Cl 12.583 50.25
NH4NO3 17.333 73.33
'F10 + ANU843' 6.0 7.333", header = T)

Related

How to create a stacked area chart in R from a csv with non-numerical data

I am trying to create a stacked area chart in R using data from this csv: https://raw.githubusercontent.com/fivethirtyeight/data/master/masculinity-survey/raw-responses.csv
(The above file is raw content, for better readability of the data look here: https://github.com/fivethirtyeight/data/blob/master/masculinity-survey/masculinity-survey.csv)
I am trying to create a percentage based stacked area chart, that i similar to this example: https://r-charts.com/en/evolution/percentage-stacked-area_files/figure-html/percentage-areaplot.png
The problem is that since i am working with non-numerical data only, it is a bit hard for me to get a proper graph.
My goal is to have the graph display the different age groups in the x-axis ( row "age3" in raw content), and the fill to be the ethnicities (row "racethn4" in raw content. All while the y axis simply is the percentage that represents the number of total answers in the survey (that of course goes up to 100).
I tried to do it the following way, but im not sure what the y value should be:
df <- read_csv("Path to csv")
ggplot(df, aes(x = df$age3, y = ???, fill = df$racethn4)) + geom_stream()
Any ideas on how to represent the plot as described?
I'm not too well versed in ggplot as I use other graphing packages but I gave this a shot. I don't believe you can use geom_area when x is a categorical variable. At least I did not have any luck trying that. So I used geom_col instead.
Here's two approaches for transforming the data. Using dplyr and data.table. Feel free to pick whichever is more natural for you.
You need to sum up the number of observations per group combo first and then get the percent total for the y values.
library(data.table)
library(ggplot2)
library(dplyr)
dat = fread("temp.csv") # from data.table::fread
# data.table way
dat_sub = dat[, .(age3 = as.factor(age3), racethn4 = as.factor(racethn4))][,.N, by = .(age3,racethn4)]
dat_sub[, tot := sum(N), by = age3][, perc := N/tot*100][order(age3)]
# dplyr way
dat_sub = dat %>%
select(age3, racethn4) %>%
group_by(age3, racethn4) %>%
summarise(n = n()) %>%
group_by(age3) %>%
mutate(tot = sum(n),
perc = n / tot * 100)
# using a stacked bar chart instead of stacked area
ggplot(dat_sub, aes(x = age3, y = perc, fill = racethn4)) +
geom_col()

R ggplot2 multiple column stacked histogram, separate bar for each column

I am trying to make a histogram of percentages for multiple columns of data in one graph. Is there a way to do this without transforming the data into an even longer format? Basically, I want to combine multiple histograms on one plot with the same y axis. I can't get facet_grid and facet_wrap to work because everything is in different columns. Here is some sample data:
data <- data.frame("participant"=c(1,2,3,4,5),
"metric1"=c(0,1,2,0,1),
"metric2"=c(1,2,0,1,2),
"metric3"=c(2,0,1,2,0),
"date"=rep("8/14/2021",5))
Ideally, I would have a stacked bar for metric 1, next to that a stacked bar for metric 2, fianlly a stacked bar for metric 3. I can generate one stacked bar at a time with the following code:
ggplot(data = data,
aes(x = date, group = factor(metric1), fill=factor(metric1))) +
geom_bar(position = "fill") +
scale_y_continuous(labels = scales::percent)
How do I combine this graph with the graphs for metric 2 and 3 so that they are all on the same graph with the same axes? Can it be done without making the data long? My real data is more complicated than the test data, and I'd like to avoid transforming it. Thank you for reading and any help you can offer.
Reshape to 'long' format with pivot_longer and create the bar plot
library(dplyr)
library(ggplot2)
library(tidyr)
data %>%
pivot_longer(cols = starts_with('metric'), values_to = 'metric') %>%
ggplot(aes(x = date, group = factor(metric),fill = factor(metric))) +
geom_bar() +
facet_wrap(~ name)

cluster bar plot in R

I am trying to create a clustered bar plot for 3 different types of precipitation data. I've been doing various searches, how this might be done in R with a similar data set. However, I couldn't find any good help.
This is the dataset I am currently using. I have tried adding multiple geom_bar() but that didn't work out. See attempt below:
ggplot(ppSAcc,aes(x=date,y=as.numeric(Precipitation)))+geom_bar(stat="identity",aes(color="blue"),show.legend=FALSE,size=1)+
geom_bar(ppMAcc,stat="identity",aes(x=date,y=as.numeric(Precipitation),color="purple"),show.legend = FALSE,size=1)+
labs(title="Accumulated Solid Precipitation (Snow)",y="Precipitation (mm)")
In my second attempt, I tried creating a dataframe which includes all three precipitation types.
data<-data.frame(date=ppSAcc$date,snow=ppSAcc$Precipitation,mixed=ppMAcc$Precipitation,rain=ppRAcc$Precipitation)
Which gave me the dataframe shown above.
This is where I am stuck. I started coding ggplot ggplot(data,aes(x=date)))+geom_bar(position = "dodge",stat = "identity") but I'm not sure how to write the code such that I will have three columns(snow, mixed, rain) for each year. I'm not sure how to set the aes() part.
You need to reshape your dataframe into a longer format before to plot it in ggplot2. You can use pivot_longer function from tidyr:
library(tidyr)
library(dplyr)
library(ggplot2)
library(lubridate)
df %>% pivot_longer(-date, names_to = "var", values_to = "val") %>%
ggplot(aes(x = ymd(date), y= val, fill = var))+
geom_col(position = position_dodge())
Does it answer your question ?
If not, please provide a reproducible example of your dataset by following this guide: How to make a great R reproducible example

Grouped barplot with ggplot2: grouping two numeric variables under each year

My problem is simple but I have not been able to find a post that solves it.
Here is my data set DF:
Year CO2Seq CO2Seq2
1 2000 1135704 1107400
2 2003 3407111 3444508
3 2010 1703555 1661100
4 2015 2271407 2296339
I would like to create a barplot where the bars CO2Seq and CO2Seq2 are next to each other for each year.
For the moment, I have only been able to create a simple barplot for CO2Seq with this script
ggplot(DF,aes(x=factor(Year), y=CO2Seq))+geom_bar(stat="identity")
Could you help me?
Thanks a lot
ggplot has generally been designed for use with long rather than wide data, so the first step is to reshape your data, then plotting is straightforward.
library(ggplot2)
library(tidyr)
df %>%
pivot_longer(col = -Year) %>%
ggplot(aes(x = factor(Year), y = value, fill = name)) +
geom_bar(stat = "identity", position = "dodge")

Plot multicolor vertical lines by using ggplot to show average time taken for each type as facet. Each type will have different vertical lines

I want to plot a chart in R where it will show me vertical lines for each type in facet.
df is the dataframe with person X takes time in minutes to reach from A to B and so on.
I have tried below code but not able to get the result.
df<-data.frame(type =c("X","Y","Z"), "A_to_B"= c(20,56,57), "B_to_C"= c(10,35,50), "C_to_D"= c(53,20,58))
ggplot(df, aes(x = 1,y = df$type)) + geom_line() + facet_grid(type~.)
I have attached image from excel which is desired output but I need only vertical lines where there are joins instead of entire horizontal bar.
I would not use facets in your case, because there are only 3 variables.
So, to get a similar plot in R using ggplot2, you first need to reformat the dataframe using gather() from the tidyverse package. Then it's in long or tidy format.
To my knowledge, there is no geom that does what you want in standard ggplot2, so some fiddling is necessary.
However, it's possible to produce the plot using geom_segment() and cumsum():
library(tidyverse)
# First reformat and calculate cummulative sums by type.
# This works because factor names begins with A,B,C
# and are thus ordered correctly.
df <- df %>%
gather(-type, key = "route", value = "time") %>%
group_by(type) %>%
mutate(cummulative_time = cumsum(time))
segment_length <- 0.2
df %>%
mutate(route = fct_rev(route)) %>%
ggplot(aes(color = route)) +
geom_segment(aes(x = as.numeric(type) + segment_length, xend = as.numeric(type) - segment_length, y = cummulative_time, yend = cummulative_time)) +
scale_x_discrete(limits=c("1","2","3"), labels=c("Z", "Y","X"))+
coord_flip() +
ylim(0,max(df$cummulative_time)) +
labs(x = "type")
EDIT
This solutions works because it assigns values to X,Y,Z in scale_x_discrete. Be careful to assign the correct labels! Also compare this answer.

Resources