cluster bar plot in R - r

I am trying to create a clustered bar plot for 3 different types of precipitation data. I've been doing various searches, how this might be done in R with a similar data set. However, I couldn't find any good help.
This is the dataset I am currently using. I have tried adding multiple geom_bar() but that didn't work out. See attempt below:
ggplot(ppSAcc,aes(x=date,y=as.numeric(Precipitation)))+geom_bar(stat="identity",aes(color="blue"),show.legend=FALSE,size=1)+
geom_bar(ppMAcc,stat="identity",aes(x=date,y=as.numeric(Precipitation),color="purple"),show.legend = FALSE,size=1)+
labs(title="Accumulated Solid Precipitation (Snow)",y="Precipitation (mm)")
In my second attempt, I tried creating a dataframe which includes all three precipitation types.
data<-data.frame(date=ppSAcc$date,snow=ppSAcc$Precipitation,mixed=ppMAcc$Precipitation,rain=ppRAcc$Precipitation)
Which gave me the dataframe shown above.
This is where I am stuck. I started coding ggplot ggplot(data,aes(x=date)))+geom_bar(position = "dodge",stat = "identity") but I'm not sure how to write the code such that I will have three columns(snow, mixed, rain) for each year. I'm not sure how to set the aes() part.

You need to reshape your dataframe into a longer format before to plot it in ggplot2. You can use pivot_longer function from tidyr:
library(tidyr)
library(dplyr)
library(ggplot2)
library(lubridate)
df %>% pivot_longer(-date, names_to = "var", values_to = "val") %>%
ggplot(aes(x = ymd(date), y= val, fill = var))+
geom_col(position = position_dodge())
Does it answer your question ?
If not, please provide a reproducible example of your dataset by following this guide: How to make a great R reproducible example

Related

geom_bar combine 2 dataset onto one graph

I have two dataframes with paired scores, each scoring patients on a 1-8 scoring system (Where 1= managing well and 8 = terminally ill).
One score is done by the patient and one by the clinician.
sample data
df <- data.frame(Patient = c(1,1,2,4,5,3,2,6,7,6,3,4,2,3,5,6,7,3,8,1), Clinican= c(1,2,2,5,4,5,4,4,4,2,3,5,4,6,5,4,3,7,7,1))
I'd like to create a bar chart similar to the one below using my dataset.
Any help would be much appreciated.
I believe I need dplyr pivot_longer similar to this post:
geom_bar two datasets together in R
Here is a solution using pivot_longer and geom_bar as you asked.
Libraries
library(dplyr)
library(tidyr)
library(ggplot2)
Solution
You can change name and value for whatever name you prefer.
Also, the x-axis is categorical, so we have to mutate it to factor.
You can then recode the factor value for the labels you need (e.g., 'well', 'very fit'...)
df %>%
pivot_longer(Patient:Clinican, names_to = "name", values_to = "value") %>%
mutate(value = factor(value)) %>%
ggplot(aes(x = value, fill = name)) +
geom_bar(position = "dodge")
Output

Various variables (y-axis) in the same graph grouped by factors in R?

I have a problem making a graph in R. I have the following data, with flower type, and a color index for different points (distance_petal).
flower_type<-c(rep("blue",3),rep("red",3))
distance_petal1<-c(2,3,2,7,6,7)
distance_petal2<-c(2,2,1,8,6,8)
distance_petal3<-c(1,1,2,9,9,8)
data<-as.data.frame(cbind(flower_type,distance_petal1,distance_petal2,distance_petal3))
data$flower_type<-as.factor(flower_type)
I am trying to make a graph showing distance_petal1, distance_petal2 and distance_petal3 in the X-axis, and the value in the Y-axis. So I want to obtain 2 lines, one for each flower type with 3 points in the Y-axis.
I mean, I want to make something like this, but instead of plotting all values, just plotting the mean for each variable for each factor.
library(GGally)
ggparcoord(data,
columns = 2:4, groupColumn = 1,scale="globalminmax"
)
Does anyone know how to do this?
Thank you very much in advance, have a nice day!
Using ggplot, dplyr and tidyr you can try:
Not sure if your question asks for a line as in the example from GGally or if you want points as in the question. So have included both the line and point version you can just remove the line of ggplot code to get what you need.
library(dplyr)
library(tidyr)
library(ggplot2)
data %>%
pivot_longer(-1) %>%
mutate(value = as.numeric(value))%>%
group_by(flower_type, name) %>%
summarise(mean = mean(value)) %>%
ggplot(aes(name, mean, colour = flower_type))+
geom_line(aes(group = flower_type))+
geom_point()
Created on 2021-04-25 by the reprex package (v2.0.0)

Plotting multiple continuous variables by frequencies together with same scale margin in r

I am trying to visualize my data. All I need is a plot to compare the distribution of the different variables.
I already tried with multi.hist. Actually, that would be enough for me. But the problem is, I cannot manage the margins of the scale to stay the same for each histogram to compare the distributions as it is already trying to fit for each variable.
As well, I have a categorial variable in my data as well (topic 1-5). Maybe there is a good way to visualize this as well but I am not dying if it is not possible so easy.
I tried a lot with ggplot as well but I am rather new to r and could not make anything good yet.
Below you see an example for my data.
Thank you very much in advance :)
My data:
Data
Try first converting your data to long format:
df2 <- df %>% pivot_longer(cols = 1:5, names_to = 'set', values_to = 'sub_means')
Then you can do a density plot, either colouring by set and faceting by topic:
df2 %>% ggplot(x = sub_means, fill = set) + geom_density() + facet_wrap(~topic)
Or vice versa:
df2 %>% ggplot(x = sub_means, fill = topic) + geom_density() + facet_wrap(~set)

ggplot (geom_bar) not sorting y-axis according to numeric values

I am trying to sort y-axis numerically according to population values. Have tried other stackoverflow answers that suggested reorder/ converting columns to numeric data type (as.numeric), but those solutions does not seem to work for me.
Without using reorder, the plot is sorted alphabetically:
Using reorder, the plot is sorted as such:
The code i am using:
library(ggplot2)
library(ggpubr)
library(readr)
library(tidyverse)
library(lemon)
library(dplyr)
pop_data <- read_csv("respopagesextod2011to2020.csv")
temp2 <- pop_data %>% filter(`Time` == '2019')
ggplot(data=temp2,aes(x=reorder(PA, Pop),y=Pop)) + geom_bar(stat='identity') + coord_flip()
How should I go about sorting my y-axis? Any help will be much appreciated. Thanks!
I am using data filtered from: https://www.singstat.gov.sg/-/media/files/find_data/population/statistical_tables/singapore-residents-by-planning-areasubzone-age-group-sex-and-type-of-dwelling-june-20112020.zip
The functions are all working as intended - the reason you don't see the result as expected is because the reorder() function is specifying the ordering of the pop_data$PA based on each observation in the set, whereas the bars you are plotting are a result of summary statistics on pop_data.
The easiest solution is to probably perform the summarizing first, then plot and reorder the summarized dataset. This way, the reordering reflects an ordering of the summarized data, which is what you want.
temp3 <- pop_data %>% filter(`Time` == '2019') %>%
group_by(PA) %>%
summarize(Pop = sum(Pop))
ggplot(data=temp3, aes(x=reorder(PA, Pop),y=Pop)) +
geom_bar(stat='identity') + coord_flip()

bar plot in r with multiple bars per x variable

How do I plot a bar-plot so that every variable (treatment group) on the x-axis displays two bars, representing avgRDm and avgSDM? I would like the bars to be colored by avgRDm and avgSDM.
The data for the plot is in the following image:
Thank you
I'm a big fan of ggplot, so here is an option in that vein. It's easiest (and tidiest) to reshape data from wide to long and then map the fill aesthetic to the key
library(tidyverse)
df %>%
gather(key, val, -trt) %>%
ggplot(aes(trt, val, fill = key)) +
geom_col(position = "dodge2")
PS. For future posts, please share data in a reproducible way using e.g. dput; screenshots are never a good idea as it requires respondents to manually type out your sample data.
Sample data
df <- read.table(text =
"trt avgRDM avgSDM
F10 49.5 108.333
NH4Cl 12.583 50.25
NH4NO3 17.333 73.33
'F10 + ANU843' 6.0 7.333", header = T)

Resources