I have following data frame:
Quarter x y p q
1 2001 8.714392 8.714621 3.3648435 3.3140090
2 2002 8.671171 8.671064 0.9282508 0.9034387
3 2003 8.688478 8.697413 6.2295996 8.4379698
4 2004 8.685339 8.686349 3.7520135 3.5278024
My goal is to generate a facet plot where x and y column in one plot in the facet and p,q together in another plot instead of 4 facets.
If I do following:
x.df.melt <- melt(x.df[,c('Quarter','x','y','p','q')],id.vars=1)
ggplot(x.df.melt, aes(Quarter, value, col=variable, group=1)) + geom_line()+
facet_grid(variable~., scale='free_y') +
scale_color_discrete(breaks=c('x','y','p','q'))
I all the four series in 4 different facets but how do I combine x,y to be one while p,q to be in another together. Preferable no legends.
One idea would be to create a new grouping variable:
x.df.melt$var <- ifelse(x.df.melt$variable == "x" | x.df.melt$variable == "y", "A", "B")
You can use it for facetting while using variable for grouping:
ggplot(x.df.melt, aes(Quarter, value, col=variable, group=variable)) + geom_line()+
facet_grid(var~., scale='free_y') +
scale_color_discrete(breaks=c('x','y','p','q'), guide = F)
I think beetroot's answer above is more elegant but I was working on the same problem and arrived at the same place a different way. I think it is interesting because I used a "double melt" (yum!) to line up the x,y/p,q pairs. Also, it demonstrates tidyr::gather instead of melt.
library(tidyr)
x.df<- data.frame(Year=2001:2004,
x=runif(4,8,9),y=runif(4,8,9),
p=runif(4,3,9),q=runif(4,3,9))
x.df.melt<-gather(x.df,"item","item_val",-Year,-p,-q) %>%
group_by(item,Year) %>%
gather("comparison","comp_val",-Year,-item,-item_val) %>%
filter((item=="x" & comparison=="p")|(item=="y" & comparison=="q"))
> x.df.melt
# A tibble: 8 x 5
# Groups: item, Year [8]
Year item item_val comparison comp_val
<int> <chr> <dbl> <chr> <dbl>
1 2001 x 8.400538 p 5.540549
2 2002 x 8.169680 p 5.750010
3 2003 x 8.065042 p 8.821890
4 2004 x 8.311194 p 7.714197
5 2001 y 8.449290 q 5.471225
6 2002 y 8.266304 q 7.014389
7 2003 y 8.146879 q 7.298253
8 2004 y 8.960238 q 5.342702
See below for the plotting statement.
One weakness of this approach (and beetroot's use of ifelse) is the filter statement quickly becomes unwieldy if you have a lot of pairs to compare. In my use case I was comparing mutual fund performances to a number of benchmark indices. Each fund has a different benchmark. I solved this by with a table of meta data that pairs the fund tickers with their respective benchmarks, then use left/right_join. In this case:
#create meta data
pair_data<-data.frame(item=c("x","y"),comparison=c("p","q"))
#create comparison name for each item name
x.df.melt2<-x.df %>% gather("item","item_val",-Year) %>%
left_join(pair_data)
#join comparison data alongside item data
x.df.melt2<-x.df.melt2 %>%
select(Year,item,item_val) %>%
rename(comparison=item,comp_val=item_val) %>%
right_join(x.df.melt2,by=c("Year","comparison")) %>%
na.omit() %>%
group_by(item,Year)
ggplot(x.df.melt2,aes(Year,item_val,color="item"))+geom_line()+
geom_line(aes(y=comp_val,color="comp"))+
guides(col = guide_legend(title = NULL))+
ylab("Value")+
facet_grid(~item)
Since there is no need for an new grouping variable we preserve the names of the reference items as labels for the facet plot.
Related
I am creating a bar graph with continuous x-labels of 'Fiscal Years', such as "2009/10", "2010/11", etc. I have a column in my dataset with a specific Fiscal Year that I would like the x-labels to begin at (see example image below). Then, I would like the x-labels to be every continuous Fiscal Year until the present. The last x-label should be "2018/19". When I try to set the limits with scale_x_continuous, I receive an error of Error: Discrete value supplied to continuous scale. However, if I use 'scale_x_discrete', I get a graph with only two bars: my chosen "Start" date and the "End" of 2018/19.
Start<-Project_x$Start[c(1)]
End<-"2018/2019"
ggplot(Project_x, (aes(x=`FY`, y=Amount)), na.rm=TRUE)+
geom_bar(stat="identity", position="stack")+
scale_x_continuous(limits = c(Start,End))
` Error: Discrete value supplied to continuous scale `
Thank you.
My data is:
df <- data.frame(Project = c(5, 6, 5, 5, 9, 5),
FY = c("2010/11","2017/18","2012/13","2011/12","2003/04","2000/01"),
Start=c("2010/11", "2011/12", "2010/11", "2010/11", "2001/02", "2010/11"),
Amount = c(500,502,788,100,78,NA))
To use the code in the answer below, I need to base my Start_Year off of my Start column rather than the FY column, and the graph should just be for Project #5.
as.tibble(df) %>%
mutate(Start_Year = as.numeric(sub("/\\d{2}","",Start)))
xlabel_start<-subset(df$Start_Year, Project == 5)
xlabel_end<-2018
filter(between(Start_Year,xlabel_start,xlabel_end)) %>%
ggplot(aes(x = FY, y = Amount))+
geom_col()
When running this, my xlabel_start is NULL.
In ggplot, continuous is dedicated for numerical values. Here, your fiscal year are character (or factor) format and so they are considered as discrete values and are sorted alphabetically by ggplot2.
One possible solution to get your expected plot is to create a new variable containing the starting year of the fiscal year and filter for values between 2010 and 2018.
But first, we are going to isolate the project and the starting year of interest by creating a new dataframe:
library(dplyr)
xlabel_start <- as.tibble(df) %>%
mutate(Start_Year = as.numeric(sub("/\\d{2}","",Start))) %>%
distinct(Project, Start_Year) %>%
filter(Project == 5)
# A tibble: 1 x 2
Project Start_Year
<dbl> <dbl>
1 5 2010
Now, using almost the same pipeline, we can isolate values of interest by
doing:
library(tidyverse)
as.tibble(df) %>%
mutate(Year = as.numeric(sub("/\\d{2}","",FY))) %>%
filter(Project == 5 & between(Year,xlabel_start$Start_Year,xlabel_end))
# A tibble: 3 x 5
Project FY Start Amount Year
<dbl> <fct> <fct> <dbl> <dbl>
1 5 2010/11 2010/11 500 2010
2 5 2012/13 2010/11 788 2012
3 5 2011/12 2010/11 100 2011
And once you have done this, you can simply add the ggplot plotting part at the end of this pipe sequence:
library(tidyverse)
as.tibble(df) %>%
mutate(Year = as.numeric(sub("/\\d{2}","",FY))) %>%
filter(Project == 5 & between(Year,xlabel_start$Start_Year,xlabel_end)) #%>%
ggplot(aes(x = FY, y = Amount))+
geom_col()
Does it answer your question ?
My data is
PC_Name Electors_2009 Electors_2014 Electors_2019 Voters_2009 Voters_2014
1 Amritsar 1241099 1477262 1507875 814503 1007196
2 Anandpur Sahib 1338596 1564721 1698876 904606 1086563
3 Bhatinda 1336790 1525289 1621671 1048144 1176767
4 Faridkot 1288090 1455075 1541971 930521 1032107
5 Fatehgarh Sahib 1207556 1396957 1502861 838150 1030954
6 Ferozpur 1342488 1522111 1618419 956952 1105412
7 Gurdaspur 1318967 1500337 1595284 933323 1042699
8 Hoshiarpur 1299234 1485286 1597500 843123 961297
9 Jalandhar 1339842 1551497 1617018 899607 1040762
10 Khadoor Sahib 1340145 1563256 1638842 946690 1040518
11 Ludhiana 1309308 1561201 1683325 846277 1100457
12 Patiala 1344864 1580273 1739600 935959 1120933
13 Sangrur 1251401 1424743 1529432 931247 1099467
Voters_2019
1 859513
2 1081727
3 1200810
4 974947
5 985948
6 1172033
7 1103887
8 990791
9 1018998
10 1046032
11 1046955
12 1177903
13 1105888
I have written the code
data <- read.csv(file = "Punjab data 3.csv")
data
library(ggplot2)
library(reshape2)
long <- reshape2::melt(data, id.vars = "PC_Name")
ggplot(long, aes(PC_Name, value, fill = variable)) + geom_freqpoly(stat="identity",binwidth = 500)
I am trying to plot something like this
I tried line chart and geom line but I am not sure where problem resides. I am trying geom polygon now but its not plotting.I want to compare voters or electors not both of them according to year 2009 2014 2019.Sorry for bad english.
I want to plot PC_Name on x-axis and compare Electors_2009 with Voters_2009 and Electors_2014 with Voters_2014 and all these on same graph. So on y axis I will have 'values' after melting.
It sounds like you were interested in PC_Name on horizontal axis, and value (after melting) on vertical axis. Perhaps you might be interested in a barplot with and compare electors and voters side-by-side?
As suggested by #camille, you could split your data frame's variable column after melting into two columns (one with either Electors or Voters, and the other column with the year). This would provide flexibility in plot options.
Here are a couple of possibilities to start with:
You could order your variable factor how you would like (e.g., Electors_2009, Voters_2009, Electors_2014, etc. for comparison) and use geom_bar.
You could use facet_wrap to make comparisons between Electors and Voters by year.
library(ggplot2)
library(reshape2)
long <- reshape2::melt(data, id.vars = "PC_Name")
# Split electors/voters from year into 2 columns
long <- cbind(long, colsplit(long$variable, "_", c("type", "year")))
# Change order of variable factor for comparisons
long$variable <- factor(long$variable, levels =
c("Electors_2009", "Voters_2009",
"Electors_2014", "Voters_2014",
"Electors_2019", "Voters_2019"))
# Plot value vs. PC_Name using barplot (all years together)
ggplot(long, aes(PC_Name, value, fill = variable)) +
geom_bar(position = "dodge", stat = "identity")
# Show example plot faceted by year
ggplot(long, aes(PC_Name, value, fill = type)) +
geom_bar(position = "dodge", stat = "identity") +
facet_wrap(~year, ncol = 1)
Please let me know if this is what you had in mind. There would be alternative options available.
Sorry if this question already exists - was googling for a while now already and didn't find anything.
I am relatively new to R and learning while doing all of this.
I'm supposed to create some PDF via r markdown that analyses patient-data with specific main-diagnosis and secondary-diagnosis. For this I'm supposed to plot some numbers via ggplot (geom_bar and geom_boxplot).
So what I do so far is, I retrieve data-sets that include both codes via SQL and load them into data.table-objects afterwards. Afterwards I join them to get the data I need.
After this I add columns that consist sub-strings of those codes and others that consist the count of those certain sub-strings (so I can plot the occurrences of every code).
I wanted now for example to put certain data.table into a geom_bar or geom_boxplot and make it visible. This actually works, but my y-axis has a weird scale that doesn't fit the numbers it actually should show. The proportions of the bars are also not accurate.
For example: one diagnoses appears 600 times and the other one 1000 times. The y-axis shows steps of 0 - 500.000 - 1.000.000 - 1.500.000 - ....
The Bar that shows 600 is super small and the bar with 1000 goes up to 1.500.000
If I create a new variable before and count what I need via count() and plot this it just works. The rows I put for the y-axis have in both variable the same datatype (integer)
So here is just how I create the data.table that I use for plotting
exazerbationsHdComorbiditiesNd <- allExazerbationsHd[allComorbiditiesNd, on="encounter_num", nomatch=0]
exazerbationsHdComorbiditiesNd <- exazerbationsHdComorbiditiesNd[, c("i.DurationGroup", "i.DurationInDays", "i.start_date", "i.end_date", "i.duration", "i.patient_num"):=NULL]
exazerbationsHdComorbiditiesNd[ , IcdHdCodeCount := .N, by = concept_cd]
exazerbationsHdComorbiditiesNd[ , IcdHdCodeClassCount := .N, by = IcdHdClass]
If I want to bar-plot now for example IcdHdClass by IcdHdCodeClassCount I do following:
ggplot(exazerbationsHdComorbiditiesNd, aes(exazerbationsHdComorbiditiesNd$IcdHdClass, exazerbationsHdComorbiditiesNd$IcdHdCodeClassCount, label=exazerbationsHdComorbiditiesNd$IcdHdCodeClassCount)) + geom_bar(stat = "identity") + geom_text(vjust = 0, size = 5)
It outputs said bar-plot with weird proportions.
If I do first:
plotTest <- count(exazerbationsHdComorbiditiesNd, exazerbationsHdComorbiditiesNd$IcdHdClass)
And then bar-plot it:
ggplot(plotTest, aes(plotTest$`exazerbationsHdComorbiditiesNd$IcdHdClass`, plotTest$n, label=plotTest$n)) + geom_bar(stat = "identity") + geom_text(vjust = 0, size = 5)
Its all perfect and works.
I checked also data-types of the columns I needed:
sapply(exazerbationsHdComorbiditiesNd, class)
sapply(plotTest, class)
In both variables the columns I need are of the type character and integer
Edit:
Unfortunately I cant post images. So here are just the links to those.
Here is a screenshot of the plot with wrong y-axis:
https://ibb.co/CbxX1n7
And here is a screenshot of the plot shown right:
https://ibb.co/Xb8gyx1
Here is some example-data that I copied out the data.table object:
Exampledata
Since you added the class counts as an additional column--rather than aggregating--what’s happening is that for each row in your data, the class counts get stacked on top of each other:
library(tidyverse)
set.seed(42)
df <- tibble(class = sample(letters[1:3], 10, replace = TRUE)) %>%
add_count(class, name = "count")
df # this is essentially what your data looks like
#> # A tibble: 10 x 2
#> class count
#> <chr> <int>
#> 1 a 5
#> 2 a 5
#> 3 a 5
#> 4 a 5
#> 5 b 3
#> 6 b 3
#> 7 b 3
#> 8 a 5
#> 9 c 2
#> 10 c 2
ggplot(df, aes(class, count)) + geom_bar(stat = "identity")
You could use position = "identity" so that the bars don’t get stacked:
ggplot(df, aes(class, count)) +
geom_bar(stat = "identity", position = "identity")
However, that creates a whole bunch of unnecessary layers in your plot that you can’t see. A better approach would be to drop the extra rows from your data before plotting:
df %>%
distinct(class, count)
#> # A tibble: 3 x 2
#> class count
#> <chr> <int>
#> 1 a 5
#> 2 b 3
#> 3 c 2
df %>%
distinct(class, count) %>%
ggplot(aes(class, count)) +
geom_bar(stat = "identity")
Created on 2019-09-05 by the reprex package (v0.3.0.9000)
I am using the ..count.. transformation in geom_bar and get the warning
position_stack requires non-overlapping x intervals when some of my categories have few counts.
This is best explained using some mock data (my data involves direction and windspeed and I retain names relating to that)
#make data
set.seed(12345)
FF=rweibull(100,1.7,1)*20 #mock speeds
FF[FF>60]=59
dir=sample.int(10,size=100,replace=TRUE) # mock directions
#group into speed classes
FFcut=cut(FF,breaks=seq(0,60,by=20),ordered_result=TRUE,right=FALSE,drop=FALSE)
# stuff into data frame & plot
df=data.frame(dir=dir,grp=FFcut)
ggplot(data=df,aes(x=dir,y=(..count..)/sum(..count..),fill=grp)) + geom_bar()
This works fine, and the resulting plot shows the frequency of directions grouped according to speed. It is of relevance that the velocity class with the fewest counts (here "[40,60)") will have 5 counts.
However more velocity classes leads to a warning. For instance, with
FFcut=cut(FF,breaks=seq(0,60,by=15),ordered_result=TRUE,right=FALSE,drop=FALSE)
the velocity class with the fewest counts (now "[45,60)") will have only 3 counts and ggplot2 will warn that
position_stack requires non-overlapping x intervals
and the plot will show data in this category spread out along the x axis.
It seems that 5 is the minimum size for a group to have for this to work correctly.
I would appreciate knowing if this is a feature or a bug in stat_bin (which geom_bar is using) or if I am simply abusing geom_bar.
Also, any suggestions how to get around this would be appreciated.
Sincerely
This occurs because df$dir is numeric, so the ggplot object assumes a continuous x-axis, and aesthetic parameter group is based on the only known discrete variable (fill = grp).
As a result, when there simply aren't that many dir values in grp = [45,60), ggplot gets confused over how wide each bar should be. This becomes more visually obvious if we split the plot into different facets:
ggplot(data=df,
aes(x=dir,y=(..count..)/sum(..count..),
fill = grp)) +
geom_bar() +
facet_wrap(~ grp)
> for(l in levels(df$grp)) print(sort(unique(df$dir[df$grp == l])))
[1] 1 2 3 4 6 7 8 9 10
[1] 1 2 3 4 5 6 7 8 9 10
[1] 2 3 4 5 7 9 10
[1] 2 4 7
We can also check manually that the minimum difference between sorted df$dir values is 1 for the first three grp values, but 2 for the last one. The default bar width is thus wider.
The following solutions should all achieve the same result:
1. Explicitly specify the same bar width for all groups in geom_bar():
ggplot(data=df,
aes(x=dir,y=(..count..)/sum(..count..),
fill = grp)) +
geom_bar(width = 0.9)
2. Convert dir to a categorical variable before passing it to aes(x = ...):
ggplot(data=df,
aes(x=factor(dir), y=(..count..)/sum(..count..),
fill = grp)) +
geom_bar()
3. Specify that the group parameter should be based on both df$dir & df$grp:
ggplot(data=df,
aes(x=dir,
y=(..count..)/sum(..count..),
group = interaction(dir, grp),
fill = grp)) +
geom_bar()
This doesn't directly solve the issue, because I also don't get what's going on with the overlapping values, but it's a dplyr-powered workaround, and might turn out to be more flexible anyway.
Instead of relying on geom_bar to take the cut factor and give you shares via ..count../sum(..count..), you can easily enough just calculate those shares yourself up front, and then plot your bars. I personally like having this type of control over my data and exactly what I'm plotting.
First, I put dir and FF into a data frame/tbl_df, and cut FF. Then count lets me group the data by dir and grp and count up the number of observations for each combination of those two variables, then calculate the share of each n over the sum of n. I'm using geom_col, which is like geom_bar but when you have a y value in your aes.
library(tidyverse)
set.seed(12345)
FF <- rweibull(100,1.7,1) * 20 #mock speeds
FF[FF > 60] <- 59
dir <- sample.int(10, size = 100, replace = TRUE) # mock directions
shares <- tibble(dir = dir, FF = FF) %>%
mutate(grp = cut(FF, breaks = seq(0, 60, by = 15), ordered_result = T, right = F, drop = F)) %>%
count(dir, grp) %>%
mutate(share = n / sum(n))
shares
#> # A tibble: 29 x 4
#> dir grp n share
#> <int> <ord> <int> <dbl>
#> 1 1 [0,15) 3 0.03
#> 2 1 [15,30) 2 0.02
#> 3 2 [0,15) 4 0.04
#> 4 2 [15,30) 3 0.03
#> 5 2 [30,45) 1 0.01
#> 6 2 [45,60) 1 0.01
#> 7 3 [0,15) 6 0.06
#> 8 3 [15,30) 1 0.01
#> 9 3 [30,45) 2 0.02
#> 10 4 [0,15) 6 0.06
#> # ... with 19 more rows
ggplot(shares, aes(x = dir, y = share, fill = grp)) +
geom_col()
I have a dataframe that includes four bacteria types: R, B, P, Bi - this is in variable.x
value.y is their abundance and variable.y is various groups they are in.
I would like to plot them according to their food categories: "FiberCategory", "FruitCategory", "VegetablesCategory" & "WholegrainCategory." I have made 4 separate files that have the as such:
Sample Bacteria Abundance Category Level
30841102 R 0.005293192 1 Low
30841102 P 0.000002570 1 Low
30841102 B 0.005813275 1 Low
30841102 Bi 0.000000000 1 Low
49812105 R 0.003298709 1 Low
49812105 P 0.000000855 1 Low
49812105 B 0.131147541 1 Low
49812105 Bi 0.000350086 1 Low
So, I would like a bar plot of how much of each bacteria is in each category. So it should be 4 plots, for each bacteria, with value on the y-axis and food category on the x-axis.
I have tried this code:
library(dplyr)
genus_veg %>% group_by(Genus, Abundance) %>% summarise(Abundance = sum(Abundance)) %>%
ggplot(aes(x = Level, y= Abundance, fill = Genus)) + geom_bar(stat="identity")
But get this error:
Error: cannot modify grouping variable
Any suggestions?
TL;DR Combine individual plots with cowplot
In another interpretation of the super unclear question, this time from:
Plotting Bacteria according to Food Groups & Abundance in R
and
would like to plot them according to their food categories: "FiberCategory", "FruitCategory", "VegetablesCategory" & "WholegrainCategory." I have made 4 separate files
You might be asking for:
You want a bar chart
You want 4 plots, one for each of the food categories
x-axis = bacteria type
y-axis = abundance of bacteria
Input
Let say you have a data frame for each food category. (Again, I'm using dummy data)
library(tidyr)
library(dplyr)
library(ggplot2)
## The categories you have defined
bacteria <- c("R", "B", "P", "Bi")
food <- c("FiberCategory", "FruitCategory", "VegetablesCategory", "WholegrainCategory")
## Create dummy data for plotting
set.seed(1)
num_rows <- length(bacteria)
num_cols <- length(food)
dummydata <-
matrix(data = abs(rnorm(num_rows*num_cols, mean=0.01, sd=0.05)),
nrow=num_rows, ncol=num_cols)
rownames(dummydata) <- bacteria
colnames(dummydata) <- food
dummydata <-
dummydata %>%
as.data.frame() %>%
tibble::rownames_to_column("bacteria") %>%
gather(food, abundance, -bacteria)
## If we have 4 data frames
filter_food <- function(dummydata, foodcat){
dummydata %>%
filter(food == foodcat) %>%
select(-food)
}
dd_fiber <- filter_food(dummydata, "FiberCategory")
dd_fruit <- filter_food(dummydata, "FruitCategory")
dd_veg <- filter_food(dummydata, "VegetablesCategory")
dd_grain <- filter_food(dummydata, "WholegrainCategory")
Where one data frame looks something like
#> dd_grain
# bacteria abundance
#1 R 0.02106203
#2 B 0.10073499
#3 P 0.06624655
#4 Bi 0.00775332
Plot
You can create separate plots. (Here, I'm using a function to generate my plots)
plot_food <- function(dd, title=""){
dd %>%
ggplot(aes(x = bacteria, y = abundance)) +
geom_bar(stat = "identity") +
ggtitle(title)
}
plt_fiber <- plot_food(dd_fiber, "fiber")
plt_fruit <- plot_food(dd_fruit, "fruit")
plt_veg <- plot_food(dd_veg, "veg")
plt_grain <- plot_food(dd_grain, "grain")
And then combine them using cowplot
cowplot::plot_grid(plt_fiber, plt_fruit, plt_veg, plt_grain)
TL;DR Plotting by facets
How you posed the question is super unclear. So I have interpreted your question from
So, I would like a bar plot of how much of each bacteria is in each category. So it should be 4 plots, for each bacteria, with value on the y-axis and food category on the x-axis.
as:
You want a bar chart
You want 4 plots, one for each of the bacteria types: R, B, P, Bi
x-axis = food category
y-axis = abundance of bacteria
Input
In regards to the input data, the data was unclear e.g. you did not describe what "Sample", "Level", or "Category" is. Ideally, you would keep all the food category in one data frame. e.g.
library(tidyr)
library(dplyr)
library(ggplot2)
## The categories you have defined
bacteria <- c("R", "B", "P", "Bi")
food <- c("FiberCategory", "FruitCategory", "VegetablesCategory", "WholegrainCategory")
## Create dummy data for plotting
set.seed(1)
num_rows <- length(bacteria)
num_cols <- length(food)
dummydata <-
matrix(data = abs(rnorm(num_rows*num_cols, mean=0.01, sd=0.05)),
nrow=num_rows, ncol=num_cols)
rownames(dummydata) <- bacteria
colnames(dummydata) <- food
dummydata <-
dummydata %>%
as.data.frame() %>%
tibble::rownames_to_column("bacteria") %>%
gather(food, abundance, -bacteria)
of which the output looks like:
#> dummydata
# bacteria food abundance
#1 R FiberCategory 0.021322691
#2 B FiberCategory 0.019182166
#3 P FiberCategory 0.031781431
#4 Bi FiberCategory 0.089764040
#5 R FruitCategory 0.026475389
#6 B FruitCategory 0.031023419
#7 P FruitCategory 0.034371453
#8 Bi FruitCategory 0.046916235
#9 R VegetablesCategory 0.038789068
#10 B VegetablesCategory 0.005269419
#11 P VegetablesCategory 0.085589058
#12 Bi VegetablesCategory 0.029492162
#13 R WholegrainCategory 0.021062029
#14 B WholegrainCategory 0.100734994
#15 P WholegrainCategory 0.066246546
#16 Bi WholegrainCategory 0.007753320
Plot
Once you have the data formatted as above, you can simply do:
dummydata %>%
ggplot(aes(x = food,
y = abundance,
group = bacteria)) +
geom_bar(stat="identity") +
## Split into 4 plots
## Note: can also use 'facet_grid' to do this
facet_wrap(~bacteria) +
theme(
## rotate the x-axis label
axis.text.x = element_text(angle=90, hjust=1, vjust=.5)
)