How to visualize two column in bar chart using R? - r

I don't know if my question clear enough...
I have this table
Name Mark_Oral Mark_Written Total_M_Oral Total_M_Written
1 Hercule Poirot 50 49 858 781
2 Joe O'Neil 70 79 1056 1083
3 John McAuley 81 99 1219 1333
and I have to visualize the last two column in bar chart using R to compare student total mark
Data
table <- structure(list(Name = c("Hercule Poirot", "Joe O'Neil", "John McAuley"),
Mark_Oral = c(50L, 70L, 81L),
Mark_Written = c(49L, 79L, 99L),
Total_M_Oral = c(858L, 1056L, 1219L),
Total_M_Written = c(781L, 1083L, 1333L)),
.Names = c("Name", "Mark_Oral", "Mark_Written", "Total_M_Oral", "Total_M_Written"),
row.names = c("1", "2", "3"), class = "data.frame")

You can use + to combine other plots on the same ggplot object. For example:
ggplot(survey, aes(often_post,often_privacy)) +
geom_point() +
geom_smooth() +
geom_point(aes(frequent_read,often_privacy)) +
geom_smooth(aes(frequent_read,often_privacy))

With ggplot2 (as your tags suggest) the syntax is:
ggplot(data = table,aes(x= Total_M_Oral,y=Total_M_Written))+geom_bar(stat = "identity")
Where table is replaced by the name of your dataframe.
Edit
I was unsure that my first answer really answered your question (multiple uses of bars).
Create dummy data
df<-data.frame(x = rpois(n = 100,lambda = 800),y = rpois(n = 100,lambda = 800))
With previous plot:
If you want to count and have a color for Oral and one for written
df2<-data.frame(x = c(df$x,df$y),y = rep(c("written","oral"),each = nrow(df)))
ggplot(data = df2,aes(x= x,fill=y),alpha = I(0.5))+geom_bar(stat = "count")
Which gives:
Comment: alpha parameter is not necessary, it just deals with the transparency so that you can see when there are overlapping bars.
With student names
df3<-data.frame(name = rep(table$Name,times = 2),
y = c(table$Total_M_Oral, table$Total_M_Written),
fill = rep(c("oral","written"),each = nrow(table)))
ggplot(data = df3, aes(x = name,y= y,fill = fill,alpha = 0.5))+geom_bar(stat= "identity")

Related

ggplot: edit shape of points based on second column

My goal is to plot a map with dwelling locations as points, where points are divided into two colours, based on a categorical variable, name category. Of those dwellings, a few dwellings need to have a different shape, e.g., a star. The column that describes this is called star in the example below. My dataframe looks like this:
x
y
category
star
123
456
1
0
143
556
0
0
124
556
1
1
233
256
1
0
ggplot(data = df, aes(x = x, y = y, color=category)) +
geom_point()
The code above gives me what I need, except for the 'stars'. How can distinguish this second column?
Have assumed you want the points with star with a value of 1 to be a star shape.
library(ggplot2)
ggplot(data = df1, aes(x = x, y = y, color=factor(category), shape = factor(star))) +
geom_point(size = 8) +
scale_shape_manual(breaks = c(0, 1),
values = c(1, 11))+
labs(color = "Category",
shape = "Star")
data
df1 <- structure(list(x = c(123L, 143L, 124L, 233L),
y = c(456L, 556L, 556L, 256L),
category = c(1L, 0L, 1L, 1L),
star = c(0L, 0L, 1L, 0L)),
class = "data.frame", row.names = c(NA, -4L))
Created on 2022-10-13 with reprex v2.0.2

Plot data from data frame list

I have a list of data frames,
>head(df.list.xyg[["archae.list"]])
motif obs pred prop stat pval stdres
AAB 1189 760.1757 0.05556028 11811.94 0 16.00425
CDD 1058 249.7147 0.01825133 11811.94 0 51.62291
DDE 771 415.1314 0.03034143 11811.94 0 17.73730
FBB 544 226.3529 0.01654385 11811.94 0 21.28994
>head(df.list.xyg[["eukaryote.list"]])
motif obs pred prop stat pval stdres
ABG 82015 48922.33 0.08773749 321891.7 0 156.64562
GBC 51601 64768.42 0.11615591 321891.7 0 -55.03402
AGG 41922 30136.56 0.05404701 321891.7 0 69.80141
CGG 25545 14757.24 0.02646569 321891.7 0 90.00215
BTT 15795 12433.58 0.02229843 321891.7 0 30.48747
I would like to
barplot motif versus stdres such that the plots are displayed in single column but two rows, and
label each plot corresponding to part of the file name "archae", eukaryote and so on.
After searching around, the following code snippet does the job but only partly. It only plots the last dataset. I assume it is "overwriting" the earlier dataset. How do I fix this? Can this be achieved by face_wrap or facet_grid?
myplots<-lapply(df.list.xyg,
function(x)
p<-ggplot(x,aes(x=motif,y=stdres)) +
geom_bar(stat="identity",width=0.5, color="blue", fill="gray")
)
print (myplots)
If you just want the plots to appear together in a single plotting window, you can use facets. You can bind the rows of your data frames together with dplyr::bind_rows, which will create an id column to label which data frame each row belongs to. We facet by this ID variable.
library(ggplot2)
ggplot(dplyr::bind_rows(df.list.xyg, .id = "Kingdom"), aes(motif, stdres)) +
geom_col(width = 0.5, fill = "deepskyblue4") +
geom_hline(yintercept = 0, color = "gray75") +
facet_grid(.~Kingdom) +
theme_bw(base_size = 16)
If you have lots of different data frames in your list, facet_grid may be better than facet_wrap.
Depending on how you wish to present the data, you may prefer to save individual plots to files using ggsave inside your lapply
Reproducible data taken from question
df.list.xyg <- list(archae = structure(list(motif = c("AAB",
"CDD", "DDE", "FBB"), obs = c(1189L, 1058L, 771L, 544L),
pred = c(760.1757, 249.7147,
415.1314, 226.3529), prop = c(0.05556028, 0.01825133, 0.03034143,
0.01654385), stat = c(11811.94, 11811.94, 11811.94, 11811.94),
pval = c(0L, 0L, 0L, 0L), stdres = c(16.00425, 51.62291,
17.7373, 21.28994)), class = "data.frame", row.names = c(NA,
-4L)), eukaryote.list = structure(list(motif = c("ABG", "GBC",
"AGG", "CGG", "BTT"), obs = c(82015L, 51601L, 41922L, 25545L,
15795L), pred = c(48922.33, 64768.42, 30136.56, 14757.24, 12433.58
), prop = c(0.08773749, 0.11615591, 0.05404701, 0.02646569, 0.02229843
), stat = c(321891.7, 321891.7, 321891.7, 321891.7, 321891.7),
pval = c(0L, 0L, 0L, 0L, 0L), stdres = c(156.64562, -55.03402,
69.80141, 90.00215, 30.48747)), class = "data.frame", row.names = c(NA,
-5L)))
Created on 2022-05-23 by the reprex package (v2.0.1)
#allancameron: Elegant solution! Thanks. The key then is to row-bind so that the data can be facet-wrapped according to kingdom. I modified the code as below
plot in a single column-two rows
scale the y-axis according to the range of data
turn the y-axis labels 90 degrees for better readability.
library(ggplot2)
ggplot(dplyr::bind_rows(df.list.xyg, .id = "Kingdom"), aes(motif, stdres)) +
geom_col(width = 0.5, fill = "deepskyblue4") +
geom_hline(yintercept = 0, color = "gray75") +
facet_grid(Kingdom ~ ., scales = "free") +
theme_bw(base_size = 16)

Interaction plot with multiple facets using ggplot

I am on R studio, and I am working on a graph that allows comparison between an input vector and what the database have.
The data looks like this:
Type P1 P2 P3
H1 2000 60 4000
H2 1500 40 3000
H3 1000 20 2000
The input vector for comparison will look like this:
Type P1 P2 P3
C 1200 30 5000
and I want my final plot to look like this:
The most important thing is a visual comparison between the input vector and the different types, for each P component. The scale of the y axis should adapt to each type of P, because there is big differences between them.
library(dplyr)
library(tidyr)
library(ggplot2)
d %>% gather(var1, val, -Type) %>%
mutate(input = as.numeric(d2[cbind(rep(1, max(row_number())),
match(var1, names(d2)))]),
slope = factor(sign(val - input), -1:1)) %>%
gather(var2, val, -Type, -var1, -slope) %>%
ggplot(aes(x = var2, y = val, group = 1)) +
geom_point(aes(fill = var2), shape = 21) +
geom_line(aes(colour = slope)) +
scale_colour_manual(values = c("red", "blue")) +
facet_grid(Type ~ var1)
DATA
d = structure(list(Type = c("H1", "H2", "H3"),
P1 = c(2000L, 1500L, 1000L),
P2 = c(60L, 40L, 20L),
P3 = c(4000L, 3000L, 2000L)),
class = "data.frame",
row.names = c(NA, -3L))
d2 = structure(list(Type = "C", P1 = 1200L, P2 = 30L, P3 = 5000L),
class = "data.frame",
row.names = c(NA, -1L))

wrong labeling in ggplot pie chart

I am trying to create a pie chart for percentage values, when I try to label them the labeling is wrong,
I mean the values are pointing the wrong place in the graph.
ggplot(Consumption_building_type, aes(x="", y=percentage, fill=Building_type))+ geom_bar(width = 0.5,stat ="identity")+coord_polar(theta = "y",direction = -1)+geom_text(aes(x=1.3,y = percentage/3 + c(0, cumsum(percentage)[- length(percentage)]),label = round(Consumption_building_type$percentage,0))) + theme_void()+ scale_fill_brewer(palette="GnBu")+ggtitle("Breakdown of building types")+theme_minimal()
This is the code I used and this is the result I got:
When I change the direction=1 both the graph and the labels shift
the data I used
structure(list(
Building_type = c("Commercial", "Industrial", "Institutional", "Large residential",
"Large Residential", "Residential", "Small residential"),
Total_consumption_GJ = c(99665694, 5970695, 10801610, 63699633,
16616981, 24373766, 70488556),
average_consumption_GJ = c(281541.508474576, 72813.3536585366, 109107.171717172,
677655.670212766, 213038.217948718, 123099.828282828, 640805.054545455),
total = c(354L, 82L, 99L, 94L, 78L, 198L, 110L),
percentage = c(34.8768472906404, 8.07881773399015, 9.75369458128079,
9.26108374384236, 7.68472906403941, 19.5073891625616, 10.8374384236453)),
.Names = c("Building_type", "Total_consumption_GJ", "average_consumption_GJ", "total", "percentage"),
class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -7L)))
Really sorry about the data a new user not sure how to paste the data
Update for ggplot 2.0+
ggplot 2.0+ has some new parameters for position_stack() that make solving this problem much simpler. There's no need to calculate the center point of each bar manually (though that solution may still be preferred in some situations and is therefore preserved below). Instead, we can simply use the "vjust" parameter of position_stack():
g <- ggplot(Consumption_building_type, aes(x="", y=percentage, fill=Building_type))+
geom_bar(width = 0.5,stat ="identity")+
coord_polar(theta = "y",direction = 1)+
geom_text(aes(x=1.3,y = percentage, label = round(Consumption_building_type$percentage,0)), position = position_stack(vjust = 0.5)) +
scale_fill_brewer(palette="GnBu")+ggtitle("Breakdown of building types")+theme_minimal() +
labs(x = NULL)
General solution: calculating the midpoint of stacked bars manually
I'm assuming that your goal is to place a label for each bar at the bar's center point. In that case, first we can calculate the center point and add it to the data frame:
Consumption_building_type$zone.start <- with(Consumption_building_type, c(0, cumsum(percentage)[-length(percentage)]))
Consumption_building_type$zone.end <- with(Consumption_building_type, cumsum(percentage))
Consumption_building_type$label.point <- with(Consumption_building_type, (zone.start + zone.end) / 2)
Building_type Total_consumption_GJ average_consumption_GJ total percentage zone.start zone.end label.point
1 Commercial 99665694 281541.51 354 34.87 0.00 34.87 17.435
2 Industrial 5970695 72813.35 82 8.07 34.87 42.94 38.905
3 Institutional 10801610 109107.17 99 9.75 42.94 52.69 47.815
4 Large residential 63699633 677655.67 94 9.26 52.69 61.95 57.320
5 Large Residential 16616981 213038.22 78 7.68 61.95 69.63 65.790
6 Residential 24373766 123099.83 198 19.50 69.63 89.13 79.380
7 Small residential 70488556 640805.05 110 10.83 89.13 99.96 94.545
And then the y aesthetic in geom_label() is simply the newly created "label.point" column.
I've also added labs(x = NULL) so that there are no empty quote marks on the y-axis of the final plot.
new.plot <- ggplot(Consumption_building_type, aes(x="", y=percentage, fill=Building_type))+
geom_bar(width = 0.5,stat ="identity")+
coord_polar(theta = "y",direction = 1)+
geom_text(aes(x=1.3,y = label.point, label = round(Consumption_building_type$percentage,0))) +
scale_fill_brewer(palette="GnBu")+ggtitle("Breakdown of building types")+theme_minimal()

Bubble chart with ggplot with no curcle

The data of df a I use is:
x y size
589 127 16,4724409449
465 58 21,0517241379
408 58 15,9137931034
I use this to take a bubble chart
library(ggplot2)
a <- read.csv("numbers.csv", header = TRUE)
ggplot(a,aes(x,y))+geom_point(size=a$size)
but in the chart I can't see any bubble. How can I make it?
Here is the dput of a data frame:
structure(list(x = c(589L, 465L, 408L), y = c(127L, 58L, 58L),
size = structure(c(2L, 3L, 1L), .Label = c("15,9137931034",
"16,4724409449", "21,0517241379"), class = "factor")), .Names = c("x",
"y", "size"), class = "data.frame", row.names = c(NA, -3L))
Also if it possible to add names and different colours to every bubble?
x y size name
589 127 16,4724409449 nameA
465 58 21,0517241379 nameB
408 58 15,9137931034 nameC
To generate a bubble chart with size mapped to a$size, and colour and labels to a$name, we can try:
ggplot(a, aes(x, y, label = name)) +
geom_point(aes(size = size, colour = name)) +
geom_text()

Resources