I have this data frame
library(dplyr)
dat =data.frame(parent = c("J","J","F","F"),group= c("A(4)","C(3)","A(4)","D(5)"),value=c(1,2,3,4),
count = c(4,3,4,5))
dat %>% arrange(parent,-count )
parent group value count
1 F D(5) 4 5
2 F A(4) 3 4
3 J A(4) 1 4
4 J C(3) 2 3
You can see above that the data is ordered by parent and then count descending. I would like the chart to keep this ordering BUT when I plot it in the "J" chart C(3) comes on top of A(4) and that should be revered. How can that be done?
ggplot(dat, aes(x = group, y= value))+
geom_bar(stat ="identity",position = "dodge")+
coord_flip()+
facet_wrap(~parent, scale = "free_y")
You can set your factor on the group column using the currently ordered values as the levels (reversed to get the ordering you want). This will lock in the ordering.
library(dplyr)
dat =data.frame(parent = c("J","J","F","F"),group= c("A(4)","C(3)","A(4)","D(5)"),value=c(1,2,3,4),
count = c(4,3,4,5))
dat <- dat %>% arrange(parent,-count )
dat$group <- factor(dat$group, levels = rev(unique(dat$group)))
ggplot(dat, aes(x = group, y= value))+
geom_bar(stat ="identity",position = "dodge")+
coord_flip()+
facet_wrap(~parent, scale = "free_y")
You can try:
dat %>% arrange(parent,-count ) %>%
mutate(group2=factor(c(2,1,4,3))) %>%
ggplot(aes(x = group2, y= value))+
geom_bar(stat ="identity",position = "dodge")+
coord_flip()+
facet_wrap(~parent, scale = "free_y")+
scale_x_discrete(breaks=1:4,labels = c("A(4)","D(5)","C(3)","A(4)"))
Related
I have the height of male and females in my data grouped by cm of 10. I want to plot them togheter side by side.
My graph looks somewhat what I want it to be, but the x-axis says factor(male). It should be height in cm.
Also I got three bars, but there should be two, one for male and one for female.
# Library
library(ggplot2)
library(tidyverse) # function "%>%"
# 1. Define data
data = read.csv2(text = "Height;Male;Female
160-170;5;2
170-180;5;5
180-190;6;5
190-200;2;2")
# 2. Print table
df <- as.data.frame(data)
df
# 3. Plot Variable with column chart
ggplot(df, aes(factor(Male),
fill = factor(Male))) +
geom_bar(position = position_dodge(preserve = "single")) +
theme_classic()
pivot_longer to longformat
Then use geom_bar with fill
library(tidyverse)
df1 <- df %>% pivot_longer(
cols = c(Male, Female),
names_to = "Gender",
values_to = "N"
)
# 3. Plot Variable with column chart
ggplot(df1, aes(x=Height, y=N)) +
geom_bar(aes(fill = Gender), position = "dodge", stat="identity") +
theme_classic()
One solution would be:
df %>%
pivot_longer(cols = 2:3, names_to = "gender", values_to = "count") %>%
ggplot(aes(x = Height, y = count, fill = gender)) +
geom_bar(stat = "identity", position = "dodge") +
theme_classic()
I need plot two grouped barcodes with two dataframes that has distinct number of rows: 6, 5.
I tried many codes in R but I don't know how to fix it
Here are my data frames: The Freq colum must be in Y axis and the inter and intra columns must be the x axis.
> freqinter
inter Freq
1 0.293040975264367 17
2 0.296736775990729 2
3 0.297619926364764 4
4 0.587377012109561 1
5 0.595245125315916 4
6 0.597022018595893 2
> freqintra
intra Freq
1 0 3
2 0.293040975264367 15
3 0.597022018595893 4
4 0.598809552335782 2
5 0.898227748764939 6
I expect to plot the barplots in the same plot and could differ inter e intra values by colour
I want a picture like this one:
You probably want a histogram. Use the raw data if possible. For example:
library(tidyverse)
freqinter <- data.frame(x = c(
0.293040975264367,
0.296736775990729,
0.297619926364764,
0.587377012109561,
0.595245125315916,
0.597022018595893), Freq = c(17,2,4,1,4,2))
freqintra <- data.frame(x = c(
0 ,
0.293040975264367,
0.597022018595893,
0.598809552335782,
0.898227748764939), Freq = c(3,15,4,2,6))
df <- bind_rows(freqinter, freqintra, .id = "id") %>%
uncount(Freq)
ggplot(df, aes(x, fill = id)) +
geom_histogram(binwidth = 0.1, position = 'dodge', col = 1) +
scale_fill_grey() +
theme_minimal()
With the data you posted I don't think you can have this graph to look good. You can't have bars thin enough to differentiate 0.293 and 0.296 when your data ranges from 0 to 0.9.
Maybe you could try to treat it as a factor just to illustrate what you want to do:
freqinter <- data.frame(x = c(
0.293040975264367,
0.296736775990729,
0.297619926364764,
0.587377012109561,
0.595245125315916,
0.597022018595893), Freq = c(17,2,4,1,4,2))
freqintra <- data.frame(x = c(
0 ,
0.293040975264367,
0.597022018595893,
0.598809552335782,
0.898227748764939), Freq = c(3,15,4,2,6))
df <- bind_rows(freqinter, freqintra, .id = "id")
ggplot(df, aes(x = as.factor(x), y = Freq, fill = id)) +
geom_bar(stat = "identity", position = position_dodge2(preserve = "single")) +
theme(axis.text.x = element_text(angle = 90)) +
scale_fill_discrete(labels = c("inter", "intra"))
You can also check the problem by not treating your x variable as a factor:
ggplot(df, aes(x = x, y = Freq, fill = id)) +
geom_bar(stat = "identity", width = 0.05, position = "dodge") +
theme(axis.text.x = element_text(angle = 90)) +
scale_fill_discrete(labels = c("inter", "intra"))
Either the bars must be very thin (small width), or you'll get overlapping x intervals breaking the plot.
There are multiple questions (here for instance) on how to arrange the x axis by frequency in a bar chart with ggplot2. However, my aim is to arrange the categories on the X-axis in a stacked bar chart by the relative frequency of a subset of the fill. For instance, I would like to sort the x-axis by the percentage of category B in variable z.
This was my first try using only ggplot2
library(ggplot2)
library(tibble)
library(scales)
factor1 <- as.factor(c("ABC", "CDA", "XYZ", "YRO"))
factor2 <- as.factor(c("A", "B"))
set.seed(43)
data <- tibble(x = sample(factor1, 1000, replace = TRUE),
z = sample(factor2, 1000, replace = TRUE))
ggplot(data = data, aes(x = x, fill = z, order = z)) +
geom_bar(position = "fill") +
scale_y_continuous(labels = percent)
When that didn't work I created a summarised data frame using dplyr and then spread the data and sort it by B and then gather it again. But plotting that didn't work either.
library(dplyr)
library(tidyr)
data %>%
group_by(x, z) %>%
count() %>%
spread(z, n) %>%
arrange(-B) %>%
gather(z, n, -x) %>%
ggplot(aes(x = reorder(x, n), y = n, fill = z)) +
geom_bar(stat = "identity", position = "fill") +
scale_y_continuous(labels = percent)
I would prefer a solution with ggplot only in order not to be dependent of the order in the data frame created by dplyr/tidyr. However, I'm open for anything.
If you want to sort by absolute frequency:
lvls <- names(sort(table(data[data$z == "B", "x"])))
If you want to sort by relative frequency:
lvls <- names(sort(tapply(data$z == "B", data$x, mean)))
Then you can create the factor on the fly inside ggplot:
ggplot(data = data, aes(factor(x, levels = lvls), fill = z)) +
geom_bar(position = "fill") +
scale_y_continuous(labels = percent)
A solution using tidyverse would be:
data %>%
mutate(x = forcats::fct_reorder(x, as.numeric(z), fun = mean)) %>%
ggplot(aes(x, fill = z)) +
geom_bar(position = "fill") +
scale_y_continuous(labels = percent)
I have the following R code, where I transform the data and then order it by a specific column:
df2 <- df %>%
group_by(V2, news) %>%
tally() %>%
complete(news, fill = list(n = 0)) %>%
mutate(percentage = n / sum(n) * 100)
df22 <- df2[order(df2$news, -df2$percentage),]
I want to apply the ordered data "df22" in ggplot:
ggplot(df22, aes(x = V2, y = percentage, fill = factor(news, labels = c("Read","Otherwise")))) +
geom_bar(stat = "identity", position = "fill", width = .7) +
coord_flip() + guides(fill = guide_legend(title = "Online News")) +
scale_fill_grey(start = .1, end = .6) + xlab("Country") + ylab("Share")
Unfortunately, ggplot still returns me a plot without the order:
Does anyone know what is wrong with my code? This is not the same as to order bar chart with a single value per bar like here Reorder bars in geom_bar ggplot2. I try to order the cart by a specific category of a factor. In particular, I want to see countries with the largest share of Read news first.
Here is the data:
V2 news n percentage
1 United States News Read 1583 1.845139
2 Netherlands News Read 1536 1.790356
3 Germany News Read 1417 1.651650
4 Singapore News Read 1335 1.556071
5 United States Otherwise 581 0.6772114
6 Netherlands Otherwise 350 0.4079587
7 Germany Otherwise 623 0.7261665
8 Singapore Otherwise 635 0.7401536
I used the following R code:
df2 <- df %>%
group_by(V2, news) %>%
tally() %>%
complete(news, fill = list(n = 114)) %>%
mutate(percentage = n / sum(n) * 100)
df2 <- df2[order(df2$news, -df2$percentage),]
df2 <- df2 %>% group_by(news, percentage) %>% arrange(desc(percentage))
df2$V2 <- factor(df2$V2, levels = unique(df2$V2))
ggplot(df2, aes(x = V2, y = percentage, fill = news))+
geom_bar(stat = "identity", position = "stack") +
guides(fill = guide_legend(title = "Online News")) +
coord_flip() +
scale_x_discrete(limits = rev(levels(df2$V2)))
Everything was fine except some countries break the order for some reason and I do not understand why. Here is the picture:
What I did with the hints from guys, I used "arrange" command instead of dplyr
df4 <- arrange(df2, news, desc(percentage))
Here is the result:
Here's what I have - hope this is useful. As mentioned #Axeman - the trick is to reorder the labels as factors. Further, using coord_flip() reorders the labels in the opposite direction so scale_x_discrete() is needed.
I am using the small sample you provided.
library(ggplot2)
library(dplyr)
df <- read.csv("data.csv")
df <- arrange(df, news, desc(Percentage))
df$V2 <- factor(df$V2, levels = unique(df$V2))
ggplot(df, aes(x = V2, y = Percentage, fill = news))+
geom_bar(stat = "identity", position = "stack") +
guides(fill = guide_legend(title = "Online News")) +
coord_flip() +
scale_x_discrete(limits = rev(levels(df$V2)))
I have a data frame with five columns and five rows. the data frame looks like this:
df <- data.frame(
day=c("m","t","w","t","f"),
V1=c(5,10,20,15,20),
V2=c(0.1,0.2,0.6,0.5,0.8),
V3=c(120,100,110,120,100),
V4=c(1,10,6,8,8)
)
I want to do some plots so I used the ggplot and in particular the geom_bar:
ggplot(df, aes(x = day, y = V1, group = 1)) + ylim(0,20)+ geom_bar(stat = "identity")
ggplot(df, aes(x = day, y = V2, group = 1)) + ylim(0,1)+ geom_bar(stat = "identity")
ggplot(df, aes(x = day, y = V3, group = 1)) + ylim(50,200)+ geom_bar(stat = "identity")
ggplot(df, aes(x = day, y = V4, group = 1)) + ylim(0,15)+ geom_bar(stat = "identity")
My question is, How can I do a grouped ggplot with geom_bar with multiple y axis? I want at the x axis the day and for each day I want to plot four bins V1,V2,V3,V4 but with different range and color. Is that possible?
EDIT
I want the y axis to look like this:
require(reshape)
data.m <- melt(df, id.vars='day')
ggplot(data.m, aes(day, value)) +
geom_bar(aes(fill = variable), position = "dodge", stat="identity") +
facet_grid(variable ~ .)
You can also change the y-axis limits if you like (here's an example).
Alternately you may have meant grouped like this:
require(reshape)
data.m <- melt(df, id.vars='day')
ggplot(data.m, aes(day, value)) +
geom_bar(aes(fill = variable), position = "dodge", stat="identity")
For the latter examples if you want 2 Y axes then you just create the plot twice (once with a left y axis and once with a right y axis) then use this function:
double_axis_graph <- function(graf1,graf2){
graf1 <- graf1
graf2 <- graf2
gtable1 <- ggplot_gtable(ggplot_build(graf1))
gtable2 <- ggplot_gtable(ggplot_build(graf2))
par <- c(subset(gtable1[['layout']], name=='panel', select=t:r))
graf <- gtable_add_grob(gtable1, gtable2[['grobs']][[which(gtable2[['layout']][['name']]=='panel')]],
par['t'],par['l'],par['b'],par['r'])
ia <- which(gtable2[['layout']][['name']]=='axis-l')
ga <- gtable2[['grobs']][[ia]]
ax <- ga[['children']][[2]]
ax[['widths']] <- rev(ax[['widths']])
ax[['grobs']] <- rev(ax[['grobs']])
ax[['grobs']][[1]][['x']] <- ax[['grobs']][[1]][['x']] - unit(1,'npc') + unit(0.15,'cm')
graf <- gtable_add_cols(graf, gtable2[['widths']][gtable2[['layout']][ia, ][['l']]], length(graf[['widths']])-1)
graf <- gtable_add_grob(graf, ax, par['t'], length(graf[['widths']])-1, par['b'])
return(graf)
}
I believe there's also a package or convenience function that does the same thing.
First I reshaped as described in the documentation in the link below the question.
In general ggplot does not support multiple y-axis. I think it is a philosophical thing. But maybe faceting will work for you.
df <- read.table(text = "day V1 V2 V3 V4
m 5 0.1 120 1
t 10 0.2 100 10
w 2 0.6 110 6
t 15 0.5 120 8
f 20 0.8 100 8", header = TRUE)
library(reshape2)
df <- melt(df, id.vars = 'day')
ggplot(df, aes(x = variable, y = value, fill = variable)) + geom_bar(stat = "identity") + facet_grid(.~day)
If I understand correctly you want to include facets in your plot. You have to use reshape2 to get the data in the right format. Here's an example with your data:
df <- data.frame(
day=c("m","t","w","t","f"),
V1=c(5,10,20,15,20),
V2=c(0.1,0.2,0.6,0.5,0.8),
V3=c(120,100,110,120,100),
V4=c(1,10,6,8,8)
)
library(reshape2)
df <- melt(df, "day")
Then plot with and include facet_grid argument:
ggplot(df, aes(x=day, y=value)) + geom_bar(stat="identity", aes(fill=variable)) +
facet_grid(variable ~ .)