Order data on ggplot [duplicate] - r

This question already has answers here:
Reorder bars in geom_bar ggplot2 by value
(3 answers)
Closed 3 years ago.
I currently have a ggplot however it is shown in alphabetical order, I want the graph to show the most 'important score' first and order in descending order. See image of plot attached and code.
library(ggplot2)
ggplot(data= VIMP, aes(x=(VIMP$Y),y=VIMP$X)) +
geom_bar(position="dodge",stat="identity",width = 0, color =
"black") +
coord_flip() + geom_point(color='skyblue') +
xlab("Variables")+ylab(" Importance Score")+
ggtitle("Variable Importance") +
theme(plot.title = element_text(hjust = 0.5)) +
theme(panel.background = element_rect(fill = 'white', colour =
'black'))

To solve this problem you might use the library(forcats) package. Forcats is a package that was made to deal with factors in R.
This code might work for you.
VIMP <- VIMP %>%
mutate(Y = forcats::fct_reorder(Y, X)) ##reorder the Y variable based on X, it's also possible to change to a descending order using desc(X).
ggplot(data= VIMP, aes(x=(VIMP$Y),y=VIMP$X)) +
geom_bar(position="dodge",stat="identity",width = 0, color =
"black") +
coord_flip() + geom_point(color='skyblue') +
xlab("Variables")+ylab(" Importance Score")+
ggtitle("Variable Importance") +
theme(plot.title = element_text(hjust = 0.5)) +
theme(panel.background = element_rect(fill = 'white', colour = 'black'))

Related

Question about ggplot2 bar graph transformation method

I wanted the barplot to appear in two forms, so I created repeated data and used it as an input.
So I used the data in the form below.
I put the data in the form above and wrote the following code to use it.
Select <- "Mbp"
if(Select == "Mbp"){
Select <- "Amount of sequence (Mbp)"
} else if (Select == "Gbp"){
Select <- "Amount of sequence (Gbp)"
}
ggplot(G4, aes(x = INDV, y = Bp, fill = Group)) + theme_light() +
geom_bar(stat = 'identity', position = 'dodge', width = 0.6) + coord_flip() +
scale_x_discrete(limits = rev(unname(unlist(RAW_TRIM[1])))) +
scale_fill_discrete(breaks = c("Raw data","Trimmed data"))+
scale_y_continuous(labels = scales::comma, position = "right") +
theme(axis.text = element_text(colour = "black", face = "bold", size = 15)) +
theme(legend.position = "bottom", legend.text = element_text(face = "bold", size = 15),
legend.title = element_blank()) + ggtitle(Select) + xlab("") + ylab("") +
theme(plot.title = element_text(size = 25, face = "bold", hjust = 0.5))
Then I can get a plot like the one below, where I want the red graph to be on top of the green graph.
I also tried changing the order of the data, and several sites such as the Internet and Stack Overflow provided solutions and used them, but not a single solution was able to solve them.
If you know a solution, please let me know how to modify the code or change the data.
thank you.
You seem to be asking more than one question at once here, but the main one is: why do the bars for Raw appear under those for Trimmed? The short answer is: factor levels and the behaviour of coord_flip().
Let's make a toy dataset:
library(tidyverse)
G4 <- data.frame(INDV = c("C_01", "C_01", "C_41", "C_41"),
Group = c("Raw data", "Trimmed data", "Raw data", "Trimmed data"),
Bp = c(200, 100, 500, 400))
A simple dodged bar chart. Note that Raw comes before Trimmed, because R is before T in the alphabet:
G4 %>%
ggplot(aes(INDV, Bp)) +
geom_col(aes(fill = Group),
position = "dodge")
Now we coord_flip:
G4 %>%
ggplot(aes(INDV, Bp)) +
geom_col(aes(fill = Group),
position = "dodge") +
coord_flip()
This has the effect of reversing the variables, so Raw is now below Trimmed.
We can fix that by altering factor levels. As there are only two groups we can just reverse them using fct_rev() from the forcats package:
G4 %>%
ggplot(aes(INDV, Bp)) +
geom_col(aes(fill = fct_rev(Group)),
position = "dodge") +
coord_flip()
The bar for Raw is now on top but unfortunately, the colours are now reversed so that Raw bars are green. We can fix that using scale_fill_manual():
G4 %>%
ggplot(aes(INDV, Bp)) +
geom_col(aes(fill = fct_rev(Group)),
position = "dodge") +
coord_flip() +
scale_fill_manual(values = c("#00BFC4", "#F8766D"))
Now the Raw bars are on top, and they are red.

how to show all mean values in the boxplot with ggplot2? [duplicate]

This question already has answers here:
Add mean to grouped box plot in R with ggplot2
(2 answers)
Closed 1 year ago.
I am trying to add the mean values (as shown in red dots in the plot below) in the boxplot with ggplot2. I used stat_summary to add mean values.
However, the following plot is not the exact one that I am looking for. What I'd like to get is to show two mean values for both Y (blue box) and N (red box), not one mean value for both.
Here is my code.
ggplot(data = df.08.long,
aes(x = TMT_signals, y = as.numeric(TMT_Intensities), fill = `probe.Mod.or.not(Y/N)`)) +
geom_boxplot() +
stat_summary(fun.y=mean, geom="point", shape=20, size=5, color="red", fill="red") +
coord_cartesian(
xlim = NULL,
ylim = c(0, 2e4),
expand = TRUE,
default = FALSE,
clip = "on")
theme_classic() +
theme(axis.title=element_text(size=8),
axis.text=element_text(size=10),
axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1))
Does anyone know how to solve this problem?
Thanks so much for any help!
mtcars example
Code
mtcars %>%
ggplot(aes(as.factor(vs),drat, fill = as.factor(am)))+
geom_boxplot()+
stat_summary(
fun=mean,
geom="point",
shape=21,
size=5,
#Define the aesthetic inside stat_summary
aes(fill = as.factor(am)),
position = position_dodge2(width = .75),
show.legend = FALSE
)
Output

How can I group my bars in ggplot2 by fill color, while maintaining descending order? [duplicate]

This question already has an answer here:
Control the fill order and groups for a ggplot2 geom_bar
(1 answer)
Closed 2 years ago.
I am currently working on a barplot where I have data collected from timeA and timeB. I took a measurement (ex: mass) of different objects. The objects measured are not consistent in the timeA group and timeB group.
I want the bars to be in descending order
But I also want the bars to be grouped by time aka fill=time
How would I achieve this?
Dataset example:
df$measures <- c(2,4,26,10,18,20,14,22,12,16,24,6,8,28)
df$object <- seq.int(nrow(df))
df$time<- "timea"
df[6:14, 3] = "time"
This is roughly the code I have for my graph so far
plot <-ggplot(df, aes(x=reorder(object, -measures), y=measures, fill=time))+
geom_bar(stat="identity") +
theme_minimal()+
theme(axis.text.x = element_text(angle = 45, hjust = 1), plot.title = element_text(hjust = 0.5))+
scale_fill_manual(name="Legend", values = c("firebrick", "cornflowerblue"), labels=c("timea", "timeb"))
plot
I attached a picture just in case of what the graph roughly looks like. Ideally, I would like all blue bars on one side in descending order and all the red bars on the other in descending order; all in the same plot.
If you want everything in the same graph you can play with factor levels in the dataframe.
library(dplyr)
library(ggplot2)
df %>%
arrange(time, desc(measures)) %>%
mutate(object = factor(object, object)) %>%
ggplot() + aes(x=object, y=measures, fill=time)+
geom_bar(stat="identity") +
theme_minimal()+
theme(axis.text.x = element_text(angle = 45, hjust = 1),
plot.title = element_text(hjust = 0.5))+
scale_fill_manual(name="Legend", values = c("firebrick", "cornflowerblue"),
labels=c("timea", "timeb"))
You can also make use of facets here -
ggplot(df) + aes(x=reorder(object, -measures), y=measures, fill=time)+
geom_bar(stat="identity") +
theme_minimal()+
theme(axis.text.x = element_text(angle = 45, hjust = 1),
plot.title = element_text(hjust = 0.5))+
facet_wrap(.~time, scales = 'free') +
scale_fill_manual(name="Legend", values = c("firebrick", "cornflowerblue"),
labels=c("timea", "timeb"))
data
df <- tibble::tibble(measures = c(2,4,26,10,18,20,14,22,12,16,24,6,8,28),
object = seq_along(measures),
time = sample(c("timea", "timeb"), length(measures), replace = TRUE))

Manually change order of y axis items on complicated stacked bar chart in ggplot2

I've been stuck on an issue and can't find a solution. I've tried many suggestions on Stack Overflow and elsewhere about manually ordering a stacked bar chart, since that should be a pretty simple fix, but those suggestions don't work with the huge complicated mess of code I plucked from many places. My only issue is y-axis item ordering.
I'm making a series of stacked bar charts, and ggplot2 changes the ordering of the items on the y-axis depending on which dataframe I am trying to plot. I'm trying to make 39 of these plots and want them to all have the same ordering. I think ggplot2 only wants to plot them in ascending order of their numeric mean or something, but I'd like all of the bar charts to first display the group "Bird Advocates" and then "Cat Advocates." (This is also the order they appear in my data frame, but that ordering is lost at the coord_flip() point in plotting.)
I think that taking the data frame through so many changes is why I can't just add something simple at the end or use the reorder() function. Adding things into aes() also doesn't work, since the stacked bar chart I'm creating seems to depend on those items being exactly a certain way.
Here's one of my data frames where ggplot2 is ordering my y-axis items incorrectly, plotting "Cat Advocates" before "Bird Advocates":
Group,Strongly Opposed,Opposed,Slightly Opposed,Neutral,Slightly Support,Support,Strongly Support
Bird Advocates,0.005473026,0.010946052,0.012509773,0.058639562,0.071149335,0.31118061,0.530101642
Cat Advocates,0.04491726,0.07013396,0.03624901,0.23719464,0.09141056,0.23404255,0.28605201
And here's all the code that takes that and turns it into a plot:
library(ggplot2)
library(reshape2)
library(plotly)
#Importing data from a .csv file
data <- read.csv("data.csv", header=TRUE)
data$s.Strongly.Opposed <- 0-data$Strongly.Opposed-data$Opposed-data$Slightly.Opposed-.5*data$Neutral
data$s.Opposed <- 0-data$Opposed-data$Slightly.Opposed-.5*data$Neutral
data$s.Slightly.Opposed <- 0-data$Slightly.Opposed-.5*data$Neutral
data$s.Neutral <- 0-.5*data$Neutral
data$s.Slightly.Support <- 0+.5*data$Neutral
data$s.Support <- 0+data$Slightly.Support+.5*data$Neutral
data$s.Strongly.Support <- 0+data$Support+data$Slightly.Support+.5*data$Neutral
#to percents
data[,2:15]<-data[,2:15]*100
#melting
mdfr <- melt(data, id=c("Group"))
mdfr<-cbind(mdfr[1:14,],mdfr[15:28,3])
colnames(mdfr)<-c("Group","variable","value","start")
#remove dot in level names
mylevels<-c("Strongly Opposed","Opposed","Slightly Opposed","Neutral","Slightly Support","Support","Strongly Support")
mdfr$variable<-droplevels(mdfr$variable)
levels(mdfr$variable)<-mylevels
pal<-c("#bd7523", "#e9aa61", "#f6d1a7", "#999999", "#c8cbc0", "#65806d", "#334e3b")
ggplot(data=mdfr) +
geom_segment(aes(x = Group, y = start, xend = Group, yend = start+value, colour = variable,
text=paste("Group: ",Group,"<br>Percent: ",value,"%")), size = 5) +
geom_hline(yintercept = 0, color =c("#646464")) +
coord_flip() +
theme(legend.position="top") +
theme(legend.key.width=unit(0.5,"cm")) +
guides(col = guide_legend(ncol = 12)) + #has 7 real columns, using to adjust legend position
scale_color_manual("Response", labels = mylevels, values = pal, guide="legend") +
theme(legend.title = element_blank()) +
theme(axis.title.x = element_blank()) +
theme(axis.title.y = element_blank()) +
theme(axis.ticks = element_blank()) +
theme(axis.text.x = element_blank()) +
theme(legend.key = element_rect(fill = "white")) +
scale_y_continuous(breaks=seq(-100,100,100), limits=c(-100,100)) +
theme(panel.background = element_rect(fill = "#ffffff"),
panel.grid.major = element_line(colour = "#CBCBCB"))
The plot:
I think this works, you may need to play around with the axis limits/breaks:
library(dplyr)
mdfr <- mdfr %>%
mutate(group_n = as.integer(case_when(Group == "Bird Advocates" ~ 2,
Group == "Cat Advocates" ~ 1)))
ggplot(data=mdfr) +
geom_segment(aes(x = group_n, y = start, xend = group_n, yend = start + value, colour = variable,
text=paste("Group: ",Group,"<br>Percent: ",value,"%")), size = 5) +
scale_x_continuous(limits = c(0,3), breaks = c(1, 2), labels = c("Cat", "Bird")) +
geom_hline(yintercept = 0, color =c("#646464")) +
theme(legend.position="top") +
theme(legend.key.width=unit(0.5,"cm")) +
coord_flip() +
guides(col = guide_legend(ncol = 12)) + #has 7 real columns, using to adjust legend position
scale_color_manual("Response", labels = mylevels, values = pal, guide="legend") +
theme(legend.title = element_blank()) +
theme(axis.title.x = element_blank()) +
theme(axis.title.y = element_blank()) +
theme(axis.ticks = element_blank()) +
theme(axis.text.x = element_blank()) +
theme(legend.key = element_rect(fill = "white"))+
scale_y_continuous(breaks=seq(-100,100,100), limits=c(-100,100)) +
theme(panel.background = element_rect(fill = "#ffffff"),
panel.grid.major = element_line(colour = "#CBCBCB"))
produces this plot:
You want to factor the 'Group' variable in the order by which you want the bars to appear.
mdfr$Group <- factor(mdfr$Group, levels = c("Bird Advocates", "Cat Advocates")

Why ggplot2 legend not show in the graph [duplicate]

This question already has answers here:
Add legend to ggplot2 line plot
(4 answers)
Closed 2 years ago.
I use ggplot to scatterplot 2 datasets and want to show the legend in the top left. I tried some code but didn't work. I am not sure why this happened.
ggplot(mf, aes(log10(mf[,2]),mf[,1]))
+ ggtitle("Plot")
+ geom_point(color = "blue") + theme(plot.margin = unit(c(1,2,1,1), "cm"))
+ xlab("xxx") + ylab("yyy")
+ theme(plot.title = element_text(size=18,hjust = 0.5, vjust=4))
+ geom_point(data=mf2,aes(log10(mf2[,2]),mf2[,1]),color="red")
+ theme(axis.title.x = element_text(size = rel(1.3)))
+ theme(axis.title.y = element_text(size = rel(1.3)))
+ scale_color_discrete(name = "Dataset",labels = c("Dataset 1", "Dataset 2"))
Since values were not provided, I have used my own values for the demonstration purpose.
mf is a dataframe with log and val as it's column.
You need to put the color parameter inside the aesthetics. This will result in the mapping of colors for the legend. After that you can manually scale the color to get any color you desire.
you can use the below code to get the desired result.
ggplot(mf, aes(val,log))+
geom_point(aes(color = "Dataset1"))+
geom_point(data=mf2,aes(color="Dataset2"))+
labs(colour="Datasets",x="xxx",y="yyy")+
theme(legend.position = c(0, 1),legend.justification = c(0, 1))+
scale_color_manual(values = c("blue","red"))

Resources