Boxplot of specific groups and all groups using ggplot2 - r

I have an example data set which represents my bigger data set I'm dealing with that looks like this:
dat <- data.frame(groupid = c(rep("2ppm", 5), rep("20ppm", 5)),
var1 = c(222, 212, 245, 233, 213, 444, 454, 464, 434, 424),
var2 = c(111, 112, 145, 133, 113, 744, 754, 764, 734, 724));
I want to plot the variables side-by-side grouped by their groupid, hence I did the following:
mDat <- melt(dat, by = "groupid");
With this data set, I can easily plot the boxplots needed using ggplot2:
bp <- ggplot(mDat, aes(x = variable, y = value, fill = groupid)) +
geom_boxplot();
So far so good, however, I want to add an additional boxplot to the end of the plot, where all values in both variables are plotted to see the overall spread in a boxplot; I couldn't figure out how to modify the melted data set to get this result, e.g. add another group to groupid called all.
Thanks in advance for your help!

Related

R : ggplot2 to show two summary data [duplicate]

This question already has answers here:
How to make a single plot from two dataframes with ggplot2
(2 answers)
Closed 6 months ago.
df1 <- data.frame(
"Item" = c("20170315","20170316","20170409","20170411","20170525"),
"Value" = c(400, 515, 743, 682, 458))
df2 <- data.frame(
"Item" = c("20180102","20180227","20180311","20180318","20180522","20180628"),
"Value" = c(793, 541, 777, 847, 901, 433))
Want to show two dataframe in one plot,
like this picture. Have a nice day!
Like this?
Create a column cond and bind the data sets. Then it's a normal dodged bar plot.
df1 <- data.frame(
"Item" = c("20170315","20170316","20170409","20170411","20170525"),
"Value" = c(400, 515, 743, 682, 458))
df2 <- data.frame(
"Item" = c("20180102","20180227","20180311","20180318","20180522","20180628"),
"Value" = c(793, 541, 777, 847, 901, 433))
suppressPackageStartupMessages({
library(dplyr)
library(ggplot2)
})
bind_rows(
df1 %>% mutate(cond = "A"),
df2 %>% mutate(cond = "B")
) %>%
ggplot(aes(Item, Value, fill = cond)) +
geom_col() +
theme(axis.text.x = element_text(angle = 60, vjust = 1, hjust=1))
Created on 2022-08-21 by the reprex package (v2.0.1)

Barplot labels too long, is it possible to set a "label width"

I am trying to create a stacked barplot where beside = TRUE. Here is the data used to generate this figure;
significant <- c(27, 44, 25, 54, 40, 31, 25, 9, 57, 59)
annotated <- c(119, 267, 109, 373, 250, 173, 124, 20, 452, 478)
names <- c("mitochondrial gene expression","ncRNA metabolic process",
"mitochondrial translation", "translation",
"ribonucleoprotein complex biogenesis", "ribosome biogenesis",
"rRNA metabolic process", "transcription preinitiation complex asse...",
"peptide biosynthetic process", "amide biosynthetic process")
data = rbind(significant, annotated)
colnames(data) <- names
rownames(data) <- c("significant", "annotated")
My plotting code is;
printBarPlots <- function(input, main){
data = rbind(input[,4], input[,3])
colnames(data)= input[,2]
rownames(data)=c("Significant", "Annotated")
par(mar=c(5,9,4,2))
mybar = barplot(data, width = 3, xlab = "Number of genes", main = main,
horiz = T, cex.axis = 0.8, beside = TRUE, las = 1,
cex.names = 0.8, legend = T, args.legend = list(x="right"))
}
Using this code, the bar labels extend far to the left of my plot. My question is unlike this questionbecause splitting the bar names at each space would still require having to have a small cex.names. Is it possible to specify that rather than having something like "transcription preinitiation complex asse..." written on one line, it can be spread out over two lines, such as below, to make better use of space? Or perhaps, some sort of code to split names onto different lines following a certain amount of letters (e.g. start new line after 13 characters).
"transcription preinitiation
complex asse..."

boxplot displays incorrect when coverting from factor to numeric

My graph displays correctly without using scale. I want to have it looks better so I convert factor to numeric then using scale_x_continuous. However, the graph looks incorrect when I convert from factor to numeric (How to convert a factor to an integer\numeric without a loss of information?). I can't use scale without converting to numeric. Please run a sample code below with and without these lines ( main$U <- as.numeric(as.character(main$U)), and + scale_x_continuous(name="Temperature", limits=c(0, 160)) ). Thank you.
library("ggplot2")
library("plyr")
df<-data.frame(U = c(25, 25, 25, 25, 25, 85, 85, 85, 125, 125),
V =c(1.03, 1.06, 1.1,1.08,1.87,1.56,1.75,1.82, 1.85, 1.90),
type=c(2,2,2,2,2,2,2,2,2,2))
df1<-data.frame(U = c(25, 25,25,85, 85, 85, 85, 125, 125,125),
V =c(1.13, 1.24,1.3,1.17, 1.66,1.76,1.89, 1.90, 1.95,1.97),
type=c(5,5,5,5,5,5,5,5,5,5))
df2<-data.frame(U = c(25, 25, 25, 85, 85,85,125, 125,125),
V =c(1.03, 1.06, 1.56,1.75,1.68,1.71,1.82, 1.85,1.88),
type=c(7,7,7,7,7,7,7,7,7))
main <- rbind(df,df1,df2)
main$type <- as.factor(main$type)
main <- transform(main, type = revalue(type,c("2"="type2", "5"="type5", "7" = "type7")))
main$U <- as.factor(main$U)
main$U <- as.numeric(as.character(main$U))
ggplot(main, aes(U, V,color=type)) +
geom_boxplot(width=0.5/length(unique(main$type)), size=.3, position="identity") +
scale_x_continuous(name="Temperature", limits=c(0, 160))
You have to specify the group in your call to geom_boxplot, and to keep the legend you can use color=factor(U) (i.e, converting U back). To not lose information on the groups that have the same x-values, I think it is best to create a new grouping column first. You take all unique pairs of U and type and create a new variable based on which row falls into which of these pairs.
main$U <- as.character(main$U)
main$type <- as.character(main$type)
grp_keys <- unique(as.matrix(main[, c("U", "type")]))
grp_inds <- 1:nrow(grp_keys)
main$grps <- apply(main, 1, function(x) {
grp_inds[colSums(as.character(x[c("U", "type")]) == t(grp_keys)) == length(c("U", "type"))]
})
Then, plotting (width adjusted because it looks very small with higher range),
main$U <- as.numeric(as.character(main$U))
ggplot(main, aes(U, V,color=type)) +
geom_boxplot(aes(group = grps, color = type), width=20/length(unique(main$type)), size=.3, position="identity") +
scale_x_continuous(name="Temperature", limits=c(0, 160))

Print box plot in R with plotly

I'm using plotly to create plots and I'm trying to create a box plot with the following data
data = data.frame(conn_1000 = c(970.09, 384, 1495), conn_2000 = c(970.09, 384, 1495), conn_4000 = c(1042.72, 685, 1495), conn_6000 = c(1012.92, 68, 1482))
plot_ly(y = data, type = "box")
As result I get an empty plot. Do you know where is the mistake?
You can try to melt your data and add group and y
because you need data in long format, but now you have in wide
plotly::plot_ly( data=reshape2::melt(data), type = "box",group = variable,y=value)

RColorBrewer Treemap package R, Change color for neutral value?

I would like to change the color of my boxes created in the treemap below. Like the code is written now all my boxes get colored green cause all values are higher than zero. How can I change the "neutral color value" in the palette so 100 represent the neutral value? In this case the boxes with a value smaller than 100 should go in the red color.
To just get the colors right I could take the value and subtract it by 100 but I also want my numbers to be correct.
Any and all help will be greatly appreciated.
Code:
library(treemap)
library(RColorBrewer)
name = c('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h')
weight = c(53796, 6897, 12928, 19838, 22745, 13456, 2333, 17567)
value = c(79, 87, 73, 109, 85, 76, 91, 104)
df = data.frame(name, weight, value)
treemap(df,
index="name",
vSize="weight",
vColor="value",
type="value",
palette = brewer.pal(n=8, "RdYlGn"))
Add mapping, e.g.:
treemap(df,
index="name",
vSize="weight",
vColor="value",
type="value",
palette = brewer.pal(n=8, "RdYlGn"),
mapping = c(min(value), 100, max(value)))

Resources