I would like to make four boxplots side-by-side using ggplot2, but I am struggling to find an explanation that suits my purposes.
I am using the well-known Iris dataset, and I simply want to make a chart that has boxplots of the values for sepal.length, sepal.width, petal.length, and petal.width all next to one another. These are all numerical values.
I feel like this should be really straightforward but I am struggling to figure this one out.
Any help would be appreciated.
Try this. The approach would be to selecting the numeric variables and with tidyverse functions reshape to long in order to sketch the desired plot. You can use facet_wrap() in order to create a matrix style plot or avoid it to have only one plot. Here the code (Two options):
library(tidyverse)
#Data
data("iris")
#Code
iris %>% select(-Species) %>%
pivot_longer(everything()) %>%
ggplot(aes(x=name,y=value,fill=name))+
geom_boxplot()+
facet_wrap(.~name,scale='free')
Output:
Or if you want all the data in one plot, you can avoid the facet_wrap() and use this:
#Code 2
iris %>% select(-Species) %>%
pivot_longer(everything()) %>%
ggplot(aes(x=name,y=value,fill=name))+
geom_boxplot()
Output:
This is a one-liner using reshape2::melt
ggplot(reshape2::melt(iris), aes(variable, value, fill = variable)) + geom_boxplot()
In base R, it can be done more easily in a one-liner
boxplot(iris[-5])
Or using ggboxplot from ggpubr
library(ggpubr)
library(dplyr)
library(tidyr)
iris %>%
select(-Species) %>%
pivot_longer(everything()) %>%
ggboxplot(x = 'name', fill = "name", y = 'value',
palette = c("#00AFBB", "#E7B800", "#FC4E07", "#00FABA"))
Related
I am using the mtcars dataset as an example and I use this code.
library(ggplot2)
library(ggsci)
ggviolin(mtcars, x="cyl", y="disp", fill="cyl", palette="jco", facet.by = "am")
To each facet, I would like to add a fourth category on the x-axis (maybe call this "6or8"), in which the 6- and 8-cylinder groups (but not the 4-cylinder group) are combined. I found this similar post, but it did not help me, because of my facets and addition of two instead of all categories.
Does anyone have a suggestion? Thank you.
You could try this:
> newmtcars <- rbind(mtcars %>% mutate(cyl = as.character(cyl)),
+ mtcars %>% filter(cyl %in% c(6,8)) %>% mutate(cyl = '6or8')) %>% arrange(cyl)
> ggviolin(newmtcars, x="cyl", y="disp", fill="cyl", palette="jco", facet.by = "am")
You can manually change the levels for cyl to change the ordering in the plot (if, for example, you want "6or8" to be the first/last level).
I want to create a plot in R with ggplot() to visualise the data included in variable matrix that looks like this:
matrix <- matrix(c(time =c(1,2,3,4,5),v1=rnorm(5),v2=c(NA,1,0.5,0,0.1)),nrow=5)
colnames(matrix) <- c("time","v1","v2")
df <-data.frame(
time=rep(matrix[,1],2),
values=c(matrix[,2],matrix[,3]),
names=rep(c("v1","v2"), each=length(matrix[,1]))
)
ggplot(df, aes(x=time,y=values,color=names)) +
geom_point()+
facet_grid(names~.)
Is there a faster way than transforming the data in a data.frame like I do? This way seems to be very laborious..
I would appreciate every help!! Thanks in advance.
A tidyverse approach:
This will produce the data structure you need to use in ggplot
library(tidyverse)
matrix %>%
as_data_frame() %>%
gather(., names, value, -time)
This will generate data structure and plot all at once
matrix %>%
as_data_frame() %>%
gather(., names, value, -time) %>%
ggplot(., aes(x=time,y=value,color=names)) +
geom_point()+
facet_grid(names~.)
When I integrate tables and figures in a document using knitr, adding the code makes it more reproducible and interesting.
Often a combination of dplyr and ggvis can make a plot that has relatively legible code (using the magrittr pipe operator %>).
mtcars %>%
group_by(cyl, am) %>%
summarise( weight = mean(wt) ) %>%
ggvis(x=~am, y=~weight, fill=~cyl) %>%
layer_bars()
The problem is that the ggvis plot:
does not look quite as as pretty as the ggplot2 plot (I know, factoring of cyl):
However, for ggplot2 we need:
mtcars %>%
group_by(am, cyl) %>%
summarise( weight = mean(wt) ) %>%
ggplot( aes(x=am, y=weight, fill=cyl) ) +
geom_bar(stat='identity')
My problem is that this switches from %>% to + for piping. I know this is a very minor itch, but I would much prefer to use:
mtcars %>%
group_by(am, cyl) %>%
summarise( weight = mean(wt) ) %>%
ggplot( aes(x=am, y=weight, fill=cyl) ) %>%
geom_bar(stat='identity')
Is there a way to modify the behaviour of ggplot2 so that this would work?
ps. I don't like the idea of using magrittr's add() since this again make the code more complicated to read.
Since it would be too long to expand in the comments, and based on your answer I am not sure if you tried the bit of code I provided and it didn't work or you tried previously and didn't manage
geom_barw<-function(DF,x,y,fill,stat){
require(ggplot2)
p<-ggplot(DF,aes_string(x=x,y=y,fill=fill)) + geom_bar(stat=stat)
return(p)
}
library(magrittr)
library(dplyr)
library(ggplot2)
mtcars %>%
group_by(cyl, am) %>%
summarise( weight = mean(wt) ) %>%
geom_barw(x='am', y='weight', fill='cyl', stat='identity')
This works for me with:
dplyr_0.4.2 ggplot2_2.1.0 magrittr_1.5
Of course geom_barw could be modified so you don't need to use the quotes anymore.
EDIT: There should be more elegant and safer way with lazy (see the lazyeval package), but a very quick adaptation would be to use substitute (as pointed by Axeman - however without the deparse part):
geom_barw<-function(DF,x,y,fill,stat){
require(ggplot2)
x<-substitute(x)
y<-substitute(y)
fill<-substitute(fill)
p<- ggplot(DF,aes_string(x=x,y=y,fill=fill))
p<- p + geom_bar(stat=stat)
return(p)
}
I am trying to make a grouped barplot and I am running into trouble. For example, if I was using the mtcars dataset and I wanted to group everything by the 'vs' column (col #8), find the average of all remaining columns, and then plot them by group.
Below is a very poor example of what I am trying to do and I know it is incorrect.
Ideally, mpg for vs=1 & vs=0 would be side by side, followed by cyl's means side by side, etc. I don't care if aggregate is skipped for dyplr or if ggplot is used or even if the aggregate step is not needed...just looking for a way to do this since it is driving me crazy.
df = mtcars
agg = aggregate(df[,-8], by=list(df$vs), FUN=mean)
agg
barplot(t(agg), beside=TRUE, col=df$vs))
Try
library(ggplot2)
library(dplyr)
library(tidyr)
df %>%
group_by(vs=factor(vs)) %>%
summarise_each(funs(mean)) %>%
gather(Var, Val, -vs) %>%
ggplot(., aes(x=Var, y=Val, fill=vs))+
geom_bar(stat='identity', position='dodge')
Or using base R
m1 <- as.matrix(agg[-1])
row.names(m1) <- agg[,1]
barplot(m1, beside=TRUE, col=c('red', 'blue'), legend=row.names(m1))
Trying to replicate the ggplot function position="fill" in ggvis. I use this handy function all the time in the presentation of results. Reproducible example successfully performed in ggplot2 + the ggvis code. Can it be done using the scale_numeric function?
library(ggplot2)
p <- ggplot(mtcars, aes(x=factor(cyl), fill=factor(vs)))
p+geom_bar()
p+geom_bar(position="fill")
library(ggvis)
q <- mtcars %>%
ggvis(~factor(cyl), fill = ~factor(vs))%>%
layer_bars()
# Something like this?
q %>% scale_numeric("y", domain = c(0,1))
I think that to do this sort of thing with ggvis you have to do the heavy data reshaping lifting before sending it to ggvis. ggplot2's geom_bar handily does a lot of calculations (counting things up, weighting them, etc) for you that you need to do explicitly yourself in ggvis. So try something like the below (there may be more elegant ways):
mtcars %>%
mutate(cyl=factor(cyl), vs=as.factor(vs)) %>%
group_by(cyl, vs) %>%
summarise(count=length(mpg)) %>%
group_by(cyl) %>%
mutate(proportion = count / sum(count)) %>%
ggvis(x= ~cyl, y = ~proportion, fill = ~vs) %>%
layer_bars()