Combine groups into one group to display in boxplot (ggplot2, R)

Combine groups into one group to display in boxplot (ggplot2, R) - r

I am using the mtcars dataset as an example and I use this code.
library(ggplot2)
library(ggsci)
ggviolin(mtcars, x="cyl", y="disp", fill="cyl", palette="jco", facet.by = "am")
To each facet, I would like to add a fourth category on the x-axis (maybe call this "6or8"), in which the 6- and 8-cylinder groups (but not the 4-cylinder group) are combined. I found this similar post, but it did not help me, because of my facets and addition of two instead of all categories.
Does anyone have a suggestion? Thank you.

You could try this:
> newmtcars <- rbind(mtcars %>% mutate(cyl = as.character(cyl)),
+ mtcars %>% filter(cyl %in% c(6,8)) %>% mutate(cyl = '6or8')) %>% arrange(cyl)
> ggviolin(newmtcars, x="cyl", y="disp", fill="cyl", palette="jco", facet.by = "am")
You can manually change the levels for cyl to change the ordering in the plot (if, for example, you want "6or8" to be the first/last level).

Related

How to remove one row of column labels in a gt table?

#Preparing the data and loading packages
library(modelsummary);library(tidyverse);library(gt)
as_tibble(mtcars)
df <- mtcars %>% mutate(cyl_ = factor(cyl)) %>%
dplyr::select(cyl_, mpg, vs, am, hp, wt)
#Gets table of descriptive statistics about different subsets of the data
print(t1 <- datasummary_balance(~cyl_,
data = df,
output = "gt"))
#This hides the "Std. Dev." columns
t1 %>% cols_hide(c(3,5,7))
#Now I want to hide the "Mean" column labels, but I want to keep the "cyl_" value column labels. Any ideas how?
I want something like this:

Using the gt package, you can pipe your table to tab_options(column_labels.hidden = TRUE) to remove column labels. Unfortunately, this will remove both levels: the column headers, and the spanning labels that include the cyl info you want to keep.
Note that datasummary_balance() produces a highly customized table which is intended to be used as a ready-made output. In cases like these, it might be easier to just build the custom table you want using datasummary() instead of trying to customize datasummary_balance() (square peg, round hole, etc). For example:
library(modelsummary)
library(tidyverse)
df <- mtcars %>%
select(cyl, mpg, vs, am, hp, wt) %>%
mutate(cyl = factor(sprintf("%s (N = %s)", cyl, n()))) %>%
as.data.frame() # The `All()` function does not accept tibbles
datasummary(
All(df) ~ Mean * cyl,
data = df,
output = "gt")

Break ggplot2 into Windows/facets with labels in alphabetical order

I want to generate a point plot which shows equal number of rows in each frame (can be facets if nothing else works) & in alphabetical order (A1,A2,B2,B2 etc.) since the plot length is too high to see the axis labels clearly. I want to break this plot into 4 windows with the same number of rows i.e. 13 each. (preferably tidyverse & not hard coded # of rows)
library(tidyverse)
df <- data.frame(names=c(paste0(LETTERS,1),paste0(LETTERS,2)),value=1:52)
df %>%
arrange(desc(names)) %>%
ggplot(aes(y=names,x=value))+
geom_point()+
scale_y_discrete(limits=rev)

We can create a grouping column with gl and use facet_wrap
library(dplyr)
library(ggplot2)
df %>%
arrange(desc(names)) %>%
mutate(grp = as.integer(gl(n(), ceiling(n()/4), n()))) %>%
ggplot(aes(y=names,x=value))+
geom_point() +
facet_wrap(~ grp, scales = 'free_y')
-output

Boxplots of four variables in the same plot

I would like to make four boxplots side-by-side using ggplot2, but I am struggling to find an explanation that suits my purposes.
I am using the well-known Iris dataset, and I simply want to make a chart that has boxplots of the values for sepal.length, sepal.width, petal.length, and petal.width all next to one another. These are all numerical values.
I feel like this should be really straightforward but I am struggling to figure this one out.
Any help would be appreciated.

Try this. The approach would be to selecting the numeric variables and with tidyverse functions reshape to long in order to sketch the desired plot. You can use facet_wrap() in order to create a matrix style plot or avoid it to have only one plot. Here the code (Two options):
library(tidyverse)
#Data
data("iris")
#Code
iris %>% select(-Species) %>%
pivot_longer(everything()) %>%
ggplot(aes(x=name,y=value,fill=name))+
geom_boxplot()+
facet_wrap(.~name,scale='free')
Output:
Or if you want all the data in one plot, you can avoid the facet_wrap() and use this:
#Code 2
iris %>% select(-Species) %>%
pivot_longer(everything()) %>%
ggplot(aes(x=name,y=value,fill=name))+
geom_boxplot()
Output:

This is a one-liner using reshape2::melt
ggplot(reshape2::melt(iris), aes(variable, value, fill = variable)) + geom_boxplot()

In base R, it can be done more easily in a one-liner
boxplot(iris[-5])
Or using ggboxplot from ggpubr
library(ggpubr)
library(dplyr)
library(tidyr)
iris %>%
select(-Species) %>%
pivot_longer(everything()) %>%
ggboxplot(x = 'name', fill = "name", y = 'value',
palette = c("#00AFBB", "#E7B800", "#FC4E07", "#00FABA"))

How to add legends in this context?

songs %>% group_by(year) %>% summarise(count=nth(pop,1))%>%
ggplot(aes(x=factor(year),y=count,fill=year))+geom_bar(stat ='identity' )+theme_classic()
1.How can I adjust my legends to show years(2010:2019) rather than what it is showing right now?
2.Scale_size_manual is not working.

You need to set year as a factor each time (or externally), not just once. I don't have your data, so I'll use mtcars.
library(ggplot2)
library(dplyr)
# first plot
mtcars %>%
ggplot(aes(factor(carb), disp, fill=carb)) +
geom_bar(stat="identity")
# second plot
mutate(mtcars, carb = factor(carb)) %>%
ggplot(aes(carb, disp, fill=carb)) +
geom_bar(stat="identity")
# alternate code for second plot, not shown
mtcars %>%
ggplot(aes(factor(carb), disp, fill=factor(carb))) +
# both ^^^^^^ and ^^^^^^
geom_bar(stat="identity")
(There are numerous ways to convert to a factor. I'm using dplyr here, but it can easily be done in base or data.table.)
I included the "alternate" code above that shows the manual factor being applied to each use of carb; this is not the preferred method in my mind, since if you're doing it multiple times, just do it once before the plotting and use it multiple times. If you need both the ordinal year and the numeric version, you can add a new field, such as ordinal_year=factor(year).

Plotting Average/Median of each column in data frame grouped by factors

I am trying to make a grouped barplot and I am running into trouble. For example, if I was using the mtcars dataset and I wanted to group everything by the 'vs' column (col #8), find the average of all remaining columns, and then plot them by group.
Below is a very poor example of what I am trying to do and I know it is incorrect.
Ideally, mpg for vs=1 & vs=0 would be side by side, followed by cyl's means side by side, etc. I don't care if aggregate is skipped for dyplr or if ggplot is used or even if the aggregate step is not needed...just looking for a way to do this since it is driving me crazy.
df = mtcars
agg = aggregate(df[,-8], by=list(df$vs), FUN=mean)
agg
barplot(t(agg), beside=TRUE, col=df$vs))

Try
library(ggplot2)
library(dplyr)
library(tidyr)
df %>%
group_by(vs=factor(vs)) %>%
summarise_each(funs(mean)) %>%
gather(Var, Val, -vs) %>%
ggplot(., aes(x=Var, y=Val, fill=vs))+
geom_bar(stat='identity', position='dodge')
Or using base R
m1 <- as.matrix(agg[-1])
row.names(m1) <- agg[,1]
barplot(m1, beside=TRUE, col=c('red', 'blue'), legend=row.names(m1))

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Combine groups into one group to display in boxplot (ggplot2, R) - r

Related

How to remove one row of column labels in a gt table?

Break ggplot2 into Windows/facets with labels in alphabetical order

Boxplots of four variables in the same plot

How to add legends in this context?

Plotting Average/Median of each column in data frame grouped by factors

Categories

Resources