Break ggplot2 into Windows/facets with labels in alphabetical order - r

I want to generate a point plot which shows equal number of rows in each frame (can be facets if nothing else works) & in alphabetical order (A1,A2,B2,B2 etc.) since the plot length is too high to see the axis labels clearly. I want to break this plot into 4 windows with the same number of rows i.e. 13 each. (preferably tidyverse & not hard coded # of rows)
library(tidyverse)
df <- data.frame(names=c(paste0(LETTERS,1),paste0(LETTERS,2)),value=1:52)
df %>%
arrange(desc(names)) %>%
ggplot(aes(y=names,x=value))+
geom_point()+
scale_y_discrete(limits=rev)

We can create a grouping column with gl and use facet_wrap
library(dplyr)
library(ggplot2)
df %>%
arrange(desc(names)) %>%
mutate(grp = as.integer(gl(n(), ceiling(n()/4), n()))) %>%
ggplot(aes(y=names,x=value))+
geom_point() +
facet_wrap(~ grp, scales = 'free_y')
-output

Related

How to compare 2 categories to the whole categories using facet in ggplot

Hi would like to compare two categories to whole categories using facet_grid or facet_wrap or another function in ggplot. For example i would like to compare statistics of Hospitals 3 and 4 to the whole hospitals.
Hospital<-c("Hosp1","Hosp1","Hosp1","Hosp1","Hosp1",
"Hosp2","Hosp2","Hosp2","Hosp2","Hosp2",
"Hosp3","Hosp3","Hosp3","Hosp3","Hosp3",
"Hosp4","Hosp4","Hosp4","Hosp4","Hosp4")
Disease<-c("D1","D1","D2","D2","D3",
"D1","D1","D1","D3","D3",
"D3","D3","D2","D2","D3",
"D1","D1","D2","D2","D2")
data<-data.frame(Hospital,Disease)
plot<-ggplot(data, aes(x=Disease,fill=Disease))+
geom_bar()+facet_grid(~Hospital)+coord_flip()
Using facet_grid, I have a graph that compares the four hospitals, which I do not want.
I rather want something like this with facets without going through "grid.arrange", because I want to display all disease categories (even if they are null) for all graphs (in order to easily compare) and I don't want the x.axis label to be displayed for each graph because it takes a lot of space
wh<-ggplot(data, aes(x=Disease,fill=Disease))+
geom_bar()+coord_flip()+labs(title = "whole hospital")
H3<-ggplot(data[data$Hospital=="Hosp3",], aes(x=Disease,
fill=Disease))+ geom_bar()+coord_flip()+
labs(title = "hospital3")
H4<-ggplot(data[data$Hospital=="Hosp4",], aes(x=Disease,
fill=Disease))+ geom_bar()+coord_flip()+
labs(title = "hospital4")
grid.arrange(wh,H3,H4,ncol=3)
How about this based on gghighlight
library(ggplot2)
library(dplyr)
data_all <-
data %>%
mutate(Hospital = "Hosp_all") %>%
group_by(Disease) %>%
summarise(total = n())
data %>%
filter(Hospital %in% c("Hosp3", "Hosp4")) %>%
ggplot(aes(x = Disease, fill = Disease))+
geom_col(data = data_all, aes(Disease, total), fill = "gray80")+
geom_bar()+
coord_flip()+
facet_wrap(~Hospital)+
theme(legend.position = "bottom")
Created on 2020-06-23 by the reprex package (v0.3.0)
If your data is not too large, one way to bind the data frames together, add another column that would indicate the dataset (or hospital) then plot with facet :
library(dplyr)
library(ggplot2)
rbind(data,subset(data,Hospital == "Hosp3"),subset(data,Hospital == "Hosp4")) %>%
mutate(hospital=rep(c("whole hospital","Hosp3","Hosp4"),
c(nrow(data),sum(data$Hospital == "Hosp3"),sum(data$Hospital == "Hosp4")))
) %>%
mutate(hospital=factor(hospital,levels=c("whole hospital","Hosp3","Hosp4"))) %>%
ggplot(aes(x=Disease,fill=Disease))+ geom_bar()+coord_flip()+
facet_wrap(~hospital,scale="free_y")

Frequencies of bargraph as independent list

Suppose I have the following dataset
set.seed(85)
a <- data.frame(replicate(10,sample(0:3,5,rep=TRUE)))
and I plot it in the following way:
library(ggplot2)
ggplot(stack(a), aes(x = values)) +
geom_bar()
From the graph I can read that there are a little less than 1250 occurrences of '3' in the dataset, but is there a way to output frequency of each x-axis value in the dataset as an independent list (i.e. not as numbers on the barplot)? I am looking for a list of how many occurrences of '3' there are in the dataset (and also for the values, 0, 1, & 2).
output:
0: 1249
1: 1200
2: ...
3: ...
Any help is much appreciated
We can convert to 'long' format and then do the count
library(dplyr)
library(tidyr)
a %>%
pivot_longer(everything()) %>%
count(value)
To get the barplot
library(ggplot2)
a %>%
pivot_longer(everything()) %>%
count(value) %>%
ggplot(aes(x = value, y = n)) +
geom_bar(stat = 'identity')
In base R, unlist and get the table
table(unlist(a))
or for plotting
barplot(table(unlist(a)))

Barplot with groups and subgroups

I need to make a boxplot with groups and variables using the data below :
df<-as.data.frame(cbind(c(1,2,3),
c(0.4,-0.11,-0.07),
c(0.31,0.07,0),
c(0.45,-0.23,0.02)))
names(df)<-c('cat','var1','var2','var3')
I need to make a barplot with the cat1 on the abscissa and the measurements of each variables on the ordinate.
For example concerning the cat=1, I need in the abscissa the number of cat1 with 3 barplots representing the value of (var1,..var3).
library(tidyverse)
df <- df %>%
gather(var, val, -cat)
ggplot(df, aes(cat, val, fill=var)) +
geom_col(position="dodge")

How to plot using ggplot2

I have a task and i need to plot graph using ggplot2.
I have a vector of rating (Samsung S4 ratings from its users)
I generate this data using this:
TestRate<- data.frame (rating=sample (x =1:5, size=100, replace=T ), month= sample(x=1:12,size=100,rep=T) )
And now I need to plot a graph, where on X axis will be dates (monthes in our example data) and 5 different lines grouped by 5 different ratings (1,2,3,4,5). Each line shows count of its ratings for corresponding month
How can I plot this in ggplot2?
You need first to count the number of elements per couple of (rating, month):
library(data.table)
setDT(TestRate)[,count:=.N,by=list(month, rating)]
And then you can plot the result:
ggplot(TestRate, aes(month, count, color=as.factor(rating))) + geom_line()
If your data.table is not set (so to speak), you can use dplyr (and rename the legend while you are at it).
df <- TestRate %>% group_by(rating, month) %>% summarise(count = n())
ggplot(df, aes(x=month, y=count, color=as.factor(rating))) + geom_line() + labs(color = "Rating")

Plotting Average/Median of each column in data frame grouped by factors

I am trying to make a grouped barplot and I am running into trouble. For example, if I was using the mtcars dataset and I wanted to group everything by the 'vs' column (col #8), find the average of all remaining columns, and then plot them by group.
Below is a very poor example of what I am trying to do and I know it is incorrect.
Ideally, mpg for vs=1 & vs=0 would be side by side, followed by cyl's means side by side, etc. I don't care if aggregate is skipped for dyplr or if ggplot is used or even if the aggregate step is not needed...just looking for a way to do this since it is driving me crazy.
df = mtcars
agg = aggregate(df[,-8], by=list(df$vs), FUN=mean)
agg
barplot(t(agg), beside=TRUE, col=df$vs))
Try
library(ggplot2)
library(dplyr)
library(tidyr)
df %>%
group_by(vs=factor(vs)) %>%
summarise_each(funs(mean)) %>%
gather(Var, Val, -vs) %>%
ggplot(., aes(x=Var, y=Val, fill=vs))+
geom_bar(stat='identity', position='dodge')
Or using base R
m1 <- as.matrix(agg[-1])
row.names(m1) <- agg[,1]
barplot(m1, beside=TRUE, col=c('red', 'blue'), legend=row.names(m1))

Resources