How to make a grouped barchart with two groups on x-axis - r

I have a data that looks like this
Name, Clusters, incorrectly_classified
PCA, 2, 34.37
PCA, 6, 60.80
ICA2, 2, 37.89
ICA6, 2, 33.20
ICA2, 6, 69.66
ICA6, 6, 60.54
RP2, 2, 32.94
RP4, 2, 33.59
RP6, 2, 31.25
RP2, 6, 68.75
RP4, 6, 61.58
RP6, 6, 56.77
I would like to create a barplot for the above data that is similar to this plot I drew
x axis will have two numbers 2 or 6. Y-axis will have incorrectly_classified and the Name will be plotted for each 2 or 6. Each Name for each group (2 or 6) would be colored consistently among the two groups.
Is this possible to achieve with barchart? If not with barchart, then what is a good way to plot this data

I think the following is what you are after.
ggplot(data = mydf, aes(x = factor(Clusters), y = incorrectly_classified, fill = Name)) +
geom_bar(stat = "identity", position = "dodge") +
labs(x = "Clusters", y = "Incorrectly classified")

This can be done with barplot.
An example:
counts <- table(mtcars$vs, mtcars$gear)
barplot(counts, main="Car Distribution by Gears and VS",
xlab="Number of Gears", col=c("darkblue","red"),
legend = rownames(counts), beside=TRUE)
EDIT
I will also work my answer out to demonstrate the barplot option (although ggplot is much cooler :-) ):
if df is your dataframe:
dfwide<-reshape(df,timevar="Clusters",v.names="incorrectly_classified",idvar="Name",direction="wide")
rownames(dfwide) <- dfwide$Name
dfwide$Name<-NULL
names(dfwide)[names(dfwide)=="incorrectly_classified.2"] <- "2"
names(dfwide)[names(dfwide)=="incorrectly_classified.6"] <- "6"
dfwide<-as.matrix(dfwide)
barplot(dfwide, main="Your Graph",
xlab="Clusters",ylab="incorrectly_classified",col=c("darkblue","red","orange","green","purple","grey"),
legend = rownames(dfwide), beside=TRUE,args.legend = list(x = "topleft", bty = "n", inset=c(0.15, -0.15)))

Related

Lollipop chart with repeated elements in different groups

I am trying to plot a lollipop chart with 5 groups and repeated elements in those groups. If all elements have different names it works as expected:
Intended behavior:
The problem is that I want to plot only 5 algorithms in different groups, and when I actually name them from Algorithm 1-5 this happens with the plot:
Unexpected behavior:
This is my snippet that produces the correct behavior of the lollipop chart (except for the wrong labels):
library(ggpubr)
# Create dataset
data <- data.frame(
algorithm=paste( "Algorithm ", seq(1,25), sep=""),
category=as.factor(c( rep('A', 5), rep('B', 5), rep('C', 5), rep('D', 5), rep('E', 5))),
metric=c(rep(rev(96:100), 5))
)
ggdotchart(data, x = "algorithm", y = "metric",
color = "category", # Color by groups
palette = c("#264653", "#2a9d8f", "#e9c46a", "#f4a261", "#e76f51"), # Custom color palette
sorting = "descending", # Sort value in descending order
add = "segments", # Add segments from y = 0 to dots
rotate = TRUE, # Rotate vertically
group = "category", # Order by groups
dot.size = 7, # Large dot size
label = round(data$metric), # Add mpg values as dot labels
font.label = list(color = "white", size = 8,
vjust = 0.5), # Adjust label parameters
ggtheme = theme_pubr() # ggplot2 theme
) +
labs(y = "Metric (%)", color="")
This is the new data snippet that causes this behavior:
# Create dataset
data <- data.frame(
algorithm=rep(paste( "Algorithm ", seq(1,5), sep=""), 5),
category=as.factor(c( rep('A', 5), rep('B', 5), rep('C', 5), rep('D', 5), rep('E', 5))),
metric=c(rep(rev(96:100), 5))
)
How can I possibly solve this issue?
Once produced, we can edit this like any other ggplot object. We can use scale_x_discrete() to manipulate the axis labels, which avoids any confusion with the original plot definition and construction under the hood of ggdotchart(). Using your first plot as p, we can do:
alg_labels <- rep(paste( "Algorithm ", seq(1,5), sep=""), 5)
p +
scale_x_discrete(
labels = alg_labels
)

How to make three different bar charts of similar type clustered in the same plot?

I need to map my Erosion values for different levels of tillage (colomns) with three levels of soil depth (rows (A1, A2, A3)). I want all of this to be shown as a barchart in a single plot.
Here is a reproducible example:
a<- matrix(c(1:36), byrow = T, ncol = 4)
rownames(a)<-(c("A1","B1","C1","A2","B2","C2","A3","B3","C3"))
colnames(a)<-(c("Int_till", "Redu_till", "mulch_till", "no_till"))
barplot(a[1,], xlab = "A1", ylab = "Erosion")
barplot(a[4,], xlab = "A2", ylab = "Erosion")
barplot(a[7,], xlab = "A3", ylab = "Erosion")
##I want these three barchart side by side in a single plot
## for comparison
### and need similar plots for all the "Bs" and "Cs"
### Lastly, i want these three plots in the same page.
I have seen people do similar things using "fill" in ggplot (for lines) and specifying the factor which nicely separates the chart for different categories but I tried doing it but always run into error maybe because my data is continuous.
If any body could help me with these two things.. It will be a great help. I will appreciate it.
Thank you!
We can use ggplot
library(reshape2)
library(ggplot2)
library(dplyr)
melt(a) %>%
ggplot(., aes(x = Var2, y = value, fill = Var1)) +
geom_bar(stat = 'identity',
position = position_dodge2(preserve = "single")) +
facet_wrap(~ Var1)
Set mfcol to specify a 3x3 grid and then for each row generate its bar plot. Also, you could consider adding the barplot argument ylim = c(0, max(a)) so that all graphs use the same Y axis. title and mtext can be used to set the overall title and various margin text as we do below. See ?par, ?title and ?mtext for more information.
opar <- par(mfcol = c(3, 3), oma = c(0, 3, 0, 0))
for(r in rownames(a)) barplot(a[r, ], xlab = r, ylab = "Erosion")
par(opar)
title("My Plots", outer = TRUE, line = -1)
mtext(LETTERS[1:3], side = 2, outer = TRUE, line = -1,
at = c(0.85, 0.5, 0.17), las = 2)

Add categorical grouping to scatter plot of continuous data in R?

Sorry if image 1 is a little basic - layout sent by my project supervisor! I have created a scatterplot of total grey seal abundance (Total) over observation time (Obsv_time), and fitted a gam over the top, as seen in image 2:
plot(Total ~ Obsv_time,
data = R_Count,
ylab = "Total",
xlab = "Observation Time (Days)",
pch = 20, cex = 1, bty = "l",col="dark grey")
lines(R_Count$Obsv_time, fitted(gam.tot2))
I would like to somehow show on the graph the corresponding Season (Image 1) - from a categorical factor variable (4 levels: Pre-breeding,Breeding,Post-breeding,Moulting), which corresponds to Obsv_time.
I am unsure if I need to plot a secondary axis or just add labels to the graph...and how to do each! Thanks!
Wanted graph layout - indicate season from factor variable
Scatterplot with GAM curve
You can do this with base R graphics. Leave off the x-axis in the original plot, and add an axis with the season labels separately. You can get indicate the season by overlaying polygons.
## Some bogus data
x = sort(runif(50,0,250))
y = 800*(sin(x/40) + x/100 + rnorm(50,0, 0.2)) + 500
FittedY = 800*(sin(x/40) + x/100)+500
plot(x,y, pch= 20, col='lightgray', ylim=c(300,2700), xaxt='n',
xlab="", ylab='Total')
lines(x, FittedY)
axis(1, at=c(25,95,155,215), tick=FALSE,
labels=c('PreBreed', 'Repro', 'PostBreed', 'Moulting'))
rect(c(-10,65,125,185), 0, c(65,125,185,260), 3000,
col=rainbow(4, alpha=0.05), border=NA)
If you are able to use ggplot2, you could add (or compute from time) another factor variable to your data-frame which would be your season. Then it is just a matter of using color (or any other) aesthetic which would use this season variable.
require(ggplot2)
df <- data.frame(total = c(26, 41, 31, 75, 64, 32, 7, 89),
time = c(1, 2, 3, 4, 5, 6, 7, 8))
df$season <- cut(df$time, breaks=c(0, 2, 4, 6, 8),
labels=c("winter", "spring", "summer", "autumn"))
ggplot(df, aes(x=time, y=total)) +
geom_smooth(color="black") +
geom_point(aes(color=season))

offsetting the mean on a scatter plot?

the figure wth offset points but mean in the middle
I'm plotting two sets of data on the same plot, distinguishing the two sets by using different pch and by offsetting them. I also want to plot the mean of both sets of data but so far I've only been able to offset the data points, not the means. This is my code
points(jitter(as.numeric(gen$genord)-0.1,0.1),ai$propaiacts, pch=15,col="dimgray",cex=1)
points(jitter(as.numeric(ugen$genord)+0.1,0.1),uai$propuaiacts, pch=6)
s=split(gen$propaiacts,gen$gencode)
points(jitter(sapply(s, mean)+0.5,0.5),pch="__", cex=2)
s=split(ugen$propuaiacts,ugen$gencode)
points(jitter(sapply(s, mean)-0.1,0.1),pch="__", cex=2)
this is the relevant data:
dput(c(gen$genord,gen$propaiacts))
c(3, 1, 2, 3, 3, 1, 1, 2, 1, 2, 1, 2, 13.5986733, 6.6115702,
9.2198582, 0.6001775, 1.0177719, 6.4348071, 10.0849649, 16.5116934,
11.00971, 14.2514897, 4.366077, 7.3884464)
> dput(c(ugen$ugenord,ugen$propuaiacts))
c(3, 1, 2, 3, 3, 1, 1, 2, 1, 2, 1, 2, 1, 9.4512195, 6.3064133,
7.2121554, 0.6486974, 1.0140406, 5.9735066, 10.076442, 12.5423729,
9.6563923, 13.3744272, 4.4930535, 5.3341665, 21.0191083)
using your code and dataset was difficult, so I will use the iris dataset and hopefully it will help you started. As an alternative to your base R, I used ggplot2. I only converted the data from wide to long. And then I just added position = position_dodge(width = 1) to the geom_point() expression. To add the mean for each group (black dot), I summarised the dataset iris_melt. Hope it will help you to get what you want.
iris_melt <- melt(iris, id.vars=c("Species"))
iris_melt_s <- ddply(iris_melt, c("Species", "variable"), summarise,
meanv = mean(value))
iris_melt <- melt(iris, id.vars=c("Species"))
ggplot(data=iris_melt, aes(x=variable, y=value, group=Species, color=Species, shape=Species)) +
geom_point(position = position_dodge(width = 0.5)) +
geom_point(data=iris_melt_s, aes(x=variable, y=meanv, group=Species, color=Species), color="black", position = position_dodge(width = 0.5))
i realised that I could simply specify the number of categories on the x and then shift it. It's a bit manual, but it worked for now. s=split(ai$propaiacts,ai$recallord) points(c(1,2,3)-0.1,sapply(s, mean), pch="__", cex=2)

How to plot three vectors/line charts on one figure?

How to draw one line chart with 3 lines in R?
min<-c(1,1,4,5)
max<-c(8,9,8,10)
d<-c(-2,3,4,3)
We can use matplot after cbinding the vectors to create a matrix
matplot(cbind(min, max, d), type='l')
To change the 'x axis' labels, we can plot with xaxt=n and change the labels with axis
matplot(cbind(min, max, d), type='l', xaxt='n', col=2:4)
axis(1, at=1:4, labels=letters[1:4])
legend('topright', legend=c('min', 'max', 'd'), col=2:4, pch=1)
Another solution to complete #akrun's very good answer, and based on this page:
require(ggplot2)
require(reshape2)
require(directlabels)
min <- c( 1, 1, 4, 5)
max <- c( 8, 9, 8, 10)
d <- c(-2, 3, 4, 3)
df <- data.frame(min=min, max=max, d=d, x=1:4)
df.m <- melt(df,id.vars="x")
p <- ggplot(df.m, aes(x=x, y=value, color=variable)) + geom_line()
direct.label(p)

Resources