Split barplot by grouping by days - r

I have the following bar chart produced using this code:
MD1<-read.csv("MD_qual_OTU_sorted.csv")
MD1<-data.frame(Samples=c("A","B","C","D","E","F","G","H","I","J","K","L","M", "N","O","P","Q", "R"), Number.of.OTUs=c(13,10,9,9,15,11,7,7,9,9,5,10,10,7,15,17,8,9))
par(las=1)
barplot(MD1[,2],names.arg=MD1[,1], ylab='OTU Count', yaxt='n', xlab='MD samples', main='Total OTU count/Sample',density=c(90,90, 90, 90, 90, 90, 10, 10, 10, 10, 10, 10, 40, 40, 40, 40, 40, 40), col=c("yellow","yellow","pink", "pink","green","green","red","red", "purple", "purple", "blue", "blue", "orange", "orange","cyan", "cyan","chartreuse4", "chartreuse4" ))
usr <- par("usr")
par(usr=c(usr[1:2], 0, 20))
axis(2, at=seq(0,20,5))
I want to split samples A-F into a separate group (Day 3), G-L (Day 5) and M-R (Day 15)
There are similar questions posted however I am not sure how to tidy up the manner in which I have inputted my data to be able to use these solutions.

You could consider using ggplot2, separate plots are very easy using facet_wrap and facet_grid.
library(ggplot2)
#create a grouping variable
MD1$Day <- rep(c("Day 03","Day 05","Day 15"),
each=6)
p1 <- ggplot(MD1, aes(x=Samples,y=Number.of.OTUs)) +
geom_bar(stat="identity") + facet_wrap(~Day,
scales="free_x")
p1
Or, if you want to use base-R and approach your original image:
#add colors/densities
MD1$col <- c("yellow","yellow","pink", "pink","green","green","red","red",
"purple", "purple", "blue", "blue", "orange", "orange","cyan", "cyan","chartreuse4", "chartreuse4" )
MD1$density <- c(90,90, 90, 90, 90, 90, 10, 10, 10, 10, 10, 10, 40, 40, 40, 40, 40, 40)
#set 1 row three cols for plotting
par(mfrow=c(1,3))
#split and plot
lapply(split(MD1, MD1$Day),function(x){
barplot(x[,2],
names.arg=x[,1],
ylab='OTU Count',
ylim=c(0,20),
main=unique(x$Day),
col=x$col,
density=x$density)
})

Related

ggplot2 - geom_histogram / scale_fill_manual

I am working the following dataframe (df):
df$GP<-c(0,0,0,1,1,2,3,3,3,3,4,4,9,15,18,18,19,19,20,20,21,22,22,23)
df$colour<-c("g","g","g","g","g","g","g","g","g","g","g","g","t","t","g","g","g","g","g","g","g","g","g","g")
I want the histogram below, but showing a different fill for colour=="g" and colour=="t".
However, running the following code, the bars labelled colour=="t", go out of scale (up to 1 - plot2) whereas should be at 0.25 (plot1).
ggplot(data=df,aes(x=GP,y=..ndensity..))+geom_histogram(bins=25,aes(fill=colour))+scale_fill_manual(values=c("black","grey"))
Do you have any idea of how this could be achieved?
Thank you very much for your help with this one!
I used a tibble as the data type for dataset, with different tibble variable names.
the result is just as you want.
tb <- tibble(
tbx = c(0, 0, 0, 1, 1, 2, 3, 3, 3, 3, 4, 4, 9, 15, 18, 18, 19, 19, 20, 20, 21, 22, 22, 23),
tby = c("g","g","g","g","g","g","g","g","g","g","g","g","t","t","g","g","g","g","g","g","g","g","g","g")
)
ggplot(tb, aes(tbx, tby = ..ndensity..)) +
geom_histogram(bins = 25, aes(fill = tby)) +
scale_fill_manual(values = c("red", "grey"))
and this is the output plot:
I hope this addresses your question

Even display of unevenly spaced numbers on x/y coordinates

Would you advise on how I could make an even display of unevenly spaced number on a graph. For example, considering the code below :
BREAKS = c(0, 0.1, 1, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500)
a <- seq(0,100,0.1)
b <- seq(0,1000,0.1)
plot(ecdf(a), col="red", xlim=c(0,100), main=NA, breaks=BREAKS)
plot(ecdf(b), col="green", xlim=c(0,100), add=T, breaks=BREAKS)
I would like to show on X-axis (0, 0.1, 1 and 10) spaced in an equal/even manner.

Table format of ANOVA output

I have a large dataset on which I am performing ANOVA analysis. I'm not sure how to get the output of the analysis into a table that I can use in a Word document (without retyping all of the values manually).
Here is an example of what I'm trying to do:
var1 <- c("Red", "Green", "Blue", "Blue", "Red","Red", "Green", "Blue", "Blue", "Red",
"Red", "Blue", "Green", "Blue", "Red","Red", "Green", "Blue", "Blue", "Red")
var2 <- c(10, 20, 15, 32, 10, 20, 15, 32, 10, 20, 15, 32, 10, 20, 15, 32, 10, 20, 15, 32)
df <- data.frame(var1, var2)
TukeyHSD(aov(var2 ~ var1))
This produces an output that looks like this:
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = var2 ~ var1)
$var1
diff lwr upr p adj
Green-Blue -8.25 -21.183389 4.683389 0.2580147
Red-Blue -2.75 -13.310068 7.810068 0.7848043
Red-Green 5.50 -7.433389 18.433389 0.5323260
I would like the output to be in a format that is easy to cut and paste into Word that includes the headings, "Variable", "Difference" and "p value". Any help would be appreciated.

Changing legend labels in ggplotly()

I have a plot of polygons that are colored according to a quantitative variable in the dataset being cut off at certain discrete values (0, 5, 10, 15, 20, 25). I currently have a static ggplot() output that "works" the way I intend. Namely, the legend values are the cut off values (0, 5, 10, 15, 20, 25). The static plot is below -
However, when I simply convert this static plot to an interactive plot, the legend values become hexadecimal values (#54278F, #756BB1, etc.) instead of the cut off values (0, 5, 10, 15, 20, 25). A screenshot of this interactive plot is shown below -
I am trying to determine a way to change the legend labels in the interactive plot to be the cut off values (0, 5, 10, 15, 20, 25). Any suggestions or support would be greatly appreciated!
Below is the code I used to create the static and interactive plot:
library(plotly)
library(ggplot2)
library(RColorBrewer)
set.seed(1)
x = abs(rnorm(30))
y = abs(rnorm(30))
value = runif(30, 1, 30)
myData <- data.frame(x=x, y=y, value=value)
cutList = c(5, 10, 15, 20, 25)
purples <- brewer.pal(length(cutList)+1, "Purples")
myData$valueColor <- cut(myData$value, breaks=c(0, cutList, 30), labels=rev(purples))
# Static plot
sp <- ggplot(myData, aes(x=x, y=y, fill=valueColor)) + geom_polygon(stat="identity") + scale_fill_manual(labels = as.character(c(0, cutList)), values = levels(myData$valueColor), name = "Value")
# Interactive plot
ip <- ggplotly(sp)
Label using the cut points and use scale_fill_manual for the colors.
cutList = c(5, 10, 15, 20, 25)
purples <- brewer.pal(length(cutList)+1, "Purples")
myData$valueLab <- cut(myData$value, breaks=c(0, cutList, 30), labels=as.character(c(0, cutList)))
# Static plot
sp <- ggplot(myData, aes(x=x, y=y, fill=valueLab)) + geom_polygon(stat="identity") + scale_fill_manual(values = rev(purples))
# Interactive plot
ip <- ggplotly(sp)

Animate ggplot2 stacked line chart in R

I'm trying to animate a stacked line chart in ggplot2.
Here's the plot I'd like to animate:
Here's the code to generate a similar plot:
#Data
mydata <- data.frame(year=rep(1:6, times=4),
activity=as.factor(rep(c("research","coursework","clinical work","teaching"), each=6)),
time=c(40, 35, 40, 60, 85, 90,
50, 40, 10, 0, 5, 0,
5, 20, 20, 40, 10, 10,
5, 5, 30, 0, 0, 0))
mydata$activity <- ordered(mydata$activity, levels = c("research","clinical work","coursework","teaching"))
labels <- data.frame(activity=c("research","coursework","clinical work","teaching"),
xaxis=c(5, 1.8, 2.5, 2.97),
yaxis=c(25, 70, 48, 90))
#Plot
ggplot(mydata, aes(x=year, y=time, fill=activity)) +
geom_area(stat="smooth", span=.35, color="black") +
theme(legend.position = "none") +
geom_text(data=labels, aes(x=xaxis, y=yaxis, label=activity)) +
ggtitle("Time in Different Activities by Year in Program") +
ylab("Percentage of Time") +
xlab("Year in Program")
I'm looking for the first image to display all axes and text. The second iteration, I'd like to gradually reveal over time, from left to right, the "Research" stacked line (including color and border). The third iteration, I'd like to gradually reveal, from left to right, the "Clinical Work" stacked line. Fourth, the "Coursework" stacked line. And finally, the "Teaching" stacked line.
Ideally, the output format would be very smooth (no jagged jumps) and would be compatible with PowerPoint.
Here is an R-based solution. It saves individual figures (.png) that can be iterated through within a presentation.
Alternatively,you could create an animation (for example converting to .gif) using ImageMagick http://www.imagemagick.org/
#Data
mydata <- data.frame(year=rep(1:6, times=4),
activity=as.factor(rep(c("research","coursework","clinical work","teaching"), each=6)),
time=c(40, 35, 40, 60, 85, 90,
50, 40, 10, 0, 5, 0,
5, 20, 20, 40, 10, 10,
5, 5, 30, 0, 0, 0))
#order the activities and then the dataframe
mydata$activity <- ordered(mydata$activity, levels = c("research","clinical work","coursework","teaching"))
mydata <- mydata[order(mydata$activity),]
#labels
labels <- data.frame(activity=c("research","coursework","clinical work","teaching"),
xaxis=c(5, 1.8, 2.5, 2.97),
yaxis=c(25, 70, 48, 90))
#creates a function to draws a plot for each activity
draw.stacks<-function(leg){
int <- leg*6
a<-ggplot(data=mydata[1:int,], aes(x=year, y=time, fill=activity))+
geom_area(stat="smooth", span=.35, color="black") +
theme_bw()+
scale_fill_discrete(limits = c("research","clinical work","coursework","teaching"), guide="none")+
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank()) +
coord_cartesian(xlim=c(1,6),ylim=c(0,100))+
geom_text(data=labels, aes(x=xaxis, y=yaxis, label=activity)) +
ggtitle("Time in Different Activities by Year in Program") +
ylab("Percentage of Time") +
xlab("Year in Program")
print(a)
}
# save individual png figures
for (i in 0:4) {
png(paste("activity", i, "png", sep="."))
draw.stacks(i)
dev.off()
}
Sorry for bringing in a non-programmer solution, but I would simply generate plots for each iteration separately, put them in power point (one plot on one slide), and use some fancy slide transition effects (I tried the Random Bars effect on your example, and it looked nice).
If you determined to find an R-based solution, you can take a look at the animate package (see a Strategic Zombie Simulation example here).

Resources