Animate ggplot2 stacked line chart in R - r

I'm trying to animate a stacked line chart in ggplot2.
Here's the plot I'd like to animate:
Here's the code to generate a similar plot:
#Data
mydata <- data.frame(year=rep(1:6, times=4),
activity=as.factor(rep(c("research","coursework","clinical work","teaching"), each=6)),
time=c(40, 35, 40, 60, 85, 90,
50, 40, 10, 0, 5, 0,
5, 20, 20, 40, 10, 10,
5, 5, 30, 0, 0, 0))
mydata$activity <- ordered(mydata$activity, levels = c("research","clinical work","coursework","teaching"))
labels <- data.frame(activity=c("research","coursework","clinical work","teaching"),
xaxis=c(5, 1.8, 2.5, 2.97),
yaxis=c(25, 70, 48, 90))
#Plot
ggplot(mydata, aes(x=year, y=time, fill=activity)) +
geom_area(stat="smooth", span=.35, color="black") +
theme(legend.position = "none") +
geom_text(data=labels, aes(x=xaxis, y=yaxis, label=activity)) +
ggtitle("Time in Different Activities by Year in Program") +
ylab("Percentage of Time") +
xlab("Year in Program")
I'm looking for the first image to display all axes and text. The second iteration, I'd like to gradually reveal over time, from left to right, the "Research" stacked line (including color and border). The third iteration, I'd like to gradually reveal, from left to right, the "Clinical Work" stacked line. Fourth, the "Coursework" stacked line. And finally, the "Teaching" stacked line.
Ideally, the output format would be very smooth (no jagged jumps) and would be compatible with PowerPoint.

Here is an R-based solution. It saves individual figures (.png) that can be iterated through within a presentation.
Alternatively,you could create an animation (for example converting to .gif) using ImageMagick http://www.imagemagick.org/
#Data
mydata <- data.frame(year=rep(1:6, times=4),
activity=as.factor(rep(c("research","coursework","clinical work","teaching"), each=6)),
time=c(40, 35, 40, 60, 85, 90,
50, 40, 10, 0, 5, 0,
5, 20, 20, 40, 10, 10,
5, 5, 30, 0, 0, 0))
#order the activities and then the dataframe
mydata$activity <- ordered(mydata$activity, levels = c("research","clinical work","coursework","teaching"))
mydata <- mydata[order(mydata$activity),]
#labels
labels <- data.frame(activity=c("research","coursework","clinical work","teaching"),
xaxis=c(5, 1.8, 2.5, 2.97),
yaxis=c(25, 70, 48, 90))
#creates a function to draws a plot for each activity
draw.stacks<-function(leg){
int <- leg*6
a<-ggplot(data=mydata[1:int,], aes(x=year, y=time, fill=activity))+
geom_area(stat="smooth", span=.35, color="black") +
theme_bw()+
scale_fill_discrete(limits = c("research","clinical work","coursework","teaching"), guide="none")+
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank()) +
coord_cartesian(xlim=c(1,6),ylim=c(0,100))+
geom_text(data=labels, aes(x=xaxis, y=yaxis, label=activity)) +
ggtitle("Time in Different Activities by Year in Program") +
ylab("Percentage of Time") +
xlab("Year in Program")
print(a)
}
# save individual png figures
for (i in 0:4) {
png(paste("activity", i, "png", sep="."))
draw.stacks(i)
dev.off()
}

Sorry for bringing in a non-programmer solution, but I would simply generate plots for each iteration separately, put them in power point (one plot on one slide), and use some fancy slide transition effects (I tried the Random Bars effect on your example, and it looked nice).
If you determined to find an R-based solution, you can take a look at the animate package (see a Strategic Zombie Simulation example here).

Related

Plotting a curved line directly through some datapoints in R (ggplot)

I am struggling with plotting a figure on the motion profile of a simulator.
What I'm trying to show are the displacements of the simulator over time.
Some sample data:
Time = c(0, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44)
Displacement = c(0, 0, 7, 0, 0, 7, 0, 0, -7, 0, 0)
DD = as.data.frame(Time, Displacement)
I want to plot a curved/smoothed line that goes directly through these datapoints.
enter image description hereUsing geom_line off course generates a spiky line.
The closest I have gotten to get a smoother line is using this piece of code:
ggplot(DD, aes(x=Time, y=Displacement, c(0,7))) +
geom_smooth(method = "loess", se = FALSE, span = 0.2, colour="black")
enter image description hereHowever, the curves are still quite spiky and I am hoping to get a more beautiful plot.
Hoping anyone can be of help :)
Anne
Try with a polynomial fit:
library(ggplot2)
#Code
ggplot(DD, aes(x=Time, y=Displacement, c(0,7))) +
geom_smooth(method = "lm",formula = y~poly(x,3), se = FALSE, span = 0.2, colour="black")
Output:

Changing legend labels in ggplotly()

I have a plot of polygons that are colored according to a quantitative variable in the dataset being cut off at certain discrete values (0, 5, 10, 15, 20, 25). I currently have a static ggplot() output that "works" the way I intend. Namely, the legend values are the cut off values (0, 5, 10, 15, 20, 25). The static plot is below -
However, when I simply convert this static plot to an interactive plot, the legend values become hexadecimal values (#54278F, #756BB1, etc.) instead of the cut off values (0, 5, 10, 15, 20, 25). A screenshot of this interactive plot is shown below -
I am trying to determine a way to change the legend labels in the interactive plot to be the cut off values (0, 5, 10, 15, 20, 25). Any suggestions or support would be greatly appreciated!
Below is the code I used to create the static and interactive plot:
library(plotly)
library(ggplot2)
library(RColorBrewer)
set.seed(1)
x = abs(rnorm(30))
y = abs(rnorm(30))
value = runif(30, 1, 30)
myData <- data.frame(x=x, y=y, value=value)
cutList = c(5, 10, 15, 20, 25)
purples <- brewer.pal(length(cutList)+1, "Purples")
myData$valueColor <- cut(myData$value, breaks=c(0, cutList, 30), labels=rev(purples))
# Static plot
sp <- ggplot(myData, aes(x=x, y=y, fill=valueColor)) + geom_polygon(stat="identity") + scale_fill_manual(labels = as.character(c(0, cutList)), values = levels(myData$valueColor), name = "Value")
# Interactive plot
ip <- ggplotly(sp)
Label using the cut points and use scale_fill_manual for the colors.
cutList = c(5, 10, 15, 20, 25)
purples <- brewer.pal(length(cutList)+1, "Purples")
myData$valueLab <- cut(myData$value, breaks=c(0, cutList, 30), labels=as.character(c(0, cutList)))
# Static plot
sp <- ggplot(myData, aes(x=x, y=y, fill=valueLab)) + geom_polygon(stat="identity") + scale_fill_manual(values = rev(purples))
# Interactive plot
ip <- ggplotly(sp)

R: Combine a graph layout with a ggplot2 object and a gplots object

I want to have one figure with two plots, one of them is a ggplot2 object, and the second is a plot generated with gplots. For example combine the next two plots in a row:
library(ggplot2)
library(gplots) #For plotmeans
df = structure(list(age = c(14, 22, 35, 21, 88, 66, 14, 22, 35, 21),
values = c(22, 8, 1.9, 26.8, 32, 15.,1.9, 26.8, 32, 15.)),
.Names = c("age", "values"),
row.names = 1:10,
class = "data.frame")
ggplot(df, aes(values)) + geom_histogram()
plotmeans(df$values ~ df$age)
I tried grid, gridExtra, par and layout but w/o success.
Any idea how can I do so?
I found the next solution using gridBase:
(based on https://stackoverflow.com/a/14125565/890739)
library(gridBase) # To combine two plots
par(mfrow=c(1, 2))
plot.new()
vps <- baseViewports()
pushViewport(vps$figure)
vp1 <-plotViewport(c(1.8,1,0,1))
#Plot histogram
g1 <- ggplot(df, aes(values)) + geom_histogram()
print(g1,vp = vp1)
plotmeans(df$values ~ df$age)
Is there a simpler way?

Split barplot by grouping by days

I have the following bar chart produced using this code:
MD1<-read.csv("MD_qual_OTU_sorted.csv")
MD1<-data.frame(Samples=c("A","B","C","D","E","F","G","H","I","J","K","L","M", "N","O","P","Q", "R"), Number.of.OTUs=c(13,10,9,9,15,11,7,7,9,9,5,10,10,7,15,17,8,9))
par(las=1)
barplot(MD1[,2],names.arg=MD1[,1], ylab='OTU Count', yaxt='n', xlab='MD samples', main='Total OTU count/Sample',density=c(90,90, 90, 90, 90, 90, 10, 10, 10, 10, 10, 10, 40, 40, 40, 40, 40, 40), col=c("yellow","yellow","pink", "pink","green","green","red","red", "purple", "purple", "blue", "blue", "orange", "orange","cyan", "cyan","chartreuse4", "chartreuse4" ))
usr <- par("usr")
par(usr=c(usr[1:2], 0, 20))
axis(2, at=seq(0,20,5))
I want to split samples A-F into a separate group (Day 3), G-L (Day 5) and M-R (Day 15)
There are similar questions posted however I am not sure how to tidy up the manner in which I have inputted my data to be able to use these solutions.
You could consider using ggplot2, separate plots are very easy using facet_wrap and facet_grid.
library(ggplot2)
#create a grouping variable
MD1$Day <- rep(c("Day 03","Day 05","Day 15"),
each=6)
p1 <- ggplot(MD1, aes(x=Samples,y=Number.of.OTUs)) +
geom_bar(stat="identity") + facet_wrap(~Day,
scales="free_x")
p1
Or, if you want to use base-R and approach your original image:
#add colors/densities
MD1$col <- c("yellow","yellow","pink", "pink","green","green","red","red",
"purple", "purple", "blue", "blue", "orange", "orange","cyan", "cyan","chartreuse4", "chartreuse4" )
MD1$density <- c(90,90, 90, 90, 90, 90, 10, 10, 10, 10, 10, 10, 40, 40, 40, 40, 40, 40)
#set 1 row three cols for plotting
par(mfrow=c(1,3))
#split and plot
lapply(split(MD1, MD1$Day),function(x){
barplot(x[,2],
names.arg=x[,1],
ylab='OTU Count',
ylim=c(0,20),
main=unique(x$Day),
col=x$col,
density=x$density)
})

create a heatmap with regions in R

I have the following kind of data: on a rectangular piece of land (120x50 yards), there are 6 (also rectabgular) smaller areas each with a different kind of plant. The idea is to study the attractiveness of the various kinds of plant to birds. Each time a bird sits down somewhere on the land, I have the exact coordinates of where the bird sits down.
I don't care exactly where the bird sits down, but only care which of the six areas it is. To show the relative preference of birds for the various plants, I want to make a heatmap that makes the areas that are frequented most the darkest.
So, I need to convert the coordinates to code which area the bird visits, and then create a heatmap that shows the differential preference for each land area.
(the research is a bit more involved than this, but this is the general idea.)
How would I do this in R? Is there a R function that takes a vector of coordinates and turns that in such a heatmap? If not, do you have some hints for more on how to do this?
Not meant to be the answer you are looking for, but might give you some inspiration.
# Simulate some data
birdieLandingSimulator <- data.frame(t(sapply(1:100, function(x) c(runif(1, -10,10), runif(1, -10,10)))))
# Assign some coordinates, which ended up not really being used much at all, except for the point colors
assignCoord <- function(x)
{
# Assign the four coordinates clockwise: 1, 2, 3, 4
ifelse(all(x>0), 1, ifelse(!sum(x>0), 3, ifelse(x[1]>0, 2, 4)))
}
birdieLandingSimulator <- cbind(birdieLandingSimulator, Q = apply(birdieLandingSimulator, 1, assignCoord))
# Plot
require(ggplot2)
ggplot(birdieLandingSimulator, aes(x = X1, y = X2)) +
stat_density2d(geom="tile", aes(fill = 1/..density..), contour = FALSE) +
geom_point(aes(color = factor(Q))) + theme_classic() +
theme(axis.title = element_blank(),
axis.line = element_blank(),
axis.text = element_blank(),
axis.ticks = element_blank()) +
scale_color_discrete(guide = FALSE, h=c(180, 270)) +
scale_fill_continuous(name = "Birdie Landing Location")
Use ggplot2. Take a look at the examples for geom_bin2d. It's pretty simple to get 2d bins. Notice that you pass in binwidth for both x and y:
> df = data.frame(x=c(1,2,4,6,3,2,4,2,1,7,4,4),y=c(2,1,4,2,4,4,1,4,2,3,1,1))
> ggplot(df,aes(x=x, y=y,alpha=0.5)) + geom_bin2d(binwidth=c(2,2))
If you don't want to use ggplot, you can use the cut function to separate your data into bins.
# Test data.
x <- sample(1:120, 100, replace=T)
y <- sample(1:50, 100, replace=T)
# Separate the data into bins.
x <- cut(x, c(0, 40, 80, 120))
y <- cut(y, c(0, 25, 50))
# Now plot it, suppressing reordering.
heatmap(table(y, x), Colv=NA, Rowv=NA)
Alternatively, to actually plot the regions in their true geographic location, you could draw the boxes yourself with rect. You would have to count the number of points in each region.
# Test data.
x <- sample(1:120, 100, replace=T)
y <- sample(1:50, 100, replace=T)
regions <- data.frame(xleft=c(0, 40, 40, 80, 0, 80),
ybottom=c(0, 0, 15, 15, 30, 40),
xright=c(40, 120, 80, 120, 80, 120),
ytop=c(30, 15, 30, 40, 50, 50))
# Color gradient.
col <- colorRampPalette(c("white", "red"))(30)
# Make the plot.
plot(NULL, xlim=c(0, 120), ylim=c(0, 50), xlab="x", ylab="y")
apply(regions, 1, function (r) {
count <- sum(x >= r["xleft"] & x < r["xright"] & y >= r["ybottom"] & y < r["ytop"])
rect(r["xleft"], r["ybottom"], r["xright"], r["ytop"], col=col[count])
text( (r["xright"]+r["xleft"])/2, (r["ytop"]+r["ybottom"])/2, count)
})

Resources