ggplot2 animation shows some empty plots - r

I'm trying to plot a lot of scatterplots in an animation, but a lot of the plots show just an empty plot. It also differs everytime I run the code/adjust the range.
Some plots do work and they are all supposed to look like this:
But most of the plots look like this:
This is my code:
library(ggplot2)
library(animation)
begintime <- min(dfL$time)
endtime <- max(dfL$time)
beginRange <- begintime
endRange <- begintime + 10
dateRangeBetween <- function(x,y){dfL[dfL$time >= x & dfL$time <= y,]}
saveHTML({
for (i in 1:20) {
dfSub <- dateRangeBetween(beginRange, endRange)
ggScatterplot = ggplot(data = dfSub, aes(x = UTM_WGS84.Longitude, y = UTM_WGS84.Latitude)) + ggtitle("Coordinates") + xlab("Longitude") + ylab("Latitude") + theme(legend.position = "top") + geom_point()
beginRange <- beginRange + 10
endRange <- endRange + 10
print(ggScatterplot)
}
}, img.name = "coordinatesplots", imgdir = "coordinatesplots", htmlfile = "coordinatesplots.html",
outdir = getwd(), autobrowse = FALSE, ani.height = 400, ani.width = 600,
verbose = FALSE, autoplay = TRUE, title = "Coordinates")
This is an example of my dataframe:
track time UTM_WGS84.Longitude UTM_WGS84.Latitude
1 1 2015-10-14 23:59:55.711 5.481687 51.43635
2 1 2015-10-14 23:59:55.717 5.481689 51.43635
3 1 2015-10-14 23:59:55.723 5.481689 51.43635
4 1 2015-10-14 23:59:55.730 5.481690 51.43635
5 1 2015-10-14 23:59:55.763 5.481691 51.43635
Can someone please help me with this?

The most likely reason for your plots being empty is that the subset of your data.frame itself is empty.
I think (hard to say without seeing your full data) that your problem is that you're not incrementing by the correct amount of time. By default, adding a number to a date will add a number of seconds. I suspect the full range of your data is less than 10 seconds, and therefore only the first plot will show some data. After that the time range will be outside of the range of your data.
If that is the case, just change the + 10 to the actual amount of time you want to add. 1: 1 second, 0.1: a tenth of a second, etc...

Related

How to accommodate several plots using facet_wrap in ggplot

I am able to plot several plots that do get plotted in a lattice format with my code. However, there are several of these plots, 77 to be exact so, each plot is totally squished and unreadable. I have tried playing with the width, height, facet_wrap_paginate as well but none seem to give me the output I want. Each plot is a line plot for each subject and EVTEST and there are 77 subjects.
I want to break the plots in multiple panels of 4X3 or 4X4 and also pages. Also I am outputting it in RTF. I am open to outputting in PDF as well if it works.How do I do it?
Below is a fictitious data, which is the same format that I use as input in my code. what I am trying to plot is line plots for this longitudinal data, where my X axis is VISITDY_D and Y axis is EVSTRESN. I am grouping the plots by creating a concatenated handle (SUBJID_EVTEST_SITE).
SUBJID SITE EVTEST EVSTRESN VISITDY_D SIDE
1 AB ABC 1.1 D00 Left
1 AB ABC 2.1 D28 Right
1 AB ABC 2.2 D56 Left
1 AB ABC 2.3 D84 Left
2 AB ABC 1.5 D00
2 AB ABC 1.6 D28 Right
#read the data (csv file)
donnees <- read.csv(paste0(path_data,"Sample.csv"), sep = ";",header =
T,stringsAsFactors = FALSE)
Params = c("Phenotype1","Phenotype2")
#PLOTTING FUNCTION
pf1<-function(subD,tit1){ # subD:Input data, tit1: Title for the plot
subD$SUBJID1 <-
as.factor(paste0(subD$RANDOID,'_',subD$EVTEST,'_',subD$SITE))
p1 <- ggplot(subD,aes(x = VISITDY_D,y = EVSTRESN, color=SIDE,group=SIDE))
geom_line(position=position_dodge(width=0.7))+
geom_point() + facet_wrap_paginate(~ SUBJID1, nrow=3,ncol=3,page=1) +
theme()
print(p1)
p1 <- ggplot(subD,aes(x = VISITDY_D,y = EVSTRESN,
color=SIDE,group=SIDE)) +
geom_line(position=position_dodge(width=0.7))+
geom_point() + facet_wrap(~ SUBJID1) + theme()
print(p1)
}
# OUTPUT; Calling the plotting function in RTF
oP <- /output_directory
setwd(oP)
rtf <- RTF(file = paste0("TEST_","individual profiles.rtf"))
addTOC(rtf)
addPageBreak(rtf)
for(s in 1:length(Params)){
dat = subset(donnees,donnees$EVTEST %in% Params[s])
SUBJID1 <-as.factor(paste0(dat$RANDOID,'_',dat$EVTEST,'_',dat$SITE))
tit1<-paste0(Params[s])
addHeader(rtf,tit1,font.size = 4, TOC.level = 1)
addPlot(rtf, plot.fun=pf1, subD= dat,tit1= tit1, width= 7, height=5.2,
res=250)
}
done(rtf)

Alter my values to surround a certain point R

I have the following data, which shows the values for 5 different cohorts of patients (3 patients in each cohort):
dat <- data.frame(Cohort=c(1,1,1, 2,2,2, 3,3,3, 4,4,4, 5,5,5),
LEN_Dose=c(15,15,15, 25,25,25, 15,15,15, 10,10,10, 10,10,10),
DLT=c("N","N","N", "Y","Y","N", "Y","N","Y", "N","N","Y", "N","N","Y"))
I would like to modify the cohort levels to be +/- 0.2 of the main cohort number so they don't sit on top of one another in a graph. I can achive what I want like this:
dat$Cohort <- dat$Cohort-0.2
dat$Cohort <- ifelse(duplicated(dat$Cohort), dat$Cohort+0.2, dat$Cohort)
dat$Cohort <- ifelse(duplicated(dat$Cohort), dat$Cohort+0.2, dat$Cohort) # have to run this twice as there are 3 patients
So the result is:
head(dat)
# Cohort LEN_Dose DLT
# 0.8 15 N
# 1.0 15 N
# 1.2 15 N
# 1.8 25 Y
# 2.0 25 Y
# 2.2 25 N
But I'm wondering if there's a better way to do this? Eg somehow inputting the base cohort level and some function automatically works out the 3 values I need?
The point is to eventually graph the data using this graph:
ggplot(aes(x=Cohort, y=as.numeric(LEN_Dose)), data = dat) +
ylab("Dose Level\n") +
xlab("\nCohort") +
ggtitle("\n") +
scale_y_continuous(breaks = c(5, 10, 15, 25),
label = c("1.2mg/kg\n5mg", "1.2mg/kg\n10mg", "1.8mg/kg\n15mg", "1.8mg/kg\n25mg")) +
scale_fill_manual(values = c("white", "darkred"),
name="Had DLT") +
geom_line(colour="grey20", size=1) +
geom_point(shape=23, size=6, aes(fill=DLT), stroke=1.1, colour="grey20") + # 21 for circles
theme_classic() +
theme(legend.box.margin=margin(c(0,0,0,-10))) +
expand_limits(y=c(5,25))
EDIT: I have tried position = position_jitter, position = position_dodge and all the other types of positions within ggplot itself, but they don't space the points equally or in any particular order, which is why I'm trying to modify the dataframe itself
How about writing your jitter function, something like:
jitterit<- function(xTojitter= dat$Cohort, howMuchjitter=0.2){
x<-xTojitter
uni<-unique(x)
for (i in 1:length(uni)) {
if (is.na(uni[i])) {
x[is.na(x)]<-NA
} else if (sum(x==uni[i], na.rm = T) %%2 ==1) {
if(sum(x==uni[i], na.rm = T)==1){x[x==uni[i] & !is.na(x)][middle] <- uni[i]
} else {
middle<-ceiling (sum(x==uni[i], na.rm = T)/2)
x[x==uni[i] & !is.na(x)][1:(middle-1)] <- uni[i] - howMuchjitter
x[x==uni[i] & !is.na(x)][(middle+1):sum(x==uni[i], na.rm = T) ]<- uni[i] + howMuchjitter
x[x==uni[i] & !is.na(x)][middle] <- uni[i]
}} else if (sum(x==uni[i], na.rm = T) %%2 ==0) {
x[x==uni[i] & !is.na(x)]<- rep(c(uni[i] - howMuchjitter,uni[i] + howMuchjitter), each= sum(x==uni[i],na.rm = T)/2)
}
}
return(x)
}
It will work for all kind of duplicated data (even or odd number of duplication)
jitterit(xTojitter = c(1,1,2,1,2,NA), howMuchjitter=0.2)
[1] 0.8 1.0 1.8 1.2 2.2 NA

Overlap plots in R - from zoo package

Using the following code:
library("ggplot2")
require(zoo)
args <- commandArgs(TRUE)
input <- read.csv(args[1], header=F, col.names=c("POS","ATT"))
id <- args[2]
prot_len <- nrow(input)
manual <- prot_len/100 # 4.3
att_name <- "Entropy"
att_zoo <- zoo(input$ATT)
att_avg <- rollapply(att_zoo, width = manual, by = manual, FUN = mean, align = "left")
autoplot(att_avg, col="att1") + labs(x = "Positions", y = att_name, title="")
With data:
> str(input)
'data.frame': 431 obs. of 2 variables:
$ POS: int 1 2 3 4 5 6 7 8 9 10 ...
$ ATT: num 0.652 0.733 0.815 1.079 0.885 ...
I do:
I would like to upload input2 which has different lenght (therefore, different x-axis) and overlap the 2 curves in the same plot (I mean overlap because I want the two curves in the same plot size, so I will "ignore" the overlapped axis labels and tittles), I would like to compare the shape, regardles the lenght of input.
First I've tried by generating toy input2 changing manual value, so that I have att_avg2 in which manual equals e.g. 7. In between original autoplot and new autoplot-2 I add par(new=TRUE), but this is not my expected output. Any hint on how doing this? Maybe it's better to save att_avg from zoo series to data.frame and not use autoplot? Thanks
UPDATE, response to G. Grothendieck:
If I do:
[...]
att_zoo <- zoo(input$ATT)
att_avg <- rollapply(att_zoo, width = manual, by = manual, FUN = mean, align = "left") #manual=4.3
att_avg2 <- rollapply(att_zoo, width = 7, by = 7, FUN = mean, align = "left")
autoplot(cbind(att_avg, att_avg2), facet=NULL) +
labs(x = "Positions", y = att_name, title="")
I get
and a warning message:
Removed 1 rows containing missing values (geom_path).
par is used with classic graphics, not for ggplot2. If you have two zoo series just cbind or merge the series together and autoplot them using facet=NULL:
library(zoo)
library(ggplot2)
z1 <- zoo(1:3) # length 3
z2 <- zoo(5:1) # length 5
autoplot(cbind(z1, z2), facet = NULL)
Note: The question omitted input2 so there could be some additional considerations from aspects not shown.

How to dodge points in ggplot2 in R

df = data.frame(subj=c(1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10), block=factor(rep(c(1,2),10)), acc=c(0.75,0.83,0.58,0.75,0.58,0.83,0.92,0.83,0.83,0.67,0.75,0.5,0.67,0.83,0.92,0.58,0.75,0.5,0.67,0.67))
ggplot(df,aes(block,acc,group=subj)) + geom_point(position=position_dodge(width=0.3)) + ylim(0,1) + labs(x='Block',y='Accuracy')
How do I get points to dodge each other uniformly in the horizontal direction? (I grouped by subj in order to get it to dodge at all, which might not be the correct thing to do...)
I think this might be what you were looking for, although no doubt you have solved it by now.
Hopefully it will help someone else with the same issue.
A simple way is to use geom_dotplot like this:
ggplot(df,aes(x=block,y=acc)) +
geom_dotplot(binaxis = "y", stackdir = "center", binwidth = 0.03) + ylim(0,1) + labs(x='Block',y='Accuracy')
This looks like this:
Note that x (block in this case) has to be a factor for this to work.
If they don't have to be perfectly aligned horizontally, here's one quick way of doing it, using geom_jitter. You don't need to group by subj.
Method 1 [Simpler]: Using geom_jitter()
ggplot(df,aes(x=block,y=acc)) + geom_jitter(position=position_jitter(0.05)) + ylim(0,1) + labs(x='Block',y='Accuracy')
Play with the jitter width for greater degree of jittering.
which produces:
Method 2: Deterministically calculating the jitter value for each row
We first use aggregate to count the number of duplicated entries. Then in a new data frame, for each duplicated value, move it horizontally to the left by an epsilon distance.
df$subj <- NULL #drop this so that aggregate works.
#a new data frame that shows duplicated values
agg.df <- aggregate(list(numdup=seq_len(nrow(df))), df, length)
agg.df$block <- as.numeric(agg.df$block) #block is not a factor
# block acc numdup
#1 2 0.50 2
#2 1 0.58 2
#3 2 0.58 1
#4 1 0.67 2
#...
epsilon <- 0.02 #jitter distance
new.df <- NULL #create an expanded dataframe, with block value jittered deterministically
r <- 0
for (i in 1:nrow(agg.df)) {
for (j in 1:agg.df$numdup[i]) {
r <- r+1 #row counter in the expanded df
new.df$block[r] <- agg.df$block[i]
new.df$acc[r] <- agg.df$acc[i]
new.df$jit.value[r] <- agg.df$block[i] - (j-1)*epsilon
}
}
new.df <- as.data.frame(new.df)
ggplot(new.df,aes(x=jit.value,y=acc)) + geom_point(size=2) + ylim(0,1) + labs(x='Block',y='Accuracy') + xlim(0,3)
which produces:

Coloring line segments in ggplot2

Suppose I have following data for a student's score on a test.
set.seed(1)
df <- data.frame(question = 0:10,
resp = c(NA,sample(c("Correct","Incorrect"),10,replace=TRUE)),
score.after.resp=50)
for (i in 1:10) {
ifelse(df$resp[i+1] == "Correct",
df$score.after.resp[i+1] <- df$score.after.resp[i] + 5,
df$score.after.resp[i+1] <- df$score.after.resp[i] - 5)
}
df
.
question resp score.after.resp
1 0 <NA> 50
2 1 Correct 55
3 2 Correct 60
4 3 Incorrect 55
5 4 Incorrect 50
6 5 Correct 55
7 6 Incorrect 50
8 7 Incorrect 45
9 8 Incorrect 40
10 9 Incorrect 35
11 10 Correct 40
I want to get following graph:
library(ggplot2)
ggplot(df,aes(x = question, y = score.after.resp)) + geom_line() + geom_point()
My problem is: I want to color segments of this line according to student response. If correct (increasing) line segment will be green and if incorrect response (decreasing) line should be red.
I tried following code but did not work:
ggplot(df,aes(x = question, y = score.after.resp, color=factor(resp))) +
geom_line() + geom_point()
Any ideas?
I would probably approach this a little differently, and use geom_segment instead:
df1 <- as.data.frame(with(df,cbind(embed(score.after.resp,2),embed(question,2))))
colnames(df1) <- c('yend','y','xend','x')
df1$col <- ifelse(df1$y - df1$yend >= 0,'Decrease','Increase')
ggplot(df1) +
geom_segment(aes(x = x,y = y,xend = xend,yend = yend,colour = col)) +
geom_point(data = df,aes(x = question,y = score.after.resp))
A brief explanation:
I'm using embed to transform the x and y variables into starting and ending points for each line segment, and then simply adding a variable that indicates whether each segment went up or down. Then I used the previous data frame to add the original points themselves.
Alternatively, I suppose you could use geom_line something like this:
df$resp1 <- c(as.character(df$resp[-1]),NA)
ggplot(df,aes(x = question, y = score.after.resp, color=factor(resp1),group = 1)) +
geom_line() + geom_point(color = "black")
By default ggplot2 groups the data according to the aesthetics that are mapped to factors. You can override this default by setting group explicitly,
last_plot() + aes(group=NA)

Resources