How to create a bwplot with date on x-axis - r

users
thanks to the reply of #McQueenDon on r-nabble
http://r.789695.n4.nabble.com/boxplot-with-x-axis-time-td4686787.html#a4687746
I managed to produce a boxplot::base of a single variable with the x-axis correctly formatted and spaced for the date of acquisition.
What if I would like to produce it with bwplot::lattice? I need this because I would like also to use a conditional factor.
Here you are a reproducible example (thanks again to #McQueenDon )
data(iris)
pippo= stack(iris[,-5])
pippo$date= rep(c("2013/01/29", "2013/03/01", "2013/11/01",
"2013/12/01", "2014/02/01", "2014/07/02"), 100)
pippo$date= as.Date(pippo$date)
boxplot(pippo$values ~ pippo$date) ## NOT exactly what I want
bx<- boxplot(pippo$values ~ pippo$date, plot= F)
bxp(bx, at=sort(unique(pippo$date))) # this is what I was looking for !
require(lattice)
bwplot(values~date, pippo, horizontal=F) #dates looks not correctly spaced even though they are correctly ordered and formatted
# finally I would like to condition to the 'ind' variable
bwplot(values~date| ind, pippo, horizontal=F, layout= c(2,2))
Thanks
Giuseppe

How about
xyplot(values~date| ind, pippo, horizontal=F, layout= c(2,2),
panel=panel.bwplot, box.width=20)
Here we use xyplot with a custom panel= parameter rather than bwplot because bwplot converts the x to a factor first which renumbers all the levels with sequential integers; xyplot does not do this.
If you wanted to label the exact dates, you could try
dts<-unique(pippo$date)
xyplot(values~date| ind, pippo, horizontal=F, layout= c(2,2),
panel=panel.bwplot, box.width=20,
scales=list(x=list(at=dts)))
but that looks quote crowded in this particular example.

Related

Plot multiple columns saved in data frame with no x

My problem is multifaceted.
I would like to plot multiple columns saved in a data frame. Those columns do not have an x variable but would essentially be 1 to 101 consistent for all. I have seen that I can transfer them into long format but most ggplot options require an X. I tried zoo which does what I want it to, but the x-label is all jumbled and I am not aware of how to fix it. (Example of data below, and plot)
df <- zoo(HIP_131_Y0_LC_walk1[1:9])
plot(df)
I have multiple data frames saved in a list so ultimately would like to run a function and apply to all. The zoo function solves step one but I am not able to apply to all the data frames in the list.
graph<-lapply(myfiles,function(x) zoo(x) )
print(graph)
Ideally I would like to also mark minimum and maximum, which I am aware can be done with ggplot but not zoo.
Thank you so much for your help in advance
Assuming that the problem is overlapped panel names there are numerous solutions to this:
abbreviate the names using abbreviate. We show this for plot.zoo and autoplot.zoo .
put the panel name in the upper left. We show this for plot.zoo using a custom panel.
Use a header on each panel. We show this using xyplot.zoo and using ggplot.
The examples below use the test input in the Note at the end. (Next time please provide a complete example including all input in reproducible form.)
The first two examples below abbreviates the panel names and using plot.zoo and autoplot.zoo (which uses ggplot2). The third example uses xyplot.zoo (which uses lattice). This automatically uses headers and is probably the easiest solution.
library(zoo)
plot(z, ylab = abbreviate(names(z), 8))
library(ggplot2)
zz <- setNames(z, abbreviate(names(z), 8))
autoplot(zz)
library (lattice)
xyplot(z)
(click on plots to see expanded; continued after plots)
This fourth example puts the panel names in the upper left of the panel themselves using plot.zoo with a custom panel.
pnl <- function(x, y, ..., pf = parent.frame()) {
legend("topleft", names(z)[pf$panel.number], bty = "n", inset = -0.1)
lines(x, y)
}
plot(z, panel = pnl, ylab = "")
(click on plot to see it expanded)
We can also get headers with autoplot.zoo similar to in lattice above.
library(ggplot2)
autoplot(z, facets = ~ Series, col = I("black")) +
theme(legend.position = "none")
(click to expand; continued after graphics)
List
If you have a list of vectors L (see Note at end for a reproducible example of such a list) then this will produce a zoo object:
do.call("merge", lapply(L, zoo))
Note
Test input used above.
library(zoo)
set.seed(123)
nms <- paste0(head(state.name, 9), "XYZ") # long names
m <- matrix(rnorm(101*9), 101, dimnames = list(NULL, nms))
z <- zoo(m)
L <- split(m, col(m)) # test list using m in Note

Multiple histograms with title and mean as a line?

I'm struggeling with the histogram function in my exploratory analysis. I would like to run a couple of variables in my dataset through a histogram function and for each add the title and a line at the arithmetic mean. This is how far I've got (but the main title is still missing):
histo.abline <-function(x){
hist(x)
abline(v = mean(x, na.rm = TRUE), col = "blue", lwd = 4)}
sapply(dataset[c(7:10)], histo.abline)
I tried to add a main argument in the histogram function but it just doesn't pick the right variable name of my dataset vector. When I put main=x there, it says returns NULL for each variable. Colnames, names and other functions didn't work either. Could you help me?
you can try to do it with ggplot:
library(ggplot)
histo.abline <-function(dataset,colnum){
p<-ggplot(dataset,aes(dataset[,colnum]))+geom_histogram(bins=5,fill=I("blue"),col=I("red"), alpha=I(.2))+
geom_vline(xintercept = mean(dataset[,colnum], na.rm = TRUE))+xlab(as.character(names(dataset)[colnum]))
return(p)
}
since you have not provided data lets work with mtcars and create a list of histograms
dataset=mtcars
listOfHistograms<-lapply(3:7,function(x) histo.abline(dataset,x))
your list has 5 histograms that you can plot for instance the first by:
print(listOfHistograms[[1]])
More histogram options for ggplot here: https://www.r-bloggers.com/how-to-make-a-histogram-with-ggplot2/
hope this helps
EDIT: Multiple Plot in one graph
One way to do it is through cowplot library:
library(cowplot)
plot_grid(plotlist=listOfHistograms[1:4])

45 degree line in Plot function in R

I have data like this :
df <- data.frame(X=rnorm(10,0,1), Y=rnorm(10,0,1), Z=rnorm(10,0,1))
I need to plot each variables against each other, so I used
plot(df)
It plotted each variable within the df against the each other exactly what is required.
But I want to add 45 degree line(where x=y), in each and every sub plot. I want to know how it can be done ? I also tried through loop but due to "space constraint" it could not happen[in reality i have 5 variables within the df]. Please help.
Thanks
plot(df) calls pairs to plot data.frames. So, using this answer, we can try:
my_line <- function(x,y,...){
points(x,y,...)
segments(min(x), min(y), max(x), max(y),...)
}
pairs(df, lower.panel = my_line, upper.panel = my_line)

R : Bad graphic of ordered boxplot according to median

Here is what I am trying to do : I have a data.frame (data) of 160 rows with 2 variables (fact (8 groups) and response) and I want to do a boxplot of response ~ fact, ordered in increasing order of the medians.
Code :
data <- read.table("box.txt",header=T)
attach(data)
index <- order(tapply(response,fact,median))
ordered <- factor(rep(index,rep(20,8)))
boxplot(response~ordered,notch=T,names=as.character(index),xlab="treatments",ylab="response")
but on the graphic the boxes are badly plotted (not in the right order and with "false" Min, Max, etc...).
I'm using RStudio with R 3.0.2 on Windows 7.
Any clue about what does that mean?
One reproducible and seemingly correct answer would be :
set.seed(1)
data <- data.frame(response=10*rnorm(160), fact=factor(rep(1:8), labels=letters[1:8]))
data$fact <- reorder(data$fact, data$response, median)
boxplot(response~fact, data=data, notch=TRUE, xlab="treatments", ylab="response")
Names on the ticks of the x axis are correct, without further ado.
No idea why it looks 'bad', but the order is wrong because you use order instead of rank to find the index. For the other issues you probably have to make a reproducible example.
The reproducible example is as follows, with two boxplots to compare. In my case the plot (possibly) looks bad because of the devil's ears. Regarding the OP's question, I interpret his phrasing as bad referring to the fact that using order() instead of rank() resulted in other mishap as well (although I wouldn't know why).
data <- data.frame(response=rnorm(160), fact=factor(rep(1:8), labels=letters[1:8]))
boxplot(response~fact, data=data, notch=TRUE, xlab="treatments", ylab="response")
data$ordered <- rank(tapply(data$response, data$fact, median))
boxplot(response~ordered, data=data, notch=TRUE, xlab="treatments", ylab="response")

Time data values in R

how can I have a data set of only time intervals (no dates) in R, like the following:
TREATMENT_A TREATMENT_B
1:01:12 0:05:00
0:34:56 1:08:09
and compute mean times, etc, and draw boxplots with time intervals in the y-axis?
I am new to R, and I searched for this but found no example in the net.
Thanks
The chron-package has a 'times' class that supports arithmetic. You could also do all of that with POSIXct objects and format the date-time output to not include the date. I thought axis.POSIXct function has a format argument that should let you have time outputs. However, it does not seem to get dispatched properly, so I needed to construct the axis "by hand."
dft <- data.frame(x= factor( sample(1:2, 100, repl=TRUE)),
y= Sys.time()+rnorm(100)*4000 )
boxplot(y~x, data=dft, yaxt='n')
axis(2, at=seq(from=range(dft$y)[1], to =range(dft$y)[2], by=3000) ,
labels=format.POSIXct(seq(from=range(dft$y)[1], to =range(dft$y)[2], by=3000),
format ="%H:%M:%S") )
There did turn out to be an appropriate method, Axis.POSIXt (to which I thought boxplot should have been turning for plotting, but it did not seem to recognize the class of the 'y' argument):
boxplot(y~x, data=dft, yaxt='n')
Axis(side=2, x=range(dft$y), format ="%H:%M:%S")
Regarding your request for something "simpler", take a look at theis ggplot2 based solution, using the dft dataframe defined above with POSIXct times. (I did try with the chron-times object but got a message saying ggplot did not support that class):
require(ggplot2); p <- ggplot(dft, aes(x,y))
p + geom_boxplot()
Check out the "lubridate" package, and the "hms" function within it.

Resources