Box-and-Whisker plot grouped by year using R - r

I have timeseries object in R which contains the values of AirPassenger bookings in every month from year 1949-1960. Its easy to plot box plot grouped by month using the command boxplot(AP ~ cycle(AP)). I would like to know how to do box plot if we have to group by year.

Sample codes to get you started:
ap <- data.frame(AirPassengers)
year <- rep(seq(1949,1960), rep(12,12))
boxplot(ap$AirPassengers~year)

Related

Is it possible to get data from a plot in R

I have this plot where I plotted patient ID's on the x axis and BMI on the y axis. I found a cluster of a data in "severely underweight" category as u can see in the plot. How can I get a table of all those points which are in here?
OR
How can I extract one category from a column in R.
Assuming that
your data is the d1 data frame
the category is given by the group column
Then one possibility is to use dplyr package:
suw <- filter(d1, group == "Severely underweight")

Multiple plots factor by ID and Day

Hi I am trying to plot multiple plots factor by ID and DAY. Each ID will have multiple plots based on the day, all ID's have multiple day data so multiple plots. I tried with the lattice plot as shown below. But factor with both day and ID is an issue.
library("lattice")
# require("lattice") - you do not need this line
xyplot(IPRE+PRED+DV) ~ TIME| ID, data= df ,type=c("l","l","p"),col= c("blue","black","red"),
distribute.type=TRUE, xlab="Time (h)",ylab="conc",layout=c(0,4))
Columns ID DAY TIME DV IPRED PRED
Not too sure what your ultimate goal is but this may be of some assistance. facet_wrap from the ggplot package allows you to split the plots by multiple variables.
library(ggplot2)
data(iris)
iris$Day<-rep(weekdays(Sys.Date()+0:4),each=10)
ggplot(data=iris,aes(x=Sepal.Width,y=Sepal.Length))+
geom_point(aes(colour=Species))+
facet_wrap(~Day+Species,nrow=5)

Histogram to show the count per month or day in R

I'm trying to create a histogram that shows count of event on date for each month so i can see the total for each month. When I create the histogram the left hand side is a density instead of a count.
How to a get a graph that shows total number of a date per month
Example code (rough indication only)
data_toview <- read.csv("file_with_data.csv", stringsAsFactors = FALSE)
#Distribution of count per day, so i know how the data can spike
hist(data_toview$interesting_date, breaks = "days")
Not i may be using the wrong plot type that is why i did not specify histogram in the question title. Also any suggestions to get the months on the labels.

How to create boxplot based on 5 year intervals in R

I have a continuous variable y measured on different dates. I need to make boxplots with a box showing the distribution of y for each 5 year interval.
Sample data:
rdob <- as.Date(dob, format= "%m/%d/%y")
ggplot(data = data, aes(x=rdob, y=ageyear)) + geom_boxplot()
#Warning message:
#Continuous x aesthetic -- did you forget aes(group=...)?
This image is the first one I tried. What I want is a box for every five year interval, instead of a box for every year.
Here is a way to pull out the year in base R:
format(as.Date("2008-11-03", format="%Y-%m-%d"), "%Y")
Simply wrap your date vector in a format() and add the "%Y". To get this to be integer, you can use as.integer.
You could also take a look at the year function in the lubridate package which will make this extraction a little bit more straightforward.
One method to get 5 year intervals is to use cut to create a factor variable that creates levels at selected break points. Unless you have dozens of years your best bet would be to set the break points manually:
df$myTimeInterval <- cut(df$years, breaks=c(1995, 2000, 2005, 2010, 2015))
Here's an example taking Dave2e's suggestion of using cut on date intervals along with ggplot's group aesthetic mapping:
library(ggplot2)
n <- 1000
## Randomly sample birth dates and dummy up an effect that trends upward with DOB
dobs <- sample(seq(as.Date('1970/01/01'), Sys.Date(), by="day"), n)
effect <- rnorm(n) + as.numeric(as.POSIXct(dobs)) / as.numeric(as.POSIXct(Sys.Date()))
data <- data.frame(dob=dobs, effect=effect)
## boxplot w/ DOB binned to 5 year intervals
ggplot(data=data, aes(x=dob, y=effect)) + geom_boxplot(aes(group=cut(dob, "5 year")))
library(lubridate)
year=year(rdob)

limiting x-axis by the facets used in qplot, R

I was trying to create a plot for each of the 4 quarters in my data. So I used:
qplot(date, mape, data = df, facets = .~quarter)
The resulting plot was:
The X-axis is using all the months present in the "date" column - from Jan - Dec. But the data on the chart is limited to the months of the quarter for e.g. Jan-Feb-March in the Q1 facet, Apr-May-June for Q2 etc. How can I limit the x-axis to the months corresponding to the quarter only?
as an alternate, I did this but its not exactly how I want to display my data: qplot(strftime(df$date, format = "%b"), mape, data = predicted.modelset)
It would be best if you could provide data for this but I am pretty sure the following will work:
qplot(date, mape, data = df) + facet_wrap(~quarter,scales="free_x") #without a dot before ~

Resources