How to produce a scatter plot of dates vs magnitudes in R? - r

This is what i have done so far but its wrong.
earthquakes<- c(6.6,6.8,8.4)
dates <- (13/02/2001 ,28/02/2001,23/06/2001)
plot(earthquakes,dates)
I have only started learning R. Please help.

earthquakes<- c(6.6,6.8,8.4)
dates <- as.Date(c("13/02/2001", "28/02/2001", "23/06/2001"), format="%d/%m/%Y")
plot(dates, earthquakes)
You had a few issues:
Dates should be in quotes (otherwise R will think you're trying to do arithmetic (i.e. 13 divided by 02 divied by 2001)
To convert dates to actual date objects, use as.Date, pass a vector of dates (this is the c(... part), and then specify the format that they are in so that R knows what to do with the strings
you had x and y swapped
Note, the as.Date step is not strictly necessary, but if you don't do that, then the x axis of the plot will plot every item equidistant, irrespective of how far apart the dates actually are in time.

Related

R date variable operation

I have a string/character variable contains a calendar date, eg,
x <- "2018-10-31"
I also have a variable y contains time, say 200 days.
y <- 200
How do I find out the calendar date for x + y?
I am not familiar with date type in R and struggle with how to approach this.
An add-on question, would this calculation be different if y = 4.3 months? Of course I can convert this into days, though wonder if there is more direct way to handle months without converting.
You could utilise the lubridate package, which is specifically designed for handling date time data.
library(lubridate)
x <- ymd("2018-10-31")
x + days(200)
[1] "2019-05-19"
lubridate works with 'period' objects, which require integers, so you would need to convert "4.3" months into something interpretable beforehand. "4.3" doesn't mean anything concrete in terms of date-time calculation anyways.

Weekly time series plot in R

I am trying to create a plot of weekly data. Though this is not the exact problem I am having it illustrates it well. Basically imagine you want to make a plot of 1,2,....,7 for for 7 weeks from Jan 1 2015. So basically my plot should just be a line that trends upward but instead I get 7 different lines. I tried the code (and some other to no avail). Help would be greatly appreciated.
startDate = "2015-01-01"
endDate = "2015-02-19"
y=c(1,2,3,4,5,6,7)
tsy=ts(y,start=as.Date(startDate),end=as.Date(endDate))
plot(tsy)
You are plotting both the time and y together as individual plots.
Instead use:
plot(y)
lines(y)
Also, create a date column based on the specifics you gave which will be a time series. From here you can add the date on the x-axis to easily see how your variable changes over time.
To make your life easier I think your first step should be to create a (xts) time series object (install/load the xts-package), then it is a piece of cake to plot, subset or do whatever you like with the series.
Build your vector of dates as a sequence with start/end date:
seq( as.Date("2011-07-01"), by=1, len=7)
and your data vector: 1:7
a one-liner builds and plots the above time series object:
plot(as.xts(1:7,order.by=seq( as.Date("2011-07-01"), by=1, len=7)))

Plotting truncated times from zoo time series

Let's say I have a data frame with lots of values under these headers:
df <- data.frame(c("Tid", "Value"))
#Tid.format = %Y-%m-%d %H:%M
Then I turn that data frame over to zoo, because I want to handle it as a time series:
library("zoo")
df <- zoo(df$Value, df$Tid)
Now I want to produce a smooth scatter plot over which time of day each measurement was taken (i.e. discard date information and only keep time) which supposedly should be done something like this: https://stat.ethz.ch/pipermail/r-help/2009-March/191302.html
But it seems the time() function doesn't produce any time at all; instead it just produces a number sequence. Whatever I do from that link, I can't get a scatter plot of values over an average day. The data.frame code that actually does work (without using zoo time series) looks like this (i.e. extracting the hour from the time and converting it to numeric):
smoothScatter(data.frame(as.numeric(format(df$Tid,"%H")),df$Value)
Another thing I want to do is produce a density plot of how many measurements I have per hour. I have plotted on hours using a regular data.frame with no problems, so the data I have is fine. But when I try to do it using zoo then I either get errors or I get the wrong results when trying what I have found through Google.
I did manage to get something plotted through this line:
plot(density(as.numeric(trunc(time(df),"01:00:00"))))
But it is not correct. It seems again that it is just producing a sequence from 1 to 217, where I wanted it to be truncating any date information and just keep the time rounded off to hours.
I am able to plot this:
plot(density(df))
Which produces a density plot of the Values. But I want a density plot over how many values were recorded per hour of the day.
So, if someone could please help me sort this out, that would be great. In short, what I want to do is:
1) smoothScatter(x-axis: time of day (0-24), y-axis: value)
2) plot(density(x-axis: time of day (0-24)))
EDIT:
library("zoo")
df <- data.frame(Tid=strptime(c("2011-01-14 12:00:00","2011-01-31 07:00:00","2011-02-05 09:36:00","2011-02-27 10:19:00"),"%Y-%m-%d %H:%M"),Values=c(50,52,51,52))
df <- zoo(df$Values,df$Tid)
summary(df)
df.hr <- aggregate(df, trunc(df, "hours"), mean)
summary(df.hr)
png("temp.png")
plot(df.hr)
dev.off()
This code is some actual values that I have. I would have expected the plot of "df.hr" to be an hourly average, but instead I get some weird new index that is not time at all...
There are three problems with the aggregate statement in the question:
We wish to truncate the times not df.
trunc.POSIXt unfortunately returns a POSIXlt result so it needs to be converted back to POSIXct
It seems you did not intend to truncate to the hour in the first place but wanted to extract the hours.
To address the first two points the aggregate statement needs to be changed to:
tt <- as.POSIXct(trunc(time(df), "hours"))
aggregate(df, tt, mean)
but to address the last point it needs to be changed entirely to
tt <- as.POSIXlt(time(df))$hour
aggregate(df, tt, mean)

Line up ts or zoo timeseries of different frequencies at "midperiod" on X axis

I need to plot a number of time series of different frequencies in R, and I need them to have the points centered on a period instead of starting at the beginning of each period. Here is an illustration of what I'm running into:
test1 <- ts(rnorm(24), start=2004, freq=12)
test2 <- ts(rnorm(2), start=2004, freq=1)
plot(test1, type='l')
lines(test2, col='red')
I'd like the red line to essentially be shifted forward 6 months, to the middle oaf each year. I've spent a little time with the R documentation for "ts" and haven't figured out how to do this -- any suggestions?
How about changing the time-series start?
test2 <- ts(rnorm(2), start=2004.5, freq=1)
I agree with #haggai_e that shifting the 'start' parameter makes sense, but if you already have a ts-object then the code to use those values would be:
lines(ts(test2, start=2004.5, freq=frequency(test2)) )
ts-objects are really just numeric vectors with attributes. You recover those attributes with start, end and frequency. The end is actually calculated on the fly from(length/frequency -1 ) of the vector added to start.

how, in R, to plot time of day versus calendar date

I regret that I come to post here after being burned-out on hours of Internet searching regarding this simplistic question.
I have several data sets to plot in R, each consisting of two columns of data: time, date. I am using R 2.11.0 on a Windows computer, via the Rgui.
Time is "time of day" that an event is observed. As an example, it is recognized as:
Factor w/ 87 levels "5:53","5:54",..: 84 85 85 85 86 ...
Date is calendar date, recognized as:
Class 'Date' num [1:730] 13879 13880 13881 13882 13883 ...
The time values are recorded in the format of a 24-hr clock, h:mm or hh:mm. The date values are displayed yyyy-mm-dd.
I want to plot time (y-axis) vs. date (x-axis).
Using
plot(date,time)
gives an accurate-looking plot, but the y-axis is labeled as the numeric factor values (about 0 to 90), rather than the desired, temporally-ordered levels of the factor variable. The x-axis is labeled in the desired, human-readable format.
How can I correct this? Is there a "time of day" format in R that I can convert my "time" variable into? I will subsequently like to do arithmetic on the time values as well, and would not mind having to carry one column of values to use in plotting and one column of values for maths.
I ran across several examples online of manipulation of (date + time) variables in R, and converting those to different formats. I do not believe this is my problem, as I have separate fields for time and date and want to plot one against the other.
My thanks to you in advance for your suggestions, or your directions to a web-accessible resource (no appropriate libraries or bookstores at my location).
There may be an easier way to do this, but you can always label the y-axis yourself. Adjust the ticksAt vector below to find something that looks suitable for your data.
Data <- data.frame(date=Sys.Date()+1:10,time=paste(5,41:50,sep=":"))
with(Data, plot(date,time,yaxt="n"))
ticksAt <- c(1,3,5,7,9)
axis(2, at=ticksAt, labels=as.character(Data$time)[ticksAt])
?plot.zoo has some good examples of how to create pretty axis annotations, though some of them may be zoo-specific. ?par is also a good resource.
ts and timeSeries are two good choices.
Take a look at Related
Let's assume you have two vectors, one of Date class named "dt" and the other a factor named "tm":
x <- paste(as.character(dt[1:2]), as.character(tm))
strptime(x, "%Y-%m-%d %H:%M")
## [1] "2008-01-01 05:53:00" "2008-01-02 05:54:00"
class(strptime(x, "%Y-%m-%d %H:%M"))
## [1] "POSIXt" "POSIXlt"

Resources