How to extract time form POSIXct and plot? - r

Here I have a data frame which looks like following way,with the first column "POSIXct" and second "latitude"
> head(b)
sample_time latitude
3813442 2015-05-21 19:02:41 39.92770
3813483 2015-05-21 19:03:16 39.92770
3813485 2015-05-21 19:14:30 39.92433
3813515 2015-05-21 19:14:59 39.92469
3813550 2015-05-21 19:15:30 39.92520
3813585 2015-05-21 19:16:00 39.92585
Now,I want to plot latitude vs sample_time, with x axis representing 24 hours timestamp within a single day and group latitude by different days.
Any help will be appreciated!Many thanks.

First, you need to define "day", as opposed to the full time. Then you need to figure out what you mean by "group" ... let's just say you want to aggregate and take the daily mean. Third, you need to make the plot.
b$day <- round.Date(b[,"sample_time"], units="days")
b_agg <- aggregate(list(sample_time=b[,"sample_time"]), by=list(day=b[,"day"]), FUN=mean)
plot(b_agg)
Edit:
Just an additional thought, if you didn't want to aggregate, you could skip the second step, and change the third to plot(b[,"day"], b[,"latitude"]. Alternatively, you may even want something like boxplot(latitude~day, data=b).

Related

converting bi-monthly Julian days to date in raster image

I have few raster images that represents bi-monthly data. I want to convert bi-monthly into monthly data taking averages of the two images.
There are total of 23 image (single band or single layer)If i stack the image using stack() from list.files, for some reason it reads 46 layers, but if i open all raster images using raster function individually and then stack it reads 23 layers only.
I open image individually and then stacked it, but while i convert the bi-monthly julian days it cannot read correctly after 4th month.
library(raster)
setwd("F:/LANDSAT-NDVI/testAverage")
x1<-raster("landsatNDVISC05SLC2000001.tif")
x2<-raster("landsatNDVISC05SLC2000017.tif")
x3<-raster("landsatNDVISC05SLC2000033.tif")
x4<-raster("landsatNDVISC05SLC2000049.tif")
x5<-raster("landsatNDVISC05SLC2000065.tif")
x6<-raster("landsatNDVISC05SLC2000081.tif")
x7<-raster("landsatNDVISC05SLC2000097.tif")
x8<-raster("landsatNDVISC05SLC2000113.tif")
x9<-raster("landsatNDVISC05SLC2000129.tif")
x10<-raster("landsatNDVISC05SLC2000145.tif")
x11<-raster("landsatNDVISC05SLC2000161.tif")
x12<-raster("landsatNDVISC05SLC2000177.tif")
x13<-raster("landsatNDVISC05SLC2000193.tif")
x14<-raster("landsatNDVISC05SLC2000209.tif")
x15<-raster("landsatNDVISC05SLC2000225.tif")
x16<-raster("landsatNDVISC05SLC2000241.tif")
x17<-raster("landsatNDVISC05SLC2000257.tif")
x18<-raster("landsatNDVISC05SLC2000273.tif")
x19<-raster("landsatNDVISC05SLC2000289.tif")
x20<-raster("landsatNDVISC05SLC2000305.tif")
x21<-raster("landsatNDVISC05SLC2000321.tif")
x22<-raster("landsatNDVISC05SLC2000337.tif")
x23<-raster("landsatNDVISC05SLC2000353.tif")
data<stack(x1,x2,x3,x4,x5,x6,x7,x8,x9,x10,x11,x12,x13,x14,x15,x16,x17,x18,x19,x20,x21,x22,x23)
julday <-c("landsatNDVISC05SLC2000001.tif","landsatNDVISC05SLC2000017.tif","landsatNDVISC05SLC2000033.tif",
"landsatNDVISC05SLC2000049.tif","landsatNDVISC05SLC2000065.tif","landsatNDVISC05SLC2000081.tif",
"landsatNDVISC05SLC2000097.tif","landsatNDVISC05SLC2000113.tif","landsatNDVISC05SLC2000129.tif",
"landsatNDVISC05SLC2000145.tif","landsatNDVISC05SLC2000161.tif","landsatNDVISC05SLC2000177.tif",
"landsatNDVISC05SLC2000193.tif","landsatNDVISC05SLC2000209.tif","landsatNDVISC05SLC2000225.tif",
"landsatNDVISC05SLC2000241.tif","landsatNDVISC05SLC2000257.tif","landsatNDVISC05SLC2000273.tif",
"landsatNDVISC05SLC2000289.tif","landsatNDVISC05SLC2000305.tif","landsatNDVISC05SLC2000321.tif",
"landsatNDVISC05SLC2000337.tif","landsatNDVISC05SLC2000353.tif")
julday <- as.numeric(substr(julday, 24,26)) #24 to 26th digit in the file name represents Julian days#
dates <- as.Date(julday, origin=as.Date("2000-01-01"))
combinddat <- setZ(data, dates)
monthly <- zApply(combinddat, by = format(dates,"%Y-%m"), fun = mean, na.rm = T)
The dates produced using that data is wrong; result is as follows
> dates
[1] "2000-01-02" "2000-01-18" "2000-02-03" "2000-02-19" "2000-03-06"
[6] "2000-03-22" "2000-04-07" "2000-01-14" "2000-01-30" "2000-02-15"
[11] "2000-03-02" "2000-03-18" "2000-04-03" "2000-01-10" "2000-01-26"
[16] "2000-02-11" "2000-02-27" "2000-03-14" "2000-03-30" "2000-01-06"
[21] "2000-01-22" "2000-02-07" "2000-02-23"
But i want the dates to be 12 months based on my julian days.
This does not answer your question but you probably could improve your code a lot with:
setwd("F:/LANDSAT-NDVI/testAverage")
library(raster)
f <- list.files(pattern="\\.tif$")
f <- sort(f)
data <- stack(f)

Select a value from time series by date in R

How to select a value from time series corresponding needed date?
I create a monthly time series object with command:
producers.price <- ts(producers.price, start=2012+0/12, frequency=12)
Then I try to do next:
value <- producers.price[as.Date("01.2015", "%m.%Y")]
But this doesn't make that I want and value is equal
[1] NA
Instead of 10396.8212805739 if producers.price is:
producers.price <- structure(c(7481.52109434237, 6393.18959031561, 6416.63065650718,
5672.08354710121, 7606.24186413516, 5201.59247092013, 6488.18361474813,
8376.39182893415, 9199.50916585545, 8261.87133079494, 8293.8195347453,
8233.13630279516, 7883.17272003961, 7537.21001580393, 6566.60260432381,
7119.99345843556, 8086.40101607729, 9125.11104610046, 10134.0228610828,
10834.5732454454, 9410.35031874371, 9559.36933274129, 9952.38679679724,
10390.3628690951, 11134.8432864557, 11652.0075507499, 12626.9616107684,
12140.6698452193, 11336.8315981684, 10526.0309052316, 10632.1492109584,
8341.26367412737, 9338.95688558448, 9732.80173656971, 10724.5525831506,
11272.2273444623, 10396.8212805739, 10626.8428853062, 11701.0802817581,
NA), .Tsp = c(2012, 2015.25, 12), class = "ts")
So, I had/have a similar problem and was looking all over to solve it. My solution is not as great as I'd have wanted it to be, but it works. I tried it out with your data and it seems to give the right result.
Explanation
Turns out in R time series data is really stored as a sequence, starting at 1, and not with yout T. Eg. If you have a time series that starts in 1950 and ends in 1960 with each data at one year interval, the Y at 1950 will be ts[1] and Y at 1960 will be ts[11].
Based on this logic you will need to subtract the date from the start of the data and add 1 to get the value at that point.
This code in R gives you the result you expect.
producers.price[((as.yearmon("2015-01")- as.yearmon("2012-01"))*12)+1]
If you need help in the time calculations, check this answer
You will need the zoo and lubridate packages
Get the difference between dates in terms of weeks, months, quarters, and years
Hope it helps :)
1) window.ts
The window.ts function is used to subset a "ts" time series by a time window. The window command produces a time series with one data point and the [[1]] makes it a straight numeric value:
window(producers.price, start = 2015 + 0/12, end = 2015 + 0/12)[[1]]
## [1] 10396.82
2) zoo We can alternately convert it to zoo and subscript it by a yearmon class variable and then use [[1]] or coredata to convert it to a plain number or we can use window.zoo much as we did with window.ts :
library(zoo)
as.zoo(producers.price)[as.yearmon("2015-01")][[1]]
## [1] 10396.82
coredata(as.zoo(producers.price)[as.yearmon("2015-01")])
## [1] 10396.82
window(as.zoo(producers.price), 2015 + 0/12 )[[1]]
## [1] 10396.82
coredata(window(as.zoo(producers.price), 2015 + 0/12 ))
## [1] 10396.82
3) xts The four lines in (2) also work if library(zoo) is replaced with library(xts) and as.zoo is replaced with as.xts.
Looking for a simple command, one line and no library needed?
You might try this.
as.numeric(window(producers.price, 2015.1, 2015.2))

Sample exactly four maintaining almost equal sample distances

I am trying to generate appointment times for yearly scheduled visits. The available days=1:365 and the first appointment should be randomly chosen first=sample(days,1,replace=F)
Now given the first appointment I want to generate 3 more appointment in the space between 1:365 so that there will be exactly 4 appointments in the 1:365 space, and as equally spaced between them as possible.
I have tried
point<-sort(c(first-1:5*364/4,first+1:5*364/4 ));point<-point[point>0 & point<365]
but it does not always give me 4 appointments. I have eventually run this many times and picked only the samples with 4 appointments, but I wanted to ask if there is a more elegant way to get exactly 4 points as equally distanced a s possible.
I was thinking of equal spacing (around 91 days between appointments) in a year starting at the first appointment... Essentially one appointment per quarter of the year.
# Find how many days in a quarter of the year
quarter = floor(365/4)
first = sample(days, 1)
all = c(first, first + (1:3)*quarter)
all[all > 365] = all[all > 365] - 365
all
sort(all)
Is this what you're looking for?
set.seed(1) # for reproducible example ONLY - you need to take this out.
first <- sample(1:365,1)
points <- c(first+(0:3)*(365-first)/4)
points
# [1] 97 164 231 298
Another way uses
points <- c(first+(0:3)*(365-first)/3)
This creates 4 points euqally spaced on [first, 365], but the last point will always be 365.
The reason your code is giving unexpected results is because you use first-1:5*364/4. This creates points prior to first, some of which can be < 0. Then you exclude those with points[points>0...].

Using abline() when x-axis is date (ie, time-series data)

I want to add multiple vertical lines to a plot.
Normally you would specify abline(v=x-intercept) but my x-axis is in the form Jan-95 - Dec-09. How would I adapt the abline code to add a vertical line for example in Feb-95?
I have tried abline(v=as.Date("Jan-95")) and other variants of this piece of code.
Following this is it possible to add multiple vertical lines with one piece of code, for example Feb-95, Feb-97 and Jan-98?
An alternate solution could be to alter my plot, I have a column with month information and a column with the year information, how do I collaborate these to have a year month on the X-axis?
example[25:30,]
Year Month YRM TBC
25 1997 1 Jan-97 136
26 1997 2 Feb-97 157
27 1997 3 Mar-97 163
28 1997 4 Apr-97 152
29 1997 5 May-97 151
30 1997 6 Jun-97 170
The first note: your YRM column is probably a factor, not a datetime object, unless you converted it manually. I assume we do not want to do that and our plot is looking fine with YRM as a factor.
In that case
vline_month <- function(s) abline(v=which(s==levels(df$YRM)))
# keep original order of levels
df$YRM <- factor(df$YRM, levels=unique(df$YRM))
plot(df$YRM, df$TBC)
vline_month(c("Jan-97", "Apr-97"))
Disclaimer: this solution is a quick hack; it is neither universal nor scalable. For accurate representation of datetime objects and extensible tools for them, see packages zoo and xts.
I see two issues:
a) converting your data to a date/POSIX element, and
b) actually plotting vertical lines at specific rows.
For the first, create a proper date string then use strptime().
The second issue is resolved by converting the POSIX date to numeric using as.numeric().
# dates need Y-M-D
example$ymd <- paste(example$Year, '-', example$Month, '-01', sep='')
# convet to POSIX date
example$ymdPX <- strptime(example$ymd, format='%Y-%m-%d')
# may want to define tz otherwise system tz is used
# plot your data
plot(example$ymdPX, example$TBC, type='b')
# add vertical lines at first and last record
abline(v=as.numeric(example$ymdPX[1]), lwd=2, col='red')
abline(v=as.numeric(example$ymdPX[nrow(example)]), lwd=2, col='red')

Loop to create series of graphs from different files

I am trying to plot histograms with long term (several years) mean precipitation (pp) for each day of the month from a series of files. Each file has data collected from a different place (and has a different code). Each of my files looks like this:
X code year month day pp
1 2867 1945 1 1 0.0
2 2867 1945 1 2 0.0
...
And I am using the following code:
files <- list.files(pattern=".csv")
par(mfrow=c(4,6))
for (i in 1:24) {
obs <- read.table(files[i],sep=",", header=TRUE)
media.dia <- ddply(obs, .(day), summarise, daily.mean<-mean(pp))
codigo <- unique(obs$code)
hist(daily.mean, main=c("hist per day of month", codigo))
}
I get 24 histograms with 24 different codes in the title, but instead of 24 DIFFERENT histograms from 24 different locations, I get the same histogram 24 times (with 24 different titles). Can anybody tell me why? Thanks!
There are at least two errors I can see in your code.
There is an error in your ddply statement.
You are passing the wrong variable to hist, thus plotting something that may or may not exist depending on previous session actions.
The problem in your ddply statement is that you are doing an invalid assign (using <- ). Fix this by using =:
media.dia<- ddply(obs, .(day),summarise, daily.mean = mean(pp))
Then edit your hist statement:
hist(media.dia$daily.mean,main=c("hist per day of month",codigo))
I suspect the problem is that you are not passing the correct parameter to hist. The reason that your code actually produces a plot at all, is because in some previous step in your session you must have created a variable called daily.mean (as Brandon points out in the comment.)
I think the daily.mean calculated in the ddply function is assigned in a separate environment, and does not exist in an environment hist can see.
Try daily.mean<<-mean(pp)

Resources