Convert timestamp column in dataframe to day of year in R - r

I have a dataframe in R where one of the columns is a timestamp such as 2014-03-08 00:20:34
I would like to convert all of these timestamps in the dataframe to the day of the year. For example, March 29, 2014 would be 88.
This will do the conversion if 'df' is my dataframe and 'updated' is my timestamp column:
strftime(df$updated, format = "%j")
How can I apply this across the entire dataframe?
Thanks,

as.numeric(as.Date("2014-03-08 00:20:34")-as.Date("2014-01-01"))+1
#[1] 67
So just use an assignment to whatever name column you want and replace df$updated for the character value. The other way would be with POSIXlt objects which are really multi-element named lists:
as.POSIXlt("2014-03-08 00:20:34")$yday
[1] 66
as.POSIXlt("2014-03-08 00:20:34")$yday +1
[1] 67 #b ecause `yday`s` are zero referenced unlike the rest of R

Related

making 2 digit month and day columns in R

I'm very new to R, so this might seem straightforward. But I have a data frame with an original date column that has values that look like this: 4-02-91, 5-29-93 (i.e. m-d-y). I am trying to separate this column into 3, where months, days, and years are separate. Then I need to combine them again to this format 19910402, 19930529 - I need it this way in order to compare it to another dataset with similar dates.
Here is what I've been trying to do:
# Make DATE an actual date column
dataframe$DATE <- as.Date(used$DATE, format="%m-%d-%Y")
# This changes the original date column into something that looks like this: 1991-04-02, 1993-05-29
# Separate DATE into multiple columns
dataframe$year <- year(dataframe$DATE)
dataframe$month <- month(dataframe$DATE)
dataframe$day <- day(dataframe$DATE)
# Combine dates again to get string
dataframe$raster_date<-paste(dataframe$year, dataframe$month, dataframe$day, sep = "")
The last step looks great except where the months or days are single digits. It's coming out as 199142 and 1993529 instead of 19910402 and 19930529. How do I insert zeros when the month and day values are 1 digit?
Here, we can use sprintf instead of paste as the year, month, day from lubridate extracts those as numeric values and numeric class would drop the 0 padded as prefix. We add those prefix with 0s in sprintf
sprintf("%04d%02d%02d", dataframe$year, dataframe$month, dataframe$day)

Subset a dataframe based on numerical values of a string inside a variable

I have a data frame which is a time series of meteorological measurement with monthly resolution from 1961 till 2018. I am interested in the variable that measures the monthly average temperature since I need the multi-annual average temperature for the summers.
To do this I must filter from the "DateVaraible" column the fifth and sixth digit, which are the month.
The values in time column are formatted like this
"19610701". So I need the 07(Juli) after 1961.
I start coding for 1 month for other purposes, so I did not try anything worth to mention. I guess that .grepl could do the work, but I do not know how the "matching" operator works.
So I started with this code that works.
summersmonth<- Df[DateVariable %like% "19610101" I DateVariable %like% "19610201"]
I am expecting a code like this
summermonths <- Df[DateVariable %like% "**06**" I DateVariable%like% "**07**..]
So that all entries with month digit from 06 to 09 are saved in the new dataframe summermonths.
Thanks in advance for any reply or feedback regarding my question.
Update
Thank to your answers I got the first part, which is to convert the variable in a as.date with the format "month"(Class=char)
Now I need to select months from Juni to September .
A horrible way to get the result I wanted is to do several subset and a rbind afterward.
Sommer1<-subset(Df, MonthVar == "Mai")
Sommer2<-subset(Df, MonthVar == "Juli")
Sommer3<-subset(Df, MonthVar == "September")
SummerTotal<-rbind(Sommer1,Sommer2,Sommer3)
I would be very glad to see this written in a tidy way.
Update 2 - Solution
Here is the tidy way, as here Using multiple criteria in subset function and logical operators
Veg_Seas<-subset(Df, subset = MonthVar %in% c("Mai","Juni","Juli","August","September"))
You can convert your date variable as date (format) and take the month:
allmonths <- month(as.Date(Df$DateVariable, format="%Y%m%d"))
Note that of your column has been originally imported as factor you need to convert it to character first:
allmonths <- month(as.Date(as.character(Df$DateVariable), format="%Y%m%d"))
Then you can check whether it is a summermonth:
summersmonth <- Df[allmonths %in% 6:9, ]
Example:
as.Date("20190702", format="%Y%m%d")
[1] "2019-07-02"
month(as.Date("20190702", format="%Y%m%d"))
[1] 7
We can use anydate from anytime to convert to Date class and then extract the month
library(anytime)
month(anydate(as.character(Df$DateVariable)))

How do i subset a zoo object based on values for a particular month [duplicate]

This question already has an answer here:
Subsetting winter (Dez, Jan, Feb) from daily time series (zoo)
(1 answer)
Closed 7 years ago.
My zoo object
I Would like to create a subset of values (these are discharge values) containing only December flow values.
Thank you!
We can extract the 'months' from the index with format, get a logical index by comparing with 'Dec', and use that to subset the zoo object.
z1[format(index(z1), '%b')=='Dec']
#1938-12-03 1938-12-10 1938-12-17 1938-12-24 1938-12-31
# 49 50 51 52 53
If we convert to xts object, .indexmon from the xts package can be also used. The .indexmon starts from 0, so December is 11.
library(xts)
z1[.indexmon(as.xts(z1))==11]
Other options from the comments are using grep on the index to get the numeric index and subset (from #Pierre Lafortune)
z1[grep("-12-",index(z1))]
Or with subset/month option (from # G. Grothendieck)
subset(z1, months(time(z1)) == "December")
data
library(zoo)
z1 <- zoo(1:100, order.by = seq(as.Date('1938-01-01'),
length.out=100, by = '1 week'))

How to determine the correct argument for origin in as.Date, R

I have a data set in R that contains a column of dates in the format yyyy/mm/dd. I am trying to use as.Date to convert these dates to date objects in R. However, I cannot seem to find the correct argument for origin to input into as.Date. The following code is an example of what I have been trying. I am using a CSV file from Excel, so I used origin="1899/12/30 based on other sites I have looked at.
> as.Date(2001/04/26, origin="1899/12/30")
[1] "1900-01-18"
However, this is not working since the input date 2001/04/26 is returned as "1900-01-18". I need to convert the dates into date objects so I can then convert the dates into julian dates.
You can either is as.Date with a numeric value, or with a character value. When you type just 2001/04/26 into R, that's doing division and getting 19.24 (a numeric value). And numeric values require an origin and the number you supply is the offset from that origin. So you're getting 19 days away from your origin, ie "1900-01-18". A date like Apr 26 2001 would be
as.Date(40659, origin="1899-12-30")
# [1] "2011-04-26"
If your dates from Excel "look like" dates chances are they are character values (or factors). To convert a character value to a Date with as.Date() you want so specify a format. Here
as.Date("2001/04/26", format="%Y/%m/%d")
# [1] "2001-04-26"
see ?strptime for details on the special % variables. Now if you're read your data into a data.frame with read.table or something, there's a chance your variable may be a factor. If that's the case, you'll want do convert to character with'
as.Date(as.character(mydf$datecol), format="%Y/%m/%d")

Converting hour, minutes columns in dataframe to time format

I have a dataframe containing 3 columns including minutes and hours.
I want to convert these columns (namely minutes and column) to time format. Given the data in drame:
Score Hour Min
10 10 56
23 17 01
I would like to get:
Score Time
10 10:56:00
23 17:01:00
You could use ISOdatetime to convert the numbers in the hour and min to a POSIXct object. However, a POSIXct object is only defined when it also includes a year, month and day. So depending on your needs to can do several things:
If you need a real time object which is correctly printed in graphs for example and can be used in arithmetic (addition, subtraction), you need to use ISOdatetime. ISOdatetime returns a so called POSIXct object, which is an R object which represents time. Then in ISOdatetime you just use fixed values for year, month, and day. This ofcourse only works if your dataset does not span multiple years.
If you just need a character column Time, you can convert the POSIXct output to string using strftime. By setting the format argument to "%H:%M:00". In this case however, you could also use sprintf to create the new character column without converting to POSIXct: sprintf("%s:%s:00", drame$Hour, drame$Min).
You can use paste() function to merge the two column data into a char and then use strptime() to convert to timestamp
x<-1:6
##1 2 3 4 5 6
y<-8:13
## 8 9 10 11 12 13
timestamp <- paste(x,":",y,":00",sep="")
timestamp
will result in
#1:8:00 2:9:00 3:10:00 4:11:00 5:12:00 6:13:00
If you prefer to convert this to timestamp object try using
strptime(mergedData,"%H:%M:%S")
## uses current date by default
if you happen to have Date in another column use paste() to make a char formattted date and use below to get date time
##strptime(mergedData,"%d/%m/%Y %H:%M:%S")

Resources