Converting variables to dates in R - r

There is a matrix with two columns: years and months
dates.m
1492 April
1492 August
1492 October
How to convert these two variables into a date format variable (for example mm/yyyy)? Thanks.

How about this:
dates.m<-data.frame(dates.m,stringsAsFactors=F)
dfl=split(dates.m,1:nrow(dates.m))
dates.m$data=do.call(rbind,lapply(dfl,function(rn)
paste(as.Date(paste(paste(rn,collapse = "/"),"01",sep="/"),"%Y/%b/%d"),sep="")))
dates.m$data
[,1]
1 "1492-04-01"
2 "1492-08-01"
3 "1492-10-01"

When it comes to dates I love working with the lubridate package. here the example code to solve this, if you have one column containing the data (assuming your data is ordered in the way of year-month-day - change the order of the letters in the function name otherwise):
require(lubridate)
df$date<-ymd(df$dates.m)
or if you have them in seperate columns:
require(lubridate)
require(stringr)
df$date<-ymd(str_c(as.character(df$Year),as.character(df$Month),
as.character(df$Day),sep="-"))

Related

Converting monthly numerics to readable dates in R

How can set R to count months instead of dates when converting integers to dates?
After reading several threads on how to convert dates in R, it seems like nobody has asked how it is possible to convert numeric dates if the numerics is given in monthly timeseries. E.g. 552 represents January 2006.
I have tried several things, such as using as.Date(dates,origin="1899-12-01"), but I reckognize that R counts days instead of months. Thus, the code on year-month number 552 above yields "1901-06-06" instead of the correct 2006-01-01.
Sidenote: I also want the format to be YEARmonth, but does R allow displaying dates without days?
I think your starting date should be '1960-01-01'.
anyway you can solve this problem using the package lubridate.
in this case you can start from a date and add months.
library(lubridate)
as.Date('1960-01-01') %m+% months(552)
it gives you
[1] "2006-01-01"
you can display only the year and month of a date, but in that case R coerces the date into a character.
format(as.Date('2006-01-01'), "%Y-%m")

Year column to time series [duplicate]

This question already has answers here:
Convert four digit year values to class Date
(5 answers)
Closed 5 years ago.
OK, this should be really simple but I'm not 'getting it.' I have a data frame with a column "Year" that I want to convert to a time series, but the format is tripping me up. How do I convert the "Year" value to a date, with the actual date being the end of each respective year (e.g. 2015 -> December 31st 2015)?
Year Production
1 1900 38400000
2 1901 43400000
3 1902 49000000
4 1903 44100000
5 1904 49800000
Goal is to get this to a time series data frame. (e.g. xts)
It is not quite the same as a previous question that converted a vector of years to dates. "Convert four digit year values to date type". Goal is to index the data by date, converting it to xts or similar object.
Edited:
This was the final solution:
df <- xts(x = df_original, order.by = as.Date(paste0(df_original[,1], "-12-31")))
whereby the "[,1]" indicates the first column of the original data frame.
If you want each full date to be 31 December, you could use paste along with as.Date to cast to a date:
df$date <- as.Date(paste0(df$Year, "-12-31"))
In addition to Tim Biegeleisen's answer, I will just add another way
df$final_date <- as.Date(ISOdate(df$Year, 12, 31))

Convert characters to dates in a panel data set

I am downloading and using the following panel data,
# load / install package
library(rsdmx)
library(dplyr)
# Total
Assets.PIT <- readSDMX("http://widukind-api.cepremap.org/api/v1/sdmx/IMF/data/IFS/..Q.BFPA-BP6-USD")
Assets.PIT <- as.data.frame(Assets.PIT)
names(Assets.PIT)[10]<-"A.PI.T"
names(Assets.PIT)[6]<-"Code"
AP<-Assets.PIT[c("WIDUKIND_NAME","Code","TIME_PERIOD","A.PI.T")]
AP<-rename(AP, Country=WIDUKIND_NAME, Year=TIME_PERIOD)
My goal is to convert the column vector Year in the dataframe AP into a vector of class dates. In other words, I want R to understand the time series part of my panel data. For your information, I have quarterly data, with unbalanced date range across cross sections (in my case countries).
head(AP$Year)
[1] "2008-Q2" "2008-Q3" "2008-Q4" "2009-Q1" "2009-Q2" "2009-Q3"
Or,
AP$Year<-as.factor(AP$Year)
head(AP$Year)
[1] 2008-Q2 2008-Q3 2008-Q4 2009-Q1 2009-Q2 2009-Q3
264 Levels: 1950-Q1 1950-Q2 1950-Q3 1950-Q4 1951-Q1 1951-Q2 1951-Q3 1951-Q4 1952-Q1 1952-Q2 1952-Q3 ... 2015-Q4
Is there any easy solution to convert these character dates into time-series dates?
library(zoo)
as.Date(as.yearqtr(AP$year, format ='%YQ-%q'))
This should do it.

converting hours:minutes with unevern column lengths - zeros

i am trying to convert a data.frame with the amount of time in the format hours:minutes.
i found this post useful and like the simple code approach of using the POSIXlt field type.
R: Convert hours:minutes:seconds
However each column represents a month's worth of days. columns are thus uneven. When i try the code below following several other SO posts, i get zeros in the one column with fewer row values.
The code is below. Note that when run, you get all zeros for feb which has fewer data values in its rows.
rDF <- data.frame(jan=c("9:59","10:02","10:04"),
feb=c("9:59","10:02",""),
mar=c("9:59","10:02","10:04"),stringsAsFactors = FALSE)
for (i in 1:3) {
Res <- as.POSIXlt(paste(Sys.Date(), rDF[,i]))
rDF[,i] <- Res$hour + Res$min/60
}
Thank you for any suggestions to fix this issue. I'm open to a more efficient approach as well.
Best,
Leah
You could try using the package lubridate. Here we are converting your data row by row to hour-minute format (using hm), then extracting the hours, and adding the minutes divided by 60:
library(lubridate)
rDF[] <- lapply(rDF, function(x){hm(x)$hour + hm(x)$minute/60})
jan feb mar
1 9.983333 9.983333 9.983333
2 10.033333 10.033333 10.033333
3 10.066667 NA 10.066667
This could easily be achieved with package lubridate's hm:
library(lubridate)
temp<-lapply(rDF,hm)
NewDF<-data.frame(jan=temp[[1]],feb=temp[[2]],mar=temp[[3]])

Selecting Specific Dates in R

I am wondering how to create a subset of data in R based on a list of dates, rather than by a date range.
For example, I have the following data set data which contains 3 years of 6-minute data.
date zone month day year hour minute temp speed gust dir
1 09/06/2009 00:00 PDT 9 6 2009 0 0 62 2 15 156
2 09/06/2009 00:06 PDT 9 6 2009 0 6 62 13 16 157
I have used breeze<-subset(data, ws>=15 & wd>=247.5 & wd<=315, select=date:dir) to select the rows which meet my criteria for a sea breeze, which is fine, but what I want to do is create a subset of the days which contain those times that meet my criteria.
I have used...
as.character(breeze$date)
trimdate<-strtrim(breeze$date, 10)
breezedate<-as.Date(trimdate, "%m/%d/%Y")
breezedate<-format(breezedate, format="%m/%d/%Y")
...to extract the dates from each row that meets my criteria so I have a variable called breezedate that contains a list of the dates that I want (not the most eloquent coding to do this, I'm sure). There are about two-hundred dates in the list. What I am trying to do with the next command is in my original dataset data to create a subset which contains only those days which meet the seabreeze criteria, not just the specific times.
breezedays<-(data$date==breezedate)
I think one of my issues here is that I am comparing one value to a list of values, but I am not sure how to make it work.
Lets assume your breezedate list looks like this and data$date is simple string:
breezedate <- as.Date(c("2009-09-06", "2009-10-01"))
This is probably want you want:
breezedays <- data[as.Date(data$date, '%m/%d/%Y') %in% breezedate]
The intersect() function (docs) will allow you to compare one data frame to another and return those records that are the same.
To use, run the following:
breezedays <- intersect(data$date,breezedate) # returns into breezedays all records that are shared between data$date and breezedate

Resources